Skip to content

Commit

Permalink
Move BearSSL from STACK_PROXY to a real, thunked 2nd stack (esp8266#5168
Browse files Browse the repository at this point in the history
)

* Update to BearSSL 0.6+ release, add AES_CCM modes

Pull in latest BearSSL head (0.6 + minor additions) release and add AES_CCM
modes to the encryption options.

* Enable the aes_ccm initialization in client/server

* Initial attempt

* Working code with second stack thunking

* Remove #ifdefs in .S file, not needed.

* Clean up thunks and remove separate stack flag

* Fix PIO assembler errors

* Remove #ifdef code changes, ensure same code as PC

Remove "#ifdef ESP8266;...;#else;...;#endif" brackets in BearSSL to
ensure the host-tested code is the same as the ESP8266-run code.

* Move to latest BearSSL w/EC progmem savings

* Merge with master

* Add br_thunk_* calls to do ref counting, painting

Add reference counting br_thunk_add/del_ref() to replace stack handling code
in the class.

Add in stack painting and max usage calculation.

* Add in postmortem stack dump hooks

When a crash occurs while in the second stack, dump the BSSL stack and
then also the stack that it was called from (either cont or sys).

* Update stack dump to match decoder expectations

* Move thunk to code core for linkiage

The thunk code needs to be visible to the core routines, so move it to the
cores/esp8266 directory.  Probably need to refactor the stack setup and the
bearssl portion to avoid dependency on bearssl libs in cores/esp8266

* Add 2nd stack dump utility routine

* Refactor once more, update stack size, add stress

Make stack_thunks generic, remove bearssl include inside of cores/esp8266.

Allocate the stack on a WiFiServerSecure object creation to avoid
fragmentation since we will need to allocate the stack to do any
connected work, anyway.

A stress test is now included which checks the total BearSSL second
stack usage for a variety of TLS handshake and certificate options
from badssl.org.

* Update to latest to-thunks branch

* Add BearSSL device test using stack stress

Run a series of SSL connection and transmission tests that stress
BearSSL and its stack usage to the device tests.

Modify device tests to include a possible SPIFFS generation and
upload when a make_spiffs.py file is present in a test directory.

* Use bearssl/master branch, not /to-thunks branch

Update to use the merged master branch of bearssl.  Should have no code
changes.
  • Loading branch information
earlephilhower authored and devyte committed Nov 15, 2018
1 parent 41de43a commit 2f43807
Show file tree
Hide file tree
Showing 17 changed files with 563 additions and 41 deletions.
122 changes: 122 additions & 0 deletions cores/esp8266/StackThunk.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
/*
StackThunk.c - Allow use second stack for BearSSL calls
BearSSL uses a significant amount of stack space, much larger than
the default Arduino core stack. These routines handle swapping
between a secondary, user-allocated stack on the heap and the real
stack.
Copyright (c) 2017 Earle F. Philhower, III. All rights reserved.
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Modified 8 May 2015 by Hristo Gochkov (proper post and file upload handling)
*/

#include <stdint.h>
#include <stdlib.h>
#include "StackThunk.h"

uint32_t *stack_thunk_ptr = NULL;
uint32_t *stack_thunk_top = NULL;
uint32_t *stack_thunk_save = NULL; /* Saved A1 while in BearSSL */
uint32_t stack_thunk_refcnt = 0;

#define _stackSize (5600/4)
#define _stackPaint 0xdeadbeef

/* Add a reference, and allocate the stack if necessary */
void stack_thunk_add_ref()
{
stack_thunk_refcnt++;
if (stack_thunk_refcnt == 1) {
stack_thunk_ptr = (uint32_t *)malloc(_stackSize * sizeof(uint32_t));
stack_thunk_top = stack_thunk_ptr + _stackSize - 1;
stack_thunk_save = NULL;
stack_thunk_repaint();
}
}

/* Drop a reference, and free stack if no more in use */
void stack_thunk_del_ref()
{
if (stack_thunk_refcnt == 0) {
/* Error! */
return;
}
stack_thunk_refcnt--;
if (!stack_thunk_refcnt) {
free(stack_thunk_ptr);
stack_thunk_ptr = NULL;
stack_thunk_top = NULL;
stack_thunk_save = NULL;
}
}

void stack_thunk_repaint()
{
for (int i=0; i < _stackSize; i++) {
stack_thunk_ptr[i] = _stackPaint;
}
}

/* Simple accessor functions used by postmortem */
uint32_t stack_thunk_get_refcnt() {
return stack_thunk_refcnt;
}

uint32_t stack_thunk_get_stack_top() {
return (uint32_t)stack_thunk_top;
}

uint32_t stack_thunk_get_stack_bot() {
return (uint32_t)stack_thunk_ptr;
}

uint32_t stack_thunk_get_cont_sp() {
return (uint32_t)stack_thunk_save;
}

/* Return the number of bytes ever used since the stack was created */
uint32_t stack_thunk_get_max_usage()
{
uint32_t cnt = 0;

/* No stack == no usage by definition! */
if (!stack_thunk_ptr) {
return 0;
}

for (cnt=0; (cnt < _stackSize) && (stack_thunk_ptr[cnt] == _stackPaint); cnt++) {
/* Noop, all work done in for() */
}
return 4 * (_stackSize - cnt);
}

/* Print the stack from the first used 16-byte chunk to the top, decodable by the exception decoder */
void stack_thunk_dump_stack()
{
uint32_t *pos = stack_thunk_top;
while (pos < stack_thunk_ptr) {
if ((pos[0] != _stackPaint) || (pos[1] != _stackPaint) || (pos[2] != _stackPaint) || (pos[3] != _stackPaint))
break;
pos += 4;
}
ets_printf(">>>stack>>>\n");
while (pos < stack_thunk_ptr) {
ets_printf("%08x: %08x %08x %08x %08x\n", pos, pos[0], pos[1], pos[2], pos[3]);
pos += 4;
}
ets_printf("<<<stack<<<\n");
}
82 changes: 82 additions & 0 deletions cores/esp8266/StackThunk.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
/*
StackThunk.h - Allow use second stack for BearSSL calls
BearSSL uses a significant amount of stack space, much larger than
the default Arduino core stack. These routines handle swapping
between a secondary, user-allocated stack on the heap and the real
stack.
Copyright (c) 2017 Earle F. Philhower, III. All rights reserved.
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Modified 8 May 2015 by Hristo Gochkov (proper post and file upload handling)
*/

#ifndef _STACKTHUNK_H
#define _STACKTHUNK_H

#ifdef __cplusplus
extern "C" {
#endif

extern void stack_thunk_add_ref();
extern void stack_thunk_del_ref();
extern void stack_thunk_repaint();

extern uint32_t stack_thunk_get_refcnt();
extern uint32_t stack_thunk_get_stack_top();
extern uint32_t stack_thunk_get_stack_bot();
extern uint32_t stack_thunk_get_cont_sp();
extern uint32_t stack_thunk_get_max_usage();
extern void stack_thunk_dump_stack();

// Globals required for thunking operation
extern uint32_t *stack_thunk_ptr;
extern uint32_t *stack_thunk_top;
extern uint32_t *stack_thunk_save;
extern uint32_t stack_thunk_refcnt;

// Thunking macro
#define make_stack_thunk(fcnToThunk) \
__asm("\n\
.text\n\
.literal_position\n\
\n\
.text\n\
.global thunk_"#fcnToThunk"\n\
.type thunk_"#fcnToThunk", @function\n\
.align 4\n\
thunk_"#fcnToThunk":\n\
addi a1, a1, -16 /* Allocate space for saved registers on stack */\n\
s32i a0, a1, 12 /* Store A0, trounced by calls */\n\
s32i a15, a1, 8 /* Store A15 (our temporary one) */\n\
movi a15, stack_thunk_save /* Store A1(SP) in temp space */\n\
s32i a1, a15, 0\n\
movi a15, stack_thunk_top /* Load A1(SP) with thunk stack */\n\
l32i.n a1, a15, 0\n\
call0 "#fcnToThunk" /* Do the call */\n\
movi a15, stack_thunk_save /* Restore A1(SP) */\n\
l32i.n a1, a15, 0\n\
l32i.n a15, a1, 8 /* Restore the saved registers */\n\
l32i.n a0, a1, 12\n\
addi a1, a1, 16 /* Free up stack and return to caller */\n\
ret\n\
.size thunk_"#fcnToThunk", . - thunk_"#fcnToThunk"\n");

#ifdef __cplusplus
}
#endif

#endif
16 changes: 14 additions & 2 deletions cores/esp8266/core_esp8266_postmortem.c
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@
#include "cont.h"
#include "pgmspace.h"
#include "gdb_hooks.h"
#include "StackThunk.h"

extern void __real_system_restart_local();

Expand Down Expand Up @@ -147,6 +148,17 @@ void __wrap_system_restart_local() {
offset = 0x10;
}

ets_printf_P("\n>>>stack>>>\n");

if (sp > stack_thunk_get_stack_bot() && sp <= stack_thunk_get_stack_top()) {
// BearSSL we dump the BSSL second stack and then reset SP back to the main cont stack
ets_printf_P("\nctx: bearssl \n");
ets_printf_P("sp: %08x end: %08x offset: %04x\n", sp, stack_thunk_get_stack_top(), offset);
print_stack(sp + offset, stack_thunk_get_stack_top());
offset = 0; // No offset needed anymore, the exception info was stored in the bssl stack
sp = stack_thunk_get_cont_sp();
}

if (sp > cont_stack_start && sp < cont_stack_end) {
ets_printf_P("\nctx: cont \n");
stack_end = cont_stack_end;
Expand All @@ -162,6 +174,8 @@ void __wrap_system_restart_local() {

print_stack(sp + offset, stack_end);

ets_printf_P("<<<stack<<<\n");

// Use cap-X formatting to ensure the standard EspExceptionDecoder doesn't match the address
if (umm_last_fail_alloc_addr) {
ets_printf_P("\nlast failed alloc call: %08X(%d)\n", (uint32_t)umm_last_fail_alloc_addr, umm_last_fail_alloc_size);
Expand All @@ -175,7 +189,6 @@ void __wrap_system_restart_local() {


static void ICACHE_RAM_ATTR print_stack(uint32_t start, uint32_t end) {
ets_printf_P("\n>>>stack>>>\n");
for (uint32_t pos = start; pos < end; pos += 0x10) {
uint32_t* values = (uint32_t*)(pos);

Expand All @@ -185,7 +198,6 @@ static void ICACHE_RAM_ATTR print_stack(uint32_t start, uint32_t end) {
ets_printf_P("%08x: %08x %08x %08x %08x %c\n",
pos, values[0], values[1], values[2], values[3], (looksLikeStackFrame)?'<':' ');
}
ets_printf_P("<<<stack<<<\n");
}

static void uart_write_char_d(char c) {
Expand Down
14 changes: 12 additions & 2 deletions libraries/ESP8266WiFi/src/BearSSLHelpers.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#include <stdlib.h>
#include <string.h>
#include <Arduino.h>
#include <StackThunk.h>
#include "BearSSLHelpers.h"

namespace brssl {
Expand Down Expand Up @@ -825,5 +826,14 @@ bool X509List::append(const uint8_t *derCert, size_t derLen) {
return true;
}

};

// Second stack thunked helpers
make_stack_thunk(br_ssl_engine_recvapp_ack);
make_stack_thunk(br_ssl_engine_recvapp_buf);
make_stack_thunk(br_ssl_engine_recvrec_ack);
make_stack_thunk(br_ssl_engine_recvrec_buf);
make_stack_thunk(br_ssl_engine_sendapp_ack);
make_stack_thunk(br_ssl_engine_sendapp_buf);
make_stack_thunk(br_ssl_engine_sendrec_ack);
make_stack_thunk(br_ssl_engine_sendrec_buf);

};
12 changes: 12 additions & 0 deletions libraries/ESP8266WiFi/src/BearSSLHelpers.h
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,18 @@ class Session {
br_ssl_session_parameters _session;
};

// Stack thunked versions of calls
extern "C" {
extern unsigned char *thunk_br_ssl_engine_recvapp_buf( const br_ssl_engine_context *cc, size_t *len);
extern void thunk_br_ssl_engine_recvapp_ack(br_ssl_engine_context *cc, size_t len);
extern unsigned char *thunk_br_ssl_engine_recvrec_buf( const br_ssl_engine_context *cc, size_t *len);
extern void thunk_br_ssl_engine_recvrec_ack(br_ssl_engine_context *cc, size_t len);
extern unsigned char *thunk_br_ssl_engine_sendapp_buf( const br_ssl_engine_context *cc, size_t *len);
extern void thunk_br_ssl_engine_sendapp_ack(br_ssl_engine_context *cc, size_t len);
extern unsigned char *thunk_br_ssl_engine_sendrec_buf( const br_ssl_engine_context *cc, size_t *len);
extern void thunk_br_ssl_engine_sendrec_ack(br_ssl_engine_context *cc, size_t len);
};

};

#endif
42 changes: 16 additions & 26 deletions libraries/ESP8266WiFi/src/WiFiClientSecureBearSSL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ extern "C" {
#include "ESP8266WiFi.h"
#include "WiFiClient.h"
#include "WiFiClientSecureBearSSL.h"
#include "StackThunk.h"
#include "lwip/opt.h"
#include "lwip/ip.h"
#include "lwip/tcp.h"
Expand All @@ -43,14 +44,17 @@ extern "C" {
#include "c_types.h"
#include "coredecls.h"

namespace BearSSL {

// BearSSL needs a very large stack, larger than the entire ESP8266 Arduino
// default one. This shared_pointer is allocated on first use and cleared
// on last cleanup, with only one stack no matter how many SSL objects.
std::shared_ptr<uint8_t> WiFiClientSecure::_bearssl_stack = nullptr;

// The BearSSL thunks in use for now
#define br_ssl_engine_recvapp_ack thunk_br_ssl_engine_recvapp_ack
#define br_ssl_engine_recvapp_buf thunk_br_ssl_engine_recvapp_buf
#define br_ssl_engine_recvrec_ack thunk_br_ssl_engine_recvrec_ack
#define br_ssl_engine_recvrec_buf thunk_br_ssl_engine_recvrec_buf
#define br_ssl_engine_sendapp_ack thunk_br_ssl_engine_sendapp_ack
#define br_ssl_engine_sendapp_buf thunk_br_ssl_engine_sendapp_buf
#define br_ssl_engine_sendrec_ack thunk_br_ssl_engine_sendrec_ack
#define br_ssl_engine_sendrec_buf thunk_br_ssl_engine_sendrec_buf

namespace BearSSL {

void WiFiClientSecure::_clear() {
// TLS handshake may take more than the 5 second default timeout
Expand Down Expand Up @@ -91,16 +95,7 @@ WiFiClientSecure::WiFiClientSecure() : WiFiClient() {
_clear();
_clearAuthenticationSettings();
_certStore = nullptr; // Don't want to remove cert store on a clear, should be long lived
_ensureStackAvailable();
_local_bearssl_stack = _bearssl_stack;
}

void WiFiClientSecure::_ensureStackAvailable() {
if (!_bearssl_stack) {
const int stacksize = 4500; // Empirically determined stack for EC and RSA connections
_bearssl_stack = std::shared_ptr<uint8_t>(new uint8_t[stacksize], std::default_delete<uint8_t[]>());
br_esp8266_stack_proxy_init(_bearssl_stack.get(), stacksize);
}
stack_thunk_add_ref();
}

WiFiClientSecure::~WiFiClientSecure() {
Expand All @@ -110,11 +105,8 @@ WiFiClientSecure::~WiFiClientSecure() {
}
free(_cipher_list);
_freeSSL();
_local_bearssl_stack = nullptr;
// If there are no other uses than the initial creation, free the stack
if (_bearssl_stack.use_count() == 1) {
_bearssl_stack = nullptr;
}
// Serial.printf("Max stack usage: %d bytes\n", br_thunk_get_max_usage());
stack_thunk_del_ref();
if (_deleteChainKeyTA) {
delete _ta;
delete _chain;
Expand All @@ -127,8 +119,7 @@ WiFiClientSecure::WiFiClientSecure(ClientContext* client,
int iobuf_in_size, int iobuf_out_size, const X509List *client_CA_ta) {
_clear();
_clearAuthenticationSettings();
_ensureStackAvailable();
_local_bearssl_stack = _bearssl_stack;
stack_thunk_add_ref();
_iobuf_in_size = iobuf_in_size;
_iobuf_out_size = iobuf_out_size;
_client = client;
Expand All @@ -146,8 +137,7 @@ WiFiClientSecure::WiFiClientSecure(ClientContext *client,
int iobuf_in_size, int iobuf_out_size, const X509List *client_CA_ta) {
_clear();
_clearAuthenticationSettings();
_ensureStackAvailable();
_local_bearssl_stack = _bearssl_stack;
stack_thunk_add_ref();
_iobuf_in_size = iobuf_in_size;
_iobuf_out_size = iobuf_out_size;
_client = client;
Expand Down
Loading

0 comments on commit 2f43807

Please sign in to comment.