Skip to content

Commit

Permalink
PoC cache configuration control (esp8266#7060)
Browse files Browse the repository at this point in the history
* PoC cache configuration control

Expaned boards.txt.py to allow new MMU options and create revised .ld's
Updated eboot to pass 48K IRAM segments.
Added Cache_Read_Enable intercept to modify call for 16K ICACHE
Update platform.txt to pass new mmu options through to compiler and linker preprocessor.
Added quick example: esp8266/MMU48K

* Style corrections
Added MMU_ qualifier to new defines.
Moved changes into their own file.
Don't know how to fix platformio issue.

* Added detailed description for Cache_Read_Enable.
Updated tools/sizes.py to report correct IRAM size and indicate ICACHE size.
Merged in earlephilhower's work on unaligned exception. Refactored and added
support for store operations and changed the name to be more closely aligned
with its function. Improved crash reporting path.

* Style and MMU_SEC_HEAP corrections.

* Improved asm register usage.
Added some inline functions to aid in byte and short access to iRAM.
 * only byte read has been tested
Updated .ld file to work better with platform.io; however, I am still
missing some steps, so platformio will still fail.

* Interesting glitch in boards.txt after github merge. A new board in
master was missing new additions added by boards.txt.py in the PR.
Which the CI flags when it rebuilds boards.txt.

* Support for 2nd Heap, excess IRAM, through umm_malloc.

Adapted changes to umm_malloc, Esp.cpp, StackThunk.cpp,
WiFiClientSecureBearSSL.cpp, and virtualmem.ino to irammem.ino from
@earlephilhower PR esp8266#6994.

Reworked umm_malloc to use context pointers instead of copy context.
umm_malloc now supports allocations from IRAM. Added class
HeapSelectIram, ... to aid in selecting alternate heaps,
modeled after class InterruptLock.
Restrict alloc request from ISRs to DRAM.

Never ending improvements to debug printing.

Sec Heap option now pulls in free IRAM left over in the 1st 32K block.
Managed through umm_malloc with HeapSelectIram.

Updated examples.

* Post push CI cleanup.

* Cleanup part II

* Cleanup part III

* Updates to support platformio, maybe.

* Added exception C wrapper replacement.

* CI Cleanup

* CI Cleanup II

Don't know what to do with platformio it doesn't like my .S file.
ifdef out USE_ISR_SAFE_EXC_WRAPPER to block the new assemlby module
from building on platformio only.

* Changes to exc-c-wrapper-handler.S to assemble under platformio.

* For platformio, Correction to toolchain-xtensa include path.
@mcspr, Thankyou!

* Temporarily added --print-memory-usage to ld parameters for cross-checking IRAM size.

* undo change to platform.txt

* correct merge conflict. take 1

* Fixed #if... for building umm_get_oom_count. It was not building when UMM_STATS_FULL was used.

* Commented out XMC support. Compatibility issues with PoC when using 16K ICACHE.

* Corrected size.py, DRAM bracketing changed to not include ICACHE with DRAM total.

* Added additional _context for support of use of UMM_INLINE_METRICS.
Corrected some UMM_POSION missed edits.

* Changes to clear errors and warnings from toolchain 10.1

Several fixes and improvements to example MMU48K.

With the improved optimization in toolchain 10.1 The example divide by 0
exception was failing with a HWDT event instead of its exception handler.
The compiler saw the obscured divide by 0 and replaced it with a break point.

* Isolated incompatable definitions related to _xtos_set_exception_handler.
GDBSTUB definitions are different from the BootROM's.

* Update tools/platformio-build.py

Co-authored-by: Max Prokhorov <[email protected]>

* Requested changes

Changed mmu related usages of ETS_... defines to DBG_MMU_...

Cleanup in example MMU48K.ino. Removed stale memory reference macro
and mmu_status print statement. Cleanup printf '\n' to be '\r\n'.

Improved issolation of development debug prints from the rest of the debug prints.

* Corrected comment. And added missing include.

* Improve comment.

* style and comment correction

* Added draft mmu.rst file and updated index.
Updated example HeapMetric.ino to also illustrate use of IRAM
Improved comments in exc-c-wrapper-handler.S. Added insurance IRQ disable.

* Updated mmu.rst

Improved function name uniqueness for is_iram, is_dram, and is_icache by
adding prefix mmu_. Also, made them available outside of a debug build.
Made pointer precision width more specific.

Made some of the static inline functions in mmu_irm.h safe for ISRs by
setting then for always inline.

* Add a default MMU_IRAM_SIZE value for a new CI test to pass.

Extended use 'umm_heap_context_t *_context' argument in ..._core functions
and expanded its usage to reduce unnecessary repeated calls to
umm_info(NULL, false), also removed recursion from umm_info(NULL, true).

Fixed stack buffer length in umm_info_safe_printf_P and heap.cpp.

Added example for creating an IRAM reserve section.

Updated mmu.rst. Grammar and spelling corrections.

* CI appeasement

* CI appeasement with comment correction.

* Ensure SYS always runs with DRAM Heap selected.

* Add/move heap stack overflow/underflow check to Esp.cpp where the event was discarded.

* Improved comment clarity of purpose for IramReserve.ino. Clean up MMU48K.ino

* Added missing #include

* Corrected usage of warning

* CI appeasement and use #message not #pragma message

* Updated git version of eboot.elf to match build version.
Good test catch.

* Remove conditional build option USE_ISR_SAFE_EXC_WRAPPER, always install.

Use the replacement wrapper on non32xfer_exception_handler install.

Added comments to code describing some exception handling issues.

* Updated mmu.rst

* Expanded and clarified comments.

Limited access to some detailed typdefs/prototypes to .cpp
modules, to avoid future build conflicts.

Completed TODO for verifing that the "C" structure struct __exception_frame
matches the ASM version.

Fixed some typo's, code rot, and added some more cases in examaple irammem.ino.
Refactored a little and reordered printing to ease comparison between methods.

Corrected `#ifdef __cplusplus` coverage area. Cleaned up `extern "C" ...` usage.
Fixes issues with including mmu_iram.h or esp8266_undocumented.h in .c files.

* Style fixes and more cleanup

* Style fix

* Remove unnessasary IRAM_ATTR from install_non32xfer_exception_handler

Some comment tuning.

In the context of _xtos_set_exception_handler and the functions it registers,
changed to type int for exception cause type. This is also the type used by gdbstub
and some other Xtensa files I found.
  • Loading branch information
mhightower83 authored Dec 6, 2020
1 parent 47e7b2d commit 8b662ed
Show file tree
Hide file tree
Showing 69 changed files with 3,926 additions and 298 deletions.
410 changes: 410 additions & 0 deletions boards.txt

Large diffs are not rendered by default.

4 changes: 3 additions & 1 deletion bootloaders/eboot/eboot.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,9 @@ int load_app_from_flash_raw(const uint32_t flash_addr)
load = true;
}

if (address >= 0x40100000 && address < 0x40108000) {
// The final IRAM size, once boot has completed, can be either 32K or 48K.
// Allow for the higher in range testing.
if (address >= 0x40100000 && address < 0x4010C000) {
load = true;
}

Expand Down
Binary file modified bootloaders/eboot/eboot.elf
Binary file not shown.
4 changes: 4 additions & 0 deletions cores/esp8266/Arduino.h
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,10 @@ void optimistic_yield(uint32_t interval_us);
#include <cstdlib>
#include <cmath>


#include "mmu_iram.h"


using std::min;
using std::max;
using std::round;
Expand Down
19 changes: 7 additions & 12 deletions cores/esp8266/Esp-frag.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,15 @@ void EspClass::getHeapStats(uint32_t* hfree, uint16_t* hmax, uint8_t* hfrag)
// 100 * (1 - sqrt(sum(hole-size²)) / sum(hole-size))

umm_info(NULL, false);
uint8_t block_size = umm_block_size();

uint32_t free_size = umm_free_heap_size_core(umm_get_current_heap());
if (hfree)
*hfree = ummHeapInfo.freeBlocks * block_size;
*hfree = free_size;
if (hmax)
*hmax = (uint16_t)ummHeapInfo.maxFreeContiguousBlocks * block_size;
*hmax = (uint16_t)umm_max_block_size_core(umm_get_current_heap());
if (hfrag) {
if (ummHeapInfo.freeBlocks) {
*hfrag = 100 - (sqrt32(ummHeapInfo.freeBlocksSquared) * 100) / ummHeapInfo.freeBlocks;
if (free_size) {
*hfrag = umm_fragmentation_metric_core(umm_get_current_heap());
} else {
*hfrag = 0;
}
Expand All @@ -46,11 +47,5 @@ void EspClass::getHeapStats(uint32_t* hfree, uint16_t* hmax, uint8_t* hfrag)

uint8_t EspClass::getHeapFragmentation()
{
#ifdef UMM_INLINE_METRICS
return (uint8_t)umm_fragmentation_metric();
#else
uint8_t hfrag;
getHeapStats(nullptr, nullptr, &hfrag);
return hfrag;
#endif
return (uint8_t)umm_fragmentation_metric();
}
76 changes: 66 additions & 10 deletions cores/esp8266/Esp.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@
#include "cont.h"

#include "coredecls.h"
#include "umm_malloc/umm_malloc.h"
// #include "core_esp8266_vm.h"
#include <pgmspace.h>

extern "C" {
Expand Down Expand Up @@ -516,45 +518,45 @@ uint8_t *EspClass::random(uint8_t *resultArray, const size_t outputSizeBytes) co
{
/**
* The ESP32 Technical Reference Manual v4.1 chapter 24 has the following to say about random number generation (no information found for ESP8266):
*
*
* "When used correctly, every 32-bit value the system reads from the RNG_DATA_REG register of the random number generator is a true random number.
* These true random numbers are generated based on the noise in the Wi-Fi/BT RF system.
* When Wi-Fi and BT are disabled, the random number generator will give out pseudo-random numbers.
*
*
* When Wi-Fi or BT is enabled, the random number generator is fed two bits of entropy every APB clock cycle (normally 80 MHz).
* Thus, for the maximum amount of entropy, it is advisable to read the random register at a maximum rate of 5 MHz.
* A data sample of 2 GB, read from the random number generator with Wi-Fi enabled and the random register read at 5 MHz,
* has been tested using the Dieharder Random Number Testsuite (version 3.31.1).
* The sample passed all tests."
*
*
* Since ESP32 is the sequal to ESP8266 it is unlikely that the ESP8266 is able to generate random numbers more quickly than 5 MHz when run at a 80 MHz frequency.
* A maximum random number frequency of 0.5 MHz is used here to leave some margin for possibly inferior components in the ESP8266.
* It should be noted that the ESP8266 has no Bluetooth functionality, so turning the WiFi off is likely to cause RANDOM_REG32 to use pseudo-random numbers.
*
* It is possible that yield() must be called on the ESP8266 to properly feed the hardware random number generator new bits, since there is only one processor core available.
*
* It is possible that yield() must be called on the ESP8266 to properly feed the hardware random number generator new bits, since there is only one processor core available.
* However, no feeding requirements are mentioned in the ESP32 documentation, and using yield() could possibly cause extended delays during number generation.
* Thus only delayMicroseconds() is used below.
*/
*/

constexpr uint8_t cooldownMicros = 2;
static uint32_t lastCalledMicros = micros() - cooldownMicros;

uint32_t randomNumber = 0;

for(size_t byteIndex = 0; byteIndex < outputSizeBytes; ++byteIndex)
{
if(byteIndex % 4 == 0)
{
// Old random number has been used up (random number could be exactly 0, so we can't check for that)

uint32_t timeSinceLastCall = micros() - lastCalledMicros;
if(timeSinceLastCall < cooldownMicros)
delayMicroseconds(cooldownMicros - timeSinceLastCall);

randomNumber = RANDOM_REG32;
lastCalledMicros = micros();
}

resultArray[byteIndex] = randomNumber;
randomNumber >>= 8;
}
Expand Down Expand Up @@ -969,3 +971,57 @@ String EspClass::getSketchMD5()
result = md5.toString();
return result;
}

void EspClass::enableVM()
{
#ifdef UMM_HEAP_EXTERNAL
if (!vmEnabled)
install_vm_exception_handler();
vmEnabled = true;
#endif
}

void EspClass::setExternalHeap()
{
#ifdef UMM_HEAP_EXTERNAL
if (vmEnabled)
umm_push_heap(UMM_HEAP_EXTERNAL);
#endif
}

void EspClass::setIramHeap()
{
#ifdef UMM_HEAP_IRAM
umm_push_heap(UMM_HEAP_IRAM);
#endif
}

void EspClass::setDramHeap()
{
#if defined(UMM_HEAP_EXTERNAL) && !defined(UMM_HEAP_IRAM)
if (vmEnabled) {
if (!umm_push_heap(UMM_HEAP_DRAM)) {
panic();
}
}
#elif defined(UMM_HEAP_IRAM)
if (!umm_push_heap(UMM_HEAP_DRAM)) {
panic();
}
#endif
}

void EspClass::resetHeap()
{
#if defined(UMM_HEAP_EXTERNAL) && !defined(UMM_HEAP_IRAM)
if (vmEnabled) {
if (!umm_pop_heap()) {
panic();
}
}
#elif defined(UMM_HEAP_IRAM)
if (!umm_pop_heap()) {
panic();
}
#endif
}
10 changes: 9 additions & 1 deletion cores/esp8266/Esp.h
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ class EspClass {
uint8_t getBootMode();

#if defined(F_CPU) || defined(CORE_MOCK)
constexpr
constexpr
#endif
inline uint8_t getCpuFreqMHz() const __attribute__((always_inline))
{
Expand Down Expand Up @@ -215,7 +215,15 @@ class EspClass {
#else
uint32_t getCycleCount();
#endif // !defined(CORE_MOCK)
void enableVM();
void setDramHeap();
void setIramHeap();
void setExternalHeap();
void resetHeap();
private:
#ifdef UMM_HEAP_EXTERNAL
bool vmEnabled = false;
#endif
/**
* @brief Replaces @a byteCount bytes of a 4 byte block on flash
*
Expand Down
9 changes: 9 additions & 0 deletions cores/esp8266/StackThunk.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@
#include "debug.h"
#include "StackThunk.h"
#include <ets_sys.h>
#include <umm_malloc/umm_malloc.h>
#include <umm_malloc/umm_heap_select.h>

extern "C" {

Expand All @@ -48,7 +50,14 @@ void stack_thunk_add_ref()
{
stack_thunk_refcnt++;
if (stack_thunk_refcnt == 1) {
DBG_MMU_PRINTF("\nStackThunk malloc(%u)\n", _stackSize * sizeof(uint32_t));
// The stack must be in DRAM, or an Soft WDT will follow. Not sure why,
// maybe too much time is consumed with the non32-bit exception handler.
// Also, interrupt handling on an IRAM stack would be very slow.
// Strings on the stack would be very slow to access as well.
HeapSelectDram ephemeral;
stack_thunk_ptr = (uint32_t *)malloc(_stackSize * sizeof(uint32_t));
DBG_MMU_PRINTF("StackThunk stack_thunk_ptr: %p\n", stack_thunk_ptr);
if (!stack_thunk_ptr) {
// This is a fatal error, stop the sketch
DEBUGV("Unable to allocate BearSSL stack\n");
Expand Down
22 changes: 21 additions & 1 deletion cores/esp8266/core_esp8266_main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ extern "C" {
#include <core_version.h>
#include "gdb_hooks.h"
#include "flash_quirks.h"
#include <umm_malloc/umm_malloc.h>
#include <core_esp8266_non32xfer.h>


#define LOOP_TASK_PRIORITY 1
#define LOOP_QUEUE_SIZE 1
Expand Down Expand Up @@ -204,7 +207,9 @@ static void loop_wrapper() {
static void loop_task(os_event_t *events) {
(void) events;
s_cycles_at_yield_start = ESP.getCycleCount();
ESP.resetHeap();
cont_run(g_pcont, &loop_wrapper);
ESP.setDramHeap();
if (cont_check(g_pcont) != 0) {
panic();
}
Expand Down Expand Up @@ -256,6 +261,7 @@ void init_done() {
std::set_terminate(__unhandled_exception_cpp);
do_global_ctors();
esp_schedule();
ESP.setDramHeap();
}

/* This is the entry point of the application.
Expand Down Expand Up @@ -311,14 +317,22 @@ extern "C" void app_entry_redefinable(void)
cont_t s_cont __attribute__((aligned(16)));
g_pcont = &s_cont;

#ifdef DEV_DEBUG_MMU_IRAM
DBG_MMU_PRINT_STATUS();

DBG_MMU_PRINT_IRAM_BANK_REG(0, "");

DBG_MMU_PRINTF("\nCall call_user_start()\n");
#endif

/* Call the entry point of the SDK code. */
call_user_start();
}

static void app_entry_custom (void) __attribute__((weakref("app_entry_redefinable")));

extern "C" void app_entry (void)
{
umm_init();
return app_entry_custom();
}

Expand All @@ -342,6 +356,12 @@ extern "C" void user_init(void) {

cont_init(g_pcont);

#if defined(NON32XFER_HANDLER) || defined(MMU_IRAM_HEAP)
install_non32xfer_exception_handler();
#endif
#if defined(MMU_IRAM_HEAP)
umm_init_iram();
#endif
preinit(); // Prior to C++ Dynamic Init (not related to above init() ). Meant to be user redefinable.

ets_task(loop_task,
Expand Down
Loading

0 comments on commit 8b662ed

Please sign in to comment.