Skip to content
This repository has been archived by the owner on Jul 20, 2022. It is now read-only.

Increased timeout to 16MB spiffs flash problem #127

Merged
merged 6 commits into from
Jun 10, 2022

Conversation

robdobsn
Copy link

@robdobsn robdobsn commented Jun 4, 2022

When flashing an ESP32 with 16MB flash chip, during flash of spiffs partition (~13MB) there is a long delay when the first block is written. This causes a time-out and results in the flash failing. Increasing the DEFAULT_TIMEOUT value from 3000 to 40000 fixes this problem. I found that the actual delay is around 25 seconds so a timeout of 40 seconds seems reasonable and probably doesn't make the user experience too much worse in cases where communication really does break down.

robdobsn added 2 commits June 2, 2022 18:06
There is a greater than 25s delay when writing a 13MB partition to 16MB flash
So 30s timeout might be insufficient in some cases
Hence increased to 40s
src/const.ts Outdated
@@ -185,7 +185,7 @@ export const USB_RAM_BLOCK = 0x800;
export const ESP_RAM_BLOCK = 0x1800;

// Timeouts
export const DEFAULT_TIMEOUT = 3000;
export const DEFAULT_TIMEOUT = 40000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't just increase the timeout for all tasks to 40 seconds. In that case we might as well have no timeout at all.

  • Why does it take 25 seconds before we get a response to writing a block?
  • If we can't solve that, can we detect we're writing a spiffs partition and increase the timeout only then?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've looked into this a little more and I think maybe I'm misunderstanding something about how the stub vs main loader work. I based my code for using esp-web-flasher on the esp-web-tools repo ... https://github.com/esphome/esp-web-tools - in this repo all flashing is done with the stub (https://github.com/esphome/esp-web-tools/blob/main/src/flash.ts line 185). In this case the timeout used is the DEFAULT_TIMEOUT. I guess if some (or all) of the flashing was done with the main instance of ESPLoader then the timeout would be calculated using the ERASE_REGION_TIMEOUT_PER_MB value which is 30 seconds per megabyte and that would be ample. However, I have tried flashing the spiffs partition with the main ESPLoader instance and I get a different error "Didn't get enough status bytes". The issue here seems to be that a packet with 2 status bytes is the first thing received after switching to the main ESPLoader and this throws an error because 4 status bytes are expected when not using the stub. I can "fix" this by simply commenting out the code which checks the status bytes altogether and it "works" - but obviously some error checking has been removed. I'm not sure how to proceed as I don't really understand when the stub should be used and when it shouldn't - for uploading the bootloader I guess - but for the partition table too? And maybe there is a correct way to switch from using the stub to using the main ESPLoader that I'm not following?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeoutPerMb is only applied for when it needs to erase. An image is broken into blocks which are individually written with a command. I still don't understand why that would require higher timeouts when more writes are being done?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick video of flashing the same files with esptool https://youtu.be/NezsdWi1kqY and the ~24 second delay can be seen. I don't know why it stops at this point. I guess it would be possible to borrow the timeout setting code which calculates a timeout based on the size of the file to be flashed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we adapt the timeout code that ESPTool uses?

Copy link
Author

@robdobsn robdobsn Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic which controls timeout in esptool is around line 496 in cmds.py (https://github.com/espressif/esptool/blob/master/esptool/cmds.py) and is as follows:

    block_timeout = max(
        DEFAULT_TIMEOUT,
        timeout_per_mb(ERASE_WRITE_TIMEOUT_PER_MB, block_uncompressed),
    )
    if not esp.IS_STUB:
        timeout = (
            block_timeout  # ROM code writes block to flash before ACKing
        )
        esp.flash_defl_block(block, seq, timeout=timeout)

When flashing larger blocks with the stub this results in timout being calculated from the block size and, in the case of the ~14MB block the result is a bit over 367 seconds. The reason it is done like this I guess is that the blocks are compressed so a small write command can expand into a large amount of flash writing - in my case the file system is mainly empty initially.

In the esp-web-flasher the equivalent logic seems to be in esp_loader.ts around line 748 (https://github.com/NabuCasa/esp-web-flasher/blob/main/src/esp_loader.ts) and looks like this:

    if (this.IS_STUB) {
      writeSize = size; // stub expects number of bytes here, manages erasing internally
      timeout = DEFAULT_TIMEOUT;
    } else {
      writeSize = eraseBlocks * flashWriteSize; // ROM expects rounded up to erase block size
      timeout = timeoutPerMb(ERASE_REGION_TIMEOUT_PER_MB, writeSize); // ROM performs the erase up front
    }
    buffer = pack("<IIII", writeSize, numBlocks, flashWriteSize, offset);
    await this.checkCommand(ESP_FLASH_DEFL_BEGIN, buffer, 0, timeout);

The logic seems to be the opposite way around regarding the stub. So that in the case of esp-web-flasher the timeout for any size block (when flashing with the stub) is 3 seconds.

However, this only calculates the timeout for the flashDeflBegin function and the same timeout value needs to be used in the flashDeflBlock function.

I have updated the PR as follows:

  • reverted the original change to the DEFAULT_TIMEOUT.
  • reversed the logic described above i.e. line 748 of esp_loader.ts now says: if (!this.IS_STUB) {
  • return the timeout value from the flashDeflBegin function (it used to return the numBlocks but the return value was ignored (line 761)
  • use the timeout value returned from flashDeflBegin (line 569) as the timeout parameter in the call to flashDeflBlock (line 611)
  • in the flashDeflBlock function pass the timeout parameter to the checkCommand function (line 659)

This works for my application. I haven't done any other testing but I expect that it is a relatively safe change as the behaviour is now much closer to that of esptool.

@TD-er
Copy link

TD-er commented Jun 4, 2022

Why not making this timeout depending on the binary size?

src/const.ts Outdated
@@ -1,303 +1,303 @@
import { toByteArray } from "./util";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've changed the new lines of this file, please revert.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted - apologies

@@ -745,7 +747,7 @@ export class ESPLoader extends EventTarget {
let timeout = 0;
let buffer;

if (this.IS_STUB) {
if (!this.IS_STUB) {
writeSize = size; // stub expects number of bytes here, manages erasing internally
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if-check also checks for write size, not just timeout. You're changing how that is determined based on it being a stub now too.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are correct - my mistake. I have now moved the timeout logic to be the other way around so the writeSize calculation is as it was previously.

@balloob balloob merged commit 8a86662 into NabuCasa:main Jun 10, 2022
@balloob
Copy link
Member

balloob commented Jun 10, 2022

Released in ESP Web Flasher 5.1.4

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants