Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uart_configure crash with ambiq apollo3 #84136

Open
fariouche opened this issue Jan 17, 2025 · 6 comments
Open

uart_configure crash with ambiq apollo3 #84136

fariouche opened this issue Jan 17, 2025 · 6 comments
Assignees
Labels
bug The issue is a bug, or the PR is fixing a bug platform: Ambiq Ambiq priority: low Low impact/importance bug

Comments

@fariouche
Copy link

Describe the bug
I'm getting an crash in uart_configure with the apollo3 board.
I've enabled mcuboot, and I can see mcuboot debug traces over the uart.... then when it is executing the zephyr app, I see the zephyr boot header and then when it reaches my call to uart_configure it crashes.
I haven't tested without mcuboot yet.

The debugger shows the following stack trace:

#0  arch_system_halt (reason=25)
    at zephyr/kernel/fatal.c:30
#1  0x00047e38 in k_sys_fatal_error_handler (reason=<optimized out>, 
    esf=<optimized out>)
    at zephyr/kernel/fatal.c:44
#2  0x00038310 in z_fatal_error (reason=<optimized out>, esf=<optimized out>)
    at zephyr/kernel/fatal.c:119
#3  0x00040a0c in z_arm_fatal_error (reason=<optimized out>, 
    esf=esf@entry=0x1001d940 <z_interrupt_stacks+2048>)
    at zephyr/arch/arm/core/fatal.c:86
#4  0x00028912 in z_arm_fault (msp=<optimized out>, psp=<optimized out>, 
    exc_return=<optimized out>, callee_regs=<optimized out>)
    at zephyr/arch/arm/core/cortex_m/fault.c:1107
#5  0x000289dc in z_arm_usage_fault ()
    at zephyr/arch/arm/core/cortex_m/fault_s.S:102
#6  <signal handler called>
#7  pl011_disable (dev=0x48f80 <__device_dts_ord_58>)
    at zephyr/drivers/serial/uart_pl011_registers.h:42
#8  pl011_runtime_configure_internal (dev=0x48f80 <__device_dts_ord_58>, 
    cfg=0x1001ed48 <z_main_stack+3976>, disable=<optimized out>)
    at zephyr/drivers/serial/uart_pl011.c:217
#9  0x00022ae8 in z_impl_uart_configure (cfg=0x1001ed48 <z_main_stack+3976>, 
    dev=0x48f80 <__device_dts_ord_58>)
    at zephyr/include/zephyr/drivers/uart.h:648
#10 uart_configure (cfg=0x1001ed48 <z_main_stack+3976>, 
    dev=0x48f80 <__device_dts_ord_58>)
    at build/zephyr/include/generated/zephyr/syscalls/uart.h:157

So it is getting a fault when calling pl011_disable... I verified and dts_ord_58 is the uart0 in the dts.

I'm getting the same crash with all calls to uart_irq_xxx functions too.

The exact same code works with other zephyr boards...

Hope this will ring a bell to someone here! Meanwhile I will try to reproduce the issue with a zephyr sample app

To Reproduce
I haven't tested with hello world sample app yet.

  1. mkdir build; cd build
  2. cmake -DBOARD=board\apollo3
  3. make
  4. See error

Expected behavior
uart is working

Impact
showstopper obviously

Logs and console output

*** Using Zephyr OS build v4.0.0-2-gad0afbde8448 ***
I: Starting bootloader
I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Secondary image: magic=bad, swap_type=0x0, copy_done=0x0, image_ok=0x0
I: Boot source: none
I: Secondary image of image pair (0.) is unreachable. Treat it as empty
I: Image index: 0, Swap type: none
I: Bootloader chainload address offset: 0xc000
I: Image version: v1.3.0
I: Jumping to the first image slot

*** Booting Zephyr OS build v4.0.0-2-gad0afbde8448 ***

Environment (please complete the following information):

  • Linux
  • zephyr 4.0.0

Additional context
This is the resulting dts of this build (from zephyr.dts)

uart0: uart@4001c000 {
          compatible = "ambiq,uart", "arm,pl011";
          reg = < 0x4001c000 0x1000 >;
          interrupts = < 0xf 0x0 >;
          interrupt-names = "UART0";
          status = "okay";
          clocks = < &uartclk >;
          ambiq,pwrcfg = < &pwrcfg 0x8 0x80 >;
          zephyr,pm-device-runtime-auto;
          current-speed = < 0xf4240 >;
          pinctrl-0 = < &uart0 >;
          pinctrl-names = "default";
          label = "uart0";
          phandle = < 0x4 >;
  };
@fariouche fariouche added the bug The issue is a bug, or the PR is fixing a bug label Jan 17, 2025
@fariouche
Copy link
Author

From what I see, it crashes in pl011_disable()...

What I do not understand is that the call get_uart(dev) is returning address 0x4001C000 which is good and access is working fine before.... I see that the reason of the crash is read memory access at same address that worked before.

Reading 76 bytes @ address 0x4001C000
WARNING: Failed to read memory @ address 0x4001C000
Cannot access memory at address 0x4001c000

maybe some Power management disabled the uart?

@fariouche
Copy link
Author

fariouche commented Jan 17, 2025

I confirm, the issue happens after the call to (void)pm_device_runtime_auto_enable(dev); in init.c

Disabling CONFIG_PM_DEVICE_RUNTIME makes it run...

@kartben kartben added the priority: low Low impact/importance bug label Jan 21, 2025
@AlessandroLuo
Copy link
Contributor

@fariouche I tried hello world sample with PM_DEVICE_RUNTIME enabled, and did not reproduce your issue.
Have you tried enlarge your stack size?

@fariouche
Copy link
Author

this is strange. I remember that I had to comment line that calls am_hal_cachectrl_control in apollo3/hal/am_hal_pwrctrl.c as to was crashing when booting.
Maybe this is the reason then? I'm not sure why it crashes here (it crashed when calling set_LPMMODE, but i wanted to investigate a bit more before submitting the issue)

@fariouche
Copy link
Author

I got some time to debug a bit more.
I uncommented the call to am_hal_cachectrl_control(AM_HAL_CACHECTRL_CONTROL_LPMMODE_RECOMMENDED, 0); to see the crash again.
It crashed in AM_HAL_CACHECTRL_CONTROL_LPMMODE_RESET when executing set_LPMMODE(AM_HAL_CACHECTRL_FLASHCFG_LPMMODE_NEVER) (I lost the debugging session while executing step by step, when entering am_hal_flash_store_ui32 when pFunc is called (value 0x10010f21 <SRAM_write_ui32+1>) with pui32Address== 0x40018004 and ui32Value==0x1751)
If I attach after the crash, I see that it is inside z_arm_usage_fault:

#0  arch_system_halt (reason=20)
    at /home/fariouche/zephyr/kernel/fatal.c:30
#1  0x00050d7a in k_sys_fatal_error_handler (reason=<optimized out>, 
    esf=<optimized out>)
    at /home/fariouche/zephyr/kernel/fatal.c:44
#2  0x0003a0c4 in z_fatal_error (reason=<optimized out>, esf=<optimized out>)
    at /home/fariouche/zephyr/kernel/fatal.c:119
Read 4 bytes @ address 0x00029A52 (Data = 0xED00E7D1)
Reading 64 bytes @ address 0x1001D780
Read 4 bytes @ address 0x00029A80 (Data = 0xBF00BD01)
Reading 64 bytes @ address 0x00029A40
#3  0x00045bc4 in z_arm_fatal_error (reason=<optimized out>, 
    esf=esf@entry=0x1001d784 <z_interrupt_stacks+2052>)
    at /home/fariouche/zephyr/arch/arm/core/fatal.c:86
#4  0x00029a52 in z_arm_fault (msp=<optimized out>, psp=<optimized out>, 
    exc_return=<optimized out>, callee_regs=<optimized out>)
    at /home/fariouche/zephyr/arch/arm/core/cortex_m/fault.c:1107
#5  0x00029a80 in z_arm_usage_fault ()
    at /home/fariouche/zephyr/arch/arm/core/cortex_m/fault_s.S:102
Read 4 bytes @ address 0x1001D764 (Data = 0x0003A0C5)
Read 4 bytes @ address 0x10013E30 (Data = 0x00000000)
#6  <signal handler called>
#7  0x10013e30 in _kernel ()
Read 4 bytes @ address 0x1001D748 (Data = 0x40018004)
Reading 64 bytes @ address 0x1001CF80
#8  0x1001d748 in z_interrupt_stacks ()
Read 4 bytes @ address 0x1001D748 (Data = 0x40018004)
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

I tried to push the main stack to 5K, but still the same issue.

I suppose I have an issue with the low power mode that is also making the uart crash if I comment this function.
I'm using a apollo3 blue dev kit (AMA3B1KK-KBR EVB)

What could be the issue?

@fariouche
Copy link
Author

Any idea on why I get this crash?
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The issue is a bug, or the PR is fixing a bug platform: Ambiq Ambiq priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

4 participants