Skip to content

Commit

Permalink
Add MachO dump writer to createdump (dotnet#51150)
Browse files Browse the repository at this point in the history
* Add MachO dump writer to createdump

Instead of the hacky ELF core dumps on MacOS now createdump generates true MachO dumps.

Setting the COMPlus_DbgEnableElfDumpOnMacOS environment variable is no longer needed.

Add special thread info memory region containing the OS thread ids missing from macho core dumps. This allows SOS to map the thread indexes to thread ids. The address (0x7fffffff00000000) of this special memory region is above the highest user address (0x0007FFFFFFFFF000) and below a kernel reserved address (0xffffff8000xxxxxx) which is kind of moot because dumps don't include any kernel regions. lldb seems just fine with this memory region.

The changes include ARM64 support also, but since I don't have a M1 device I can't build/test them. I'm hoping Steve can at least review them.

Add --verbose/TRACE_VERBOSE support to tone down all the macho dump generation spew.

Issue: dotnet#48664

* Fix build problem

* Update docs

* Code review feedback

Co-authored-by: Juan Sebastian Hoyos Ayala <[email protected]>
  • Loading branch information
mikem8361 and hoyosjs authored Apr 15, 2021
1 parent 56e36a0 commit 79ad10c
Show file tree
Hide file tree
Showing 25 changed files with 1,185 additions and 952 deletions.
4 changes: 3 additions & 1 deletion docs/design/coreclr/botr/xplat-minidump-generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,9 @@ There will be some differences gathering the crash information but these platfor

### OS X ###

As of .NET 5.0, createdump is supported on MacOS but instead of the MachO dump format, it generates the ELF coredumps. This is because of time constraints developing a MachO dump writer on the generation side and a MachO reader for the diagnostics tooling side (dotnet-dump and CLRMD). This means the native debuggers like gdb and lldb will not work with these dumps but the dotnet-dump tool will allow the managed state to be analyzed. Because of this behavior an additional environment variable will need to be set (COMPlus_DbgEnableElfDumpOnMacOS=1) along with the ones below in the Configuration/Policy section.
On .NET 5.0, createdump supported generating dumps on MacOS but instead of the MachO dump format, it generates the ELF coredumps. This wad because of time constraints developing a MachO dump writer on the generation side and a MachO reader for the diagnostics tooling side (dotnet-dump and CLRMD). This means the native debuggers like gdb and lldb will not work with dumps obtained from apps running on a 5.0 runtime, but the dotnet-dump tool will allow the managed state to be analyzed. Because of this behavior an additional environment variable will need to be set (COMPlus_DbgEnableElfDumpOnMacOS=1) along with the ones below in the Configuration/Policy section.

Starting .NET 6.0, native Mach-O core files get generated and the variable COMPlus_DbgEnableElfDumpOnMacOS has been deprecated.

### Windows ###

Expand Down
8 changes: 2 additions & 6 deletions docs/workflow/debugging/coreclr/debugging.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,17 +52,13 @@ Only lldb is supported by SOS. Gdb can be used to debug the coreclr code but wit
1. Perform a build of the coreclr repo.
2. Install the corefx managed assemblies to the binaries directory.
3. cd to build's binaries: `cd ./artifacts/bin/coreclr/Linux.x64.Debug`
4. Start lldb: `lldb-3.9 corerun HelloWorld.exe linux`
4. Start lldb: `lldb corerun HelloWorld.exe linux`
6. Launch program: `process launch -s`
7. To stop annoying breaks on SIGUSR1/SIGUSR2 signals used by the runtime run: `process handle -s false SIGUSR1 SIGUSR2`
8. Get to a point where coreclr is initialized by setting a breakpoint (i.e. `breakpoint set -n LoadLibraryExW` and then `process continue`) or stepping into the runtime.
9. Run a SOS command like `clrstack` or `sos VerifyHeap`. The command name is case sensitive.

You can combine steps 4-8 and pass everything on the lldb command line:

`lldb-3.9 -o "plugin load libsosplugin.so" -o "process launch -s" -o "process handle -s false SIGUSR1 SIGUSR2" -o "breakpoint set -n LoadLibraryExW" corerun HelloWorld.exe linux`

For .NET Core version 1.x and 2.0.x, libsosplugin.so is built for and will only work with version 3.6 of lldb. For .NET Core 2.1, the plugin is built for 3.9 lldb and will work with 3.8 and 3.9 lldb.
See https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-sos for information on how to install SOS.

**Note:** _corerun_ is a simple host that does not support resolving NuGet dependencies. It relies on libraries being locatable via the `CORE_LIBRARIES` environment variable or present in the same directory as the corerun executable. The instructions above are equally applicable to the _dotnet_ host, however - e.g. for step 4 `lldb-3.9 dotnet bin/Debug/netcoreapp2.1/MvcApplication.dll` will let you debug _MvcApplication_ in the same manner.

Expand Down
20 changes: 4 additions & 16 deletions docs/workflow/debugging/libraries/unix-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,8 @@ CoreFX can be debugged on unix using both lldb and visual studio code

## Using lldb and SOS

- Install SOS and lldb. See https://github.com/dotnet/diagnostics/blob/main/documentation/sos.md and https://docs.microsoft.com/en-us/dotnet/core/diagnostics/dotnet-sos for setup instructions.
- Run the test using msbuild at least once with `/t:Test`.
- [Install version 3.9 of lldb](../coreclr/debugging.md#debugging-core-dumps-with-lldb) and launch lldb with dotnet as the process and arguments matching the arguments used when running the test through msbuild.
- Load the sos plugin using `plugin load libsosplugin.so`.
- Type `soshelp` to get help. You can now use all sos commands like `bpmd`.

You may need to supply a path to load SOS. It can be found next to libcoreclr.so. For example:
```
(lldb) plugin load libsosplugin.so
error: no such file
(lldb) image list libcoreclr.so
[ 0] ..... /home/dan/dotnet/shared/Microsoft.NETCoreApp/2.0.4/libcoreclr.so
(lldb) plugin load /home/dan/dotnet/shared/Microsoft.NETCoreApp/2.0.4/libsosplugin.so
```

## Debugging core dumps with lldb

Expand All @@ -31,20 +20,19 @@ There are instructions for installing lldb and SOS [here](https://github.com/dot
Once you have everything listed above, you are ready to start debugging. You need to specify an extra parameter to lldb in order for it to correctly resolve the symbols for libcoreclr.so. Use a command like this:

```
lldb-3.9 -O "settings set target.exec-search-paths <runtime-path>" -o "plugin load <path-to-libsosplugin.so>" --core <core-file-path> <host-path>
lldb-3.9 -O "settings set target.exec-search-paths <runtime-path>" --core <core-file-path> <host-path>
```

- `<runtime-path>`: The path containing libcoreclr.so.dbg, as well as the rest of the runtime and framework assemblies.
- `<core-file-path>`: The path to the core dump you are attempting to debug.
- `<host-path>`: The path to the dotnet or corerun executable, potentially in the `<runtime-path>` folder.
- `<path-to-libsosplugin.so>`: The path to libsosplugin.so, should be in the `<runtime-path>` folder.

lldb should start debugging successfully at this point. You should see stacktraces with resolved symbols for libcoreclr.so. At this point, you can run `plugin load <libsosplugin.so-path>`, and begin using SOS commands, as above.
lldb should start debugging successfully at this point. You should see stacktraces with resolved symbols for libcoreclr.so. At this point you can begin using SOS commands provided you've set it up as described in the links.

Also see this [link](https://github.com/dotnet/diagnostics/blob/master/documentation/debugging-coredump.md) in the diagnostics repo.

##### Example

```
lldb-3.9 -O "settings set target.exec-search-paths /home/parallels/Downloads/System.Drawing.Common.Tests/home/helixbot/dotnetbuild/work/2a74cf82-3018-4e08-9e9a-744bb492869e/Payload/shared/Microsoft.NETCore.App/$(ProductVersion)/" -o "plugin load /home/parallels/Downloads/System.Drawing.Common.Tests/home/helixbot/dotnetbuild/work/2a74cf82-3018-4e08-9e9a-744bb492869e/Payload/shared/Microsoft.NETCore.App/$(ProductVersion)/libsosplugin.so" --core /home/parallels/Downloads/System.Drawing.Common.Tests/home/helixbot/dotnetbuild/work/2a74cf82-3018-4e08-9e9a-744bb492869e/Work/f6414a62-9b41-4144-baed-756321e3e075/Unzip/core /home/parallels/Downloads/System.Drawing.Common.Tests/home/helixbot/dotnetbuild/work/2a74cf82-3018-4e08-9e9a-744bb492869e/Payload/shared/Microsoft.NETCore.App/$(ProductVersion)/dotnet
lldb-3.9 -O "settings set target.exec-search-paths /home/parallels/Downloads/System.Drawing.Common.Tests/home/helixbot/dotnetbuild/work/2a74cf82-3018-4e08-9e9a-744bb492869e/Payload/shared/Microsoft.NETCore.App/$(ProductVersion)/" --core /home/parallels/Downloads/System.Drawing.Common.Tests/home/helixbot/dotnetbuild/work/2a74cf82-3018-4e08-9e9a-744bb492869e/Work/f6414a62-9b41-4144-baed-756321e3e075/Unzip/core /home/parallels/Downloads/System.Drawing.Common.Tests/home/helixbot/dotnetbuild/work/2a74cf82-3018-4e08-9e9a-744bb492869e/Payload/shared/Microsoft.NETCore.App/$(ProductVersion)/dotnet
```
3 changes: 2 additions & 1 deletion eng/testing/debug-dump-template.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,8 @@ setsymbolserver -directory %LOUTDIR%/shared/Microsoft.NETCore.App/6.0.0
---
## If it's a macOS dump
Instructions for debugging dumps on macOS are essentially the same as [Linux](#If-it's-a-Linux-dump-on-Linux...) with one exception: `dotnet-dump` cannot analyze macOS system dumps: you must use `lldb` for those. `dotnet-dump` can only analyze dumps created by `dotnet-dump` or `createdump`, by the runtime on crashes when the appropriate environment variables are set, or the [`blame-hang` setting of `dotnet test`](https://docs.microsoft.com/en-us/dotnet/core/tools/dotnet-test).
Instructions for debugging dumps on macOS are essentially the same as [Linux](#If-it's-a-Linux-dump-on-Linux...) with one exception: `dotnet-dump` cannot analyze macOS system dumps yet: you must use `lldb` for those. As of .NET 6, createdump on macOS
will start generating native Mach-O core files. dotnet-dump and ClrMD are still being worked on to handle these dumps.
---
# Other Helpful Information
Expand Down
2 changes: 2 additions & 0 deletions src/coreclr/debug/createdump/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -63,12 +63,14 @@ if(CLR_CMAKE_HOST_OSX)
add_executable_clr(createdump
crashinfomac.cpp
threadinfomac.cpp
dumpwritermacho.cpp
${CREATEDUMP_SOURCES}
)
else()
add_executable_clr(createdump
crashinfounix.cpp
threadinfounix.cpp
dumpwriterelf.cpp
${CREATEDUMP_SOURCES}
${PAL_REDEFINES_FILE}
)
Expand Down
27 changes: 22 additions & 5 deletions src/coreclr/debug/createdump/crashinfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -122,10 +122,13 @@ CrashInfo::GatherCrashInfo(MINIDUMP_TYPE minidumpType)
return false;
}
#endif
TRACE("Module addresses:\n");
for (const MemoryRegion& region : m_moduleAddresses)
if (g_diagnosticsVerbose)
{
region.Trace();
TRACE_VERBOSE("Module addresses:\n");
for (const MemoryRegion& region : m_moduleAddresses)
{
region.Trace();
}
}
// If full memory dump, include everything regardless of permissions
if (minidumpType & MiniDumpWithFullMemory)
Expand Down Expand Up @@ -589,7 +592,7 @@ CrashInfo::CombineMemoryRegions()

TRACE("CombineMemoryRegions: FINISHED\n");

if (g_diagnostics)
if (g_diagnosticsVerbose)
{
TRACE("Memory Regions:\n");
for (const MemoryRegion& region : m_memoryRegions)
Expand Down Expand Up @@ -619,7 +622,21 @@ CrashInfo::SearchMemoryRegions(const std::set<MemoryRegion>& regions, const Memo
void
CrashInfo::Trace(const char* format, ...)
{
if (g_diagnostics) {
if (g_diagnostics)
{
va_list args;
va_start(args, format);
vfprintf(stdout, format, args);
fflush(stdout);
va_end(args);
}
}

void
CrashInfo::TraceVerbose(const char* format, ...)
{
if (g_diagnosticsVerbose)
{
va_list args;
va_start(args, format);
vfprintf(stdout, format, args);
Expand Down
8 changes: 6 additions & 2 deletions src/coreclr/debug/createdump/crashinfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
#include "../dbgutil/machoreader.h"
#else
#include "../dbgutil/elfreader.h"
#endif

// typedef for our parsing of the auxv variables in /proc/pid/auxv.
#if TARGET_64BIT
Expand All @@ -29,6 +28,8 @@ typedef __typeof__(((elf_aux_entry*) 0)->a_un.a_val) elf_aux_val_t;
// All interesting auvx entry types are AT_SYSINFO_EHDR and below
#define AT_MAX (AT_SYSINFO_EHDR + 1)

#endif

class CrashInfo : public ICLRDataEnumMemoryRegionsCallback,
#ifdef __APPLE__
public MachOReader
Expand All @@ -53,8 +54,8 @@ class CrashInfo : public ICLRDataEnumMemoryRegionsCallback,
std::set<MemoryRegion> m_allMemoryRegions; // all memory regions on MacOS
#else
std::array<elf_aux_val_t, AT_MAX> m_auxvValues; // auxv values
#endif
std::vector<elf_aux_entry> m_auxvEntries; // full auxv entries
#endif
std::vector<ThreadInfo*> m_threads; // threads found and suspended
std::set<MemoryRegion> m_moduleMappings; // module memory mappings
std::set<MemoryRegion> m_otherMappings; // other memory mappings
Expand Down Expand Up @@ -87,8 +88,10 @@ class CrashInfo : public ICLRDataEnumMemoryRegionsCallback,
inline const std::set<MemoryRegion> ModuleMappings() const { return m_moduleMappings; }
inline const std::set<MemoryRegion> OtherMappings() const { return m_otherMappings; }
inline const std::set<MemoryRegion> MemoryRegions() const { return m_memoryRegions; }
#ifndef __APPLE__
inline const std::vector<elf_aux_entry> AuxvEntries() const { return m_auxvEntries; }
inline size_t GetAuxvSize() const { return m_auxvEntries.size() * sizeof(elf_aux_entry); }
#endif

// IUnknown
STDMETHOD(QueryInterface)(___in REFIID InterfaceId, ___out PVOID* Interface);
Expand Down Expand Up @@ -122,4 +125,5 @@ class CrashInfo : public ICLRDataEnumMemoryRegionsCallback,
bool ValidRegion(const MemoryRegion& region);
void CombineMemoryRegions();
void Trace(const char* format, ...);
void TraceVerbose(const char* format, ...);
};
18 changes: 6 additions & 12 deletions src/coreclr/debug/createdump/crashinfomac.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,6 @@ CrashInfo::Initialize()
fprintf(stderr, "task_for_pid(%d) FAILED %x %s\n", m_pid, result, mach_error_string(result));
return false;
}
m_auxvEntries.push_back(elf_aux_entry { AT_BASE, { 0 } });
m_auxvEntries.push_back(elf_aux_entry { AT_NULL, { 0 } });
return true;
}

Expand Down Expand Up @@ -110,7 +108,7 @@ CrashInfo::EnumerateMemoryRegions()
TRACE("mach_vm_region_recurse for address %016llx %08llx FAILED %x %s\n", address, size, result, mach_error_string(result));
break;
}
TRACE("%016llx - %016llx (%06llx) %08llx %s %d %d %d %c%c%c %02x\n",
TRACE_VERBOSE("%016llx - %016llx (%06llx) %08llx %s %d %d %d %c%c%c %02x\n",
address,
address + size,
size / PAGE_SIZE,
Expand Down Expand Up @@ -291,9 +289,11 @@ void CrashInfo::VisitSegment(MachOModule& module, const segment_command_64& segm
const auto& found = m_moduleMappings.find(moduleRegion);
if (found == m_moduleMappings.end())
{
TRACE("VisitSegment: ");
moduleRegion.Trace();

if (g_diagnosticsVerbose)
{
TRACE_VERBOSE("VisitSegment: ");
moduleRegion.Trace();
}
// Add this module segment to the module mappings list
m_moduleMappings.insert(moduleRegion);

Expand Down Expand Up @@ -380,9 +380,3 @@ CrashInfo::ReadProcessMemory(void* address, void* buffer, size_t size, size_t* r
*read = numberOfBytesRead;
return size == 0 || numberOfBytesRead > 0;
}

// For src/inc/llvm/ELF.h
Elf64_Ehdr::Elf64_Ehdr()
{
}

7 changes: 5 additions & 2 deletions src/coreclr/debug/createdump/createdump.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,13 @@
#endif

extern void trace_printf(const char* format, ...);
extern void trace_verbose_printf(const char* format, ...);
extern bool g_diagnostics;
extern bool g_diagnosticsVerbose;

#ifdef HOST_UNIX
#define TRACE(args...) trace_printf(args)
#define TRACE_VERBOSE(args...)
#define TRACE_VERBOSE(args...) trace_verbose_printf(args)
#else
#define TRACE(args, ...)
#define TRACE_VERBOSE(args, ...)
Expand Down Expand Up @@ -82,7 +84,8 @@ typedef int T_CONTEXT;
#include <string>
#ifdef HOST_UNIX
#ifdef __APPLE__
#include "mac.h"
#include <mach/mach.h>
#include <mach/mach_vm.h>
#endif
#include "datatarget.h"
#include "threadinfo.h"
Expand Down
Loading

0 comments on commit 79ad10c

Please sign in to comment.