PLCrashReporter includes support for monitoring crashes via an in-process Mach exception handler. There are a small number of crash cases that will not be caught with via a POSIX signal handler, but can be caught via a Mach exception handler:
Unfortunately, the latter issue (__assert) can not be handled on iOS; trapping abort requires that a Mach exception handler operate out-of-process, which is impossible on iOS. On Mac OS X, this will only be handled once we've implemented fully out-of-process crash excution.
On Mac OS X, the Mach exception implementation is fully supported using entirely public API. On iOS, the APIs required are not fully public – more details on the implications of this for exception handling on iOS may be found in Mach Exceptions on iOS below. It is worth noting that even where the Mach exception APIs are fully supported, kernel-internal constants, as well as architecture-specific trap information, may be required to fully interpret a Mach exception's root cause.
For example, the EXC_SOFTWARE exception is dispatched for four different failure types, using the exception code to differentiate failure types:
Of those four types, only the constant required to interpret the SIGKILL behavior (EXC_SOFT_SIGNAL) is publicly defined. Of the remaining three failure types, the constant values are kernel implementation-private, defined only in the available kernel sources. On iOS, these sources are unavailable, and while they generally do match the Mac OS X implementation, there are no gaurantees that this is – or will remain – the case in the future.
Likewise, interpretation of particular fault types requires information regarding the underlying machine traps that triggered the Mach exceptions. For example, a floating point trap on x86/x86-64 will trigger an EXC_ARITHMETIC, with a subcode value containing the value of the FPU status register. Determining the exact FPU cause requires extracting the actual exception flags from status register as per the x86 architecture documentation. The exact format of this subcode value is not actually documented outside the kernel, and may change in future releases.
While we have the advantage of access to the x86 kernel sources, the situation on ARM is even less clear. The actual use of the Mach exception codes and subcodes is largely undefined by both headers and publicly available documentation, and the available x86 kernel sources are of little use in interpreting this data.
As such, while Mach exceptions may catch some cases that BSD signals can not, they are not a perfect solution, and may also provide less insight into the actual failures that occur. By comparison, the BSD signal interface is both fully defined and architecture independent, with any necessary interpretation of the Mach exception codes handled in-kernel at the time of exception dispatch. It is generally recommended by Apple as the preferred interface, and should generally be preferred by PLCrashReporter API clients.
Enabling in-process Mach exception handlers will conflict with any attached debuggers; the debugger may suspend the processes Mach exception handling thread, which will result in any exception messages sent via the debugger being lost, as the in-process handler will be unable to receive and forward the messages.
A Mach exception handler may conflict with any managed runtime that registers a BSD signal handler that can safely handle otherwise fatal signals, allowing execution to proceed. This includes products such as Xamarin for iOS.
In such a case, PLCrashReporter will write a crash report for non-fatal signals, as there is no immediate mechanism for determining whether a signal handler exists and that it can safely handle the failure. This can result in unexpected delays in application execution, increased I/O to disk, and other undesirable operations.
The APIs required for Mach exception handling are not fully public on iOS. After filing a request with Apple DTS to clarify the status of the Mach exception APIs on iOS, and implementing a Mach Exception handler using only supported API, they provided the following guidance:
Our engineers have reviewed your request and have determined that this would be best handled as a bug report, which you have already filed. There is no documented way of accomplishing this, nor is there a workaround possible.
Due to user request, PLCrashReporter provides an optional implementation of Mach exception handling for both iOS and Mac OS X.
This implementation uses only supported API on Mac OS X, and depends on limited undefined API on iOS. The reporter may be excluded entirely at build time by modifying the PLCRASH_FEATURE_MACH_EXCEPTIONS build configuration; it may also be disabled at runtime by configuring the PLCrashReporter instance appropriately via PLCrashReporterConfig.
The iOS implementation is implemented almost entirely using public API, and links against no actual private symbols; the use of undocumented functionality is limited to assuming the use of specific msgh_id values (see below for details). As a result, it may be considered perfectly safe to include the Mach Exception code in the standard build, and enable/disable it at runtime.
The following issues exist in the iOS implementation:
The msgh_id values required for an exception reply message are not available from the available headers and must be hard-coded. This prevents one from safely replying to exception messages, which means that it is impossible to (correctly) inform the server that an exception has not been handled.
Impact: This can lead to the process locking up and not dispatching to the host exception handler (eg, Apple's crash reporter), depending on the behavior of the kernel exception code.
The mach_* structure/type variants required by MACH_EXCEPTION_CODES are not publicly defined (on Mac OS X, these are provided by mach_exc.defs). This prevents one from forwarding exception messages to an existing handler that was registered with a MACH_EXCEPTION_CODES behavior (eg, forwarding is entirely non-functional on ARM64 devices).
Impact: This can break forwarding to any task exception handler that registers itself with MACH_EXCEPTION_CODES, including other handlers registered within the current process, eg, by a managed runtime. This could also result in misinterpretation of a Mach exception message, in the case where the message format is modified by Apple to be incompatible with the existing 32-bit format.
This is the case with LLDB; it will register a task exception handler with MACH_EXCEPTION_CODES set. Failure to correctly forward these exceptions will result in the debugger breaking in interesting ways; for example, changes to the set of dyld-loaded images are detected by setting a breakpoint on the dyld image registration funtions, and this functionality will break if the exception is not correctly forwarded.
Since Mach exception handling is important for a fully functional crash reporter, we have also filed a radar to request that the API be made public: Radar: rdar://12939497 RFE: Provide mach_exc.defs for iOS
At the time of this writing, the radar remains open/unresolved.