Migrate from breakpad to crashpad

Issue #1995 resolved
Marshall Greenblatt created an issue

Chrome is now using Crashpad instead of Breakpad for crash reporting on OS X and Windows. CEF should migrate those platforms to Crashpad as well.

As part of this migration we will be adding improved APIs, tooling and documentation for building and testing crash reporting in CEF-based applications. More details to follow.

Crashpad announcement: https://groups.google.com/a/chromium.org/d/msg/chromium-dev/6eouc7q2j_g/AaUralOb3acJ

Chrome crashpad implementation on OS X: https://bugs.chromium.org/p/chromium/issues/detail?id=390217

Chrome crashpad implementation on Windows: https://bugs.chromium.org/p/chromium/issues/detail?id=546288

Crashpad implementation for content_shell: https://bugs.chromium.org/p/chromium/issues/detail?id=466890

Comments (27)

  1. Marshall Greenblatt reporter

    On Windows, Chrome uses an ELF (Early Loading Framework) DLL [1] to handle crash reporting. This DLL is loaded before anything else in the process, even before WinMain is called, allowing for the earliest possible initialization of crash reporting. The ELF is then paired with a "run as crash handler" process type [2] which eliminates the need for a separate crash service executable. The crash handler process is launched indirectly from the ELF's DllMain function [3], so all of this logic can be handled internally by the ELF and CEF (via CefExecuteProcess).

    [1] https://cs.chromium.org/chromium/src/chrome_elf/

    [2] https://cs.chromium.org/chromium/src/chrome/app/chrome_exe_main_win.cc?rcl=0&l=223

    [3] https://cs.chromium.org/chromium/src/chrome_elf/chrome_elf_main.cc?rcl=0&l=18

  2. Marshall Greenblatt reporter

    There are some problems using the chrome_elf approach with CEF. In summary:

    A. CEF ships binaries that are shared by many different applications. The application will customize and compile the main executable and dynamically link the libcef DLL (wrapping the Content API), which itself dynamically links the chrome_elf DLL. We therefore need to configure/compile upload and crash key definitions outside of chrome_elf (see examples #1 and #2 below). Perhaps we can load these values from a separate file during chrome_elf startup and CEF-based applications would then install this file along with the application so that it's available on first run.

    B. CEF does not use MetricsServicesManager and, if possible, we should avoid adding this dependency just to configure Crashpad preferences (see example #3 below). We can perhaps move Crashpad-related functionality out of ChromeMetricsServicesManagerClient and into something that can be shared by all Crashpad consumers.

    Specific examples:

    1. The crash upload URL is currently either hard-coded or retrieved via the CHROME_CRASHPAD_SERVER_URL environment variable in PlatformCrashpadInitialization() [1]. With chrome_elf this function is called before the application's WinMain function so it's not possible to set environment variables programmatically in the application's WinMain function like we did with Breakpad. Since CEF ships binaries that are shared by many different applications we need a way to set this URL that does not require setting external environment variables or re-compiling Chromium.

    2. Annotations [2] and crash keys [3] are currently handled by code compiled into chrome_elf. Each application using CEF is likely to have application-specific keys that it cares about, so we need to retrieve these keys from a source external to chrome_elf.

    3. The logic for enabling crash upload seems to be handled by MetricsServicesManager::UpdateUploadPermissions() which calls back into chrome_elf via SetUploadConsentImpl [4] to update a settings database file. This file is then checked by CrashReportUploadThread::ProcessPendingReport() [5] to determine if the report should be uploaded. It seems that we could enable upload without using MetricsServicesManager by extracting the logic in ChromeMetricsServicesManagerClient::UpdateRunningServices [6] that calls SetUploadConsentImpl.

    [1] https://cs.chromium.org/chromium/src/components/crash/content/app/crashpad_win.cc?q=PlatformCrashpadInitialization&sq=package:chromium&l=53&dr=CSs

    [2] https://cs.chromium.org/chromium/src/components/crash/content/app/crashpad_win.cc?q=GetPlatformCrashpadAnnotations&sq=package:chromium&l=33&dr=CSs

    [3] https://cs.chromium.org/chromium/src/chrome/app/chrome_crash_reporter_client_win.cc?q=RegisterCrashKeysHelper&sq=package:chromium&l=71&dr=CSs

    [4] https://cs.chromium.org/chromium/src/components/crash/content/app/crashpad.cc?q=SetUploadConsentImpl&sq=package:chromium&l=311&dr=C

    [5] https://cs.chromium.org/chromium/src/third_party/crashpad/crashpad/handler/crash_report_upload_thread.cc?q=CrashReportUploadThread::ProcessPendingReport&sq=package:chromium&l=188&dr=CSs

    [6] https://cs.chromium.org/chromium/src/chrome/browser/metrics/chrome_metrics_services_manager_client.cc?q=ChromeMetricsServicesManagerClient::UpdateRunningServices&sq=package:chromium&l=253&dr=CSs

  3. Marshall Greenblatt reporter

    Options moving forward:

    1. Don't use chrome_elf. Instead, initialize crash reporting in the application's WinMain as we do currently with Breakpad.

    • Advantages: We can define and compile crash key definitions in the client application. No need to distribute chrome_elf.dll with the client application.
    • Disadvantages: We don't get reporting of very-early startup crashes.

    2. Build chrome_elf along with the client application.

    • Advantages: The client application can optionally not build/link chrome_elf.dll (disables crash reporting).
    • Disadvantages: We include another static library in the binary distribution and make client build setup more complicated.

    3. Create a version of chrome_elf that reads crash keys from file on startup.

    • Advantages: We can ship chrome_elf in binary form. No special crash-related setup in the client application.
    • Disadvantages: Adds another file that we need to distribute with the client application. Reading this file may break chrome_elf's "load really early" contract (add dependencies on system libraries), negating some of its advantages.
  4. Marshall Greenblatt reporter

    Other considerations for client applications:

    • Need a way to customize the path and command-line for the crash dump process. For managed apps like .Net or Java it will likely not be the same as the loading executable.
    • Maybe support both compiled-in settings and environment variable settings. A managed app could ship chrome_elf without changes, and rely on the user setting environment variables to customize behavior.
  5. Marshall Greenblatt reporter

    Looks like we'll go with 3 above (create a version of chrome_elf that reads crash keys from file on startup).

  6. Said Elkhazendar

    Hi Marshall When migrating from branch 2785 to 2883, I ran into COM registration error. I have a COM component that use Libcef and when trying to register the dill by calling "regsvr32 Mylibrary.dll" I get an error saying "--type=crashpad-handler" is not valid Please see attached image. Is there a way to disable crash pad handler or any possible workarounds? This also throws exceptions when trying to load libcef.dll from Mylibrary.dll when calling CefInitialize.

    Thanks

    RegistrationFail.png

  7. Marshall Greenblatt reporter

    @said_elkhazendar : What 2883 branch build/revision are you using? Are you calling CefExecuteProcess or CefInitialize from DllMain in your Mylibrary.dll? How are you handling the other process types (renderer, gpu, etc)?

  8. Said Elkhazendar

    Hi Marshall

    Using Version 3.2883.1528.gf557d32 x86 Browser calls CefInitialize, Renderer processes uses CefExecuteProcess.

  9. Said Elkhazendar

    Hi Marshall

    Changing chromium\src\components\crash\content\appcrashpad.cc to pass false as 3rd param in InitializeCrashpadImpl to disable the embedded handler (crashpad-handler) seems to fix the COM registration issue. Do you think it is a safe change? and if yes, can this change be added as a configuration (commandline or CefSettings)?

    Thank You

  10. Said Elkhazendar

    Hi Chris,

    Thanks, seems related, except I am loading from from c++ application as part of a COM component. So new handler seems to have this side effect on wrapped COM libraries.

  11. Marshall Greenblatt reporter

    @said_elkhazendar : Don't call Cefinitialize from your DllMain function. You should instead export a function from your DLL that the application can call explicitly to initialize CEF. That way it will not be called when your DLL is loaded by regsvr32.

  12. Marshall Greenblatt reporter

    The crash reporter rate-limiting/retry logic is implemented in CrashReportUploadThread [1]. The current implementation can be summarized as follows:

    1. Process pending crash reports every 15 minutes (value passed to WorkerThread constructor) or when notified of a new crash (ReportPending called). Results in a call to ProcessPendingReport for each pending report.

    2. Attempt to upload each report at most one time. Skip upload if:

    A.Uploading is disabled (CrashReporterClient methods GetCollectStatsConsent, GetCollectStatsInSample or ReportingIsEnforcedByPolicy return false), or;

    B. A report was uploaded less than 1 hour ago (kUploadAttemptIntervalSeconds), or;

    C. The upload failed.

    There's an open issue [2] to improve this logic in crashpad, however more immediate improvements are required for CEF consumers. CEF will implement the following retry strategy:

    1. (same as above) Process pending crash reports every 15 minutes (value passed to WorkerThread constructor) or when notified of a new crash (ReportPending called). Results in a call to ProcessPendingReport for each pending report.

    2. Attempt to upload each report repeatedly using a backoff scheme until the upload is successful or 24 hours has passed.

    3. Limit the number of uploads per 24 hour period (default 5).

    [1] https://cs.chromium.org/chromium/src/third_party/crashpad/crashpad/handler/crash_report_upload_thread.cc?sq=package:chromium&dr=CSs

    [2] https://bugs.chromium.org/p/crashpad/issues/detail?id=23

  13. Marshall Greenblatt reporter

    @amaitland : It's complicated by the early initialization of crash reporting on Windows (e.g. before CefInitialize is called). I don't know what the solution will be yet.

  14. Marshall Greenblatt reporter

    @amaitland : The crash handler executable is determined in crashpad_win.cc PlatformCrashpadInitialization [1]. If we pass an |embedded_handler| value of false to that method then it will use "crashpad_handler.exe" from the same directory as the main executable. Would that be sufficient for your purposes, or do you need to specify an arbitrary path and executable name?

    [1] https://cs.chromium.org/chromium/src/components/crash/content/app/crashpad_win.cc?q=PlatformCrashpadInitialization&sq=package:chromium&l=56&dr=CSs

  15. Marshall Greenblatt reporter

    NOTE: This comment is a rough draft of documentation added in https://bitbucket.org/chromiumembedded/cef/wiki/CrashReporting.md


    Crashpad will be configured using an INI-style config file named "crash_reporter.cfg" placed next to the main application executable (on Win, Linux) or in the top-level app bundle Resources directory (on macOS). If the config file exists then crashpad will be enabled and a crashpad-handler process instance will be launched. File contents are as follows:

    # Comments start with a hash character and must be on their own line.
    
    [Config]
    ProductName=<Value of the "prod" crash key; defaults to "cef">
    ProductVersion=<Value of the "ver" crash key; defaults to the CEF version>
    AppName=<Windows only; App-specific folder name component for storing crash information; default to "CEF">
    ExternalHandler=<Windows only; Name of the external handler exe to use instead of re-launching the main exe; default to empty>
    ServerURL=<crash server URL; default to empty>
    RateLimitEnabled=<True if uploads should be rate limited; default to true>
    MaxUploadsPerDay=<Max uploads per 24 hours, used if rate limit is enabled; default to 5>
    MaxDatabaseSizeInMb=<Total crash report disk usage greater than this value will cause older reports to be deleted; default to 20>
    MaxDatabaseAgeInDays=<Crash reports older than this value will be deleted; default to 5>
    
    [CrashKeys]
    my_key1=<small|medium|large>
    my_key2=<small|medium|large>
    
    • If "ProductName" and/or "ProductVersion" are set then the specified values will be included in the crash dump metadata. On macOS if these values are set to empty then they will be retrieved from the Info.plist file using the "CFBundleName" and "CFBundleShortVersionString" keys respectively.
    • If "AppName" is specified on Windows then crash information (metrics, database and dumps) will be stored under "C:\Users\[user]\AppData\Local\[AppName]\User Data". This value is ignored on macOS and Linux; use CefSettings.user_data_path instead on those platforms.
    • If "ExternalHandler" is specified on Windows then the specified exe will be launched as the crashpad-handler (it may be either an absolute path or a path relative to the main exe directory). This value is ignored on macOS and Linux; use CefSettings.browser_subprocess_path instead on Linux.
    • If "ServerURL" is specified then crashes will be uploaded to the server; otherwise, they will only be stored on the local machine.
    • "RateLimitEnabled" and "MaxUploadsPerDay" feed into the rate limiting scheme described in above comments. Rate limiting is not supported on Linux.
    • "MaxDatabaseSizeInMb" and "MaxDatabaseAgeInDays" specify the conditions under which existing reports will be deleted from the client machine. On Windows each dump is about 600KB, so a "MaxDatabaseSizeInMb" value of 20 equates to about 34 crash reports stored on disk.

    Any number of crash keys can be specified for use by the application. Crash key values will be truncated based on the specified size (small = crash_keys::kSmallSize (63 bytes), medium = crash_keys::kMediumSize (252 bytes), large = crash_keys::kLargeSize (1008 bytes)). The value of crash keys can be set from any thread or process using a new CefSetCrashKeyValue function exposed by CEF. These key/value pairs will be sent to the crash server along with the crash dump file.

    The config logic is implemented in CEF/Chromium as follows (code coming soon):

    1. Parse the value from "crash_reporter.cfg" in CefCrashReporterClient::ReadCrashConfigFile.
    2. Add the flag to the |arguments| array in CefCrashReporterClient::GetCrashOptionalArguments (called from PlatformCrashpadInitialization in crashpad_win.cc and crashpad_mac.mm).
    3. Add the flag and logic to support the value in HandlerMain [1] (which executes in the crashpad-handler process). For example, "MaxUploadsPerDay" is passed to the CrashReportUploadThread-derived class constructor and used in CefCrashReportUploadThread::ProcessPendingReport.

    [1] https://cs.chromium.org/chromium/src/third_party/crashpad/crashpad/handler/handler_main.cc?dr=C&q=handler_main.cc+HandlerMain&sq=package:chromium&l=140

  16. amaitland

    amaitland : It's complicated by the early initialization of crash reporting on Windows (e.g. before CefInitialize is called). I don't know what the solution will be yet.

    @magreenblatt Thank you for the quick reply!

    amaitland : The crash handler executable is determined in crashpad_win.cc PlatformCrashpadInitialization [1]. If we pass an |embedded_handler| value of false to that method then it will use "crashpad_handler.exe" from the same directory as the main executable. Would that be sufficient for your purposes, or do you need to specify an arbitrary path and executable name?

    I'm sure that could have been made to work :+1:

    If the "crashpad.cfg" file exists then crashpad will be enabled and a crashpad-handler process instance will be launched

    Crashpad being enabled based on the existence of a cfg file sounds like a flexible solution, thanks for taking the time to look into the problem.

  17. Steven Bush

    I noticed that the crashpad-handler process includes --database and --metrics-dir on the command line and these are defaulting to chromium subfolders within the user's AppData/Local folders on Windows. Will it also be possible to configure these locations? It would be nice to contain these within our application's existing local temp folder.

  18. Marshall Greenblatt reporter

    Standardize product/version/platform crash keys in master revision 18ce862, 2924 branch revision df1d25f and 2883 branch revision 88ff29a. Crashes on all platforms will now contain a "platform" key with a value from the set ("win32", "win64", "linux32", "linux64", "macos").

  19. Marshall Greenblatt reporter

    MacOS has a system crash reporter that's responsible for showing the crash UI dialog ("[Application] quit unexpectedly"). By default Chromium will forward [1] main process crashes to the system crash reporter [2]. In CEF we will not show the crash UI dialog by default, and instead make this behavior configurable via a new BrowserCrashForwardingEnabled option in crash_reporter.cfg.

    Added in master revision 661fa72, 2987 branch revision 1ad8ea0 and 2924 branch revision d513424.

    [1] https://cs.chromium.org/chromium/src/components/crash/content/app/crashpad.cc?type=cs&q=crashpad_info-%3Eset_system_crash_reporter_forwarding&l=146

    [2] https://cs.chromium.org/chromium/src/third_party/crashpad/crashpad/handler/mac/crash_report_exception_handler.cc?q=client_options.system_crash_reporter_forwarding&l=191

  20. Log in to comment