add coredumpd support
This only does something when used with a new enough KCrash.
Coredumpd is a coredump handler that comes with systemd. When a process dumps its core it is sent to coredumpd which records the crash in the systemd journal and stores the core on disk. This allows us to pick up the crash after the fact and file a bug report. For example when software crashes on session logout.
To facilitate bug reporting KCrash writes the metadata we ordinarily get through ARGV to disk as an INI file. Since we still want to support both operation modes this commit introduces large amounts of extra tooling specifically meant to connect coredumpd crashes, to metadata, to drkonqi argvs. All of this does depend on systemd and is generally working with version 245, but 248 is vastly more recommended because of various refinements and bugfixes.
Architecturally a coredumpd crash works like this:
KCrash
The app crashes. KCrash's signal handler runs. It records the metadata
to a file in ~/.cache
. Re-raises the signal to then trigger a core
dump.
coredumpd
Coredumpd gets invoked by the kernel, captures the core, records the
crash with all the metadata it has available (proc maps, pid, time,
etc.) to journald. It does this by invoking an instance of
systemd-coredump@.service
drkonqi-coredump-processor@.service
This is wanted by systemd-coredump@.service
and instantiated using the
same instance "name" as coredump@ (this then allows us to find the correct
crash). The processor connects to journald and searches/waits for the
crash for the correct coredump@ instance to appear in the journal. Once
the crash record has been found a connection to a user-scope socket is
opened...
drkonqi-coredump-launcher.socket
Is a user-scope socket that purely exists for
drkonqi-coredump-processor@.service
to talk to. When a connection is
opened an instance of drkonqi-coredump-launcher@.service
is spun up to
deal with the traffic.
drkonqi-coredump-launcher@.service
Is the actual launcher service, it is socket activated from system-scope. On the socket it gets the crash metadata streamed from the system-level processor (thereby eliminating the need to talk to journald again - the processor forwards the data it looked up).
The launcher then glues the coredumpd metadata into the same file as the KCrash metadata, turning the .ini file into a comprehensive record of the crash.
Once the file is complete it forks drkonqi with the same arguments as though KCrash had invoked it directly so the user can file a crash report.
Drkonqi
Drkonqi itself has grown a new CoredumpBackend analogous to the KCrash backend. Its main concern is preparing the core for tracing. Depending on the systemd version that is either delegated to coredumpctl (the CLI for coredumpd) or partially done on our end. In either event coredumpctl is a runtime requirement to not have to concern ourselves with where a core is actually stored from the coredumpd side of things (could be compressed, on disk, or in journal).
gdbrc now also supports the coredump backend by extending the commandline templates with core-based tracin, for the coredump backend only. As a side effect, debuggers now can have a corefile template variable which is the path to the on-disk corefile in the event that the legacy coredumpd backend is used. Newer coredumpd-248+ allows us to invoke gdb through coredumpctl directly, eliminating the need to faff about with core files manually on our end.
Everything else stays the same. As far as the UI bits are concerned nothing changes between a kcrash backend and a coredump backend.
Metadata file presence currently is doubling as "this crash has not been dealt with" indicator. As such, metadata files are only cleaned up if the user somehow interacts with drkonqi to discard the dialog. This is to assist with future development to implement "an application has crashed in the past" style behavior (e.g. when apps crashed on logout).
drkonqi-coredump-cleanup.{service,timer}
Is a cleanup system in case crashes fall through the cracks and don't get their metadata files clean up. This is largely a stop-gap measure because this commit does not deal with actually picking up crashes that happened at logout - this requires additional UI engineering first.