Skip to content

Draft: Make the startup of the service more robust

Ingo Klöcker requested to merge work/kloecker/more-robust-startup into master

These two changes make the startup of the service (and the client script) more robust:

  • The services try up to 30 minutes to reach the SFTP server. The client scripts try up to 5 minutes.
  • The services try a few times (8 times in about 25 seconds) to fetch the project settings.

These changes should prevent the service from exiting to quickly on startup. As a side effect this prevents too many restarts of the services in a too short time span which made systemd stop restarting the service.

I'm open for comments about the timeouts/retry periods. Retrying 30 minutes to reach the SFTP server may sound extreme. On the other hand, without SFTP server the service is not operational. We could still exit more quickly and then let systemd restart the service again. On the other hand, retrying for (only) about 25 seconds to fetch the project settings may sound too short. But it's certainly long enough to prevent systemd from marking the service as bad.

Thoughts?

P.S. None of the settings for systemd service units allow throttling/delaying restarts. Therefore I chose the above approach to mitigate the problem that systemd stopped restarting our services.

Merge request reports