Transaction creation not possible
Incident Report for Signhost Verified Signing
Postmortem

What happened?

On Friday February 2 at 17:05 our messaging queue that takes care of the messaging between our applications and endpoints ran into network issues. These issues arised because an underlying process ran out of memory.

What did we do?

After having noticed the issue occurring we tried to revive the system by stopping all nodes but one and restarting them again in the hope to get a quorum again. Once we found out this didn’t work we decided to remove all cluster nodes and their data and to rebuilt the whole cluster. This took only a couple of minutes in the end so once we were finished with rebuilding the cluster our platform quickly became fully functional again. Around 17:50 the whole platform was functioning properly again.

What will we do better?

  • Improve the cleaning up processes of stale queues
  • Respond quicker to these network issues, as we know now what works and what doesn't.
  • Improve the resilience of our queueing system in general.
Posted Feb 09, 2024 - 17:30 CET

Resolved
The incident has been resolved. The issue in our message queue handling system was found and fixed. We are investigating the root cause to improve our message queue system moving forward, and will update this status page with a further cause analysis when investigation is complete.
Posted Feb 02, 2024 - 18:09 CET
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Feb 02, 2024 - 18:01 CET
Identified
We have found the cause of the issue and are working on a fix.
Posted Feb 02, 2024 - 17:49 CET
Investigating
We are investigating an issue with transaction creation
Posted Feb 02, 2024 - 17:11 CET
This incident affected: API, ID Proof, Webforms, and Evidos App.