Delay in transaction creation
Incident Report for Evidos trust services
Postmortem

Past week we saw a major outage in our signing platform. Transaction creation in both portal and API was delayed. We want to apologize for any problems experienced by our customers during this outage, and through this way want to inform our customers about mitigating steps we took to prevent this from reoccurring. We are committed to guarantee a high uptime, as you are used from our platform.

Intro

Last week we saw our transaction processing times for our signing service rapidly getting higher. We activated an emergency protocol to stop the queue from getting worse. This meant that for about 2 hours creating new transactions in both portal and API was not possible.

Problem

On Februari the 4th around 13:30 CET we encountered a problem with our database, which meant that new transactions could not be processed. We automatically activated an emergency protocol to prevent timeouts and a big queue buildup, this resulted in ‘not available’ messages and error QR codes to at least make customers directly aware of the fact that transaction creation was not possible.

All transactions queued before activation of the emergency protocol were parked, for a later re-entry into our regular signing service.

A database error should be quickly diagnosed and fixed. A compounding problem was that some time earlier we migrated to a new server environment. In this new environment not all rights were granted to us as we were used to in our previous environment. We suddenly had to depend on our web hosting party to fully diagnose and fix this problem. Because of this extra layer of communication, a problem which we should be able to diagnose and fix right away, suddenly took longer than expected to solve.

Fix

After getting into contact with our hosting party, the problem was diagnosed and fixed. Around 15:32 CET our Portal and API functionality was restored. Around midnight the next day (Friday) we restored the previously parked transactions. This can lead to transactions being processed around midnight Friday, even though actual creation or interaction was earlier.

Mitigation

In order to mitigate these problems in the future, we have checked that the rights we have on our server environment enable us to do quick response and repair on our platform. Furthermore, we expanded our logging and testing capabilities to nip similar problems in the bud at an earlier stage. Finally, we are in the progress of retooling some database services to prevent and improve bottlenecks.

This makes sure that:

  1. Problems with our database should happen less
  2. If we see problems we are quicker to diagnose them
  3. If we have diagnosed a problem our quick response teams have full control to fix the problem right away.

This will help us guarantee our high uptime, and keep our customer’s environments operational and responsive in the future.

Posted Feb 10, 2021 - 14:54 CET

Resolved
Transactions can be created again. The incident has been resolved. For pending transactions of today with the status 'waiting for document' and 'in progress', please wait 24 hours. We will try to process the regarding transactions again.
If the pending transactions in question cannot wait, we recommend withdrawing them and create them again.
Posted Feb 04, 2021 - 17:13 CET
Update
The cause of the delay has been identified. Transactions can be created again. We are continuing to monitor for any further issues.
Posted Feb 04, 2021 - 16:54 CET
Update
We are continuing to monitor for any further issues.
Posted Feb 04, 2021 - 15:32 CET
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Feb 04, 2021 - 15:32 CET
Update
We are continuing to work on a fix for this issue.
Posted Feb 04, 2021 - 15:16 CET
Identified
The issue has been identified and a fix is being implemented.
Posted Feb 04, 2021 - 15:04 CET
Update
We are continuing to investigate this issue.
Posted Feb 04, 2021 - 14:42 CET
Update
The delay in the creation of transactions is still an issue. Please bare with us. We're in close contact with our hosting provider and it has our full attention.
Posted Feb 04, 2021 - 14:33 CET
Investigating
We are currently experiencing a delay in creating new transactions which may result in an error message 500 with a QR code. Please try again later once you see this error message. Sorry for the inconvenience.
Posted Feb 04, 2021 - 13:38 CET
This incident affected: API and Portal.