CyberGrants Is Slow To Respond

Postmortem

Issue Summary:
On Monday 5th May, 2025 at approximately 11:00 EST, we became aware that some Cybergrants users were encountering slow responses from the application. In addition, some users experienced intermittent errors. After initial investigation, the issue was resolved and at approximately 12:20 EST, some 80 minutes later, normal service was resumed.

Root Cause:
A significant increase in CPU utilization was observed, following an email communication sent by one customer that varied from the usual delivery timing and process approach. The customer distributed a mass email to a significant population, prompting immediate action within the CyberGrants platform.
While the communication itself was not inherently problematic, the resulting user activity initiated a customization within the system. The action requested by users in response to the email was invalid, and the system attempted to process a high volume of these invalid requests concurrently.
This behavior resulted in unusually high CPU consumption—more than double the peak levels previously recorded, including those experienced during high-traffic periods such as Giving Tuesday and National Volunteer Month combined.
Upon identifying the root cause, a corrective script was executed on the database to artificially validate the invalid responses. Once this change propagated, system performance returned to normal. The issue was isolated to clients hosted on the same instance as the initiating customer.

Prevention:

  • Load testing specific to these types of unstable queries, has been added to our standard set test practices
  • The customization for the initiating client has been amended to only allow specifically valid database queries of the kind experienced in this incident
  • All queries that have the potential to cause similar issues have been reviewed and, where appropriate and necessary, optimized
  • The initiating client has been advised of the standard practices pertaining to emails
Posted May 22, 2025 - 13:43 EDT

Resolved

This incident has been resolved.
Posted May 05, 2025 - 14:31 EDT

Monitoring

Performance issue has been addressed. We will continue to monitor to ensure there is no regression.
Posted May 05, 2025 - 12:37 EDT

Investigating

CyberGrants is currently experiencing an issue causing the application to respond very slowly and/or timing out. We are actively reviewing and working to resolve as quickly as possible. Please monitor this page for further updates.
Posted May 05, 2025 - 11:45 EDT
This incident affected: CyberGrants.