Issue Summary:
On Sunday 5th October 2025, from approximately 04:28 to 05:48 ET, some CyberGrants users were unable to use the full functionality of the product. The same issue recurred later that day from 21:55 to 08:47 ET the following day, Monday 6th October. The situation was resolved when upon investigation, a series of massive reports, concurrently running, were manually stopped allowing affected servers to return to their normal operational state.
Root Cause:
Some Application servers became overloaded after a massive report, that would take well over an hour to complete, was requested multiple times in rapid succession over a prolonged period. For comparison, the vast majority of other reports are less than a thousandth of the size of this one. The report in question is an anomaly and consumes a significant amount of memory and CPU power. Requesting the report over and over, each request coming after a fraction of the time it takes to complete it, caused server after server to become completely utilized and the request, as far as the requestor was able to see, time out, although each request was still running in the background.
Prevention: