A client shared a problem regarding scaling tech infrastructure they faced recently:
They had 2 services, where Service A would call a webhook on Service B whenever some failure event occurred.
Accidentally, one of their customers set this trigger to be called on success event instead of failure event.
Now, success events being much higher than failure events, soon their service B receiving webhook got overloaded with CPU usage reaching almost maximum.
They asked for my suggestions to overcome this problem.
The simplest and quickest solution came to my mind was adding a queue in between to manage the load.
The queue can take input events from service A in at any rate (10 req/s to 10000 req/s, while the queue will (output) call the webhook service B at a fixed rate (example, 50 req/s).
This ensures that the webhook service B receives events at a fixed rate thus keeping the load constant, while also allowing service A to send webhooks at any rate they would want!