During the incident, customers experienced slow performance while using the search function. The issue affected all users of the Search platform and impacted the overall booking experience.
The incident was caused by a backend processing component entering a stalled state, which prevented tasks from being assigned and processed as expected. This resulted in search slowness and incomplete responses for some requests. The degraded state persisted for several hours without triggering automated detection.
The issue was resolved when a scheduled service restart script ran, restoring normal operations.
To mitigate the risk of recurrence, Traveltek is implementing the following measures:
Automation: Automate the backend service restart process using scheduled or condition-based triggers. This ensures that if the service stalls, it can recover without manual intervention.