Page Delivery Issues Observed
Postmortem

What happened?

Between 11:47AM CEST and 11:59AM CEST, several Helix services were failing and returned 429 (too many requests). Customer website delivery remained mainly unaffected (only 5 uncached pages were rendered 504), but publishing content was no longer possible.

How did it happen?

I/O Runtime team reported: it looks like the invokers died, there are some issues in the ew1 cluster, traffic redirected to ue1 cluster.

What are we doing now?

Asked I/O Runtime to take actions so that issue does not happen again.

Posted Nov 17, 2020 - 12:54 UTC

Resolved
This incident has been resolved.
Posted Nov 17, 2020 - 11:04 UTC
Investigating
We are observing issues that are affecting page delivery for Project Helix customers. The issue is under active investigation and we are working with full effort to reach a speedy resolution.
Posted Nov 17, 2020 - 10:50 UTC