The Matrix.org homeserver [2] is currently experiencing a “major outage” [1.1] — the server is currently offline [1.3]. The timeline of events is currently (2025-09-03T00:14Z) as follows:
- On 2025-09-02T17:39Z, an issue with the matrix.org database was reported by status.matrix.org [1.3].
- At 2025-09-02T19:01:27Z, the Matrix.org Foundation later reported that the matrix.org secondary database lost its filesystem at 2025-09-02T11:17Z, and subsequently lost its primary database filesystem at 2025-09-02T17:26Z. [5]
- At 2025-09-02T21:39:25Z, the Matrix.org Foundation reported that they were unable to restore the primary database filesystem, so they are restoring a 55TB database snapshot from the previous night. This restoration is expected to take more than 10 hours to recover the data, then greater than 4 hours to restore the data, then greater than 3 hours to catch up on missing traffic. [6]
References
- Type: Website. Publisher: “Matrix.org”. Accessed: 2025-09-03T00:18Z URI: https://status.matrix.org/.
- This is the “system status and incidents” website for The Matrix.org Homeserver [4]
- Type: Image.
- Type: Image.
- Type: Image.
- Type: Meta.
- As seen on status.matrix.org, the site which tracks the system status and incidents of The Matrix.org Homeserver [3.1], the section “Matrix.org” contains the context: “The Matrix deployment on matrix.org” [1.2]. Within that section, Synapse, which is reported to have the outage [1.1] is a Matrix homeserver [5].
- Type: Article. Title: “The Matrix.org Homeserver”. Publisher: “The Matrix.org Foundation”. Accessed: 2025-09-03T00:30. URI: https://matrix.org/homeserver/about/.
- Type: Text. Location: ¶3.
System status and incidents
- Type: Text. Location: ¶3.
- Type: Website. Title: “Servers”. Publisher: “The Matrix.org Foundation”. Accessed: 2025-09-03T00:38Z. URI: https://matrix.org/ecosystem/servers/.
- Type: Image.
- Type: Image.
- Type: Post>Text. Author: “The Matrix.org Foundation” (“@matrix@mastodon.matrix.org”). Publisher: “mastodon.social”. Published: 2025-09-02T19:01:27.000Z. Accessed: 2025-09-03T00:46Z. URI: https://mastodon.matrix.org/@matrix/115136245785561439.
So: the matrix.org database secondary lost its FS due to a RAID failure earlier today (11:17 UTC). Then, we lost the primary at 17:26. We’re trying to restore the primary DB FS (which could be fastish), while also doing a point-in-time backup restore from last night (which takes >10h). We believe the incremental DB traffic since last night is intact however. Apologies for the downtime; folks on their own homeserver are of course not impacted.
-
“FS” is presumed to mean “filesystem” from
[…] the matrix.org database secondary lost its FS due to a RAID failure earlier today (11:17 UTC). Then, we lost the primary at 17:26. We’re trying to restore the primary DB FS […] [5]
combined with
[…] we haven’t been able to restore the DB primary filesystem to a state we’re confident in running as a primary […] [6]
-
“earlier today (11:17 UTC)” is assumed to mean 2025-09-02T19:01:27Z as this post was published on 2025-09-02T19:01:27.000Z.
-
Since no other qualifying information is given, and given its successive nature to 11:17, “17:26” is presumed to mean 2025-09-02T17:26Z.
-
[…] Then, we lost the primary at 17:26 […]
This is presumed to be referring to the primary database filesystem given the context:
[…] the matrix.org database secondary lost its FS due to a RAID failure earlier today (11:17 UTC). Then, we lost the primary at 17:26 […]
-
“DB” is presumed to mean “database” given the following context
[…] the matrix.org database secondary lost its FS due to a RAID failure earlier today (11:17 UTC). Then, we lost the primary at 17:26. We’re trying to restore the primary DB FS […]
-
- Type: Post>Text. Author “The Matrix.org Foundation” (“@matrix@mastodon.matrix.org”). Publisher: “mastodon.social”. Published: 2025-09-02T21:39:25.000Z. Accessed: 2025-09-03T00:50Z. URI: https://mastodon.matrix.org/@matrix/115136866878237078.
Sorry, but it’s bad news: we haven’t been able to restore the DB primary filesystem to a state we’re confident in running as a primary (especially given our experiences with slow-burning postgres db corruption). So we’re having to do a full 55TB DB snapshot restore from last night, which will take >10h to recover the data, and then >4h to actually restore, and then >3h to catch up on missing traffic. Huge apologies for the outage. Again, folks using their own homeservers are not impacted.
- “DB” is presumed to mean “database” [5].
Cross-posts: