To give a quick update on what happened last night:
What happened
Just when I was taking a shower (of course) the account management service ran out of memory (for reasons yet to be determined). As soon as I got out I saw that @molp had called me an I knew something serious was wrong
. A quick reboot fixed the issue.
We discovered a few things that I will or have already looked into:
- Monitoring of said service leaves a bit to be desired. Not that it would have helped much in this particular situation, but still.
- The root cause was the service running out of memory, which might be related to changes introduced by the recent Spring Cleaning update.
- Ironically, it was almost certainly unrelated to the database update that happened earlier that day.
- Michi could have easily fixed the issue but didnāt have access to the server. He should have had and the issue has already been resolved by our ops partner.
Advice for future, similar incidents
In the future, when thereās an issue that affects all players and/or servers, donāt bother opening a ticket via email. I only see those the next morning (on a workday) the earliest. Same goes for a post on the forums these days (like this very thread). Due to my very constrained time budget, I tend to only check the helpdesk once a day, the forums a few more times during ābreaksā. But definitely not during off hours (at least not regularly).
Consequently, a mention on Discord is probably the quickest way to reach me (which is also why you should generally avoid mentioning me directly unless itās an on-going conversation) ![]()
That said, under normal circumstances, we do have monitoring in place that should alert me if a service goes down. But canāt hurt to give me a ping ![]()