Jump to content

Recommended Posts

Posted
17 minutes ago, Zombra said:

Forgive a ground level question, but does the server selection screen automatically update when server status changes, or will I need to refresh it?  I'm leaving the client open on a 2nd monitor so I can see when we're back online 🙂

 

It occasionally polls the servers to check the status. You can also check the Server Status channel in the Discord. 

  • Thanks 1

Lockely's AE Tales:

H: The Rook's Gambit (Arc ID 49351), P: Best Left Buried (WIP)

Posted
On 1/24/2024 at 6:23 AM, General Idiot said:

Personally I distinctly remember getting lag spikes of the kind described here back on live. And given the discussion of large leagues I can't help but wonder if that barrier thing was part of what made incarnate trials lag so horribly when they first released too. BAF especially I remember having upwards of thirty seconds delay on some things because the server was just that far behind.

 

So it's possible this was an issue on live too and just not considered enough of a problem to invest time and resources into solving it for no direct return.

 

 

I remember the Rikti Warzone mass raids being the worst, once we're up on the ship on "Live" all those powers made the server play it as stop motion animation, it's significantly less now, but can still obviously be improved. Hami was never as bad but it still got the lag fest on "Live" and BAF was bad on the pulls, but after the two first bosses it wasn't so bad.

 

I also remember NCSoft constantly saying they was doing everything they could to address the issue. But never really did. 

 

It's actually nice to see it finaly getting addressed and hopefully sorted. Thanks team! 

  • Thumbs Up 1
Posted

Any thoughts on if this will help alleviate the issues with the Auction House, Character Items, and Super Packs? Or if this will allow the removal of the timers put on claiming items from character items?

 

A while back, these became issues for some players, where character items and some features (like super packs) would lock up entirely until the next reboot. Often it would happen after placing items on the AH. I don't recall all the details (the posts are here in the forums), but some of the blame was placed on Excelsior's server woes. Changes were put into place to limit the speed of claiming items, where attempting to claim items faster than one in about a second would essentially lock up your ability to claim more items for around 5 seconds. Also last week, for the first time in probably over a year, I had character items and super packs lock up for hours with the reply of Character or Account item claim failed, another request is pending.

Posted
On 1/23/2024 at 10:21 PM, Lunar Ronin said:

 

There weren't as many people nor as much activity back on the live servers.  I've found that people seriously misremember how small the live servers were in comparison to Excelsior and even Everlasting, likely skewed by modern MMORPGs.  I've seen a few people over the years say that Infinity was a small server back on live.  It was the third most populated server back then, just behind Freedom and Virtue. :classic_laugh:


So fan servers are more populated now than the official servers just before the shutdown? Very impressive. What about right now VS when the game first launched in the early 2000s?

  • City Council
Posted
7 hours ago, Tenebrose said:

Any thoughts on if this will help alleviate the issues with the Auction House, Character Items, and Super Packs? Or if this will allow the removal of the timers put on claiming items from character items?

 

A while back, these became issues for some players, where character items and some features (like super packs) would lock up entirely until the next reboot. Often it would happen after placing items on the AH. I don't recall all the details (the posts are here in the forums), but some of the blame was placed on Excelsior's server woes. Changes were put into place to limit the speed of claiming items, where attempting to claim items faster than one in about a second would essentially lock up your ability to claim more items for around 5 seconds. Also last week, for the first time in probably over a year, I had character items and super packs lock up for hours with the reply of Character or Account item claim failed, another request is pending.

 

These databases are on a separate database host and wouldn't be impacted by this upgrade. If/when we upgrade that server, AccountServer (which handles the account-wide inventory and super packs) may see some performance uplift. It's possible we may look at potential improvements on the code side at some point in the near future as well.

  • Thanks 1
  • Thumbs Up 2

If you need help, please submit a support request here or use /petition in-game.

 

Got time to spare? Want to see Homecoming thrive? Consider volunteering as a Game Master!

Posted
12 hours ago, Cipher said:

It's possible we may look at potential improvements on the code side at some point in the near future as well.

 

If I may, I fully support you all in any back end code improvements you make that would improve overall performance and fix long-standing issues with the game engine.  To be honest, I'd happily give up new costume pieces and missions and power sets while these more important tasks get done for the overall, long term health and growth of the game.  As always, thank you all for your efforts here. 

  • Thumbs Up 1
  • 1 year later
Posted
On 1/23/2024 at 7:38 AM, Telephone said:

Hello! I'm Telephone, and you may remember me from previous technical posts such as

 

 

As many of you are likely aware, performance on Excelsior during peak times has been less than optimal for some time, and the recent increased player count since the license announcement has not improved things.

 

We've spent a lot of time over the last couple of years profiling the shards and even more over the last couple of weeks in an attempt to resolve the performance issues, but we're now at a point where we need to actually upgrade our hardware, or, to be more specific, our SQL host hardware.

 

 

The SQL Queue

 

The issues we are encountering are caused by what the City of Heroes™ server software calls the SQL Queue. Everything players do is ultimately committed to a back-end database. Under normal operation, most of these operations are both parallelized and asynchronous, and they commit quickly (within nanoseconds), but there are some operations which are much larger and take more time to perform.

 

There are also certain operations which have stricter timing requirements to maintain database integrity; these operations often come with what is called a barrier. When a barrier occurs, the entire SQL queue must be drained before continuing, meaning our normal large pool of asynchronous operations has to stop and wait on the barrier.

 

One of the biggest barrier culprits (until a very recent fix by @Number Six) was the disbanding of a large league, which is why you may have noticed at the end of a Hamidon raid the shard often seemed to lag for some time when the league was disbanded.

 

While we were able to find a solution to this barrier issue, there are other operations where the barrier can't be removed without significant rearchitecting. When the shard is very busy and there are a lot of other large operations taking place, a barrier can cause the entire shard to lag for several seconds, and this often gets into a vicious cycle where some of those other large operations may have their own barrier, or there may be database conflicts and the entire SQL operation is rolled back, rebuilt by the server, and sent to the database again with another barrier.

 

The queue does eventually drain, but it could be a period of many seconds or even minutes until everything finally unwinds. If it gets particularly bad, the shard may enter what's called Overload Protection, where (among other load-shedding measures) new logins are temporarily forced to queue, even though the shard has not reached its player limit.

 

 

Throw More Hardware At It

 

Homecoming has been in operation for nearly five years now, and in that time the size of our databases and their indices has grown significantly. Our existing primary North America SQL hosts (of which we have two) are legacy OVH Advance-3 servers, with Xeon D-2141 CPUs (8 cores, 16 threads, 2.2-2.7 GHz), 64 GiB of RAM, and two NVMe drives (in a mirrored configuration). Excelsior's database alone is over 100 GB in size, and during peak time sees many thousands of transactions per second, so we have simply outgrown the hardware we have been running on.

 

We're planning to upgrade both of them to the newest iteration of the OVH Advance-3, which is a Ryzen 5900X (12 cores, 24 threads, 3.8-4.7 GHz), 128 GiB of RAM, and four NVMe drives (in a RAID10 configuration). The main benefits we expect are that the doubling of RAM will hold many more indices in memory, and moving to four drives instead of two will double our I/O capacity.

 

We're upgrading the one Excelsior (and Everlasting) are on first, and if that works well we will upgrade the other one (hosting Torchbearer, Indomitable, and global services) next month.

 

 

Power Underwhelming?

 

It's possible (but unlikely) that even this upgrade would not be enough to resolve the issues on Excelsior. The most likely cause of this would be insufficient RAM; the newest model of OVH Advance-3 has a maximum of 128 GiB of RAM, so we would need to go up to an Advance-4 to get more RAM (this would also increase the number of cores, but at a lower clock speed). This would be a somewhat significant cost increase, but if it becomes necessary we will explore it.

 

There's also the possibility that the issue can't be resolved by more hardware; some of the SQL Queue problems are fundamental to the system design. We haven't stopped looking at fixes from the software side and while some of the barrier operations must remain barriers, there are potentially other fixes we can do to reduce load and database contention.

 

 

TL;DR

 

We believe our SQL hosts are no longer up to the task of handling our shards as they have grown, and need to upgrade their hardware.

 

We're planning to spend approximately $600 this month and $600 next month in one-time charges on upgrading our primary North America SQL hosts. Our ongoing costs will increase by about $250-$300 per month ($125-$150 per month per database host; the amount is a little difficult to calculate due to taxes and SQL licensing costs).

@Telephone and @Number Six

 

Are there any plans to upgrade hardware further or is the quality of the hardware been sufficient over the past two years?

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...