Jump to content

cforciea

Members
  • Posts

    10
  • Joined

  • Last visited

Posts posted by cforciea

  1. Wait, is this the talks that TonyV said were happening with Titan Networkin April, or the talks TonyV said were happening with Titan Network in August, or the talks that were happening when the game shut down, or maybe this guy from 2014 https://www.cohtitan.com/forum/index.php?topic=10284.0?

     

    How about you guys stop talking about this until it actually happens? Giving people potentially false hope in order to give yourself an extra air of legitimacy is just cruel.

     

    (edited to clear up name confusion)

    • Like 2
  2. We do have a replica, but it's using log shipping since we're running the cheap version of SQL server, so it's not online and available for queries. The replica is inside the secure part of the network in case we need to go active/active and run some shards there, which I believe was done this morning as the primary SQL box got maxed out a few times last night.

     

    You have a license that lets you go active-active and not active-passive? You can also read off of log ship replicas as long as they aren't currently in the middle of a restore operation, and restore operations are on a schedule. Repeatedly citing security is a red herring; you already have firewall pinholes going to the secure part of your network, and I'm willing to bet it's a lot easier to validate the security properties of a microservice using read-only db access to serve get requests of effectively nonsensitive data than it is to be sure that ncsoft's auth server isn't going to do something evil.

     

    FWIW, the super-high priority list contains issues that are affecting people daily. Excelsior frequently has more than 2,000 connected -- it was nearly 2,500 last night when it started queuing people and it was running really close to the instability line. It's the same issue that took down Torchbearer yesterday -- at around 2300 online with multiple MSRs going, dbserver couldn't keep up and the queue of deferred updates to send to the SQL server grew to over 60,000 before it became unresponsive. That caused many to get disconnected and stuck in the dreaded "character is still logging out" state. There's also a problem somewhere in the netlink code that can hard crash an entire shard if global services (chatserver / accountserver) go offline unexpected, but the bug only seems to show up when there are lots of people connected and getting a good crash dump is proving tricky.

     

    Running without a way for people to protect their own character data also affects them daily, the effect is just in the form of risk. People just tend to be bad at risk management. Even so, I'm curious what would happen if you sent out a survey asking your players which is more important: being able to play during peak hours with less time in login queues and fewer crashes, or providing mitigation against the risk of their characters all going away. I have a funny feeling that there is a very large segment that would choose the latter, but don't understand the technology or the situation well enough to even understand that it's something that could happen to them, and that there's a way to mitigate it. And thus they don't go asking for it.

     

  3. Add me to the list of people who would rather see them do it correctly, then rush something through that is problematic or causes a problem for the servers. I agree with the devs approach.

     

    There are lots of options here besides "provide a client-side system for autosaving characters with RSA signatures" and "make a tool that hoses the game servers." I'm in no way advocating for doing anything dangerous or poorly architected, regardless of any claims about what doing it "correctly" would look like.

  4. dbquery is the "brute force" method because it's the most direct method and the one that works now to produce the character in the text container format. But dbquery doesn't touch SQL directly; so indices and such don't matter. Instead it opens a connection to the dbserver just like a map starting up would do and asks for it from there.

     

     

    In testing we found that dbquery also can cause the dbserver to hiccup when it's run for some as yet unknown reason. That's really bad, since every map in a shard depends on dbserver being responsive not only to load/save characters but also to relay chat, mission updates, teams, moving between zones, etc. The dbservers are already struggling to handle more than 2,000 players online due to some design issues unrelated to the backend database, and a hiccup at the wrong time is catastrophic for a lot of people.

     

    I'll own up to misunderstanding what you meant by dbquery. I had assumed you either just meant generally a database query, or something like the bit of LINQ library that also generates database queries, not to an existing executable. I also got off my butt and looked at the source for the tool other servers are running, and yep that's what's going on. My bad.

     

    So that leaves either writing a tool to extract the data from SQL directly and format it the same way, or coming up with some other method. For security reasons the web tier shouldn't be allowed to touch the game database directly, and an extra replica for that would incur more license costs - at least until the pgsql conversion is done.

     

    I wouldn't particularly worry about "the same format". As long as the data is parseable, somebody can always write a tool later to munge it to however it needs to look, especially if you open source whatever model code you'd be using to serialize it.

     

    If the concern is just security, it'd be pretty trivial to throw up a lightweight API server using a framework in your language of choice, toss nginx in front of it, and limit requests to a specific incoming IP. Then your public-facing web server can just query that. The amount of code difference there is pretty negligible.

     

    Actually using a replica would indeed generate more costs, but as I alluded to, I'd think you'd want a replica anyway. The alternative is all sorts of potential hilarity when you lose a drive on your db server and need to do a raid rebuild, or a rollback if the db eats its fingers. If the goal is stability, that should be pretty high on your list regardless of any export functionality.

     

     

    There are also logistical issues that we've discussed the best way to handle. The website is the most obvious route. However, it's based on old forum software and at the risk of spoiling something that shouldn't be public yet, there are plans to replace it with something more modern for account management and other functions. It doesn't make a lot of sense to write something for the current site that will just have to be rewritten in a month or two.

     

    If you do your interesting logic in an API layer separate from the website to alleviate your existing security concern, then all you are rewriting is the ability to read and display the output from that backend API call.

     

    Web access may not be the best way to go for other reasons as well. On such an alt-heavy game, it could become quite burdensome to manually download your characters whenever you want to make a backup copy.

     

    One other approach we're considering (and the one I'm personally advocating for) is to modify the protocol so that the game client receives a complete copy of the character on logout and automatically saves it in a backup folder. This would be opt-in to avoid privacy issues if you're playing on somebody else's computer, but once enabled would allow players the peace of mind of knowing they always have a current backup. This also shifts the burden of producing the character dump to the mapserver, which already has a full copy anyway so there is zero additional load on the dbserver.

     

    Finally there is talk of adding a cryptographic signature so that the character export can be verified as authentic. This would be useful both in a worst-case-scenario recovery contingency, as well as for server operators who want to allow automated imports of verified characters. This would be coupled with adding a character UUID to prevent duplication.

     

    Burdensome is substantially better than nonexistent. Don't let perfect be the enemy of good enough. It's perfectly possible to throw together something that gives us safety now and then still build those other, cooler things later.

     

    Besides, I bet if you create a crappy web interface, other people could throw together a tool to log in to your website, crawl through the character list, and scrape all of the characters without you having to build anything else.

     

    Right now the server team's development priorities are focused almost entirely on stability. There is an extensive rewrite of parts of dbserver ongoing to address its design issues preventing smooth scaling above 2000 players per shard. There is also a plan for a potential PostgreSQL migration; as its design meshes well with the update-heavy workload and it could reduce licensing costs in the long term. The powers and frontend/UI people are working on different things in their areas of expertise of course.

     

    TL;DR: Character exports are something we think is a good idea, and something we're planning to look at once we're comfortable with the stability of the servers. But we're not going to slap something together and throw thousands of players at it without being confident it will work - we'll do it right.

     

    Given how the centralized control model worked out for me in 2012, I tend to think that doing it "right" in this case is at least throwing together something simple so your players can manage their own character data before worrying about whether you can get more than 2000 players on a shard, but you are, of course, entitled to your own priorities.

     

  5. When is Homecoming going to support character to text file exports? There's no way I can possibly convince myself to play on a server where I'm a C&D away from losing all my characters again.

     

    That's something we've looked at and had internal discussions on. The short answer is that it's something we would like to be able to do; there are just some technical details that need to be worked out to be able to do it at scale. The brute force method (using dbquery) causes lag for other players when you use it on a server that's hosting more than a few hundred online.

     

    Huh? Relational databases are really good at pulling small amounts of data out of a bunch of tables using indices. I can't imagine what you mean by that being a brute force method. It's literally the primary job of a RDMS. If you were feeling really iffy about it, just make a downstream replica of your db server and have the queries run there. I'm assuming you guys have one of those anyway, unless you like fun large scale outages and rollbacks because your db server fell over.

×
×
  • Create New...