_NOPE_ Posted June 19, 2019 Share Posted June 19, 2019 Aaaaaannnnnndddd... we're processing! Not sure how GOOD the data is yet though... I'm going to run it for a bit, and then dump it and inspect it... then adjust accordingly. But, it's a start. I'm out. Link to comment Share on other sites More sharing options...
Healix Posted June 19, 2019 Share Posted June 19, 2019 Forever grateful to be back in my city! Link to comment Share on other sites More sharing options...
Jeneki Posted June 20, 2019 Share Posted June 20, 2019 It's worth mentioning that there was at least one forum purge over the years. I remember it quite well as it deleted several excellent humor posts from the Champion subforum. If someone can't find the old post they are looking for, you might have to look at an older snapshot of the forum than just before shutdown. Link to comment Share on other sites More sharing options...
Zep Posted June 20, 2019 Author Share Posted June 20, 2019 Looks like you have things well under control. I was also thinking you could just box/google drive/whatever me a zip file to uncompressed and execute a file -- if it would be worth the time/hassle. ** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB ** ** Corsair Voyager a1600 ** Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 20, 2019 Share Posted June 20, 2019 I'm out. Link to comment Share on other sites More sharing options...
Zep Posted June 20, 2019 Author Share Posted June 20, 2019 YEAH! ** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB ** ** Corsair Voyager a1600 ** Link to comment Share on other sites More sharing options...
Zep Posted June 22, 2019 Author Share Posted June 22, 2019 Inquiry: When your glorious work is done how will we access the data? I would likely start by wanting to find my own posts - in many cases that would lead to wanting to read whole threads and lead to old guides etc. ** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB ** ** Corsair Voyager a1600 ** Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 24, 2019 Share Posted June 24, 2019 Inquiry: When your glorious work is done how will we access the data? I would likely start by wanting to find my own posts - in many cases that would lead to wanting to read whole threads and lead to old guides etc. Sorry, I was on a family trip over the weekend. I'm currently planning on a web interface similar to the Dev Tracker, but with a "Search Page". Now, just to let you know, I've stopped my parsing process for two reasons: [*]I just wanted some sample data so that I can start constructing the front end and make sure that all looks good. [*]I realized that I forgot about thread titles in my schema.... probably doesn't do much good to have a bunch of threads that are just known by their ThreadID. That'd be like having to navigate the internet by going to http://192.168.1.1 instead of http://www.google.com - namely, it would SUCK. So I need to parse the Titles and get them into a unique column so that they are displayable/searchable as well. No ETA at this time. Trust me, this is a LOT of work! And I have to balance doing this against work/family/Playing CoH for Sanity. I appreciate your patience in advance. We'll get there, but it might just take a while. I'm out. Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 24, 2019 Share Posted June 24, 2019 You know what? I just changed my mind, sorta. I'm going to start it off as a desktop application, because that's where I'm comfortable and I know I can crank that out WAY quicker. And then after I have that working, I might turn it into a web site... because there's a WHOLE lot I still have to learn about designing modern websites using code dynamically. I BARELY got the Dev Tracker working, to be honest, and I still haven't spent the time to figure out how to NOT make the thing load as one giant web page that takes forever to load. So, yeah... desktop app is where I'm going to start. I'm out. Link to comment Share on other sites More sharing options...
Zep Posted June 24, 2019 Author Share Posted June 24, 2019 I will never rush you on this. I hope no one else would. I do get curious and like to help. Thank you again for doing this as it is a large amount of work. ** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB ** ** Corsair Voyager a1600 ** Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 24, 2019 Share Posted June 24, 2019 Not to get you TOO excited... but I'm beta-testing the basic interface now, and it seems functional. The resulting screens look like crap, but it's a start: I'm out. Link to comment Share on other sites More sharing options...
chigiabelo Posted June 24, 2019 Share Posted June 24, 2019 That looks very nice, PK! Good job! Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 24, 2019 Share Posted June 24, 2019 Search by UserName (full or partial), is now a thing: I'm out. Link to comment Share on other sites More sharing options...
Zep Posted June 24, 2019 Author Share Posted June 24, 2019 You are awesome..... Find many for _Zep_ ? Just curious :) **NO RUSH** I am a slow leveler and have a lot of alts, be a while before I can use any of my old builds anyways -- prob need to do some adapting to newer conditions anyways. I am hoping to find one or more of my origin stories more so than builds - again no rush. :) I truly appreciate the time and work you are putting into this. I tried going through the Wayback Machine and found a total of (1) of my old posts before giving up. Thanks!!!!! You should get merits for this - HINT HINT HINT to the Dev's. ** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB ** ** Corsair Voyager a1600 ** Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 24, 2019 Share Posted June 24, 2019 Honestly, I haven't been looking for specific usernames, just test data to play with. Now that I think the system is MOSTLY "good enough", I have to change my parser to start uploading to my server instead of to my personal PC. Then, once the data is all on my server, I can modify my front-end app to point to that server instead of my PC, and I'll be ready for initial release. When I release this, I'm going to be making a read-only user account for my SQL database, and release the source code for both the parser and the front end, so that anyone with sufficient programming skills can improve on what I started, if they want to go to the Internet Archive and download their own copy of the CoH forums archive. The parsing has been ROUGH, there's a LOT of "junk" and corruption in a lot of the files that the Wayback machine stored... I don't see any way around that, TBH... I'm out. Link to comment Share on other sites More sharing options...
Obitus Posted June 24, 2019 Share Posted June 24, 2019 This is great work. Thanks PK. I have a handful of my old posts saved via bookmark through the archived site, but it's extremely difficult to find specific posts/threads using that method. Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 24, 2019 Share Posted June 24, 2019 This is great work. Thanks PK. I have a handful of my old posts saved via bookmark through the archived site, but it's extremely difficult to find specific posts/threads using that method. Speaking of the Wayback Machine... if for some reason the parsed HTML that's on my servers is bad, I've added a button that lets you go right to the Wayback Machine page for the specific Page/Post/Thread you're looking at in the program: I'm out. Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 25, 2019 Share Posted June 25, 2019 Status update: I've rewritten the parser to send data to my remote MySQL database instead of my local MSSQL database, and it's now starting to populate the ACTUAL tables that will be used by the front end: And this is still just from processing file #1 of.... 18,869. :o I'm out. Link to comment Share on other sites More sharing options...
chigiabelo Posted June 25, 2019 Share Posted June 25, 2019 How big is the MSSQL database, file size? Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 25, 2019 Share Posted June 25, 2019 How big is the MSSQL database, file size? Oh, I never ran the full parser on all of the files, I stopped the parser after I had a good enough sample size to work with... so I stopped at like file 26 or something like that. So, I have no clue how big this thing is gonna get. I'm out. Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 25, 2019 Share Posted June 25, 2019 Made the changes to the reader to read from the remote MySQL database, looks like it's working! Sorry, no _Zep_ user yet, but then again, I'm only through parsing file #3 of 18,869: I'm out. Link to comment Share on other sites More sharing options...
City Council Widower Posted June 25, 2019 City Council Share Posted June 25, 2019 You should get merits for this - HINT HINT HINT to the Dev's. I can't give him in-game merits, but I can give him Widower's Medal for Exceptional Coolness. "We need Widower. He's a drop of sanity in a bowl of chaos - very important." - Cipher Are you also a drop of sanity in a bowl of chaos? Consider applying to be a Game Master! Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 25, 2019 Share Posted June 25, 2019 FYI, I'm going to try to post a daily update on how the data upsert process is going. Here's what we've got so far, so you can see it'll be a while, but hey, at least I have the files themselves, they aren't going anywhere (just on my PC), so the only thing slowing things right now is my mandatory database checks to ensure that we're NOT adding the same data into the database twice... because for some reason this archive contains a LOT of duplicated data.... I'm out. Link to comment Share on other sites More sharing options...
_NOPE_ Posted June 25, 2019 Share Posted June 25, 2019 I'd also stopped and restarted the process, and didn't realize that it was restarting from SCRATCH every time, so I'm now making a "skip" file for every file that I process - if, when I come across a file, I find its "skip" brother/sister, I'm skipping processing the file. A poor man's resume "tool", but it lets me deal with issues with power/internet outages going forward. I'm out. Link to comment Share on other sites More sharing options...
chigiabelo Posted June 25, 2019 Share Posted June 25, 2019 ... the only thing slowing things right now is my mandatory database checks to ensure that we're NOT adding the same data into the database twice... because for some reason this archive contains a LOT of duplicated data... I've had a similar challenge importing Azure billing information into a SQL server because we import the current month daily and there's usually some slight variances from the last few days, so we reimport from the beginning of the month to the current date, which can be hundreds of megabytes by a certain time of the month due to our size. The method I have found that works faster than checking if a record already exists is to set a SQL constraint on the table based on the fields that need to be unique and then a try-catch on the code that is doing the importing. If a duplicate exists, the SQL server fails the attempt to import that one record, the app catches the exception and goes on to the next item. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now