Oubliette_Red Posted June 27, 2019 Posted June 27, 2019 PK, are the parsed files being reported properly? The last couple of runs I've done have said "Now processing file # 0 of 1." or "#0 of 3'". It seems to continue as expected and creates the .warcskip file(s). CHEATER CHEATER PUMPKIN EATER! Heh, that's fine, it all goes to the cause. And we're in this for the long haul. I just wonder what it would take to get others to help, because really, the program takes VERY little resources, and runs in the background, and doesn't really interfere with anything (unless they come across an error). Speaking of which, I'll get the patched up shortly. Also, with the Viewer now, you don't really have to report the files that you processed. Hell, I probably should do away with creating those WARCSKIP files too, since I'm also using the database check, it's kind of redundant... If it's removing the files anyway, probably don't need the .warcskip files. But I throw it/them back into the archive folder, use it kind of like a placeholder for what I've already parsed. Dislike certain sounds? Silence/Modify specific sounds. Looking for modified whole powerset sfx? Check out Michiyo's modder or Solerverse's thread. Got a punny character? You should share it.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 Yes, they should be. Let me know if your DateTimeStamps in the Viewer don't show up properly. :P Also, new Parser just pushed out, no more WARCSKIP files! Bug you found was fixed! I'm out.
justicebeliever Posted June 27, 2019 Posted June 27, 2019 I was all set to help until I realized that the file is 219 GB, and I would barely have room to store it compressed, much less unzip it... "The opposite of a fact is falsehood, but the opposite of one profound truth may very well be another profound truth." - Niels Bohr Global Handle: @JusticeBeliever ... Home servers on Live: Guardian ... Playing on: Everlasting
Oubliette_Red Posted June 27, 2019 Posted June 27, 2019 Hmmm... parser is hung up on a file, not progressing and no error. "boards.cityofheroes.com-threads-range-15130-20120904-150600, line 29072 of 313737" Pulled the other files out and reran the same file. Got the following error: Dislike certain sounds? Silence/Modify specific sounds. Looking for modified whole powerset sfx? Check out Michiyo's modder or Solerverse's thread. Got a punny character? You should share it.
Zep Posted June 27, 2019 Posted June 27, 2019 Going to bed - torrent is 99.9% downloaded - let it sit over night. Get it going tomorrow. ** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB ** ** Corsair Voyager a1600 **
Amatyr Posted June 27, 2019 Posted June 27, 2019 And when it reached 99% it reported that it was stalled. Although it also reported that 219G was dl'd which was the size of the file. I think is was just a reporting issue and that it actually completed, as the archive folder reported the same size. My 99.9% (and stalled at this point) downloaded 219Gb tar file is definitely not extracting properly :( No problem with disk space, 7zip only reports 123 files (which is ~1.5Gb). Tried bash tar as well, no change. But, I'm processing the files it will give me and that's working. Excelsior Global Channel - for your server wide chat and forming TFs, Trials, Radios, Farms, whatever you want to do - /chan_join Excelsior today!
Zep Posted June 27, 2019 Posted June 27, 2019 I'm still getting a real low download speed. I think there is a small bit that is only available from a really slow source. Maybe I'll try the direct download option. ** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB ** ** Corsair Voyager a1600 **
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 Hmmm... parser is hung up on a file, not progressing and no error. "boards.cityofheroes.com-threads-range-15130-20120904-150600, line 29072 of 313737" Pulled the other files out and reran the same file. Got the following error: Yeah, this is what I'm talking about when I talk about old, junk data. It's apparently just an HTML page with no content whatsoever, weird: I'm processing the rest of that file now, so don't worry about that one. I'm out.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 Going to bed - torrent is 99.9% downloaded - let it sit over night. Get it going tomorrow. Yeah, it looks like the Internet Archive sat on this for a while and then stuck it at the bottom of its resource pile, since it's been around so long (and probably nobody cared about it until now!). Thanks for your efforts! I'm out.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 And when it reached 99% it reported that it was stalled. Although it also reported that 219G was dl'd which was the size of the file. I think is was just a reporting issue and that it actually completed, as the archive folder reported the same size. My 99.9% (and stalled at this point) downloaded 219Gb tar file is definitely not extracting properly :( No problem with disk space, 7zip only reports 123 files (which is ~1.5Gb). Tried bash tar as well, no change. But, I'm processing the files it will give me and that's working. Thanks for your efforts! You can try the direct download option if you want. I had both running, and the direct download completed faster, so I cancelled the torrent. Maybe the torrent is in fact missing a bit? Ugh. I'm out.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 I'm still getting a real low download speed. I think there is a small bit that is only available from a really slow source. Maybe I'll try the direct download option. Yes, try that! I'm out.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 I just published a new version. Let me know if that auto-stops your processing after the next file processes like I intend it to. What I'm doing now is, if I run across an error in parsing (like, for example, a User Page like Red's error), I'm just throwing it into the "Pages" bucket. Then, sometime later, we can look at the Pages bucket and see if there's anything more we can do to salvage that data. If anyone has any better ideas than that, let me know. I kind of expected this to just be the first round of parsing anyways, where we got the lump done, and then there'd be additional parsing as people request more and more specific queries, which would necessitate additional "cross-reference" tables, which would necessitate additional parsing for that data. I'm out.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 Aaaaaaand, we have a new king! Thanks Cipher! I'm out.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 By the way, I just figured out that since I didn't make it a single instance application, you can have multiple instances running and process more faster: You don't HAVE to of course... just saying.... I'm out.
Amatyr Posted June 27, 2019 Posted June 27, 2019 OK boss :o I stopped adding because with the small number of files I have right now it started trying to process the same one. Also, started the web dl hopefully I get the full archive this time. Excelsior Global Channel - for your server wide chat and forming TFs, Trials, Radios, Farms, whatever you want to do - /chan_join Excelsior today!
City Council Cipher Posted June 27, 2019 City Council Posted June 27, 2019 OK boss :o I stopped adding because with the small number of files I have right now it started trying to process the same one. Also, started the web dl hopefully I get the full archive this time. Hopefully you have better luck with this than I have had. I had the same issue with the torrent where the last 0.2% was completely unavailable, leading to only a small percentage of recoverable files (from the tar). The direct download has now failed at 137/219GB 4 times in a row after hours of trying, so it appears there's an issue with the direct download as well. If anyone manages to get the full archive and wants to upload or seed it I can contribute more than just the 1,700 or so files that were recoverable. If you need help, please submit a support request here or use /petition in-game. Got time to spare? Want to see Homecoming thrive? Consider volunteering as a Game Master!
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 I've got the full archive, it succeeded for me last night. I'm in the process of decompressing it again, I planned to unzip it out of ALL of the different formats its in, then then rezip it into a single format, for convenience and reducing the number of steps. I suppose I could host it myself, I've got infinite space on my shared host, though the speed might be questionable. 18869 files, for reference. I'm out.
Amatyr Posted June 27, 2019 Posted June 27, 2019 If you could setup a torrent that we know has the good data, after downloading it I'd be able to leave it seeding for a good while. The time to download is less of a problem if we know the end result will be good. Excelsior Global Channel - for your server wide chat and forming TFs, Trials, Radios, Farms, whatever you want to do - /chan_join Excelsior today!
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 As soon as I get the whole thing unzipped the 12 times necessary into a flat folder, and then rezipped the ONE time using 7z (proven to be the highest compression rate), then I'll setup a new torrent. I can even keep it running on my PC, for whatever that's worth. Maybe I could learn how to set it up on my server as a tracker, might have to investigate that... I'm out.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 FYI, I'm also claiming boards.cityofheroes.com-threads-range-15775-20120907-054529, which I think is the biggest file. I'm out.
Oubliette_Red Posted June 27, 2019 Posted June 27, 2019 OK boss :o I stopped adding because with the small number of files I have right now it started trying to process the same one. Also, started the web dl hopefully I get the full archive this time. You can get around this by placing chunks of files in different folders and pointing each instance of the parser to different folders. Dislike certain sounds? Silence/Modify specific sounds. Looking for modified whole powerset sfx? Check out Michiyo's modder or Solerverse's thread. Got a punny character? You should share it.
_NOPE_ Posted June 27, 2019 Author Posted June 27, 2019 Sorry, is this too awkward? What would work better? I'm basically just trying to randomize things so everyone doesn't end up processing the same files and wasting processing time. I'm out.
Amatyr Posted June 27, 2019 Posted June 27, 2019 Splitting the files off manually is a really good suggestion and file clash is mostly an issue because of my small set of files it can select randomly from. Ultimately, I'd love the parser to handle multiple runs itself. Some option to just tell it to run 10 instances of itself and get on with it. Use my excessive modern computing resources! Excelsior Global Channel - for your server wide chat and forming TFs, Trials, Radios, Farms, whatever you want to do - /chan_join Excelsior today!
Oubliette_Red Posted June 27, 2019 Posted June 27, 2019 I just started running multiple instances to parse all of the strictly 20k files, so 20,202-20,999 Dislike certain sounds? Silence/Modify specific sounds. Looking for modified whole powerset sfx? Check out Michiyo's modder or Solerverse's thread. Got a punny character? You should share it.
Oubliette_Red Posted June 27, 2019 Posted June 27, 2019 Sorry, is this too awkward? What would work better? I'm basically just trying to randomize things so everyone doesn't end up processing the same files and wasting processing time. I suppose we ( I ) could stick to a strictly numerical filename process to avoid stepping on each other's toes. My apologies if I started a poor trend. <.< Dislike certain sounds? Silence/Modify specific sounds. Looking for modified whole powerset sfx? Check out Michiyo's modder or Solerverse's thread. Got a punny character? You should share it.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now