Jump to content

Recommended Posts

Posted

Had some builds and origin stories I posted years ago I would now like to get back to.

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted

Thank you both!

 

Any thoughts on how to get the search function to work?

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted

Thank you both!

 

Any thoughts on how to get the search function to work?

You cannot. Think about it for a moment: the search functionality on an online forum depends on that forum's backend and database, neither of which are copied by The Wayback Machine - all it does is take a static snapshot of the pages themselves.
Posted

Thank you.

 

Yes, I dont expect the old forum functions to work -- are there any other ways to search the old data?

 

I have some Bio's and Builds posted I would like to find again.

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted

Thank you.

 

Yes, I dont expect the old forum functions to work -- are there any other ways to search the old data?

 

I have some Bio's and Builds posted I would like to find again.

 

Hmmm.... that gives me an idea.... I was able to make a Dev Tracker by parsing XML, I wonder if I could do something similar to make a one-time program that would parse the Wayback Machine and create essentially a "database" of posts.... hmmmmm....

 

u6sxp2I.png

 

I'm out.
Posted
Hmmm.... that gives me an idea.... I was able to make a Dev Tracker by parsing XML, I wonder if I could do something similar to make a one-time program that would parse the Wayback Machine and create essentially a "database" of posts.... hmmmmm....
You absolutely could, yes.
Posted

Hmmm.... that gives me an idea.... I was able to make a Dev Tracker by parsing XML, I wonder if I could do something similar to make a one-time program that would parse the Wayback Machine and create essentially a "database" of posts.... hmmmmm....
You absolutely could, yes.

 

You would be my mostest favorite person in the whole wide world for at least a week!

 

I am betting others would feel the same.

 

How much raw data would forums like that use? Out of curiosity. 

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted

Hmmm.... that gives me an idea.... I was able to make a Dev Tracker by parsing XML, I wonder if I could do something similar to make a one-time program that would parse the Wayback Machine and create essentially a "database" of posts.... hmmmmm....
You absolutely could, yes.

 

You would be my mostest favorite person in the whole wide world for at least a week!

 

I am betting others would feel the same.

 

How much raw data would forums like that use? Out of curiosity.

 

Just plain text? Probably a few megabytes. They say you could fit the Library of Congress on a 1.44MB floppy if it was plaintext. Graphics and such makes that grow... exponentially.

I'm out.
Posted

Not to disparage the mighty floppy, but, wiki says the library of congress has over 38 million books. Some might just be coloring books (already filled in) but still I dont see that fitting.

 

I am getting some old Apple II's working..... the 140k floppies have some serious storage then? :) I have 20 of them and they are 2 sided even.

 

Lets wait till I get COH running on them (KIDDING).

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted
Just plain text? Probably a few megabytes. They say you could fit the Library of Congress on a 1.44MB floppy if it was plaintext. Graphics and such makes that grow... exponentially.

I'd expect around a gigabyte, quite possibly more if the forum was particularly long-running or busy such as the official forum for an eight year run of a popular MMO

Posted

I was just wondering if it would be possible/practical to grab all the plain text, load it into word (or whatever) and search that -- sounding less practical by the moment.

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted
I was just wondering if it would be possible/practical to grab all the plain text, load it into word (or whatever) and search that
Oh hell no. Way too much stuff to grab manually. Some form of automation would be necessary to just grab the pages from the archive, and if you're doing that you might as well use a DOM parser to scrape the details of each post into a proper database.
Posted

That's what I'm doing now. About 25% done in 8 hours.

 

I can set up Hyper V if you ever need my to run a VM to do work related to COH. Be happy to leave the machine running over night etc.

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted

That's what I'm doing now. About 25% done in 8 hours.

I'm going to start giving you influence every hour on the hour for the rest of the week.

Face front, true believers!

Posted

Sorry, it looks like my algorithm was faulty, it's a "Microsoft Minute", because every time it finds a new "folder" in the Wayback Machine, that's a whole lot other files that weren't previously included in the calculations:

zz5eWKw.png

 

And this is just the first step of GETTING the data locally. After that, I still have to make a program to "parse" it into a database format that's searchable. So, it's coming... Soon™

I'm out.
Posted

Thank you so very much for helping with this!!!

 

I appreciate the update.

 

Now I just need to make sure I remember my old forum name :)

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted

Update today:

 

11303 files (16.76%)

 

Note again the "Microsoft Minute" happening here. Even though I've downloaded an additional 2,000 files+, I've found even MORE file than I'd found before, so that reduces the total percentage. Honestly, this could take weeks. The spider has to go slow to not trigger a ban from the Wayback Machine.

I'm out.
Posted

Thanks!

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted

Update:

 

16279 files (15.57%)

 

We're in this for the LONG haul.... every time I think we're making progress, the spider finds more files to index. This could take a LONG while, so don't expect anything any time soon on this.

I'm out.
Posted

Thank you for doing this just the same.

 

 

** Asus TUF x670E Gaming, Ryzen 7950x, AIO Corsair H150i Elite, TridentZ 192GB DDR5 6400, Sapphire 7900XTX, 48" 4K Samsung 3d & 56" 4k UHD, NVME Sabrent Rocket 2TB, MP600 Pro 8tb, MP700 2 TB. HDD Seagate 12TB **


** Corsair Voyager a1600 **

Posted

Thank me until I have RESULTS. Until then, it's all fluff and numbers....

 

 

... that should totally be a band name... "Fluff and Numbers".

I'm out.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...