Does anybody have any code that can be used to monitor a Cred Forums board including logging threads and hashing images, possibly storing images, perferably in a structured database?
Im trying to get something together by tonight to monitor the Cred Forums board for the debate.
I could code something myself but im not sure if I have enough time to put anything meaningful together.
Thats a lot of stuff to just have lying around... best I got is a program to save all of the images in a thread.
Juan Reyes
I'll see what I can do OP
Lincoln Bell
Ask NSA. I'm sure they'll be happy to help you.
Blake Phillips
im not going to do it, but I can tell you its possible and how I would go about it. make it a bash script. Use curl to return the pages and sed or other text filter to get the posts, then store in files or database. Pretty easy really.
Zachary Brown
OP here... that would work awesome. I can add the ability to do anything else I need, thats a perfect starting point...
What language do you have it written in?
Grayson Diaz
Its not that Im unable to do it myself, Its that I only have 2 hours to throw something together. A starting block of code would be nice to save me the work, but since I havent got one yet im just coding in python
Isaiah Perez
>nano Astute. my editor of choice for programming.
John Allen
The most annoying part of the project is going to be making the 4ch source code readable by human so you can start to decompose it... Its one long f'in line with no breaks
readable? computers dont care if its readable. The issue is that Cred Forums doesn't have a resolvable url so that throws curl out the window. I think PHP could do it though
Elijah Martinez
A computer doesnt, but I do. How the hell am I going to be able to write an algorithm to decompose source HTML if I dont know what the hell its saying
Nolan Adams
Dude, they've got a json API. Should be a cs 101 level weekend project. Afternoon project for someone who's got experience programming (fizzbuzz doesn't count).
Eli Moore
perfect. thanks for the help. just what I needed.
Levi Nguyen
Fuckin amazing. Thank you sir, you are a gentleman and a scholar.
Hudson Green
I'm going to spoon feed you a little since you're retarded.
Right click on the page and press "inspect element" inside a post.
find the name of the unique elements (probably a class) and make your program search for those.
protip: you won't figure it out before the debate.
Maybe by the next debate....
Hunter Thomas
poster im gonna work on it. email at [email protected] thats my phony email
Owen Clark
>Does anybody have any code that can be used to monitor a Cred Forums board including logging threads and hashing images, possibly storing images, perferably in a structured database? So you want to make Cred Forums archive no.21327322? You're gonna need a lot of storage space.
Grayson Clark
Cred Forums is always so obvious.
Levi King
I think you're actually serious. Is there really a difference between pretending to be retarded or actually being retarded?
Hunter Powell
(you)
Cooper Morales
Just storing the text doesn't take up much space, and most boards are pretty low volume. On even a modest home connection, an old p4 pull down all the new messages, run metrics on it, compress the text using zlib, shove the messages & metrics into a local db, and sleep for a few minutes without breaking a sweat.
Daniel Flores
is OP still here? file_get_contents is pulling the pages. Run it every minute or so, filter the results, and bam.
Andrew White
...
Bentley Perez
you know there are multiple Cred Forums archiving softwares right?
Jacob Brooks
Are you really the op? You dont need a program for tgat, just a script