Cred Forums board monitoring

Does anybody have any code that can be used to monitor a Cred Forums board including logging threads and hashing images, possibly storing images, perferably in a structured database?

Im trying to get something together by tonight to monitor the Cred Forums board for the debate.

I could code something myself but im not sure if I have enough time to put anything meaningful together.

Anybody have anything that can help me out?

Other urls found in this thread:

textfixer.com/html/uncompress-html-code.php
scotch.io/tutorials/scraping-the-web-with-node-js
twitter.com/SFWRedditImages

bamp

404

Nice try, NSA.

Try it.

>nano

Thats a lot of stuff to just have lying around... best I got is a program to save all of the images in a thread.

I'll see what I can do OP

Ask NSA. I'm sure they'll be happy to help you.

im not going to do it, but I can tell you its possible and how I would go about it. make it a bash script. Use curl to return the pages and sed or other text filter to get the posts, then store in files or database. Pretty easy really.

OP here... that would work awesome. I can add the ability to do anything else I need, thats a perfect starting point...

What language do you have it written in?

Its not that Im unable to do it myself, Its that I only have 2 hours to throw something together. A starting block of code would be nice to save me the work, but since I havent got one yet im just coding in python

>nano
Astute. my editor of choice for programming.

The most annoying part of the project is going to be making the 4ch source code readable by human so you can start to decompose it... Its one long f'in line with no breaks

>textfixer.com/html/uncompress-html-code.php

>those colors
>nano
>python

I hope this is a troll post

10/10 if so

readable? computers dont care if its readable. The issue is that Cred Forums doesn't have a resolvable url so that throws curl out the window. I think PHP could do it though

A computer doesnt, but I do. How the hell am I going to be able to write an algorithm to decompose source HTML if I dont know what the hell its saying

Dude, they've got a json API. Should be a cs 101 level weekend project. Afternoon project for someone who's got experience programming (fizzbuzz doesn't count).

perfect. thanks for the help. just what I needed.

Fuckin amazing. Thank you sir, you are a gentleman and a scholar.

I'm going to spoon feed you a little since you're retarded.

Right click on the page and press "inspect element" inside a post.

find the name of the unique elements (probably a class) and make your program search for those.

Clearly you don't know anything about anything.

Use this you little babby:

scotch.io/tutorials/scraping-the-web-with-node-js

protip: you won't figure it out before the debate.

Maybe by the next debate....

poster im gonna work on it. email at [email protected] thats my phony email

>Does anybody have any code that can be used to monitor a Cred Forums board including logging threads and hashing images, possibly storing images, perferably in a structured database?
So you want to make Cred Forums archive no.21327322?
You're gonna need a lot of storage space.

Cred Forums is always so obvious.

I think you're actually serious. Is there really a difference between pretending to be retarded or actually being retarded?

(you)

Just storing the text doesn't take up much space, and most boards are pretty low volume. On even a modest home connection, an old p4 pull down all the new messages, run metrics on it, compress the text using zlib, shove the messages & metrics into a local db, and sleep for a few minutes without breaking a sweat.

is OP still here? file_get_contents is pulling the pages. Run it every minute or so, filter the results, and bam.

...

you know there are multiple Cred Forums archiving softwares right?

Are you really the op? You dont need a program for tgat, just a script