Gregg Pollack included a great review of CloudCrowd in a recent episode of his show, Scaling Rails. CloudCrowd will still be Greek to the truly non-technical readers out there, but if you have enough of a handle on software development to wish you understood“scaling” better, his review just might help.
Our latest release, Docsplit, is a command-line utility and Ruby library for splitting documents into distinct components such as raw text (which you need for searches), page thumbnails, and document metadata (details like the document’s author or the number of pages it contains).
Splitting documents apart is a pretty key functionality for DocumentCloud: everything else DocumentCloud does depends on the presence of one or another of these pieces. Docsplit got a lot of attention when we released it on Monday — and we’re all looking forward to seeing what other folks do with it.