How would I download files (video) with Python using wget and save them locally? There will be a bunch of files, so how do I know that one file is downloaded so as to automatically start downloding another one?
It is a twisted-based web crawler framework. Still under heavy development but it works already. Has many goodies:
Example code to extract information about all torrent files added today in the mininova torrent site, by using a XPath selector on the HTML returned:
class Torrent(ScrapedItem): pass class MininovaSpider(CrawlSpider): domain_name = 'mininova.org' start_urls = ['http://www.mininova.org/today'] rules = [Rule(RegexLinkExtractor(allow=['/tor/\d+']), 'parse_torrent')] def parse_torrent(self, response): x = HtmlXPathSelector(response) torrent = Torrent() torrent.url = response.url torrent.name = x.x("//h1/text()").extract() torrent.description = x.x("//div[@id='description']").extract() torrent.size = x.x("//div[@id='info-left']/p/text()").extract() return [torrent]