Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

« Newer Snippets
Older Snippets »
Showing 1-10 of 11 total  RSS 

simplest wget/curl replacement with Perl ;-)

usage: lwp-get.pl your_url > output_file

   1  
   2  #!/usr/bin/perl
   3  use LWP::Simple;
   4  getprint($ARGV[0]) if $ARGV[0];

Auto download pdf from www.estado.com.br

// This script downloads all pdfs from http://jpdf.estado.com.br
// Note, you must subscribe this newspaper in order to download the pdfs
// You must have wget installed

   1  
   2  export http_proxy="http://10.1.1.1:8000"
   3  
   4  cookie='sMkjwKA67H8FDcsZX5'
   5  dd=`date +%d`
   6  mm=`date +%m`
   7  yyyy=`date +%Y`
   8  
   9  index="http://jpdf.estado.com.br/menupdfi.php?E=SP&D=$dd/$mm/$yyyy&A=/estadopdf/sp/paginas/$yyyy/$mm/$dd/A01.pdf"
  10  
  11  rm index.txt
  12  ./wget/wget -nc -k -S -U Mozilla --proxy --header "Cookie: User=$cookie " -O index.txt $index
  13  if [ ! -f index.txt ]; then exit 1; fi
  14  
  15  l=`gawk  'BEGIN {FS="\""} /option VALUE="\/estadopdf/ { print $2 }' index.txt`
  16  
  17  for x in $l; do 
  18  
  19  	# Ignora os classificados
  20  	if [ ${x%01.pdf} -eq "Cl" ]; then continue; fi
  21  	# Ignora o Guia
  22  	if [ ${x%01.pdf} -eq "Q" ]; then continue; fi
  23  
  24  	y=http://jpdf.estado.com.br${x%01.pdf}
  25  	i=1
  26  	flag=0
  27  	
  28  	while [ $i -lt 40 ]; do
  29  		filename=`printf "%s%02d.pdf\n" $y $i`
  30  
  31  		echo "=================================================================="
  32  		echo $filename
  33  		echo "=================================================================="
  34  		./wget/wget -P estado -nc -k -S -U Mozilla --proxy --header "Cookie: User=$cookie " $filename
  35  		if [ $? -eq 1 ]; then
  36  			let flag=flag+1
  37  			if [ $flag -gt 1 ]; then 
  38  				flag=0
  39  				echo "Proximo caderno..."; break;
  40  			fi 
  41  		fi
  42  		
  43  		sleep 1
  44  		let i=i+1
  45  	done
  46  done
  47  

Searching through your Twitter archive

This Ruby code downloads previous twitter entries as html files to a local file directory.
   1  
   2  (1..5).each {|i| `wget --user=jrobertson --password=secret http://twitter.com/account/archive?page=#{i}`}

then using grep -i <keyword> * you can search for anything you've twittered in the past.

Note: Use Wget sparingly.

Wget's switch to preserve the filename

Download a file and preserve the filename. Without the switch -O the next time the file is downloaded the new filename would be projxmlbase.rb.1

   1  
   2  wget -O projxmlbase.rb http://xml.mysqueak.info/p/projxmlbase.rb

Ruby equivalent of a simple Wget

Reads the file contents from a URL. I used this code to work around the problem with the Ruby RSS reader which couldn't read the RSS file from digg.com. The reason being that the website would not allow files to be downloaded without supplying the User-Agent string.

   1  
   2  require 'open-uri'
   3  puts open('http://digg.com/rss/index.xml',
   4       'User-Agent' => 'Ruby-Wget').read

using wget to download content protected by referer and cookies

1. get base url and save its cookies in file
2. get protected content using stored cookies

   1  
   2   $ wget --cookies=on --keep-session-cookies --save-cookies=cookie.txt http://first_page
   3   $ wget --referer=http://first_page --cookies=on --load-cookies=cookie.txt --keep-session-cookies --save-cookies=cookie.txt http://second_page

Download the pys60 tutorial

Many wonderful examples by Jürgen Scheible.
I omit the zip,exe,pdf to save site's bandwidth.
   1  
   2  C:\>wget -kmnp -l3 -Rzip,exe,pdf http://www.mobilenin.com/pys60/menu.htm

You can just browse the site.

Making a backup of your del.icio.us bookmarks

   1  wget http://del.icio.us/api/posts/all --http-user=YOURUSERNAME --http-passwd=YOURPASSWORD

Use wget to display contents of a URL

   1  wget -qO - url

Mirror a site with wget

I create a batch file (w.bat) containing just
   1  
   2  wget -krmnp %1


Then I call
   1  
   2  C:\>w [what-ever-url]
« Newer Snippets
Older Snippets »
Showing 1-10 of 11 total  RSS