Never been to DZone Snippets before?

Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

« Newer Snippets
Older Snippets »
Showing 1-10 of 38 total  RSS 

Perform a Rails find() and iterate over the resulting records in groups

module ActiveRecord
  class Base
    # This method lets you iterate over the results of a .find, in groups.
    # (Basically an interface to LIMIT.)
    # Anything you can pass as options to .find, you can pass here. 
    # Example 1:
    #   Order.each_by(100, :conditions => { :cc_processed_at => nil }) do |order|
    #     # do stuff with order
    #   end
    # Example 2:
    #   Person.each_by(50, :order => 'name') do |person, index|
    #     # do stuff with person and index
    #   end
    # Pass :update => true in the options to print a message before each group is
    # fetched from the db.
    #
    # Author: Elliot Winkler <elliot.winkler@gmail.com>
    # Source: http://snippets.dzone.com/posts/show/5461
    def self.each_by(group_size, options={}, &blk)
      update = options.delete(:update) || false
      num_records = count(options.except(:from))
      return 0 if num_records == 0
      #raise "Number of records: #{num_records}"
      also_pass_offset = (blk.arity == 2)
      0.step(num_records, group_size) do |offset|
        find_options = { :offset => offset, :limit => group_size }.merge(options)
        if update
          if num_records == 1
            puts ">> Reading the only record."
          else
            start_offset = offset + 1
            end_offset   = offset + group_size
            end_offset   = num_records if num_records < end_offset
            puts ">> Reading records #{start_offset}-#{end_offset}."
          end
        end 
        find(:all, find_options).each do |record|
          also_pass_offset ? blk.call(record, offset) : blk.call(record)
        end
      end
      num_records
    end
  end
end

Simple File.find

# Simple File.find by c00lryguy
# Thanks to justinwr for adding what I forgot to do
# ------------------------------
# Usage: 
#     * = wildcard in filename
#   File.find("E:\\") => All files in E:\
#   File.find("E:\\Ruby", "*.rb") => All .rb files in E:\Ruby
#   File.find("E:\\", "*.rb", false) => All .rb files in E:\, but not in its subdirs
class File
  def self.find(dir, filename="*.*", subdirs=true)
    Dir[ subdirs ? File.join(dir.split(/\\/), "**", filename) : File.join(dir.split(/\\/), filename) ]
  end
end

Creating a bucket in Amazon S3 through an irb session

1) Log into an irb session, and enter your S3 login details.
require 'rubygems'
require 'aws/s3'

  AWS::S3::Base.establish_connection!(
    :access_key_id     => 'REPLACE_ME',
    :secret_access_key => 'REPLACE_ME'
  )

output:
=> #<AWS::S3::Connection:0xb75e0594 @http=#<Net::HTTP s3.amazonaws.com:80 open=false>, @secret_access_key="", @options={:server=>"s3.amazonaws.com", :access_key_id=>"", :port=>80, :secret_access_key=>"", :persistent=>true}, @access_key_id="19S45GYAGWK8DC2B8VG2">

2) Browse the existing buckets.
AWS::S3::Service.buckets

output:
=> [#<AWS::S3::Bucket:0xb75cc850 @object_cache=[], @attributes={"name"=>"ogg.twitteraudio.com", "creation_date"=>Sat Apr 26 10:40:16 UTC 2008}>, #<AWS::S3::Bucket:0xb75cc83c @object_cache=[], @attributes={"name"=>"t1000", "creation_date"=>Fri Apr 25 21:35:21 UTC 2008}>, #<AWS::S3::Bucket:0xb75cc814 @object_cache=[], @attributes={"name"=>"t2000", "creation_date"=>Fri Apr 25 21:53:15 UTC 2008}>]

3) Browse the buckets in a programmatical way.
AWS::S3::Service.buckets.each {|b| puts b.name}

output:
ogg.twitteraudio.com
t1000
t2000


4) Add a new bucket called t3000.
AWS::S3::Bucket.create('t3000')

output:
=> true

5) Observe adding the bucket again doesn't cause an error.
AWS::S3::Bucket.create('t3000')

output:
=> true

6) View the buckets again.
AWS::S3::Service.buckets

output:
=> [#<AWS::S3::Bucket:0xb75cc850 @object_cache=[], @attributes={"name"=>"ogg.twitteraudio.com", "creation_date"=>Sat Apr 26 10:40:16 UTC 2008}>, #<AWS::S3::Bucket:0xb75cc83c @object_cache=[], @attributes={"name"=>"t1000", "creation_date"=>Fri Apr 25 21:35:21 UTC 2008}>, #<AWS::S3::Bucket:0xb75cc814 @object_cache=[], @attributes={"name"=>"t2000", "creation_date"=>Fri Apr 25 21:53:15 UTC 2008}>]

Note: You would expect t3000 to be in there however it didn't appear possibly because of the bucket permissions.

7) Let's then look for bucket t3000.
t3000 = AWS::S3::Bucket.find('t3000')

output:
=> #<AWS::S3::Bucket:0xb76df724 @object_cache=[], @attributes={"prefix"=>nil, "name"=>"t3000", "marker"=>nil, "max_keys"=>1000, "is_truncated"=>false, "xmlns"=>"http://s3.amazonaws.com/doc/2006-03-01/"}>

8) Now that we've found the bucket let's upload a text file called works.txt.
file = "works.txt"

output:
=> "works.txt"
AWS::S3::S3Object.store(file, open(file), 't3000', :access => :public_read)

output:
=> #<AWS::S3::S3Object::Response:0x-608926458 200 OK>

9) Setting the file access to :public_read allows us to view the file from the http location http://t3000.s3.amazonaws.com/works.txt

References:
http://amazon.rubyforge.org/
upload_to_s3 - Ruby S3 upload client [dzone.com]

*update: 14:30 30 April 2008 *
I didn't use Bucket.objects(:reload) which is the reason why the bucket t3000 didn't show up with the statement Service.buckets

Reference: spatten design - Amazon S3, Ruby and Rails slides [spattendesign.com]

process email files like unix find

I call this program whitelist. It lets you run a command on a bunch of files depending on whether the file is an email and has a from address in a whitelist.

It's useful for maintaining whitelisted mailboxes and analysing mailboxes. With a few more tests it might be a generically useful tool.

#!/usr/bin/python
# Copyright (C) 2008 by Tapsell-Ferrier Limited

# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.

# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program; see the file COPYING.  If not, write to the
# Free Software Foundation, Inc.,   51 Franklin Street, Fifth Floor,
# Boston, MA  02110-1301  USA

import commands
import email.Parser
import sys
import re
import getopt
import os
import os.path

try:
    from email.utils import parseaddr
except:
    from rfc822 import parseaddr


def help():
    print """whitelist.py -h
whitelist.py [-v] [-f whitelist filename] command ; filelist [-]

Execute the specified command (which must be shell escaped if calling
from shell) on all the files in the filelist or, if - is present in
the filelist, read from stdin (like xargs) whenever the file is an
email that contains a from address specified in the whitelist.

Like xargs, or find, the command can include {} as a replacement token
for the matched filename.

The command can also be a header reference, for example:

  $FROM

will print the specified mails From address.

Options:

 -v   specifies that the test is to be negated, executing the action if
      the file does NOT contain a from address in the whiltelist.

 -f   specifies a whitelist, the default is $HOME/.addresses

For example:

 whitelist.py -f .wlist wc \{} \: maildir/cur/*

runs wc on each file in maildir/cur with a FROM address matching
something in the whitelist; or:

 find maildir/INBOX/cur -type f | whitelist.py -v mv \{} mailbox/TRASH/cur \; -

mv's all files in the INBOX with FROMs not matching the whitelist into
a TRASH folder.

  find maildir/Greylist/new -type f | whitelist.py -v $TO \; -

displays the TO address of all messages where the from didn't match
the whitelist.
"""


def read_whitelisted(filename):
    fd = open(filename)
    data = fd.read()
    fd.close()
    return data.split()

def get_msg(filename):
    fd = open(filename)
    try:
        msg = email.Parser.HeaderParser().parse(fd, True)
        return msg
    finally:
        fd.close()

action_re = re.compile("\{}")

def handle(filenames_fn, action, whitelist, negate=False):
    for filename in filenames_fn():
        msg = get_msg(filename)
        realname, addr = parseaddr(msg["from"])
        result = addr in whitelist

        if negate:
            result = not result

        if result:
            try:
                m = re.match("\$(.+)", action)
                result = msg[m.group(1)]
            except Exception:
                cmd_str = action_re.sub(filename, action)
                os.system(cmd_str)
            else:
                print result


def main(args):
    negate = False
    whitelist_filename = os.path.join(os.environ["HOME"], ".addresses")
    opts, args = getopt.getopt(args, "hv")
    for o,a in opts:
        if o == "-h":
            help()
            sys.exit(0)

        elif o == "-v":
            negate = True

        elif o == "-f":
            whitelist_filename = a

    if not os.access(whitelist_filename, os.F_OK):
        print >>sys.stderr, "whitelist.py   -  no whitelist filename\n"
        help()
        sys.exit(1)

    cmdstr = " ".join(args)
    m = re.match("(.*) ;([ ]*.*)", cmdstr)
    if not m:
        sys.exit(1)

    cmd = m.group(1)
    files = m.group(2).strip().split(" ")

    def ffn():
        for f in files:
            if f == "-":
                for innerf in sys.stdin:
                    yield innerf.strip()
            else:
                yield f
        return

    whitelist = read_whitelisted(whitelist_filename)
    handle(ffn, cmd, whitelist, negate)


if __name__ == "__main__":
    main(sys.argv[1:])

# End

Parsing Errors on Command Line

Get PHP parsing errors on command line. Useful for those extreme cases where you can't get them to print to the browser.

find . -name \*.php \! -exec php -l {} \;

Shell script to recursively find files with the same name and replace text within each of them

This shell script recursively finds files with the same name and replaces text within each of them

FILE="filename.txt"
FIND="old text"
REPLACE="new text"
find . -name $FILE -print0 | xargs -0p perl -pi -w -e "s/$FIND/$REPLACE/g;"


Remove the p flag if confirmation is not necessary.

find and replace text from the shell

Snagged from http://snippets.dzone.com/posts/show/116

find . -name '*.txt' -print0 |xargs -0 perl -pi -e 's/find/replace/g'

find line in text file, add to another text file

// Batch - search .txt / .csv / etc using argument, return matches to .txt / .csv / etc file
// skips the ---------- foo.csv returned by find alone
find /i "%1" foo.csv | find /i "%1" >> bar.txt

Recursively find files by filename pattern.

Scans a directory, and all subdirectories for files, matching a regular expression. Each match is sent to the callback provided as third argument. A simple example:

function my_handler($filename) {
  echo $filename . "\n";
}
find_files('c:/', '/php$/', 'my_handler');


And the actual snippet

function find_files($path, $pattern, $callback) {
  $path = rtrim(str_replace("\\", "/", $path), '/') . '/';
  $matches = Array();
  $entries = Array();
  $dir = dir($path);
  while (false !== ($entry = $dir->read())) {
    $entries[] = $entry;
  }
  $dir->close();
  foreach ($entries as $entry) {
    $fullname = $path . $entry;
    if ($entry != '.' && $entry != '..' && is_dir($fullname)) {
      find_files($fullname, $pattern, $callback);
    } else if (is_file($fullname) && preg_match($pattern, $entry)) {
      call_user_func($callback, $fullname);
    }
  }
}
« Newer Snippets
Older Snippets »
Showing 1-10 of 38 total  RSS