DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Simple S3 Utils - Copy Bucket To Bucket

12.30.2007
| 12561 views |
  • submit to reddit
        Some simple S3 utils in Ruby

Specifically written to copy one bucket to another (for testing on production on staging)

PLEASE BE CAREFUL WITH THIS - THERE IS CODE THAT DELETES ALL CONTENTS OF A BUCKET

Example:
a = AmazoneS3Asset.new
a.copy_over_bucket("myapp_production", "myapp_production")

require 'aws/s3'
require 'mechanize'

class AmazonS3Asset
  
  include AWS::S3
  S3ID = "your s3 id"
  S3KEY = "your s3 key"
  
  def initialize
    puts "connecting..."
    AWS::S3::Base.establish_connection!(
      :access_key_id     => S3ID,
      :secret_access_key => S3KEY
    )
  end

  def delete_key(bucket, key)
    if exists?(bucket, key) 
      S3Object.delete key, bucket
    end
  end
  
  def empty_bucket(bucket)
    bucket_keys(bucket).each do |k|
      puts "deleting #{k}"
      delete_key(bucket,k)
    end
  end
  
  def bucket_keys(bucket)
    b = Bucket.find(bucket)
    b.objects.collect {|o| o.key}
  end

  def copy_over_bucket(from_bucket, to_bucket)
    puts "Replacing #{to_bucket} with contents of #{from_bucket}"
    #delete to_bucket
    empty_bucket(to_bucket)
    bucket_keys(from_bucket).each do |k|
      copy_between_buckets(from_bucket, to_bucket, k)
    end
  end
  
  def copy_between_buckets(from_bucket, to_bucket, from_key, to_key = nil)
    if exists?(from_bucket, from_key)
      to_key = from_key if to_key.nil?
      puts "Copying #{from_bucket}.#{from_key} to #{to_bucket}.#{to_key}"
      url = "http://s3.amazonaws.com/#{from_bucket}/#{from_key}"
      filename = download(url)
      store_file(to_bucket,to_key,filename)
      File.delete(filename)
      return "http://s3.amazonaws.com/#{to_bucket}/#{to_key}"
    else
      puts "#{from_bucket}.#{from_key} didn't exist"
      return nil
    end
  end

  def store_file(bucket, key, filename)
     puts "Storing #{filename} in #{bucket}.#{key}"
     S3Object.store(
      key,
      File.open(filename),
      bucket,
      :access => :public_read
      )
  end

  def download(url, save_as = nil)
    if save_as.nil?
      Dir.mkdir("amazon_s3_temp") if !File.exists?("amazon_s3_temp")
      save_as = File.join("amazon_s3_temp",File.basename(url))
    end
    begin
      puts "Saving #{url} to #{save_as}"
      agent = WWW::Mechanize.new {|a| a.log = Logger.new(STDERR) }
      img = agent.get(url)
      img.save_as(save_as)
      return save_as
    rescue
      raise "Failed on " + url + "  " + save_as
    end
  end

  def exists?(bucket,key)
    begin
      res = S3Object.find key, bucket
    rescue 
      res = nil
    end
    return !res.nil?
  end
      
end
    

Comments

Snippets Manager replied on Thu, 2010/08/12 - 10:14am

I don't think the copy_over_bucket method is going to work if the (source) bucket has more than 1,000 objects. This is because this method is not going to return all the keys: def bucket_keys(bucket) b = Bucket.find(bucket) b.objects.collect {|o| o.key} end From Amazon S3 Technical FAQs at http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1109&categoryID=55#03 : Q : "I only see 1000 objects in my bucket, what gives?" "When you list the content of a bucket the results come back in blocks (currently 1000 keys or less). If the IsTruncated node is true in the response from Amazon S3, you'll need to make another request to get the rest of your keys. Your code must be prepared to handle this scenario." "If the IsTruncated flag is set, request the next page of results by setting Marker to the value of the NextMarker node from the last Amazon S3 response." So I think you could lose data, if you rely on this copy method! Stephan

Snippets Manager replied on Fri, 2010/10/08 - 3:42pm

Totally agree with Stephan. I think the code needs to be modified like this: def bucket_keys(bucket) marker_str = "" return_objs = [] begin b = Bucket.find(bucket, :marker => marker_str) marker_str = b.objects.last.key unless b.empty? return_objs.concat b.objects.collect { |o| o.key } end while not objs.empty? return return_objs end

Snippets Manager replied on Wed, 2011/02/16 - 5:57pm

I know this was posted a while ago, but is there any way to still get it to work? When i try to run this script it throws an exception and quits. /home/austin/Documents/csrware/ruby/AmazonS3Asset.rb:77: warning: toplevel constant Mechanize referenced by Mechanize::Mechanize /home/austin/Documents/csrware/ruby/AmazonS3Asset.rb:82:in `rescue in download': Failed on http://s3.amazonaws.com/artest1/notsoexecutable amazon_s3_temp/notsoexecutable (RuntimeError) from /home/austin/Documents/csrware/ruby/AmazonS3Asset.rb:75:in `download' from /home/austin/Documents/csrware/ruby/AmazonS3Asset.rb:50:in `copy_between_buckets' from /home/austin/Documents/csrware/ruby/AmazonS3Asset.rb:41:in `block in copy_over_bucket' from /home/austin/Documents/csrware/ruby/AmazonS3Asset.rb:40:in `each' from /home/austin/Documents/csrware/ruby/AmazonS3Asset.rb:40:in `copy_over_bucket' from ./docopy.rb:6:in `'