DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world
Cheapest Rsync Replacement (with Ruby)
I often use rsync to keep a local copy of some HTTPD logs (around ~200MB atm.). Since they are append-only, having rsync compute and compare the checksums for the parts I already have seems wasteful: both my box and the one I'm copying from would be happier if they didn't have to process a couple hundred MBs for nothing. (...)
#!/usr/bin/env ruby
REMOTE_RUBY = "ruby"
# TODO: allow REMOTE_RUBY to be specified via a cmdline opt
if ARGV.size != 2 || ARGV[0][/:/].nil? || !File.exist?(ARGV[1])
puts <<EOF
ruby logfetcher.rb host:path/to/src dst
EOF
exit
end
FILE = ARGV[1]
REMOTE_HOST, REMOTE_FILE = ARGV[0].split(/:/)
BLOCK_SIZE = 8192
osize = File.size(FILE)
#FIXME: cheap escaping
command = "File.open(#{REMOTE_FILE.inspect}){|f| " +
"f.pos = #{osize}; print f.read(#{BLOCK_SIZE}) until f.eof? }"
command.gsub!(/"/){'\\"'}
fetched = 0
t = nil
$stdout.sync = true
print "Establishing connection\r"
File.open(FILE, "a") do |os|
IO.popen(%{ssh #{REMOTE_HOST} ruby -e '"#{command}"'}) do |is|
until is.eof?
data = is.read(BLOCK_SIZE)
t ||= Time.new # ignore the time it takes to establish the SSH connection
fetched += data.size
print "Read #{fetched} \r"
os.write(data)
end
end
end
print(" " * 50 + "\r")
dt = Time.new - t
puts "Fetched #{fetched} bytes."
puts "Total size #{osize + fetched}."
puts "Needed %4.1f seconds." % dt
puts "Average speed %d bytes/sec." % (fetched / dt)
Source: <a href="http://eigenclass.org/hiki.rb?cheap+rsync">Cheapest rsync replacement</a>





