DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world
Perform A Rails Find() And Iterate Over The Resulting Records In Groups
module ActiveRecord
class Base
# This method lets you iterate over the results of a .find, in groups. (Basically
# an interface to LIMIT.) Anything you can pass as options to .find, you can pass here.
#
# Returns the number of records found so you can determine whether or not something
# actually happened.
#
# === Specific options
# * <tt>:by</tt> - Number of records in each group
# * <tt>:update</tt> - If set to true, a message like 'Reading records 1-50' will be
# output before each group of records is fetched from the database.
#
# === Examples
# # default # of records is 100...
# Order.find_each(:conditions => { :cc_processed_at => nil }) do |order|
# # do stuff with order
# end
#
# # ...but you can explicitly specify it
# Person.find_each(:by => 50, :include => :address, :order => 'name') do |person, index|
# # do stuff with person and index
# end
#
# # using the return value
# num_records = Order.find_each do |order|
# # ...
# end
# puts "No orders found." if num_records.zero?
#
# # you can get the offset per iteration too
# Order.find_each do |order, offset|
# puts "Offset: #{offset}"
# end
#
# # even works with named scopes!
# Dog.with_ears_cut_short.with_tail_cut_long.find_each do |dog|
# # ...
# end
#
# === Caveats
# If you're resaving the records that .find_each returns such that executing the query with
# the same offset and limit on the next iteration would omit the records, .find_each may end
# up skipping records on subsequent iterations. For instance, if you have something like:
#
# Order.find_each(:conditions => { :processed => false }) do |order|
# order.processed = true
# order.save!
# end
#
# Here's the problem. On the first iteration .find_each will execute a <tt>SELECT .. LIMIT 0,100</tt>
# query, and the records will get saved. The next iteration, it will execute <tt>SELECT .. LIMIT 100,100</tt>.
# However, the records that were just saved no longer match the conditions in the query,
# but the query still assumes that they're there, so in effect we're double-skipping them.
#
# In this case what you probably want to do is pull a list of all the ids that match the
# conditions at the outset, and then iterate through those in groups. Unfortunately this
# doesn't do that quite yet ;)
def self.find_each(options={}, &blk)
group_size = options.delete(:by) || 100
update = options.delete(:update) || false
num_records = count(options.except(:from))
return 0 if num_records == 0
#raise "Number of records: #{num_records}"
also_pass_offset = (blk.arity == 2)
0.step(num_records, group_size) do |offset|
find_options = { :offset => offset, :limit => group_size, :order => "#{table_name}.#{primary_key}" }.merge(options)
if update
if num_records == 1
puts ">> Reading the only record."
else
start_offset = offset + 1
end_offset = offset + group_size
end_offset = num_records if num_records < end_offset
puts ">> Reading records #{start_offset}-#{end_offset} of #{num_records} (offset: #{offset}, limit: #{group_size})"
end
end
find(:all, find_options).each do |record|
also_pass_offset ? blk.call(record, offset) : blk.call(record)
end
end
num_records
end
end
end
Also see: * http://weblog.jamisbuck.org/2007/4/6/faking-cursors-in-activerecord * http://weblog.jamisbuck.org/2007/2/28/poor-man-s-pagination * http://pseudocursors.rubyforge.org/





