DZone Snippets is a public source code repository. Easily build up your personal collection of code snippets, categorize them with tags / keywords, and share them with the world

Snippets has posted 5883 posts at DZone. View Full User Profile

Variance, Mean, Normalizing Functions, Euclidean And Other Distances

02.09.2007
| 4763 views |
  • submit to reddit
        Here <b>sum</b>, <b>mean</b> and <b>variance</b> were inspired by the Peter's inline sum code:

class Array; def sum; inject( nil ) { |sum,x| sum ? sum+x : x }; end; end
class Array; def mean; self.sum/self.size.to_f; end; end
class Array; def variance; mean = self.mean; Math.sqrt(inject( nil ) { |var,x| var ? var+((x-mean)**2) : ((x-mean)**2)}/self.size.to_f); end; end

If you want to normalize a random variable (array) so that mean = 0 and variance = 1, you can transform your array <b>x</b> by calling:
# inputs a random variable, sets mean = 0 and variance = 1
def standardize_random_variable(x)
  mean = x.mean
  variance = x.variance
  x.map!{|a| (a-mean)/variance }
end

If you want to compute distance, call these functions between two arrays of data, a and b.

## Distance Functions

# Sum of (x-y)^2
def euclidean_squared_distance(a,b)
  b = b.to_a
  a = a.to_a
  sum_of_diff_sq = 0
  (0...a.size).each { |i| sum_of_diff_sq+=((a[i].to_f-b[i].to_f)**2)}
  sum_of_diff_sq 
end

# Square root of sum of (x-y)^2
def euclidean_distance(neighbor,xq)
  Math.sqrt(euclidean_squared_distance(neighbor,xq))
end

# Sum of abs(x,y)
def cityblock_distance(neighbor,xq)
  xq = xq.to_a
  abs_diff = 0
  (0...xq.size).each { |i| abs_diff+=(Math.abs(xq[i].to_f-neighbor[i].to_f)}
  abs_diff
end
    

Comments

Snippets Manager replied on Sat, 2007/12/08 - 7:18pm

NOTE: the variance function in fact returns the standard deviation (sqrt of the variance)!!!