Calculating the geometric mean of a long list of random doubles

So, I ran into a problem today in my design of a limited Boltzmann machine, which should be trivial, but it seems to be difficult. Basically, I initialize 2k values ​​to random doubles between 0 and 1.

What I would like to do is calculate the geometric mean of this dataset. The problem I am facing is that since the dataset is so long, multiplying everything together will always lead to zero, and the correct root at each step will simply rails to 1.

I could add a list to the list, but I think it's really rude. Any ideas on how to do this in an elegant way?

In theory, I would like to expand my current RBM code to be closer to 15k + entries, and to be able to run RBM across multiple threads. Unfortunately, this excludes the math apache commons (geometric middle method is not synchronized), longs.

+5
source share
3 answers

Wow, using a large decimal type is overkill!

Just take the logarithm of everything, find the arithmetic mean and then evaluate.

+11
source

Mehrdad's logarithm solution certainly works. You can do it faster (and perhaps more accurately), though:

  • Calculate the sum of the numbers, for example S.
  • Drop all exponents to zero so that each number is between 1/2and 1.
  • 1000.
    • . .
    • S .
  • 1/1000 . 2 3, .
  • T. T 1/N 2 S/N N - .
+1

It seems that after enough multiplications, double precision is no longer enough. Too many leading zeros if you want.

The wiki page for arbitrary precision arithmetic shows several ways to solve this problem. In Java, BigDecimal seems to be the way to go, though at the expense of speed.

0
source

All Articles