By joojoogame@yahoo.com
via jtoee.blogspot.com
Published: May 18 2008 / 20:41
By joojoogame@yahoo.com
via jtoee.blogspot.com
Published: May 18 2008 / 20:41
Comments
bhargett replied ago:
Short and to the point. Good tip I know I have done it the wrong way, but then again I don't iterate over maps much as of yet.
petvirus replied ago:
wtf, why are the code samples saved as images? havent these people heard of syntax highlighting? its the new web 2.0 thing!
,
willcode4beer replied ago:
I thought everyone iterated over the entry set....
Oh well, voted up because I hope to never see that first example in the wild.
killerweb replied ago:
Funny to see posts like this and people just agreeing. It would seem it would be faster to do it that way, but is it really? I mean under the covers it might not be doing what you think it's doing. getValue() could be looking up a the value as well and not being a storage interface. And after poking at HashMap and using some metrics on maps of 10,000 items, it's just as I thought. This is why Interfaces can be miss-leading sometimes.
joojoogame@yahoo.com replied ago:
I am not sure how your metric was calculated. For 10,000 items the results for both iterations are not conclusive. For an iteration of 10,000,000 items there is a clear winner among the iteration methods, atleast in JDK 1.5 running on a 3 gig machine with 2 GB RAM. Here is the metric
Bad iterate: 1063 ms
Good iterate: 891 ms
The JVM implementation for get(key) calculates the hash everytime contrary to what you say. From the JVM source for get(key)
int hash = hash(k);
int i = indexFor(hash, table.length);
Entry> e = table[i];
From the JVM source for entryset()
Set>> es = entrySet;
return (es != null ? es : (entrySet = (Set>>) (Set) new EntrySet()));
However that said it is very unlikely for anyone to iterate 10,000,000 items using a HashMap. Although this is a better way to iterate over a HashMap developers will not see a performance increase until they hit huge numbers.
killerweb replied ago:
Geez your going to really make me prove this one ;) Anyway here goes...
1,000,000 Items, hard to test over that number because even hiccups in CPU time can throw off the number.
Million items, More trials for better results. If you look the numbers are so close I would keep to my first prior statement, It just doesn't matter.
Plus even your numbers say the same thing, if you got 10,000,000 items and saw only a difference of 172ms! I think it just proves my point more. Run it 10 times and see what you get. CPU burps can always throw off large data sets.
Test Trial [1]
Map.Entry duration[80]
Map.get(key) duration[70]
Test Trial [2]
Map.Entry duration[70]
Map.get(key) duration[70]
Test Trial [3]
Map.Entry duration[70]
Map.get(key) duration[70]
Test Trial [4]
Map.Entry duration[61]
Map.get(key) duration[60]
Test Trial [5]
Map.Entry duration[60]
Map.get(key) duration[70]
Test Trial [6]
Map.Entry duration[61]
Map.get(key) duration[70]
Test Trial [7]
Map.Entry duration[60]
Map.get(key) duration[50]
Test Trial [8]
Map.Entry duration[60]
Map.get(key) duration[50]
Test Trial [9]
Map.Entry duration[60]
Map.get(key) duration[50]
Test Trial [10]
Map.Entry duration[60]
Map.get(key) duration[80]
Anyway on to real work ;)
killerweb replied ago:
Oh and one more thing..lol... It's not a better way to iterate over maps by just saying, "it's just a better way". If your using generics with Map, defining the generic twice once at the Map> level and then again at loop time, makes code look sloppy. Double the work with no payout. Ahhh... just my 2 cents, too much RebBull today lmao.
joojoogame@yahoo.com replied ago:
This seems weird. Tests consistently won over for the "Good code" on my machine at least, over a span of 10 - 15 tests.
As for readability it would seem to be a preference. I prefer the Map.Entry version. Thanks for your comments and scrutiny, this was an interesting one since most of us assume Map.Entry to be more efficient :)
MattRussell replied ago:
Hmm. No arguments are given as to why using Map.Entry is the best approach to map iteration. Granted that the "Good code" might be more efficient. On the other hand, the "Not So Good Code" is a bit more readable, mostly because it's more concise. So I'd personally prefer the "bad" version unless I'd identified a performance bottleneck in that region of code -- unless I'm missing something?
peter_lawrey replied ago:
Sometimes the access cost is fairly high.
Say you have a Map which is backed by a database table.
In that case entrySet() can return a SELECT * FROM table
However, getting all the keys and then looking up each key can be very expensive.
However, most of the time it doesn't make a significant difference to the user how you access the map. In this case I would suggest you write in a manner you think is is easier to read.
,
Voters For This Link (27)
Voters Against This Link (5)