I've a hash that uses numbers as keys. The numbers aren't necessary sequential. Here is an example:
my $href = { '6' => '154', '4' => '152', '1' => '80', '3' => '151', '8' => '150' };
I just want the hash value with the lowest key value - in the example value, it's '1' => '80' and discard the rest (by emptying the hash).Is there a quick way to do that? Thanks in advance :)
Something like
my $lowest_value = $href->{ ( sort {$a <=> $b} keys %$href )[0] };
should give you what you need.
Hope that helps,
-- [189756|Foxcub]
#include [http://www.liquidfusion.org.uk|www.liquidfusion.org.uk]
Update: Added {$a <=> $b} to the sort to ensure it performs a numeric sort over an ASCII sort - thanks to those who pointed that out.
But would I be able to get to the key as well? As I was waiting for solutions, I came up with the following:
my @sorted = sort keys %$href;
my $value = $href->{$sorted[0]};
%$href = ();
$href->{$sorted[0]} = $value;
It gets me what I was looking for but is there a better way to do it?
just use sort to order the keys of the hash
As a general rule, don't sort to retrieve the min/max of a list. That's spectacularly inefficient. Scan the list once for a cost of O(n) and retrieve the information required. If you sort, depending on your algorithm and dataset you'll usually get around O(n log n). A quick benchmark bears this out:
#! /usr/local/bin/perl -w
use strict;
use Benchmark qw/cmpthese/;
my %h;
$h{ int(rand 1_000_000) } = rand 1000 for 1..10_000;
sub min_by_sort { ( sort {$a <=> $b} keys %h )[0] }
sub min_by_scan {
my $min = undef;
(not defined $min or $min > $_) and $min = $_ for keys %h;
$min;
}
print "min_by_sort = ", min_by_sort(), "\n";
print "min_by_scan = ", min_by_scan(), "\n";
cmpthese(
shift || -5,
{
min_by_sort => \&min_by_sort,
min_by_scan => \&min_by_scan,
}
)
For a list of ten elements, the scan does much better than the sort:
min_by_sort = 107
min_by_scan = 107
Benchmark: running min_by_scan, min_by_sort, each for at least 5 CPU seconds...
min_by_scan: 4 wallclock secs ( 5.44 usr + 0.00 sys = 5.44 CPU) @ 95284.97/s (n=518112)
min_by_sort: 6 wallclock secs ( 5.12 usr + 0.00 sys = 5.12 CPU) @ 152355.18/s (n=779630)
Rate min_by_scan min_by_sort
min_by_scan 95285/s -- -37%
min_by_sort 152355/s 60% --
update: [eserte] points out that I misread the results. Somehow I managed to past in the results of something else (and I forget what). The correct results for a list of 10 elements are as follows:
min_by_scan: 6 wallclock secs ( 5.30 usr + 0.00 sys = 5.30 CPU) @ 93681.46/s (n=496219)
min_by_sort: 6 wallclock secs ( 5.28 usr + 0.00 sys = 5.28 CPU) @ 71843.03/s (n=379421)
Rate min_by_sort min_by_scan
min_by_sort 71843/s -- -23%
min_by_scan 93681/s 30% --
If the hash contains around 10 000 keys (note that the quick and dirty initialisation used means that I might generate the same key twice) the scan method is still a winner, although sort is catching up because of the pure-C/perl-ops diffential.
min_by_sort = 301
min_by_scan = 301
Benchmark: running min_by_scan, min_by_sort, each for at least 5 CPU seconds...
min_by_scan: 6 wallclock secs ( 5.48 usr + 0.01 sys = 5.48 CPU) @ 94.63/s (n=519)
min_by_sort: 6 wallclock secs ( 5.30 usr + 0.00 sys = 5.30 CPU) @ 73.14/s (n=388)
Rate min_by_sort min_by_scan
min_by_sort 73.1/s -- -23%
min_by_scan 94.6/s 29% --
So for really large hashes you should use sort, otherwise scan. This is only due to the fact that sort is a single perl op (modern perls can recognise sort {$a <=> $b}) at which point you drop down into optimised C. The scan loop will run a small loop of ops through the interpreter which limits its flat-out performance. If you were writing both in C the scan would always win.
For instance, if you disable the inlined sort by creating a comparator like sort {$a+1 <=> $b+1} sort's improvement quickly evaporates and the scan continues to be the best performer. It's something peculiar to Perl, and something to keep in mind.
update: A thought just came to me over lunch: even if you consider the speed argument weak (which is fine by me) a more damning problem is how much memory is used. The sort will require a second copy of the list to be held in memory. Given Perl's propensity to consume memory, this will be a significant factor for large lists.
- another intruder with the mooring of the heat of the Perl
my $low = (keys %h)[0];
for( keys %h )
{
if ($_ < $low ){ $low = $_; }
}
my $s_k = $low;
my $s_v = $h{$low};
%h=();
$h{$s_k}=$s_v;
Because of the way that hashes are stored in memory, the order of keys returned is not guaranteed to be ordered or even consistent. If you want to get the lowest key, then something like...
my @ordered_list = sort keys %myhash;
...will put the lowest key (sorted asciibetically) into the $ordered_list[0] element. See this and this for futher info on sort and keys.
Hope this helps.
Update: Or you could use Foxcub's far more elegant solution!
"Stercus! Dixit Pooh. Eeyore, missilis lux navigii heffalumporum iaculas. Piglet, mecum ad cellae migratae secundae concurras."
#!perl
use strict;
use warnings;
my $href = { '6' => '154', '4' => '152', '1' => '80', '3' => '151', '8' => '150' };
my $key = (sort {$a <=> $b} keys %$href)[0];
my $value = $href->{$key};
print "$key => $value\n";
__END__
1 => 80
I was wondering why when I have
my $href = { '11' => '154', '4' => '152', '8' => '150' };
'11' => '154' was returned as the value instead of '4' => '152'Your code solves it :)
my $href = { '6' => '154', '4' => '152', '1' => '80', '3' => '151', '8' => '150' };
use List::Util qw(min);
my $min = min keys %$href;
$href = {$min => $href->{$min}};
perlmonks.org content © perlmonks.org and BUU, CombatSquirrel, Elgon, eserte, grinder, kiat, rir, Tanalis
prlmnks.org © 2006 edmund von der burg (eccles & toad)
v 0.03