opendir( DIR, '.' )||die "Can't open $dir";
my(@filenames)=readdir ( DIR );
closedir ( DIR );
@filenames=sort {$a cmp $b} @filenames;
i want to optimize this piece of code. is there some way i can do this
Code tags added by [Arunbear]
Well, aside from calling my @filenames = sort readdir( DIR ) in one step (since by default sort uses effectively the comparitor you've shown, and that copies the results around one less time) what more do you think could be optimized? Doesn't get much more bare bones than that. If you're not interested in every filename that readdir is going to return you might stick a grep in between there, but there's not really much more to be pared down.
sort with no explicit comparison is already optimized away and does the same as what you've shown, so just remove the block. Oh, and BTW: you probably already knew, but "premature optimization is the root of all evil".
You didn't say what you wanted to optimize, so here's another take on it:
my @filenames = do {
opendir my $d, "." or die;
sort readdir $d
};
Note that this only works on modern perls.
[duff]
On my OS I'd do
my @files = glob '*';
as glob always produces a sorted list on my system. I'm not sure if that is true on other systems?
On some OSs that doesn't include hidden files:
$ perl -le 'opendir my $fd, "."; my @files = readdir $fd; closedir $fd; print scalar @files' 48 $ perl -le 'my @files = <*>; print scalar @files' 7 $ _
But including them is not difficult:
$ perl -le 'my @files = (<*>,<.*>); print scalar @files' 48 $ _
It seems that [doc://readdir] is faster, though:
use Benchmark qw(:all) ;
cmpthese(20000, {
readdir => sub { opendir my $fd, "."; my @files = readdir $fd; closedir $fd; },
glob => sub { my @files = (<*>,<.*>); }
});
__END__
Rate glob readdir
glob 3170/s -- -75%
readdir 12739/s 302% --
Update: Of course [doc://glob] is slower, since in this computer it also sorts the result.
--
David Serrano
If it's an efficient sort algorithm you're after, you should note that "In 5.7, the quicksort implementation was replaced with a stable mergesort algorithm whose worst-case behavior is O(NlogN)." (perldoc). According to Numerical Recipes, "For the basic task of sorting N elements, the best algorithms require on the order of several times N log2 N operations." That means that mergesort is just about as efficient as you can get.
Incidentally, Numerical Recipes is an astonishingly good book. Knuth may be the definitive reference, and nag may have many algorithms fine-tuned to specific cases, but NR is exceptionally good value for money. You can even download free source code in FORTRAN or C. If you're new to Perl, I'd see the lack of a Perl version as a good thing. I think it's a useful exercise in learning a language to try translating into it.
Update: Of course, all of this is overkill for the number of items likely to be found in any sensible directory!
heh. I've actually had to scold some of my co-workers, telling them they really should not be putting 300,000 files in a single directory, and there are easy enough ways to avoid this... Even 50,000 seems like too much to me (but then, I learned programming on a pdp 11-10, and my sense of scale may be a bit old-fashioned).
heh. I've actually had to scold some of my co-workers, telling them they really should not be putting 300,000 files in a single directory, and there are easy enough ways to avoid this...
Anything more than a screenful seems excessive to me! But then, I grew up on a Data General S-250, IIRC (Think PDP 11-70 - that sorta era, but less successful.)
Been a while since I have referred anyone to this, but it is a pretty good introductory discussion <shamelessplug>(and response)</shamelessplug> on why some directory structures slow down after a certain number of documents are included. Not necessarily useful for those Old System Administrators1.
1 - OSA's never die, they just put down roots.
--MidLifeXis
Do you want it to run faster?
Do you want to use less code?
Do you want it to be more readable, maintainable, portable, or more robust?
glob or similar will be less code. What you've given (other than the lack of codetags) is almost as good as you can do for balancing all of the criteria above.
Incidentally, IIRC Perl has builtin optimizations for the following sort subs:
{$a cmp $b} # default
{$b cmp $a}
{$a <=> $b}
{$b <=> $a}
These are the most common. So using any one of these should be indistinguishable from the default (no sort sub) case.
-QM
--
Quantum Mechanics: The dreams stuff is made of
perlmonks.org content © perlmonks.org and Anonymous Monk, blazar, BrowserUk, duff, Fletch, graff, Hue-Bond, MidLifeXis, Polonius, QM, smokemachine
prlmnks.org © 2006 edmund von der burg (eccles & toad)
v 0.03