I know variants of the following bug have been discussed in the Monastery recently, but I didn't find this particular one. My apologies in advance if I missed it.
Spot the bug:
sub foo {
my $hashref = shift;
while ( my ( $key, $value ) = each %$hashref ) {
return if sometest( $key );
frobnicate( $key, $value );
}
}
BTW, there's nothing arcane about this bug. It's vanilla Perl, but it's easy to miss.
The problem is that if the function is called with the same arguments twice, and if during the first call the while is exited due to the result of sometest , then in the second call to the foo, the iteration will not begin in the same place as it did in the first call. Conceivably this could be the desired behavior, but I bet that in most cases it isn't.
To fix it, use keys before the loop:
sub foo {
my $hashref = shift;
keys %$hashref;
while ( my ( $key, $value ) = each %$hashref ) {
return if sometest( $key );
frobnicate( $key, $value );
}
}
the lowliest monk
It's one of the reasons I tend to shy away from each(). I prefer instead to get the keys, then use the keys to look up the values. That way there's never a problem with each-loop exits, and I can also sort the keys.
-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.
By "bug", I presume you mean "bug in your code". The "exiting from the middle of an each-loop" has been discussed here before, and the behavior of Perl is as documented.
Sorry I gave the impression (probably through my choice of title) that I thought this was a bug in Perl. I did mean a bug in the snippet I presented. Perl's behavior is, as you say, amply documented.
the lowliest monk
The problem is people have a bad habit of forgetting to reset the iterator, and heaven help you if the general routine calls back out to uncontrolled code from inside the loop. Can you say "infinite loop"? :-)
The problem is people have a bad habit of forgetting to reset the iterator,Yes, and? As long as you don't forget it in your general routine (unless you have good reason not to).
heaven help you if the general routine calls back out to uncontrolled code from inside the loop.Yes, and? Presumably the uncontrolled code was passing in as an argument, one way or the other (otherwise, it wouldn't be uncontrolled). It's the users responsibility, just as it's the users responsibility to not pass in vital files as arguments to unlink.
Iterators hand the programmer some rope, but what doesn't? That's not a reason to avoid iterators all together.
I think the problem is not with iterators as such, as much as with the fact that the scope and workings of this particular iterator are so non-obvious. In other words, it's the "magic" the bites here, IMO, not the iterator. I posted a hash iterator class some months ago that lets one create multiple simultaneous lexically-scoped iterators on any hash, but somehow it feels like overkill for most situations. Maybe this is a facility that should be built-in.
the lowliest monk
my @lines = <$file_handle>;
for my $line (@lines) {
...
}
but I bet people will comment on that if you start promoting that style on Perlmonks.
I find iterators bind to the data they act on far more natural than being binded to the location in the source code. It translates to an OO approach much more naturally.
I think it would be better to bind lexically to the value myself.What does "bind lexically to the value" mean?
Having some code you pulled in cause side effects is definitely not what most people think of as DWIW.Well, perhaps not in a language like LISP or Haskell, but it's pretty hard to do something in Perl that doesn't have side-effects. Even looking at a value can change it (and hence, is a side-effect). Assignment is a side-effect. warn is a side-effect. Processing @ARGV is a side-effect. Reading from a handle is a side-effect.
People don't actually mind side-effects. In fact, most people will expect side-effects. What people don't like is unexpected side-effects. But I never advocated not telling what code does.
Personally, I don't consider your two examples as side effectsI actually gave five examples of side effects (looking at a value, assignment, warn, processing @ARGV, reading from a handle).
Now, you might not call them side-effects, but than you have a private definition of a side-effect. Using private definitions doesn't contribute to succesful communication.
I dont think its fair to equate each(%hash) and filehandles, the reason being that I can have multiple filehandles open on the same file without any interference problems. But with each, you have to always consider that any traversal of the hash has to be completely independent. Thus a routine that takes a filename and traverses the file calling out a callback to process each chunk is totally safe. The equivelent routine that takes a hash and traverses it using each calling out to a callback runs the serious risk that the hashes iterator will be reset or changed by user code.
I can think of all kinds of behaviour that is totally safe with filehandles and is not safe with each(%hash). They just aren't the same thing.
And while it's true that you can open multiple handles to the same file, when was the last time you saw a program opening multiple handles on STDIN or on a socket, to name a few common streams programs read from?
You missed the point. If you have two subs that open file handles on a the same file they don't need to worry about each other. If however you have two subs that 'open' an each on the same hash, you are in serious trouble. I couldn't realy quoute you numbers on the use of either, but filehandles are certainly very different than each.
Unlike you, I was talking about filehandles, not files.
Well, you are playing semantic games. If you mean streams or sockets then talk about streams or sockets.
Yes, it mattered when Perl was running on 10Mhz 1Megabyte machines, and databases were tough, so a DBM hash was a cool thing. Those are long gone. {grin}
-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.
Also a possible problem is the call to frobnicate; what if it also does something to affect the iterator of the hash passed to foo? If that's a possibility, you'd need to fetch all the keys up front (or use the lexical-iterator suggestion made by someone else in this thread).
perlmonks.org content © perlmonks.org and demerphq, eric256, merlyn, Perl Mouse, shotgunefx, tlm, ysth
prlmnks.org © 2006 edmund von der burg (eccles & toad)
v 0.03