use English; and performance
asz
created: 2006-03-02 10:53:06
hello,

i recently had to use $', so reading perlvar more carefully, i noticed it warns about using $':
The use of this variable anywhere in a program imposes a considerable performance penalty on all regular expression matches. See "BUGS".
...but in the BUGS section it states that:
Due to an unfortunate accident of Perl's implementation, use English imposes a considerable performance penalty on all regular expression matches in a program,[...]
now i'm confused :) ... i don't understand wheter use English; (i.e. $POSTMATCH) reduces performance or simply using $'.
thank you for your time!

:)))))
Re: use English; and performance
created: 2006-03-02 11:04:03

Both. Read the section on performance in perldoc English

Basically, using $`, $&, and $' impose a performance penalty on all regular expression matches. Because the English module makes use of these vars, it too imposes the same performance penalty.

Re^2: use English; and performance
asz
created: 2006-03-02 11:56:20
i'm trying to create a string tokenizer for a config file parser and the best that i've managed to think of is this:
#!/usr/bin/perl

use strict;
use Data::Dumper;

my $line = q[keyword1 value keyword2 "value with spaces" keyword3 value];

print Dumper tokenize_line($line);

sub tokenize_line {
    my $line = shift;

    my @tokens;
    while ($line =~ /(\S+)/g) {
        # every non-space match is a token
        push @tokens, $1;

        # anything in double-quotes is a single token
        if ($line =~ /\G\s*"(.+?)"/) {
            push @tokens, $1;
            # continue from this last match
            $line = $';
        }
    }

    return \@tokens;
}
wich outputs this:
$VAR1 = [
          'keyword1',
          'value',
          'keyword2',
          'value with spaces',
          'keyword3',
          'value'
        ];
i know it's an ugly hack, trying to substitute the original string with the rest of the matched pattern ($line = $';), but in my previous attempts i would use split and substr to achieve the same results... and it was very ugly :)
what would be a better way to write this? thank you all for your time and advice!
:)))))
Re^3: use English; and performance
created: 2006-03-03 11:09:55

Use /gc in your speculative match. /c prevents pos() from being reset on match failure.

Makeshifts last the longest.

Re: use English; and performance
created: 2006-03-02 11:09:16
so.. to use it without the preformance issues.. straight from the docs:
use English qw(-no_match_vars);



This is not a Signature...
Re^2: use English; and performance
created: 2006-03-02 11:15:40

The unary minus quotes identifiers, so the following is sufficient:

use English -no_match_vars;

perlmonks.org content © perlmonks.org and Aristotle, asz, duff, ikegami, monkey_boy

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03