Matching quote characters?
Anonymous Monk
created: 2006-08-05 10:01:21
Is there an elegant way (one expression) to match an opening quote character followed by some characters and an optional trailing quote character? I tried this:
my $string = '"foo"';
   $string =~ s/\"(.*?)\"?/$1/;

print $string;
But the .*? always gobbles up the trailing " and the \"? never matches it. I thought non-gready .* matched as few non-newline characters as possible. Why is it matching the final " if it could leave it for "? and the pattern would still match?
Re: Matching quote characters?
created: 2006-08-05 10:13:12

Tell it not to match "

$string =~ s/"([^"]+)"?/$1/;

Update: after I posted, I realised that if you don't care for the closing quote, then you can simply remove all quotes:

$string =~ s/"//g;
--
Leviathan.
Re: Matching quote characters?
created: 2006-08-05 10:25:58
try
my $string = '"foo"';
my $string2 = '"foo';
   $string =~ s/\"(.*?)\"|\"(.*?)/$1$2/;
   $string2 =~ s/\"(.*?)\"|\"(.*?)/$1$2/;
print "$string -- $string2\n";
I think in the $string =~ s/\"(.*?)\"?/$1/; version both the .*? and the \"? are non-gready, so it's a question of which one is less gready competing, and \"? wins becuas it's non -greedy.

Updated:Corrected my muddle-headedness about "non-greedy ", thanks to betterwold.

Re^2: Matching quote characters?
created: 2006-08-05 15:22:22
I think in the $string =~ s/\"(.*?)\"?/$1/; version both the .*? and the \"? are non-gready
No, \"? is greedy. \"?? would be non-greedy.
Re: Matching quote characters?
created: 2006-08-05 11:31:43

    Why is it matching the final " if it could leave it for "? and the pattern would still match?

In the greedy case, it's because that's how regexes work; if .* can match the quote ("), it will.  Note that it would be different if the trailing quote was non-optional; then the greedy match would only match up to but not including the final quote.

In the non-greedy case, of course, it's the same problem; the non-greedy succeeds by matching nothing at all, as the optional final quote does not have to be matched either.

Here's another way to look at the effect of a "non-optional" final quote for both the greedy and non-greedy cases:

    $string =~ s/\"(.*)\"/$1/;
    #                  <= Final quote " pushes left against
    #                     otherwise greedy match (.*)

    $string =~ s/\"(.*?)\"/$1/;
    #                   => Final quote " pulls right against
    #                      otherwise non-greedy match (.*?)

So when the final quote is optional, neither of the above constraints get enforced; the greedy match can be maximally greedy, and the non-greedy match can be minimally greedy.

And by the way, you don't need to escape the quotes in a regex.  \" can be just ".


s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: Matching quote characters?
created: 2006-08-05 15:19:41
my $string = '"foo"';
   $string =~ s/\"(.*?)\"?/$1/;

print $string;
But the .*? always gobbles up the trailing "
Actually it doesn't. Your non-greedy .*? matches as few characters as possible: Zero. Then, one or zero " follows. You can see it from the following snippet:
my $string = '"foo"';
   $string =~ s/"(.*?)"?/_$1_/; # note that you don't have to escape "

print $string; # prints __foo"
For a solution to your problem, see the other posts.
Re: Matching quote characters?
created: 2006-08-06 18:02:48
The object is to get the shortest possible string that starts with " and ends with either " or the end of the sample. So:
while () {
    chomp;
    print "$1\n" while m/(".*?(?:"|$))/g;
}

__DATA__
abc
"def
"ghi"
jkl"
mno"pqr"stu
v"wx"y"z"

perlmonks.org content © perlmonks.org and Anonymous Monk, betterworld, Leviathan, liverpole, rodion, TedPride

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03