Understanding alternation
Anonymous Monk
created: 2006-02-03 09:23:29
Hi , being new to Perl I'm struggling for a simple explaination on how something like the following code would work.
$text = "The dog is black cat is white and the fox does not like the cow or pig";
if ($text =~ /(dog|cow|pig)/) { print "match found" }
Does the match quit once it finds the first occurence of one of the alternatives ? e.g. in this case it finds a match on dog so exits
then If I add /g
/(dog|cat|fox|cow|pig)/g
Will that then try and match each alternative in the string and then return ?
What happens if I wanted to build a regular expression but have it continue to search the string for the other alternatives even if the first is matched e.g. I may wish to split the above string up based on the alternatives provided , say put a new line in just before where a match is found. So I end up with the following output
The dog is black
cat is white and the
fox does not like the
cow or
pig
Can I do that using the alternation (dog|cat|fox|cow|pig) ?
Re: Understanding alternation
g0n
created: 2006-02-03 10:06:18
'Does the match quit once it finds the first occurence of one of the alternatives?'

Yes.

'Will that then try and match each alternative in the string and then return?'

Sort of. It will keep looking through the string, even after it's found a match, until it doesn't find any more matches. It's a fine distinction, but it doesn't look through the string for the first, then look through again for the second etc, AFAIK. Note that in the context you are using it, it will just return true, not a list of matches.

'Can I do that using the alternation ..'

Yes, like this:

$text = "The dog is black cat is white and the fox does not like the cow or pig";
$text =~ s/(dog|cow|cat|pig)/\n$1/g;
print $text;

($1 contains the value matched by the pattern, so you are replacing each match with itself, prepended by \n).

--------------------------------------------------------------

"If there is such a phenomenon as absolute evil, it consists in treating another human being as a thing."
John Brunner, "The Shockwave Rider".

Re^2: Understanding alternation
created: 2006-02-03 10:15:01
I'd throw a couple of \b's in there, so it only matches on word boundaries. eg:
$text =~ s/\b(dog|cat|fox|cow|pig)\b/\n$1/g;
Otherwise, you'd match stuff like "cowardly" and "pigheaded" ;)
Re^2: Understanding alternation
created: 2006-02-03 10:38:08
Ok seems fairly straightforward thanks for the help
Re: Understanding alternation
created: 2006-02-03 10:47:36

For the sake of performance, you should switch REGEX alternation with short-circuit alternation:

my $text = 'The dog is black cat is white and the fox does not like the cow or pig';
print 'match found', "\n" if ( $text =~ /dog/ || /cow/ || /pig/ );

This works fine for simple patterns like you're using. The Camel books has a nice explanation for that in Common Pratices chapter.

Alceu Rodrigues de Freitas Junior
---------------------------------
"You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill
Re^2: Understanding alternation
created: 2006-02-03 11:01:05
[glasswalk3r],
Your code doesn't do what you think it does. You are saying if $text conatins dog or $_ contains cow or $_ contains pig. This can be solved 1 of 2 ways.
local $_ = 'The dog is black cat is white and the fox does not like the cow or pig';
print "match found\n" if /dog/ || /cow/ || /pig/;

# or by explicitly stating
print "match found\n" if $text =~ /dog/ || $text =~ /cow/ || $text =~ /pig/;
It is worth noting that alternation in regexen can be expensive but [demerphq]'s patch to bleed perl (and hopefully the recently released 5.8.8) can make it much less so.

Cheers - [Limbic~Region|L~R]

Re: Understanding alternation
created: 2006-02-03 10:57:42
At a slight tangent from your question.

In list context the regex returns all the matches.

#!/bin/perl5

use strict;
use warnings;

my $text = "The dog is black cat is white and the fox does not like the cow or pig";
my (@array) = $text =~ /(cow|dog|pig)/g;
print "@array\n"; 

__DATA__
---------- Capture Output ----------
> "C:\Perl\bin\perl.exe" _new.pl
dog cow pig

> Terminated with exit code 0.

perlmonks.org content © perlmonks.org and Anonymous Monk, g0n, glasswalk3r, Limbic~Region, McDarren, wfsp

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03