my $string = "abcdefghi";
my @fields = split /(?=.{3})/, $string;
They expected this to mean "split $string at every location that is followed by three characters (and then skip ahead three characters!)", but what it really means is "split $string at every location that is followed by three characters". They ended up getting ("a", "b", "c", "d", "e", "f", "ghi").
So how can you use split() to do this? Someone said "Couldn't you abuse \G?", and that reminded me of the internal assignment to $_ of the string being matched against, and the resulting use of pos()! I present:
my @fields = split /(?(?{pos() % 3})(?!))/, $string;
Another trick that works is to capture the split characters, which places them also in @fields and makes [pos] advance beyond them. Since all but probably the last group match, the normal split results mostly don't contain anything, so we need to filter out false elements with [grep]:
my $string = join '', a..z;
my @fields = grep {$_} split /(.{3})/, $string;
print "@fields\n";
__END__
abc def ghi jkl mno pqr stu vwx yz
After Compline,
Zaxo
Yep, I tried with [defined] first because that's the way I thought it worked, too. With that, the result of mine is,
abc def ghi jkl mno pqr stu vwx yzNote the extra spaces, indicating that there are defined empty strings instead of [undef]s in those positions.
After Compline,
Zaxo
I find [unpack] more suitable for this task.
print for unpack '(A3)*', "abcdefghi";; abc def ghi
Yep, I know. I remember it it being added.
I also remember it from the last time you told me.
And the time before that.
So, what is your point?
I know, I know. You're just "expanding knowledge".
Perhaps you should also consider adding footnotes to all your posts that use or recommend other features that have not been around forever? Like say, the 3-arg open; or even hashes?
(*) For the pedantic, 3 years, 8 months, 16 days 4 hours (approx. at the time of posting).
See Re: Version, version, why change the version..
So now I'm gonna ask you the same question. What is your point?
Are you seriously suggesting that no post on PM can mention the use of a 5.8.x feature?
Or that if they are mentioned, then the post must also duplicate the deltas and give a history of each features inception?
my $string = "abcdefghi";
my @fields = $string =~ /.{1,3}/g;
Hmm, $string =~ /.{1,3}/g should even be faster than split /(?(?{pos() % 3})(?!))/, $string.
I hereby propose that we patch split such that it's first argument, if it's a reference to an integer, will split the string into chunk of characters each with as many chars as that integrer (except the last of course). Come on! Who's with me? :-)
(for the humor impaired, I'm not being serious)
I'm not sure split is the right choice for extracting fixed-length substrings. Isn't that really what [doc://substr] is for (I mean, if you don't want to use [doc://unpack])?
sub split_len {
## split_len( $chars, $string[, $limit] )
## - splits $string into chunks of $chars chars
## - limits number of segments returned to $limit, if provided
my ($chars, $string) = @_;
my ($i, @result);
for ($i = 0; ($i+$chars) < length($string); $i+=$chars) {
last if (defined $limit && @result >= $limit);
push @result, substr($string, $i, $chars);
}
# deal with any short remainders
return @result if (defined $limit && @result >= $limit);
if ($i > length($string)-$chars) {
push @result, substr($string, $i);
}
return @result;
}
# deal with any short remainders
substr does it for us: "If OFFSET and LENGTH specify a substring that is partly outside the string, only the part within the string is returned". This is my version. Doesn't implement $limit (nor parameter checking) but features $start:
sub split_len {
my ($str, $start, $len) = @_;
my @ret;
for (my $strlen = length $str; $start <= $strlen; $start += $len) {
push @ret, substr $str, $start, $len;
}
return @ret;
}
my $c = join '', 'a'..'z';
print "@{[ split_len $c, 0, 3 ]}\n";
print "@{[ split_len $c, 0, 4 ]}\n";
print "@{[ split_len $c, 3, 4 ]}\n";
__END__
abc def ghi jkl mno pqr stu vwx yz
abcd efgh ijkl mnop qrst uvwx yz
defg hijk lmno pqrs tuvw xyz
--
David Serrano
perlmonks.org content © perlmonks.org and ambrus, BrowserUk, chibiryuu, duff, Hue-Bond, ikegami, japhy, radiantmatrix, Roy Johnson, Zaxo
prlmnks.org © 2006 edmund von der burg (eccles & toad)
v 0.03