I don't know how usefull this will be to anyone. It's rather obscure. Perldoc -f split says:
As a special case, specifying a PATTERN of space ( ) will split on white space just as "split" with no arguments does. Thus, "split( )" can be used to emulate awks default behavior, whereas "split(/ /)" will give you as many null initial fields as there are leading spaces. A "split" on "/\s+/" is like a "split( )" except that any leading whitespace produces a null first field. A "split" with no arguments really does a "split( , $_)" internally.That reads like stereo instructions. I think I understand it but it appears that B::Deparse is having problems with it:
#!/usr/bin/perl -w $string = ' abc def '; @array1 = split " ", $string; @array2 = split ?\s+?, $string; print '2: ' . ( join '|', @array1 ) . "\n"; print '3: ' . ( join '|', @array2 ) . "\n";
Run it through deparse and you get:
BEGIN { $^W = 1; }
$string = ' abc def ';
@array1 = split(?\s+?, $string, 0);
@array2 = split(?\s+?, $string, 0);
print '2: ' . join('|', @array1) . "\n";
print '3: ' . join('|', @array2) . "\n";
Run this script and you get the following output:
2: abc|def 3: |abc|def
This caused me to not trust Deparse quite so much. I'm going to be a little more careful about trusting what Deparse spits out.
C:\>perl -MO=Concise -e"@f = split ' ', shift" 9 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 8 <@> split[t5] vK ->9 3 > pushre(/"\\s+"/ => @f) s*/64 ->4 6 <1> shift sK/1 ->7 5 <1> rv2av[t4] sKRM/1 ->6 4 <#> gv[*ARGV] s ->5 7 <$> const[IV 0] s ->8 -e syntax OK C:\>perl -MO=Concise -e"@f = split /\s+/, shift" 9 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 8 <@> split[t5] vK ->9 3 > pushre(/"\\s+"/ => @f) s/64 ->4 6 <1> shift sK/1 ->7 5 <1> rv2av[t4] sKRM/1 ->6 4 <#> gv[*ARGV] s ->5 7 <$> const[IV 0] s ->8 -e syntax OK C:\>perl -MO=Concise -e"@f = split ?\s+?, shift" 9 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 8 <@> split[t5] vK ->9 3 > pushre(/"\\s+"/ => @f) s/64 ->4 6 <1> shift sK/1 ->7 5 <1> rv2av[t4] sKRM/1 ->6 4 <#> gv[*ARGV] s ->5 7 <$> const[IV 0] s ->8 -e syntax OKB::Deparse cautions you that the output might not be what you expect (or that it might be a bug, which you should report), but B::Concise seems to agree that there's no difference (but don't take my word for it, it might just be a similar bug).
Thanks. I didn't know about Concise.
The only difference I can see in these two is the asterisk (*):
> pushre(/"\\s+"/ => @f) s*/64 ->4 (split " ") > pushre(/"\\s+"/ => @f) s/64 ->4 (split ?\s+?)
The most I've been able to find out is that the '*' means 'do something weird for this op' and a reference to op.h, which says 'On pushre, re is /\s+/ imp. by split " "'. We already know this.
So this is gonna have to be one of those cases where I'm just gonna have to accept it as a peculiarity of perl. This is not a big issue as I won't be running into this any time soon again.
Also, the magic only happens when assigning to an array, using in implied contexts or assigning to a scalar the " " is *not* converted to ?\s+?.
Well, Concise doesn't even distingush between
perl -MO=Concise -we 'warn "foo"=~/f./g'and
perl -MO=Concise -we 'warn "foo"=~/f./'so I wouldn't trust it so much.
See what happens when you put that into a list context:
C:\MattPietrek>perl -MO=Concise -e"'foofoo' =~ /f../" 5 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 4 > match(/"f.."/) vKS/RTIME ->5 3 <$> const[PV "foofoo"] s ->4 -e syntax OK C:\MattPietrek>perl -MO=Concise -e"'foofoo' =~ /f../g" 5 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 4 > match(/"f.."/) vKS/RTIME ->5 3 <$> const[PV "foofoo"] s ->4 -e syntax OK C:\MattPietrek>perl -MO=Concise -e"() = 'foofoo' =~ /f../g" 8 <@> leave[1 ref] vKP/REFC ->(end) 1 <0> enter ->2 2 <;> nextstate(main 1 -e:1) v ->3 7 <2> aassign[t1] vKS/COMMON ->8 - <1> ex-list lK ->6 3 <0> pushmark s ->4 5 > match(/"f.."/) lKS/RTIME ->6 4 <$> const[PV "foofoo"] s ->5 - <1> ex-list lK ->7 6 <0> pushmark s ->7 - <0> stub lPRM* ->- -e syntax OK
Well, deparse doesn't seem to work on my notebook at the moment for the script so I can't check it out so it may be totally irrelevant, but ... are you sure you want to use ?\s+? instead /\s+/? Maybe it doesn't matter in split(), but ... from perldoc perlop:
?PATTERN?I have to admit I've never used this, but it does sound a little scary.
This is just like the "/pattern/" search, except that it matches only once between calls to the reset() operator. This is a useful optimization when you want to see only the first occurrence of something in each file of a set of files, for instance. Only "??" patterns local to the current package are reset.
One of the marks of a good obfuscation is one that runs fine as coded, but fails to run in its B::Deparse'd version. B::Deparse isn't perfect. Only perl can 100% reliably parse Perl. And even then it's probably only 99.99999% reliable. ;)
Dave
Deparse doesn't parse perl either. It generates perl. Every time Deparse fails to produce source code that compiles back to the same thing, that's a bug. There are no obfuscations that should be undeparseable.
⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊
You might as well be. To deparse that, you'd want B::Deparse to be able to write to a file instead of just STDOUT.
⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊
Yeah, I suppose you're right that that can be worked around. How about a more extreme example then?
BEGIN { undef %:: if %O:: }I don't think there's any way to deparse that.
Agreed. Several modules depend on the proper functioning of B::Deparse. If there is a difference, you should report the bug with perlbug. Please provide the test scripts so this can problem can be investigated.
perlmonks.org content © perlmonks.org and ambrus, BrowserUk, davido, diotalevi, harleypig, Jenda, PodMaster, Steve_p, truedfx
prlmnks.org © 2006 edmund von der burg (eccles & toad)
v 0.03