Deparse isn't as reliable as I thought
harleypig
created: 2006-03-03 16:47:17

I don't know how usefull this will be to anyone. It's rather obscure. Perldoc -f split says:

As a special case, specifying a PATTERN of space (’ ’) will split on white space just as "split" with no arguments does. Thus, "split(’ ’)" can be used to emulate awk’s default behavior, whereas "split(/ /)" will give you as many null initial fields as there are leading spaces. A "split" on "/\s+/" is like a "split(’ ’)" except that any leading whitespace produces a null first field. A "split" with no arguments really does a "split(’ ’, $_)" internally.

That reads like stereo instructions. I think I understand it but it appears that B::Deparse is having problems with it:

#!/usr/bin/perl -w

$string = '   abc   def   ';

@array1 = split " ", $string;

@array2 = split ?\s+?, $string;

print '2: ' . ( join '|', @array1 ) . "\n";

print '3: ' . ( join '|', @array2 ) . "\n";

Run it through deparse and you get:

BEGIN { $^W = 1; }
$string = '   abc   def   ';
@array1 = split(?\s+?, $string, 0);
@array2 = split(?\s+?, $string, 0);
print '2: ' . join('|', @array1) . "\n";
print '3: ' . join('|', @array2) . "\n";

Run this script and you get the following output:

2: abc|def
3: |abc|def

This caused me to not trust Deparse quite so much. I'm going to be a little more careful about trusting what Deparse spits out.

Harley J Pig
Re: Deparse isn't as reliable as I thought
created: 2006-03-03 17:37:09
Not exactly :)
C:\>perl -MO=Concise -e"@f = split ' ', shift"
9  <@> leave[1 ref] vKP/REFC ->(end)
1     <0> enter ->2
2     <;> nextstate(main 1 -e:1) v ->3
8     <@> split[t5] vK ->9
3         pushre(/"\\s+"/ => @f) s*/64 ->4
6        <1> shift sK/1 ->7
5           <1> rv2av[t4] sKRM/1 ->6
4              <#> gv[*ARGV] s ->5
7        <$> const[IV 0] s ->8
-e syntax OK

C:\>perl -MO=Concise -e"@f = split /\s+/, shift"
9  <@> leave[1 ref] vKP/REFC ->(end)
1     <0> enter ->2
2     <;> nextstate(main 1 -e:1) v ->3
8     <@> split[t5] vK ->9
3         pushre(/"\\s+"/ => @f) s/64 ->4
6        <1> shift sK/1 ->7
5           <1> rv2av[t4] sKRM/1 ->6
4              <#> gv[*ARGV] s ->5
7        <$> const[IV 0] s ->8
-e syntax OK

C:\>perl -MO=Concise -e"@f = split ?\s+?, shift"
9  <@> leave[1 ref] vKP/REFC ->(end)
1     <0> enter ->2
2     <;> nextstate(main 1 -e:1) v ->3
8     <@> split[t5] vK ->9
3         pushre(/"\\s+"/ => @f) s/64 ->4
6        <1> shift sK/1 ->7
5           <1> rv2av[t4] sKRM/1 ->6
4              <#> gv[*ARGV] s ->5
7        <$> const[IV 0] s ->8
-e syntax OK
B::Deparse cautions you that the output might not be what you expect (or that it might be a bug, which you should report), but B::Concise seems to agree that there's no difference (but don't take my word for it, it might just be a similar bug).

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.
Re^2: Deparse isn't as reliable as I thought
created: 2006-03-03 19:21:29

Thanks. I didn't know about Concise.

The only difference I can see in these two is the asterisk (*):

 pushre(/"\\s+"/ => @f) s*/64 ->4 (split " ")
 pushre(/"\\s+"/ => @f) s/64 ->4 (split ?\s+?)

The most I've been able to find out is that the '*' means 'do something weird for this op' and a reference to op.h, which says 'On pushre, re is /\s+/ imp. by split " "'. We already know this.

So this is gonna have to be one of those cases where I'm just gonna have to accept it as a peculiarity of perl. This is not a big issue as I won't be running into this any time soon again.

Also, the magic only happens when assigning to an array, using in implied contexts or assigning to a scalar the " " is *not* converted to ?\s+?.

Harley J Pig
Re^2: Deparse isn't as reliable as I thought
created: 2006-03-05 18:35:42

Well, Concise doesn't even distingush between

perl -MO=Concise -we 'warn "foo"=~/f./g'
and
perl -MO=Concise -we 'warn "foo"=~/f./'
so I wouldn't trust it so much.
Re^3: Deparse isn't as reliable as I thought
created: 2006-03-05 18:52:56

See what happens when you put that into a list context:

C:\MattPietrek>perl -MO=Concise -e"'foofoo' =~ /f../"
5  <@> leave[1 ref] vKP/REFC ->(end)
1     <0> enter ->2
2     <;> nextstate(main 1 -e:1) v ->3
4      match(/"f.."/) vKS/RTIME ->5
3        <$> const[PV "foofoo"] s ->4
-e syntax OK

C:\MattPietrek>perl -MO=Concise -e"'foofoo' =~ /f../g"
5  <@> leave[1 ref] vKP/REFC ->(end)
1     <0> enter ->2
2     <;> nextstate(main 1 -e:1) v ->3
4      match(/"f.."/) vKS/RTIME ->5
3        <$> const[PV "foofoo"] s ->4
-e syntax OK

C:\MattPietrek>perl -MO=Concise -e"() = 'foofoo' =~ /f../g"
8  <@> leave[1 ref] vKP/REFC ->(end)
1     <0> enter ->2
2     <;> nextstate(main 1 -e:1) v ->3
7     <2> aassign[t1] vKS/COMMON ->8
-        <1> ex-list lK ->6
3           <0> pushmark s ->4
5            match(/"f.."/) lKS/RTIME ->6
4              <$> const[PV "foofoo"] s ->5
-        <1> ex-list lK ->7
6           <0> pushmark s ->7
-           <0> stub lPRM* ->-
-e syntax OK

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Deparse isn't as reliable as I thought
created: 2006-03-03 20:32:29

Well, deparse doesn't seem to work on my notebook at the moment for the script so I can't check it out so it may be totally irrelevant, but ... are you sure you want to use ?\s+? instead /\s+/? Maybe it doesn't matter in split(), but ... from perldoc perlop:

?PATTERN?
This is just like the "/pattern/" search, except that it matches only once between calls to the reset() operator. This is a useful optimization when you want to see only the first occurrence of something in each file of a set of files, for instance. Only "??" patterns local to the current package are reset.
I have to admit I've never used this, but it does sound a little scary.

Re: Deparse isn't as reliable as I thought
created: 2006-03-04 00:33:16

One of the marks of a good obfuscation is one that runs fine as coded, but fails to run in its B::Deparse'd version. B::Deparse isn't perfect. Only perl can 100% reliably parse Perl. And even then it's probably only 99.99999% reliable. ;)


Dave

Re^2: Deparse isn't as reliable as I thought
created: 2006-03-04 01:03:02

Deparse doesn't parse perl either. It generates perl. Every time Deparse fails to produce source code that compiles back to the same thing, that's a bug. There are no obfuscations that should be undeparseable.

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re^3: Deparse isn't as reliable as I thought
created: 2006-03-04 04:17:38
What about BEGIN { close STDOUT; }? (No, I'm not entirely serious here.)
Re^4: Deparse isn't as reliable as I thought
created: 2006-03-04 11:29:03

You might as well be. To deparse that, you'd want B::Deparse to be able to write to a file instead of just STDOUT.

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re^5: Deparse isn't as reliable as I thought
created: 2006-03-04 14:36:14

Yeah, I suppose you're right that that can be worked around. How about a more extreme example then?

BEGIN { undef %:: if %O:: }

I don't think there's any way to deparse that.

Re^3: Deparse isn't as reliable as I thought
created: 2006-03-04 08:41:20

Agreed. Several modules depend on the proper functioning of B::Deparse. If there is a difference, you should report the bug with perlbug. Please provide the test scripts so this can problem can be investigated.

perlmonks.org content © perlmonks.org and ambrus, BrowserUk, davido, diotalevi, harleypig, Jenda, PodMaster, Steve_p, truedfx

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03