Using a regex function from a hash
perlNinny
created: 2006-08-02 14:43:51
What I want is a function hash that when it finds a line with one thing in it, will work on that line with the function its associated with. See the code below for an example:
#!/bin/perl

my @view = ("ignore","ignore","dont\\ignore\\this.vms","ignore");
my %functions_hash = ( '\.vms' => 's/\///');

foreach my $data (@views)
{
   while (($key, $value) = each %functions_hash)
   {
      if( $data =~ /$key/ )
      {
          $data =~ $value;
          print "data is now $data\n";
      }
   }
}
Re: Using a regex function from a hash
created: 2006-08-02 14:48:34

Bad news: regexen and substitutions don't work that way.

Good news: if use an anonymous sub instead you can do what you want.

my %functions = ( "\\.vms" => sub { my $t = shift; $t =~ s{/}{}; $t } );

Then call the sub when you get a match.

$data = $value->( $data );
Re: Using a regex function from a hash
created: 2006-08-02 14:49:12
You could wrap the substitution in a code reference:

my %functions_hash = ( '\.vms' => sub { $_[0] =~ s{/}{} } );
...
if ( $data =~ /$key/ ) { 
    $value->( $data );
}
Re^2: Using a regex function from a hash
created: 2006-08-03 13:23:26
EXCELLENT. now a follow up. I am trying to get the word out of a line without much success. Here is some example lines:

Created by GeorgeO at the parker inn
date : 06-Aug-06:22.34.01 buncahstuffIdon'twant
Created by 7of9 at the parker inn
date : 06-Jan-06:22.34.01 buncahstuffIdon'twant
Please get me:
GeorgeO
06-Aug-06:22.34.01
7of9
06-Jan-06:22.34.01

my %functions_hash = 
( '^Created by ' => sub { $_[0] =~ /^Created by (\w+)/; $_[0] = $1; },
'^date :' => sub ( $_[0] =~ /^data :(#HELP!!)/; $_[0] = $1
 );
...
if ( $data =~ /$key/ ) { 
    $value->( $data );
}
Re^3: Regex, extracting stuff from the middle of a line
created: 2006-08-03 15:13:01
Messed up my question on the regex thing. Here is some example lines:

Created by a.buncha.stuff.I.dont.want\GeorgeO at the parker inn
date : 06-Aug-06:22.34.01 buncahstuffIdon'twant
Created by a.buncha.stuff.I.dont.want\7of9 at the parker inn
date : 06-Jan-06:22.34.01 buncahstuffIdon'twant

Please get me:
GeorgeO
06-Aug-06:22.34.01
7of9
06-Jan-06:22.34.01

2006-08-03 Moved and retitled by GrandFather, as per Monastery guidelines
Original title: 'Regex, extracting stuff from the middle of a line'

Re: Re^3: Regex, extracting stuff from the middle of a line
created: 2006-08-03 15:28:04
You should escape the special '\' symbol:
m|\\(\w+)| and print $1, "\n";
For the date/time:
m|:\s+(\S+)| and print $1, "\n";
See [doc://perlre]


Re: Re^3: Regex, extracting stuff from the middle of a line
created: 2006-08-03 15:29:14

Your requirements are very vague.

  • You didn't specify which components are constant and which are not. I assumed "\" and "date : " are constant.
  • You didn't specify of which chars the variable parts could be made. I assumed what is represented by "GeorgeO" could contain spaces (which required specifying " at the parker inn"),
  • You didn't specufy from where the lines came. I assumed a file handle.
  • You didn't specify what to do with the extracted values. I print them.
while (<$fh>) {
   print("$1\n") if /\\(.*?) at the parker inn$/
                 or /^date : (\S+)/;
}
Re: Re^3: Regex, extracting stuff from the middle of a line
created: 2006-08-03 15:29:40

And your code that's not working looks like . . .?

Show some effort of your own before submitting work requests. See node 172086.

Update: Oop, didn't realize this was a followup to a prior question.

Probably because it was exceedingly vague and offered no context or indication otherwise, so the how-not-to-ask-ing's still relevant.

Re: Re^3: Regex, extracting stuff from the middle of a line
created: 2006-08-03 16:38:36

While ikegami said pretty much everything I wanted to say, I suggest that if you can come up with a set of explicit rules for what you want removed or, conversely, kept, the code will be easy (perhaps not the most elegant possible).

emc

Outside of a dog, a book is man's best friend. Inside of a dog it's too dark to read.

Groucho Marx
Re: Using a regex function from a hash
created: 2006-08-02 15:25:26
Hi [perlNinny],

Another thing you could do, if you only want to apply regex substitutions, is to make the values of the hash list references, where the first item in the list is the left side of the substitution, and the second item is the right side of the substitution.

Here's an example:

#!/usr/bin/perl -w

use strict;
use warnings;
                                                                            
my @views = (
    "ignore",
    "abbbbc",
    "mn123op",
    "ignore",
    "uvwxy",
);

my %functions_hash = (
    'ab+c$' => [ 'b+',    'x' ],
    'mn\d+' => [ '(\d+)', '' ],
    'uvwxy' => [ 'uvw',   'UVW' ],
);

foreach my $data (@views)
{
   while (my ($key, $pvalue) = each %functions_hash)
   {
      if ($data =~ /$key/)
      {
          my ($from, $to) = ($pvalue->[0], $pvalue->[1]);
          $data =~ s/$from/$to/;
          print "data is now $data\n";
      }
   }
}

Which gives these results:

data is now axc
data is now mnop
data is now UVWxy

s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re^2: Using a regex function from a hash
created: 2006-08-03 04:56:35
..or put another way..?
my @view = ("ignore","ignore","dont\\ignore\\this.vms","ignore");
my %functions_hash = ( "\.vms" => [ qr/\\/, "" ] );

foreach my $data (@view) {
   while ((my $key, my $value) = each %functions_hash) {
      if( $data =~ /$key/ ) {
          my ( $match, $substitute ) = @{$value};
          $data =~ s/$match/$substitute/g;
          print "data is now $data\n";
      }
   }
}
Re: Using a regex function from a hash
created: 2006-08-03 05:54:21
Because there is no built-in regexp-based lookup into a hash, the regexps might as well be paired with a reference to the code for each regexp, a.k.a a 'dispatch table':
my @dispatch = [ [ '\.vms',
                   sub { my $fnam = shift;
                         $fnam =~ s/\///g;
                         return $fnam; }
                 ]
               # ,[] ...
               ];

for ( my $v=0; $v <= $#view; $v++ ) {
    REGEXP: for ( my $d=0; $d <= $#dispatch; $d++ ) {
        if ( $view[$v] =~ /$dispatch[ $d ][0]/ ) {
            $view[$v]=&{$dispatch[$d][1]} ($view[$v]);
            last REGEXP;
        }
    } 
}
(updated to correct a typo - see reply below)

-M

Free your mind

Re^2: Using a regex function from a hash
created: 2006-08-03 13:34:24
This makes the most sense to me, but, I think this line:
           $view[$v]=&{$dispatch[$d][0]} ($view[$v]);
should probably be this
           $view[$v]=&{$dispatch[$d][1]} ($view[$v]);
Re^3: Using a regex function from a hash
created: 2006-08-04 11:09:17
absolutely - well spotted!

-M

Free your mind

Re: Using a regex function from a hash
created: 2006-08-03 16:58:06
In particular, have a look at the "Execute command" part.
#!/usr/bin/perl
#
# $Id: whatprog,v 1.2 1994/11/20 06:19:00 weingart Exp weingart $
#
# Whatnowproc for MH-6.8

# Get the command line arguments
push(@ARGV, split(' ', `mhparam whatnow`));

# Dispatch table
%dispatch = (
        'alias',        'alias_proc',
        'echo',         'echo_proc',
        'edit',         'edit_proc',
        'encrypt',      'encrypt_proc',
        'env',          'env_proc',
        'mime',         'mime_proc',
        'quit',         'quit_proc',
        'send',         'send_proc',
        'set',          'set_proc',
        'sign',         'sign_proc',
        'unalias',      'unalias_proc',
        'unset',        'unset_proc',
);

# Alias table
%aliases = (
);

# Variables table
%var = (
        'prompt',               '"Draft($message): "',
        'alias_level',  10,
);


# Mainline
{
        # Init variables
        foreach $key (keys %ENV){
                next if($key !~ m/^mh/);

                $var{$key} = $ENV{$key};
        }
        split(/\//, $ENV{'mhdraft'});
        $var{'message'} = pop(@_);

        # Read init file
        if(open(INIT, "$ENV{HOME}/.whatnowrc")){
                while(){
                        &do_command($_);
                }
                close(INIT);
        }

        # Command loop
        &prompt;
        while(<>){
                # Execute command
                &do_command($_);

                &prompt;
        }
        exit(0);
}


# Handle command
sub do_command {
        local($cmd) = $_[0];
        local(@cmd);

        # Massage line into list
        chop($cmd);
        @cmd = &do_token($cmd);
        return if($#cmd == -1);

        # Interpolate vars
        @cmd = &do_vars(@cmd);

        # Do aliases
        @cmd = &do_aliases(@cmd);

        # Execute cmd
        if(defined($dispatch{$cmd[0]})){
                &{ $dispatch{$cmd[0]} }(@cmd);
                print "$@\n" if($@);
        }else{
                print "Not finished yet\n";
        }
}

# Do aliases
sub do_aliases {
        local(@args) = @_;
        local($deep) = 0;

        while(defined($aliases{$args[0]}) && ($deep != $var{'alias_level'})){
                $args[0] = $aliases{$args[0]} if(defined($aliases{$args[0]}));
                $deep++;

                print "Infinite recursion...\n" if($deep == $var{'alias_level'});
        }
        if($deep >= $var{'alias_level'}){
                &prompt;
                next;
        }

        return(@args);
}


# Print out prompt
sub prompt {
        local($message);
        local($prompt);

        split('/', $ENV{'mhdraft'});
        $message = pop(@_);

        if(defined($var{'prompt'})){
                $prompt = eval("$var{'prompt'}");
                print "$prompt";
        }else{
                print "Draft $message> ";
        }
        flush;
}


# Unalias an alias
sub unalias_proc {
        local(@args) = @_;
        local($cmd);

        $cmd = shift(@args);
        $cmd = shift(@args);

        if(!defined($aliases{$cmd})){
                if($cmd !~ m/^\s*$/){
                        print "Alias $cmd does not exist!\n";
                }else{
                        print "Huh, say what?\n";
                }
        }else{
                delete($aliases{$cmd});
        }
}


# Alias some command
sub alias_proc {
        local(@args) = @_;
        local($cmd, $exp, $tmp);

        $cmd = shift(args);
        $cmd = shift(args);
        $exp = join(' ', @args);

        if(defined($dispatch{$cmd})){
                print "Can not alias that!\n";
                return;
        }

        if($exp !~ m/^\s*$/){
                $aliases{$cmd} = $exp;
        }else{
                foreach $tmp (keys %aliases){
                        print "$tmp\t->\t$aliases{$tmp}\n";
                }
        }
}


# Echo arguments
sub echo_proc {
        local(@args) = @_;

        shift(@args);
        print join(' ', @args);
        print "\n";
}


# Set a variable
sub set_proc {
        local(@args) = @_;
        local($tmp);

        if($#args == 0){
                foreach $tmp (keys %var){
                        print "$tmp = $var{$tmp}\n";
                }
        }else{
                $var{$args[1]} = $args[3];
        }
}


# Unset a variable
sub unset_proc {
        local(@args) = @_;
        local($tmp);

        return if($#args != 1);

        $tmp = $args[1];
        delete $var{$tmp} if(defined($var{$tmp}));
}


# Interpolate variables
sub do_vars {
        local(@args) = @_;
        local($tmp);

        foreach $tmp (@args){
                next if($tmp !~ m/^\$([a-zA-Z]\w*)/);

                if(!defined($var{$1})){
                        print "\$$1 is not defined.\n";
                }else{
                        $tmp =~ s/\$(\w+)/$var{"$1"}/;
                }
        }

        return(@args);
}


# Encrypt a document
sub encrypt_proc {
        print "Hang on sloopy!\n";
        print @_;
}


# Sign a document
sub sign_proc {
        print "Hang on sloopy!\n";
        print @_;
}


# Mime a document
sub mime_proc {
        local($mimeproc);
        local(@mimeproc);

        chop($mimeproc = `mhparam buildmimeproc`);
        chop($mimeproc = `mhparam automhnproc`) if($mimeproc eq '');
        @mimeproc = split(/\s+/, $mimeproc);
        system(@mimeproc, "$ENV{'mhdraft'}");
}


# Send a document
sub send_proc {
        local($sendproc);
        local(@sendproc);
        local($domime);

        chop($domime = `mhparam automimeproc`);
        &mime_proc if($domime eq '1');
        chop($sendproc = `mhparam sendproc`);
        @sendproc = split(/\s+/, $sendproc);
        system(@sendproc, "$ENV{'mhdraft'}");
}


# Edit a document
sub edit_proc {
        system("$ENV{'mheditor'}", "$ENV{'mhdraft'}");
}


# Print environment
sub env_proc {
        local($i);

        foreach $i (keys %ENV){
                next if($i !~ m/^mh/i);
                print "$i => $ENV{$i}\n";
        }
}


# Quit this
sub quit_proc {
        local(@args);
        local($tmp);

        $tmp = join(' ', @_);
        @args = split(/\s+/, $tmp);

        if(!grep(/^-nodel(ete)?/, @args)){
                $tmp = $ENV{'mhdraft'};
                $tmp =~ s|/(\d+)$|/,$1|;
                rename($ENV{'mhdraft'}, $tmp);
        }

        exit(0);
}


# Tokenize line
sub do_token {
        local($line) = $_[0];
        local(@match) = ();
        local(@what) = ();
        local($i, $tmp);


        for($i = 0; $line ne ''; $i++){

                # BLANK
                if($line =~ m/^(\s+)/){
                        $line = substr($line, length($1));
                }

                # WORD
                if($line =~ m/^(\w+)/){
                        $what[$i] = 'WORD';
                        $match[$i] = $1;
                        $line = substr($line, length($1));
                        next;
                }

                # VAR
                if($line =~ m/^(\$[a-zA-Z]\w*)/){
                        $what[$i] = 'VAR';
                        $match[$i] = $1;
                        $line = substr($line, length($1));
                        next;
                }

                # STRING
                if($line =~ m/^("[^"]*")/){
                        $what[$i] = 'STRING';
                        $match[$i] = $1;
                        $line = substr($line , length($1));
                        next;
                }

                # SPECIAL
                if($line =~ m/^([=])/){
                        $what[$i] = 'SPECIAL';
                        $match[$i] = $1;
                        $line = substr($line, length($1));
                        next;
                }

                # Comment
                if($line =~ m/^(#.*)/){
                        $line = substr($line, length($1));
                        next;
                }

                # ERROR
                if($line =~ m/^(.+)$/){
                        print "Found ERROR($1).\n";
                        $line = substr($line, length($1));
                        next;
                }

        }

        return(@match);
}

2006-08-04 [id://340870|Retitled] by [planetscape], as per Monastery [id://341118|guidelines] ( keep:0 edit:13 reap:0 )
Original title: 'I use something like this:'

Re: Using a regex function from a hash
created: 2006-08-04 16:40:42

Not to be picky, as the code is pretty clear (and complete!) but all those local() declarations on the functions shoould really be my() declarations, IMO... local() does not create a local variable... (Perl 5! check the docs)

best regards!

--
our $Perl6 is Fantastic;

Re^2: Using a regex function from a hash
created: 2006-08-11 15:46:49
Much of this was written quite some time ago... :) Yes, it needs a rewrite, but other things usually take precedence.. -T.
Re: Using a regex function from a hash
created: 2006-08-03 23:56:53
Your approach could be fine if you wanted to restrict the allowed functions only to simple replacements. With a few minor tweaks I get this:

use strict;

my @views = ("ignore","ignore","dont\\ignore\\this.vms","ignore");

my %regex_hash = (
	'\.vms' => [ qr/\\/ , '/' ],
   );

for my $data (@views) {
	for my $re_key ( grep { $data =~ /$_/ } keys %regex_hash ) {
		$data =~ s!$regex_hash{$re_key}[0]!$regex_hash{$re_key}[1]!g;
	}
}

print join "\n", @views;
and the result is:
ignore
ignore
dont/ignore/this.vms
In regex_hash the key is the regex (as a string) to match against the view data. The first entry in the value arrayref is the match portion of the substitution. The second entry in the value arrayref is the replace portion of the substitution.

You'd have to be careful of using ! (in this implementation) in the key or the second value because I used it for the substitution delimiter.

Becuase of this and the flexibility offered by a sub dispatch table you're probably better off going that way, but I thought I'd show you how close you were to making it work!

Update: Just realised that [liverpole] already pointed this out. Using qr as in my example is a good idea however - it is more efficient and will protect you against quoting issues. See [perlre] if you are unfamiliar with qr

perlmonks.org content © perlmonks.org and Anonymous Monk, aufflick, Excalibor, Fletch, friedo, ikegami, liverpole, monarch, Moron, perlNinny, sh1tn, swampyankee, weingart

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03