<?xml version="1.0" encoding="UTF-8"?>



<rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule">

    <channel>
        <title>perlmeditation</title>
        <link>http://prlmnks.org/list/</link>
        <description>RSS feeds from perlmonks.org</description>
        <language>en</language>
        <ttl>5</ttl>

        

<item>
    <title>Polynomial JAPH revisited (tweetiepooh)</title>
    <link>http://prlmnks.org/html/580957.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580957.html</guid>

    <description>
        In my [id://577605|earlier submission] I showed a means of JAPHing using a polynomial generated by Excel.  Here now is some code to calculate the equation from a string all in Perl.&lt;p&gt;It works fine with the JAPH string upto 19 characters above which I think the limitations on internal maths stuff breaks.  It&#39;s not obfusctation at this point and suggestions to get it work with the full 25 characters in JAPH (include line feed) would be fun.  Note the code has been cobbled together on a Friday afternoon and I&#39;ve made no attempt to make it elegant&lt;p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/local/bin/perl -wuse strict;use Math::Polynomial;my %h = ();my $s = &quot;Just another Perl h&quot;;# put string into a hash {0=&gt;J, 1=&gt;u ...)map { $h{$_}=ord(substr($s,$_,1))}(0..length($s)-1);# use the library to get the polynomial factorsmy @p = split / /, Math::Polynomial::interpolate(%h);# generate the calculation string from factorsmy $p;my $i = length($s)-1;map{ $p .= &quot;+$_*\$i**$i&quot;;$i-- } @p;# tidy the string, remove $i**0 (=1)# add .5 to total for rounding purposes$p =~ s/\+\-/\-/g;$p =~ s/.*\(//;$p =~ s/\).*/.5/;print &quot;$p\n&quot;;# test the equationfor $i (0..length($s)-1) {print chr eval $p;}&lt;/pre&gt;&lt;b&gt;Update&lt;/b&gt;&lt;p&gt;Links and spelling corrected as per chatterbox
    </description>
</item>

        

<item>
    <title>iThreads for OOP (Zukoff)</title>
    <link>http://prlmnks.org/html/580843.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580843.html</guid>

    <description>
        I&#39;ve investigated what modules CPAN have about ithreads. Some of them are too old, some - doesn&#39;t work on Win32 as good as on *nix. I was impressed with:Perl Scalable ithreaded Component Helper Extensions&lt;a href=&quot;http://www.presicient.com/psiche/&quot;&gt;http://www.presicient.com/psiche/&lt;/a&gt;
    </description>
</item>

        

<item>
    <title>A faster?, safer, user transparent, shared variable &quot;locking&quot; mechanism. (BrowserUk)</title>
    <link>http://prlmnks.org/html/580692.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580692.html</guid>

    <description>
        &lt;p&gt;In perl, all user data is stored in scalars (SV*s). &lt;p&gt;What if the user mutable shared data, shared scalars were allocated from a read-only memory section? &lt;p&gt;Every time the user attempted to write to them, an exception would be raised.&lt;p&gt;When an exception occurs, the operation, whatever it may be, is re-done within the auspices of a critical section having temporarially set the section read-write. It is set back to read-only before the critical section ends. &lt;p&gt;As I understand it (on x86), the read-only/read-write status of a section of memory is an attribute of its &lt;a href=&quot;/out/http/?url=en.wikipedia.org%2Fwiki%2FLocal_Descriptor_Table&quot;&gt;LDT entry&lt;/a&gt;. These are read and written using single opcodes (SLDT/LLDT) and so the process of switching between read-only to read-write and back ought to be pretty quick.&lt;p&gt;The steps are:&lt;ol&gt;&lt;li&gt;enter exception handler.&lt;/li&gt;&lt;li&gt;enter critical section.&lt;/li&gt;&lt;li&gt;set the attributes of the shared data memory section to read-write.&lt;/li&gt;&lt;li&gt;reattempt the excepting operation (opcode?)&lt;/li&gt;&lt;li&gt;reset the shared data memory section read-only.&lt;/li&gt;&lt;li&gt;exit critical section&lt;/li&gt;&lt;li&gt;exit exception handler&lt;/li&gt;&lt;li&gt;continue with original program at the next operation.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;As a critical section cannot be preempted, this effectively locks the shared data without the need for semaphores.&lt;p&gt;The removal of semaphores from the picture, vastly reduces the complexity of the problem, and vastly reduces the overhead of checking for locks. &lt;p&gt;The absence of the requirement for semaphores means that the same (existing) code for manipulating non-shared SV*s can also be used for shared SV*s, with no impact on performance for operations on non-shared data. &lt;p&gt;All locking becomes redundant. And serialisation of write accesses to shared data is dealt with entirely by a single mechanism that is controlled by the CPU itself, that of raising exceptions. &lt;p&gt;These will only occur when writing. And only when writing to shared data. &lt;p&gt;All other accesses are entirely free of any impact attributable to the provision of shared data.&lt;p&gt;The only other impact is that when variables are declared shared, they get allocated from the shared (most times, read-only) memory section.&lt;hr /&gt;&lt;ul&gt;&lt;li&gt;Can this work? &lt;/li&gt;&lt;li&gt;Would it be faster? &lt;/li&gt;&lt;li&gt;Could it be retro-fitted to perl 5? &lt;/li&gt;&lt;li&gt;Is there any merit in it for perl 6 (or Parrot or other implementation)?&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Win32/win64 certainly have all the primitives to implement this. I suspect, but do not know, that many/most other OSs (on x86 at least) have similar primitives in order to implement it. What about other platforms/hardware?&lt;p&gt;I do not have the requisite XS/perlguts skills to try it, or reach a conclusion on whether it could be done there. So I am throwing the idea out there to see what pople think?&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-171588&quot;&gt;&lt;hr /&gt;&lt;font size=1 &gt;&lt;div&gt;Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.&lt;/div&gt;&lt;div&gt;Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?&lt;/div&gt;&lt;div&gt;&quot;Science is about questioning the status quo. Questioning authority&quot;. &lt;/div&gt;&lt;div&gt;In the absence of evidence, opinion is indistinguishable from prejudice.&lt;/div&gt;&lt;/font&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>OT: Improving IE7 SDK Docs (was Let&#39;s suppose...) (footpad)</title>
    <link>http://prlmnks.org/html/580686.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580686.html</guid>

    <description>
        &lt;p&gt;The recent release of IE7 has gotten me to thinking.  (Always a dangerous sign.)&lt;/p&gt;&lt;p&gt;Let&#39;s suppose you could set aside the political BS and actually talk to the documentation team for &lt;a href=&quot;http://msdn.microsoft.com/ie/&quot;&gt;Internet Explorer&#39;s SDK&lt;/a&gt;, the folks that document IE&#39;s support for HTML, CSS, and other browser idiosyncracies.  Let&#39;s say you could tell them what you didn&#39;t like about the existing documentation and could tell them what would make the documentation &quot;pop.&quot;&lt;/p&gt;&lt;p&gt;Mind you, this isn&#39;t a request to bash IE; that&#39;s been done quite effectively elsewhere.  Instead, it&#39;s a serious attempt to solicit feedback from a community that has typically not been consulted with respect to this documentation set.&lt;/p&gt;&lt;p&gt;Nor is it a request for feature support in the next release of IE.  Yes, we all know it would be sweet if IE would support ACID2; however, today it doesn&#39;t and this isn&#39;t that sort of thread.  This is strictly a focus on the developer documentation.  What&#39;s confusing, what&#39;s missing, what would make that material a useful resource for you?&lt;/p&gt;&lt;p&gt;(And before you bash me for being off-topic, this does affect those of us using perl to write Web 1.0-style CGI scripts.  There is an audience that I feel has been left out of the discussion and wonder what, if granted the opportunity, what  feedback that community would provide.)&lt;/p&gt;&lt;p&gt;&lt;i&gt;--f&lt;/i&gt;&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>Spot the bug! (tlm)</title>
    <link>http://prlmnks.org/html/580652.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580652.html</guid>

    <description>
        &lt;p&gt;Just got my butt chomped by this one, so I thought I&#39;d share.&lt;/P&gt;&lt;P&gt;What&#39;s wrong with this function?&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;sub foo {  return eval { bar( shift ) } || 0;}&lt;/pre&gt;The answer does not depend on the definition of &lt;tt class=&quot;inline_code&quot;&gt;bar&lt;/tt&gt;.  E.g. let &lt;tt class=&quot;inline_code&quot;&gt;bar&lt;/tt&gt; be&lt;pre class=&quot;block_code&quot;&gt;sub bar { return shift; }&lt;/pre&gt;&lt;spoiler&gt;To see what&#39;s wrong, try this:&lt;pre class=&quot;block_code&quot;&gt;print foo( 3 ), &quot;\n&quot;;$@ = 3;print foo( $@ ), &quot;\n&quot;;__END__30&lt;/pre&gt;The moral of the story: never use &lt;tt class=&quot;inline_code&quot;&gt;@_&lt;/tt&gt; inside an &lt;tt class=&quot;inline_code&quot;&gt;eval&lt;/tt&gt;.&lt;/spoiler&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-439528&quot;&gt;&lt;p&gt;&lt;small&gt;the lowliest monk&lt;/small&gt;&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Using DBM::Deep and Parallel::ForkManager for a generalized parallel hashmap function builder (followup to &quot;reducing pain of parallelization with FP&quot;) (tphyahoo)</title>
    <link>http://prlmnks.org/html/580608.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580608.html</guid>

    <description>
        This is a followup to &lt;p&gt;&lt;a href=&quot;/html/579976.html&quot;&gt;Using functional programming to reduce the pain of parallel-execution programming (with threads, forks, or name your poison)&lt;/a&gt;&lt;p&gt;In this post, I was attempting to create numerous variations of a function builder I&#39;m calling &lt;tt class=&quot;inline_code&quot;&gt;hashmap_parallel&lt;/tt&gt;. &lt;p&gt;This function builder should theoretically allow me to implement, for example, many of the utilities in &lt;a href=&quot;/out/mod/List::MoreUtils&quot;&gt;List::MoreUtils&lt;/a&gt; in a transparently parallel way. Eg, instead of using &lt;tt class=&quot;inline_code&quot;&gt;flexygrep&lt;/tt&gt; to build up a parallel grepping function on top of a parallel mapping function and an element test, as shown below, I would use flexy_any, flexy_all, flexy_pairwise, flexy_mesh, or what have you. These could all build on top of diverse custom hashmap_parallel, which you could plug in at will depending on your resources and the paricular bottleneck for the computation you are faced with.&lt;p&gt;So, if you have multiple processors, you can take advantage of that without cluttering up your code. And if you happen to have a 1000 commodity-box cluster, perhaps you too will one day be able to &lt;a href=&quot;http://labs.google.com/papers/mapreduce-osdi04.pdf&quot;&gt;sort a 10^10 element array in under two minutes&lt;/a&gt;, with a simple call to &lt;tt class=&quot;inline_code&quot;&gt;distributed_sort($cmp_function, $my_big_list)&lt;/tt&gt;&lt;p&gt;Since posting the above meditation, I have been able to implement hashmap_parallel in two ways.&lt;p&gt;1) threads (with help from &lt;a href=&quot;/out/node/LanceDeeply&quot;&gt;LanceDeeply&lt;/a&gt;)&lt;br&gt;2) forks, along with multiple DBM::Deep data stores (using Parallel::ForkManager on &lt;a href=&quot;/out/node/tilly&quot;&gt;tilly&lt;/a&gt;&#39;s advice, using DBM::Deep because... well, I couldn&#39;t think of any other way to do it that would work for any type of data.)&lt;br&gt;&lt;p&gt;&lt;p&gt;In addition, I have a simple &quot;serial hashmap&quot; function that I can plug into my &quot;flexygrep&quot; grep builder, as a sanity check.&lt;p&gt;At present, the threading solution is faster. I&#39;m attributing this to the multiple hard drive read/writes that are required with the multiple DBM::Deep solution. &lt;p&gt;I was hoping to also implement a solution using a single DBM::Deep db and locking, which should at least reduce the read time. However, I was unable to do so. My test is to match a collection of letters against &lt;tt class=&quot;inline_code&quot;&gt;/[abc]/&lt;/tt&gt;, where this test is made after sleeping for one second (we delay so that there is an artificial &quot;reason&quot; to want to take advantage of parallelism). There should be 9 matches in each case. &lt;p&gt;&lt;pre class=&quot;block_code&quot;&gt;perl test_hashmap.pltest strings: a b c a b c a b cok 1 - threads, matches: 9 -- a a b c c b b a c -- should be 9ok 2 - threads, time elapsed: 2, wanted max seconds: 2ok 3 - fork with many dbm::deep dbs, matches: 9 -- a a b c c b b a c -- should be 9ok 4 - fork with many dbm::deep dbs, time elapsed: 1, wanted max seconds: 2Use of uninitialized value in join or string at test_hashmap.pl line 48.not ok 5 - fork with 1 dbm::deep db, matches: 7 -- a b b b  c c -- should be 9#   Failed test &#39;fork with 1 dbm::deep db, matches: 7 -- a b b b  c c -- should be 9&#39;#   in test_hashmap.pl at line 48.ok 6 - fork with 1 dbm::deep db, time elapsed: 1, wanted max seconds: 2test strings: a b c d e f g h i j k l m n o p q r s t u v w x y z a b c d e f g h i j k l m n o p q r s t u v w x y z a b c d e f g h i j k l m n o p q r s t u v w x y zok 7 - threads, matches: 9 -- a b a c c b c b a -- should be 9ok 8 - threads, time elapsed: 2, wanted max seconds: 2ok 9 - fork with many dbm::deep dbs, matches: 9 -- a b a c c b c b a -- should be 9not ok 10 - fork with many dbm::deep dbs, time elapsed: 9, wanted max seconds: 2#   Failed test &#39;fork with many dbm::deep dbs, time elapsed: 9, wanted max seconds: 2&#39;#   in test_hashmap.pl at line 49.not ok 11 - fork with 1 dbm::deep db, matches: 8 -- b c a c b c a b -- should be 9#   Failed test &#39;fork with 1 dbm::deep db, matches: 8 -- b c a c b c a b -- should be 9&#39;#   in test_hashmap.pl at line 48.not ok 12 - fork with 1 dbm::deep db, time elapsed: 9, wanted max seconds: 2#   Failed test &#39;fork with 1 dbm::deep db, time elapsed: 9, wanted max seconds: 2&#39;#   in test_hashmap.pl at line 49.1..12# Looks like you failed 4 tests of 12.&lt;/pre&gt;&lt;p&gt;&lt;p&gt;In addition to the test failures for the single DBM::Deep file shown above, about 25% of the time the single DBM::Deep file case seems to lock up, and the test never completes. &lt;p&gt;I would be very interested in receiving advice for how to do hashmap_paralell with a single DBM::Deep. And if there are suggestions for other strategies I could try out, or changes to my function builders that would speed things up, I&#39;ll be much obliged.&lt;p&gt;Thanks.&lt;pre class=&quot;block_code&quot;&gt;test_hashmap.pl:#!/usr/bin/perluse strict;use warnings;use Test::More qw( no_plan );use Data::Dumper;use Grep; # several different versions of grep, built up functionally.my $slow_matches_b = sub { sleep 1;   return unless $_[0];   return 1 if $_[0] =~ /[abc]/; };my $justafew_letters = [ my @letters = ((&#39;a&#39;..&#39;c&#39;) x 3) ];my $lotsa_letters = [ ((&#39;a&#39;..&#39;z&#39;) x 3) ];for (     $justafew_letters,     $lotsa_letters    ){  testem($_)}sub testem {  my $test_strings = shift or die &quot;no test strings&quot;;  print &quot;test strings: @$test_strings\n&quot;;  my $paralell_tests = [ {testname =&gt; &#39;threads&#39;,function =&gt; sub { Grep::threadgrep( $_[0], $_[1]) } },       {testname=&gt; &#39;fork with many dbm::deep dbs&#39;,function =&gt; sub { Grep::fork_manydbs_grep($_[0], $_[1]) }       },       {testname =&gt; &#39;fork with 1 dbm::deep db&#39;,function =&gt; sub { Grep::fork_onedb_grep($_[0], $_[1]) }       },       #{testname =&gt; &#39;simple serial map&#39;,#function =&gt; sub { Grep::slowgrep($_[0], $_[1]) }       #}     ];  my $max_seconds_parallel=2; # parallel execution should speed things up  for (@$paralell_tests ) {    my $timestarted=time;    my $test_name = $_-&gt;{testname};    my $matches = $_-&gt;{function}-&gt;($slow_matches_b, $test_strings);    my $timeelapsed=time - $timestarted;    my $num_matches = @$matches;    ok( $num_matches == 9, &quot;$test_name, matches: $num_matches -- @$matches -- should be 9&quot;);    ok( $timeelapsed &lt;= $max_seconds_parallel, &quot;$test_name, time elapsed: $timeelapsed, wanted max seconds: $max_seconds_parallel&quot;);  }}#my ($timestarted, $timeelapsed);Grep.pm:package Grep;use strict;use warnings;use Data::Dumper;use Map;# grep can be parallelized by building it on top of map_parallel# which uses forks, threads, distributed computations with MapReduce# or some such black magic# in some cases this may be faster, but not always,# it depends where your bottleneck is.# Whatever black magic is going on in the background,# by abstracting it out, the code we get is clean and easy to read.sub threadgrep {  my $test_function = shift;  my $in_array = shift;  my $map_function = sub { Map::hashmap_parallel_threads(@_)};  return flexygrep($test_function, $map_function, $in_array);}sub fork_onedb_grep {  my $test_function = shift;  my $in_array = shift;  my $map_function = sub { Map::hashmap_parallel_forks_onedb(@_)};  return flexygrep($test_function, $map_function, $in_array);}sub fork_manydbs_grep {  my $test_function = shift;  my $in_array = shift;  my $map_function = sub { Map::hashmap_parallel_forks_manydbs(@_)};  return flexygrep($test_function, $map_function, $in_array);}# or you could do it in a non-forked/threaded/distributed/whatever# way, by basing it on the conceptually simpler function map_serial.sub slowgrep {  my $test_function = shift;  my $in_array = shift;  my $map_function = sub { Map::hashmap_serialized(@_)};  return flexygrep($test_function, $map_function, $in_array);}sub flexygrep {  my $test_function = shift;  my $hashmap_function = shift;  my $in_array = shift;  my $in_hash = Map::hash_from_array($in_array);  my $result_hash = $hashmap_function-&gt;($test_function, $in_hash);  my $out_array = [];  for my $key (keys %$result_hash) {    if ( my $out_true = $result_hash-&gt;{$key}-&gt;{out} ) {      push @$out_array, $result_hash-&gt;{$key}-&gt;{in}    }  }  return $out_array;}1;Map.pm: package Map;use strict;use warnings;# Black magic for doing stuff in parallel is encapsulated here# use MapReduce; -- not yet, but it&#39;s on the list.use Data::Dumper;sub hash_from_array {  my $array = shift;  my $hash;  for my $index (0..$#$array) {    $hash-&gt;{$index}-&gt;{in} = $array-&gt;[$index];  }  return $hash;}# input is a funcion (eg, my $sub_multiply by ten = { return $_[0] * 10 } ), and# a hash like# my $input_values = { blee =&gt; { in =&gt; 1 },#                      blah =&gt; { in =&gt; 2}#                    }# output is a hash like#{ blee =&gt; { in =&gt; 1, out =&gt; 10 },#  blah =&gt; { in =&gt; 2, out =&gt; 20 }#}sub hashmap_serial {  my $function = shift;  my $hash = shift;  die &quot;bad hash&quot; . Dumper($hash) if grep { ! defined($hash-&gt;{$_}-&gt;{in}) } (keys %$hash);  # hash keys are processed in whatever order  for my $key ( keys %$hash) {    my $in = $hash-&gt;{$key}-&gt;{in};    my $out = $function-&gt;($in);    #print &quot;result for $in is $out\n&quot;;    $hash-&gt;{$key}-&gt;{out} = $out;  }  return $hash;}# does the same thing as hashmap_serial# but saves the value on the hard drive# (serialized in this context means a memory value gets put on the hard disk,# not to be confused with the sense of &quot;serial as opposed to parallel&quot;sub hashmap_serialized {  my $function = shift;  my $hash = shift;  die &quot;bad hash&quot; . Dumper($hash) if grep { ! defined($hash-&gt;{$_}-&gt;{in}) } (keys %$hash);  use File::Path qw(mkpath);  my $dir=&quot;c:/tmp/map_serialized&quot;;  mkpath($dir) unless -d &quot;$dir&quot;;  die &quot;no directory: $dir&quot; unless -d &quot;$dir&quot;;  my $file=&quot;$dir/$$.db&quot;;  my $db = DBM::Deep-&gt;new( $file );  $db-&gt;{result}=$hash;  for my $key ( keys %$hash ) {    my $in = $hash-&gt;{$key}-&gt;{in};    my $out = $function-&gt;($in);    $hash-&gt;{$key}-&gt;{out} = $out;  }  #unlink $file;  #die &quot;couldn&#39;t delete file&quot; if -f $file;  return $hash;}# uses many DBM::Deep stores, along with forks# works# but slowsub hashmap_parallel_forks_manydbs {  my $function = shift;  my $hash = shift;  die &quot;bad hash&quot; . Dumper($hash) if grep { ! defined($hash-&gt;{$_}-&gt;{in}) } (keys %$hash);  use File::Temp qw(tempdir);  my $dir = tempdir();  use Parallel::ForkManager;  use DBM::Deep;  my $pm=new Parallel::ForkManager(10);  for my $key ( keys %$hash ) {    $pm-&gt;start and next;    my $in = $hash-&gt;{$key}-&gt;{in};    my $out = $function-&gt;($in);    my $file=&quot;$dir/$key&quot;;    #print &quot;file: $file\n&quot;;    my $db = DBM::Deep-&gt;new( $file );    $db-&gt;{result}=$out;    #print &quot;in $in, out $out\n&quot;;    $pm-&gt;finish;  }  $pm-&gt;wait_all_children;  for my $key ( keys %$hash ) {    my $file=&quot;$dir/$key&quot;;    my $db = DBM::Deep-&gt;new( $file );    defined( my $out = $db-&gt;{result} ) or die &quot;no result in $file for key $key&quot;;    $hash-&gt;{$key}-&gt;{out}=$out;  }  #print &quot;hash: &quot; . Dumper($hash);  return $hash;}# tries to use one locked DBM::Deep store, along with forks# doesn&#39;t worksub hashmap_parallel_forks_onedb {  my $function = shift;  my $hash = shift;  die &quot;bad hash&quot; . Dumper($hash) if grep { ! defined($hash-&gt;{$_}-&gt;{in}) } (keys %$hash);  use File::Temp qw(tempfile);  my ($fh, $file) = tempfile();  use DBM::Deep;  my $db = DBM::Deep-&gt;new(                file =&gt; $file,locking =&gt; 1        );  use Parallel::ForkManager;  my $pm=new Parallel::ForkManager(10);  for my $key ( keys %$hash ) {    $pm-&gt;start and next;    my $in = $hash-&gt;{$key}-&gt;{in};    my $out = $function-&gt;($in);    #print &quot;in $in, out $out\n&quot;;    #$db-&gt;lock();    $db-&gt;{result}-&gt;{$key}-&gt;{in}=$in;    $db-&gt;{result}-&gt;{$key}-&gt;{out}=$out;    #$db-&gt;unlock();    $pm-&gt;finish;  }  $pm-&gt;wait_all_children;  my $result = $db-&gt;{result};;  #print &quot;result: &quot; . Dumper($result);  return $result;  #return $hash;}#workssub hashmap_parallel_threads {    my $function = shift;    my $hash = shift;    use threads;    my @threads;    for ( keys %$hash ) {      my $in = $hash-&gt;{$_}-&gt;{in};      my $t = threads-&gt;create( sub { map_element($_, $function, $in ) } );      push @threads, $t;    }    #wait for threads to return ( this implementation is bound by slowest thread )    my %results = map { %{ $_-&gt;join() }; } @threads;    #print Dumper \%results;    return {%results};}sub map_element {      my $key = shift;      my $function = shift;      my $in = shift;      my $out = $function-&gt;($in);      return { $key =&gt; {in =&gt;  $in,out =&gt; $out       }     };}1;&lt;/pre&gt;
    </description>
</item>

        

<item>
    <title>Things you should need to know before using Perl regexes. (Humour, with a serious point) (BrowserUk)</title>
    <link>http://prlmnks.org/html/580514.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580514.html</guid>

    <description>
        &lt;p&gt;This is not a tutorial.&lt;h2 align=center&gt;Perl&#39;s regex engine is not lightweight&lt;/h2&gt;&lt;p&gt;Every time you use &lt;tt class=&quot;inline_code&quot;&gt;$`&lt;/tt&gt;, &lt;tt class=&quot;inline_code&quot;&gt;$&amp;&lt;/tt&gt; or &lt;tt class=&quot;inline_code&quot;&gt;$&#39;&lt;/tt&gt;, the entire scalar you are searching is copied. &lt;p&gt;What&#39;s more, not only are the scalars that you process using the regex contain a reference to one of those variables copied, but &lt;b&gt;every scalar, processed by every regex in your entire program also gets copied&lt;/b&gt;. &lt;p&gt;Further, every time you use capturing brackets, all the captured chunks are also copied--again.&lt;p&gt;And, even correctly written regexes that use two or more variable length matches (&lt;re&gt;* or &lt;re&gt;+ etc.) can consume prodigious amount of runtime stack and cpu.&lt;p&gt;Badly and/or naively written regexes that use nested qualifiers can have exponential runtimes, and if the scalar they operate on is anything more than modestly sized, can completely consume your process stack before finally trapping having consumed all your process memory allocation, or system swap space--whichever runs out first.&lt;p&gt;Dooom, gloom, despondency.&lt;p&gt;More doom gloom and despondency.&lt;p&gt;Blah, blah, blah.&lt;p&gt;Oh. and &lt;a href=&quot;/out/node/here&quot;&gt;here&lt;/a&gt; is a solution that prevents some of the problems by wrapping each call to the regex engine. &lt;p&gt;It starts anothor process, sends your scalars and the regex to it via sockets. That other process runs the regex on your behalf, and sends the results back via another socket. This neatly eliminates the &lt;tt class=&quot;inline_code&quot;&gt;$&amp;&lt;/tt&gt; problem, and allows recovery from the stack runaway/memory exhaustion problems whilst keeping your main process&#39; memory requirements to a minimum.&lt;hr /&gt;&lt;H2 align=center&gt;This is not a serious attack on the perl regex engine!&lt;/h2&gt;&lt;p&gt;Whilst much of the above is and has been true for the past 5 (8?, 10?) years, &lt;b&gt;most of it could not be otherwise&lt;/b&gt;. &lt;p&gt;And the point is that the regex engine isn&#39;t lightweight, and has some &lt;i&gt;vagaries&lt;/i&gt; and caveats,&lt;p&gt;&lt;b&gt;but that hasn&#39;t prevented thousands of programmers from writing 100s of thousands of perfectly functional, useful, beneficial scripts that use Perl&#39;s regex engine&lt;/b&gt;&lt;p&gt;Note:The stack problem has been very cleverly fixed in a recent build,
    </description>
</item>

        

<item>
    <title>Perl is dead (cLive ;-))</title>
    <link>http://prlmnks.org/html/580321.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580321.html</guid>

    <description>
        &lt;p&gt;Excuse me if this has been discussed before (I couldn&#39;t find it though). This is a rather &lt;a href=&quot;http://steve.yegge.googlepages.com/ancient-languages-perl&quot;&gt;amusing rant&lt;/a&gt; from an Amazon developer.&lt;/p&gt;&lt;p&gt;Even though it is a little OTT in places, it does definitely have some valid points.&lt;/p&gt;&lt;p&gt;I love Perl, but my faith in its future is not that great right now. I feel Python beckoning ;-)&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>How many words does it take? (Limbic~Region)</title>
    <link>http://prlmnks.org/html/580093.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580093.html</guid>

    <description>
        All,&lt;br /&gt;This meditation is about not giving up on a problem just because it is [wp://Np_complete|NP complete].  Common responses from seasoned programmers to NP complete questions include:&lt;ul&gt;&lt;li&gt;Don&#39;t bother, it is NP complete&lt;/li&gt;&lt;li&gt;Find a heurstic approximation and hope it is good enough&lt;/li&gt;&lt;li&gt;Brute force is the only way to go&lt;/li&gt;&lt;/ul&gt;Response 2 is actually pretty good advice when it applies.  Response 3 may be true but not all brute force approaches are created equal.  The fact is that real world problems need to be solved regardless of CS theory.  Just because something is NP complete, doesn&#39;t mean there isn&#39;t a smart(er|ish)? way of solving it.&lt;p&gt;I decided to work with [id://348444].  I was familiar with it because I had already provided a fairly [id://545167|fast approximation algorithm].  The problem stated is simple:  &lt;i&gt;Given a dictionary file, find the minimal number of words that contain a given set of letters.&lt;/i&gt;&lt;/p&gt;I set out a list of constraints which turned the general problem into a specific one:&lt;ul&gt;&lt;li&gt;The dictionary file = 2of12 from the [http://wordlist.sourceforge.net/|Official 12Dicts Package]&lt;/li&gt;&lt;li&gt;Letters of both question and answer = a-z&lt;/li&gt;&lt;li&gt;Answer need only be fewest words, not fewest letters&lt;/li&gt;&lt;li&gt;Question may not contain duplicate letters - abc ok, abbc not ok&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Originally, I wanted to precalculate enough information to allow each question to be solveable in polynomial time.  While I still believe this approach would work with some problems, I ended up precalculating all solutions and functionally turned it into a lookup table.  To do this, I needed to reduce the problem space enough to make a brute force solution feasible.  Here are the guiding optimization principals I employed:&lt;/p&gt;&lt;READMORE&gt;&lt;h4&gt;Not all solutions need to be found&lt;/h4&gt;&lt;p&gt;There are a finite number of questions that can be asked because the letter set has been restricted to a-z.  Using the combinatorial formula C = N! / K!(N - K)! when N = 26 and K = 1..26, there are 67_108_863 different possible questions.  Since the solution that applies to &#39;abc&#39; also applies to &#39;ab&#39;, &#39;ac&#39;, &#39;bc&#39;, &#39;a&#39;, &#39;b&#39;, and &#39;c&#39; - even fewer questions actually have to be answered.&lt;/p&gt;&lt;h4&gt;Not all words need be considered&lt;/h4&gt;&lt;p&gt;Because of the constraints, we can reduce a word to the unique list of letters in contains.  The word &#39;screwdriver&#39; becomes &#39;cdeirsvw&#39; and the word &#39;disc&#39; becomes &#39;cdis&#39;.  Because there are no letters in &#39;disc&#39; that are not contained in &#39;screwdriver&#39;, &#39;disc&#39; need not be considered.  Applying this principal further, we end up with only words of the longest string of unique characters.&lt;/p&gt;&lt;h4&gt;Use a breadth-first search&lt;/h4&gt;&lt;p&gt;Instead of solving questions at random, a [wp://Breadth-first_search|BFS] will provide the most efficient solution path.  Start by finding all questions that can be solved using a single word from our reduced dictionary, then 2 at a time, then 3 at a time, etc until the longest possible question (a-z) can be answered.&lt;/p&gt;&lt;h4&gt;Not all combinations of words need be considered&lt;/h4&gt;&lt;p&gt;Since our goal is to get to a-z, when combining words we need only consider words that contain letters we don&#39;t already have.  Further, we can use the reduction technique to only include the longest list of &quot;new&quot; characters.  As an example if we start with &#39;hello&#39; and are considering combining it with &#39;windowpane&#39; and &#39;weapon&#39;, we are really considering &#39;adinpw&#39; and &#39;anpw&#39; and we can safely ignore &#39;weapon&#39;.&lt;/p&gt;&lt;h4&gt;Not all derived answers need be considered&lt;/h4&gt;&lt;p&gt;Since we know that the answer for &#39;abc&#39; also answers &#39;ab&#39;, &#39;ac&#39;, &#39;bc&#39;, &#39;a&#39;, &#39;b&#39;, and &#39;c&#39; - we only need solve for &#39;abc&#39; and we get the [wp://Powerset|powerset] for free.  We can take advantage of this when solving new questions because we can skip the portions of the powerset that have been solved by some previous question.  For instance, if we are looking at &#39;abcd&#39;, &#39;abce&#39;, and &#39;abcfg&#39; - we know we can skip &#39;abc&#39; and their descendants when solving the second and third question.&lt;/p&gt;&lt;/READMORE&gt;&lt;p&gt;Even with these guiding principals in mind before I started, I still followed a number of dead-ends before finding the right approach.  I will attempt to highlight the mistakes I made but will only be showing code for the actual solution.&lt;/p&gt;&lt;h4&gt;Phase 1:  Reduce the dictionary&lt;/h4&gt;&lt;p&gt;Number of lines in input:  61_406&lt;br /&gt;Number of lines in output: 3_477&lt;br /&gt;Execution time:  1 min 15 seconds&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perluse strict;use warnings;use constant  WORD =&gt; 0;use constant   LEN =&gt; 1;use constant NFORM =&gt; 2;use Inline C =&gt;;my $file = $ARGV[0] || &#39;dictionary.txt&#39;;open(my $fh, &#39;&lt;&#39;, $file) or die &quot;Unable to open &#39;$file&#39; for reading: $!&quot;;my @word;while (&lt;$fh&gt;) {    $_ = lc;    tr/a-z//cd;    my %uniq = map {$_ =&gt; undef} split //;    my $len = keys %uniq;    push @word, [$_, $len, join &#39;&#39;, sort keys %uniq];}@word = sort {$b-&gt;[LEN] &lt;=&gt; $a-&gt;[LEN] } @word;for my $i (0 .. $#word - 1) {    next if ! defined $word[$i];    for my $j ($i + 1 .. $#word) {        next if ! defined $word[$j];        $word[$j] = undef if ! distinct($word[$i][NFORM], $word[$j][NFORM]);    }}for (grep defined, @word) {    print join &quot;\t&quot;, $_-&gt;[NFORM], $_-&gt;[WORD];    print &quot;\n&quot;;}__END____C__int distinct(unsigned char *str1, unsigned char *str2) {    /* Actual code has 256 0s - truncated for post */    char exists[256] = {};    /* Turn array into a hash */    while (*str1) {        exists[*str1++] = 1;    }    /* Determine if str2 contains any chars str1 does not */    while (*str2) {        if (! exists[*str2++]) return 1;    }    return 0;}&lt;/pre&gt;&lt;/p&gt;&lt;h4&gt;Phase 2:  Two at a time&lt;/h4&gt;&lt;p&gt;Number of lines in input:  3_477&lt;br /&gt;Number of lines in output: 636_186&lt;br /&gt;Execution time:  9 min 35 seconds&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perluse strict;use warnings;use constant NEW_LET =&gt; 1;use constant     LEN =&gt; 2;use Inline C =&gt;;my $file = $ARGV[0] || &#39;phase1.data&#39;;open(my $fh, &#39;&lt;&#39;, $file) or die &quot;Unable to open &#39;$file&#39; for reading: $!&quot;;my @word;while (&lt;$fh&gt;) {    chomp;    push @word, [split /\t/];}my %seen_nform;for my $i (0 .. $#word - 1) {    my ($nform1, $str1) = @{$word[$i]};    my (@new_word, %seen);    for my $j ($i + 1 .. $#word) {        my ($nform2, $str2) = @{$word[$j]};        my $new_let = diff($nform2, $nform1);        next if ! $new_let || $seen{$new_let}++;        push @new_word, [$str2, $new_let, length($new_let)];     }    @new_word = sort { $b-&gt;[LEN] &lt;=&gt; $a-&gt;[LEN] } @new_word;    for my $i2 (0 .. $#new_word - 1) {        next if ! defined $new_word[$i2];        for my $j2 ($i2 + 1 .. $#new_word) {            next if ! defined $new_word[$j2];            $new_word[$j2] = undef if ! distinct($new_word[$i2][NEW_LET], $new_word[$j2][NEW_LET]);        }    }    for (grep defined, @new_word) {        my $str2 = $_-&gt;[0];        my %uniq = map {$_ =&gt; undef} split //, $nform1 . $str2;        my $new_nform = join &#39;&#39;, sort keys %uniq;        next if $seen_nform{$new_nform}++;        print join &quot;\t&quot;, $new_nform, $str1, $str2;        print &quot;\n&quot;;    }}__END____C__SV* diff ( char *str1, char *str2 ) {    SV *sv= newSVpvn( &quot;&quot;, 0 );    int result_index= 0;    char *result= SvGROW( sv, 257);    /* identify all chars present in str2 */    while ( *str1 &amp;&amp; *str2 ) {        if ( *str1 &lt; *str2)             result[ result_index++ ]= *str1++;        while ( *str1 &amp;&amp; *str1 == *str2) {            str1++; str2++;        }        if ( *str1 &gt; *str2 )            str2++;    }    while (*str1)         result[ result_index++ ]= *str1++;    result[ result_index ]= 0;    SvCUR_set( sv, result_index );        return sv;}int distinct(unsigned char *str1, unsigned char *str2) {    /* Actual code has 256 0s - truncated for post */    char exists[256] = {};    /* Turn array into a hash */    while (*str1) {        exists[*str1++] = 1;    }    /* Determine if str2 contains any chars str1 does not */    while (*str2) {        if (! exists[*str2++]) return 1;    }    return 0;}&lt;/pre&gt;&lt;/p&gt;&lt;h4&gt;Phase 3:  Three at a time&lt;/h4&gt;&lt;p&gt;Number of lines in input:  636_186&lt;br /&gt;Number of lines in output: 8_809_183&lt;br /&gt;Execution time:  401 min 10 seconds&lt;/p&gt;&lt;p&gt;Yes, that&#39;s java code you see.  I originally wrote it in Perl which looked very much like phase 2 but it took between 6 1/3 hours to 11 1/2 hours depending on which machine and what variation of the algorithm I used.  If anyone is interested in the perl counterpart - please let me know.&lt;/p&gt;&lt;p&gt;&lt;pre class=&quot;block_code&quot;&gt;import java.io.File;import java.io.IOException;import java.util.ArrayList;import java.util.BitSet;import java.util.HashMap;import java.util.HashSet;import java.util.LinkedHashMap;import java.util.Map;import java.util.Scanner; public class Phase3 {    public static final String regex = &quot;[a-z]+&quot;;    public static char alphabet[] = {        &#39;a&#39;, &#39;b&#39;, &#39;c&#39;, &#39;d&#39;, &#39;e&#39;, &#39;f&#39;, &#39;g&#39;, &#39;h&#39;, &#39;i&#39;, &#39;j&#39;, &#39;k&#39;, &#39;l&#39;, &#39;m&#39;,        &#39;n&#39;, &#39;o&#39;, &#39;p&#39;, &#39;q&#39;, &#39;r&#39;, &#39;s&#39;, &#39;t&#39;, &#39;u&#39;, &#39;v&#39;, &#39;w&#39;, &#39;x&#39;, &#39;y&#39;, &#39;z&#39;,    };    public static BitSet[] seen_nform = {        new BitSet(),        new BitSet(26),       new BitSet(325),     new BitSet(2600),        new BitSet(14950),   new BitSet(65780),    new BitSet(230230),  new BitSet(657800),        new BitSet(1562275), new BitSet(3124550),  new BitSet(5311735), new BitSet(7726160),        new BitSet(9657700), new BitSet(10400600), new BitSet(9657700), new BitSet(7726160),        new BitSet(5311735), new BitSet(3124550),  new BitSet(1562275), new BitSet(657800),        new BitSet(230230),  new BitSet(65780),    new BitSet(14950),   new BitSet(2600),        new BitSet(325),     new BitSet(26),       new BitSet(1)    };        public static HashMap&lt;Character, Integer&gt; lookup = new HashMap&lt;Character, Integer&gt;(26);    public static void main(String[] args) {        initlookup();        Scanner scr1, scr2;        String  line;        LinkedHashMap&lt;String, String&gt; word = new LinkedHashMap&lt;String, String&gt;(3477);        // Load phase1.data into word        try {            scr1 = new Scanner(new File(args[0]));            while (scr1.hasNextLine()) {                line  = scr1.nextLine();                scr2  = new Scanner(line);                word.put(scr2.next(regex), scr2.next(regex));            }        }        catch (IOException ioe) {            ioe.printStackTrace();        }        // Process phase2.data        String nform, str1, str2;        try {            scr1 = new Scanner(new File(args[1]));            while (scr1.hasNextLine()) {                line  = scr1.nextLine();                scr2  = new Scanner(line);                nform = scr2.next(regex);                str1  = scr2.next(regex);                str2  = scr2.next(regex);                HashSet&lt;String&gt; seen = new HashSet&lt;String&gt;();                ArrayList&lt;String[]&gt; new_word = new ArrayList&lt;String[]&gt;();                for (Map.Entry&lt;String, String&gt; entry : word.entrySet()) {                    String new_nform = uniq(nform, entry.getValue());                    int len = new_nform.length();                    int bit = getBit(len, new_nform);                    if (seen_nform[len].get(bit)) { continue; }                    String new_let = diff(entry.getKey(), nform);                    if (new_let.length() == 0 || seen.contains(new_let)) { continue; }                    seen.add(new_let);                    String[] e = {entry.getValue(), new_let, new_nform, Integer.toString(bit)};                    new_word.add(e);                }                String[] new1, new2;                int idx;                int end = new_word.size();                for (int i = 0; i &lt; end - 1; ++i) {                    new1 = new_word.get(i);                    if (new1 == null) { continue; }                    for (int j = i + 1; j &lt; end; ++j) {                        new2 = new_word.get(j);                        if (new2 == null) { continue; }                        idx = j;                        if (new2[1].length() &gt; new1[1].length()) {                            String[] temp = new1;                            new1 = new2;                            new2 = temp;                            idx  = i;                        }                        int res = distinct(new1[1], new2[1]);                        if (res == 0) {                            new_word.set(idx, null);                        }                    }                }                for (String[] e : new_word) {                    if (e == null) { continue; }                    String str3 = e[0];                    String new_nform = e[2];                    int len = new_nform.length();                    int bit = Integer.decode(e[3]);                    /* Probably not needed */                    if (seen_nform[len].get(bit)) { continue; }                    seen_nform[len].set(bit);                    System.out.println(new_nform + &quot;\t&quot; + str1 + &quot;\t&quot; + str2 + &quot;\t&quot; + str3);                }            }        }        catch (IOException ioe) {            ioe.printStackTrace();        }    }    private static void initlookup () {        // Actual code goes a-z, reduced for post        lookup.put(&#39;a&#39;, new Integer(&quot;1&quot;));        lookup.put(&#39;b&#39;, new Integer(&quot;2&quot;));    }    // Determine Chars in str1 not present in str2    private static String diff(String str1, String str2) {        int len = str2.length();        HashSet&lt;Character&gt; have = new HashSet&lt;Character&gt;(len);        for (Character c : str2.toCharArray()) {            have.add(c);        }        StringBuilder new_let = new StringBuilder(len);        for (Character c : str1.toCharArray()) {            if (have.contains(c)) {                continue;            }            new_let.append(c);        }        return new_let.toString();    }    // Determine if str2 (shorter) contains any chars not in str1 (longer)    private static int distinct(String str1, String str2) {        HashSet&lt;Character&gt; have = new HashSet&lt;Character&gt;(str1.length());        for (Character c : str1.toCharArray()) {            have.add(c);        }        for (Character c : str2.toCharArray()) {            if (have.contains(c)) {                continue;            }            return 1;        }        return 0;    }     // List of unique chars in alphabetic order in str1 &amp; str2    private static String uniq(String str1, String str2) {        HashSet&lt;Character&gt; have = new HashSet&lt;Character&gt;();        for (Character c : str1.toCharArray()) {            have.add(c);        }        for (Character c : str2.toCharArray()) {            have.add(c);        }        StringBuilder unique = new StringBuilder(26);        for (char let : alphabet) {            if (have.contains(let)) {                unique.append(let);            }        }        return unique.toString();    }    private static int binomial(int n, int k) {        int c = 1;        for (int i = 0; i &lt; k; ++i) {            c *= n - i;            c /= i + 1;        }        return c;    }    private static int getBit(int len, String str) {        int    sum = 0;        for (int i = 0; i &lt; len; ++i) {            sum += binomial(lookup.get(str.charAt(i)) - 1, i + 1);        }        return sum;    }}&lt;/pre&gt;&lt;/p&gt;&lt;h4&gt;Phase 4:  Four at a time (new approach)&lt;/h4&gt;&lt;p&gt;Number of lines in input:  8_809_183&lt;br /&gt;Number of lines in output: 1&lt;br /&gt;Execution time:  0 min 7 seconds&lt;/p&gt;&lt;p&gt;The last phase brought the solution close enough just to scan for a word that contained &lt;i&gt;all&lt;/i&gt; the missing letters to reach a-z.&lt;/p&gt;&lt;p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perluse strict;use warnings;use Inline C =&gt;;my $file = $ARGV[0] || &#39;phase1.data&#39;;open(my $fh, &#39;&lt;&#39;, $file) or die &quot;Unable to open &#39;$file&#39; for reading: $!&quot;;my @word;while (&lt;$fh&gt;) {    chomp;    my ($nform, $word) = split /\t/;    push @word, $word;}$file = $ARGV[1] || &#39;phase3.data&#39;;open($fh, &#39;&lt;&#39;, $file) or die &quot;Unable to open &#39;$file&#39; for reading: $!&quot;;while (&lt;$fh&gt;) {    my ($nform, $str1, $str2, $str3) = split /\t/;    for (@word) {        if (! diff(&#39;abcdefghijklmnopqrstuvwxyz&#39;, $_ . $nform)) {            print &quot;abcdefghijklmnopqrstuvwxyz\t$_\t$str1\t$str2\t$str3&quot;;            exit;        }    }}print &quot;No Cigar\n&quot;;__END____C__SV *diff(char *str1, char *str2) {    /* Actual code has 256 0s - truncated for post */    char exists[256] = {};    SV *ret_sv = newSVpvn(&quot;&quot;,0);        /* identify all chars present in str2 */    while (*str2) {        exists[(U8)*str2++] = 1;    }    /* Determine chars in str1 not in str2 */    for ( ; *str1 ; str1++ )         if (! exists[(U8)*str1])             sv_catpvn(ret_sv,str1,1);        return ret_sv;}&lt;/pre&gt;&lt;/p&gt;&lt;h4&gt;Phase 5:  Generate powersets&lt;/h4&gt;&lt;p&gt;Number of lines in input:  9_448_847&lt;br /&gt;Number of lines in output: 67_108_863&lt;br /&gt;Execution time:  66 min 20 seconds&lt;/p&gt;&lt;p&gt;Yep, more Java.  I also have a Perl version if anyone is interested but I wanted the final solution to take less than 8 hours to run so that someone wanting to reproduce my results could.&lt;/p&gt;&lt;p&gt;&lt;pre class=&quot;block_code&quot;&gt;import java.io.File;import java.io.IOException;import java.util.ArrayList;import java.util.BitSet;import java.util.HashMap;import java.util.Scanner; public class Phase5 {    public static final String regex = &quot;[a-z]+&quot;;    public static BitSet[] seen = {        new BitSet(),        new BitSet(26),       new BitSet(325),     new BitSet(2600),        new BitSet(14950),   new BitSet(65780),    new BitSet(230230),  new BitSet(657800),        new BitSet(1562275), new BitSet(3124550),  new BitSet(5311735), new BitSet(7726160),        new BitSet(9657700), new BitSet(10400600), new BitSet(9657700), new BitSet(7726160),        new BitSet(5311735), new BitSet(3124550),  new BitSet(1562275), new BitSet(657800),        new BitSet(230230),  new BitSet(65780),    new BitSet(14950),   new BitSet(2600),        new BitSet(325),     new BitSet(26),       new BitSet(1)    };        public static HashMap&lt;Character, Integer&gt; lookup = new HashMap&lt;Character, Integer&gt;(26);    public static void main(String[] args) {        initlookup();        String  line = null;        Scanner s2   = null;        for (String filename : args) {            try {                Scanner scr = new Scanner(new File(filename));                while (scr.hasNextLine()) {                    line = scr.nextLine();                    s2 = new Scanner(line);                    if (! s2.hasNext(regex)) {                        continue;                    }                    getSets(s2.next(regex), line);                }            }            catch (IOException ioe) {                ioe.printStackTrace();            }        }    }    private static void initlookup () {        // Actual code goes a-z, reduced for post        lookup.put(&#39;a&#39;, new Integer(&quot;1&quot;));        lookup.put(&#39;b&#39;, new Integer(&quot;2&quot;));    }    private static void getSets(String str, String line) {        int len = str.length();        int bit = getBit(len, str);        if (! seen[len].get(bit)) {            seen[len].set(bit);            System.out.println(str + &quot;\t&quot; + line);            for (StringBuilder set : subsets(str)) {                getSets(set.toString(), line);            }        }    }    private static ArrayList&lt;StringBuilder&gt; subsets(String str) {        ArrayList&lt;StringBuilder&gt; subs = new ArrayList&lt;StringBuilder&gt;();        if (str.length() == 1) {            return subs;        }        for (int i = 0; i &lt; str.length(); ++i) {            StringBuilder set = new StringBuilder(str);            set.deleteCharAt(i);            subs.add(set);        }        return subs;    }    private static int binomial(int n, int k) {        int c = 1;        for (int i = 0; i &lt; k; ++i) {            c *= n - i;            c /= i + 1;        }        return c;    }    private static int getBit(int len, String str) {        int    sum = 0;        String key;        for (int i = 0; i &lt; len; ++i) {            sum += binomial(lookup.get(str.charAt(i)) - 1, i + 1);        }        return sum;    }}&lt;/pre&gt;&lt;/p&gt;&lt;h4&gt;Creating a working final product&lt;/h4&gt;&lt;p&gt;I decided to use a [cpan://DBD::SQlite] database with a very simple [http://www.gatcomb.org/dictdemo/demo.cgi|web front end to demo it].  The split into 26 tables was left over from an earlier disk space optimization idea that I didn&#39;t pursue so the following code could be simplified further:&lt;/p&gt;Split the data by length and convert solution words to indices (took 12 minutes)&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perluse strict;use warnings;use Storable;my %fwd;my $file = $ARGV[0] || &#39;phase1.data&#39;;open(my $fh, &#39;&lt;&#39;, $file) or die &quot;Unable to open &#39;$file&#39; for reading: $!&quot;;my $n;while ( &lt;$fh&gt; ) {    chomp;    my ($nform, $word) = split /\t/;    $fwd{$word} = ++$n;}my %sol_fh;for (1 .. 26) {    my $file_name = sprintf(&quot;%.2d&quot;, $_) . &quot;.data&quot;;    open($sol_fh{$_}, &#39;&gt;&#39;, $file_name) or die $!;}$file = $ARGV[1] || &#39;phase5.data&#39;;open($fh, &#39;&lt;&#39;, $file) or die &quot;Unable to open &#39;$file&#39; for reading: $!&quot;;while ( &lt;$fh&gt; ) {    chomp;    my ($nform, undef, @words) = split /\t/;    $_ = $fwd{$_} for @words;    print { $sol_fh{length($nform)} } $nform, &quot;\t&quot;, (join &quot;-&quot;, @words), &quot;\n&quot;;}my %rev = reverse %fwd;store \%rev, &#39;sol.rev&#39;;&lt;/pre&gt;Build the database and use the sqlite3 shell interface to import the data&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perluse strict;use warnings;use DBD::SQLite;my $dbh = DBI-&gt;connect(&quot;dbi:SQLite:dbname=solution.db&quot;,&quot;&quot;,&quot;&quot;) or die $DBI::errstr;for my $size (1 .. 26) {    my $sql = &quot;CREATE TABLE solution$size (nform TEXT, solution TEXT)&quot;;    $dbh-&gt;do($sql) or die $dbh-&gt;errstr;}$dbh-&gt;disconnect or die $dbh-&gt;errstr;&lt;/pre&gt;Create a simple web front end&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -Tuse strict;use warnings;use CGI::Simple;use DBD::SQLite;use HTML::Template;use Storable;$CGI::Simple::POST_MAX        = 1024;$CGI::Simple::DISABLE_UPLOADS = 1;#$CGI::Simple::DEBUG = 1;my $q = CGI::Simple-&gt;new();# TODO: if unable to get exclusive lock on $0 display_error()# TODO: Add error handling - duh!$q-&gt;param(&#39;answer&#39;) ? display_answer() : display_question();sub display_question {    my $template = HTML::Template-&gt;new(filename =&gt; &#39;question.tmpl&#39;);    print $q-&gt;header;    print $template-&gt;output();    exit(0);}sub display_answer {    my $template = HTML::Template-&gt;new(filename =&gt; &#39;answer.tmpl&#39;);    my $input = get_input();    my $len   = length($input);    my $rev   = retrieve(&#39;sol.rev&#39;);    my $dbh   = DBI-&gt;connect(&quot;dbi:SQLite:dbname=solution.db&quot;,&quot;&quot;,&quot;&quot;)        or die $DBI::errstr;    my $sth   = $dbh-&gt;prepare(&quot;SELECT solution FROM solution$len WHERE nform=?&quot;)        or die $dbh-&gt;errstr;    $sth-&gt;execute($input)                                                  or die $dbh-&gt;errstr;    my @word;    while (my @row = $sth-&gt;fetchrow_array() ) {        push @word, map { {WORD =&gt; $rev-&gt;{$_}} } split /-/, $row[0];    }    $template-&gt;param(USER_INPUT =&gt; $input);    $template-&gt;param(WORD_LIST  =&gt; \@word);    print $q-&gt;header;    print $template-&gt;output();    #$sth-&gt;finish();    #$dbh-&gt;disconnect;    exit(0);}sub get_input {    my $input = $q-&gt;param(&#39;question&#39;) || &#39;&#39;;    $input = lc($input);    $input =~ tr/a-z//cd;    $input ||= &#39;abcdefghijklmnopqrstuvwxyz&#39;;    my %uniq = map {$_ =&gt; undef} split //, $input;    return join &#39;&#39;, sort keys %uniq;}&lt;/pre&gt;Include a question template&lt;pre class=&quot;block_code&quot;&gt;&lt;html&gt;&lt;head&gt;&lt;title&gt;How many words does it take.....&lt;/title&gt;&lt;/head&gt;&lt;body&gt;&lt;center&gt;&lt;h2&gt;How many words does it take....&lt;/h2&gt;&lt;p&gt;Please enter a unique list of lower case letters (a-z) below.&lt;br /&gt;Be warned that some of the &quot;words&quot; may not be safe for work (NSFW).&lt;/p&gt;&lt;p&gt;&lt;form name=&quot;display_question&quot; action=&quot;demo.cgi&quot;&gt;&lt;input type=text name=&quot;question&quot;&gt;&lt;input type=hidden name=&quot;answer&quot; value=&quot;1&quot;&gt;&lt;br /&gt;&lt;input type=&quot;submit&quot; value=&quot;Submit&quot;&gt;&lt;/form&gt;&lt;/p&gt;&lt;/center&gt;&lt;/body&gt;&lt;/html&gt;&lt;/pre&gt;Include an answer template&lt;pre class=&quot;block_code&quot;&gt;&lt;html&gt;&lt;head&gt;&lt;title&gt;It takes.....&lt;/title&gt;&lt;/head&gt;&lt;body&gt;&lt;center&gt;&lt;h2&gt;For &lt;TMPL_VAR NAME=USER_INPUT&gt; it takes....&lt;/h2&gt;&lt;p&gt;&lt;TMPL_LOOP NAME=WORD_LIST&gt;    Word: &lt;TMPL_VAR NAME=WORD&gt;    &lt;br /&gt;&lt;/TMPL_LOOP&gt;&lt;/p&gt;&lt;/center&gt;&lt;/body&gt;&lt;/html&gt;&lt;/pre&gt;&lt;h1&gt;Conclusion&lt;/h1&gt;&lt;p&gt;Answering all 67+ million questions using a dictionary of over 64K words can take less than 8 hours and the resulting lookup table (SQLite) is 2GB of disk space.  If this were a real world problem, that is not a considerable investement.&lt;/p&gt;&lt;h4&gt;Notes&lt;/h4&gt;&lt;p&gt;I freely admit that I spent far more than 8 hours on this specific project.  I also admit that each problem is unique and has no guarantee of being feasible.  I came to the Monastery at least twice looking for help with optimizations:&lt;ul&gt;&lt;li&gt;[id://579130]&lt;/li&gt;&lt;li&gt;[id://576101]&lt;/li&gt;&lt;/ul&gt;I had to find the right balance of trading memory for speed since 1GB of RAM was not enough in many cases.  I also had to find the right balance between reducing the output for the next phase with the amount of work required for that reduction.&lt;/p&gt;&lt;p&gt;The code as it appears can likely be optimized.  I spent my time looking for algorithm related optimizations rather than micro-optimizations.  While I used both [cpan://Inline::C] and Java in this project, I am not much more than a neophyte in either.  In fact, this thread contains all the Java I have ever written in my life.  I had help in those areas but the contributors were limited by my description of what I was trying to do so all deficiencies are mine rather than theirs.&lt;/p&gt;&lt;p&gt;I would like to thank everyone who helped me with this project.  Rather than naming the few I can remember and offending those I can&#39;t, I will just say &quot;you know who you are&quot;.  You can try it out [http://www.gatcomb.org/dictdemo/demo.cgi|here].  I welcome any questions or comments on the code and algorithms.&lt;/p&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-180961&quot;&gt;&lt;p&gt;Cheers - [Limbic~Region|L~R]&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Parrot, threads &amp; fears for the future. (BrowserUk)</title>
    <link>http://prlmnks.org/html/580004.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/580004.html</guid>

    <description>
        &lt;p&gt;The future is threaded.&lt;p&gt;At the commodity hardware level, this is already the case thanks to Intel Hyperthreading, Intel, AMD (and others) dual  core processors. AMP, SMP &amp; NUMA. Dual, quad &amp; 8-way motherboards. In the future it will be more so. Eg. The Cell processor. Quad-core technology from Intel and AMD coming in 2007.&lt;p&gt;At the software level, very little existing software makes use of threading. There are several reasons for this:&lt;ol&gt;&lt;li&gt; Software needs to be written from the ground up with threading in mind in order to properly benefit from it.&lt;/li&gt;&lt;li&gt;Retro-fitting threading to existing applications is rarely effective because threading exacerbates the effect of every bad programming practice.    &lt;ul&gt;&lt;li&gt;Re-entrancy issues can not be glossed over. Whether in the application code, language runtime or OS.    &lt;/li&gt;&lt;li&gt;Global data-structures become even more vulnerable.   &lt;/li&gt;&lt;li&gt;Tight coupled code causes low granularity. Low granularity can make threading expensive.   &lt;/li&gt;&lt;li&gt;Memory management becomes paramount.        &lt;ul&gt;&lt;li&gt;Stop the world GC will have a disastrous effect upon performance.       &lt;/li&gt;&lt;li&gt;Monolithic heap management will suck the life out of efficiency.       &lt;/li&gt;&lt;/ul&gt;   &lt;/li&gt;&lt;/ul&gt;   &lt;/li&gt;&lt;li&gt;Existing code that can benefit from threading, often uses event driven and/or state machine techniques to approximate those benefits,      but code designed to utilise those techniques is usually structured such that it does not lend itself to conversion to threading.   &lt;/li&gt;&lt;li&gt;Existing code that already achieves parallelism through forking, that could also benefit from shared state, is often difficult to adapt to threading due to the assumptions that can be safely made when using forking that no longer hold true with threading.   &lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Redeveloping existing, successful applications that could benefit from threading is often slow to happen. Again, for a variety of reasons:&lt;ol&gt;&lt;li&gt;Mid-life, ground-up, redesign of an existing application is always a major undertaking, even where there are clear benefits to doing so. &lt;/li&gt;&lt;li&gt;In many environments threading is seen as hard.&lt;/li&gt;&lt;li&gt;In many environments, there is a lack of both understanding and the skills required to implement threading well.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Even when developing new applications, that could obviously benefit from threading, it is overlooked, ignored or explicitly ruled out. Again there are a variety reasons why this happens, most of which are already covered above.&lt;p&gt;Much of the resistance/reluctance to utilise threading can be attributed to a single factor--there are few if any good tools available. &lt;ul&gt;&lt;li&gt;Most languages--that the majority of people use--either do not support threading at all, or support it as an afterthought. And then only at the lowest level.&lt;/li&gt;&lt;li&gt;There are few good abstractions of shared state.   If every program still had to deal with disk-bound data by directly manipulating physical blocks and freespace chains, very few programs that manipulate  file-based data would exist--that&#39;s most programs in existence.   Today&#39;s ubiquitous hierarchical filesystems make it seem as if there was never any other way, and that tends to imply that they are perfect. But we also have  RDBMSs, which are most definitely not hierarchical, and not file-based, (Although they often live on hierarchical filesystems.).&lt;/li&gt;&lt;li&gt;Most of todays programming tools--compilers, interpreters, editors, debuggers, runtime libraries etc.--are the latest evolutions of the same tools that go back years., In many cases, decades.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;And like any other type of existing application, adaption to threading is difficult, resisted ,expensive and risky. Redesign from the ground up with threading in mind, implies throwing away thousands or millions of development hours in thoroughly tried and tested tools and libraries.    &lt;h4 align=center&gt;Could that possibly be worth the trouble?&lt;/h4&gt;   &lt;p&gt;What if an existing, popular, powerful and flexible language was already being redesigned from the ground up. &lt;p&gt;If that language was already looking to support simple syntax and intuitive semantics for Distributed Operations?&lt;pre class=&quot;block_code&quot;&gt;    if( any( @list ) == constant ) ...    @list1 &gt;&gt; += &lt;&lt; @list2&lt;/pre&gt;    &lt;p&gt;And those DistOps inherently lent themselves to being run concurrently on multiple hyperthreads/cores/CPUs?&lt;p&gt;What if the entire tool-chain to support that new language was also already being redesigned from the ground up?&lt;p&gt;Doesn&#39;t it make sense to write those tools with threading not just &quot;in mind&quot;, but as a high priority?&lt;p&gt;Will those tools be all they could be, if their architects &quot;Don&#39;t do threads&quot;? If the implementors &quot;do not see the need for threads&quot;? &lt;p&gt;Indeed, does it bode well for the future of those tools if the implementors do not use the languages that those tools are to support, And don&#39;t have the slightest feel for what will drive the needs  and uses of those languages in the future? &lt;p&gt;Does the &lt;ul&gt;&lt;li&gt;complete absence of a threads.pdd from the specification; &lt;/li&gt;&lt;li&gt;that the term &quot;threads&quot; appears only 35 times in the entire documentation set; &lt;/li&gt;&lt;li&gt;that the &quot;failed&quot; ithreads model, so widely denigrated and despised, is being nearly exactly replicated for the underpinnings of the new language; &lt;/li&gt;&lt;/ul&gt;  &lt;p&gt;inspire &lt;i&gt;you&lt;/i&gt; with confidence?&lt;p&gt;What about userspace threading? Many languages provide this and many programmers find it&#39;s determinism and light weight lends itself to many things that they want to do. It&#39;s not a replacement for kernel threads, but if threads != interpreter, then providing primitives to allow cooperative, user threading to run within preemptive kernel threading becomes not just possible but almost trivial. Whether this is provided at the VM level or the language level. Trying to hack user space threading into a language when thread == interpeter becomes a completely different ballgame.   &lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-171588&quot;&gt;&lt;hr /&gt;&lt;font size=1 &gt;&lt;div&gt;Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.&lt;/div&gt;&lt;div&gt;Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?&lt;/div&gt;&lt;div&gt;&quot;Science is about questioning the status quo. Questioning authority&quot;. &lt;/div&gt;&lt;div&gt;In the absence of evidence, opinion is indistinguishable from prejudice.&lt;/div&gt;&lt;/font&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Creating a co-operative framework for testing (gmax)</title>
    <link>http://prlmnks.org/html/579983.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/579983.html</guid>

    <description>
        &lt;p&gt;Hi Monks,&lt;/p&gt;&lt;p&gt;I recently applied for a post of QA developer at &lt;a href=&quot;http://datacharmer.blogspot.com/2006/09/riding-dolphin.html&quot;&gt;a open source company&lt;/a&gt; and I started working there, with great delight, about one month ago. One of the reasons for my hiring was my active participation in the community, with blogging, writing articles, submitting bug reports, answering to forum and NG questions. My background as a tester was considered, of course, but the stress was on my community links.&lt;/p&gt;&lt;h3&gt;Inviting an open source community to Quality Assurance tasks&lt;/h3&gt;&lt;p&gt;Thus, one of my tasks is now to organize a framework for cooperation between the Quality Assurance department and the community.&lt;/p&gt;&lt;p&gt;It is not an easy task to tackle. I can see how much an active community can contribute (bug reports, test cases, code patches, usability feedback), but I can also see the clash between a QA department populated by professional testers and the erratic behavior of a large community. Somehow I have to find a way of reconciling these two worlds, and make them work harmoniously toward a common goal.&lt;/p&gt;&lt;p&gt;My plan (yes, in spite of the task being so though, I do have a plan) is to motivate potential contributors with durable benefits, to get the best possible outcome from this relationship.&lt;/p&gt;&lt;p&gt;During the past twelve months, the company has promoted such cooperation with a contest. Win an iPod if you submit many bugs, and/or write articles or blog posts about our products. The results were quite good in the beginning, but they faded away soon after a few weeks. Some of the prizes are yet to be awarded. Why did this policy fail to provide the expected results? IMO, because it was targeting the wrong quality of a potential contributor: it was appealing at the competitiveness, instead of tickling pride and desire of recognition, which work much better in an open source environment. But especially, it did not address the main aspect of open source success, i.e. &lt;b&gt;mutual benefit&lt;/b&gt;. &lt;/p&gt;&lt;p&gt;Why does someone contribute to an open source project? There are many reasons, but the paramount one should be striving for improvement. The main point of open source is being able to modify something that barely works for you into something that fits your needs appropriately. Thus the mutual benefit: when you add a feature or fix a bug in an open source product that affects your work, you make that product better for you, but you are also improving its overall value.&lt;/p&gt;&lt;h3&gt;The challenge of testing&lt;/h3&gt;&lt;p&gt;Among the activities that make the quality of a software product, testing is perhaps the most visible one. There are other elements, such as policies for coding, internal code reviews, and several organizational issues that will affect the final quality. But testing is a key element in software development. Thus, my company has a huge regression test suite that does a good job of keeping each build free of bugs.&lt;/p&gt;&lt;p&gt;As everyone knows, testing is never enough. Despite all the efforts from the developers and the QA engineers, there is always a bug that escapes their attention and affects the user. Why? Surely because testing everything is impossible, but also because developers and QA engineers are highly trained people and they approach problems from a different perspective than the final users.&lt;/p&gt;&lt;p&gt;Common users take the product for what they need, and just throw at it the commands that will solve their problem, regardless of any other concern that the developers could take into account. This naive behavior is what finds the most appalling bugs. &lt;/p&gt;&lt;p&gt;The challenge, then, is to combine the scientific approach of a QA department with the cleverness of willing contributors, who are able to find bugs that elude most of the professionals.&lt;/p&gt;&lt;h3&gt;The lessons of Perl&lt;/h3&gt;&lt;p&gt;That&#39;s why (finally) I am coming here to ask for advice. My job is not as a pure Perl developer (although Perl is largely used in our testing infrastructure), so this problem is not related to Perl as a language, but with Perl as a community. I was accustomed to testing before using Perl, but it is in the Perl community that I found the most efficient way of testing.&lt;/p&gt;&lt;p&gt;I can see that the Perl community has a very good testing infrastructure. I see how it is organized, how it works, but the reasons why it is so good escape me. &lt;/p&gt;&lt;h3&gt;Seeking advice&lt;/h3&gt;&lt;p&gt;Here are the questions&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Why is the Perl testing infrastructure so effective?&lt;/li&gt;&lt;li&gt;If I wanted to export some of the qualities of Perl testing to a non-Perl product, what should I focus on?&lt;/li&gt;&lt;li&gt;What motivates a (QA) contributor?&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;So, fire on. Any insight on this matter could be valuable.&lt;/p&gt;&lt;p&gt;Thanks in advance.&lt;/p&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-127116&quot;&gt;&lt;pre&gt; _  _ _  _  (_|| | |(_|&gt;&lt; _|   &lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Using functional programming to reduce the pain of parallel-execution programming (with threads, forks, or name your poison) (tphyahoo)</title>
    <link>http://prlmnks.org/html/579976.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/579976.html</guid>

    <description>
        This is a followup from two recent posts:&lt;p&gt;&lt;a href=&quot;/html/579540.html&quot;&gt;Could there be ThreadedMapReduce (and/or ForkedMapReduce) instead of DistributedMapReduce?&lt;/a&gt;&lt;p&gt;&lt;a href=&quot;/html/579015.html&quot;&gt;using parallel processing to concatenate a string, where order of concatenation doesn&#39;t matter&lt;/a&gt;&lt;p&gt;In a sense it is a followup from an older post as well: &lt;p&gt;&lt;a href=&quot;/html/436173.html&quot;&gt;What is the fastest way to download a bunch of web pages?&lt;/a&gt;&lt;p&gt;The common theme to all these posts is doing something in parallel to accomplish a task, where each bit of the task is independent of the other bits. &lt;p&gt;It turns out that this is the essense of the MapReduce algorithm that google uses to build much of its most important code.  Google uses cluster computing to accomplish this, but the same &quot;parallelization&quot; logic could also be used on a single computer running multiple processes, or even a single process with threads. (Ira Woodhouse has indicated that the next version of MapReduce released to the CPAN will probably have configuration options for &lt;a href=&quot;http://perlmonks.org/?node_id=579891&quot;&gt;single computer clusters&lt;/a&gt; in order to do exactly that.)&lt;p&gt;Currently, the canonical example of a situation where this would be useful for me, on my job, is to download a bunch of web pages and verify that each page regex-matches something that it should.  In other words, grep, where part of the grep test is to download a web page.&lt;p&gt;Throughout my experience with perl, have had numerous other situations where this basic idea -- break down a task into components that can run in parallel, run them, and then reassemble the results -- would have been helpful. Sometimes I would do this, but because I&#39;m not an experienced thread programmer, and sometimes I was on windows sometimes not, sometimes would have to recompile perl, etc etc, it was always an unpleasant experience. There were many occasions when I thought, maybe I could use parallelization here... but painful... not enough time to debug.... I&#39;ll just make do with having it be a little slower.&lt;p&gt;An example of a time when I could have, but didn&#39;t, use functional programming based on parallelization to speed things up, was when I had ~50,000 web pages to parse and reformat using a variety of functions based on HTML::TreeBuilder. None of the files cared what the other files looked like, so I could have processed multiples at the same time. But I got the basic system working serially, and it got the job done in a bit over 48 hours. This was too slow, so I did some simple things with forks and got it down to under a day, which was acceptable because the script only had to run once. &lt;p&gt;But I remember thinking the forking code the ugliest, and hardest to debug, in my program. &lt;p&gt;If I had had 5 million web pages instead of 50,000, and wanted to split the computation among multiple processors somehow, putting the job queue stuff together for it along the lines I had used till then, it would have been a nightmare. Even though all I was doing was grep, with a little network communication inside the grep test function.&lt;p&gt;Later when I came across the article &lt;p&gt;&lt;a href=&quot;http://www.softpanorama.org/People/Ousterhout/Threads/index.shtml&quot;&gt;Why Threads are a Bad Idea&lt;/a&gt;&lt;p&gt;it all rang very true to me. Okay, I was working with forks,  in a simple context, and this presentation is maybe more about threads with systems programming. But the basic difficulties apply to any situation where you are running stuff in an order that isn&#39;t guaranteed. It&#39;s different from running stuff through a simple, ordered, for loop in ways that keep quite subtle and hard to detect.&lt;p&gt;What frustrated me the most is the feeling that I couldn&#39;t encapsulate the logic that I want, that I have to keep writing the &quot;ugly bits&quot; again and again.&lt;p&gt;This was before I had heard of functional programming.Since then, I have been learning a lot about functional programming, and trying to incorporate it into my bag of tools. I originally became interested in this after reading Paul Graham&#39;s &lt;a href=&quot;http://www.bookshelf.jp/texi/onlisp/onlisp.html&quot;&gt;On Lisp&lt;/a&gt;. But I want to apply functional programming to make my life easier in perl. Joel Spolsky suggests in &lt;p&gt;&lt;a href=&quot;http://www.joelonsoftware.com/items/2006/08/01.html&quot;&gt;Can Your Programming Language Do This&lt;/a&gt;&lt;p&gt;that functional programming is a good technique to hide the &quot;ugly but important bits&quot; of your code.  &lt;p&gt;In the article, spolsky suggests that this is exactly what Google has done with their MapReduce algorithm. Google programmers can write code that says the equivalent of&lt;p&gt;&lt;tt class=&quot;inline_code&quot;&gt;my @results = distributed_mapreduce_grep ( $test_function, [ @in_array ]);&lt;/tt&gt;&lt;p&gt;and this would do exactly what &lt;p&gt;&lt;tt class=&quot;inline_code&quot;&gt;my @results = grep { $test_function($) )} @array&lt;/tt&gt;&lt;p&gt;would do in perl. Except that it works on a cluster. So you can process a lot more data, faster. &lt;p&gt;And the &quot;ugly bits&quot; are hidden.&lt;p&gt;I want to start hiding the ugly bits of my code, using functional programming.&lt;p&gt;The following is my attempt to do that.&lt;p&gt;Unfortunately it still isn&#39;t working. But I think it&#39;s an interesting read, and I&#39;m also hoping someone can plug something in there that will make it work.&lt;p&gt;What&#39;s nice is that the &quot;ugly bit&quot; that isn&#39;t working is encapsulated. This is the function &lt;tt class=&quot;inline_code&quot;&gt;hashmap_parallel&lt;/tt&gt;. Currently my &quot;parallelization strategy&quot; involves forking off processes and storing values in a DBM::Deep hard disk store. But actually I don&#39;t care about the implementation details, I just want it to work. &lt;p&gt;****************************&lt;p&gt;UPDATE: Thanks to &lt;a href=&quot;/out/node/LanceDeeply&quot;&gt;LanceDeeply&lt;/a&gt;, I now have code that works, using threads for the parallelization.&lt;p&gt;I kept the function that doesn&#39;t work as hashmap_parallel_forks, which I am still hoping to get working. The code that does work is called hashmap_parallel_threads. The test script has also been updated accordingly.&lt;p&gt;If anyone else want to shoot me some candidates for their favorite way of implementing transparent parallelization with map, I will add them to the catalog.&lt;p&gt;****************************&lt;p&gt;Hashmap here means essentially the same as the &quot;map&quot; half in MapReduce. It processes a set, where order doesn&#39;t matter. I have several hashmap functions in this code, two of which work, and one (the one that executes in parallel) which doesn&#39;t. &lt;p&gt;These &quot;mapping&quot; functions are given as an argument to the function builder &lt;tt class=&quot;inline_code&quot;&gt;flexygrep&lt;/tt&gt;, which returns a grepping function. So, as a consequence two of my grepping functions work and one (the parallel one) doesn&#39;t.&lt;p&gt;If I can get &lt;tt class=&quot;inline_code&quot;&gt;hashmap_parallel&lt;/tt&gt;to work, I&#39;m thinking could theoretically use this to build other functions, like sort_parallel, permute_parallel, you name it. Sometimes, for efficiency reasons, this will make sense. A lot of the time it won&#39;t. Depends what your bottleneck is -- cpu, memory, network, disk io, etc. But the good news is that once the parallel mapping function -- or mapping functions -- work, you can just plug them in and try. A lot easier than writing threading code for all scenarios.&lt;p&gt;Again, the bit that I need to get to work -- but which will pay major dividends in maintainability when I do -- is &lt;tt class=&quot;inline_code&quot;&gt;hashmap_parallel&lt;/tt&gt;. &lt;p&gt;Now, here&#39;s a little test output. &lt;pre class=&quot;block_code&quot;&gt;$ ./test_hashmap.plok 1 - parallel-y threadgrep worksnot ok 2 - parallel-y forkgrep works#   Failed test &#39;parallel-y forkgrep works&#39;#   in ./test_hashmap.pl at line 21.ok 3 - serially executing code works1..3# Looks like you failed 1 test of 3.&lt;/pre&gt;And here&#39;s the code&lt;pre class=&quot;block_code&quot;&gt;test_hashmap:#!/usr/bin/perluse strict;use warnings;use Test::More qw( no_plan );use Data::Dumper;use Grep;my $slow_matches_b = sub { sleep 1;                           return unless $_[0];                           return 1 if $_[0] =~ /b/;                         };my $test_strings = [ (&#39;blee&#39;,&#39;blah&#39;,&#39;bloo&#39;, &#39;qoo&#39;, &#39;fwee&#39; ) ];my $matches;$matches = Grep::threadgrep( $slow_matches_b, $test_strings );ok( @$matches == 3, &quot;parallel-y threadgrep works&quot;  );# should get blee, blah bloo, but not fwee or qoo$matches = Grep::forkgrep( $slow_matches_b, $test_strings );ok( @$matches == 3, &quot;parallel-y forkgrep works&quot;  );$matches = Grep::slowgrep( $slow_matches_b, $test_strings );ok( @$matches == 3, &quot;serially executing code works&quot;  );Grep.pm:package Grep;use strict;use warnings;use Data::Dumper;use Map;# grep can be parallelized by building it on top of map_parallel# which uses forks, threads, distributed computations with MapReduce# or some such black magic# in some cases this may be faster, but not always,# it depends where your bottleneck is.# Whatever black magic is going on in the background,# by abstracting it out, the code we get is clean and easy to read.sub threadgrep {  my $test_function = shift;  my $in_array = shift;  my $map_function = sub { Map::hashmap_parallel_threads(@_)};  return flexygrep($test_function, $map_function, $in_array);}sub forkgrep {  my $test_function = shift;  my $in_array = shift;  my $map_function = sub { Map::hashmap_parallel_forks(@_)};  return flexygrep($test_function, $map_function, $in_array);}# or you could do it in a non-forked/threaded/distributed/whatever# way, by basing it on the conceptually simpler function map_serial.sub slowgrep {  my $test_function = shift;  my $in_array = shift;  my $map_function = sub { Map::hashmap_serialized(@_)};  return flexygrep($test_function, $map_function, $in_array);}sub flexygrep {  my $test_function = shift;  my $hashmap_function = shift;  my $in_array = shift;  my $in_hash = Map::hash_from_array($in_array);  my $result_hash = $hashmap_function-&gt;($test_function, $in_hash);  my $out_array = [];  for my $key (keys %$result_hash) {    if ( my $out_true = $result_hash-&gt;{$key}-&gt;{out} ) {      push @$out_array, $result_hash-&gt;{$key}-&gt;{in}    }  }  return $out_array;}1;Map.pm:package Map;use strict;use warnings;# Black magic for doing stuff in parallel is encapsulated here# use MapReduce;use Parallel::ForkManager;use threads;# use threads::shared qw(is_shared);use DBM::Deep;use Data::Dumper;sub hash_from_array {  my $array = shift;  my $hash;  for my $index (0..$#$array) {    $hash-&gt;{$index}-&gt;{in} = $array-&gt;[$index];  }  return $hash;}# input is a funcion (eg, my $sub_multiply by ten = { return $_[0] * 10 } ), and# a hash like# my $input_values = { blee =&gt; { in =&gt; 1 },#                      blah =&gt; { in =&gt; 2}#                    }# output is a hash like#{ blee =&gt; { in =&gt; 1, out =&gt; 10 },#  blah =&gt; { in =&gt; 2, out =&gt; 20 }#}sub hashmap_serial {  my $function = shift;  my $hash = shift;  die &quot;bad hash&quot; . Dumper($hash) if grep { ! defined($hash-&gt;{$_}-&gt;{in}) } (keys %$hash);  # hash keys are processed in whatever order  for my $key ( keys %$hash) {    my $in = $hash-&gt;{$key}-&gt;{in};    my $out = $function-&gt;($in);    #print &quot;result for $in is $out\n&quot;;    $hash-&gt;{$key}-&gt;{out} = $out;  }  return $hash;}# does the same thing as hashmap_serial# but saves the value on the hard drive# (serialized in this context means a memory value gets put on the hard disk,# not to be confused with the sense of &quot;serial as opposed to parallel&quot;sub hashmap_serialized {  my $function = shift;  my $hash = shift;  die &quot;bad hash&quot; . Dumper($hash) if grep { ! defined($hash-&gt;{$_}-&gt;{in}) } (keys %$hash);  use File::Path qw(mkpath);  my $dir=&quot;c:/tmp/map_serialized&quot;;  mkpath($dir) unless -d &quot;$dir&quot;;  die &quot;no directory: $dir&quot; unless -d &quot;$dir&quot;;  my $file=&quot;$dir/$$.db&quot;;  my $db = DBM::Deep-&gt;new( $file );  $db-&gt;{result}=$hash;  for my $key ( keys %$hash ) {    my $in = $hash-&gt;{$key}-&gt;{in};    my $out = $function-&gt;($in);    $hash-&gt;{$key}-&gt;{out} = $out;  }  #unlink $file;  #die &quot;couldn&#39;t delete file&quot; if -f $file;  return $hash;}# but uses threads to compute &quot;out&quot; values in a parallel way# doesn&#39;t work.sub hashmap_parallel_forks {  my $function = shift;  my $hash = shift;  die &quot;bad hash&quot; . Dumper($hash) if grep { ! defined($hash-&gt;{$_}-&gt;{in}) } (keys %$hash);  return {};  use File::Path qw(mkpath);  my $dir=&quot;c:/tmp/map_serialized&quot;;  mkpath($dir) unless -d &quot;$dir&quot;;  die &quot;no directory: $dir&quot; unless -d &quot;$dir&quot;;  my $file=&quot;$dir/$$.db&quot;;  my $db = DBM::Deep-&gt;new( $file );  $db-&gt;{result}=$hash;  my $pm=new Parallel::ForkManager(10);  for my $key ( keys %$hash ) {    $pm-&gt;start and next;    my $in = $hash-&gt;{$key}-&gt;{in};    my $out = $function-&gt;($in);    print &quot;in $in, out $out\n&quot;;    $hash-&gt;{$key}-&gt;{out} = $out;    $pm-&gt;finish;  }  $pm-&gt;wait_all_children;  print &quot;hash: &quot; . Dumper($hash);  #unlink $file;  #die &quot;couldn&#39;t delete file&quot; if -f $file;  #die &quot;forkgrep result: &quot; . Dumper($hash);  return $hash;}#workssub hashmap_parallel_threads {    my $function = shift;    my $hash = shift;    my @threads;    for ( keys %$hash ) {      my $in = $hash-&gt;{$_}-&gt;{in};      my $t = threads-&gt;create( sub { map_element($_, $function, $in ) } );      push @threads, $t;    }    #   wait for threads to return ( this implementation is bound by slowest thread )    my %results = map { %{ $_-&gt;join() }; } @threads;    #print Dumper \%results;    return {%results};}sub map_element {      my $key = shift;      my $function = shift;      my $in = shift;      my $out = $function-&gt;($in);      return { $key =&gt; {                        in =&gt;  $in,                        out =&gt; $out                       }             };}1;&lt;/pre&gt;&lt;p&gt;UPDATE: Seemingly relevant comments from Jenda at [id://580019]: &lt;p&gt;&quot;You can only transparently paralelize map{} if the executed block is side-effect-free.....&quot;&lt;p&gt;I&#39;m actually not sure if my code here is side effects free or not. Hm...&lt;p&gt;*****************************************************&lt;p&gt;Posts I&#39;m looking at to see if I can use something there to get hashmap_parallel to work...:&lt;p&gt;[id://237089]
    </description>
</item>

        

<item>
    <title>C++ vs. Perl mention on shygypsy.com (dhoss)</title>
    <link>http://prlmnks.org/html/579920.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/579920.html</guid>

    <description>
        &lt;p&gt;Hey all,&lt;/p&gt;&lt;p&gt;If you&#39;ve been in the CB within the past 24 hours, you&#39;ve probably at least heard of [http://shygypsy.com/farm/p.cgi|Funny Farm], a pretty sweet little game by Igor Naverniouk.&lt;/p&gt;&lt;p&gt;Looking around [http://www.shygypsy.com|his site], i found on his [http://shygypsy.com/acm/|UVA tools page] that he says, and i quote:&lt;blockquote&gt;ShyGypsy labsAugust 4, 2006 02:19I&#39;m playing around with some better ways of ranking people and problems. Here is a preview of a prototype. I have also discovered that I can easily write CGI scripts in C++. Perl is great, but I can get stuff done a hundred times faster in C++. &lt;/blockquote&gt;&lt;em&gt;&quot;I can get stuff done a hundred times faster in C++&quot;&lt;/em&gt;? Did I miss something? If I recall correctly, comparatively it takes at least 5-6 lines more of code in C++ that in perl to write a simple &quot;Hello world&quot; program. &lt;/p&gt;&lt;p&gt;Example:&lt;p&gt;C++:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#include &lt;iostream&gt;using namespace std;int main (){  cout &lt;&lt; &quot;Hello World!&quot;;  return 0;}&lt;/pre&gt;&lt;p&gt;Perl:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -wuse strict;print &quot;Hello world!\n&quot;;&lt;/pre&gt;&lt;p&gt;or, even less:&lt;/p&gt;&lt;tt class=&quot;inline_code&quot;&gt;C:\&gt;perl -e &quot;print &#39;hello world!&#39;&quot;&lt;/tt&gt;&lt;/p&gt;&lt;p&gt;I&#39;m not trying to dig up an old Jihad, and I normally would just search the site for responses on this, but what could possibly justify him saying that?  It could very well be that he&#39;s more comfortable with C++ and can simply code quicker in C++, logically that&#39;s sound because he&#39;s got 2 C++ courses he teaches at UofToronto, but I just can&#39;t fathom this.&lt;/p&gt;&lt;p&gt;Another possiblity is he has many code libraries in C++, mainly algorithms and such, and maybe that puts C++ on a more level playing field with perl since it eliminates a lot of low end coding.  But do libraries really making a &quot;100 times faster&quot; difference?&lt;/p&gt;&lt;p&gt;Please, correct me, downvote me, chastise me if I&#39;m off base and haven&#39;t done my research, but I just can&#39;t possibly see how &lt;em&gt;coding&lt;/em&gt; in C++ could possibly be any faster than coding in perl&lt;/p&gt;&lt;p&gt;Your wisdom is appreciated monks.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;P.S&gt; I just threw this in meditations, move as needed.&lt;/strong&gt;&lt;/p&gt;&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-203787&quot;&gt;&lt;i&gt;meh.&lt;/i&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Book update: &quot;Exploring Programming Language Architecture in Perl&quot; (billh)</title>
    <link>http://prlmnks.org/html/579895.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/579895.html</guid>

    <description>
        Is it ok to post updates here? I&#39;ve got no-one else to talk to :-)&lt;br/&gt;Anyway assuming it&#39;s ok I&#39;ve just finished the first draft of chapter 13on continuationsover at at &lt;a href=&quot;http://billhails.net/Book/interpreter-0-0-10.html&quot;&gt;billhails.net&lt;/a&gt;.&lt;br/&gt;It&#39;s still quite rough, and it needs rearranging (move the &quot;what&#39;s so coolabout continuations&quot; bits up to the front so the reader feels moremotivated to read the rest,) and subtitling.&lt;br/&gt;I&#39;m slightly worried that I&#39;m leaning rather too heavily on one source, the venerable and wonderful &quot;Essentials of Programming Languages&quot;, but hopefully by the time I&#39;ve finished I&#39;ll be able to present a more balanced account.&lt;br/&gt;Any and all constructive criticism is welcome: anything I&#39;ve missed,any points that are too laboured etc.&lt;br/&gt;And enjoy of course.&lt;br/&gt;PS There are now PostScript and PDF versions linked from the cover page.&lt;br/&gt;&lt;b&gt;Update: &lt;/b&gt; changed title as per recommendation&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-543551&quot;&gt;Bill H&lt;br&gt;&lt;pre class=&quot;block_code&quot;&gt;perl -e &#39;print sub { &quot;Hello @{[shift]}!\n&quot; }-&gt;(&quot;World&quot;)&#39;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;i&gt;See also: [id://578390] and [id://543564]&lt;/i&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;small&gt;Edited by [planetscape] - added &quot;see also&quot; links &lt;readmore title=&quot;view votes&quot;&gt;( keep:0 edit:14 reap:0 )&lt;/small&gt;&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>Perl needs Zend (EvanCarroll)</title>
    <link>http://prlmnks.org/html/579777.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/579777.html</guid>

    <description>
        &lt;strong&gt;I feel that it is important to say, I have down voted every response to this for lack of utility. Please post a response that has content, and not just mindless perl fanaticism.&lt;/strong&gt;&lt;p&gt;Fellow monks, I think I&#39;ve finally answered the question as to why PHP is, generally speaking, more successful. I have to point to their corporate backing which largely comes from Zend. Zend offers above and beyond the free PHP stuff other commercial products too, and they take the proceeds from these product, and dump them back into the language. Zen actually produces seven products all of which can be found on their PHP powered website at &lt;a href=&quot;/out/http/?url=www.zend.com%2Fproducts&quot;&gt;zend products.&lt;/a&gt; Zend even produces an Optimizer which makes all of your php code run up to &quot;25times faster.&quot; That&#39;s really good news for php, especially considering php is pretty fast anyway.&lt;/p&gt;&lt;p&gt;I mean look at this, the Zend core actually &lt;a href=&quot;/out/http/?url=www.zend.com%2Fproducts%2Fzend_core%2Fzend_core_for_oracle&quot;&gt;runs Oracle&lt;/a&gt; and has been ported to &lt;a href=&quot;/out/http/?url=www.zend.com%2Fproducts%2Fzend_core%2Fzend_core_for_ibm&quot;&gt;IBM&lt;/a&gt;. Which makes me think maybe the core being more full featured than PERL&#39;s is the reason for PHP&#39;s success. Obviously, it has some advantages; or why would big blue, and Oracle have adopted it? Maybe its the additional free support Zend provides that the Perl foundation doesn&#39;t, &lt;a href=&quot;/out/http/?url=www.zend.com%2Fforums%2Findex.php%3Ft%3Dthread%26frm_id%3D2&quot;&gt;IBM support&lt;/a&gt; and &lt;a href=&quot;/out/http/?url=www.zend.com%2Fforums%2Findex.php%3Ft%3Dthread%26frm_id%3D3&quot;&gt;Oracle support&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;/out/http/?url=www.zend.com%2Fstore%2Fzend_php_certification&quot;&gt;Zend even offers certificates&lt;/a&gt;, and they are &quot;confidant you will pass the first time.&quot; That&#39;s probably because the language has no learning curve whatsoever. You are almost born knowing it! And, if you get both certificates you get a $25 discount. Lets say though you find php cryptic, for 769.00 you can get &lt;a href=&quot;/out/http/?url=www.zend.com%2Fstore%2Fzend_php_training%2Fphp_professional_training&quot;&gt;Zend professional training&lt;/a&gt;. Another important thing Zend offers is &lt;a href=&quot;/out/http/?url=www.zend.com%2Fproducts%2Fconsulting_services&quot;&gt;consultation&lt;/a&gt;, which isn&#39;t readily available in the Perl community.&lt;/p&gt;&lt;p&gt;The solution to this problem is simple, Perl needs some really big corporate backing too, like either Google, Yahoo, Microsoft or Apple. Maybe there is some way we can get Larry to hand over the project to Zend. Or maybe some day Zend will build a Perl optimizer for us that will run our applications 25 times faster. We probably also need these big corporate guys to run on our core as well, at least Apple; surely the PERL core can handle them.&lt;/p&gt;&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-474782&quot;&gt;&lt;br&gt;&lt;br&gt;Evan Carroll&lt;br&gt;www.EvanCarroll.com&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>RFC: Perl meta programming (bennymack)</title>
    <link>http://prlmnks.org/html/579458.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/579458.html</guid>

    <description>
        &lt;p&gt;Dear Monks,&lt;/p&gt;&lt;p&gt;Herein lies my latest half-baked Perl extension. This one is for reducing the amount of boiler-plate code necessary for writing Perl subs.&lt;/p&gt;&lt;p&gt;As per usual, I will jump right to the code which I always assume will speak for itself ( but rarely does ).&lt;/p&gt;&lt;p&gt;Here is an example of a module that uses my &quot;handy&quot; new module.&lt;/p&gt;SomeTestPackage2.pm&lt;pre class=&quot;block_code&quot;&gt;package SomeTestPackage2;use strict;use warnings;use Data::Dumper;use constant IVARS =&gt; qw[$none $href $aref $non_empty_aref];BEGIN {use base qw[SomeAttributes2];__PACKAGE__-&gt;import( IVARS );}use constant custom_aref =&gt; q{    confess( &#39;!!!package!!! !!!ivar!!! must be an array ref&#39; ) if &#39;ARRAY&#39; ne ref !!!ivar!!!; };sub self_sub : Method( qw/ $none :none $href :href $aref custom_aref $non_empty_aref :non_empty_aref / ) { #    return ( ref $self, ref $href, ref $aref, scalar @{ $non_empty_aref } );}1;&lt;/pre&gt;&lt;p&gt;The previous code snip shows how the meta programming interface works for the most part.&lt;/p&gt;&lt;p&gt;First, you specify a list of instance variables, or whatever, that you would like to pull into subs. This pre-declaration is necessary for syntax reasons.&lt;/p&gt;&lt;p&gt;Next, you can specify &quot;custom&quot; attributes as package subs if you desire. The difference between a custom attribute and a canned/common one is that canned attributes begin with a colon and custom do not.&lt;/p&gt;&lt;p&gt;Now, in your sub attribute you specify a list of variables you&#39;d like pulled into your sub along with any attributes you&#39;d like called on them.&lt;/p&gt;&lt;p&gt;This is the test script running the prior code&lt;/p&gt;test2.pl&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perluse strict;use warnings;use Test::More qw[no_plan];use SomeTestPackage2;do { # Successfull call to self_sub    my( $test1 ) = bless( {}, &#39;SomeTestPackage2&#39; );    my( @self_sub ) = $test1-&gt;self_sub( &#39;hi&#39;, { }, [ ], [ 1 ] );    ok( $self_sub[0] eq &#39;SomeTestPackage2&#39;, &#39;ref $self&#39; );    ok( $self_sub[1] eq &#39;HASH&#39;,             &#39;ref $href&#39; );    ok( $self_sub[2] eq &#39;ARRAY&#39;,            &#39;ref $aref&#39; );    ok( $self_sub[3] &gt; 0,                   &#39;non empty array&#39; );};&lt;/pre&gt;&lt;p&gt;Here is the output of running the previous script.&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;$ perl test2.plok 1 - ref $selfok 2 - ref $hrefok 3 - ref $arefok 4 - non empty array1..4&lt;/pre&gt;&lt;p&gt;And finally, here is the code that is compiled from the &quot;Method&quot; attribute on the &quot;self_sub&quot;.&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;sub {    local *__ANON__=&#39;SomeTestPackage2::self_sub&#39;;    my( $self, $none, $href, $aref, $non_empty_aref ) = @_;    confess( &#39;SomeTestPackage2::self_sub $non_empty_aref needs a non-empty array reference&#39; ) if &#39;ARRAY&#39; ne ref $non_empty_aref or not @{ $non_empty_aref };    confess( &#39;SomeTestPackage2 $aref must be an array ref&#39; ) if &#39;ARRAY&#39; ne ref $aref;    confess( &#39;SomeTestPackage2::self_sub $href needs a hash reference&#39; ) if &#39;HASH&#39; ne ref $href;    confess( &#39;SomeTestPackage2::self_sub needs a SomeTestPackage2 reference&#39; ) if ref $self ne &#39;SomeTestPackage2&#39;;    package SomeTestPackage2;    use warnings;    use strict &#39;refs&#39;;    return ref $self, ref $href, ref $aref, scalar @{$non_empty_aref;};}&lt;/pre&gt;&lt;p&gt;That&#39;s it! Is it crap?&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>RFC: Perl Testing -- How to Introduce to a team (NovMonk)</title>
    <link>http://prlmnks.org/html/579344.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/579344.html</guid>

    <description>
        Wise monks,&lt;br&gt;&lt;br&gt;My company is facing a transition.  We have been writing perl scripts-- little snippets of code to get certain text outputs-- or modifying longer programs (adding logic to affect printed text output).  We have been using shell scripts to drive our processes-- the most ungodly mess you can imagine, and which varies from job to job based on who set it up, and how many years ago.  The good news: we have recently purchased some new software and we are retooling our existing processes to use it, and moving from flat file to database (MySQL) data.&lt;br&gt;&lt;br&gt;As we talked about designing this new framework, I timidly suggested adding testing modules to the process Up Front, and as a reward, I was tasked with researching testing, and training our developers to do it.  The closest any of us have been to testing is adding print statements to code while debugging-- one of us (not me) actually uses the Perl debugger with some facility.  &lt;br&gt;&lt;br&gt;It&#39;s not a matter of convincing them we need to be testing-- everybody&#39;s sold on that.  We are all just getting overwhelmed with where to start.  I&#39;ve read the posts on testing here found through Super Search, and I have printed some out to share.  I&#39;ve gone to CPAN and printed off several of the modules (Test::Simple, Test::More, the tutorial on testing, etc).  I&#39;ve been reading books on testing and software construction in general: &lt;a href=&quot;/out/isbn/0201795264&quot;&gt;Perl Medic&lt;/a&gt;, &lt;a href=&quot;/out/isbn/0201700549&quot;&gt;Perl Debugged&lt;/a&gt;, &lt;a href=&quot;/out/isbn/0596100922&quot;&gt;Perl testing: A Developer&#39;s Notebook&lt;/a&gt; , and &lt;a href=&quot;/out/isbn/0735619670&quot;&gt;Code Complete&lt;/a&gt;.&lt;br&gt;&lt;br&gt;One problem is: most of us are self taught, or have only the haziest ideas about Object Oriented Programming, or writing/ using modules at all.  Some things in our process are obvious- you want to test that needed files are where you expect them to be, are in the right format, and that the program  generates the expected outfiles and sends them where they need to go.  But-- What else?  If youre supposed to design tests before you code, what else should we be looking for?&lt;br&gt;&lt;br&gt;My plan for training is to pass out some of the CPAN material on Test::Simple and Test::More, and the posts from here, and parts of the books, some of which we are buying as a company. Im also going to touch on Perl Tidy and Perl Critic, as ways for us to enforce these coding standards were agreeing on.  What Im looking for from you guys is other suggestions, things I might think about, ways youve presented testing to programmers youve mentored over the years.  Things you wish youd known when you were starting out.  Testing procedures you inherited when you started with a company, that you wish you could change.  &lt;br&gt;&lt;br&gt;Weve this fantastic opportunity to design procedures and policies that will make our lives easier, make us better programmers, and make our company, and us, lots of money.  And that is also a little overwhelming.&lt;br&gt;&lt;br&gt;Many thanks in advance,&lt;br&gt;&lt;br&gt;NovMonk
    </description>
</item>

        

<item>
    <title>OO in Perl 6 (Scott7477)</title>
    <link>http://prlmnks.org/html/578618.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578618.html</guid>

    <description>
        In his latest &lt;a href=&quot;/out/http/?url=dev.perl.org%2Fperl6%2Fdoc%2Fdesign%2Fapo%2FA12.html&quot;&gt;Apocalypse&lt;/a&gt;, Larry Wall indicates that Perl 6 &quot;uses . instead of -&gt; to dereference an object&quot; due to the idea that &quot;The use of arrow where most of the rest of the world uses dot was confusing.&quot;  I think that this is a Very Good Thing, for a couple of reasons.&lt;br&gt;&lt;br&gt;One reason is less keystrokes; going from the dash key to the &quot;&gt;&quot; key has always seemed a little bit of a pain:)  The second reason is that this change will likely facilitate introducing people to using Perl.  This will be one less thing for folks new to Perl to go &quot;Huh?&quot; at when you show them some code.  
    </description>
</item>

        

<item>
    <title>(OT) X terminal output speed on linux (zentara)</title>
    <link>http://prlmnks.org/html/578511.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578511.html</guid>

    <description>
        This is just a curious discovery I made while testing the &lt;a href=http://software.schmorp.de/pkg/rxvt-unicode&gt; urxvt &lt;/a&gt;&lt;p&gt;I ran this simple script, to print out 500,000 lines and timed it.Both the xterm and the urxvt have 20k scrollback buffers.&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl use warnings;use strict;my $start = time;for (1..500000){print &quot;$_\n&quot;;}my $end = time;my $diff = $end - $start;print &quot;$diff\n&quot;;&lt;/pre&gt;To my utter amazement ( and I&#39;m still wondering if it&#39;s an xterm bug), the xterm took a whopping  1067  seconds, while the urxvt only took 19 seconds. &lt;p&gt;I would be interested in hearing the results from other people, on their xterm speed, and maybe k-terminal, etc.&lt;p&gt; Anyone have clues as to why this occurs?&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-131741&quot;&gt;&lt;hr /&gt;I&#39;m not really a human, but I play one on earth.&lt;a href = http://zentara.net/japh.html&gt;Cogito ergo sum a bum&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>The state of audio processing with Perl (Joost)</title>
    <link>http://prlmnks.org/html/578420.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578420.html</guid>

    <description>
        Hi there, fellow monks!&lt;p&gt;My name is &lt;a href=&quot;/out/node/Joost&quot;&gt;Joost&lt;/a&gt; and I&#39;m going to talk about audio processing in perl.&lt;p&gt;I&#39;ve been playing with &quot;live&quot; (not really realtime, but close enough) audio generation/processing in perl for some time now and I&#39;m not really impressed by the code on CPAN to do that.&lt;p&gt;For one thing, there appears to be no portable module to play or record audio with any kind of precision. I mean just some code to throw an audio buffer at the soundcard (OIW, in-process handling of data vs playing an audio file). Sure there&#39;s &lt;a href=&quot;/out/cpan/Audio::Play&quot;&gt;Audio::Play&lt;/a&gt; but that only supports monophonic sound and doesn&#39;t record.&lt;p&gt;Also, most audio modules use incompatible formats for their data and lots of them are badly documented and/or haven&#39;t been updated for years. In other words, coupling these modules is a mess (and slow).&lt;p&gt;More issues: reading and writing different audio file formats. &lt;a href=&quot;/out/cpan/Audio::SndFile&quot;&gt;I&#39;ve taken a stab at that just now&lt;/a&gt; - but it still lacks some important formats.&lt;p&gt;Here are some of the things I feel are needed:&lt;p&gt;&lt;ul&gt;&lt;li&gt;One audio data interchange format. &lt;a href=&quot;/out/cpan/Audio::Data&quot;&gt;Audio::Data&lt;/a&gt; seems to me to be a good starting point, but it lacks low-level access (i.e. a C level interface and packed float/double strings for fast processing).&lt;li&gt;A portable audio playback/recording library. Preferably one that will handle everything that Audio::Play handles but with unlimited channels and a &lt;a href=&quot;/out/http/?url=jackaudio.org&quot;&gt;JACK&lt;/a&gt; interface.&lt;li&gt;Some kind of processing API that allows coupling of perl &amp; XS code. Something like my &lt;a href=&quot;/out/cpan/Audio::LADSPA&quot;&gt;Audio::LADSPA&lt;/a&gt; module, but more flexible and with an easier Perl interface.&lt;/ul&gt;&lt;p&gt;So here&#39;s my question: is anyone interested in this kind of thing? By myself I might be able to get this stuff done in a year or so, considering the time I&#39;ve got to spend, but I&#39;d really like people to supply questions, burning needs, code and/or documentation to get this thing on the right track.&lt;p&gt;Also, if you are writing any audio modules on CPAN, I&#39;d like to know about it. &lt;p&gt;Please let me know I&#39;m not the only one interested in this kind of thing :-)&lt;p&gt;In other words: &lt;b&gt;audio people of perl, let&#39;s hear your crazy ideas!&lt;/b&gt;&lt;p&gt;Cheers. J.&lt;p&gt;&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-157432&quot;&gt;&lt;em&gt;&lt;a href=&quot;/out/id/149675&quot;&gt;&quot;What should it profit a man, if he should win a flame war, yet lose his cool?&quot;&lt;/a&gt;&lt;/em&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>RFC: more of that almost-book (billh)</title>
    <link>http://prlmnks.org/html/578390.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578390.html</guid>

    <description>
        Hi,Tell me what you think:&lt;br/&gt;&lt;a href=&quot;http:billhails.net/Book&quot;&gt;Exploring Programming LanguageArchitecture in Perl&lt;/a&gt;.&lt;br/&gt;Some of you may remember I posted a link to this many moons ago.&lt;br/&gt;I haven&#39;t really given up on it but it&#39;s become something of a spare-timeproject, I&#39;ve finished the object-oriented extension and started writingthe chapter on continuations. I&#39;ve corrected a lot of bloopers in the earlierchapters and I&#39;ve also learned far more about XML andCSS than anyone should have to :-), but at least it looks prettier now.&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-543551&quot;&gt;Bill H&lt;br&gt;&lt;pre class=&quot;block_code&quot;&gt;perl -e &#39;print sub { &quot;Hello @{[shift]}!\n&quot; }-&gt;(&quot;World&quot;)&#39;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>GUI toolkit choice (jbert)</title>
    <link>http://prlmnks.org/html/578368.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578368.html</guid>

    <description>
        Hi,&lt;br&gt;A quick search only showed a Tk/Gtk discussion from 2001, so I hope I didn&#39;t miss something and this is a FAQ. Also, I&#39;m hoping that we can discuss things without having a religious war about which is &#39;best&#39;.&lt;p&gt;That said, I&#39;m interested in finding out more about the various options for GUI coding, particularly - but not exclusively - from perl.&lt;p&gt;The main options I know of are:&lt;ul&gt;&lt;li&gt;Gtk+ - originally written for the GIMP, used by the GNOME people and some other apps&lt;li&gt;QT - written by Trolltech, used by the KDE people and some other apps&lt;li&gt;Tk - from Tcl and ported to perl early on&lt;li&gt;WxWindows - seems good but I know little of it (the DrScheme IDE is written in it and seems decent)&lt;/ul&gt;All of these are cross-platform (at least Linux and Windows and I would guess Mac and &#39;general unix/X windows as well&#39;).&lt;p&gt;I&#39;ve used Gtk+ from perl and C++ on Linux with good results and the same scripts runs on Windows (once Gtk is installed), which is nice.&lt;p&gt;If anyone has experience of more than one of these toolkits and can provide some thoughts on what they see as the most significant strengths and weaknesses of them I&#39;d be grateful.&lt;p&gt;And I&#39;d hesitate from using this thread to ask the popularity of the different toolkits, but perhaps that would make a good poll?
    </description>
</item>

        

<item>
    <title>The purpose of testing (g0n)</title>
    <link>http://prlmnks.org/html/578279.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578279.html</guid>

    <description>
        Testing has been a bit of a problem for me over the last few months, as some will have heard from my CB rants. Not the &#39;make test and watch the TAP output come pouring out testing&#39; - oh dear no. For much of the last 8 months, I&#39;ve been doing a great deal of manual testing, of the &#39;type these commands, compare 50 fields in the database output with the 50 fields printed in the test script, putting a p (pass) or f (fail) next to each one, sign and date, then go on to the next one of the 150 tests&#39; type. Yuk! Human beings did not evolve to do this kind of thing.&lt;p&gt;So I enthused about automated testing, tried to get my colleagues on side (with a degree of success), but there is one major stumbling block. The Quality Assurance team. Correctly, they&#39;re independent. Unfortunately, they don&#39;t permit automated testing unless it&#39;s done with an approved, validated (company validated that is) automated testing tool (TestDirector, for example). They&#39;re also entirely non technical, and really only concerned with the quality (in terms of change tracking, consistency etc) of documents. The net result of this is that testing is an extraordinary time overhead, and we have to think carefully about what tests to run for a given release. This means that testing is not as thorough as it could, or should be, and bugs creep through. Not as many as you might expect in this situation, but nevertheless more bugs find their way into production than I consider acceptable.&lt;p&gt;This stalemate has been going on for some time. Years actually. Then on monday something happened. A big, fat bug in some of my code showed up in production. Embarassing. This bug meant that I now have to run a manual report daily for the next couple of weeks until we can patch, to take the place of the automated report that I broke. Embarassing and irritating, especially since another bug had been emergency fixed that morning.&lt;p&gt;At that point I realised that, just like the QA people, I&#39;d lost sight of the real issue - testing is about finding bugs, not filling in forms. If the formal, QA approved testing is less thorough than it should be, we have to make sure that the code gets properly tested some other way.&lt;p&gt;So I got to work writing unit tests with [cpan://Test::More].&lt;p&gt;3 days work later I&#39;ve got one of the components up to 50% test coverage and found 3 bugs in edge cases that have never showed up in production. Unfortunately we probably can&#39;t test everything this way, since the perl code is only one component, running in an embedded perl interpreter inside a proprietary application. Integration testing still needs to be done the old way, so our test overhead has gone up by the amount of effort needed to write unit tests, but at least the chances of bugs getting through is reduced.&lt;p&gt;Another advantage of testing with the Perl testing modules is the availability of [cpan://Devel::Cover]. Because the unit testing is informal and unvalidated, test cases can be added any time. If someone has a few minutes spare, a quick run of the test suite with [cpan://Devel::Cover] will show up opportunities for improving the testing.&lt;p&gt;Something else I&#39;d lost sight of is the fact that we primarily want to test &lt;i&gt;our&lt;/i&gt; code, not someone elses. A lot of our code depends heavily on [cpan://Net::LDAP], so the need to provide a correctly configured directory server looked like a barrier to automated testing. However, end to end integration testing covers the &#39;get data back from the directory server&#39; test case. If there&#39;s no directory server easily available for unit testing, we can invade the dependency&#39;s name space to let us test our own code:&lt;p&gt;&lt;pre class=&quot;block_code&quot;&gt;use strict;use warnings;use Test::More;require &#39;MyCode.pl&#39;;*Net::LDAP::bind = \&amp;ldapbind;*Net::LDAP::new = \&amp;ldapnew;MyCode::bindToLDAP(&quot;hostname&quot;,&quot;port&quot;,&quot;cn=binddn&quot;,&quot;password&quot;);sub ldapnew{my $host = shift;cmp_ok($host,&quot;hostname:port&quot;,&quot;Check that Net::LDAP::new receives the right params&quot;);}sub ldapbind{my %params = @_;my %comparison ={dn=&gt;&quot;cn=binddn&quot;,password=&gt;&quot;password&quot;,};is_deeply(\%params,\%comparison,&quot;Check that Net::LDAP::bind gets the right params&quot;);}&lt;/pre&gt;&lt;p&gt;I&#39;m hoping I can get the vendor of the core application to give us information on externally accessing the test functions in their application via XS, so that we can extend the unit tests to include the application config. I&#39;m not hopeful on that front, but it&#39;s worth a try.&lt;p&gt;One final note: in the mindless drudgery of manual testing, I&#39;d also forgotten how much fun one can have writing tests to try and break things :-)&lt;p&gt;&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-421540&quot;&gt;&lt;p&gt;--------------------------------------------------------------&lt;p&gt;&lt;small&gt;&quot;If there is such a phenomenon as absolute evil, it consists in treating another human being as a thing.&quot;&lt;br&gt;John Brunner, &quot;The Shockwave Rider&quot;.&lt;/small&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Make sure you&#39;re solving the right problem (talexb)</title>
    <link>http://prlmnks.org/html/578159.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578159.html</guid>

    <description>
        &lt;p&gt;This is a followup on some recent posts of mine .. in case anyone new to Perl wonders if it is ever used for real work.&lt;/p&gt;&lt;p&gt;My web application handles document processing (conversion from PDF to XML), and it&#39;s currently running on a dozen servers, the busiest of which is used by two teams, one here in Toronto and the other in Mumbai, India. The work day starts about midnight local time (930am in Mumbai) and goes to about 8 or 9pm, and lately we&#39;ve been running 7 days a week to keep up with demand. If things go South, it lands in my lap. I hate pages at 4am, so I do what I can to avoid that situation.&lt;/p&gt;&lt;p&gt;Anyway, recently, I &lt;a href=&quot;/html/577445.html&quot;&gt;wrote about&lt;/a&gt; a problem I&#39;d been having with a sleepy CGI, and eventually rediscovered the excellent tool &lt;tt class=&quot;inline_code&quot;&gt;strace&lt;/tt&gt; thanks to [jbert|some] [sgifford|respondents]. But that led me to my next problem, as to why the CGI would be freezing at&lt;pre class=&quot;block_code&quot;&gt;select(5, [4], [], [4], NULL&lt;/pre&gt;as &lt;tt class=&quot;inline_code&quot;&gt;strace&lt;/tt&gt; was showing me. After some more research, I discovered this was the connection to the database. Could it be as simple as the CGI waiting for the database that was causing the problem?&lt;/p&gt;&lt;p&gt;Oh boy. The answer is yes, as it turns out. And the CGI was not &#39;going to sleep&#39; -- it was continuing to run, but was patiently waiting for the database, not sleeping.&lt;/p&gt;&lt;p&gt;And the solution to the database slowdown (PostgreSQL, in my case) was the simple application of&lt;pre class=&quot;block_code&quot;&gt;ANALYZE VERBOSE DOCUMENTS;&lt;/pre&gt;and the response time for the main query went from 20 seconds to about 550ms. Yay!&lt;/p&gt;&lt;p&gt;Since I&#39;ve seen this performance problem before, I&#39;m now going to keep an eagle eye on the system today and find out how long it takes before the performance starts to drop, then set up a &lt;tt class=&quot;inline_code&quot;&gt;cron&lt;/tt&gt; job to &lt;tt class=&quot;inline_code&quot;&gt;ANALYZE&lt;/tt&gt; the suspect table again, most likely every four hours or so.&lt;/p&gt;&lt;p&gt;This is all a result of &lt;a href=&quot;/html/573972.html&quot;&gt;watching my system closely&lt;/a&gt;, something I think is extremely valuable in any Engineering job -- staying on top of the performance of your system is very important in my current situation of developer, supporter and maintainer.&lt;/p&gt;&lt;p&gt;The moral of the story is, follow the data (I know, it sounds like CSI). Processes very rarely &#39;go to sleep&#39; unless they&#39;re told to. Where&#39;s the CPU time (or bandwidth, or your other resources) going? Follow that lead, and you&#39;ll find the answer to your problem.&lt;/p&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-131279&quot;&gt;&lt;p&gt;Alex / &lt;a href=&quot;/out/node/talexb&quot;&gt;talexb&lt;/a&gt; / Toronto&lt;/p&gt;&lt;p&gt;&lt;small&gt;&quot;&lt;a alt=&quot;GrokLaw, by Pamela Jones&quot; href=&quot;http://www.groklaw.net&quot;&gt;Groklaw&lt;/a&gt; is the open-source mentality applied to legal research&quot; ~ Linus Torvalds&lt;/small&gt;&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Check out g2 .. a new graphics interface (zentara)</title>
    <link>http://prlmnks.org/html/578135.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578135.html</guid>

    <description>
        I just saw &lt;a href= http://g2.sourceforge.net/&gt; g2 &lt;/a&gt; today on http://freshmeat.net. It is a c lib for generating 2d graphics. I  thought it would be worth mentioning here, because it contains a perl module interface, and it works quite well. &lt;p&gt;It&#39;s big advantage is that it can output to multiple types simultaneously.  See screenshots at above link.&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-131741&quot;&gt;&lt;hr /&gt;I&#39;m not really a human, but I play one on earth.&lt;a href = http://zentara.net/japh.html&gt;Cogito ergo sum a bum&lt;/a&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Automated regression testing (rinceWind)</title>
    <link>http://prlmnks.org/html/578110.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578110.html</guid>

    <description>
        &lt;p&gt;Where I am working, we have a requirement to provide an automatic regression test suite for an existing live application. The application is C++, not perl (though there is some perl glue already deployed for application monitoring). I&#39;d quite like to do this using Perl, as I think the language is ideally suited. I also have permission to use any CPAN modules of my choice. I will still need to convince colleagues of the wisdom of this choice.&lt;/p&gt;&lt;p&gt;I use the Test::More and Test::Harness mechanisms in unit tests for modules used by the application glue. But, the full regression test is a much bigger prospect. On the automation side, I will need to be doing the following:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;Starting and stopping application components (daemons)&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Interfacing to the application through command line tools&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Trawling and tailing log files to detect events&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Database access, including updates&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;It occurs to me that there is a fundamental difference between an automation &quot;step&quot; and a test. If a test fails, I want the script to carry on and run more tests. If a step fails, I want it to pause requiring manual intervention. The pause could eventually time out, bailing the run, allowing the whole suite to be run in &quot;lights out mode&quot; unattended over a weekend.&lt;/p&gt;&lt;p&gt;My plan is to put the automation steps in first, and add the tests afterwards.&lt;/p&gt;&lt;p&gt;I&#39;d be very interested in hearing from anybody who has attempted or achieved anything like this before. What tools have already been written that can help me? What problems and gotchas should I be careful to avoid?&lt;/p&gt;&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-144850&quot;&gt;&lt;p&gt;&lt;small&gt;--&lt;br /&gt;&lt;br /&gt;Oh Lord, wont you burn me a Knoppix CD ?&lt;br /&gt;My friends all rate Windows, I must disagree.&lt;br /&gt;Your powers of persuasion will set them all free,&lt;br /&gt;So oh Lord, wont you burn me a Knoppix CD ? &lt;br /&gt; &lt;em&gt;(Missquoting Janis Joplin)&lt;/em&gt;&lt;/small&gt;&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Finding &quot;Accidental Contributions&quot; with Google Code Search (jasonk)</title>
    <link>http://prlmnks.org/html/578030.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578030.html</guid>

    <description>
        &lt;p&gt;Someone mentioned today on &lt;a href=&quot;http://www.digg.com/tech_news/Google_Code_Search_23_000_email_addresses_of_developers&quot;&gt;digg&lt;/a&gt; that it was relatively easy to find &lt;a href=&quot;http://www.google.com/codesearch?hl=en&amp;lr=&amp;q=%5Cs%5Cw%2B%28%5B-%2B.%5D%5Cw%2B%29*%40%5Cw%2B%28%5B-.%5D%5Cw%2B%29%2B%5Cs&amp;btnG=Search&quot;&gt;thousands of developer email addresses&lt;/a&gt; by searching for a suitable regexp on &lt;a href=&quot;http://www.google.com/codesearch&quot;&gt;Google Code Search&lt;/a&gt;.  Since I was already bored (hey, I was reading digg, it couldn&#39;t have been too busy a day) I decided to search for myself and see what should pop up.&lt;/p&gt;&lt;p&gt;The first couple of pages were unsuprising, mostly links to &lt;a href=&quot;http://www.cpan.org/modules/by-authors/id/J/JA/JASONK/&quot;&gt;my modules on CPAN&lt;/a&gt;, but then I stumbled across something interesting.  In a little &lt;a href=&quot;http://www.cs.rochester.edu/~gildea/Map/mapzoom.v0.9.tar.gz&quot;&gt;program I had never heard of&lt;/a&gt; was a copy of &lt;a href=&quot;/out/cpan/Geo::ShapeFile&quot;&gt;Geo::ShapeFile&lt;/a&gt;.  The fact that they were distributing my module along with their program isn&#39;t a big deal, it&#39;s licensed under the Artistic license, and they credited me and everything else, what caught my eye about it was this comment in the &lt;a href=&quot;http://www.google.com/codesearch?q=show:VRitBwkxeT4:bIY7Gx-_I_I:BH34tVf3fyE&amp;sa=N&amp;ct=rd&amp;cs_p=http://www.cs.rochester.edu/~gildea/Map/mapzoom.v0.9.tar.gz&amp;cs_f=mapzoom/LICENSE&quot;&gt;LICENSE&lt;/a&gt; file:&lt;/p&gt;&lt;blockquote&gt;The version of Geo::Shapfile in this distribution includes fixes for machines of different endianness, not yet available on CPAN.&lt;/blockquote&gt;&lt;p&gt;I downloaded the source and perused it a bit and discovered that the author was even kind enough to include RCS files for the libraries that had been modified, which made it very easy to figure out what local changes he had made, and to find that indeed there were fixes for some endian-related issues in the module which have vexed me for quite a while (sadly my current job does not involve mapping, so I don&#39;t have as much time to put into this module as I would like).  The most interesting part was that the solution he used seems to be better than my own solution had been, and appears to have been written before the problem was first reported to me.&lt;/p&gt;&lt;p&gt;So, have you checked &lt;a href=&quot;http://www.google.com/codesearch&quot;&gt;Google&lt;/a&gt; to see what people might have neglected to contribute to your projects?&lt;/p&gt;&lt;p&gt;Along the way I also found some other interesting tidbits that I hadn&#39;t known...&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://www.google.com/codesearch?q=+kohles&amp;start=30&amp;sa=N&quot;&gt;That I&#39;m mentioned in Jifty::DBI&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.google.com/codesearch?q=+kohles+show:l0TivvNk9QQ:sbLLnUhOfYw:6Qq0VCDKrVg&amp;sa=N&amp;cd=57&amp;ct=rc&amp;cs_p=http://www.achievo.org/files/achievo-1.2.rc1.zip&amp;cs_f=achievo-1.2.rc1/doc/AUTHORS#a0&quot;&gt;a project I vaguely remember hearing something about long ago, that lists me as a contributor&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.google.com/codesearch?q=show:UXXX2_DEyw0:NZXe4USObiA:FNVRaGIErrE&amp;sa=N&amp;ct=rd&amp;cs_p=http://www.mondoirc.net/ircd/services/epona-1.4.14.tar.gz&amp;cs_f=epona-1.4.14/Changes.old&quot;&gt;a project I don&#39;t think I&#39;ve ever heard of that lists me in their Changes file&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://www.google.com/codesearch?q=+kohles+show:eQ8D4rEMNhw:1rVbR8GAUcU:z20xUDL8LtY&amp;sa=N&amp;cd=61&amp;ct=rc&amp;cs_p=ftp://katerina.frederic.k12.wi.us/pilot-link/0.12.0-pre1/pilot-link-0.12.0-pre1.tar.gz&amp;cs_f=pilot-link-0.12.0-pre1/ChangeLog#a0&quot;&gt;One of my very first contributions to an open source project&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;And last (and certainly least) a reminder that &lt;a href=&quot;http://www.google.com/codesearch?q=+kohles+show:mp_WvrVpefA:AQOYhA5QlqY:9mAhF7FZjvU&amp;sa=N&amp;cd=67&amp;ct=rc&amp;cs_p=http://hpux.cs.utah.edu/ftp/hpux/Text/CGD-1.2.1/CGD-1.2.1-src-10.20.tar.gz&amp;cs_f=CGD-1.2.1/REAL11.NEW#a0&quot;&gt;I was even more of a geek in college&lt;/a&gt;&lt;/p&gt;&lt;!-- Node text goes above. Div tags should contain sig only --&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-231445&quot;&gt;&lt;hr&gt;&lt;table border=0 width=100%&gt;&lt;tr&gt;&lt;th&gt;&lt;small&gt;We&#39;re not surrounded, we&#39;re in a target-rich environment!&lt;/small&gt;&lt;/th&gt;&lt;/tr&gt;&lt;/table&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Collective Hacking Projects: Let&#39;s Hear About Them! (jkeenan1)</title>
    <link>http://prlmnks.org/html/578020.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/578020.html</guid>

    <description>
        About two years ago this time I helped organize a Perl Seminar NY contingent of the &lt;a href=&quot;/out/link/http://qa.perl.org/phalanx/&quot;&gt;Phalanx project&lt;/a&gt;.  When I pitched the project to our members, I emphasized &lt;a href=&quot;/out/link/http://thenceforward.net/perl/yapc/YAPC-NA-2005/tt.tgz&quot;&gt;the joys of collective hacking&lt;/a&gt;.  And it proved very enjoyable indeed (and profitable for &lt;a href=&quot;/out/link/http://www.drinkgoodstuff.com/&quot;&gt;the bar where we met to do this hacking&lt;/a&gt;!)&lt;p&gt;We didn&#39;t do as much collective hacking in the 2005-06 season, but with our &lt;a href=&quot;/out/link/http://tech.groups.yahoo.com/group/perlsemny/message/669&quot;&gt;2006-07 Perl Seminar NY season about to begin&lt;/a&gt;, I&#39;d like to rev up some interest among our members.&lt;p&gt;I know of some of the collective hacking projects going on in the Perl community right now, and I&#39;ll be participating in the &lt;a href=&quot;/out/link/http://hackathon.info&quot;&gt;Chicago Perl hackathon next month&lt;/a&gt;.  But just in case I&#39;ve missed any, I&#39;d like to ask the Monks to respond to this Meditation by posting ongoing projects (e.g., Vanilla Perl, Module::Build) and their weblinks.&lt;p&gt;And, for extra credit, if you think that any of these projects could be broken up into components suitable for &lt;b&gt;face-to-face&lt;/b&gt; hacking by local Perlmonger groups, please so indicate.&lt;p&gt;I like online projects, but I don&#39;t like to hack or drink alone.&lt;p&gt;Thanks in advance.&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-360854&quot;&gt;Jim Keenan&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>PPM - It&#39;s a GUI!! (mikasue)</title>
    <link>http://prlmnks.org/html/577889.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/577889.html</guid>

    <description>
        &lt;p&gt;I recently installed ActivePerl 5.8.8 Build 819 on a desktop.  After reading the release notes, I was anxious to explore the new version of the Perl Package Manager.  At first the GUI was a little confusing becuase it&#39;s not clear what is being listed.  However, after playing with it, I think it&#39;s great.&lt;/p&gt;&lt;p&gt;Listing and installing packages are much easier tasks to complete.  Adding repositories are just as easy as typing in the url and giving it a name.  I can see all packages installed or all packages available for install.  With the click of a button I can upgrade and verify packages.&lt;/p&gt;&lt;/p&gt;All is not lost for those who like the command line shell.  Doing a ppm repo will list all your repositories in a nice borded table.  Other commands are available also just do a ppm help at the command line.&lt;/p&gt;&lt;p&gt;I like the new version of PPM.  What do you monks thinks about it?&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>Oracle OpenWorld Pilgrimage (Thwack)</title>
    <link>http://prlmnks.org/html/577879.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/577879.html</guid>

    <description>
        I will be traveling to Oracles OpenWorld conference in San Francisco (Oct. 22-26 2006). I am curious if any other Perl Monks and/or Perl related programmers will be there. I would like to meet up and hi and maybe do a lunch or something of that sort. It would be fun to meet other developers/administrators that use or focus on Perl with Oracle. 
    </description>
</item>

        

<item>
    <title>Startups Ideas from Perlmonks (monkfan)</title>
    <link>http://prlmnks.org/html/577850.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/577850.html</guid>

    <description>
        Having visited many social networking sites like &lt;a href=&quot;/out/http/?url=www.reddit.com&quot;&gt;reddit&lt;/a&gt;, &lt;a href=&quot;/out/http/?url=del.icio.us&quot;&gt;del.icio.us&lt;/a&gt;, &lt;a href=&quot;/out/http/?url=www.technorati.com&quot;&gt;Technorati&lt;/a&gt;, &lt;a href=&quot;/out/node/http:www.digg.com&quot;&gt;Digg&lt;/a&gt;, makes me think Perlmonks in some sense are far more superior than those guys. &lt;br&gt;&lt;br&gt;There are too many features that is very fundamental which Perlmonks has that they don&#39;t have, e.g. Chatterbox, Bookmarklet, voting systems, and the  closely knitted comunity (my knowledge is blatantly pale wrt PM wealth of features).  &lt;br&gt;&lt;br&gt;Do you guys have ever thought, what kind of business idea one can learn from our Perlmonks?&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-393886&quot;&gt;&lt;br&gt;Regards,&lt;br&gt;Edward&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>RFC: A Beginners Guide to Fuzzy Rules-Based Systems (lin0)</title>
    <link>http://prlmnks.org/html/577755.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/577755.html</guid>

    <description>
        &lt;p&gt;Greetings Fellow Monks,&lt;/p&gt;&lt;p&gt;This is a 50-minutes seminar I am giving next week as a part of an Image Analysis tutorial I am teaching. In this seminar, I introduce Fuzzy Sets and the AI::FuzzyInference module to the students.