<?xml version="1.0" encoding="UTF-8"?>



<rss version="2.0" xmlns:blogChannel="http://backend.userland.com/blogChannelModule">

    <channel>
        <title>perltutorial</title>
        <link>http://prlmnks.org/list/</link>
        <description>RSS feeds from perlmonks.org</description>
        <language>en</language>
        <ttl>5</ttl>

        

<item>
    <title>Threads: why locking is required when using shared variables (ikegami)</title>
    <link>http://prlmnks.org/html/579444.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/579444.html</guid>

    <description>
        &lt;p&gt;I was recently faced with a thread that used &lt;tt class=&quot;inline_code&quot;&gt;.=&lt;/tt&gt; on a shared variable, and I wondered if that was safe. I figured I&#39;d write up a introductory tutorial on the answer I found. For simplicity, we&#39;ll look at &lt;tt class=&quot;inline_code&quot;&gt;++&lt;/tt&gt; first.&lt;hr&gt;&lt;p&gt;The following code outputs 400,000:&lt;pre class=&quot;block_code&quot;&gt;my $count     = 100_000;my $num_calls = 4;my $sum = 0;sub inc { ++$sum for 1..$count; }inc() for 1..$num_calls;print(&quot;$sum\n&quot;);   # 400000&lt;/pre&gt;&lt;p&gt;If you ran the 4 calls to &lt;tt class=&quot;inline_code&quot;&gt;inc&lt;/tt&gt; in parallel, would the answer still be 400,000? Not likely, if you don&#39;t change &lt;tt class=&quot;inline_code&quot;&gt;inc&lt;/tt&gt;.&lt;pre class=&quot;block_code&quot;&gt;use threads;use threads::shared;my $count     = 100_000;my $num_calls = 4;my $sum : shared = 0;sub inc { ++$sum for 1..$count; }$_-&gt;join for map { threads-&gt;create( \&amp;inc ) } 1..$num_calls;print(&quot;$sum\n&quot;);   # 314813&lt;/pre&gt;&lt;p&gt;That&#39;s because there is a [wp://race condition].&lt;pre class=&quot;block_code&quot;&gt;+=======================+|          CPU          |+-----------+-----------+| thread 1  | thread 2  |+===========+===========+| ...       |           |   T| load $sum |           |   i| inc       |           |   m+-----------+-----------+   e|           | ...       |   ||           | load $sum |   ||           | inc       |   v|           | save $sum ||           | ...       |+-----------+-----------+| save $sum |           || ...       |           |+===========+===========+&lt;/pre&gt;&lt;p&gt;The solution is to protect the [wp://critical section] using a thread synchronization mechanism such as &lt;tt class=&quot;inline_code&quot;&gt;lock&lt;/tt&gt;.&lt;pre class=&quot;block_code&quot;&gt;use threads;use threads::shared;my $count     = 100_000;my $num_calls = 4;my $sum : shared = 0;sub inc { for (1..$count) { lock($sum); ++$sum } }$_-&gt;join for map { threads-&gt;create( \&amp;inc ) } 1..$num_calls;print(&quot;$sum\n&quot;);   # 400000&lt;/pre&gt;&lt;p&gt;Whenever an transformation operation (read &amp;#8658; manipulate &amp;#8658; write) is performed on a shared variable, locking is needed. See [mod://threads::shared] for tools to do this.&lt;p&gt;The program behind the &lt;tt class=&quot;inline_code&quot;&gt;&lt;spoiler&gt;&lt;/tt&gt; below outputs results similar to the following:&lt;pre class=&quot;block_code&quot;&gt;++s     sum = 233564 (expecting 400000)s+=1    sum = 143915 (expecting 400000)c.=l    length = 248149 (expecting 400000)c=c.l   length = 123360 (expecting 400000)&lt;/pre&gt;&lt;p&gt;As you can see, &lt;tt class=&quot;inline_code&quot;&gt;+=&lt;/tt&gt;, &lt;tt class=&quot;inline_code&quot;&gt;.=&lt;/tt&gt; and &lt;tt class=&quot;inline_code&quot;&gt;= .&lt;/tt&gt; are also not atomic. The program can only prove that an operator isn&#39;t atomic (i.e. is interruptable). It cannot prove that an operator is atomic (i.e. is not interruptable). If you&#39;re getting the &quot;expecting&quot; result, try upping &lt;tt class=&quot;inline_code&quot;&gt;$count&lt;/tt&gt; and/or &lt;tt class=&quot;inline_code&quot;&gt;$threads&lt;/tt&gt;.&lt;spoiler&gt;&lt;pre class=&quot;block_code&quot;&gt;use v5.8.0;use strict;use warnings;use threads;use threads::shared;{   my $count   = 100_000;   my $threads = 4;   my $sum : shared = 0;   sub inc {      for (1..$count) {         ++$sum;      }   }   $_-&gt;join      for map { threads-&gt;create( \&amp;inc ) }          0..$threads-1;   print(&quot;++s     sum = $sum (expecting &quot; . ($count*$threads). &quot;)\n&quot;);}{   my $count   = 100_000;   my $threads = 4;   my $sum : shared = 0;   sub inc_assign {      for (1..$count) {         $sum += 1;      }   }   $_-&gt;join      for map { threads-&gt;create( \&amp;inc_assign ) }          0..$threads-1;   print(&quot;s+=1    sum = $sum (expecting &quot; . ($count*$threads). &quot;)\n&quot;);}{   my $count   = 100_000;   my $threads = 4;   my $content : shared = &#39;&#39;;   sub append {      my ($letter) = @_;      for (1..$count) {         $content .= $letter;      }   }   $_-&gt;join      for map { threads-&gt;create( \&amp;append, chr(ord(&#39;a&#39;)+$_) ) }          0..$threads-1;   print(&quot;c.=l    length = &quot; . length($content) .         &quot; (expecting &quot; . ($count*$threads). &quot;)\n&quot;);}{   my $count   = 100_000;   my $threads = 4;   my $content : shared = &#39;&#39;;   sub concatenate {      my ($letter) = @_;      for (1..$count) {         $content = $content . $letter;      }   }   $_-&gt;join      for map { threads-&gt;create( \&amp;concatenate, chr(ord(&#39;a&#39;)+$_) ) }          0..$threads-1;   print(&quot;c=c.l   length = &quot; . length($content) .         &quot; (expecting &quot; . ($count*$threads). &quot;)\n&quot;);}&lt;/pre&gt;&lt;/spoiler&gt;&lt;p&gt;&lt;b&gt;Update&lt;/b&gt;: Added the preface and links to Wikipedia.&lt;p&gt;&lt;small&gt;Added to [Tutorials] by [planetscape] &lt;readmore title=&quot;view votes&quot;&gt;( keep:0 edit:6 reap:0 )&lt;/small&gt;&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>Big-O Notation - What is it good for? (Limbic~Region)</title>
    <link>http://prlmnks.org/html/573138.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/573138.html</guid>

    <description>
        [All],&lt;br /&gt;If you have ever felt that people debating the big-oh notation of some algorithm sound like they are speaking a foreign language, then this tutorial is for you.  You may have even decided to educate yourself by checking out the [wp://Big_o_notation|Wikipedia entry] and have been convinced it was a foreign language.  You are not alone.&lt;p&gt;You may have already read [id://227909] by [dws] which is excellent.  This tutorial will repeat much of the same information in a much more elementary manner as well as go into detail about how useful &lt;i&gt;or not&lt;/i&gt; the notation is.  It is likely I won&#39;t be completely accurate about a number of things in the tutorial.  My hope is that by the end, your understanding of the topic will be sufficient enough that you can understand the corrections others make (&lt;i&gt;as I am sure they will&lt;/i&gt;).&lt;/p&gt;&lt;READMORE&gt;&lt;h4&gt;What Is The Big-O&lt;/h4&gt;This tutorial covers the Big-O as it relates to computer science.  If you were thinking of something else (perhaps Fridays in the Chatterbox), you can stop reading now.  Simply put, it describes how the algorithm scales (&lt;i&gt;performs&lt;/i&gt;) in the worst case scenario as it is is run with more input.  Since my simple explanation may not be simple enough - let me give an example.  If we have a sub that searches an array item by item looking for a given element, the scenario that the Big-O describes is when the target element is last (or not present at all).  This particular algorithm is O(N) so the same algorithm working on an array with 25 elements should take approximately 5 times longer than an array with 5 elements.&lt;p&gt;It is easy to lose sight of the fact that there is more to consider about an algorithm other than how fast it runs.  The Big-O can also be used to describe other behavior such as memory consumption.  We often optimize by trading memory for time.  You may need to choose a slower algorithm because it also consumes less of a resource that you need to be frugal with.&lt;/p&gt;&lt;h4&gt;What The Big-O Is Not&lt;/h4&gt;&lt;b&gt;Constants:&lt;/b&gt;  The Big-O is not concerned with factors that do not change as the input increases.  Let me give an example that may be suprising.  Let&#39;s say we have an algorithm that needs to compare every element in an array to every other element in the array.  A simple implementation may look like:&lt;pre class=&quot;block_code&quot;&gt;for my $i (0 .. $#array) {    for my $j (0 .. $#array) {        next if $j == $i;        # Compare $i, $j    }}&lt;/pre&gt;This is O(N^2).  After a little bit of testing we decide that this is far too slow, so we make a little optimization.&lt;pre class=&quot;block_code&quot;&gt;for my $i (0 .. $#array - 1) {    for my $j ($i + 1 .. $#array) {        # Compare $i, $j    }}&lt;/pre&gt;We have just cut our run time in half - YAY!  Guess what, the Big-O has stayed the same O(N^2).  This is because N^2 / 2 only has one variable part.  The &lt;i&gt;divided by 2&lt;/i&gt; does not affect how the algorithm scales.  If you remember a little linear algebra, the slope of a line is the change in height (execution time) as the length (size of input) increases.  While the second algorithm&#39;s line will always be half as high as the first, it&#39;s slope will be exactly the same.&lt;p&gt;&lt;b&gt;Implementation Details:&lt;/b&gt;  The Big-O is an uncaring cold-hearted jerk.  It does not care if you can&#39;t afford to buy the extra RAM needed for your problem and have to resort to tying your hash to disk.  You are on your own.  It also doesn&#39;t care that the data structure you would need to implement to achieve O(Log Log N) is so complex you will never be able to maintain it.  In a nutshell, the Big-O lives in the land of theory and doesn&#39;t care very much about the real world.&lt;/p&gt;&lt;h4&gt;What The Big-O Is Good For&lt;/h4&gt;The good news is that the Big-O belongs to an entire family of notation.  This tutorial will not cover it but family members include describing the average and best cases.  It also serves as a good indicator of what algorithm to use once you take your individual circumstances into consideration.  Let me give a contrived example:&lt;p&gt;Let&#39;s consider using cacheing as an optimization.  In theory, the Big-O is going to ignore it saying your input is all different and you will never benefit from it.  In reality, you test it and discover that you have a 60% hit rate.  You do a little more experimenting and discover that the input size required for a more complex algorithm to be faster is larger than your &lt;i&gt;real&lt;/i&gt; maximum input size.  This all despite the more complex alorithm having a more favorable Big-O.&lt;/p&gt;&lt;p&gt;In a nutshell, the Big-O of a given algorithm combined with the specific problem knowledge is a great way to choose the best algorithm for your situation.&lt;/p&gt;&lt;h4&gt;What Do Those Symbols Mean?&lt;/h4&gt;So by this point you should realize that Big-O (theory) without context (real world) is not very useful.  You are now armed with the knowledge necessary to start using Big-O as the mercenary it is.  Ok Big-O, what exactly do you mean that algorithm is O(N Log N)?  I am going to duck at this point and suggest you read the node by [dws] or the Wikipedia entry I linked to earlier.  You may now be wondering if Big-O is really inanimate, perhaps even an abstract concept and not at all real as I have made it out to be.  If so, how then can you determine the Big-O of a given algorithm?  [http://algo.inria.fr/AofA/|Analysis of algorithms] is not for the faint of heart, so I must once again duck.&lt;/READMORE&gt;&lt;p&gt;I have not really added anything to any of the other links I referenced.  I do hope however that I have put it in plain enough english to be understood by even the most extreme novice.  I welcome those more knowledgeable than myself to add corrections as well as provide additional content.  I would only ask that you do so in the same spirit of this tutorial (&lt;i&gt;understandable by non-CS majors&lt;/i&gt;).&lt;/p&gt;&lt;small&gt;&lt;small&gt;Also see [id://25833] by [jeffa]&lt;/small&gt;&lt;/small&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-180961&quot;&gt;&lt;p&gt;Cheers - [Limbic~Region|L~R]&lt;/p&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Common Causes for &quot;Modification of a read-only value attempted&quot; (imp)</title>
    <link>http://prlmnks.org/html/570712.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/570712.html</guid>

    <description>
        You tried, directly or indirectly, to change the value of aconstant. This type of error frequently occurs at a distance and is difficult to trace. The magic of $_ is often involved, but is not responsible.&lt;p&gt;&lt;p&gt;Common causes:&lt;ol&gt;&lt;li&gt;&lt;a href=&quot;#loop_lvalue&quot;&gt;  Treating a loop variable as an lvalue&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#map_grep&quot;&gt;Modifying $_ inside foreach, map or grep&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#param_lvalue&quot;&gt;  Modifying elements of @_ directly&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#sort_modify&quot;&gt;  Modifying $a or $b inside sort&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#sort_vivify&quot;&gt;  Autovivifying $a or $b inside sort&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#default_unlocalized&quot;&gt;  Modifying an unlocalized $_&lt;/a&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;&lt;a name=&quot;loop_lvalue&quot;&gt;  &lt;h4&gt;Treating a loop variable as an lvalue&lt;/h4&gt;&lt;/a&gt;&lt;pre class=&quot;block_code&quot;&gt;for my $x (1,2) {  $x++;}&lt;/pre&gt;In this example $x is aliased to the constant &#39;1&#39;, so when the loop body attempts to increment $x an error is triggered. See &lt;a href=&quot;#lists_with_constants&quot;&gt;Lists With Constant Values&lt;/a&gt; for more details.&lt;a name=&quot;map_grep&quot;&gt;&lt;h4&gt;Modifying $_ inside [doc://perlsyn#Foreach-Loops-for-foreach|foreach], [doc://map] or [doc://grep]&lt;/h4&gt;&lt;/a&gt;&lt;pre class=&quot;block_code&quot;&gt;for (1,2) {  chomp;}for (&quot;foo&quot;, @list) {  s/foo/bar/;}@array = map  { $_++ } (1,2);@array = grep { $_++ } (1,2);&lt;/pre&gt;In all of these examples $_ is aliased to a constant, and when the loop body attempts to modify $_ an error is triggered. See &lt;a href=&quot;#lists_with_constants&quot;&gt;Lists With Constant Values&lt;/a&gt; for more details.&lt;a name=&quot;param_lvalue&quot;&gt;&lt;h4&gt;Modifying elements of @_ directly&lt;/h4&gt;&lt;/a&gt;&lt;pre class=&quot;block_code&quot;&gt;sub incr {  $_[0]++;}my $n = 1;incr($n); # goodincr(1);  # bad&lt;/pre&gt;Modifying elements of &lt;tt class=&quot;inline_code&quot;&gt;@_&lt;/tt&gt; directly allows you tomodify the variable that was passed to the function. For example, in the above example &lt;i&gt;$n&lt;/i&gt; is now 2. But an error will occur when a constant is passed, as in the second call.&lt;a name=&quot;sort_modify&quot;&gt;&lt;h4&gt;Modifying $a or $b inside [doc://sort]&lt;/h4&gt;&lt;/a&gt;&lt;pre class=&quot;block_code&quot;&gt;@array = sort { $a++ } (1,2);&lt;/pre&gt;It is permissible (but ill-advised) to modify $a and $b within sort. However, modifying a constant that is aliased to $a or $b is still an error.&lt;a name=&quot;sort_vivify&quot;&gt;&lt;h4&gt;Autovivifying $a or $b inside [doc://sort]&lt;/h4&gt;&lt;/a&gt;&lt;pre class=&quot;block_code&quot;&gt;my @bad;$bad[0] = [1];$bad[2] = [2];@bad = sort {$a-&gt;[0] &lt;=&gt; $b-&gt;[0]} @bad;&lt;/pre&gt;The variables $a and $b are aliased to each item in the list being sorted, and as such modifying them is possible - but will cause an error if the current element is unmodifiable. A common cause of this is sorting an array of references when where the list has a gap. In this situation $a will be undef, and autovivification by dereferencing will trigger an error.&lt;a name=&quot;default_unlocalized&quot;&gt;&lt;h4&gt;Modifying an unlocalized $_&lt;/h4&gt;&lt;/a&gt;&lt;pre class=&quot;block_code&quot;&gt;for (1,2) {  my $data = prompt_user();}sub prompt_user {   print &quot;Enter a number\n&quot;;   while (&lt;STDIN&gt;) {     # Do stuff   }}&lt;/pre&gt;This example will cause an error because the [doc://perlsyn#Foreach-Loops-for-foreach|for] loop aliases $_ to the literal &#39;1&#39;, and then calls prompt_user which attempts to read a line from STDIN and store it in $_ - which is still aliased to &#39;1&#39;.&lt;p&gt;The error will also occur in this simplified scenario:&lt;pre class=&quot;block_code&quot;&gt;for (1,2) {    while (&lt;STDIN&gt;) {    }}&lt;/pre&gt;&lt;h3&gt;Guidelines to avoid read-only errors:&lt;/h3&gt;&lt;ol&gt;&lt;li&gt;Don&#39;t treat loop variables as a lvalue if there is any chance a constant value will be included&lt;/li&gt;&lt;li&gt;Don&#39;t modify an unlocalized $_&lt;/li&gt;&lt;li&gt;Don&#39;t modify $_ inside map or grep&lt;/li&gt;&lt;li&gt;Don&#39;t modify $a or $b inside sort&lt;/li&gt;&lt;li&gt;Don&#39;t dereference $a or $b inside sort without checking that they exist and are references&lt;/li&gt;&lt;li&gt;Don&#39;t use $_ for the loop variable unless it is a very trivial loop&lt;/li&gt;&lt;li&gt;Don&#39;t modify elements of @_ directly&lt;/li&gt;&lt;/ol&gt;&lt;hr&gt;&lt;h3&gt;Notes&lt;/h3&gt;&lt;a name=&quot;lists_with_constants&quot;&gt;&lt;h4&gt;Lists With Constant Values&lt;/h4&gt;&lt;/a&gt;Within the context of this document the important thing to understand is what expressions result in modifiable objects.The following expressions have constants in them:&lt;pre class=&quot;block_code&quot;&gt;$_++ for (1,2);$_++ for (1,@array);@array = map {$_++} (1,@array);&lt;/pre&gt;And the following are safe:&lt;pre class=&quot;block_code&quot;&gt;my @array = (1,2);for (@array) {  $_++;}&lt;/pre&gt;&lt;pre class=&quot;block_code&quot;&gt;my ($x,$y) = (1,2);for ($x,$y) {  $_++;}&lt;/pre&gt;For an explanation of lists versus arrays I recommend the following:&lt;ul&gt;&lt;li&gt;[id://451421]&lt;/li&gt;&lt;li&gt;[href://http://japhy.perlmonk.org/articles/pm/2000-02.html|&quot;List&quot; Is a Four-Letter Word]&lt;/li&gt;&lt;/ul&gt;
    </description>
</item>

        

<item>
    <title>mod_perl / Apache::Registry accidental closures (imp)</title>
    <link>http://prlmnks.org/html/562746.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/562746.html</guid>

    <description>
        When working with Apache::Registry it is very easy to create accidental closures.&lt;br&gt;This is due to the way Apache::Registry makes a fake package for your script, as I&#39;ll demonstrate in this tutorial.&lt;p&gt;The only indication that something is amiss (other than the unpredictable behaviour) will be the following line in your error log: &lt;i&gt;Variable &quot;$foo&quot; will not stay shared at ...&lt;/i&gt;&lt;p&gt;The following code demonstrates the problem with accidental closures when using Apache::Registry.&lt;pre class=&quot;block_code&quot;&gt;use strict;use warnings;my $foo = 5;print &quot;Content-type: text/plain\n&quot;;print &quot;Content-disposition: inline; filename=foo.txt\n\n&quot;;printf &quot;Package: %s\n&quot;, __PACKAGE__;printf &quot;[%s] Before: %s\n&quot;, $$, $foo;badness(5);printf &quot;[%s] After: %s\n&quot;, $$, $foo;sub badness {    my $val = shift;    printf &quot;[%s] badness: %s\n&quot;, $$, $foo;    $foo += $val;}&lt;/pre&gt;&lt;p&gt;Apache::Registry will take the above code and create a new package for it based on the ServerName and the name of the script, and then wrap the code in a sub handler {} block.&lt;p&gt;If your script is running on &quot;foo.com&quot; and is named &quot;test.pl&quot;, then this is what the above code will look like after Apache::Registry is done:&lt;p&gt;&lt;pre class=&quot;block_code&quot;&gt;package Apache::ROOTfoo_2ecom::test_2epl;use Apache qw(exit);sub handler {#line 1 /www/foo.com/test.pluse strict;use warnings;my $foo = 5;print &quot;Content-type: text/plain\n&quot;;print &quot;Content-disposition: inline; filename=foo.txt\n\n&quot;;printf &quot;Package: %s\n&quot;, __PACKAGE__;printf &quot;[%s] Before: %s\n&quot;, $$, $foo;badness(5);printf &quot;[%s] After: %s\n&quot;, $$, $foo;sub badness {    my $val = shift;    printf &quot;[%s] badness: %s\n&quot;, $$, $foo;    $foo += $val;}}&lt;/pre&gt;&lt;u&gt;First run:&lt;/u&gt;&lt;pre class=&quot;block_code&quot;&gt;Package: Apache::ROOTfoo_2ecom::test_2epl[13520] Before: 5[13520] badness: 5[13520] After: 10&lt;/pre&gt;&lt;u&gt;Second:&lt;/u&gt;&lt;pre class=&quot;block_code&quot;&gt;Package: Apache::ROOTfoo_2ecom::test_2epl[19331] Before: 5[19331] badness: 5[19331] After: 10&lt;/pre&gt;&lt;u&gt;Third:&lt;/u&gt;&lt;pre class=&quot;block_code&quot;&gt;Package: Apache::ROOTfoo_2ecom::test_2epl[19331] Before: 5[19331] badness: 10[19331] After: 5&lt;/pre&gt;&lt;u&gt;Fourth:&lt;/u&gt;&lt;pre class=&quot;block_code&quot;&gt;Package: Apache::ROOTfoo_2ecom::test_2epl[19331] Before: 5[19331] badness: 15[19331] After: 5&lt;/pre&gt;Notice how the number within the badness sub is increasing for each process, but the $foo that is seen by the instance script is never modified after &#39;badness&#39; after the first execution for that process.&lt;p&gt;This is because the badness function is actually an inner function now,and it keeps a reference to the instance of $foo that was created for the first run.&lt;p&gt;&lt;b&gt;Edit - example of how to avoid this issue added, per rhesa&#39;s suggestion&lt;/b&gt;&lt;br&gt;&lt;p&gt;Thankfully it is easy to avoid these problems once you know why they occur.&lt;br&gt;Tips:&lt;ul&gt;&lt;li&gt;Keep your toplevel script minimal&lt;/li&gt;&lt;li&gt;Subroutines should only use the variables that were passed&lt;/li&gt;&lt;li&gt;Encapsulate behaviour in supporting objects&lt;/li&gt;&lt;/ul&gt;Example of a working alternative:&lt;pre class=&quot;block_code&quot;&gt;use strict;use warnings;my $foo = 5;print &quot;Content-type: text/plain\n&quot;;print &quot;Content-disposition: inline; filename=foo.txt\n\n&quot;;printf &quot;Package: %s\n&quot;, __PACKAGE__;printf &quot;[%s] Before: %s\n&quot;, $$, $foo;badness(\$foo, 5);badness(\$foo, 5);printf &quot;[%s] After: %s\n&quot;, $$, $foo;sub badness {    my ($foo,$val) = @_;    printf &quot;[%s] badness: %s\n&quot;, $$, $$foo;    $$foo += $val;}&lt;/pre&gt;
    </description>
</item>

        

<item>
    <title>mbito de variables en Perl: lo bsico (Hue-Bond)</title>
    <link>http://prlmnks.org/html/559011.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/559011.html</guid>

    <description>
        &lt;p&gt;&lt;small&gt;Translated to spanish from [id://23317]&#39;s original [id://66677].&lt;/small&gt;&lt;/p&gt;&lt;h1&gt;mbito&lt;/h1&gt;&lt;p&gt;Una de las cosas necesarias para dominar Perl es cmo manejar los mecanismos de mbito que te ofrece. Que queremos globales? Las tenemos. Que queremos evitar &quot;colisiones&quot; (dos variables con el mismo nombre pisndose una a la otra)? Podemos, y hay ms de una forma de conseguirlo. Pero las reglas de mbito de Perl no son siempre tan claras, y no es slo la diferencia entre &lt;tt&gt;my&lt;/tt&gt; y &lt;tt&gt;local&lt;/tt&gt; lo que hace tropezar a la gente, aunque aclarar eso es uno de mis objetivos.&lt;/p&gt;&lt;p&gt;He aprendido mucho de &lt;a href=&quot;http://perl.plover.com/FAQs/Namespaces.html&quot;&gt;Coping with scoping&lt;/a&gt; y de varios libros de Perl (por ejemplo, &lt;a href=&quot;http://www.perlmonks.org/?node=Effective%20Perl%20Programming&quot;&gt;Effective Perl Programming&lt;/a&gt;), por lo que debo dar crdito a sus autores ([Dominus] por el primero, y Joseph N. Hall y [merlyn] por el segundo). [Dominus] tambin ha hecho varias correcciones a los errores (algunos de ellos notables) de una versin anterior de este tutorial, as que l debera considerarse como mnimo el segundo autor (N del T: aunque [Dominus] no est de acuerdo con esto). Sin embargo la documentacin que viene con tu versin de Perl es la ms actualizada que puedes consultar, as que no dudes en usar &lt;tt&gt;perldoc perlop&lt;/tt&gt; y &lt;tt&gt;perldoc -f foo&lt;/tt&gt; en tu propio sistema.&lt;/p&gt;&lt;h3&gt;Resumen&lt;/h3&gt;&lt;p&gt;S, al principio...&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;&lt;tt&gt;my&lt;/tt&gt; proporciona mbito lxico; una variable declarada con &lt;tt&gt;my&lt;/tt&gt; slo es visible en el bloque en que ha sido declarada.&lt;/li&gt;  &lt;li&gt;Los bloques de cdigo son trozos delimitados por llaves { }. Un archivo tambin se considera un bloque.&lt;/li&gt;  &lt;li&gt;Usar &lt;tt&gt;use vars qw(&amp;#91;nombres de variables&amp;#93;)&lt;/tt&gt; o &lt;tt&gt;our (&amp;#91;nombres de variables&amp;#93;)&lt;/tt&gt; para crear globales.&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;local&lt;/tt&gt; guarda el valor de una global y lo sustituye por un valor nuevo a efectos del cdigo que est en el bloqueactual y al que llamemos desde tal bloque.&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Espacios de nombres&lt;/h2&gt;&lt;p&gt;Una de las ideas bsicas, aunque no es necesario dominarla para muchos programas, es la de &lt;i&gt;espacios de nombres&lt;/i&gt;. Las variables globales (las que no se declaran con &lt;tt&gt;my&lt;/tt&gt;) estn en un paquete. Los paquetes proporcionan &lt;i&gt;espacios de nombres&lt;/i&gt;, que voy a explicar usando como metfora los apellidos. En los pases de habla hispana, &quot;Roberto&quot; es un nombre bastante comn, as que es posible que conozcamos a ms de un &quot;Roberto&quot; (asumiendo que vivimos en uno de estos pases). Normalmente, para los humanos, el contexto de la conversacin basta para que nuestra audiencia sepa de qu &quot;Roberto&quot; estamos hablando (en el vestbulo de la piscina, &quot;Roberto&quot; es el que controla de dardos; pero en el trabajo &quot;Roberto&quot; es el director de la empresa).&lt;/p&gt;&lt;p&gt;Por supuesto, estas personas tambin tienen &lt;i&gt;apellidos&lt;/i&gt; (pero existen personas distintas con el mismo apellido, as que despus de todo esta metfora no es perfecta), y si quisiramos ser explcitos podramos aadirlos para que quien nos oye sepa de qu &quot;Roberto&quot; hablamos. $Garcia::Roberto es una cosa distinta de $Gonzalez::Roberto. Cuando tenemos dos variables distintas con el mismo &quot;nombre de pila&quot;, podemos referirnos a cualquiera de ellas, sin importar el lugar del cdigo en que nos encontremos, usando el nombre completo de la variable.&lt;/p&gt;&lt;p&gt;Se usa el operador &lt;tt&gt;package&lt;/tt&gt; para cambiar el paquete actual. Cuando usamos &lt;tt&gt;package Garcia&lt;/tt&gt; en el programa, estamos, en efecto, diciendo que todas las variables y funciones no calificadas (es decir, que no tienen &quot;apellido&quot; explcito) deben ser entendidas como si estuvieran en el paquete Garcia. Es como decir &quot;en esta parte del programa, voy a hablar de la familia Garcia&quot;.&lt;/p&gt;&lt;p&gt;De forma implcita, hay un &lt;tt&gt;package main&lt;/tt&gt; al principio de los programas, esto es, excepto que declaremos explcitamente un paquete distinto, todas las variables que se declaren (teniendo en cuenta el uso de &lt;tt&gt;my&lt;/tt&gt;) estarn en &lt;tt&gt;main&lt;/tt&gt;. A las variables que estn en un paquete se les llama, y con razn, &quot;globales de paquete&quot;, porque se puede acceder a ellas sin ms desde todos los operadores y subrutinas que estn en tal paquete (y si somos explcitos con sus nombres, tambin son accesibles desde fuera de l).&lt;/p&gt;&lt;p&gt;Usar paquetes hace que acceder a las variables sea como moverse en distintos crculos. Por ejemplo, en el trabajo, se entiende que &quot;Roberto&quot; es &quot;Roberto Szywiecki&quot;, el jefe. En la piscina, &quot;Roberto&quot; es &quot;Roberto Yamauchi&quot;, el experto en dardos. Aqu tenemos un pequeo programa para mostrar el uso de paquetes:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -wpackage Szywiecki;$Robert = &quot;el jefe&quot;;sub terminate {  my $name = shift;  print &quot;$Robert ha despedido a ${name}\n&quot;;}terminate(&quot;arturo&quot;); # muestra &quot;el jefe ha despedido a arturo&quot;package main;# terminate(&quot;arturo&quot;); # produce un error si se descomenta__OUTPUT__el jefe ha despedido a arturo&lt;/pre&gt;&lt;p&gt;El nombre completo de la variable &lt;tt&gt;$Robert&lt;/tt&gt; es &lt;tt&gt;$Szywiecki::Robert&lt;/tt&gt; (ntese que el &lt;tt&gt;$&lt;/tt&gt; se desplaza al principio, antes del nombre del paquete, indicando que este es el escalar  &lt;tt&gt;Robert&lt;/tt&gt; que est en el paquete &lt;tt&gt;Szywiecki&lt;/tt&gt;). Para el cdigo y, ms importante, las subrutinas del paquete &lt;tt&gt;Szywiecki&lt;/tt&gt;, un &lt;tt&gt;$Robert&lt;/tt&gt; sin calificar se refiere a &lt;tt&gt;$Szywiecki::Robert&lt;/tt&gt; -- &lt;i&gt;excepto&lt;/i&gt; que &lt;tt&gt;$Robert&lt;/tt&gt; haya sido &quot;enmascarado&quot; por una declaracin &lt;tt&gt;my&lt;/tt&gt; o &lt;tt&gt;local&lt;/tt&gt; (hablaremos de esto despus).&lt;/p&gt;&lt;p&gt;Ahora, al hacer &lt;tt&gt;use strict&lt;/tt&gt; (y se debera! consulta [strict.pm] por ejemplo), tendremos que declarar todas esas variables globales antes de poder usarlas, EXCEPTO que querramos usar siempre sus nombres completos. Esa es la razn por la que la segunda llamada a &lt;tt&gt;terminate&lt;/tt&gt; fallara si  la descomentramos. Perl espera encontrar una subrutina &lt;tt&gt;terminate&lt;/tt&gt; en el paquete &lt;tt&gt;main&lt;/tt&gt;, pero no la hemos definido. Es decir, esto:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -wuse strict;$Robert = &quot;el jefe&quot;;  # error!print &quot;\$Robert = $Robert\n&quot;;&lt;/pre&gt;&lt;p&gt;producir un error, mientras que si ponemos el nombre entero (recordando que existe un &lt;tt&gt;package main&lt;/tt&gt; implcito), no hay problema:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -wuse strict;$main::Robert = &quot;el jefe&quot;;print &quot;\$main::Robert = $main::Robert\n&quot;;&lt;/pre&gt;&lt;p&gt;Para satisfacer a &lt;tt&gt;strict &#39;vars&#39;&lt;/tt&gt; (la parte de &lt;tt&gt;strict&lt;/tt&gt; que se encarga de las declaraciones de variables), tenemos dos opciones; producen resultados distintos y una de ellas slo est disponible en Perl 5.6.0 y ms recientes:&lt;/p&gt;&lt;ol&gt;  &lt;li&gt;El operador &lt;tt&gt;our ($foo, $bar)&lt;/tt&gt; (en Perl 5.6.0 y superiores) declara &lt;tt&gt;$foo&lt;/tt&gt; como una variable en el paquete actual.&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;use vars qw($foo $bar)&lt;/tt&gt; (versiones anteriores, pero todava funciona en 5.6) le dice a &lt;tt&gt;strict &#39;vars&#39;&lt;/tt&gt; que es correcto usar estas variables sin calificarlas del todo.&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Una de las diferencias entre &lt;tt&gt;our&lt;/tt&gt; y el ms antiguo &lt;tt&gt;use vars&lt;/tt&gt; es que &lt;tt&gt;our&lt;/tt&gt; proporciona &lt;i&gt;mbito lxico&lt;/i&gt; (ms acerca de esto en la seccin de &lt;tt&gt;my&lt;/tt&gt;, ms abajo).&lt;/p&gt;&lt;p&gt;Otra diferencia es que con &lt;tt&gt;use vars&lt;/tt&gt;, debemos usar un array de &lt;i&gt;nombres&lt;/i&gt; de variables, no las variables propiamente dichas (tal como con &lt;tt&gt;our&lt;/tt&gt;). Ambos mecanismos nos permiten usar globales al mismo tiempo que mantenemos uno de los principales beneficios de &lt;tt&gt;strict &#39;vars&#39;&lt;/tt&gt;: el estar protegidos de crear variables accidentalmente si nos equivocamos al teclear. &lt;tt&gt;strict &#39;vars&#39;&lt;/tt&gt; exige que las variables se declaren explcitamente (como diciendo &quot;estas son las globales que voy a usar&quot;). Los dos mecanismos permiten hacer esto con globales de paquete.&lt;/p&gt;&lt;p&gt;Algo que debemos tener en cuenta (que es potencialmente algo malo, dependiendo de lo fantico que uno sea de la privacidad&quot;) es que las variables globales no son slo globales a ese paquete, sino que son accesibles desde &lt;i&gt;cualquier parte del cdigo&lt;/i&gt;, siempre que se usen sus nombres completos. Podemos hablar de Roberto, el experto de dardos, en el trabajo si decimos &quot;Roberto Yamauchi&quot; (en este cdigo no uso &lt;tt&gt;strict&lt;/tt&gt; por brevedad):&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -wpackage Szywiecki;$Robert = &quot;el jefe&quot;;package PoolHall;$Robert = &quot;el experto en dardos&quot;;package Szywiecki; # a trabajar otra vez!print &quot;Aqu en el trabajo, &#39;Robert&#39; es $Robert, pero en la piscina, &#39;Robert&#39; es $PoolHall::Robert\n&quot;;__OUTPUT__Aqu en el trabajo, &#39;Robert&#39; es el jefe, pero en la piscina, &#39;Robert&#39; es el experto en dardos&lt;/pre&gt;&lt;p&gt;Lo veis? Entender los paquetes no es tan difcil. En trminos generales, un paquete es como una familia de variables (y de subrutinas! el nombre completo de aquel &lt;tt&gt;terminate&lt;/tt&gt; en un ejemplo anterior es &lt;tt&gt;&amp;Szywiecki::terminate&lt;/tt&gt; -- lo mismo sirve para hashes y arrays, por supuesto).&lt;/p&gt;&lt;h2&gt;&lt;tt&gt;my&lt;/tt&gt; (y un poco ms sobre &lt;tt&gt;our&lt;/tt&gt;) &lt;i&gt;a.k.a&lt;/i&gt; mbito lxico&lt;/h2&gt;&lt;p&gt;Las variables declaradas con &lt;tt&gt;my&lt;/tt&gt; no son globales, aunque pueden actuar como tales. Uno de los usos principales de &lt;tt&gt;my&lt;/tt&gt; es operar con una variable que slo sirva en un bucle o subrutina, pero desde luego que hay muchos ms. He aqu algunos conceptos acerca de &lt;tt&gt;my&lt;/tt&gt;:&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;El mbito de una variable &lt;tt&gt;my&lt;/tt&gt; es un &lt;i&gt;bloque&lt;/i&gt; de cdigo.&lt;/li&gt;  &lt;li&gt;Un bloque se define normalmente con llaves { }, pero en lo que a Perl concierne, un archivo tambin es un bloque.&lt;/li&gt;  &lt;li&gt;Las variables declaradas con &lt;tt&gt;my&lt;/tt&gt; &lt;i&gt;&lt;b&gt;no pertenecen a ningn paquete&lt;/b&gt;&lt;/i&gt;, slo &quot;pertenecen&quot; a su bloque.&lt;/li&gt;  &lt;li&gt;Aunque podemos dar nombre a los bloques (por ejemplo, &lt;tt&gt;BEGIN&lt;/tt&gt;), no podemos calificar el nombre del bloque para acceder a la variable &lt;tt&gt;my&lt;/tt&gt;.&lt;/li&gt;  &lt;li&gt;Las variables &lt;tt&gt;my&lt;/tt&gt; a nivel de archivo son las que se declaran en un archivo pero fuera de un bloque de cdigo.&lt;/li&gt;  &lt;li&gt;No se puede acceder a una variable &lt;tt&gt;my&lt;/tt&gt; de archivo desde fuera del archivo en que se declare (&lt;i&gt;excepto&lt;/i&gt; que sea el valor de retorno de una subrutina, por ejemplo).&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Mientras slo escribamos programas de un solo archivo (por ejemplo, los que no importan mdulos), algunos de estos conceptos no importan mucho. Pero si estamos interesados en privacidad y encapsulacin (por ejemplo, si escribimos mdulos), tendremos que entender todas esas cosas.&lt;/p&gt;&lt;p&gt;He aqu un programa comentado para explicar algunas:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -wuse strict;#recordemos que estamos en el paquete mainuse vars qw($foo);$foo = &quot;Yo!&quot;; # damos valor a $main::fooprint &quot;\$foo: $foo\n&quot;; # muestra &quot;Yo!&quot;my $foo = &quot;Hey!&quot;; # variable my a nivel de archivoprint &quot;\$foo: $foo\n&quot;; # muestra &quot;Hey!&quot; -- la variable nueva &#39;pisa&#39; a la vieja{ # comenzamos un bloque   my $foo = &quot;Yacht-Z&quot;;     print &quot;\$foo: $foo\n&quot;;   # muestra &quot;Yacht-Z&quot; -- hay una nueva variable $foo visible  print &quot;\$main::foo: $main::foo\n&quot;;  # todava podemos ver $main::foo  subroutine();   } # fin del bloqueprint &quot;\$foo: $foo\n&quot;; # nuestra variable $foo a nivel de archivo se ve otra vez!print &quot;\$main::foo: $main::foo\n&quot;; # $main::foo todava est aqusub subroutine {  print &quot;\$foo: $foo\n&quot;; # muestra &quot;Hey!&quot;  # Por qu? porque la variable declarada en el bloque sin nombre est en    # su mbito -- ahora tenemos otras llaves distintas rodeando esto. Pero la  # variable de archivo todava est en mbito, y todava &quot;pisa&quot; a la  # declaracin de $main::foo.}package Bar;print &quot;\$foo: $foo\n&quot;; # muestra &quot;Hey!&quot; -- la variable my todava es visible# si no hubiramos hecho la declaracin arriba, esto provocara un error: el# intrprete nos dira que Bar::foo no ha sido definida.__OUTPUT__   $foo: Yo!      $foo: Hey!     $foo: Yacht-Z  $main::foo: Yo!$foo: Hey!$foo: Hey!$main::foo: Yo!  $foo: Hey!&lt;/pre&gt;&lt;p&gt;Tal como la parte de abajo del ejemplo nos dice, dado que no estn en ningn paquete, las variables &lt;tt&gt;my&lt;/tt&gt; &lt;i&gt;pueden ser&lt;/i&gt; visibles incluso aunque hayamos declarado un paquete nuevo, &lt;i&gt;dado que el bloque de cdigo es el archivo&lt;/i&gt; (al menos en este ejemplo).&lt;/p&gt;&lt;p&gt;Este ejemplo usa un bloque sin nombre, no hay estructura de control asociada (por ejemplo &lt;tt&gt;if&lt;/tt&gt; o &lt;tt&gt;while&lt;/tt&gt;). Pero de ser as tampoco habra diferencias.&lt;/p&gt;&lt;p&gt;Las variables &lt;tt&gt;my&lt;/tt&gt; de archivo SON accesibles desde los bloques definidos en ese archivo (tal como el ejemplo muestra), esta es una manera de la que pueden actuar como globales. Si, no obstante, &lt;tt&gt;subroutine&lt;/tt&gt; se hubiera definido en otro archivo, tendramos un error en tiempo de ejecucin. Una vez sabemos cmo funciona &lt;tt&gt;my&lt;/tt&gt;, podemos saber, slo fijndonos en la sintaxis del archivo, dnde va a ser visible. Esta es una razn por la que el mbito que proporciona se llama &quot;lxico&quot;. En esto, &lt;tt&gt;use vars&lt;/tt&gt; y el nuevo operador &lt;tt&gt;our&lt;/tt&gt; difieren: si ponemos &lt;tt&gt;our $foo&lt;/tt&gt; en el paquete &lt;tt&gt;Bar&lt;/tt&gt; pero &lt;i&gt;fuera de un bloque&lt;/i&gt;, estamos diciendo que (hasta que aparezca otro operador de mbito) debe entenderse que las ocurrencias de &lt;tt&gt;$foo&lt;/tt&gt; se refieren a &lt;tt&gt;$Bar::foo&lt;/tt&gt;. Esto ilustra la diferencia entre &lt;tt&gt;use vars&lt;/tt&gt; y el nuevo &lt;tt&gt;our&lt;/tt&gt;:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -wuse strict;our ($bob);use vars qw($carol);$carol = &quot;ted&quot;;$bob = &quot;alice&quot;;print &quot;Bob =&gt; $bob, Carol =&gt; $carol\n&quot;;package Movie;print &quot;Bob =&gt; $bob, Carol =&gt; $carol\n&quot;;&lt;/pre&gt;&lt;p&gt;El segundo &lt;tt&gt;print&lt;/tt&gt; produce un error, porque &lt;tt&gt;$carol&lt;/tt&gt; se toma como &lt;tt&gt;$Movie::carol&lt;/tt&gt;, mientras que &lt;tt&gt;$bob&lt;/tt&gt; es &lt;tt&gt;$main::bob&lt;/tt&gt;.&lt;/p&gt;&lt;p&gt;Mientras que esta &quot;expansin sobre paquetes&quot; (que slo se muestra en el caso de &lt;tt&gt;our&lt;/tt&gt;) es una similaridad funcional entre los dos tipos distintos de operadores de mbito, no debemos olvidar la diferencia entre ellos, que es que &lt;tt&gt;our&lt;/tt&gt; declara una global, pero &lt;tt&gt;my&lt;/tt&gt; no.&lt;/p&gt;&lt;h2&gt;&lt;tt&gt;local&lt;/tt&gt; &lt;i&gt;a.k.a.&lt;/i&gt; mbito dinmico&lt;/h2&gt;&lt;p&gt;Ahora llegamos a &lt;tt&gt;local&lt;/tt&gt;, que es como &lt;tt&gt;my&lt;/tt&gt;, pero debido a su nombre, su funcin se confunde con frecuencia con la de &lt;tt&gt;my&lt;/tt&gt;. Aqu est el detalle: &lt;tt&gt;local $foo&lt;/tt&gt; &lt;i&gt;almacena&lt;/i&gt; el valor actual de la variable &lt;b&gt;global&lt;/b&gt; &lt;tt&gt;$foo&lt;/tt&gt;, y hace que en el bloque actual y en el cdigo al que se llame desde el bloque actual, &lt;tt&gt;$foo&lt;/tt&gt; se refiera al valor que le demos en tal bloque (hacer &lt;tt&gt;local $foo&lt;/tt&gt; le dar a &lt;tt&gt;$foo&lt;/tt&gt; el valor &lt;tt&gt;undef&lt;/tt&gt;, lo mismo que con &lt;tt&gt;my&lt;/tt&gt;). Actualmente, &lt;tt&gt;local&lt;/tt&gt; slo funciona en &lt;b&gt;globales&lt;/b&gt;, no se puede usar sobre una variable &lt;tt&gt;my&lt;/tt&gt;.&lt;/p&gt;&lt;p&gt;Ya que &lt;tt&gt;local&lt;/tt&gt; puede afectar a cosas que ocurren fuera del bloque en que lo hemos usado, &lt;tt&gt;local&lt;/tt&gt; proporciona mbito denominado &lt;i&gt;dinmico&lt;/i&gt;, ya que su efecto se determina a partir de lo que ocurre cuando se ejecuta el programa. Esto es, el compilador no puede saber cuando &lt;tt&gt;local&lt;/tt&gt; har efecto o no durante la compilacin del programa (que ocurre antes de la ejecucin del mismo). Esto distingue el mbito dinmico del lxico proporcionado por &lt;tt&gt;my&lt;/tt&gt; y &lt;tt&gt;our&lt;/tt&gt;, que tienen efectos visibles en tiempo de compilacin.&lt;/p&gt;&lt;p&gt;El resultado bsico de esta diferencia es que si &lt;tt&gt;local&lt;/tt&gt;izamos una variable dentro de un bloque y llamamos a una subrutina desde ese bloque, la subrutina ver el valor de la variable &lt;tt&gt;local&lt;/tt&gt;izada. Esta es una diferencia importante entre &lt;tt&gt;my&lt;/tt&gt; y &lt;tt&gt;local&lt;/tt&gt;. Comparar el ejemplo anterior con este:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -wuse strict;use vars qw ($foo); # &quot;our $foo&quot; si usamos 5.6$foo = &quot;global value&quot;;print &quot;\$foo: $foo\n&quot;; # muestra &quot;global value&quot;print &quot;mysub    result &#39;&quot;, &amp;mysub(),    &quot;&#39;\n&quot;; # &quot;global value&quot;print &quot;localsub result &#39;&quot;, &amp;localsub(), &quot;&#39;\n&quot;; # &quot;local value&quot;print &quot;no sub   result &#39;&quot;, &amp;showfoo(),  &quot;&#39;\n&quot;; # &quot;global value&quot;sub mysub { my $foo = &quot;my value&quot;; showfoo();}sub localsub { local $foo = &quot;local value&quot;; showfoo(); # SIEMPRE muestra &quot;local value&quot;}sub showfoo {  return $foo;}__OUTPUT__$foo: global valuemysub    result &#39;global value&#39;localsub result &#39;local value&#39;no sub   result &#39;global value&#39;&lt;/pre&gt;&lt;p&gt;Ntese que &lt;tt&gt;showfoo&lt;/tt&gt; ignora (en apariencia) la declaracin &lt;tt&gt;my&lt;/tt&gt; de &lt;tt&gt;mysub&lt;/tt&gt; (ya que hemos abandonado el bloque en el que la declaracin &lt;tt&gt;my&lt;/tt&gt; tiene efecto) pero la declaracin &lt;tt&gt;local&lt;/tt&gt; de &lt;tt&gt;localsub&lt;/tt&gt; no se ignora. Y despus de abandonar ese bloque, el valor original de &lt;tt&gt;$foo&lt;/tt&gt; se vuelve a ver.&lt;/p&gt;&lt;p&gt;Espero que hayis aprendido tanto al leer esto como yo al escribirlo!&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>POD en 5 minutos (Hue-Bond)</title>
    <link>http://prlmnks.org/html/558831.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/558831.html</guid>

    <description>
        &lt;p&gt;&lt;small&gt;Translated to spanish from [node://Juerd]&#39;s original [id://252477].&lt;/small&gt;&lt;/p&gt;&lt;h1&gt;Plain Old Documentation en 5 minutos&lt;/h1&gt;&lt;h2&gt;La documentacin es importante&lt;/h2&gt;&lt;p&gt;Todos el mundo lo sabe, y sabe por qu. Me voy a saltar esta seccin porque cualquier discusin detallada de por qu la documentacin es importante rompera mi promesa de que se puede aprender a documentar en cinco minutos.&lt;/p&gt;&lt;h2&gt;Documentacin en Perl&lt;/h2&gt;&lt;p&gt;El cdigo fuente en Perl puede contener documentacin en formato POD. POD significa &quot;Plain Old Documentation&quot; (documentacin sencilla y antigua). Se puede mezclar POD con cdigo, poner todo el POD al principio o ponerlo al final. Slo depende del gusto de cada uno. T eliges.&lt;/p&gt;&lt;h2&gt;Encabezados en POD&lt;/h2&gt;&lt;p&gt;La estructura lgica es importante, por tanto se suelen usar encabezados. Hay cuatro niveles, y con esto debera llegar. Se usan los comandos &lt;tt&gt;=head1&lt;/tt&gt; .. &lt;tt&gt;=head4&lt;/tt&gt; (oficialmente se les llama &lt;i&gt;comandos de prrafo&lt;/i&gt;. Son prrafos porque estn separados del resto del POD mediante lneas enblanco).&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;=head1 NOMBREMy::Module - Un mdulo de ejemplo&lt;/pre&gt;&lt;h2&gt;Secciones habituales&lt;/h2&gt;&lt;p&gt;Para mantener las cosas claras, se usan las mismas secciones en todas partes. Ya hemos visto la seccin NOMBRE. S, es costumbre escribir los prrafos head1 en MAYSCULAS. Si haces mdulos para CPAN, debes usar este estilo. Si no, o si usas POD para otras cosas que documentacin (tambin es un formato bueno para escribir artculos o informes), queda a tu eleccin.&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;&lt;i&gt;NOMBRE&lt;/i&gt; contiene el nombre del mdulo o script, un guin y una descripcin corta.&lt;/li&gt;  &lt;li&gt;&lt;i&gt;SINOPSIS&lt;/i&gt; significa &quot;ver todo junto&quot; y muestra ejemplos de uso.&lt;/li&gt;  &lt;li&gt;&lt;i&gt;DESCRIPCIN&lt;/i&gt; contiene una descripcin larga de lo que hace el mdulo y lista sus funciones.&lt;/li&gt;  &lt;li&gt;&lt;i&gt;BUGS&lt;/i&gt; o &lt;i&gt;ADVERTENCIAS&lt;/i&gt; habla de los bugs o problemas que el usuario debera conocer.&lt;/li&gt;  &lt;li&gt;&lt;i&gt;AGRADECIMIENTOS&lt;/i&gt; es donde el autor agracede a los que arreglan bugs y prueban el programa.&lt;/li&gt;  &lt;li&gt;&lt;i&gt;COPYRIGHT&lt;/i&gt; o &lt;i&gt;LICENCIA&lt;/i&gt; menciona las restricciones de copyright. Sin embargo, no hay que poner toda la GPL :).&lt;/li&gt;  &lt;li&gt;&lt;i&gt;DISPONIBILIDAD&lt;/i&gt; anuncia dnde se pueden encontrar versiones ms recientes.&lt;/li&gt;  &lt;li&gt;&lt;i&gt;AUTOR&lt;/i&gt; explica quin ha hecho el programa, si no lo hace ya la seccin COPYRIGHT.&lt;/li&gt;  &lt;li&gt;&lt;i&gt;VASE TAMBIN&lt;/i&gt; refiere al lector a un lugar con ms documentacin.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Todos estos son para &lt;tt&gt;=head1&lt;/tt&gt;.&lt;/p&gt;&lt;p&gt;Las funciones, mtodos y todo eso se explican normalmente en una seccin &lt;tt&gt;=head2&lt;/tt&gt; bajo DESCRIPCIN.&lt;/p&gt;&lt;p&gt;Como mnimo, hay que documentar los argumentos que reciben las funciones y los valores que se devuelven. Si hay condiciones necesarias para algo, se deben mencionar. Si una funcin devuelve &lt;tt&gt;undef&lt;/tt&gt; cuando hay errores, hay que hacrselo saber a la gente.&lt;/p&gt;&lt;p&gt;Est bien escribir frases cortas. Es mejor evitar las largas.&lt;/p&gt;&lt;h2&gt;Ejemplos de cdigo&lt;/h2&gt;&lt;p&gt;Los prrafos tabulados se toman como cdigo, con la tabulacin intacta. As de fcil!:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;=head1 SINOPSIS    use My::Module;    my $object = My::Module-&gt;new();     print $object-&gt;as_string;&lt;/pre&gt;&lt;p&gt;Esto se llama un &lt;i&gt;prrafo textual&lt;/i&gt;.&lt;/p&gt;&lt;h2&gt;Marcado&lt;/h2&gt;&lt;p&gt;POD soporta un pequeo conjunto de elementos de marcado. Para mantener mi promesa, me voy a limitar a enumerarlos:&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;&lt;tt&gt;B&amp;lt;texto en negrita&amp;gt;&lt;/tt&gt;&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;B&amp;lt;texto en cursiva&amp;gt;&lt;/tt&gt;&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;B&amp;lt;texto subrayado&amp;gt;&lt;/tt&gt;&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;B&amp;lt;cdigo&amp;gt;&lt;/tt&gt;&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;B&amp;lt;y se pueden I&amp;lt;anidar&amp;gt;&amp;gt;&lt;/tt&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Tambin hay F, S, X y Z pero apenas se usan y no merece la pena explicarlos en un tutorial pequeo como este.&lt;/p&gt;&lt;p&gt;Si alguna vez hace falta incluir un carcter &#39;&amp;gt;&#39; dentro de cdigo, hay dos opciones. Si queremos poner &lt;tt&gt;$foo-&gt;bar&lt;/tt&gt; con fuente de cdigo, podemos hacer esto:&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;&lt;tt&gt;C&amp;lt;$foo-E&amp;lt;gt&amp;gt;bar&amp;gt;&lt;/tt&gt;&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;C&amp;lt;&amp;lt; $foo-&amp;gt;bar &amp;gt;&amp;gt;&lt;/tt&gt; (los espacios son necesarios!)&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;C&amp;lt;&amp;lt;&amp;lt; $foo-&amp;gt;bar &amp;gt;&amp;gt;&amp;gt;&lt;/tt&gt; (los espacios son necesarios!)&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Entidades&lt;/h2&gt;&lt;p&gt;Hemos visto que se puede usar E para entidades. Son como las entidades de HTML; tambin tenemos estas:&lt;/p&gt;&lt;ul&gt;  &lt;li&gt;&lt;tt&gt;verbar&lt;/tt&gt; para una barra vertical.&lt;/li&gt;  &lt;li&gt;&lt;tt&gt;sol&lt;/tt&gt; para una barra (solidus).&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Las entidades numricas pueden ir en decimal, octal (con el prefijo &#39;0&#39;) y en hexadecimal (con el prefijo &#39;0x&#39;).&lt;/p&gt;&lt;h2&gt;Listas&lt;/h2&gt;&lt;p&gt;En este caso un ejemplo es mucho ms claro que una explicacin:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;=head2 Mtodos=over 12=item C&lt;new&gt;Devuelve un objeto My::Module nuevo.=item C&lt;as_string&gt;Devuelve una representacin delobjeto en forma de cadena. Sirveprincipalmente para depuracin.=back&lt;/pre&gt;&lt;p&gt;Como puede verse, se empieza esta lista con &lt;tt&gt;=over&lt;/tt&gt; y la acabamos con &lt;tt&gt;=back&lt;/tt&gt;. Entre ambos comandos hay &lt;tt&gt;=item&lt;/tt&gt;s. El nmero despus de &lt;tt&gt;=over&lt;/tt&gt; es el nivel de tabulacin, usado principalmente por los renderizadores de texto para conseguir un diseo horizontal. pod2text convierte el ejemplo anterior en:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  Methods      &quot;new&quot;       Returns a new                  My::Module object.      &quot;as_string&quot; Returns a stringified                   representation of the                   object. This is mainly                  for debugging purposes.&lt;/pre&gt;&lt;h2&gt;Otras cositas de POD&lt;/h2&gt;&lt;p&gt;Se puede usar L para enlazar a secciones del mismo documento o a otros documentos. POD se termina con &lt;tt&gt;=cut&lt;/tt&gt; para volver a Perl. Hay comandos especiales para los distintos formatos de salida. Para leer la documentacin completa de POD, teclear:&lt;/p&gt;&lt;tt class=&quot;inline_code&quot;&gt;perldoc perlpod&lt;/tt&gt;&lt;h2&gt;Un ejemplo completo&lt;/h2&gt;&lt;pre class=&quot;block_code&quot;&gt;=head1 NOMBREMy::Module - Un mdulo de ejemplo=head1 SINOPSIS    use My::Module;    my $object = My::Module-&gt;new();    print $object-&gt;as_string;=head1 DESCRIPCINEste mdulo no existe en realidad, sehizo con el nico objetivo de mostrarcmo funciona POD.=head2 Mtodos=over 12=item C&lt;new&gt;Devuelve un objeto My::Module nuevo.=item C&lt;as_string&gt;Devuelve una representacin delobjeto en forma de cadena. Sirveprincipalmente para depuracin.=back=head1 AUTORJuerd - &lt;http://juerd.nl/&gt;=head1 VASE TAMBINL&lt;perlpod&gt;, L&lt;perlpodspec&gt;=cut&lt;/pre&gt;&lt;h2&gt;Conclusin&lt;/h2&gt;&lt;p&gt;Documentar con POD es fcil. A divertirse!&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>perlunitut: Unicode in Perl (Juerd)</title>
    <link>http://prlmnks.org/html/551676.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/551676.html</guid>

    <description>
        &lt;p&gt;&lt;a name=&quot;__index__&quot;&gt;&lt;/a&gt;&lt;/p&gt;&lt;!-- INDEX BEGIN --&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;#name&quot;&gt;NAME&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#description&quot;&gt;DESCRIPTION&lt;/a&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;#definitions&quot;&gt;Definitions&lt;/a&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;#unicode&quot;&gt;Unicode&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#utf8&quot;&gt;UTF-8&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#text_strings__character_strings_&quot;&gt;Text strings (character strings)&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#binary_strings__byte_strings_&quot;&gt;Binary strings (byte strings)&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#encoding&quot;&gt;Encoding&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#decoding&quot;&gt;Decoding&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#internal_format&quot;&gt;Internal format&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;&lt;a href=&quot;#your_new_toolkit&quot;&gt;Your new toolkit&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#i_o_flow__the_actual_5_minute_tutorial_&quot;&gt;I/O flow (the actual 5 minute tutorial)&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#q_and_a&quot;&gt;Q and A&lt;/a&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;#this_isn_t_really_a_unicode_tutorial__is_it&quot;&gt;This isn&#39;t really a Unicode tutorial, is it?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_about_binary_data__like_images&quot;&gt;What about binary data, like images?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_about_the_utf8_flag&quot;&gt;What about the UTF-8 flag?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#when_should_i_decode_or_encode&quot;&gt;When should I decode or encode?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_if_i_don_t_decode&quot;&gt;What if I don&#39;t decode?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_if_i_don_t_encode&quot;&gt;What if I don&#39;t encode?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#is_there_a_way_to_automatically_decode_or_encode&quot;&gt;Is there a way to automatically decode or encode?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#cheat__tell_me__how_can_i_cheat&quot;&gt;Cheat?! Tell me, how can I cheat?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_if_i_don_t_know_which_encoding_was_used&quot;&gt;What if I don&#39;t know which encoding was used?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#can_i_use_unicode_in_my_perl_sources&quot;&gt;Can I use Unicode in my Perl sources?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#data__dumper_doesn_t_restore_the_utf8_flag__is_it_broken&quot;&gt;Data::Dumper doesn&#39;t restore the UTF-8 flag; is it broken?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#how_can_i_determine_if_a_string_is_a_text_string_or_a_binary_string&quot;&gt;How can I determine if a string is a text string or a binary string?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#how_do_i_convert_from_encoding_foo_to_encoding_bar&quot;&gt;How do I convert from encoding FOO to encoding BAR?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_about_the_use_bytes_pragma&quot;&gt;What about the &lt;tt class=&quot;inline_code&quot;&gt;use bytes&lt;/tt&gt; pragma?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_are_decode_utf8_and_encode_utf8&quot;&gt;What are &lt;tt class=&quot;inline_code&quot;&gt;decode_utf8&lt;/tt&gt; and &lt;tt class=&quot;inline_code&quot;&gt;encode_utf8&lt;/tt&gt;?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_s_the_difference_between_utf8_and_utf8&quot;&gt;What&#39;s the difference between &lt;tt class=&quot;inline_code&quot;&gt;UTF-8&lt;/tt&gt; and &lt;tt class=&quot;inline_code&quot;&gt;utf8&lt;/tt&gt;?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#what_character_encodings_does_perl_support&quot;&gt;What character encodings does Perl support?&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#which_version_of_perl_should_i_use&quot;&gt;Which version of perl should I use?&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;li&gt;&lt;a href=&quot;#summary&quot;&gt;SUMMARY&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#acknowledgements&quot;&gt;ACKNOWLEDGEMENTS&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#author&quot;&gt;AUTHOR&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;#see_also&quot;&gt;SEE ALSO&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;!-- INDEX END --&gt;&lt;hr /&gt;&lt;p&gt;&lt;/p&gt;&lt;h1&gt;&lt;a name=&quot;name&quot;&gt;NAME&lt;/a&gt;&lt;/h1&gt;&lt;p&gt;perlunitut - Perl Unicode Tutorial&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr /&gt;&lt;h1&gt;&lt;a name=&quot;description&quot;&gt;DESCRIPTION&lt;/a&gt;&lt;/h1&gt;&lt;p&gt;The days of just flinging strings around are over. It&#39;s well established thatmodern programs need to be capable of communicating funny accented letters, andthings like euro symbols. This means that programmers need new habits. It&#39;seasy to program Unicode capable software, but it does require discipline to doit right.&lt;/p&gt;&lt;p&gt;There&#39;s a lot to know about character sets, and text encodings. It&#39;s probablybest to spend a full day learning all this, but the basics can be learned inminutes.&lt;/p&gt;&lt;p&gt;These are not the very basics, though. It is assumed that you alreadyknow the difference between bytes and characters, and realise (and accept!)that there are many different character sets and encodings, and that yourprogram has to be explicit about them. Recommended reading is ``The AbsoluteMinimum Every Software Developer Absolutely, Positively Must Know About Unicodeand Character Sets (No Excuses!)&#39;&#39; by Joel Spolsky, at&lt;a href=&quot;http://joelonsoftware.com/articles/Unicode.html&quot;&gt;http://joelonsoftware.com/articles/Unicode.html&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;This tutorial speaks in rather absolute terms, and provides only a limited viewof the wealth of character string related features that Perl has to offer. Formost projects, this information will probably suffice.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;a name=&quot;definitions&quot;&gt;Definitions&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;It&#39;s important to set a few things straight first. This is the most importantpart of this tutorial. This view may conflict with other information that youmay have found on the web, but that&#39;s mostly because many sources are wrong.&lt;/p&gt;&lt;p&gt;You may have to re-read this entire section a few times...&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;unicode&quot;&gt;Unicode&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Unicode&lt;/strong&gt; is a character set with room for lots of characters. The ordinalvalue of a character is called a &lt;strong&gt;code point&lt;/strong&gt;.&lt;/p&gt;&lt;p&gt;There are many, many code points, but computers work with bytes, and a byte canhave only 256 values. Unicode has many more characters, so you need a methodto make these accessible.&lt;/p&gt;&lt;p&gt;Unicode is encoded using several competing encodings, of which UTF-8 is themost used. In a Unicode encoding, multiple subsequent bytes can be used tostore a single code point, or simply: character.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;utf8&quot;&gt;UTF-8&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;UTF-8&lt;/strong&gt; is a Unicode encoding. Many people think that Unicode and UTF-8 arethe same thing, but they&#39;re not. There are more Unicode encodings, but much ofthe world has standardized on UTF-8.&lt;/p&gt;&lt;p&gt;UTF-8 treats the first 128 codepoints, 0..127, the same as ASCII. They takeonly one byte per character. All other characters are encoded as two or more(up to six) bytes using a complex scheme. Fortunately, Perl handles this forus, so we don&#39;t have to worry about this.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;text_strings__character_strings_&quot;&gt;Text strings (character strings)&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Text strings&lt;/strong&gt;, or &lt;strong&gt;character strings&lt;/strong&gt; are made of characters. Bytes areirrelevant here, and so are encodings. Each character is just that: thecharacter.&lt;/p&gt;&lt;p&gt;On a text string, you would do things like:&lt;/p&gt;&lt;pre&gt;    $text =~ s/foo/bar/;    if ($string =~ /^\d+$/) { ... }    $text = ucfirst $text;    my $character_count = length $text;&lt;/pre&gt;&lt;p&gt;The value of a character (&lt;tt class=&quot;inline_code&quot;&gt;ord&lt;/tt&gt;, &lt;tt class=&quot;inline_code&quot;&gt;chr&lt;/tt&gt;) is the corresponding Unicode codepoint.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;binary_strings__byte_strings_&quot;&gt;Binary strings (byte strings)&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Binary strings&lt;/strong&gt;, or &lt;strong&gt;byte strings&lt;/strong&gt; are made of bytes. Here, you don&#39;t havecharacters, just bytes. All communication with the outside world (anythingoutside of your current Perl process) is done in binary.&lt;/p&gt;&lt;p&gt;On a binary string, you would do things like:&lt;/p&gt;&lt;pre&gt;    my (@length_content) = unpack &amp;quot;(V/a)*&amp;quot;, $binary;    $binary =~ s/\x00\x0F/\xFF\xF0/;  # for the brave :)    print {$fh} $binary;    my $byte_count = length $binary;&lt;/pre&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;encoding&quot;&gt;Encoding&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Encoding&lt;/strong&gt; (as a verb) is the conversion from &lt;em&gt;text&lt;/em&gt; to &lt;em&gt;binary&lt;/em&gt;. To encode,you have to supply the target encoding, for example &lt;tt class=&quot;inline_code&quot;&gt;iso-8859-1&lt;/tt&gt; or &lt;tt class=&quot;inline_code&quot;&gt;UTF-8&lt;/tt&gt;.Some encodings, like the &lt;tt class=&quot;inline_code&quot;&gt;iso-8859&lt;/tt&gt; (``latin&#39;&#39;) range, do not support theUnicode standard; characters that can&#39;t be represented are lost in theconversion.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;decoding&quot;&gt;Decoding&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;strong&gt;Decoding&lt;/strong&gt; is the conversion from &lt;em&gt;binary&lt;/em&gt; to &lt;em&gt;text&lt;/em&gt;. To decode, you have toknow what encoding was used during the encoding phase. And most of all, it mustbe something decodable. It doesn&#39;t make much sense to decode a PNG image into atext string.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;internal_format&quot;&gt;Internal format&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Perl has an &lt;strong&gt;internal format&lt;/strong&gt;, an encoding that it uses to encode text stringsso it can store them in memory. All text strings are in this internal format.In fact, text strings are never in any other format!&lt;/p&gt;&lt;p&gt;You shouldn&#39;t worry about what this format is, because conversion isautomatically done when you decode or encode.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;a name=&quot;your_new_toolkit&quot;&gt;Your new toolkit&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;Add to your standard heading the following line:&lt;/p&gt;&lt;pre&gt;    use Encode qw(encode decode);&lt;/pre&gt;&lt;p&gt;Or, if you&#39;re lazy, just:&lt;/p&gt;&lt;pre&gt;    use Encode;&lt;/pre&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;a name=&quot;i_o_flow__the_actual_5_minute_tutorial_&quot;&gt;I/O flow (the actual 5 minute tutorial)&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;The typical input/output flow of a program is:&lt;/p&gt;&lt;pre&gt;    1. Receive and decode    2. Process    3. Encode and output&lt;/pre&gt;&lt;p&gt;If your input is binary, and is supposed to remain binary, you shouldn&#39;t decodeit to a text string, of course. But in all other cases, you should decode it.&lt;/p&gt;&lt;p&gt;Decoding can&#39;t happen reliably if you don&#39;t know how the data was encoded. Ifyou get to choose, it&#39;s a good idea to standardize on UTF-8.&lt;/p&gt;&lt;pre&gt;    my $foo   = decode(&#39;UTF-8&#39;, get &#39;&lt;a href=&quot;http://example.com/&quot;&gt;http://example.com/&lt;/a&gt;&#39;);    my $bar   = decode(&#39;ISO-8859-1&#39;, readline STDIN);    my $xyzzy = decode(&#39;Windows-1251&#39;, $cgi-&amp;gt;param(&#39;foo&#39;));&lt;/pre&gt;&lt;p&gt;Processing happens as you knew before. The only difference is that you&#39;re nowusing characters instead of bytes. That&#39;s very useful if you use things like&lt;tt class=&quot;inline_code&quot;&gt;substr&lt;/tt&gt;, or &lt;tt class=&quot;inline_code&quot;&gt;length&lt;/tt&gt;.&lt;/p&gt;&lt;p&gt;It&#39;s important to realize that there are no bytes in a text string. Of course,Perl has its internal encoding to store the string in memory, but ignore that.If you have to do anything with the number of bytes, it&#39;s probably best to movethat part to step 3, just after you&#39;ve encoded the string. Then you knowexactly how many bytes it will be in the destination string.&lt;/p&gt;&lt;p&gt;The syntax for encoding text strings to binary strings is as simple as decoding:&lt;/p&gt;&lt;pre&gt;    $body = encode(&#39;UTF-8&#39;, $body);&lt;/pre&gt;&lt;p&gt;If you needed to know the length of the string in bytes, now&#39;s the perfect timefor that. Because &lt;tt class=&quot;inline_code&quot;&gt;$body&lt;/tt&gt; is now a byte string, &lt;tt class=&quot;inline_code&quot;&gt;length&lt;/tt&gt; will report thenumber of bytes, instead of the number of characters. The number ofcharacters is no longer known, because characters only exist in text strings.&lt;/p&gt;&lt;pre&gt;    my $byte_count = length $body;&lt;/pre&gt;&lt;p&gt;And if the protocol you&#39;re using supports a way of letting the recipient knowwhich character encoding you used, please help the receiving end by using thatfeature! For example, E-mail and HTTP support MIME headers, so you can use the&lt;tt class=&quot;inline_code&quot;&gt;Content-Type&lt;/tt&gt; header. They can also have &lt;tt class=&quot;inline_code&quot;&gt;Content-Length&lt;/tt&gt; to indicate thenumber of &lt;em&gt;bytes&lt;/em&gt;, which is always a good idea to supply if the number isknown.&lt;/p&gt;&lt;pre&gt;    &amp;quot;Content-Type: text/plain; charset=UTF-8&amp;quot;,    &amp;quot;Content-Length: $byte_count&amp;quot;&lt;/pre&gt;&lt;p&gt;&lt;/p&gt;&lt;h2&gt;&lt;a name=&quot;q_and_a&quot;&gt;Q and A&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;this_isn_t_really_a_unicode_tutorial__is_it&quot;&gt;This isn&#39;t really a Unicode tutorial, is it?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;No, Perl has an abstracted interface for all supported character encodings, sothis is actually a generic &lt;tt class=&quot;inline_code&quot;&gt;Encode&lt;/tt&gt; tutorial. But many people think thatUnicode is special and magical, and I did&#39;t want to disappoint them, so Idecided to call this document a Unicode tutorial.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_about_binary_data__like_images&quot;&gt;What about binary data, like images?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Well, apart from a bare &lt;tt class=&quot;inline_code&quot;&gt;binmode $fh&lt;/tt&gt;, you shouldn&#39;t treat them specially.(The binmode is needed because otherwise Perl may convert line endings on Win32systems.)&lt;/p&gt;&lt;p&gt;Be careful, though, to never combine text strings with binary strings. If youneed text in a binary stream, encode your text strings first using theappropriate encoding, then join them with binary strings. See also: ``What if Idon&#39;t encode?&#39;&#39;.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_about_the_utf8_flag&quot;&gt;What about the UTF-8 flag?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Please, unless you&#39;re hacking the internals, or debugging weirdness, don&#39;tthink about the UTF-8 flag at all. That means that you very probably shouldn&#39;tuse &lt;tt class=&quot;inline_code&quot;&gt;is_utf8&lt;/tt&gt;, &lt;tt class=&quot;inline_code&quot;&gt;_utf8_on&lt;/tt&gt; or &lt;tt class=&quot;inline_code&quot;&gt;_utf8_off&lt;/tt&gt; at all.&lt;/p&gt;&lt;p&gt;Perl&#39;s internal format happens to be UTF-8. Unfortunately, Perl can&#39;t keep asecret, so everyone knows about this.  That is the source of much confusion.It&#39;s better to pretend that the internal format is some unknown encoding,and that you always have to encode and decode explicitly.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;when_should_i_decode_or_encode&quot;&gt;When should I decode or encode?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Whenever you&#39;re communicating with anything that is external to your perlprocess, like a database, a text file, a socket, or another program. Even ifthe thing you&#39;re communicating with is also written in Perl.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_if_i_don_t_decode&quot;&gt;What if I don&#39;t decode?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Whenever your encoded, binary string is used together with a text string, Perlwill assume that your binary string was encoded with ISO-8859-1, also known aslatin-1. If it wasn&#39;t latin-1, then your data is unpleasantly converted. Forexample, if it was UTF-8, the individual bytes of multibyte characters are seenas separate characters, and then again converted to UTF-8. Such double encodingcan be compared to double HTML encoding (&lt;tt class=&quot;inline_code&quot;&gt;&amp;amp;amp;gt;&lt;/tt&gt;), or double URI encoding(&lt;tt class=&quot;inline_code&quot;&gt;%253E&lt;/tt&gt;).&lt;/p&gt;&lt;p&gt;This silent implicit decoding is known as ``upgrading&#39;&#39;. That may soundpositive, but it&#39;s best to avoid it.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_if_i_don_t_encode&quot;&gt;What if I don&#39;t encode?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Your text string will be sent using the bytes in Perl&#39;s internal format. Insome cases, Perl will warn you that you&#39;re doing something wrong, with afriendly warning:&lt;/p&gt;&lt;pre&gt;    Wide character in print at example.pl line 2.&lt;/pre&gt;&lt;p&gt;Because the internal format is really UTF-8, these bugs are hard to spot,because UTF-8 is usually the encoding you wanted! But don&#39;t be lazy, and don&#39;tuse the fact that Perl&#39;s internal format is UTF-8 to your advantage. Encodeexplicitly to avoid weird bugs, and to show to maintenance programmers that youthought this through.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;is_there_a_way_to_automatically_decode_or_encode&quot;&gt;Is there a way to automatically decode or encode?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;If all data that comes from a certain handle is encoded in exactly the sameway, you can tell the PerlIO system to automatically decode everything, withthe &lt;tt class=&quot;inline_code&quot;&gt;encoding&lt;/tt&gt; layer.&lt;/p&gt;&lt;p&gt;You can provide this layer when &lt;tt class=&quot;inline_code&quot;&gt;open&lt;/tt&gt;ing the file:&lt;/p&gt;&lt;pre&gt;    open my $fh, &#39;&amp;gt;:encoding(UTF-8)&#39;, $filename;  # auto encoding on write    open my $fh, &#39;&amp;lt;:encoding(UTF-8)&#39;, $filename;  # auto decoding on read&lt;/pre&gt;&lt;p&gt;Or if you already have an open filehandle:&lt;/p&gt;&lt;pre&gt;    binmode $fh, &#39;:encoding(UTF-8)&#39;;&lt;/pre&gt;&lt;p&gt;Some database drivers for DBI can also automatically encode and decode, butthat is typically limited to the UTF-8 encoding, because they cheat.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;cheat__tell_me__how_can_i_cheat&quot;&gt;Cheat?! Tell me, how can I cheat?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Well, because Perl&#39;s internal format is UTF-8, you can just skip the encodingor decoding step, and manipulate the UTF-8 flag directly.&lt;/p&gt;&lt;p&gt;Instead of &lt;tt class=&quot;inline_code&quot;&gt;:encoding(UTF-8)&lt;/tt&gt;, you can simply use &lt;tt class=&quot;inline_code&quot;&gt;:utf8&lt;/tt&gt;. This is widelyaccepted as good behavior.&lt;/p&gt;&lt;p&gt;Instead of &lt;tt class=&quot;inline_code&quot;&gt;decode&lt;/tt&gt; and &lt;tt class=&quot;inline_code&quot;&gt;encode&lt;/tt&gt;, you could use &lt;tt class=&quot;inline_code&quot;&gt;_utf8_on&lt;/tt&gt; and &lt;tt class=&quot;inline_code&quot;&gt;_utf8_off&lt;/tt&gt;.But this is, contrary to &lt;tt class=&quot;inline_code&quot;&gt;:utf8&lt;/tt&gt;, considered bad style.&lt;/p&gt;&lt;p&gt;There are some shortcuts for oneliners; see &lt;tt class=&quot;inline_code&quot;&gt;-C&lt;/tt&gt; in &lt;em&gt;perlrun&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_if_i_don_t_know_which_encoding_was_used&quot;&gt;What if I don&#39;t know which encoding was used?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Do whatever you can to find out, and if you have to: guess. (Don&#39;t forget todocument your guess with a comment.)&lt;/p&gt;&lt;p&gt;You could open the document in a web browser, and change the character set orcharacter encoding until you can visually confirm that all characters look theway they should.&lt;/p&gt;&lt;p&gt;There is no way to reliably detect the encoding automatically, so if peoplekeep sending you data without charset indication, you may have to educate them.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;can_i_use_unicode_in_my_perl_sources&quot;&gt;Can I use Unicode in my Perl sources?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Yes, you can! If your sources are UTF-8 encoded, you can indicate that with the&lt;tt class=&quot;inline_code&quot;&gt;use utf8&lt;/tt&gt; pragma.&lt;/p&gt;&lt;pre&gt;    use utf8;&lt;/pre&gt;&lt;p&gt;This doesn&#39;t do anything to your input, or to your output. It only influencesthe way your sources are read. You can use Unicode in string literals, inidentifiers (but they still have to be ``word characters&#39;&#39; according to &lt;tt class=&quot;inline_code&quot;&gt;\w&lt;/tt&gt;),and even in custom delimiters.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;data__dumper_doesn_t_restore_the_utf8_flag__is_it_broken&quot;&gt;Data::Dumper doesn&#39;t restore the UTF-8 flag; is it broken?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;No, Data::Dumper&#39;s Unicode abilities are as they should be. There have beensome complaints that it should restore the UTF-8 flag when the data is readagain with &lt;tt class=&quot;inline_code&quot;&gt;eval&lt;/tt&gt;. However, you should really not look at the flag, andnothing indicates that Data::Dumper should break this rule.&lt;/p&gt;&lt;p&gt;Here&#39;s what happens: when Perl reads in a string literal, it sticks to 8 bitencoding as long as it can. (But perhaps originally it was internally encodedas UTF-8, when you dumped it.) When it has to give that up because othercharacters are added to the text string, it silently upgrades the string toUTF-8.&lt;/p&gt;&lt;p&gt;If you properly encode your strings for output, none of this is of yourconcern, and you can just &lt;tt class=&quot;inline_code&quot;&gt;eval&lt;/tt&gt; dumped data as always.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;how_can_i_determine_if_a_string_is_a_text_string_or_a_binary_string&quot;&gt;How can I determine if a string is a text string or a binary string?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;You can&#39;t. Some use the UTF-8 flag for this, but that&#39;s misuse, and makes wellbehaved modules like Data::Dumper look bad. The flag is useless for thispurpose, because it&#39;s off when an 8 bit encoding (by default ISO-8859-1) isused to store the string.&lt;/p&gt;&lt;p&gt;This is something you, the programmer, has to keep track of; sorry. You couldconsider adopting a kind of ``Hungarian notation&#39;&#39; to help with this.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;how_do_i_convert_from_encoding_foo_to_encoding_bar&quot;&gt;How do I convert from encoding FOO to encoding BAR?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;By first converting the FOO-encoded byte string to a text string, and then thetext string to a BAR-encoded byte string:&lt;/p&gt;&lt;pre&gt;    my $text_string = decode(&#39;FOO&#39;, $foo_string);    my $bar_string  = encode(&#39;BAR&#39;, $text_string);&lt;/pre&gt;&lt;p&gt;or by skipping the text string part, and going directly from one binaryencoding to the other:&lt;/p&gt;&lt;pre&gt;    use Encode qw(from_to);    from_to($string, &#39;FOO&#39;, &#39;BAR&#39;);  # changes contents of $string&lt;/pre&gt;&lt;p&gt;or by letting automatic decoding and encoding do all the work:&lt;/p&gt;&lt;pre&gt;    open my $foofh, &#39;&amp;lt;:encoding(FOO)&#39;, &#39;example.foo.txt&#39;;    open my $barfh, &#39;&amp;gt;:encoding(BAR)&#39;, &#39;example.bar.txt&#39;;    print { $barfh } $_ while &amp;lt;$foofh&amp;gt;;&lt;/pre&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_about_the_use_bytes_pragma&quot;&gt;What about the &lt;tt class=&quot;inline_code&quot;&gt;use bytes&lt;/tt&gt; pragma?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Don&#39;t use it. It makes no sense to deal with bytes in a text string, and itmakes no sense to deal with characters in a byte string. Do the properconversions (by decoding/encoding), and things will work out well: you getcharacter counts for decoded data, and byte counts for encoded data.&lt;/p&gt;&lt;p&gt;&lt;tt class=&quot;inline_code&quot;&gt;use bytes&lt;/tt&gt; is usually a failed attempt to do something useful. Just forgetabout it.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_are_decode_utf8_and_encode_utf8&quot;&gt;What are &lt;tt class=&quot;inline_code&quot;&gt;decode_utf8&lt;/tt&gt; and &lt;tt class=&quot;inline_code&quot;&gt;encode_utf8&lt;/tt&gt;?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;These are alternate syntaxes for &lt;tt class=&quot;inline_code&quot;&gt;decode(&#39;utf8&#39;, ...)&lt;/tt&gt; and &lt;pre class=&quot;block_code&quot;&gt;encode(&#39;utf8&#39;,...)&lt;/pre&gt;.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_s_the_difference_between_utf8_and_utf8&quot;&gt;What&#39;s the difference between &lt;tt class=&quot;inline_code&quot;&gt;UTF-8&lt;/tt&gt; and &lt;tt class=&quot;inline_code&quot;&gt;utf8&lt;/tt&gt;?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;&lt;tt class=&quot;inline_code&quot;&gt;UTF-8&lt;/tt&gt; is the official standard. &lt;tt class=&quot;inline_code&quot;&gt;utf8&lt;/tt&gt; is Perl&#39;s way of being liberal inwhat it accepts. If you have to communicate with things that aren&#39;t so liberal,you may want to consider using &lt;tt class=&quot;inline_code&quot;&gt;UTF-8&lt;/tt&gt;. If you have to communicate with thingsthat are too liberal, you may have to use &lt;tt class=&quot;inline_code&quot;&gt;utf8&lt;/tt&gt;. The full explanation is in&lt;em&gt;Encode&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;&lt;tt class=&quot;inline_code&quot;&gt;UTF-8&lt;/tt&gt; is internally known as &lt;tt class=&quot;inline_code&quot;&gt;utf-8-strict&lt;/tt&gt;. This tutorial uses UTF-8consistently, even where utf8 is actually used internally, because thedistinction can be hard to make, and is mostly irrelevant.&lt;/p&gt;&lt;p&gt;Okay, if you insist: the ``internal format&#39;&#39; is utf8, not UTF-8.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;what_character_encodings_does_perl_support&quot;&gt;What character encodings does Perl support?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;To find out which character encodings your Perl supports, run:&lt;/p&gt;&lt;pre&gt;    perl -MEncode -le &amp;quot;print for Encode-&amp;gt;encodings( q&lt;a href=&quot;/out/node/:all&quot;&gt;:all&lt;/a&gt; )&amp;quot;;&lt;/pre&gt;&lt;p&gt;&lt;/p&gt;&lt;h3&gt;&lt;a name=&quot;which_version_of_perl_should_i_use&quot;&gt;Which version of perl should I use?&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Well, if you can, upgrade to the most recent, but certainly &lt;tt class=&quot;inline_code&quot;&gt;5.8.1&lt;/tt&gt; or newer.This tutorial is based on the status quo as of &lt;tt class=&quot;inline_code&quot;&gt;5.8.7&lt;/tt&gt;.&lt;/p&gt;&lt;p&gt;You should also check your modules, and upgrade them if necessary. For example,HTML::Entities requires version &amp;gt;= 1.32 to function correctly, even though thechangelog is silent about this.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr /&gt;&lt;h1&gt;&lt;a name=&quot;summary&quot;&gt;SUMMARY&lt;/a&gt;&lt;/h1&gt;&lt;p&gt;Decode everything you receive, encode everything you send out. (If it&#39;s textdata.)&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr /&gt;&lt;h1&gt;&lt;a name=&quot;acknowledgements&quot;&gt;ACKNOWLEDGEMENTS&lt;/a&gt;&lt;/h1&gt;&lt;p&gt;Thanks to Johan Vromans from Squirrel Consultancy. His UTF-8 rants during theAmsterdam Perl Mongers meetings got me interested and determined to find outhow to use character encodings in Perl in ways that don&#39;t break easily.&lt;/p&gt;&lt;p&gt;Thanks to Gerard Goossen from TTY. His presentation ``UTF-8 in the wild&#39;&#39; (DutchPerl Workshop 2006) inspired me to publish my thoughts and write this tutorial.&lt;/p&gt;&lt;p&gt;Thanks to the people who asked about this kind of stuff in several Perl IRCchannels, and have constantly reminded me that a simpler explanation wasneeded.&lt;/p&gt;&lt;p&gt;Thanks to the people who reviewed this document for me, before it went public.They are: Benjamin Smith, Jan-Pieter Cornet, Johan Vromans, Lukas Mai, NathanGray.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr /&gt;&lt;h1&gt;&lt;a name=&quot;author&quot;&gt;AUTHOR&lt;/a&gt;&lt;/h1&gt;&lt;p&gt;Juerd Waalboer &amp;lt;&lt;a href=&quot;mailto:juerd@cpan.org&quot;&gt;juerd@cpan.org&lt;/a&gt;&amp;gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;hr /&gt;&lt;h1&gt;&lt;a name=&quot;see_also&quot;&gt;SEE ALSO&lt;/a&gt;&lt;/h1&gt;&lt;p&gt;&lt;em&gt;perlunicode&lt;/em&gt;, &lt;em&gt;perluniintro&lt;/em&gt;, &lt;em&gt;Encode&lt;/em&gt;&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>A Simple Socket Server Using &#39;inetd&#39; (dwildesnl)</title>
    <link>http://prlmnks.org/html/544341.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/544341.html</guid>

    <description>
        &lt;B&gt;Using &lt;tt class=&quot;inline_code&quot;&gt;inetd&lt;/tt&gt; to serve a socket&lt;/b&gt;&lt;BR&gt;&lt;BR&gt;It is often the case that one needs to test a system before the hardware is available or on-line. In my case, I&#39;m developing an interface which will talk to a socket on a complex piece of Fab Metrology gear called an Applied Materials NanoSEM, using a complex protocol called SEMI SECS-II/GEM HSMS. I needed a quick and dirty handler to act as the NanoSEM while I get the protocol parser working.&lt;BR&gt;&lt;BR&gt;UNIX-like systems such as FreeBSD and Linux have a nifty feature called &lt;tt class=&quot;inline_code&quot;&gt;inetd&lt;/tt&gt;, which comes to our rescue. &lt;tt class=&quot;inline_code&quot;&gt;inetd&lt;/tt&gt; runs a program you specify whenever somebody else tries to connect to the socket you&#39;ve chosen. By making a few simple configuration improvements, we can send our input to the specified socket, and inetd invokes our program, passing our input to it as STDIN. Our handler then processes it and spits out its response as STDOUT back to our socket. Cool, huh? What&#39;s even more cool is that if another process (or system) also tries to connect to the same socket, &lt;tt class=&quot;inline_code&quot;&gt;inetd&lt;/tt&gt; will invoke another copy of our handler without bothering the first one.&lt;BR&gt;&lt;BR&gt;&lt;READMORE&gt;Interested? Okay, here&#39;s the code...&lt;BR&gt;&lt;BR&gt;&lt;B&gt;The  server handler:&lt;/b&gt;&lt;BR&gt;&lt;BR&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl -w -T# sinet.pl A simple inetd socket server.use strict;my $old_fh = select(STDOUT);$| = 1;select($old_fh);while( my $line = &lt;STDIN&gt; ){    $line =~ s/\r?\n$//;    if ($line =~ /endit/)     {die &quot;shutting down\n&quot;;    }# do your processing here!    print &quot;  $line\n&quot;;}&lt;/pre&gt;This little program will process anything that comes in on the specified port, clearing carriage returns and line feeds, and (in this simple case) spitting it back with two spaces in front and a newline at the end. An input containing &quot;endit&quot; causes the handler to exit, and, thanks to the flush sequence at the top, output is immediate. Make your program executable:&lt;pre class=&quot;block_code&quot;&gt;# chmod +x /usr/local/bin/sinet.pl&lt;/pre&gt;&lt;BR&gt;&lt;BR&gt;&lt;B&gt;The configuration:&lt;/b&gt;&lt;BR&gt;&lt;BR&gt;Okay, now, here&#39;s the setup. In /etc (you need to be superuser), edit /etc/services to add your port number to the known services list, making up a unique name for the service. My port is 6100, its service is &#39;secshsms&#39;, and I&#39;ve asked it to handle both stream (tcp) and datagram (udp) packets, although this example will only deal with tcp.&lt;pre class=&quot;block_code&quot;&gt;secshsms6100/tcp   # DSW Handler for SECS-II/GEM HSMS trafficsecshsms6100/udp   # DSW&lt;/pre&gt;Next, edit /etc/inetd.conf to attach your program to that service:&lt;pre class=&quot;block_code&quot;&gt;# DSW add-on for SECS-II over HSMSsecshsmsstreamtcpnowaitnobody/usr/local/bin/sinet.pl secshsms&lt;/pre&gt;Okay, find inetd and restart it.&lt;pre class=&quot;block_code&quot;&gt;# ps -ax | grep inetd  563  ??  Is      0:00.01 /usr/sbin/inetd -wW -C 60# kill -1 563&lt;/pre&gt;From now on, any process that attempts to talk to my machine&#39;s port 6100 gets its output routed to my handler sinet.pl.&lt;BR&gt;&lt;BR&gt;With that in hand, here&#39;s a sample client, adapted from Perl Cookbook recipe 17.10. It can be installed on any machine within a routable network (i.e., no firewall) and it will talk to my handler.&lt;BR&gt;&lt;BR&gt;&lt;B&gt;A simple client:&lt;/B&gt; &lt;BR&gt;&lt;BR&gt;&lt;pre class=&quot;block_code&quot;&gt;#!/usr/bin/perl# ncliauto.pl a simple automated clientuse warnings;use strict;use IO::Socket;my ( $confstr, $host, $port, $kidpid, $handle, $line, @say );# the config file contains host name (or IP addr) and port number, with a space between# examples:  localhost 6100#            myserv.mynet.com 6100#            123.456.789.1 6100open( CONF, &quot;&lt;./hsms.conf&quot;) or die &quot;conf file: $!\n&quot;;$confstr = &lt;CONF&gt;;close( CONF ) or die &quot;closing conf file: $!\n&quot;;chomp $confstr;( $host, $port ) = split( /\s+/, $confstr );# This is our demo array of outputs sent to the handler@say = ( &#39;You are getting sleepy...&#39;, &#39;... very sleepy.&#39;, &#39;Your eyes are getting very heavy!&#39;, &quot;... it&#39;s so hard to hold them open.&quot;, &quot;You&#39;re so very sleepy now.&quot;, &#39;You just want to go to sleep.&#39;, &#39;Sleep feels so good!&#39;, &quot;You&#39;re asleep. Sleep!&quot;, &quot;You&#39;ve earned it, just relax and sleep!&quot;, &quot;... Sleep!&quot;, &quot;              Sleep!&quot;, &quot;                        Sleep!&quot;, &#39;&#39;, &#39;... zzz... zzz... ...zzz ...&#39;, &#39;endit&#39;);# This creates our client socket$handle = IO::Socket::INET-&gt;new( Proto =&gt; &quot;tcp&quot;, PeerAddr =&gt; $host, PeerPort =&gt; $port )    or die &quot;can&#39;t connect to port &#39;$port&#39; on host &#39;$host&#39;: $!\n&quot;;# make sure it turns around inputs immediately$handle-&gt;autoflush(1);# announce our connectionprint STDERR &quot;[connected to $host:$port]\n&quot;;# fork a child to handle sending our data to the socketdie &quot;can&#39;t fork: $!\n&quot; unless defined($kidpid = fork());if ( $kidpid ){#   The parent handles data coming from the socket server to us    while ( defined( $line = &lt;$handle&gt; ) )    {print STDOUT $line;    }#   ... until the connection is broken    kill( &quot;TERM&quot; =&gt; $kidpid );}else{#   The child process receives data for us    foreach my $item ( @say )    {print $handle $item . &quot;\r\n&quot;;sleep 1;    } }exit;&lt;/pre&gt;Working from this skeleton, a more elaborate language can be developed. The program on each end can be made to parse and respond to commands from the other.&lt;BR&gt;&lt;BR&gt;UPDATE 1: changed server user to nobody, thank you [idsfa].&lt;/READMORE&gt;&lt;div class=&quot;pmsig&quot;&gt;&lt;div class=&quot;pmsig-420266&quot;&gt;&lt;BR&gt;&lt;EM&gt;Don Wilde&lt;/EM&gt;&lt;BR&gt;&lt;FONT COLOR=&#39;#4F8EFF&#39;&gt;&quot;&lt;i&gt;There&#39;s more than one level to any answer.&lt;/i&gt;&quot;&lt;/FONT&gt;&lt;/div&gt;&lt;/div&gt;
    </description>
</item>

        

<item>
    <title>Gay (Anonymous Monk)</title>
    <link>http://prlmnks.org/html/542155.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/542155.html</guid>

    <description>
        UR GAY U STUPID BIATCH
    </description>
</item>

        

<item>
    <title>Module::Compile (or: what&#39;s this PMC thingy?) (Juerd)</title>
    <link>http://prlmnks.org/html/536132.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/536132.html</guid>

    <description>
        &lt;p&gt;&lt;i&gt;This document is written in response to many questions triggered by [id://536101]. It&#39;s not useful as a tutorial, because Module::Compile is in its infancy and very likely to change further, but posted here because it provides information only.&lt;/i&gt;&lt;/p&gt;&lt;h4&gt;History&lt;/h4&gt;&lt;p&gt;Once upon a time, people thought that maybe Perl would be faster if compiled code could be re-used in a later session. (Actually, many people still do.) So they created support for compiled Perl Modules, and where Python libraries got their &quot;c&quot; after their &quot;py&quot;, compiled Perl modules would get their &quot;c&quot; after their &quot;pm&quot;, and the &quot;pmc&quot; extension was born.&lt;/p&gt;&lt;p&gt;It works somewhat like this: you have a module, say, Test.pm, and the following line gives you the compiled version:&lt;pre class=&quot;block_code&quot;&gt;perl -MO=Bytecode,-H -MTest -e1 &gt; Test.pmc&lt;/pre&gt;Provided that Test.pmc is in the same directory that Test.pm is, perl will automatically load Test.pmc instead of Test.pm the next time someone uses &quot;use Test&quot; or &quot;require Test&quot;. This is because in every directory of @INC, perl first sees if a .pmc exists, and if it does, it loads that, and if not, then it tries to load the .pm.&lt;/p&gt;&lt;h4&gt;What&#39;s going on?&lt;/h4&gt;&lt;p&gt;If you strace the process, you can easily see what was going on before:&lt;pre class=&quot;block_code&quot;&gt;stat64(&quot;/usr/local/lib/perl/5.8.7/Test.pmc&quot;, 0x7fe20a50) = -1 ENOENT (No such file or directory)open(&quot;/usr/local/lib/perl/5.8.7/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/local/share/perl/5.8.7/Test.pmc&quot;, 0x7fe20a50) = -1 ENOENT (No such file or directory)open(&quot;/usr/local/share/perl/5.8.7/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/lib/perl5/Test.pmc&quot;, 0x7fe20a50) = -1 ENOENT (No such file or directory)open(&quot;/usr/lib/perl5/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/share/perl5/Test.pmc&quot;, 0x7fe20a50) = -1 ENOENT (No such file or directory)open(&quot;/usr/share/perl5/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/lib/perl/5.8/Test.pmc&quot;, 0x7fe20a50) = -1 ENOENT (No such file or directory)open(&quot;/usr/lib/perl/5.8/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/share/perl/5.8/Test.pmc&quot;, 0x7fe20a50) = -1 ENOENT (No such file or directory)open(&quot;/usr/share/perl/5.8/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = 4&lt;/pre&gt;Before perl finds Test.pm in /usr/share/perl/5.8, it first tries a lot of other directories, each time stat()ing Test.pmc first, and then trying to open Test.pm. Now, because I have created a Test.pm in /usr/share/perl/5.8, perl&#39;s done a little earlier:&lt;pre class=&quot;block_code&quot;&gt;stat64(&quot;/usr/local/lib/perl/5.8.7/Test.pmc&quot;, 0x7f95aa50) = -1 ENOENT (No such file or directory)open(&quot;/usr/local/lib/perl/5.8.7/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/local/share/perl/5.8.7/Test.pmc&quot;, 0x7f95aa50) = -1 ENOENT (No such file or directory)open(&quot;/usr/local/share/perl/5.8.7/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/lib/perl5/Test.pmc&quot;, 0x7f95aa50) = -1 ENOENT (No such file or directory)open(&quot;/usr/lib/perl5/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/share/perl5/Test.pmc&quot;, 0x7f95aa50) = -1 ENOENT (No such file or directory)open(&quot;/usr/share/perl5/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/lib/perl/5.8/Test.pmc&quot;, 0x7f95aa50) = -1 ENOENT (No such file or directory)open(&quot;/usr/lib/perl/5.8/Test.pm&quot;, O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)stat64(&quot;/usr/share/perl/5.8/Test.pmc&quot;, {st_mode=S_IFREG|0644, st_size=720, ...}) = 0stat64(&quot;/usr/share/perl/5.8/Test.pm&quot;, {st_mode=S_IFREG|0644, st_size=28863, ...}) = 0open(&quot;/usr/share/perl/5.8/Test.pmc&quot;, O_RDONLY|O_LARGEFILE) = 4&lt;/pre&gt;Don&#39;t be fooled by the stat() on Test.pm; it&#39;s ignored. The actual open() is now on Test.pmc instead of Test.pm.&lt;/p&gt;&lt;h4&gt;Sad ending&lt;/h4&gt;&lt;p&gt;So, why did (and does) nobody really use this? Well, it turns out that bytecode loading is indeed faster in some (but not all) cases, but not very reliable: segfaults are quite common, and nobody so far has been able to fix it.&lt;/p&gt;&lt;p&gt;But there is another problem with Byteloader. The PMC file is actually just a normal Perl file: via a source filter, the bytecode is loaded. That&#39;s nasty, and makes using precompiled bytecode even less interesting.&lt;/p&gt;&lt;p&gt;Thus a sad story seemed to end, and the PMC feature was ignored, forgotten, and almost about to be deprecated... (Features aren&#39;t removed lightly in Perl - even if nobody uses it, a deprecation cycle is needed.)&lt;/p&gt;&lt;h4&gt;Joyful rebirth of PMC and source filters&lt;/h4&gt;&lt;p&gt;... if it weren&#39;t for the incredible geniuses who work on Perl 6 to find a very neat way to use the PMC feature to make Perl 6 acceptance and initial use less painful!&lt;/p&gt;&lt;p&gt;But actually, it&#39;s not even really Perl 6 related. It&#39;s just a very neat thing that you can also put to good use if you don&#39;t want to use Perl 6 yet. You see, we have source filters that can do very nice and spiffy (no pun intended) things for us, but are also known to have certain negative feelings attached to them. Sourcefilters have problems with mod_perl and other embedded perls, and they cannot be combined with other source filters. Besides that, source filters are always runtime overhead, and the result is usually not cached anywhere. You can&#39;t even debug the result easily, because the intermediate code is not available on disk.&lt;/p&gt;&lt;p&gt;The new invention exists in the form of [cpan://Module::Compile], formerly (for less than a day &amp;mdash; that&#39;s how fast things go in Freenode #perl6) known as PMC::Filter.&lt;/p&gt;&lt;p&gt;Instead of using the normal source filters, Module::Compile writes the &quot;compiled&quot; code to a .pmc file, and that&#39;s loaded from that point forward. The nice trick is that the PMC file doesn&#39;t have to be bytecode. It&#39;s just a normal pure Perl file.&lt;/p&gt;&lt;h4&gt;But we call them compilers now&lt;/h4&gt;&lt;p&gt;A &quot;compiler&quot; (read: source filter) is built quite easily:&lt;pre class=&quot;block_code&quot;&gt;package Foo;use Module::Compile -base;sub pmc_compile {    my ($class, $source) = @_;    # Convert $source into (most like Perl 5) $compiled_output    return $compiled_output;}1;&lt;/pre&gt;and using that is also very straight forward:&lt;pre class=&quot;block_code&quot;&gt;# Unfiltered code herequux(bar);use Foo;# This code is filtered!# Ehh... I mean compiled!no Foo;# Unfiltered code here, againquux(bar);&lt;/pre&gt;&lt;/p&gt;&lt;p&gt;A nice solution for everyone who has found a good use for source filters, but doesn&#39;t use them. Module::Compile supports using multiple compilers (source filters, for those who still don&#39;t get it) at the same time, the intermediate code is available for simpler debugging, and it works well even with embedded perls.&lt;/p&gt;&lt;h4&gt;Perl 6 again&lt;/h4&gt;&lt;p&gt;Remember that it was Perl 6 people that invented all this? It must have something to do with Perl 6, then, right? Well, but of course! The idea is to compile Perl 6 code into Perl 5 code, so that you can use it without Perl 6. You can do this to have a module that has parts in Perl 5 and parts in Perl 6, which is nice if you want to upgrade an existing module with some of the nice new features that Perl 6 has to offer, but don&#39;t want to rewrite everything just yet.&lt;/p&gt;&lt;p&gt;An even nicer thing, though, is that this allows you to write modules that are fully Perl 6, that will work both in Perl 5 and Perl 6.&lt;/p&gt;&lt;h4&gt;Distribution&lt;/h4&gt;&lt;p&gt;There are two problems with the PMC technique, though:&lt;ol&gt;&lt;li&gt; You need to have all the stuff installed that is needed for a succesful compilation&lt;li&gt; Users don&#39;t have write access in the system library directories&lt;/ol&gt;But there is one solution that solves both these problems: let Module::Compile generate your PMCs as you develop your software, and when you&#39;re done, distribute the compiled result along with the rest, so that people can use that. Also, if the PMC is installed with the rest, then someone with superuser privileges installs it.&lt;/p&gt;&lt;h4&gt;Perl 6 example&lt;/h4&gt;&lt;p&gt;Oh, you want example code, of course! Here it is, again copied straight from Module::Compile&#39;s documentation:&lt;pre class=&quot;block_code&quot;&gt;# User.pmuse v6-pugs;module User;...some p6 code here...no v6;...back to p5 land...&lt;/pre&gt;Oh well, I&#39;ll steal their explanation too:&lt;/p&gt;&lt;p&gt;&lt;em&gt;On the first time this module is loaded, it will compile Perl 6 chunks into Perl 5 (as soon as the no v6 line is seen), and merge it with the Perl 5 chunks, saving the result into a User.pmc file.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;The next time around, Perl 5 will automatically load User.pmc when someone says use Foo. On the other hand, Perl 6 can run User.pm as a Perl 6 module just fine, as &quot;use v6-pugs&quot; and &quot;no v6&quot; both works in a perl 6 setting also.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;The pmc module will imbue v6.pm with the ability to check for Foo.pmc&#39;s updatedness also, and if it&#39;s up to date, then it&#39;d touch its timestamp so the .pmc is loaded on the next time.&lt;/em&gt;&lt;/p&gt;&lt;h4&gt;Final words&lt;/h4&gt;&lt;p&gt;And that&#39;s why you shouldn&#39;t disable the PMC feature in Perl: although unused since its invention in 1999, there&#39;s now a very powerful use for it! Fortunately, so far the survey shows that only Mandrake/Mandriva users have it disabled. They can fix it soon, by upgrading.&lt;/p&gt;&lt;p&gt;Do note that Perl 6 is very far from production quality, and that Module::Compile is also still very young. You should probably not use them for anything serious yet.&lt;/p&gt;&lt;p&gt;Thank you, ingy++ and audreyt++, for bringing us this wonderful innovation!&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>Using Look-ahead and Look-behind (Roy Johnson)</title>
    <link>http://prlmnks.org/html/518444.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/518444.html</guid>

    <description>
        If you are familiar with Perl&#39;s regular expressions, you are probably already familiar with zero-width assertions: the &lt;tt class=&quot;inline_code&quot;&gt;^&lt;/tt&gt; indicating the beginning of string and the &lt;tt class=&quot;inline_code&quot;&gt;\b&lt;/tt&gt; indicating a word boundary are examples. They do not match any characters, but &quot;look around&quot; to see what comes before and/or after the current position.&lt;p&gt;With the look-ahead and look-behind constructs documented in &lt;a href=&quot;/out/http/?url=perldoc.perl.org%2Fperlre.html%23Extended-Patterns&quot;&gt;perlre.html#Extended-Patterns&lt;/a&gt;, you can &quot;roll your own&quot; zero-width assertions to fit your needs. You can look forward or backward in the string being processed, and you can require that a pattern match succeed (positive assertion) or fail (negative assertion) there.&lt;h3&gt;Syntax&lt;/h3&gt;Every extended pattern is written as a parenthetical group with a question mark as the first character. The notation for the look-arounds is fairly mnemonic, but there are some other, experimental patterns that are similar, so it is important to get all the characters in the right order. &lt;dl&gt;&lt;dt&gt;&lt;tt class=&quot;inline_code&quot;&gt;(?=&lt;/tt&gt;&lt;i&gt;pattern&lt;/i&gt;&lt;tt class=&quot;inline_code&quot;&gt;)&lt;/tt&gt;&lt;/dt&gt;&lt;dd&gt;is a positive look-ahead assertion&lt;/dd&gt;&lt;dt&gt;&lt;tt class=&quot;inline_code&quot;&gt;(?!&lt;/tt&gt;&lt;i&gt;pattern&lt;/i&gt;&lt;tt class=&quot;inline_code&quot;&gt;)&lt;/tt&gt;&lt;/dt&gt;&lt;dd&gt;is a negative look-ahead assertion&lt;/dd&gt;&lt;dt&gt;&lt;tt class=&quot;inline_code&quot;&gt;(?&lt;=&lt;/tt&gt;&lt;i&gt;pattern&lt;/i&gt;&lt;tt class=&quot;inline_code&quot;&gt;)&lt;/tt&gt;&lt;/dt&gt;&lt;dd&gt;is a positive look-behind assertion&lt;/dd&gt;&lt;dt&gt;&lt;tt class=&quot;inline_code&quot;&gt;(?&lt;!&lt;/tt&gt;&lt;i&gt;pattern&lt;/i&gt;&lt;tt class=&quot;inline_code&quot;&gt;)&lt;/tt&gt;&lt;/dt&gt;&lt;dd&gt;is a negative look-behind assertion&lt;/dd&gt;&lt;/dl&gt;Notice that the &lt;tt class=&quot;inline_code&quot;&gt;=&lt;/tt&gt; or &lt;tt class=&quot;inline_code&quot;&gt;!&lt;/tt&gt; is always last. The directional indicator is only present in the look-behind, and comes before the positive-negative indicator.&lt;h3&gt;Common tasks&lt;/h3&gt;&lt;h4&gt;Finding the last occurrence&lt;/h4&gt;There are actually a number of ways to get the last occurrence that don&#39;t involve look-around, but if you think of &quot;the last foo&quot; as &quot;foo that isn&#39;t followed by a string containing foo&quot;, you can express that notion like this:&lt;pre class=&quot;block_code&quot;&gt;/foo(?!.*foo)/&lt;/pre&gt;The regular expression engine will do its best to match &lt;tt class=&quot;inline_code&quot;&gt;.*foo&lt;/tt&gt;, starting at the end of the string &quot;foo&quot;. If it is able to match that, then the negative look-ahead will fail, which will force the engine to progress through the string to try the next foo.&lt;h4&gt;Substituting before, after, or between characters&lt;/h4&gt;Many substitutions match a chunk of text and then replace part or all of it. You can often avoid that by using look-arounds. For example, if you want to put a comma after every foo:&lt;pre class=&quot;block_code&quot;&gt;s/(?&lt;=foo)/,/g; # Without lookbehind: s/foo/foo,/g or s/(foo)/$1,/g&lt;/pre&gt;or to put the hyphen in look-ahead:&lt;pre class=&quot;block_code&quot;&gt;s/(?&lt;=look)(?=ahead)/-/g;&lt;/pre&gt;This kind of thing is likely to be the bulk of what you use look-arounds for. It is important to remember that &lt;b&gt;look-behind expressions cannot be of variable length&lt;/b&gt;. That means you cannot use quantifiers (&lt;tt class=&quot;inline_code&quot;&gt;., +, or {1,5}&lt;/tt&gt;) or alternation of different-length items inside them.&lt;h4&gt;Matching a pattern that doesn&#39;t include another pattern&lt;/h4&gt;You might want to capture everything between foo and bar that doesn&#39;t include baz. The technique is to have the regex engine look-ahead at every character to ensure that it isn&#39;t the beginning of the undesired pattern:&lt;pre class=&quot;block_code&quot;&gt;/foo  # Match starting at foo (?:       # Complex expression:   (?!baz) #   make sure we&#39;re not at the beginning of baz    .       #   accept any character )*        # any number of times bar  # and ending at bar/x;&lt;/pre&gt;&lt;h3&gt;Nesting&lt;/h3&gt;You can put look-arounds inside of other look-arounds. This has been known to induce a flight response in certain readers (me, for example, the first time I saw it), but it&#39;s really not such a hard concept. A look-around sub-expression inherits a starting position from the enclosing expression, and can walk all around relative to that position without affecting the position of the enclosing expression. They all have independent (though initially inherited) bookkeeping for where they are in the string.&lt;p&gt;The concept is pretty simple, but the notation becomes hairy very quickly, so commented regular expressions are recommended. Let&#39;s look at the real example of [id://319742]. The poster wants to put a space after any comma (punctuation, actually, but for simplicity, let&#39;s say comma) that is not nestled between two digits. Building up the s/// expression:&lt;pre class=&quot;block_code&quot;&gt;s/(?&lt;=,        # after a comma,    (?!        # but not matching      (?&lt;=\d,) #   digit-comma before, AND      (?=\d)   #   digit afterward    )  )/ /gx;      # substitute a space&lt;/pre&gt;Note that multiple lookarounds can be used to enforce multiple conditions at the same place, like an AND condition that complements the alternation (vertical bar)&#39;s OR. In fact, you can use Boolean algebra ( &lt;tt class=&quot;inline_code&quot;&gt;NOT (a AND b) === (NOT a OR NOT b)&lt;/tt&gt; ) to convert the expression to use OR:&lt;pre class=&quot;block_code&quot;&gt;s/(?&lt;=,        # after a comma, but either    (?:      (?&lt;!\d,) #   not matching digit-comma before      |        #   OR      (?!\d)   #   not matching digit afterward    )  )/ /gx;      # substitute a space&lt;/pre&gt;&lt;h3&gt;Capturing&lt;/h3&gt;It is sometimes useful to use capturing parentheses within a look-around. You might think that you wouldn&#39;t be able to do that, since you&#39;re just browsing, but [478043|you can]. But remember: the capturing parentheses must be within the look-around expression; from the enclosing expression&#39;s point of view, no actual matching was done by the zero-width look-around.
    </description>
</item>

        

<item>
    <title>Creating COM and DCOM objects with Perl (JamesNC)</title>
    <link>http://prlmnks.org/html/516137.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/516137.html</guid>

    <description>
        Ok, you have a written a really cool perl program and you would like to use if from MSExcel or MSWord or from a WSH or VB script.  Perhaps it is an interface to DBI or some other really great perl module.  How do you do it?  &lt;br&gt;This short tutorial for Win32 users explains how and give examples of how to share perl $scalars, @arrays and %hashes using a WSC - Windows Scripting Component to local and networked users.  The documentation shows much more info. This tutorial is a just a quick and dirty how to get it working along with some stuff I have learned that is NOT in the documentation like how to do hashes( mimicked )&lt;br&gt;This tutorial also assumes that you know some VBA and how to open up the VB Editor in Excel.&lt;br&gt;Helpful links:&lt;li&gt; &lt; a href=&quot;http://www.microsoft.com/downloads/details.aspx?FamilyId=408024ED-FAAD-4835-8E68-773CCC951A6B&amp;displaylang=en &quot; &gt;WSC Wizard&lt;/a&gt; &lt;i&gt;Technically optional, but I would get this free tool that generates a .wsc framework for you.&lt;/i&gt;&lt;li&gt; &lt; a href=&quot;http://www.microsoft.com/downloads/details.aspx?familyid=C717D943-7E4B-4622-86EB-95A22B832CAA&amp;displaylang=en&quot; &gt;WSH &lt;/a&gt; &lt;i&gt;Optional on W2K/XP&lt;/i&gt;&lt;li&gt;&lt;a href=&quot;http://www.microsoft.com/downloads/details.aspx?familyid=01592C48-207D-4BE1-8A76-1C4099D7BBB9&amp;displaylang=en&quot; &gt; WSC and WSH Documentation &lt;/a&gt;&lt;p&gt;The process is fairly simple.&lt;ol&gt;&lt;li&gt; Create a .WSC file ( recommend using the Wizard )&lt;li&gt; Add Perl and/or Perlscript code to the .WSC file&lt;li&gt; Register the .WSC COM object with windows&lt;li&gt; In your VB code, use CreateObject to create a instance of the COM object&lt;/ol&gt;Now on to the examples!&lt;/p&gt;&lt;i&gt;Copy and paste the following code into a file named &quot;pcom.WSC&quot; ( Note: do not alter the classid in this file! )&lt;/i&gt;&lt;pre class=&quot;block_code&quot;&gt;&lt;?xml version=&quot;1.0&quot;?&gt;&lt;component&gt;&lt;?component debug=&quot;true&quot; ?&gt;&lt;registrationdescription=&quot;pcom&quot;progid=&quot;pcom.WSC&quot;version=&quot;1.00&quot;remotable=&quot;true&quot;classid=&quot;{e91a2a76-18bc-4ab0-8b1c-06d6d0442287}&quot;&gt;&lt;/registration&gt;&lt;public&gt;&lt;method name=&quot;sortList&quot;&gt;        &lt;parameter name=&quot;vbArray&quot;/&gt;&lt;/method&gt;&lt;method name=&quot;getHash&quot;&gt;&lt;/method&gt;&lt;method name=&quot;getScalar&quot;&gt;&lt;parameter name=&quot;scalar&quot;/&gt;&lt;/method&gt;&lt;/public&gt;&lt;script language=&quot;PerlScript&quot;&gt;&lt;![CDATA[# Author: James Moosmann# Copyright 2005use Win32::OLE::Variant;sub sortList{my @list = @_;@list = sort @list;my $ary = convertArrayToVBArray( \@list );return $ary;}sub getHash{my @input = @_;my %dog = ( &quot;buddy&quot; =&gt; &quot;dog&quot; , &quot;georgie&quot;, &quot;dog&quot; );my $hash = convertHashToDict(\%dog);return $hash;}sub getScalar{my $scalar = $_[0];if( $scalar =~/^[\d\.]+$/g ){ return $scalar+1; }return $scalar = &quot;perl_saw:$scalar&quot;;}sub convertHashToDict{my $hash_ref = $_[0];my $dict = Win32::OLE-&gt;CreateObject(&quot;Scripting.Dictionary&quot;) or die $!;foreach my $key ( keys %{$hash_ref} ){  $dict-&gt;add( $key, ${$hash_ref}{$key} ); }return $dict;}sub convertArrayToVBArray{my $array_ref = $_[0];my $ary = Variant( VT_ARRAY|VT_VARIANT, $#{$array_ref}+1);$ary-&gt;Put(\@{$array_ref});return $ary;}]]&gt;&lt;/script&gt;&lt;/component&gt;&lt;/pre&gt;After you create the above file on your system.  Then you just need to Register it with windows. In windows this is simple.&lt;ol&gt;&lt;li&gt; Right click on the file&lt;li&gt; Select -&gt; Register&lt;/ol&gt;Whew!  That was tough.&lt;br&gt;&lt;i&gt;**You may also notice the &quot;Generate Type Library&quot; and wonder what it does.  If it worked, it would help open up the api to your COM object and things like autocomplete in the VB editor would work for you. It kind of works in that if you were to create the typelib and then go to Excel and create a refrence to it, things would seem to work.  But at run time when the COM object gets created, it is always the wrong type (Variant/Object) vs the Object that you created. So, don&#39;t even bother.  Microsoft knows about this error and acknowledges it, but doesn&#39;t disable the &quot;feature&quot;... bizarre&lt;/i&gt;&lt;p&gt;Ok, we have Registered our COM object.  Now, we need to use it!  Here&#39;s how using Excel:&lt;br&gt;&lt;i&gt;Open a blank Excel sheet and add the following code&lt;/i&gt;&lt;pre class=&quot;block_code&quot;&gt;Sub sortList_test()Set obj = CreateObject(&quot;pcom.WSC&quot;)Dim l() As Variantl = obj.sortList(&quot;fox&quot;, &quot;dog&quot;, &quot;moose&quot;, &quot;cat&quot;, &quot;fish&quot;) For Each li In l    Debug.Print li    Next liEndEnd SubSub getHash_test()Set obj = CreateObject(&quot;pcom.WSC&quot;)Dim hash As Scripting.DictionarySet hash = obj.getHash(&quot;dogHash&quot;)Debug.Print &quot;Buddy is my &quot; &amp; hash.Item(&quot;buddy&quot;)End SubSub getScalar_test()Set obj = CreateObject(&quot;pcom.WSC&quot;)scalar = obj.getScalar(1)Debug.Print scalarscalar = obj.getScalar(&quot;a string! from VB&quot;)Debug.Print scalarEnd Sub&lt;/pre&gt;&lt;p&gt;&lt;i&gt;Make sure that you have References to &quot;Microsoft Scriptlet Library&quot; and &quot;Microsoft Scripting Runtime&quot; selected. &lt;/i&gt;Call the subs and that&#39;s it for the fast track intro. I will elaborate more below on DCOM and a few more selected details. &lt;/p&gt;&lt;br&gt;&lt;p&gt;What about DCOM?  Why use it? &lt;br&gt;Your office mate on the same network doesn&#39;t have Perl installed on his box, but he wants to use your nifty perl object.  DCOM comes to the rescue!  Here are the steps needed to allow a user on another computer have access to the object. (Since this is documented in the above links I am only going to give an outline, search the docs for more details.)&lt;ol&gt;&lt;li&gt; You need to modify the pcom.WSC file and add remotableattribute (be careful of the spelling! it is NOT remoteable) to the &lt;registration ... &gt; section like so:&lt;br&gt;&lt;pre class=&quot;block_code&quot;&gt;&lt;registrationdescription=&quot;pcom&quot;progid=&quot;pcom.WSC&quot;version=&quot;1.00&quot;classid=&quot;{e91a2a76-18bc-4ab0-8b1c-06d6d0442287}&quot;        remotable=&quot;true&quot;&gt;&lt;/registration&gt;&lt;/pre&gt;Now, you can call the object by using the the 2 argument form of CreateObject in your VB like so...&lt;pre class=&quot;block_code&quot;&gt;Set hash = obj.getHash(&quot;dogHash&quot;, &quot;127.0.0.1&quot;)&lt;/pre&gt;If you tried this, it would fail on your friends computer because you have to first register your object on his machine AND you need to give him permission to use that object from your computer. &lt;br&gt;&lt;p&gt;How to register your COM object on his computer&lt;ol&gt; &lt;li&gt; Open up regedit and search for pcom.WSC&lt;li&gt; Right-click on the &quot;{e91a2a76-18bc-4ab0-8b1c-06d6d0442287}&quot; entry (pcom.WSC&#39;s classid ) and then select Export from the popup menu.&lt;li&gt; Save this registry entry as pcom.reg&lt;/ol&gt;&lt;p&gt;Now, you just can use this file to add the needed information to his registry.  Copy the file to your friends computer and Merge the file with his registry. ( You can do that with a Right-click and select Merge )&lt;p&gt;You are 1/2 way there.  Now, you need to give him permission to access the object on your machine.  You do that with &quot;dcomcnfg.exe&quot;. This utility has a different interface for W2K and XP, so I will let you figure this part out.  Basically, after you add the remotable entry, you will need to re-register the pcom.WSC file again, then open dcomcnfg.exe and browse the the pcom object and set the permissions for that object as appropriate. &lt;i&gt;See the documentation for more details on how to run the object with different credentials.  You can set security on the objects and more with the utility!&lt;/i&gt;&lt;p&gt; What about debugging?&lt;br&gt;The docs again.  But my take on it is obviously debug your perl separately or in a .vb script where you can see the output more easily.&lt;p&gt;Well, that&#39;s it.  Hope it will open the world of amazing things perl can do with great ease and power to spread sheets and and other Windows applications for you.  &lt;br&gt;Other options I did not talk about include some solutions from ActiveState. They have some commercially available tools that can turn your scripts into stand alone controls and services, system tray utilities, apps and more.  I don&#39;t work for them, but their PDK - Perl Dev Kit is worth the money IMHO if you want more robust features of doing these types of things.&lt;p&gt;Cheers, &lt;br&gt;JamesNC
    </description>
</item>

        

<item>
    <title>Test adding tutorial (wl69)</title>
    <link>http://prlmnks.org/html/511037.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/511037.html</guid>

    <description>
        I am doing a reseach project about the web community. I just test adding a tutorial without logging into the community. I am so sorry about messing up the tutorial section. If anyone has authority to delete the node, please go ahead! I appreciate that!
    </description>
</item>

        

<item>
    <title>How A Function Becomes Higher Order (Limbic~Region)</title>
    <link>http://prlmnks.org/html/492651.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/492651.html</guid>

    <description>
        All:&lt;br /&gt;[isbn://1558607013|Higher Order Perl], by [Dominus], has become a very popular book.  It was written to teach programmers how to transform programs with programs.  Many of us who do not have familiarity with Functional Programming are not aware of what [http://en.wikipedia.org/wiki/Higher-order_function|Higher Order] functions are.  It is a function that does at least on of the two following things:&lt;ul&gt;&lt;li&gt;Accepts a function as input&lt;/li&gt;&lt;li&gt;Returns a function as output&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;This tutorial is an illustration of how a simple every day function may become higher order increasing its usefulness in the process.  Along the way we will pick up other tricks that can make our code more flexible and useful.&lt;/p&gt;&lt;H4&gt;Problem: We have a file containing a list of scores and we need to determine the highest score.&lt;/H4&gt;&lt;p&gt;Using the principal of code reuse and not reinventing the wheel, we turn to our trusty [cpan://List::Util].&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;use List::Util &#39;max&#39;;my @scores = &lt;FH&gt;;my $high_score = max(@scores);&lt;/pre&gt;&lt;p&gt;Unfortunately, this requires all of the scores to be held in memory at one time and our file is really big.  Just this once, we decide to break the rules and roll our own.&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;my $high_score;while ( &lt;FH&gt; ) {    chomp;    $high_score = $_ if ! defined $high_score || $_ &gt; $high_score;}&lt;/pre&gt;&lt;p&gt;As time goes by &quot;just this once&quot; has happened many times and we decide to make our version reuseable. &lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;sub gen_max {    my $max;    $max = $_[0] if defined $_[0];    return sub {        for ( @_ ) {            $max = $_ if ! defined $max || $_ &gt; $max;        }        return $max;    };}my $max = gen_max();while ( &lt;FH&gt; ) {    chomp;    $max-&gt;($_);}my $high_score = $max-&gt;();&lt;/pre&gt;&lt;p&gt;This is our first step into Higher Order functions as we have returned a function as the output for the sake of reusability.  We have also have many advantages over the original [cpan://List::Util] &lt;i&gt;max&lt;/i&gt; function.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Does not require all values to be present at once&lt;/li&gt;&lt;li&gt;Ability to define a starting value&lt;/li&gt;&lt;li&gt;Ability to process one or more values at a time&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Unfortunately, our function breaks the second we start comparing strings instead of numbers.  We could make &lt;i&gt;max()&lt;/i&gt; and &lt;i&gt;maxstr()&lt;/i&gt; functions like [cpan://List::Util] but we want to use the concept of Higher Order functions to increase the versatility of our single function.&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;sub gen_reduce {    my $usage = &#39;Usage: gen_reduce(&quot;initial&quot; =&gt; $val, &quot;compare&quot; =&gt; $code_ref)&#39;;    my $val;    die $usage if @_ % 2;    my %opt = @_;    die $usage if ! defined $opt{compare} || ref $opt{compare} ne &#39;CODE&#39;;    my $compare = $opt{compare};    $val = $opt{initial} if defined $opt{initial};    return sub {        for ( @_ ) {            $val = $_ if ! defined $val || $compare-&gt;($_,  $val);        }        return $val;    };}my $maxstr = gen_reduce(compare =&gt; sub { length($_[0]) &gt; length($_[1]) } );while ( &lt;FH&gt; ) {    chomp;    $maxstr-&gt;($_);}my $long_str = $maxstr-&gt;();&lt;/pre&gt;&lt;p&gt;Now our function takes a function as input and returns a function as output.  In addition to the previous functionality, we have added a few more features.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Named parameters - allows flexibility in ordering and presence of arguments as well as ease in extensibility&lt;/li&gt;&lt;li&gt;User defined comparator - our max function has now become a reduce function&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;This does not have to be the end of the journey into Higher Order functions, though it is the end of the tutorial.  Whenever you encounter a situation where two programs do nearly identical things but their differences are enough to make using a single function impossible - consider Higher Order functions to bridge the gap.  Remember - it is important to always document your interface and assumptions well!&lt;/p&gt;&lt;p&gt;I open the floor to comments both on the advantages and disadvantages of Higher Order functions.  As they say, there is no such thing as a free lunch and there are always cases in which it makes sense to use distinct routines for distinct problems.&lt;/p&gt;Cheers,[Limbic~Region|L~R]&lt;p&gt;&lt;small&gt;Note:  [cpan://List::Util] is a great module and the limitation of requiring all the values to be present at once is usually made up for by the fact that it also provides a &lt;i&gt;reduce()&lt;/i&gt; function, has both C and Perl implementations, and syntactic sugar.  The limitations were highlighted here for illustration purposes though I recommend using it when and where it does the job you need it to.&lt;/small&gt;&lt;/p&gt;
    </description>
</item>

        

<item>
    <title>Stepping up from XML::Simple to XML::LibXML (grantm)</title>
    <link>http://prlmnks.org/html/490846.html</link>
    <guid isPermaLink="true">http://prlmnks.org/html/490846.html</guid>

    <description>
        &lt;p&gt;If your XML parsing requirements can be boiled down to &quot;&lt;i&gt;slurp an XML file into a hash&lt;/i&gt;&quot;,then XML::Simple is very likely all you need.However,many people who start using [http://search.cpan.org/dist/XML-Simple/|XML::Simple] continue to cling to the module even when their requirements have outgrown it.Most often,it&#39;s fear of the unknown that keeps them from &#39;stepping up&#39;; to a more capable module.In this article,I&#39;m going to attempt to dispel some of that fear by comparing using [http://search.cpan.org/dist/XML-LibXML/|XML::LibXML] to using XML::Simple.&lt;/p&gt;&lt;h2&gt;Installation&lt;/h2&gt;&lt;p&gt;If you&#39;re running Windows,you can get a binary build of XML::LibXML from Randy Kobes&#39; [http://theoryx5.uwinnipeg.ca/|PPM repositories].If you&#39;re running Linux then things will be even simpler - just use the package from your distribution (eg: on Debian: apt-get install libxml-libxml-perl).&lt;/p&gt;&lt;p&gt;If for some reason you&#39;re unable to install XML::LibXML,but you have XML::Parser,then you might like to install XML::XPath which is a Pure Perl module that implements a very similar API to LibXML but uses XML::Parser for the parsing bit.&lt;/p&gt;&lt;h2&gt;Some Sample Data&lt;/h2&gt;&lt;p&gt;Let&#39;s start with a file that lists the details of books in a (very small) library:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  &lt;library&gt;    &lt;book&gt;      &lt;title&gt;Perl Best Practices&lt;/title&gt;      &lt;author&gt;Damian Conway&lt;/author&gt;      &lt;isbn&gt;0596001738&lt;/isbn&gt;      &lt;pages&gt;542&lt;/pages&gt;      &lt;image src=&quot;http://www.oreilly.com/catalog/covers/perlbp.s.gif&quot;             width=&quot;145&quot; height=&quot;190&quot; /&gt;    &lt;/book&gt;    &lt;book&gt;      &lt;title&gt;Perl Cookbook, Second Edition&lt;/title&gt;      &lt;author&gt;Tom Christiansen&lt;/author&gt;      &lt;author&gt;Nathan Torkington&lt;/author&gt;      &lt;isbn&gt;0596003137&lt;/isbn&gt;      &lt;pages&gt;964&lt;/pages&gt;      &lt;image src=&quot;http://www.oreilly.com/catalog/covers/perlckbk2.s.gif&quot;             width=&quot;145&quot; height=&quot;190&quot; /&gt;    &lt;/book&gt;    &lt;book&gt;      &lt;title&gt;Guitar for Dummies&lt;/title&gt;      &lt;author&gt;Mark Phillips&lt;/author&gt;      &lt;author&gt;John Chappell&lt;/author&gt;      &lt;isbn&gt;076455106X&lt;/isbn&gt;      &lt;pages&gt;392&lt;/pages&gt;      &lt;image src=&quot;http://media.wiley.com/product_data/coverImage/6X/07645510/076455106X.jpg&quot;             width=&quot;100&quot; height=&quot;125&quot; /&gt;    &lt;/book&gt;  &lt;/library&gt;&lt;/pre&gt;&lt;h2&gt;A Simple Problem&lt;/h2&gt;&lt;p&gt;As a warm-up exercise, let&#39;s list the titles of all the books from the XML file. Please assume all the code samples begin as follows:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  #!/usr/bin/perl  use strict;  use warnings;  my $filename = &#39;library.xml&#39;;&lt;/pre&gt;&lt;p&gt;Here&#39;s one solution, using XML::Simple:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  use XML::Simple qw(:strict);  my $library  = XMLin($filename,    ForceArray =&gt; 1,    KeyAttr    =&gt; {},  );  foreach my $book (@{$library-&gt;{book}}) {    print $book-&gt;{title}-&gt;[0], &quot;\n&quot;   }&lt;/pre&gt;&lt;p&gt;And here&#39;s a LibXML solution that works the same way:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  use XML::LibXML;  my $parser = XML::LibXML-&gt;new();  my $doc    = $parser-&gt;parse_file($filename);  foreach my $book ($doc-&gt;findnodes(&#39;/library/book&#39;)) {    my($title) = $book-&gt;findnodes(&#39;./title&#39;);    print $title-&gt;to_literal, &quot;\n&quot;   }&lt;/pre&gt;&lt;p&gt;The &lt;tt class=&quot;inline_code&quot;&gt;&#39;/library/book&#39;&lt;/tt&gt; argument to &lt;tt class=&quot;inline_code&quot;&gt;findnodes&lt;/tt&gt; is called an XPath expression. If we substitute a slightly more complex XPath expression, we can factor out one line of code from inside the loop:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  foreach my $title ($doc-&gt;findnodes(&#39;/library/book/title&#39;)) {    print $title-&gt;to_literal, &quot;\n&quot;   }&lt;/pre&gt;&lt;p&gt;And if it&#39;s code brevity we&#39;re looking for, we can take things even further (this is Perl after all):&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  print $_-&gt;data . &quot;\n&quot; foreach ($doc-&gt;findnodes(&#39;//book/title/text()&#39;));&lt;/pre&gt;&lt;h2&gt;A More Complex Query&lt;/h2&gt;&lt;p&gt;Now, let&#39;s select a specific book using its ISBN number and list the authors. Using XML::Simple:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  use XML::Simple qw(:strict);  my $isbn     = &#39;0596003137&#39;;  my $library  = XMLin($filename,     ForceArray =&gt; [ &#39;book&#39;, &#39;author&#39; ],     KeyAttr    =&gt; { book =&gt; &#39;isbn&#39; }  );  my $book = $library-&gt;{book}-&gt;{$isbn};  print &quot;$_\n&quot; foreach(@{$book-&gt;{author}});&lt;/pre&gt;&lt;p&gt;And with LibXML:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  use XML::LibXML;  my $isbn   = &#39;0596003137&#39;;  my $parser = XML::LibXML-&gt;new();  my $doc    = $parser-&gt;parse_file($filename);  my $query  = &quot;//book[isbn/text() = &#39;$isbn&#39;]/author/text()&quot;;  print $_-&gt;data . &quot;\n&quot; foreach ($doc-&gt;findnodes($query));&lt;/pre&gt;&lt;p&gt;This time, we&#39;ve used a more complex XPath expression to identify both the &lt;tt class=&quot;inline_code&quot;&gt;&lt;book&gt;&lt;/tt&gt; element and the &lt;tt class=&quot;inline_code&quot;&gt;&lt;author&gt;&lt;/tt&gt; elements within it, in a single step. To understand that XPath expression, let&#39;s first consider a simpler one:&lt;/p&gt;&lt;tt class=&quot;inline_code&quot;&gt;  //book[1]&lt;/tt&gt;&lt;p&gt;This expression selects the first in a sequence of consecutive &lt;tt class=&quot;inline_code&quot;&gt;&lt;book&gt;&lt;/tt&gt; elements. The &lt;tt class=&quot;inline_code&quot;&gt;[1]&lt;/tt&gt; is actually a shorthand version of the more general form:&lt;/p&gt;&lt;tt class=&quot;inline_code&quot;&gt;  //book[position() = 1]&lt;/tt&gt;&lt;p&gt;&lt;i&gt;Note XPath positions are numbered from 1 - weird huh?.&lt;/i&gt;&lt;/p&gt;&lt;p&gt;As you can see, the square brackets enclose an expression and the XPath query will match all nodes for which the expression evaulates to true. So to return to the XPath query from our last code sample:&lt;/p&gt;&lt;tt class=&quot;inline_code&quot;&gt;  //book[isbn/text() = &#39;0596003137&#39;]/author/text()&lt;/tt&gt;&lt;p&gt;This will match the text content of any &lt;tt class=&quot;inline_code&quot;&gt;&lt;author&gt;&lt;/tt&gt; elements within a &lt;tt class=&quot;inline_code&quot;&gt;&lt;book&gt;&lt;/tt&gt; element which also contains an &lt;tt class=&quot;inline_code&quot;&gt;&lt;isbn&gt;&lt;/tt&gt; element with the text content &#39;0596003137&#39;. The leading // is kind of a wildcard and will match any number of levels of element nesting. After you&#39;ve re-read that a few times, it might even start to make sense.&lt;/p&gt;&lt;p&gt;The &lt;a href=&quot;/out/http/?url=search.cpan.org%2Fdist%2FXML-XPath%2F&quot;&gt;XML::XPath&lt;/a&gt; distribution includes a command-line tool &#39;xpath&#39; which you can use to test your XPath skills interactively. Here&#39;s an example of querying our file to extract the ISBN of any book over 900 pages long:&lt;/p&gt;&lt;tt class=&quot;inline_code&quot;&gt;  xpath -q -e &#39;//book[pages &gt; 900]/isbn/text()&#39; library.xml&lt;/tt&gt;&lt;p&gt;To achieve the same thing with XML::Simple, you&#39;d need to iterate over the elements yourself:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  my $library  = XMLin($filename, ForceArray =&gt; [ &#39;book&#39; ], KeyAttr =&gt; {});  foreach my $book (@{$library-&gt;{book}}) {    print $book-&gt;{isbn}, &quot;\n&quot; if $book-&gt;{pages} &gt; 900;  }&lt;/pre&gt;&lt;h2&gt;Modifying the XML&lt;/h2&gt;&lt;p&gt;One area in which XML::Simple is particularly weak is round-tripping an XML file - reading it, modifying the data and writing it back out as XML.&lt;/p&gt;&lt;p&gt;For this example, we&#39;re going to locate the data for the book with ISBN 076455106X and correct its page count from 392 to 394:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  use XML::Simple qw(:strict);  my $isbn = &#39;076455106X&#39;;  my $xs = XML::Simple-&gt;new(    ForceArray =&gt; 1,    KeyAttr    =&gt; { },    KeepRoot   =&gt; 1,  );  my $ref  = $xs-&gt;XMLin($filename);  my $books = $ref-&gt;{library}-&gt;[0]-&gt;{book};  my($book) = grep($_-&gt;{isbn}-&gt;[0] eq $isbn, @$books);  $book-&gt;{pages}-&gt;[0] = &#39;394&#39;;  print $xs-&gt;XMLout($ref);&lt;/pre&gt;&lt;p&gt;In this example I&#39;ve used a number of tricks to attempt to make the output format resemble the input format as closely as possible:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;an XML::Simple object was used to ensure the exact same options were used both for input and output&lt;/li&gt;&lt;li&gt;the ForceArray option was turned on to ensure that elements didn&#39;t get turned into attributes - unfortunately this necessitates the use of the extra &lt;tt class=&quot;inline_code&quot;&gt;-&gt;[0]&lt;/tt&gt; indexing&lt;/li&gt;&lt;li&gt;the KeyAttr option was used to stop arrays being folded into hashes and thus losing the order of the &lt;code &gt;&lt;book&gt;&lt;/code&gt; elements - unfortunately this necessitates iterating through the elements rather than indexing directly by ISBN&lt;/li&gt;&lt;li&gt;the KeepRoot option was used to ensure the root element name was preserved - unfortunately this introduced an extra level of hash nesting&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Even after disabling all the features that make XML::Simple both simple and convenient, the results are not ideal. Although the order of the books was preserved, the order of the child elements within each book was lost.&lt;/p&gt;&lt;p&gt;By contrast, the LibXML code to perform the same update is both simpler and more accurate:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  use XML::LibXML;  my $isbn   = &#39;076455106X&#39;;  my $parser = XML::LibXML-&gt;new();  my $doc    = $parser-&gt;parse_file($filename);  my $query  = &quot;//book[isbn = &#39;$isbn&#39;]/pages/text()&quot;;  my($node)  = $doc-&gt;findnodes($query);  $node-&gt;setData(&#39;394&#39;);  print $doc-&gt;toString;&lt;/pre&gt;&lt;h2&gt;Other Operations&lt;/h2&gt;&lt;p&gt;If you need to remove an element from an XML document using XML::Simple, you&#39;d simply delete the appropriate hash key. With LibXML, you would call the &lt;tt class=&quot;inline_code&quot;&gt;removeChild&lt;/tt&gt; method on the element&#39;s parent. For example:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  my($book)  = $doc-&gt;findnodes(&quot;//book[isbn = &#39;$isbn&#39;]&quot;);  my $library = $book-&gt;parentNode;  $library-&gt;removeChild($book);&lt;/pre&gt;&lt;p&gt;To add an element with XML::Simple you&#39;d add a new key to the hash. With LibXML, you must first create the new element, add any child elements (such as text content) and add it at the right point in the tree. For example:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  my $rating = $doc-&gt;createElement(&#39;rating&#39;);  $rating-&gt;appendTextNode(&#39;5&#39;);  $book-&gt;appendChild($rating);&lt;/pre&gt;&lt;p&gt;If that looks a bit too complex, there&#39;s also a convenience method you can use to add one element with text content in a single step:&lt;/p&gt;&lt;pre class=&quot;block_code&quot;&gt;  $book-&gt;appendTextChild(&#39;rating&#39;, &#39;5&#39;);&lt;/