top ten things every Perl hacker should know
apotheon
created: 2006-03-15 19:11:25

I'm feeling a little contemplative today, mostly about very simple subjects. One topic that came to mind is this: What are the top ten things every Perl hacker should know? I encourage you all to offer your suggestions, but I've decided to provide a list of my own as a starting point.

  1. Perl is not an acronym
  2. there is more than one way to do it
  3. how and why to use warnings and use strict
  4. how and why to use taint checking
  5. how and why to use lexical scoping for variables
  6. how saved Perl source code files should be named
  7. how to use CPAN
  8. how and why to use perldoc and Perl Monks
  9. don't reinvent the wheel: how and why to use subroutines, modules, and libraries
  10. how and why to use regexen

I suspect the bit about naming Perl files will be considered less important by others here, but it seems quite important to me to know what file extensions to use for different types of source files, to use names without spaces and special characters in them, and to use at least vaguely descriptive names.

This is meant to be a list of generally important bits of knowledge, like quirks of the language, resources available, and important aspects of Perl culture. It's not intended to include things like ingenious snippets of code or what O'Reilly books about Perl are best (though these, too, are worthy subjects to contemplate).

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re: top ten things every Perl hacker should know
created: 2006-03-15 19:21:30
1. Perl is not an acronym
This is the top item on your list? On my list, that's at number 101,439. Right after "Learn to turn on computer".
Re^2: top ten things every Perl hacker should know
created: 2006-03-15 19:25:43

It seems like an important item if you're going to converse with other Perl programmers. Calling it PERL instead of Perl is a quick way to get marked as a know-nothing newbie — which is doubly problematic if you're an author of programming instructional texts. I've seen an awful lot of Perl books wherein the author kept calling it PERL.

Then again, you're right: this item is pretty frivolous, in the grand scheme of things.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re^3: top ten things every Perl hacker should know
created: 2006-03-15 19:59:25
Not this shit again...:(

thor

The only easy day was yesterday

Re^4: top ten things every Perl hacker should know
created: 2006-03-15 20:24:36

No, not that again. There's a big difference between saying it's important to know Perl is not PERL if you're going to participate in the Perl community and saying I wouldn't hire someone for not knowing Perl is not PERL. I suspect I'd be less likely to hire someone that called it PERL on a resume, but not because of the capitalization — it's probably the case that the qualities I'd look for in a candidate would match up with the likelihood of knowing that Perl is not an acronym, though.

In other words, I wouldn't mark you down for not knowing it's properly Perl instead of PERL, but I have a sneaking suspicion that calling it PERL is more likely to be accompanied by a lack of qualities I'd like (such as involvement in open source development communities).

In any case, the point of this is that it's something people should know to get along in the Perl community, not an indictment of people who don't know them. Everybody starts out ignorant: knowledge and wisdom come with (not necessarily formal) education and experience.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

PERL vs perl vs Perl
created: 2006-03-16 05:14:46
Remember that a lot of CVs that you get from agents will have been mangled beyond all recognition by the agent. That a CV from an agent says PERL doesn't say a thing about the candidate.

That's why, as a candidate, I usually take my own copies of my CV to interviews, to replace the broken one the agent has given them.

Re: PERL vs perl vs Perl
created: 2006-03-16 14:06:04

Good point.

As I said, the spelling of PERL on the resume wouldn't in and of itself cause me to pass over a candidate, but your point about recruiters might be an important point to keep in mind.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re^4: top ten things every Perl hacker should know
created: 2006-03-15 21:51:11

Though I've seen a lot of low-quality "Perl" books, I've never seen a high quality "PERL" one. It's worth remembering.

Cheers,
Ovid

New address of my CGI Course.

Re^5: top ten things every Perl hacker should know
created: 2006-03-16 23:30:16

++, Ovid. This is so far the best reason I've heard for not spelling PERL.

Re^3: top ten things every Perl hacker should know
created: 2006-03-17 11:42:47
It's minor, sure, but it drives me batty that I have an infrastructure diagram where it is spelled PERL. In spite of how many times I have told them to change it. I think its a conspiracy to piss me off :-D

no, i'm not paranoid - it's just that they really are out to get me ;-D

Re^3: top ten things every Perl hacker should know
created: 2006-03-20 10:37:52
Calling it PERL instead of Perl is a quick way to get marked as a know-nothing newbie
%man perl
NAME
      perl - Practical Extraction and Report Language

You're right... only an idiot would consider that an acronym. No wait -- only a perl cultist wouldn't consider that an acronym, because that's exactly what it is, and only severe abuse of one's perceptions and/or the English language itself can change that.

However, not everyone is a perl cultist. Some of us don't believe in abusing the honorific case in English to distinguish a language from it's implementation, because unlike gods and royalty, a mere programming language doesn't deserve genuflection, and to those speakers of the English language, "Perl" is only correct if capitalized at the beginning of a sentence.

I realize this place is called Perl Monks, and that some people think it grants them license to turn into foaming at the mouth perl fantatics, but seriously -- denying the very truth before your eyes is the mark of a cultist, not a rational human free of the taint of religion.

Perl *is* an acroynm, because of the very fact it's been billed as such for well over ten years. Just because Pope Larry has decided to pull back from some of his past decrees doesn't undo the past.

Drop the religious fervour; it's not serving any purpose. Be rational, and focus on writing quality code in the language, and follow the original rules for the English language, not PerlSpeak.

Re^4: top ten things every Perl hacker should know
created: 2006-03-21 02:56:42

I can only assume you're unfamiliar with the term "backronym" — which is what Perl is, rather than a proper acronym. If you don't believe me, ask Mr. Wall.

In fact, if you're interested in getting this all "right", you might wish to refer back to the writings of Mr. Wall, the inventor of Perl and the man who coined the term Perl in the first place. It might also be of interest to note that the term perl, with a lower-case first letter, arose only because that's what the binary executable Perl parser is called in most unix-based implementations.

Perl is a proper name, as are Ruby, Python, Java, Lisp, Prolog, and a slew of other languages — no matter what their various parser binaries are called.

For purposes of knowing the proper terms, it helps to know the history of the terms.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re^5: top ten things every Perl hacker should know
created: 2006-03-21 11:13:22
I can only assume you're unfamiliar with the term "backronym" — which is what Perl is, rather than a proper acronym.

Just because Larry carefully rebranded perl as an acronym for marketing purposes doesn't make it any less an acronym. We don't know all the internal names the makers of lasers, sonar, or scuba gear pitched before the acronyms were popularlized: yet those words are and remain acronyms.

Ten years ago, perl was being agressively and loudly marketed under the acronym "Practical Extraction and Reporting Language" -- it was a selling point to encourage people to switch from shell scripts, which is largely what perl was written to replace. The fact that early versions of perl didn't have the acronym was a tiny sentence buried deep in man pages as a minor historical footnote.

For well over a decade (at *least* since I first read the perl man pages back in 1994, if not before then), perl was billed, loudly, as an acronym for "Practial Extraction and Reporting Language": now that it's associations with shell scripting are no longer considered cool, Larry is trying to flip-flop back to the old name. That part is fine, I guess, but claiming that perl is suddenly no longer an acronym is NOT okay.

That's revisionist history, and it's flat out wrong. Perl is still defined as an acronym in the canonical man pages that describe the language, so to claim otherwise is just plain silly.

Perl is a proper name, as are Ruby, Python, Java, Lisp, Prolog

Uh huh. So the concept of Perl gets to be a proper name, but the tangible instantiation doesn't? We don't do that with any other noun in English. I don't drive a "ford taurus", which then deemed to be a specific instantiation of the platonic ideal of a "Ford Taurus". If anything, a proper noun implies a specific instantiation of a more general concept, not the other way around. We name specific children; but we don't consider 'child' a proper noun. Yet perl zealots keep harping on the distinction between the Holy Abstraction of Perl (which is capitalized, presumably as an honourific), and the lowly, bug-ridden instantiation (which apparently doesn't merit one).

We write popular acronyms in lowercase, without the periods: thus "perl", not "P.E.R.L."; just as we do with laser, maser, sonar, or scuba.

For purposes of knowing the proper terms, it helps to know the history of the terms.

And history says that perl stands for "Practical Extraction and Reporting Language"; that the definition was changed very early in the history of perl, and the very first description you got, and still get when you look up the meaning of the language has been, and still remains, an acronym.

Re^6: top ten things every Perl hacker should know
created: 2006-03-21 12:47:28
Sorry, but you're wrong.

In the perl man-page, since Perl 1, it has included both the backronyms Practical Extraction and Report Language and later Pathologically Eclectic Rubbish Lister in the BUGS section. In fact Perl never stood for either, and both backronyms were due to Larry Wall himself, before he ever released his creation on an unsuspecting world.

Larry doesn't care much about the issue either way. If you find the language useful, then great. If you don't, well OK. But he doesn't care to argue about what you call it.

It is other people who have made a big issue over this. I'm not sure how it came to be, but knowing this piece of trivia has become essentially a "secret handshake" to identify who has been a part of online Perl communities, and who has not. This is useful enough that I use it that way myself, even though I don't care how you call it.

As for the fact that Perl gets to be a proper name while the instantiation doesn't, blame Unix. That the Perl executable is called perl is due to Unix convention, and since Unix is case-sensitive, it is very important to capitalize that correctly. The language itself is capitalized because by convention most language names are capitalized. For example we write Python, Visual Basic, Ruby, Lisp (even though that really was an acronym), and so on.

But note that those languages that have Unix implementations have lower-case executables. Therefore at heart Perl/perl is no more a violation of standard English capitalization rules than Python/python, Ruby/ruby, Sendmail/sendmail, Mozilla/mozilla and many other examples.

Re^6: top ten things every Perl hacker should know
created: 2006-03-22 02:56:06

Speaking of Platonic ideals: you seem to have a strange idea about how the English language works. Sure, English actually tends to conform to a (very complex) set of rules, but many of those rules are exceptions to other rules, and even those excepting rules have exceptions much of the time. Thus, your half-baked attempt to retrofit terms that are properly unix jargon to your vision of the Platonic ideal of the English language is not only misguided in its ignorance of subcultural jargon rules but also in its impression of English as being a pure extrapolation from some set of inviolable and unvarying rules. The Ford Taurus is not, in instantiation, called the ford taurus because that's the way the rules of English treat brand names for cars in common parlance, while Perl implementations are called perl in unix systems because that's the way the rules of unix-hacker jargon as a subset of English treat executable binaries.

I find it ironic that you're trying to argue that Perl is properly perl because of your gymnastic feats in an attempt to contort the term's history to fit your Platonic ideal of the English language, all while violating the rules of English every few sentences in use of very simple, undisputed rules of syntax and grammar such as possessives, sentence structure, punctuation, et cetera. Normally, I don't pick on the spelling, grammar, and other errors of English usage when disagreeing with someone, but since you're claiming everyone but you is wrong about how the English language is used it seems not only fair game but a highly relevant point. How can you instruct the rest of us in the use of the English language and application of its rules when you do not even know them yourself?

You furthermore contradict yourself, claiming that Perl (or "perl" as you'd have it) was an acronym first, then go on to say that Larry Wall "flip-flopped back to the old name" (emphasis mine), thus effectively conceding the point that it's properly a backronym rather than strictly an acronym.

While your so-called history of Perl's purpose is essentially irrelevant to the discussion at hand, it's worth noting that Perl was, from day one, apparently far more than merely a replacement for shell scripts. It was a replacement for a great many things, including shell scripts, sed and awk, C for system administration, and probably half a dozen other things besides.

I recommend Wall's State of the Onion addresses if you want to know more about the early history of Perl. It seems like every one of them gives up some new tidbit of information on his early motivations and decisions.

EDIT: While it's not really all that big a deal to me, or even most people, how you choose to spell it or why you make that decision, making wildly inaccurate claims about how it really is spelled according to your own hasty generalizations and other logical fallacies just begs for corrections. I don't care if you call it "perl" rather than "Perl", but telling me I'm wrong for calling it "Perl" because the Ford Taurus isn't called a "ford taurus" when it's sitting in my driveway isn't going to convince me you know what you're talking about.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re^7: top ten things every Perl hacker should know
created: 2006-03-22 12:47:13
misguided in its ignorance of subcultural jargon rules

Subcultural jargon is a corruption of the English language: it HAS no rules. It's an exception to the rules.

Your ad hominem attacks on someone is patently right just underscores the real truth: you're trying to form an elitist group.

The manual page lists the acronym as a correct usage of perl. You say it's not. You're wrong. The man page is definitive; that's it's purpose.

You can invent nonsense words like backronyms; you can bitch about my spelling or grammar or other trivialities, but the truth of the matter is you are wrong. You know it, I know it, everyone knows it.

I'm sick of people like you, who are just looking for a reason to sneer at other people who don't agree to play by stupid rules of an elistist subculture of arrogant twits.

Re^8: top ten things every Perl hacker should know
created: 2006-03-22 13:37:13
The manual page lists the acronym as a correct usage of perl. You say it's not. You're wrong. The man page is definitive; that's it's purpose.

If the man page is definitive, then here's the definitive definition which overrides all others:

Perl actually stands for Pathologically Eclectic Rubbish
Lister, but don't tell anyone I said that.              
Re^8: top ten things every Perl hacker should know
created: 2006-03-22 14:35:19

If you want to take the manpage as the authoritative reference, you should learn to read and learn something about context.

  1. manpages refer to programs, not languages: therefore, the spelling with a lower-case P under the "NAME" heading refers to the binary, not the language
  2. every single time the manpage refers to the language, as opposed to the parser, it's spelled with a capital P, including uses in the middle of a sentence

In addition, there's hardly anything "elitist" about me offering advice about how best to appear knowledgeable when speaking with others while explaining that I, for one, don't care whether you capitalize the first letter or not.

Furthermore, I didn't invent the term "backronym". It's a portmanteau of "back" and "acronym" that was coined in 1983, four years before the existence of Perl v1.0, that is a synonym for the term "back formation" as used by linguists to refer to terms that are essentially counterfeit acronyms.

I think your claim that "everyone knows" that I'm "wrong" is patently absurd, but you're welcome to it. I sure wouldn't want to be associated with the unironic conception of such an asinine pronouncement.

In any case, I think I've had enough of this.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re: top ten things every Perl hacker should know
created: 2006-03-15 23:25:44
Okay, I'm intrigued. Please explain your thoughts on how Perl source code files should be named and why,
or perhaps show a link to your writings. I've been debating how to do this since I started downloading
code from this site recently.

Thanks....
Re^2: top ten things every Perl hacker should know (script naming)
tye
created: 2006-03-16 02:54:03

Perl source files (for scripts, not modules) should be named like whatItDoes.pl and checked in to source control. And you should have a script that "installs" your Perl scripts so that you can run them. That install script should do the following:

  • adjust the path in the #! (unless you are lucky enough to have enough control of all of your systems so that /usr/bin/perl is always appropriate as Larry intended)
  • copy the file into the appropriate directory in your PATH
  • rename the file so it no longer ends in ".pl" so that you execute it simply as "whatItDoes", like every other command you use (because a tool getting reimplemented in a different language should not cause you to have to invoke it any differently)
  • set execute permission (for some OSes)
  • run pl2bat (for other OSes) -- Note that some prefer other ways of making it so that, on Win32, you simply invoke a Perl script using an extensionless name (not "perl ..." nor "whatItDoes.pl ...") and I'm not arguing against such alternatives; I'm simply arguing needing to know what language the whatItDoes tool was written in in order to use it is a mistake. (:

Though certainly not necessarily in that order, or even as that many steps.

- tye        

Re^3: top ten things every Perl hacker should know (script naming)
created: 2006-03-16 03:33:28

Good list, but for

rename the file so it no longer ends in ".pl"

I think I'd prefer to create a link (hard or soft, whichever you feel more comfortable with) from whatItDoes to whatItDoes.pl, leaving the original file as it is. Only for those OSes that can do links of course.


All dogma is stupid.
Re^4: top ten things every Perl hacker should know (link)
tye
created: 2006-03-16 13:57:49

Are you saying you'd link the executable to some working copy of the *.pl file that you have checked out (so that you can't edit without affecting those using it) or to the *.pl inside the revision control system (may not be possible, prevents you from checking in unfinished changes) or that you'd put *.pl files into your "bin" directory? I wouldn't put *.pl files into a "bin" directory, since it tempts the use of the *.pl file which would break when the script gets reimplemented or wrapped in something and it just adds clutter, IMO.

- tye        

Re^5: top ten things every Perl hacker should know (link)
created: 2006-03-16 16:54:47

Yes, I would put the whatever.pl file in the bin directory, and almost precisely for the reason you give against it :-). This makes it possible for me to use whatever.pl directly in a script or system if I want that particular implementation. If you want to place a wrapper around it or reimplement it as "whatever", then that's fine, go ahead and I'll still be using the specific implementation I wrote my script against (and of course, if you're the admin there's nothing stopping you from linking whatever.pl to whatever, thereby forcing me to use the new implementation). Normal users will use whatever anyway and thus benefit from implementation changes directly.

Yes this can be abused, but it provides maximum flexibility. Polite requests over shotguns and all that.


All dogma is stupid.
Re^6: top ten things every Perl hacker should know (link)
tye
created: 2006-03-16 18:48:56

If there is a need to support multiple versions, then I support multiple version, each regardless of what it happens to be implemented in. Only supporting multiple versions when changing the language that something is implemented in seems of very limited value. So I find not publishing the *.pl name to be both more flexible and less confusing.

- tye        

Re^4: top ten things every Perl hacker should know (script naming)
created: 2006-03-16 14:26:54

I tend to "install" working copies of Perl programs normally, renamed without a file extension. When I'm working on code, though, I tend to keep it in a /home/username/src directory with a file extension, and softlink to it from a /home/username/bin directory with a shorter name (no file extension) so that I can test-run scripts more easily.

I guess maybe I'm somewhere between the two of you on this one.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re: top ten things every Perl hacker should know
xdg
created: 2006-03-16 00:02:19

(sigh)

How about 10 things culled haphazardly and in no particular from either my scratchpad or Selected Best Nodes:

-xdg

Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: top ten things every Perl hacker should know
created: 2006-03-16 04:40:47

I think you omit a number of things that are of crucial importance

  • Know what the Phalanx 100 is, and study it. It is an excellent starting place for finding out what CPAN has to offer.
  • Know the name of your local CPAN mirror.
  • Know how to install a CPAN module directly without the assistance of cpan or cpanp.
  • Know what mailing lists are available, and subscribe to the ones that you think are important or interesting.
  • Know where the next, nearest Perl Workshop or YAPC will be held.
  • Know how to file a bug report. Does the author like to receive e-mail or do they mention that they prefer to use Request Tracker? (One prominent author refuses to use rt.cpan.org).
  • Know how to comment out large slabs of code easily.
  • Know how and when barewords are interpreted as strings or globs
  • Know what the indirect calling notation is and why it is deprecated.

  • Know what lexical filehandles are
  • Know that a warning in an if statement may well be related to a chained elsif further down in the code.

Goodness! I'd better stop. I'm up to eleven already.

• another intruder with the mooring in the heart of the Perl

Re^2: top ten things every Perl hacker should know
created: 2006-03-16 16:16:07
Know how to comment out large slabs of code easily.
  1. Mark the block to be commented.
  2. Fire up search and replace in regex mode
  3. Search for "^", replace with "#"
Of course you need a decent editor. The other way round should be obvious ;).


holli, /regexed monk/
Re^3: top ten things every Perl hacker should know
xdg
created: 2006-03-16 18:48:22

Or better yet, teach the editor to do 1, 2 and 3 (and the inverse) as a macro.

And for those without a decent editor wondering what to do, I assume the original reference was about using Pod:

=begin comment

# lots of code here

=end comment

-xdg

Code written by xdg and posted on PerlMonks is [http://creativecommons.org/licenses/publicdomain|public domain]. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re^3: top ten things every Perl hacker should know
created: 2006-03-17 09:09:12
At the risk of starting up a whole new discussion, my vote for a decent editor goes to nedit.

Cheers,

JohnGG

My top ten
created: 2006-03-16 05:25:59
  1. All rules are guidelines, including this one;
  2. Always use strict and use warnings;
  3. use constant is your friend;
  4. Regexes are bad, you can often use another way to do it;
  5. While there are many ways to do it, most of them are wrong;
  6. Anyone listing rules for programmers is wrong;
  7. Documentation is for users, comments for developers. You will be a user of your own code, so selfishness compels you to write both;
  8. Tests aren't as necessary as the testing cabal would have you believe;
  9. Tea is the one true source of caffeine;
  10. It's OK to reinvent the wheel sometimes;
  11. There will always be last-minute additions
My second rule is really just a variation on "turn on the fascism options in your compiler; and if the compiler emits warnings that's because your code is broken". My third rule is a special case of general good practice regarding naming conventions.
Re: My top ten
created: 2006-03-17 12:49:25
Rewrite for #7

Users don't read documentation if they have a phone number. Write documentation like you would a mystery novel, and leave clues to your phone number spread out through the documentation. That way, by the time they've figured out your phone number, they've already got their answer and they don't need to phone you.
Re: My top ten
created: 2006-03-19 10:53:15
Jolly well done, except for the utter nonsense in point 9. Tea is fine as a supplemental source of caffeine, but nothing more.
Re: My top ten
created: 2006-03-22 03:32:58

re: "10. It's OK to reinvent the wheel sometimes;"

I'd say that's generally only true if you can't get at the source for the wheel. Really, wheels should usually be improved rather than reinvented anew. One might also consider whether wheels simply need new tires (wrappers) rather than reinvention, though the outer interface for the wheel might need to be redesigned to accept your new tire design.

There are exceptions to every rule, including the "use strict and warnings" rule — particularly in Perl culture — but I don't think that invalidates the rule particularly.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re^2: My top ten
created: 2006-03-22 12:28:07
I strongly disagree.

Just because someone else has written a piece of crap, doesn't mean that I have to use it.

In fact I'd say that one of the critical programming skills that few develop is being able to properly decide whether to reuse or ignore a particular wheel. It isn't an easy decision. And it certainly isn't as simple as saying, Always do _____.

The advice to always reuse wheels is good advice to give beginners exactly because they are beginners. Wheels that they're likely to hear about from experienced programmers are always going to be better than what they can write for themselves. But good programmers have a harder decision to make. Because if you actually are good, you probably can create better wheels than a lot that are in use out there. The question then becomes whether it is worth the time and energy to do so. Usually it is not, but sometimes it clearly is.

Re^3: My top ten
created: 2006-03-22 14:10:46

Someone writing a "piece of crap" hasn't really invented a wheel, though — he's only invented a "piece of crap". If someone presented a 4x12 rectangle on an axle, I wouldn't call it an invented wheel.

I think, at this point, we're really not disagreeing in principle — only in phrasing.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re^2: My top ten
created: 2006-03-23 06:10:59
Sometimes the guts of a module are so tied to the interface that to adapt it means to rewrite it anyway, in which case it's usually quicker to just write a new module than it would be to pick the old one apart and then hack it up beyond recognition. A good example is GD::Graph, which I am very seriously considering re-writing to add some features and fix others.
Re: top ten things every Perl hacker should know
created: 2006-03-16 08:00:52

"Perl Programmer" sounds better than hacker ;-)

Walking the road to enlightenment... I found a penguin and a camel on the way.....
Fancy a yourname@perl.me.uk? Just ask!!!
Re: top ten things every Perl hacker should know
created: 2006-03-16 08:46:04
A very useful discussion from my point of view. Each of 22609's points makes total sense. 237051's caveat is understandable. Since
different folks' ideas of what is most important are going to vary, it's worthwhile to see the different responses. I have found that
in thinking about a specific subject, whether it be Perl or any other topic, putting together a "top ten list" can be a useful problem-solving exercise.
Re: top ten things every Perl hacker should know
created: 2006-03-16 09:08:59
A non-programmer, beginner's take on this...

1. Always use strict.
2. Always use warnings.
3. Always use diagnostics during development.
4. KISS.
5. Brute Force programming is not always a bad idea. Elegant code is often slower and harder to maintain.
6. White space is your friend.
7. Comments are an even better friend.
8. CPAN is the best friend you'll ever have.
9. When you've programmed yourself into a corner, sometimes the quickest fix is to trash everything and start from scratch.
10. Assume that the person who will have to maintain your code is even more of a NOOB than you are.

Jack

Re^2: top ten things every Perl hacker should know
created: 2006-03-16 14:16:53

When you've programmed yourself into a corner, sometimes the quickest fix is to trash everything and start from scratch.

That's very true — but it's also sometimes instructive to keep doggedly at the problem until you solve it without rewriting, even if scrapping the original and starting over would be quicker. That's my experience, at any rate.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re^2: top ten things every Perl hacker should know
created: 2006-03-16 14:57:18
Brute Force programming is not always a bad idea. Elegant code is often slower and harder to maintain.

I agree with the first premise. The supporting reasons are less good. A complicated Schwartzian transform may well be impossible for a first-year Perl hacker to read but it's 100 times easier to read than the 2 screenfuls of code it replaces if you know what it is. Elegant, to me, means better, concise, more lucid, not too tricky to follow.

Re^3: top ten things every Perl hacker should know
created: 2006-03-16 15:54:49
I couldn't agree more but there's a difference between elegance for elegance sake (ie. doing something tricky because it has a high geek factor) and elegance because it's good programming (e.g. the Schwartzian transform). I was thinking more of high geek factor code. It might parse better if I s/often/sometimes/.

Jack

Re^4: top ten things every Perl hacker should know
created: 2006-03-17 00:28:40

"Tricky" or "clever" is not always elegant. I tend to think of elegance as eschewing the gratuitous. That would mean writing clever code for the sake of being clever does not fit my definition of "elegance".

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re^3: top ten things every Perl hacker should know
created: 2006-03-16 23:14:16
Bad example.

A Schwartzian transform in Perl is always more complex than a straightforward sort block. It is good to know about it because it can be lots faster, but it is more complex.

If you do not understand this, then you don't really understand the Schwartzian transform.

(Note that I have to say "in Perl" because other languages, for instance Python and Ruby, have implemented shortcuts to make Schwartzian transforms simpler than the alternative.)

Re^4: top ten things every Perl hacker should know
created: 2006-03-17 00:51:52

It's all right in front of you; in one sweep of your eye. The filtering, the modification, and the sorting (grep, map, sort, and maybe more). There aren't a couple sets of temporary variables and three or more bocks or subroutines to jump around and try to keep in your head at once. It's the same code condensed. I find it easier to read and much easier to debug.

So I do think the ST is a good example of elegance. I'm open to counter examples of what you consider Perl (not Python or Ruby of course!) elegance.

Re^5: top ten things every Perl hacker should know
created: 2006-03-17 02:05:46
Let's take a list of files and use a Schwartzian Transform to sort it by file size descending, then alphabetical order ascending.
my @sorted_files
  = map {$_->[0]}
    sort {$b->[1] <=> $a->[1] or $a->[0] cmp $b->[0]}
    map {[$_, -s]} @files;
Here is the same code written as a normal sort.
my @sorted_files = sort {-s $b <=> -s $a or $a cmp $b} @files;
Clearly the Schwartzian Transform is more complex. But if you have a list of 1000 files, it's also about 10 times faster. Which is why we learn it.

Now to explain my Ruby comment. In Ruby, arrays have a sort_by method. So in this example you'd write:

sorted_files = files.sort_by {|f| [- test(?s, f), f]};
and you've written the more efficient sort with less code than the regular sort. This does not work in Perl first of all because we don't have a sort_by method, and furthermore because Perl doesn't do anything useful when you try to sort array references. (Ruby sorts them lexicographically, with each field sorting in its "natural" way.)
Re^6: top ten things every Perl hacker should know
qq
created: 2006-03-18 15:21:09

We sort of have a sort_by method in Sort::Maker. But I've never liked the api. There is no denying that an ST is a scary sight to a new programmer not familiar with the idiom. I'd argue it should be wrapped in a sub for clarity most often.

I like the PBP argument for using List::Util and List::MoreUtils functions such as any rather than idiomatic uses of grep. It makes your intention clear, and prevents you from thinking in syntax - no matter how comfortable and familiar that syntax is.

update: added some words for clarity

Re^7: top ten things every Perl hacker should know
created: 2006-03-20 10:55:57
We sort of have a sort_by method in Sort::Maker. But I've never liked the api

you should try [cpan://Sort::Key], it has a very simple API and it's faster and uses less memory than any other perl sorting technique:

use Sort::Key qw(keysort);
my @sorted = keysort { genkey($_) } @data;
Re^4: top ten things every Perl hacker should know
created: 2006-03-17 09:20:04
The problem with the "straightforward sort block" is that you often have to surround it with lots of other code to achieve the same results as the ST. At the heart of it, the sort complexity is the same and the difference lies in where you mangle the data into a sortable form and then extract it again. If I understand correctly, it is the Guttman-Rosler Transform that has a simpler lexical sort after the sort keys have benn carefully packed into a string.

Cheers,

JohnGG

Re^5: top ten things every Perl hacker should know
created: 2006-03-17 11:31:47
Try to come up with an example. With real code. You'll find that the straightforward sort really is more straightforward.

You'll find time after time again that the straightforward sort block is simpler to write in Perl than the fancy sorts. OK, thinking carefully about it, there is one exception. And that exception is where the code to extract "what you want to sort by" is very complex, so that it is more complex to do it both for $a and $b than it is to do it once and have a Schwartzian Transform. But I don't think I've ever encountered that in real life. (Plus one can just move the complex logic into a function and call the function twice. With anonymous functions one can do it inline, and it will still be simpler than a Schwartzian Transform.)

And, of course, someone who hasn't studied sorting tricks is always going to find the straightforward sort block far easier to read.

However the sort block executes more times than mangle/extract blocks do in the Schwartzian Transform or the Guttman-Rosler Transform. So the more work you move from the sort blocks to mangle/extract, the more time you'll save. The GRT is faster than the Schwartzian Transform because it uses a simpler data structure (a string), and so the sort block can be made even faster (in fact it is the default string compare).

People think that this is cool because they are surprised that this change can have such big performance implications. But it is an optimization, and the code you get is more complex (at least in Perl).

Update: hv noted that I'd written GST instead of GRT. Fixed.

Re^6: top ten things every Perl hacker should know
created: 2006-03-17 14:42:09
I'll take one of my ST sorts and translate it to "straightforward sort block" style and see what happens. I'll give you an update when it's done.

Cheers,

JohnGG

Re^7: top ten things every Perl hacker should know
created: 2006-03-17 17:44:38
You can look at Re^5: top ten things every Perl hacker should know for an example of the same sort written as a Schwartzian Transform and as a regular sort. (I intentionally chose one that was similar to the example that gave the Schwartzian transform its name.)
Re^8: top ten things every Perl hacker should know
created: 2006-03-18 07:07:47
No, you are right and I am wrong!

If the data you need for sorting is there in front of you with no need for any "mangling", the "straightforward sort block" is simpler to write than the Schwartzian Transform. I was confusing the tranforming done as a performance measure with other transformations to the data that are necessary because it is not yet in a sortable form.

If you do need to change the data in some way before the sort can take place then the Schwartzian Transform starts to gain in the straighforwardness factor, I think. The sort block method gains in readability because you can use meaningful variable names rather than $_; the Schwartzian Transform gains because the flow seems to me to be more obvious, albeit up the page which is counter-intuitive if you are used to piping commands in the shell. I tend to comment what is going on in the trans