I'm feeling a little contemplative today, mostly about very simple subjects. One topic that came to mind is this: What are the top ten things every Perl hacker should know? I encourage you all to offer your suggestions, but I've decided to provide a list of my own as a starting point.
I suspect the bit about naming Perl files will be considered less important by others here, but it seems quite important to me to know what file extensions to use for different types of source files, to use names without spaces and special characters in them, and to use at least vaguely descriptive names.
This is meant to be a list of generally important bits of knowledge, like quirks of the language, resources available, and important aspects of Perl culture. It's not intended to include things like ingenious snippets of code or what O'Reilly books about Perl are best (though these, too, are worthy subjects to contemplate).
|
- apotheon
CopyWrite Chad Perrin |
1. Perl is not an acronymThis is the top item on your list? On my list, that's at number 101,439. Right after "Learn to turn on computer".
It seems like an important item if you're going to converse with other Perl programmers. Calling it PERL instead of Perl is a quick way to get marked as a know-nothing newbie — which is doubly problematic if you're an author of programming instructional texts. I've seen an awful lot of Perl books wherein the author kept calling it PERL.
Then again, you're right: this item is pretty frivolous, in the grand scheme of things.
|
- apotheon
CopyWrite Chad Perrin |
thor
The only easy day was yesterday
No, not that again. There's a big difference between saying it's important to know Perl is not PERL if you're going to participate in the Perl community and saying I wouldn't hire someone for not knowing Perl is not PERL. I suspect I'd be less likely to hire someone that called it PERL on a resume, but not because of the capitalization — it's probably the case that the qualities I'd look for in a candidate would match up with the likelihood of knowing that Perl is not an acronym, though.
In other words, I wouldn't mark you down for not knowing it's properly Perl instead of PERL, but I have a sneaking suspicion that calling it PERL is more likely to be accompanied by a lack of qualities I'd like (such as involvement in open source development communities).
In any case, the point of this is that it's something people should know to get along in the Perl community, not an indictment of people who don't know them. Everybody starts out ignorant: knowledge and wisdom come with (not necessarily formal) education and experience.
|
- apotheon
CopyWrite Chad Perrin |
That's why, as a candidate, I usually take my own copies of my CV to interviews, to replace the broken one the agent has given them.
Good point.
As I said, the spelling of PERL on the resume wouldn't in and of itself cause me to pass over a candidate, but your point about recruiters might be an important point to keep in mind.
|
- apotheon
CopyWrite Chad Perrin |
Though I've seen a lot of low-quality "Perl" books, I've never seen a high quality "PERL" one. It's worth remembering.
Cheers,
Ovid
New address of my CGI Course.
no, i'm not paranoid - it's just that they really are out to get me ;-D
%man perl
NAME
perl - Practical Extraction and Report Language
You're right... only an idiot would consider that an acronym. No wait -- only a perl cultist wouldn't consider that an acronym, because that's exactly what it is, and only severe abuse of one's perceptions and/or the English language itself can change that.
However, not everyone is a perl cultist. Some of us don't believe in abusing the honorific case in English to distinguish a language from it's implementation, because unlike gods and royalty, a mere programming language doesn't deserve genuflection, and to those speakers of the English language, "Perl" is only correct if capitalized at the beginning of a sentence.
I realize this place is called Perl Monks, and that some people think it grants them license to turn into foaming at the mouth perl fantatics, but seriously -- denying the very truth before your eyes is the mark of a cultist, not a rational human free of the taint of religion.
Perl *is* an acroynm, because of the very fact it's been billed as such for well over ten years. Just because Pope Larry has decided to pull back from some of his past decrees doesn't undo the past.
Drop the religious fervour; it's not serving any purpose. Be rational, and focus on writing quality code in the language, and follow the original rules for the English language, not PerlSpeak.
I can only assume you're unfamiliar with the term "backronym" — which is what Perl is, rather than a proper acronym. If you don't believe me, ask Mr. Wall.
In fact, if you're interested in getting this all "right", you might wish to refer back to the writings of Mr. Wall, the inventor of Perl and the man who coined the term Perl in the first place. It might also be of interest to note that the term perl, with a lower-case first letter, arose only because that's what the binary executable Perl parser is called in most unix-based implementations.
Perl is a proper name, as are Ruby, Python, Java, Lisp, Prolog, and a slew of other languages — no matter what their various parser binaries are called.
For purposes of knowing the proper terms, it helps to know the history of the terms.
|
- apotheon
CopyWrite Chad Perrin |
Just because Larry carefully rebranded perl as an acronym for marketing purposes doesn't make it any less an acronym. We don't know all the internal names the makers of lasers, sonar, or scuba gear pitched before the acronyms were popularlized: yet those words are and remain acronyms.
Ten years ago, perl was being agressively and loudly marketed under the acronym "Practical Extraction and Reporting Language" -- it was a selling point to encourage people to switch from shell scripts, which is largely what perl was written to replace. The fact that early versions of perl didn't have the acronym was a tiny sentence buried deep in man pages as a minor historical footnote.
For well over a decade (at *least* since I first read the perl man pages back in 1994, if not before then), perl was billed, loudly, as an acronym for "Practial Extraction and Reporting Language": now that it's associations with shell scripting are no longer considered cool, Larry is trying to flip-flop back to the old name. That part is fine, I guess, but claiming that perl is suddenly no longer an acronym is NOT okay.
That's revisionist history, and it's flat out wrong. Perl is still defined as an acronym in the canonical man pages that describe the language, so to claim otherwise is just plain silly.
Perl is a proper name, as are Ruby, Python, Java, Lisp, Prolog
Uh huh. So the concept of Perl gets to be a proper name, but the tangible instantiation doesn't? We don't do that with any other noun in English. I don't drive a "ford taurus", which then deemed to be a specific instantiation of the platonic ideal of a "Ford Taurus". If anything, a proper noun implies a specific instantiation of a more general concept, not the other way around. We name specific children; but we don't consider 'child' a proper noun. Yet perl zealots keep harping on the distinction between the Holy Abstraction of Perl (which is capitalized, presumably as an honourific), and the lowly, bug-ridden instantiation (which apparently doesn't merit one).
We write popular acronyms in lowercase, without the periods: thus "perl", not "P.E.R.L."; just as we do with laser, maser, sonar, or scuba.
For purposes of knowing the proper terms, it helps to know the history of the terms.
And history says that perl stands for "Practical Extraction and Reporting Language"; that the definition was changed very early in the history of perl, and the very first description you got, and still get when you look up the meaning of the language has been, and still remains, an acronym.
In the perl man-page, since Perl 1, it has included both the backronyms Practical Extraction and Report Language and later Pathologically Eclectic Rubbish Lister in the BUGS section. In fact Perl never stood for either, and both backronyms were due to Larry Wall himself, before he ever released his creation on an unsuspecting world.
Larry doesn't care much about the issue either way. If you find the language useful, then great. If you don't, well OK. But he doesn't care to argue about what you call it.
It is other people who have made a big issue over this. I'm not sure how it came to be, but knowing this piece of trivia has become essentially a "secret handshake" to identify who has been a part of online Perl communities, and who has not. This is useful enough that I use it that way myself, even though I don't care how you call it.
As for the fact that Perl gets to be a proper name while the instantiation doesn't, blame Unix. That the Perl executable is called perl is due to Unix convention, and since Unix is case-sensitive, it is very important to capitalize that correctly. The language itself is capitalized because by convention most language names are capitalized. For example we write Python, Visual Basic, Ruby, Lisp (even though that really was an acronym), and so on.
But note that those languages that have Unix implementations have lower-case executables. Therefore at heart Perl/perl is no more a violation of standard English capitalization rules than Python/python, Ruby/ruby, Sendmail/sendmail, Mozilla/mozilla and many other examples.
Speaking of Platonic ideals: you seem to have a strange idea about how the English language works. Sure, English actually tends to conform to a (very complex) set of rules, but many of those rules are exceptions to other rules, and even those excepting rules have exceptions much of the time. Thus, your half-baked attempt to retrofit terms that are properly unix jargon to your vision of the Platonic ideal of the English language is not only misguided in its ignorance of subcultural jargon rules but also in its impression of English as being a pure extrapolation from some set of inviolable and unvarying rules. The Ford Taurus is not, in instantiation, called the ford taurus because that's the way the rules of English treat brand names for cars in common parlance, while Perl implementations are called perl in unix systems because that's the way the rules of unix-hacker jargon as a subset of English treat executable binaries.
I find it ironic that you're trying to argue that Perl is properly perl because of your gymnastic feats in an attempt to contort the term's history to fit your Platonic ideal of the English language, all while violating the rules of English every few sentences in use of very simple, undisputed rules of syntax and grammar such as possessives, sentence structure, punctuation, et cetera. Normally, I don't pick on the spelling, grammar, and other errors of English usage when disagreeing with someone, but since you're claiming everyone but you is wrong about how the English language is used it seems not only fair game but a highly relevant point. How can you instruct the rest of us in the use of the English language and application of its rules when you do not even know them yourself?
You furthermore contradict yourself, claiming that Perl (or "perl" as you'd have it) was an acronym first, then go on to say that Larry Wall "flip-flopped back to the old name" (emphasis mine), thus effectively conceding the point that it's properly a backronym rather than strictly an acronym.
While your so-called history of Perl's purpose is essentially irrelevant to the discussion at hand, it's worth noting that Perl was, from day one, apparently far more than merely a replacement for shell scripts. It was a replacement for a great many things, including shell scripts, sed and awk, C for system administration, and probably half a dozen other things besides.
I recommend Wall's State of the Onion addresses if you want to know more about the early history of Perl. It seems like every one of them gives up some new tidbit of information on his early motivations and decisions.
EDIT: While it's not really all that big a deal to me, or even most people, how you choose to spell it or why you make that decision, making wildly inaccurate claims about how it really is spelled according to your own hasty generalizations and other logical fallacies just begs for corrections. I don't care if you call it "perl" rather than "Perl", but telling me I'm wrong for calling it "Perl" because the Ford Taurus isn't called a "ford taurus" when it's sitting in my driveway isn't going to convince me you know what you're talking about.
|
- apotheon
CopyWrite Chad Perrin |
Subcultural jargon is a corruption of the English language: it HAS no rules. It's an exception to the rules.
Your ad hominem attacks on someone is patently right just underscores the real truth: you're trying to form an elitist group.
The manual page lists the acronym as a correct usage of perl. You say it's not. You're wrong. The man page is definitive; that's it's purpose.
You can invent nonsense words like backronyms; you can bitch about my spelling or grammar or other trivialities, but the truth of the matter is you are wrong. You know it, I know it, everyone knows it.
I'm sick of people like you, who are just looking for a reason to sneer at other people who don't agree to play by stupid rules of an elistist subculture of arrogant twits.
The manual page lists the acronym as a correct usage of perl. You say it's not. You're wrong. The man page is definitive; that's it's purpose.
If the man page is definitive, then here's the definitive definition which overrides all others:
Perl actually stands for Pathologically Eclectic Rubbish Lister, but don't tell anyone I said that.
If you want to take the manpage as the authoritative reference, you should learn to read and learn something about context.
In addition, there's hardly anything "elitist" about me offering advice about how best to appear knowledgeable when speaking with others while explaining that I, for one, don't care whether you capitalize the first letter or not.
Furthermore, I didn't invent the term "backronym". It's a portmanteau of "back" and "acronym" that was coined in 1983, four years before the existence of Perl v1.0, that is a synonym for the term "back formation" as used by linguists to refer to terms that are essentially counterfeit acronyms.
I think your claim that "everyone knows" that I'm "wrong" is patently absurd, but you're welcome to it. I sure wouldn't want to be associated with the unironic conception of such an asinine pronouncement.
In any case, I think I've had enough of this.
|
- apotheon
CopyWrite Chad Perrin |
Perl source files (for scripts, not modules) should be named like whatItDoes.pl and checked in to source control. And you should have a script that "installs" your Perl scripts so that you can run them. That install script should do the following:
Though certainly not necessarily in that order, or even as that many steps.
- tye
Good list, but for
rename the file so it no longer ends in ".pl"
I think I'd prefer to create a link (hard or soft, whichever you feel more comfortable with) from whatItDoes to whatItDoes.pl, leaving the original file as it is. Only for those OSes that can do links of course.
Are you saying you'd link the executable to some working copy of the *.pl file that you have checked out (so that you can't edit without affecting those using it) or to the *.pl inside the revision control system (may not be possible, prevents you from checking in unfinished changes) or that you'd put *.pl files into your "bin" directory? I wouldn't put *.pl files into a "bin" directory, since it tempts the use of the *.pl file which would break when the script gets reimplemented or wrapped in something and it just adds clutter, IMO.
- tye
Yes, I would put the whatever.pl file in the bin directory, and almost precisely for the reason you give against it :-). This makes it possible for me to use whatever.pl directly in a script or system if I want that particular implementation. If you want to place a wrapper around it or reimplement it as "whatever", then that's fine, go ahead and I'll still be using the specific implementation I wrote my script against (and of course, if you're the admin there's nothing stopping you from linking whatever.pl to whatever, thereby forcing me to use the new implementation). Normal users will use whatever anyway and thus benefit from implementation changes directly.
Yes this can be abused, but it provides maximum flexibility. Polite requests over shotguns and all that.
If there is a need to support multiple versions, then I support multiple version, each regardless of what it happens to be implemented in. Only supporting multiple versions when changing the language that something is implemented in seems of very limited value. So I find not publishing the *.pl name to be both more flexible and less confusing.
- tye
I tend to "install" working copies of Perl programs normally, renamed without a file extension. When I'm working on code, though, I tend to keep it in a /home/username/src directory with a file extension, and softlink to it from a /home/username/bin directory with a shorter name (no file extension) so that I can test-run scripts more easily.
I guess maybe I'm somewhere between the two of you on this one.
|
- apotheon
CopyWrite Chad Perrin |
(sigh)
How about 10 things culled haphazardly and in no particular from either my scratchpad or Selected Best Nodes:
-xdg
Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.
I think you omit a number of things that are of crucial importance
Goodness! I'd better stop. I'm up to eleven already.
• another intruder with the mooring in the heart of the Perl
Know how to comment out large slabs of code easily.
Or better yet, teach the editor to do 1, 2 and 3 (and the inverse) as a macro.
And for those without a decent editor wondering what to do, I assume the original reference was about using Pod:
=begin comment # lots of code here =end comment
-xdg
Code written by xdg and posted on PerlMonks is [http://creativecommons.org/licenses/publicdomain|public domain]. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.
re: "10. It's OK to reinvent the wheel sometimes;"
I'd say that's generally only true if you can't get at the source for the wheel. Really, wheels should usually be improved rather than reinvented anew. One might also consider whether wheels simply need new tires (wrappers) rather than reinvention, though the outer interface for the wheel might need to be redesigned to accept your new tire design.
There are exceptions to every rule, including the "use strict and warnings" rule — particularly in Perl culture — but I don't think that invalidates the rule particularly.
|
- apotheon
CopyWrite Chad Perrin |
Just because someone else has written a piece of crap, doesn't mean that I have to use it.
In fact I'd say that one of the critical programming skills that few develop is being able to properly decide whether to reuse or ignore a particular wheel. It isn't an easy decision. And it certainly isn't as simple as saying, Always do _____.
The advice to always reuse wheels is good advice to give beginners exactly because they are beginners. Wheels that they're likely to hear about from experienced programmers are always going to be better than what they can write for themselves. But good programmers have a harder decision to make. Because if you actually are good, you probably can create better wheels than a lot that are in use out there. The question then becomes whether it is worth the time and energy to do so. Usually it is not, but sometimes it clearly is.
Someone writing a "piece of crap" hasn't really invented a wheel, though — he's only invented a "piece of crap". If someone presented a 4x12 rectangle on an axle, I wouldn't call it an invented wheel.
I think, at this point, we're really not disagreeing in principle — only in phrasing.
|
- apotheon
CopyWrite Chad Perrin |
"Perl Programmer" sounds better than hacker ;-)
1. Always use strict.
2. Always use warnings.
3. Always use diagnostics during development.
4. KISS.
5. Brute Force programming is not always a bad idea. Elegant code is often slower and harder to maintain.
6. White space is your friend.
7. Comments are an even better friend.
8. CPAN is the best friend you'll ever have.
9. When you've programmed yourself into a corner, sometimes the quickest fix is to trash everything and start from scratch.
10. Assume that the person who will have to maintain your code is even more of a NOOB than you are.
Jack
When you've programmed yourself into a corner, sometimes the quickest fix is to trash everything and start from scratch.
|
- apotheon
CopyWrite Chad Perrin |
Brute Force programming is not always a bad idea. Elegant code is often slower and harder to maintain.
I agree with the first premise. The supporting reasons are less good. A complicated Schwartzian transform may well be impossible for a first-year Perl hacker to read but it's 100 times easier to read than the 2 screenfuls of code it replaces if you know what it is. Elegant, to me, means better, concise, more lucid, not too tricky to follow.
Jack
"Tricky" or "clever" is not always elegant. I tend to think of elegance as eschewing the gratuitous. That would mean writing clever code for the sake of being clever does not fit my definition of "elegance".
|
- apotheon
CopyWrite Chad Perrin |
A Schwartzian transform in Perl is always more complex than a straightforward sort block. It is good to know about it because it can be lots faster, but it is more complex.
If you do not understand this, then you don't really understand the Schwartzian transform.
(Note that I have to say "in Perl" because other languages, for instance Python and Ruby, have implemented shortcuts to make Schwartzian transforms simpler than the alternative.)
It's all right in front of you; in one sweep of your eye. The filtering, the modification, and the sorting (grep, map, sort, and maybe more). There aren't a couple sets of temporary variables and three or more bocks or subroutines to jump around and try to keep in your head at once. It's the same code condensed. I find it easier to read and much easier to debug.
So I do think the ST is a good example of elegance. I'm open to counter examples of what you consider Perl (not Python or Ruby of course!) elegance.
my @sorted_files
= map {$_->[0]}
sort {$b->[1] <=> $a->[1] or $a->[0] cmp $b->[0]}
map {[$_, -s]} @files;
Here is the same code written as a normal sort.
my @sorted_files = sort {-s $b <=> -s $a or $a cmp $b} @files;
Clearly the Schwartzian Transform is more complex. But if you have a list of 1000 files, it's also about 10 times faster. Which is why we learn it.Now to explain my Ruby comment. In Ruby, arrays have a sort_by method. So in this example you'd write:
sorted_files = files.sort_by {|f| [- test(?s, f), f]};
and you've written the more efficient sort with less code than the regular sort. This does not work in Perl first of all because we don't have a sort_by method, and furthermore because Perl doesn't do anything useful when you try to sort array references. (Ruby sorts them lexicographically, with each field sorting in its "natural" way.)
We sort of have a sort_by method in Sort::Maker. But I've never liked the api. There is no denying that an ST is a scary sight to a new programmer not familiar with the idiom. I'd argue it should be wrapped in a sub for clarity most often.
I like the PBP argument for using List::Util and List::MoreUtils functions such as any rather than idiomatic uses of grep. It makes your intention clear, and prevents you from thinking in syntax - no matter how comfortable and familiar that syntax is.
update: added some words for clarity
you should try [cpan://Sort::Key], it has a very simple API and it's faster and uses less memory than any other perl sorting technique:
use Sort::Key qw(keysort);
my @sorted = keysort { genkey($_) } @data;
Cheers,
JohnGG
You'll find time after time again that the straightforward sort block is simpler to write in Perl than the fancy sorts. OK, thinking carefully about it, there is one exception. And that exception is where the code to extract "what you want to sort by" is very complex, so that it is more complex to do it both for $a and $b than it is to do it once and have a Schwartzian Transform. But I don't think I've ever encountered that in real life. (Plus one can just move the complex logic into a function and call the function twice. With anonymous functions one can do it inline, and it will still be simpler than a Schwartzian Transform.)
And, of course, someone who hasn't studied sorting tricks is always going to find the straightforward sort block far easier to read.
However the sort block executes more times than mangle/extract blocks do in the Schwartzian Transform or the Guttman-Rosler Transform. So the more work you move from the sort blocks to mangle/extract, the more time you'll save. The GRT is faster than the Schwartzian Transform because it uses a simpler data structure (a string), and so the sort block can be made even faster (in fact it is the default string compare).
People think that this is cool because they are surprised that this change can have such big performance implications. But it is an optimization, and the code you get is more complex (at least in Perl).
Update: hv noted that I'd written GST instead of GRT. Fixed.
Cheers,
JohnGG
If the data you need for sorting is there in front of you with no need for any "mangling", the "straightforward sort block" is simpler to write than the Schwartzian Transform. I was confusing the tranforming done as a performance measure with other transformations to the data that are necessary because it is not yet in a sortable form.
If you do need to change the data in some way before the sort can take place then the Schwartzian Transform starts to gain in the straighforwardness factor, I think. The sort block method gains in readability because you can use meaningful variable names rather than $_; the Schwartzian Transform gains because the flow seems to me to be more obvious, albeit up the page which is counter-intuitive if you are used to piping commands in the shell. I tend to comment what is going on in the trans