TIMTOWTDI Challenge: Open a file
Tanktalus
created: 2006-04-17 20:10:38

I was reading some old posts and came across TMTOWTDI... and most of them are wrong again. I saw that I disagreed with it at the time, and it got me thinking - how many ways really are there to do something? Thus, this challenge.

How many ways can we come up with to open a file?

I'm purposefully leaving this somewhat vague because there are many aspects to perl and the problems we try to solve. Bonus points for explaining the pros and cons of your approach. And bonus points for obscurity. Part of the excersise is to broaden our minds and escape the boxes we've grown accustomed to. We'll assume that $rfile is a filename relative to the current working directory or $afile is a filename with an absolute path - which you use is up to you.

To start things off, here is the trivial, fairly obvious case:

open my $fh, $rfile, 'r';

We get the advantages of a lexical filehandle, in an explicit mode, and should be able to handle spaces in the filename, or other wierd characters. We can be pretty assured that it will ignore special characters, such as < or | that may have been passed in. On the other hand, we can't handle those special characters transparently, such as when an advanced user wants to actually have us read from a pipe instead of a file.

Or, a more obscure case:

use File::Basename;
use Net::FTP;

my $ftp = Net::FTP->new('localhost');
$ftp->login($user, $password);
$ftp->cwd(dirname($apath));
$ftp->binary();
my $dataconn = $ftp->retr(basename($apath));

The $dataconn object can be used like this:

my $data;
while ($dataconn->read(my $buffer, 1024))
{
    $data .= $buffer;
}

print $data;

Advantage: none, I'm going for the obscure bonus points here. ;-) Well, that's not entirely true - this was actually quite enlightening to write and test on how to use [cpan://Net::FTP] to grab a file that I do not actually want to save to disk under its current name or even under any name. Disadvantage: really obscure for local files. A lot more code than is needed, plus the overhead of an FTP server that needs to be running.

I haven't covered the vast majority of trivial cases, but will let others do so. Also, if you have comments about pros/cons of any example, especially mine, please add that.

Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-17 20:24:46

The [sysopen] call is sometimes useful for its full range of open modes, and the ability to set file permissions when the file is created.

sysopen FILEHANDLE,FILENAME,MODE,PERMS

If you fool around with that, use Fcntl;. That lets you take your program to another machine without worrying about differing numeric values for the system constants.

After Compline,
Zaxo

Re^2: TIMTOWTDI Challenge: Open a file
created: 2006-04-17 21:05:18
I use [doc://sysopen] when I want my scripts to play nicely with each other, and observe locking. eg:
use Fcntl qw(:DEFAULT :flock);

sysopen(OUT, $file, O_WRONLY | O_CREAT)
   or die "Cannot open $file for writing:$!\n";
flock(OUT, LOCK_EX)
   or die "Cannot get a lock on $file:$!\n";

(That example is pretty much straight out of the book)

Cheers,
Darren :)

Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-17 20:36:36
I can mess with both files at once. :-)
system( "$^X -pl -e 's/foo/bar/g' $rfile $afile" );
Alternately, if I absolutely have to provide an explicit iteration over the file, I could
exec "$0 $rfile $afile" unless @ARGV;
while (<>) {
    # Do whatever you want here.
}

My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-18 02:47:51

Here's one:

unshift @ARGV, $rfile;
while( <> ) {
  # ....
}

Or how about this:

use Tie::File;
tie my @array, 'Tie::File', $rfile or die;

Here's another variation on the @ARGV theme...

#!/usr/bin/perl -p
BEGIN{ 
    $rfile = 'myfile.txt';
    unshift @ARGV, $rfile;
}

Update:
By the way, I won't attempt to justify any of these abominations other than to state the obvious; they're just for fun.


Dave

Re^2: TIMTOWTDI Challenge: Open a file
created: 2006-04-19 02:24:48

I'm pretty fond of the diamond operator and just feeding a filename to the script when it's called.

#!/usr/bin/perl

while (<>) {
    # do stuff
}

and . . .

$ script.pl filename

I guess it's the KISS principle applied to getting data from a file.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-18 02:49:13
To start things off, here is the trivial, fairly obvious case:
open my $fh, $rfile, 'r';
Should read
open my $fh, 'r', $rfile ;
Re^2: TIMTOWTDI Challenge: Open a file
created: 2006-04-18 06:59:59

open my $fh, 'r', $rfile ;

Should read

open my $fh, '<', $rfile;

And shame on those who upvoted the parent node... :-)
Re: TIMTOWTDI Challenge: Open a file
xdg
created: 2006-04-18 10:09:23

Object-oriented (adapted from [mod://IO::File] synopsis):

use IO::File;

$fh = IO::File->new("< file");
$fh = IO::File->new("file", "r");
$fh = IO::File->new("file", O_WRONLY|O_APPEND);

-xdg

Code written by xdg and posted on PerlMonks is [http://creativecommons.org/licenses/publicdomain|public domain]. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re^2: TIMTOWTDI Challenge: Open a file
created: 2006-04-27 19:30:09

oooo....i like that...that's pretty...and clean.

my question is...what kind of modules are there for operating systems that don't support file locking?

I know there's quite a few methods for checking and adapting for such, but has anyone put together a module encompassing these routines?

meh.
Re^3: TIMTOWTDI Challenge: Open a file
xdg
created: 2006-04-27 20:25:04

According to perlport, flock is available on WinNT (and later). You should be able to check Config for d_flock.

> perl -MConfig -le "print $Config{d_flock}"
define

No guarantees on how good that is for protecting against anything other than other Perl programs using flock. You might have to use Win32API::File for more direct control.

-xdg

Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-18 10:32:32
use Win32::OLE;

my $fso = Win32::OLE->new('Scripting.FileSystemObject');
my $ts = $fso->OpenTextFile($rFile);

while (!$ts->AtEndOfStream) {
  print $ts->ReadLine,"\n";
};
Advantage: None. Its just another WTDI, although you can also do things like BuildPath with $fso and get DateCreated, DateLastAccessed, DateLastModified of files.

Disadvantage: Only works with Win32.

Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-18 16:04:06
Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-18 17:07:53
syscall()
Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-19 01:07:36

TIMTOWTDI, true. Most of them may not be *wrong*, necessarily, but many of them aren't great. They may be clumsy, difficult to use, so obscure that they're nearly impossible to maintain (even the author might not remember quite what he/she intended, a few months down the pike), security holes, inefficient, and so on.

I suspect that while TIMTOWTDI, there is usually a very limited number of optimum ways to do it - often that number being one.

Re^2: TIMTOWTDI Challenge: Open a file
created: 2006-04-19 01:39:57

Well, I agree that though there are many ways to do things, there are occasionally only a few (or a single) good way(s) to do it. And on the face of things, it would seem obvious that when you want to open a file, you just open it.

I wouldn't want to advocate difficult to read code, and as I mentioned in my own earlier post in this thread, this really is just for fun. But the mental exercise is valuable, I believe. There is nothing wrong with exploring the corners of the language, as long as you take the time to learn why, and why not. A great orator may command a mastery of spoken language that would make doctors and attorneys blush. And certanly he may go to great lengths to personally explore the corners of his language of communication. But in composing a speech, he is going to select words and constructs from his tool chest of expression that meet the needs of his audience, in level of education, field of specialty, as well as degree of entertainment, appropriate level of formality, and efficiency of communication of important ideas.

In coding, there are many such factors to keep in mind. Maintainability (usually but not always an issue), corporate culture for coding style (sometimes, sometimes not an issue), conciseness, efficiency, and so on. The truth is, there may even be an appropriate time to unshift a filename into @ARGV and tap into the diamond operator's power. That's not usually the clearest way to open and read a file. But nobody should go so far as to say it's never appropriate.

...just some food for thought. Now back to the fun at hand.......


Dave

Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-19 01:58:25
open(FILE, "cat $file |") or die "Cannot read '$file': $!";
The big advantage of this method shows up when you want to deal with large files with a Perl that doesn't understand them, while cat does. (In that case opening the file directly only lets you read the first 2 GB, while opening it this way allows you to fetch through the whole file.)

I helped a number of people with this trick back in the day, but its time has is long past now. Today you can compile Perl to understand how to handle large files (which you couldn't easily do with, say, Perl 5.004), and this setting is on by default.

The disadvantages are numerous. You require an extra process, you have to worry about escaping special characters in the filename, error handling is more complex, etc, etc, etc. The only remaining advantage that I can think of for this method is that the additional parallelism might allow you to reduce the effect of disk latency if you were doing simple processing of a large file off of a slow disk. But that should only make a big difference if I/O time was close to processing time, and I'd need to see a benchmark before I'd believe it really was helping.

Re: TIMTOWTDI Challenge: Open a file
created: 2006-04-20 11:52:28

What? No one's mentioned the swiss army chainsaw of of IO operations, [cpan://IO::All] ?

From the docs:

use IO::All;
my ($io, $contents);
$io = io 'file.txt';
$contents < $io;
And if you want to write to the file, you've got that too:
# manipulate $contents to write it back out:
$contents > $io;
$some_extra_bit >> $io;

Ignoring abominations like that module, there's also this trick:
  my $filename = 'thefile.txt';
  my $pid = open(FH, "-|");
  if (not defined $pid) {
    die "cannot fork: $!; bailing out";
  }
  if (! $pid) {
    exec('/bin/cat', $filename);
  }
  while () { ... }
Which trick is actually useful if you want to do more than open a file; e.g. if you want to run some external program that needs a special environment set up before you invoke it, like tying STDERR to STDOUT, or ensuring that STDIN is closed, or that the program is disassociated from the controlling terminal via a call to POSIX::setsid. (java processes that you want running in the background are sometimes picky about being completely disassociated from the controlling terminal)
--
@/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/;
map{y/X_/\n /;print}map{pop@$_}@/for@/

perlmonks.org content © perlmonks.org and ambrus, Anonymous Monk, apotheon, davido, dhoss, dragonchild, fizbin, Juerd, McDarren, spiritway, Tanktalus, tilly, xdg, Zaxo

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03