(The purpose of the temporary file is to read data from, process it in some way, and then write it back over the top of the original file; i.e. I effectively want to edit the original file "in-place", but not all in memory at once in case it is too large.)
I was intending to use the File::Temp::tempfile() function to get a temporary file, and File::Copy to do the copying, but both of the following obvious ideas have problems:
my $tmpfh = File::Temp::tempfile(); File::Copy::copy($file, $tmpfh);
my($tmpfh, $tmpfile) = File::Temp::tempfile(UNLINK => 1); File::Copy::copy($file, $tmpfile);
For inplace editing I suggest you look at -i switch or [Super Search] for 'inplace edit'
perl -pi.bak -e 's/this stuff/that stuff/g' some files
As for creating a temporary file. There is lots of that on [Super Search] too, but once you have a temp file handle you can just use [open] and [<>] or [read] and [print]......
You will find the guts of File::Temp in a 15 line function here [id://334072]
cheers
tachyon
Is there a reason why you want to copy the original file to process it rather than say, renaming it to some temporary name and outputting the results of your munging to a new file with the original name?
For example: Is it necessary that the original file be available to other processes whilst the munging is in progress?
The nice thing about ranaming is that it is (under most circumstances), an atomic operation at the OS level, which closes many possibilities for problems.
My idea was therefore to use the backup file as the temporary file, where a backup filename is given, or else use File::Temp to create one for me. I then copy to the temporary file, read from it and write back over the original, and then leave File::Temp to clean up the temporary file if one was created. (If a specified backup file was used instead, then it gets left afterwards, of course.)
I could start by renaming the original to the backup/temporary name instead, as you suggest, but where do I get the temporary name from? File::Temp returns an open filehandle - no good for renaming my original file to, hence I was looking to copy to it instead.
Actually, having read Re-runnably editing a file in place, I'm now thinking something along those lines would be better:
I could get a temporary filehandle, read from the original file, process the data and write to the temporary filehandle. Then I'd want to rename the temporary file to the original filename, but I don't know the temporary filename unless I ignore File::Temp's advice and pick up both the handle and the name. Maybe that's safe enough since I wouldn't be doing anything with the temporary filename except renaming it (and I therefore wouldn't want File::Temp to try to delete the temporary file either). (I'd have to create the backup file separately, rather than using it as the temporary file, in this scheme, of course.)
- Steve
I think I'd use something simple like:
my $file = ...;
my $n=0;
if( -e $file ) { ## Stop endless loop if $file doesn't exist
$n++ until rename $file, "$file.bak$n";
}
else {
die "$file doesn't exist";
}
my $backup = "$file.bak$n";
Anyway, as I said, the backup filename is supplied by the caller of this code if a backup file is required. My real concern is what the best way to achieve the in-place edit via a temporary file is, possibly taking advantage of the given backup filename if one is given.
I like the idea of writing the processed data to a temporary file and then moving that back (either (1) by a rename or (2) by copying the contents), rather than my original idea of moving/copying the file to be edited and then writing the processed data back to it, so that the process can be easily re-run if it failed the first time.
However, both options (1) and (2) above have problems:
Option (1) goes something like this (return values obviously need checking, and there are some chmod games that can be played too, but this is the bare bones of it):
use File::Temp qw(tempfile);
my $file = 'test.txt';
my($tmpfh, $tmpfile) = tempfile();
open my $fh, '<', $file;
binmode $fh;
while (<$fh>) {
# Process $_ here
print $tmpfh $_;
}
close $fh;
close $tmpfh;
rename $tmpfile, $file;
I can see two problems with that. Firstly, tempfile() was not called in scalar context so the temporary file will not be cleaned up if the program is interrupted or killed. (A $SIG{INT} handler could arrange for them to be cleaned up if interrupted, but not if the program is killed.) Secondly, while the rename itself is (normally) atomic, there is a race condition between the close and the rename - somebody else could potentially modify the file inbetween.
Option (2) looks like this (with the same caveats as before):
use Fcntl qw(:seek);
use File::Temp qw(tempfile);
my $file = 'test.txt';
my $tmpfh = tempfile();
open my $fh, '<', $file;
binmode $fh;
while (<$fh>) {
# Process $_ here
print $tmpfh $_;
}
close $fh;
seek $tmpfh, 0, SEEK_SET;
open my $fh2, '>', $file;
binmode $fh2;
print $fh2 $_ while <$tmpfh>;
close $fh2;
close $tmpfh;
This time, the temporary file's contents are written back to the original file without the temporary file having been closed, so there is no close/rename race condition. Also, tempfile() was called in scalar context so the temporary file will be cleaned up even if the program is killed (on Win32, at least, via the O_TEMPORARY flag that is used when opening the file). However, the process of copying the temporary file's contents back to the original file is no longer atomic, so if the program is interrupted during the final while loop then the original file will be left partially written.
So neither option is perfect. Which is approach is the lesser of the two evils? Is there another approach with none of these pitfalls?
That doesn't look like a great way to choose a backup filename - the rename will succeed even for candidate backup filenames that exist (permissions permitting),...
Really? I'm pretty certain that I have never used a filesystem that, regardless of permissions, would allow you to rename one file on top of an existing one. Which filesystem are you using?
The perlfunc manpage entry for Perl's built-in rename() function says:
Changes the name of a file; an existing file NEWNAME will be clobbered.and it's quite correct (I just tried it to make sure!).
Any more thoughts on my temporary file issue?
- Steve
I really never knew that. How dumb. Both my assumption in not checking what I knew could never be so and the logic that makes me wrong. You'll have to decide for yourself which is dumber:)
It will be a while before I stop thinking about the logic that allows a [rename] function to become a "delete target and then copy over" command.
You could consider this.
#! perl -slw use strict; use Win32::API::Prototype; ApiLink( 'kernel32', 'UINT GetTempFileName( LPCTSTR lpPathName, LPCTSTR lpPrefixString, UINT uUnique, LPTSTR lpTempFileName )' ) or die $^E; my $tempFileName = ' ' x 254; my $path = '.'; my $prefix = 'temp0000'; GetTempFileName( $path, $prefix, 0, $tempFileName ) or die $^E; print $tempFileName;
After the above code has been run, the an empty file with the name returned will have been created. You can then open and use it as you need to.
perlmonks.org content © perlmonks.org and Aragorn, BrowserUk, shay, tachyon
prlmnks.org © 2006 edmund von der burg (eccles & toad)
v 0.03