Move matched to top, bottom
quinkan
created: 2006-05-02 18:36:42
I had a need to "float" entries matchable with a regex to the bottom of an array or list, preferably without altering the order of other entries. I played for a while with split and splice, but reached the conclusion that the following works for my purposes...

my @rawentries = (
"1topEndbit",
"2ndbit",
"endBit3",
"4thbit",
"5thbit",
"6OtherEndbit",
"7Bitsecondfromend",
"8Bitoneoffend",
"9Anyoldbit",
);

sub ToTop {
       my $t1 = ($a =~ m/Endbit/i)? 0:1;
       my $t2 = ( $b =~ m/Endbit/i )? 0:1;
       $t1 <=> $t2;
}

sub ToEnd
{
       my $t1 = ($a =~ m/Endbit/i)? 1:0;
       my $t2 = ( $b =~ m/Endbit/i )? 1:0;
       $t1 <=> $t2;
}
my @toendlist = sort ToEnd @rawentries;
my @totoplist = sort ToTop @rawentries;

print (join "\n", @toendlist);
print "\n\n".join ("\n", @totoplist)."\n";
Two questions arise:

  1. Can this sort usage be relied upon?
  2. Am I missing obvious alternative(s) ?
Re: Move matched to top, bottom
created: 2006-05-02 18:51:26

The following may appeal:

use strict;
use warnings;

my @rawentries = (
"1topEndbit",
"2ndbit",
"endBit3",
"4thbit",
"5thbit",
"6OtherEndbit",
"7Bitsecondfromend",
"8Bitoneoffend",
"9Anyoldbit",
);

my @sorted;
my @last;

/Endbit/i ? push @last, $_ : push @sorted, $_ for @rawentries;
push @sorted, @last;
print join "\n", @sorted;

Prints:

2ndbit
4thbit
5thbit
7Bitsecondfromend
8Bitoneoffend
9Anyoldbit
1topEndbit
endBit3
6OtherEndbit

DWIM is Perl's answer to Gödel
Re^2: Move matched to top, bottom
created: 2006-05-02 20:39:20
Thanks for that. Looks a good deal faster, too (which I guess is to be expected when a sort is involved, particularly if we start with initially-sorted or close-to-sorted data -- even with median partition selections).

Oops ! I see 5.8 has changed sort to allow a change from the quicksort to mergesort, so median-of-three need not be the point.. (Thanks for the pointer, Tanktalus.)

Re: Move matched to top, bottom
created: 2006-05-02 20:41:46
Can this sort usage be relied upon?

No, not entirely. You're asking that no other entries change places with respect to each other, except on the criteria of matching some regex (in your example, /Endbit/i). That requires a stable sort, which perl does not entirely guarantee in all circumstances. Look at the [doc://sort] documentation and search for the word 'stable' to see what this means, and how you can get a stable sort at all times.

Am I missing obvious alternative(s) ?

I think the most obvious solution is just to seperate out all the entries into two arrays, and then merge them together afterwards. Something like this:

my (@match, @notmatch);
for (@rawentries)
{
  if (/Endbit/i) {
    push @match, $_;
  } else {
    push @notmatch, $_;
  }
}
my @newlist = (@notmatch, @match); # the ones that match float to the bottom
That's not to say this is the most efficient algorithm - but you asked for obvious ;-)

Re: Move matched to top, bottom
created: 2006-05-02 21:31:10
This should do it .. read the comments backwards.
print join "\n",
@rawentries[   # array slice to turn the indices back to the values (could use map, too)
    sort {
        ($rawentries[$a] =~ /Endbit/i) <=> ($rawentries[$b] =~ m/Endbit/i)  # sort first using the index's array value matching against a regex and sort against the return. non-matches go higher; matches go lower
        || $a <=> $b     # default to just sorting the index values to maintain order w/in bit/not-bit chunks
    } 0 .. $#rawentries   # sort the indices
]
;
(note that this could be a good candidate for a schwartzian transform)
Re: Move matched to top, bottom
created: 2006-05-03 04:58:23
As per davidrw's suggestion, here is a Schwartzian Transform to do the same thing.

use strict;
use warnings;

my @rawentries = (
   "1topEndbit",
   "2ndbit",
   "endBit3",
   "4thbit",
   "5thbit",
   "6OtherEndbit",
   "7Bitsecondfromend",
   "8Bitoneoffend",
   "9Anyoldbit");

print
   map  {"$_->[0]\n"}
   sort
   {
      $a->[1] <=> $b->[1] ||
      $a->[2] <=> $b->[2]
   }
   map
   {
      [
         $rawentries[$_],
         $rawentries[$_] =~ /Endbit/i ? 1 : 0,
         $_
      ]
   }
   0 .. $#rawentries;

When run it produces

2ndbit
4thbit
5thbit
7Bitsecondfromend
8Bitoneoffend
9Anyoldbit
1topEndbit
endBit3
6OtherEndbit

Cheers,

JohnGG

perlmonks.org content © perlmonks.org and davidrw, GrandFather, johngg, quinkan, Tanktalus

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03