Sorting path like strings
BUU
created: 2006-01-05 03:04:53
I have a series of strings that look much like:
0/1/2/3
0
0/4/5/6
0/4
0/1
0/1/2
0/4/5
0/10/111/145
0/10/111
0/10
Now I'm trying to sort them. The correct order is:
0
0/1
0/1/2
0/1/2/3
0/4
0/4/5
0/4/5/6
0/10
0/10/111
0/10/111/145
Perl's default sort using [cmp] comes very close, but it places 0/10 before 0/4, and it should not. I've found a really ugly solution involving a couple of for loops and such things, but I was really hoping for anything faster and cleaner. Any ideas?
Re: Sorting path like strings
created: 2006-01-05 03:45:10
sub do_the_sort {
  my @aa = split /\//, $a;
  my @bb = split /\//, $b;
  for ( 0 .. $#aa ) {
    return 1 unless defined $bb[$_];
    my $zz = $aa[$_] <=> $bb[$_];
    return $zz if $zz;
  }
  -1;
}

my @y = sort do_the_sort @x;
Boris
Re^2: Sorting path like strings
created: 2006-01-05 11:49:33
[borisz],
I think the following recursive implementation does basically the same thing - don't ask me how I came up with it.
my @data = map {chomp; [ split '/' ] } ;
@data = map { join '/', @$_ } sort my_sort @data;

sub my_sort {
    my $i = shift || 0;
    if ( defined $a->[$i] && defined $b->[$i] ) {
        return $a->[$i] <=> $b->[$i] || my_sort(++$i);
    }
    return defined $a->[$i] ? 1 : -1;
}

Cheers - [Limbic~Region|L~R]

Re: Sorting path like strings
created: 2006-01-05 03:46:15

If your numeric values are all less than 255, then this works. You could swap the ST to a GRT or OM if performance is a concern. If your numbers get larger than 255 you could probably use unicode pack 'U*', but I haven't tested that.

@data = qw[
  0/1/2/3 0 0/4/5/6 0/4 0/1 0/1/2
  0/4/5 0/10/111/145  0/10/111 0/10
];;

print for map{ 
    $_->[0]
} sort{ 
    $a->[1] cmp $b->[1] 
} map{ 
    [ $_, pack 'C*', split '/', ] 
} @data;;

0
0/1
0/1/2
0/1/2/3
0/4
0/4/5
0/4/5/6
0/10
0/10/111
0/10/111/145

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Sorting path like strings
created: 2006-01-05 08:34:45
How do you print newlines ("\n") here?
Re^3: Sorting path like strings
created: 2006-01-05 08:37:54

My perl shell has the -l switch on the shebang line which means amongst other things) that print statements have a newline appended to them automatically. See perlrun for the details.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re^4: Sorting path like strings
created: 2006-01-05 08:48:22
Hm, I meant in the program.

And after thinking I found it:

print for map{ 
    $_->[0] . "\n"
}

etc.
I should think a lot more before posting a trivial questions. :-)
Re^2: Sorting path like strings
created: 2006-01-05 08:47:48

Are you sure that non-C locales won’t cause this to sort in orders other than ASCIIbetical (and thus wrongly), particularly if you add Unicode to the mix?

Makeshifts last the longest.

Re^3: Sorting path like strings
created: 2006-01-05 10:46:04
Are you sure that non-C locales won’t cause this to sort in orders other than ASCIIbetical

Um, no. Did I imply that I was?

I assume that anyone using non-C locales will know they are and know that they will have to take special steps to get sorts to operate correctly. This assumption is based upon the fact that many uses of GRT (including yours elsewhere in the thread), and as far as I can tell from looking, many of the various sorting modules on CPAN will be similarly affected.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Sorting path like strings
created: 2006-01-05 03:52:05
Your sort operator needs to impose a lexicographic ordering on the lists defined by your "paths", as in this example:
#!/usr/bin/perl
use strict;
use warnings;

# parse the paths into lists
my @data;
while (  ) {
    chomp;
    push @data, [ split m:/:, $_ ];
}

sub compare_lists_lexicographically {
    # make copies of the input lists
    my @left = @$a;
    my @right = @$b;

    # chew threw the lists until
    #   a) we discover they're different lengths, or
    #   b) the elements are different, or
    #   c) neither
    while ( 1 ) {
        if ( !@left && !@right ) {
            # ran out at the same time: they're equal
            return 0;
        }
        elsif ( !@left ) {
            # left is shorter than right: left is first
            return -1;
        }
        elsif ( !@right ) {
            # right is shorter than left: right is first
            return 1;
        }
        else {
            # check the next element from each list
            my ( $l, $r ) = ( shift @left, shift @right );

            # NB: use < if fields are numeric, lt otherwise
            if ( $l < $r ) {
                # next element of left comes first
                return -1;
            }
            elsif ( $r < $l ) {
                # next element of right comes first
                return 1;
            }
            # else equal, keep checking
        }
    }
}

# display the output
printf "%s\n", join '/', @$_ for sort compare_lists_lexicographically @data;

__DATA__
0/1/2/3
0
0/4/5/6
0/4
0/1
0/1/2
0/4/5
0/10/111/145
0/10/111
0/10
HTH,
--athomason
Re: Sorting path like strings
created: 2006-01-05 03:53:49
Assuming you don't have too many subdirs at any level, does this

perl -le 'print for map { join"/",map ord,split"",$_}
             sort map {chomp;join"",map chr,@x=split"/",$_}<>'

get the right order?

Re: Sorting path like strings
created: 2006-01-05 09:42:11

I’d use a GRT variant.

use List::Util qw( max );

my @padded = do {
	my @copy = @path;
	my @segment_length;
	while( 1 ) {
		my $len = max map { s!([^/]*/|[^/]+)!! ? length( $1 ) : -1 } @copy;
		last if $len == -1;
		push @segment_length, $len;
	}
	my $format = join '', map "%${_}s", @segment_length;
	map {;
		no warnings 'uninitialized';
		sprintf $format, split m!/!, $_, -1;
	} @path;
};

@path = @path[ sort { $padded[ $a ] cmp $padded[ $b ] } 0 .. $#padded ];

Makeshifts last the longest.

Re: Sorting path like strings
created: 2006-01-05 16:37:46
The standard solution to the lexical vs. numerical sorting problem is using leading 0 s. Comparing '04' and '10' works in both cases. It is also reversible:
foreach (@in) {s|(\d+)|sprintf '%10.10d', $1|eg};
@in = sort  @in;
foreach (@in) {s|(\d+)|sprintf '%d',      $1|eg};
This assumes that your numbers have up to 10 digits.

perlmonks.org content © perlmonks.org and Aristotle, arkturuz, athomason, borisz, BrowserUk, BUU, educated_foo, Limbic~Region, NiJo

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03