printing out a matrix for a data list
Angharad
created: 2006-08-05 09:21:07
Hi there
I have been trying to do the following
Taking 'ob1' as the 'object of interest' I have a text file which looks like this
ob1, ob2, 34
ob1, ob3, 56
ob1, ob4, 12
ob1, ob5, 78
ob1, ob6, 23
ob3, ob1, 56
ob7, ob1, 23
ob8, ob1, 12
ob9, ob1, 90
etc ...
and what I need is to create a matrix like this
     ob1   ob2   ob3  ob4  ob5  ob6  ob7  ob8  ob9
ob1   0     34    56   12   78   23   23   12   90
ob2  34
ob3  56
ob4  12
ob5  78
ob6  23
ob7  23
ob8  12
ob9  90
And for any 'duplicate results' like these
ob1, ob3, 56
ob3, ob1, 56
Only take the first instance (as the result is the same, regardless of the direction).
Ob2 then becomes the 'object of interest' and column two and row two is populated in the same way (but with different values potentially) as the first column and first row was populated with ob1 was the 'object if interest'. And the ob3 becomes the 'object of interest' until the matrix is completely filled. In some cases, there may be some missing values (for example there may not be a 'ob4 ob5' value and I need to take that into account - perhaps by printing 'NULL' or something).
I asked for help from Perl Monks yesterday and this is the code I have so far
#!/usr/local/bin/perl
use strict;

my $data = $ARGV[0];

# open file here

open(DATA, "$data") || die "cant open file for reading\n";

my %table;
my %rows;
my %cols;

for() {
  my($row,$col,$val) = split ',';
  $table{$row}{$col} = $val;
  $rows{$row}++;
  $cols{$col}++;
}

for my $col (sort keys %cols) {
  print "\t$col";
}
print "\n";

for my $row (sort keys %rows) {
  print "$row\t";
  for my $col (sort keys %cols) {
    print $table{$row}{$col} if defined $table{$row}{$col};
    print "\t";
  }
  print "\n";
}
The results I get using the code above is:
               ob1 ob2 ob3 ob4 ob5 ob6


ob1                 34
        56
        12
        78
        23

ob3         56

ob7         23

ob8         12

ob9         90
Which isn't quite what I need, but I don't know how to fix it (due to my blind spot regarding hashes I suspect).
Any suggestions much appreciated.
Re: printing out a matrix for a data list
created: 2006-08-05 09:50:13

Fix the output? Or the matrix filling code?

One problem I see in the output code is that you don't treat missing values. Probably you should iterate over all objects.

my %all = ( %rows, %cols )

my @objs = keys %all;
UPDATE: Another (or the?) problem I see is that you don't remove the newline in your data. chomp should fix this.
Re: printing out a matrix for a data list
created: 2006-08-05 10:09:05

Have you investigated CPAN? Specifically, Text::Table? The trick becomes massaging your input data into the format that Text::Table wants it, but then it becomes much more trivial to get it printing right.

Re: printing out a matrix for a data list
created: 2006-08-05 10:59:12
Try the code below. The changes include
  • adding chomp to get rid of the extra newlines that are left on the $val
  • removing the spaces, along with the commas, on the split, so they don't end up as part of the hash indices
  • using a canonical form for the table index, so that "ob1,bo2" and "ob2,ob1" go in the same bin.
  • just one hash for which rows and columns are present
  • print "-" if there's no entry

You were well on your way, but the first two on this list can confuse things enough that you can't see the rest. That's the way it happens, it's the things you're not looking at that get you.

use strict;
my %table;
my %rows_cols;
my %cols;

for() {
   chomp;
  my($row,$col,$val) = split ', *';
  if ($row>$col) {
     ($row,$col) = ($col,$row);
   }
  $table{$row}{$col} = $val;
  $rows_cols{$row}++;
  $rows_cols{$col}++;
}

for my $col (sort keys %rows_cols) {
  print "\t$col";
}
print "\n";

my $val;
for my $row (sort keys %rows_cols) {
  print "$row\t";
  for my $col (sort keys %rows_cols) {

    if (defined $table{$row}{$col}) {
       print $table{$row}{$col};
    }
    elsif (defined $table{$col}{$row}) {
       print $table{$col}{$row};
    }
    else {
       print "-";
    }
    print "\t";
  }
  print "\n";
}
__END__
ob1, ob2, 34
ob1, ob3, 56
ob1, ob4, 12
ob1, ob5, 78
ob1, ob6, 23
ob3, ob1, 56
ob7, ob1, 23
ob8, ob1, 12
ob9, ob1, 90
ob3, ob2, 87
prints
        ob1     ob2     ob3     ob4     ob5     ob6     ob7     ob8     ob9
ob1     -       34      56      12      78      23      23      12      90

ob2     34      -       87      -       -       -       -       -       -

ob3     56      87      -       -       -       -       -       -       -

ob4     12      -       -       -       -       -       -       -       -

ob5     78      -       -       -       -       -       -       -       -

ob6     23      -       -       -       -       -       -       -       -

ob7     23      -       -       -       -       -       -       -       -

ob8     12      -       -       -       -       -       -       -       -

ob9     90      -       -       -       -       -       -       -       -
Re: printing out a matrix for a data list
created: 2006-08-05 11:06:48
thanks for your comments/suggestions so far. much appreciated
Re: printing out a matrix for a data list
created: 2006-08-05 18:46:04

Any reason why you couldn't update How to print list as matrix where there are already plenty of answers?

/J\

Re^2: printing out a matrix for a data list
created: 2006-08-07 05:09:06
Yes, actually. It was a slightly different question and I was upfront from the beginning as to it being an update. Any particular reason why you had to be rude?
Re: printing out a matrix for a data list
created: 2006-08-06 18:20:03
Very basic hack:
use strict;
use warnings;

my ($delim, $row, $col, $val, %matrix, %cols, @rows, @cols) = '  ';

while () {
    chomp;
    ($row, $col, $val) = split /, /, $_;
    $matrix{$row}{$col} = $val;
    $cols{$col} = ();
}

@cols = sort keys %cols;
@rows = sort keys %matrix;

print join $delim, '   ', @cols;
for $row (@rows) {
    print "\n", $row;
    for $col (@cols) {
        print $delim, $matrix{$row}{$col} ? sprintf('%3d', $matrix{$row}{$col}) : '  -';
    }
}

__DATA__
ob1, ob2, 34
ob1, ob3, 56
ob1, ob4, 12
ob1, ob5, 78
ob1, ob6, 23
ob3, ob1, 56
ob7, ob1, 23
ob8, ob1, 12
ob9, ob1, 90
Re^2: printing out a matrix for a data list
created: 2006-08-07 05:19:42
Thank you. Your help is much appreciated :)

perlmonks.org content © perlmonks.org and Angharad, gellyfish, lima1, rodion, Tanktalus, TedPride

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03