XML Parsing
saaz11
created: 2006-02-01 11:24:08
Hi,
I am trying to find out how can I read a value between the start and end tags in an XML file in perl.
If I have
I can read this using
#!/usr/bin/perl
use XML::Parser;

$parser = new XML::Parser(Style => 'Stream');
$parser->parsefile('abc.xml');

print "$Hotel_id\n", "$Hotel_name\n";

sub StartTag {
    my $Handler = shift(@_);
    my $Name = shift(@_);
    my %Attr = %_;
if($Name eq 'Hotel') {
        $Hotel_id = $Attr{'id'};
        $Hotel_name = $Attr{'name'};
    }
sub Text  {
       $TheText = $_ ;
      }
sub EndTag
    {
     if($Name eq 'Hotel'){
exit(0) };
But if I have
Marriott
I am not sure how to read "Marriott" in perl. Does anyone have any ideas?

Thanks a lot.
Zeelani.

Re: XML Parsing
created: 2006-02-01 11:25:40
Ever tried looking at what $TheText has?

My criteria for good software:
  1. Does it work?
  2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re^2: XML Parsing
created: 2006-02-01 12:03:45
I tried $TheText but it returns nothing.
Re: XML Parsing
created: 2006-02-01 13:09:29
You picked the wrong tool. XML::Parser is probably the worst XML parsing interface on CPAN. Try switching to XML::Simple, which should be more than adequate for your needs.

-sam

Re^2: XML Parsing
created: 2006-02-02 00:57:32
You picked the wrong tool...

I would not make that sort of blanket assertion. Note that XML::Simple is actually a layer on top of XML::Parser -- and the man page for XML::Simple is (paradoxically) much longer and more complicated than the one for XML::Parser.

From my perspective, XML::Simple applies a lot of "default" (or maybe even inescapable) assumptions about what you intend to do with XML data, and if what you really want to do is a different from that... well, it might not be so "Simple".

Lots of good work can be done with XML::Parser, and it's not that hard to master. Maybe it's "the worst XML parsing interface on CPAN" for your purposes, but I'm sure there are many who find it both useful and appropriate for a variety of jobs.

Re^3: XML Parsing
created: 2006-02-02 14:48:09
Note that XML::Simple is actually a layer on top of XML::Parser

Sure, but that says nothing about the quality of the interface. XML::Parser is a high-quality, fast XML parser, but programming for it bites.

and the man page for XML::Simple is (paradoxically) much longer and more complicated than the one for XML::Parser.

I pity the fool that chooses modules based on the length of their man page! The only reason XML::Parser's man-page is so short is that it's missing a lot of useful stuff, like working examples. I'd much rather have a simple interface with a long manual than a complex interface with a short one.

it's not that hard to master.

Well, everbody's different, so I guess you might find it easier than most. Hang out on the perl-xml list and I think you'll see a pattern emerge - XML::Parser is pretty hard for most people to use. Almost everything else on CPAN that does the same job is more intuitive.

-sam

Re: XML Parsing
created: 2006-02-01 15:32:05

YOu could also consider [mod://XML::Twig]. Like Perl it makes simple things simple, and most things possible:

use warnings;
use strict;
use XML::Twig;

my $twig = new XML::Twig;

$twig->parse (do {local $/; });

for ($twig->descendants ('Hotel'))
  {
  my $Hotel_name = $_->{'att'}{'name'} || $_->text;
  next if ! defined $Hotel_name;
  
  my $Hotel_id = $_->{'att'}{'ID'};
  print "($Hotel_id) " if defined $Hotel_id;
  print "$Hotel_name\n";
  }

__DATA__

Marriott


Prints:

Marriott
(123) Savoy

DWIM is Perl's answer to Gödel
Re: XML Parsing
created: 2006-02-01 19:16:30
And with [cpan://XML::LibXML]...

use warnings;
use strict;
use XML::LibXML;

my $parser = new XML::LibXML;
my $tree   = $parser->parse_file('hotels.xml');
my $root   = $tree->getDocumentElement();

for ($root->getElementsByTagName('hotel')){
  my $Hotel_name = $_->getAttribute('name') || $_->getData();
  next if ! defined $Hotel_name;
  
  my $Hotel_id = $_->getAttribute('ID');
  print "($Hotel_id) " if defined $Hotel_id;
  print "$Hotel_name\n";
}

hotels.xml:

Marriott


Re: XML Parsing
created: 2006-02-01 22:56:49
Each of the previously mentioned modules are very well known by the whole community. So, you'll find lots of tutorials, articles and tips. But there is another one that is as simple as talking about directories and its contents: XML::Mini::Document. After studying the others, please have a look at this one.

perlmonks.org content © perlmonks.org and chanio, dragonchild, graff, GrandFather, reneeb, saaz11, samtregar

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03