Parsing Data from XML Schemas
mosiondz
created: 2004-06-15 19:43:40

Hello Perl Monks!

I am attempting to parse XML data from an HTTP request and place it into a local database. I've gotten to the point where I've got the XML data saved to a local temporary file and want to take the information from the XML file and place it into the database. However, the ASP script I'm pulling the data from queries a SEQUEL database and returns MS "RowsetSchema" XML Schema; I'm trying to parse the info from it on a Linux system and put it into a MySQL database.

My main question: is there a way to parse the XML schema on my Linux system with a Perl script? I looked at various CPAN parsers but can't seem to find one that interprets this schema correctly. Do I need a proprietary solution?

Thanks in advance!

Re: Parsing Data from XML Schemas
created: 2004-06-15 20:20:06
Do you really mean an XML schema (.xsd) document, or just plain XML data? If it's coming from a SQL query, it's probably just XML data and XML::Simple (for small XML files) or XML::Parser (for large ones) would probably work ok.
Re^2: Parsing Data from XML Schemas
created: 2004-06-16 12:21:22

As far as I understand it, the script returns an XML Schema (I might be off on the terminology here). It starts off with the following:

- 
- 
Would I be correct in calling this a schema?
Re^3: Parsing Data from XML Schemas
created: 2004-06-16 12:52:59
In this case, it's an XML file with the database schema and rowset data embedded within it. If you look at the XML file, it should have <s:Schema> and <rs:data> sections.

Try something like this.. I've tested it and it works. Note that XML::Simple reads the entire XML file into memory and therefore is not ideal for large datasets.

use strict;
use XML::Simple;

my $xs = XML::Simple->new();
my $sqlref = $xs->XMLin('sqldata.xml');

my $rowsref = $sqlref->{'rs:data'}{'z:row'};

foreach my $row (@$rowsref) {
	print "---------New Row--------\n";
	print $row->{'column1'}, "\n";
	print $row->{'column2'}, "\n";
	print $row->{'column3'}, "\n";
}
Re^4: Parsing Data from XML Schemas
created: 2004-06-16 19:06:50

Thanks for the code example, meetraz! However, I receive "Not an ARRAY reference" on the following line:

foreach my $row (@$rowsref) {

I'm not sure why I'd be receiving this error, as everything seems to assigned correctly.

Re^5: Parsing Data from XML Schemas
created: 2004-06-16 21:25:49
Is there any way you can post a sample of the file you're trying to parse?
Re^6: Parsing Data from XML Schemas
created: 2004-06-17 14:40:35
Sure:
- 
- 
- 
- 
   
  
- 
   
  
- 
   
  
- 
   
  
- 
   
  
- 
   
  
- 
   
  
- 
   
  
   
  
  
- 
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
  
  
Re^7: Parsing Data from XML Schemas
created: 2004-06-18 00:53:04
Is this the data you are feeding to perl? The "-" marks on the left and the embedded quotes are usually added by the browser (IE) and will break the XML. You will need to save the XML data to file directly from ASP and not via the browser.
Re: Parsing Data from XML Schemas
created: 2004-06-15 23:08:33

If you're just looking to validate the document against a XML Schema, you could try XML::Validator::Schema .

Re: Parsing Data from XML Schemas
created: 2004-06-16 02:50:41
XML::Parser is often used to read in an XML blob so you can do something interesting with it. Its a good idea to validate the XML, etc, but if you just want the blob un-XML'd try something like this:
use XML::Parser;

my $filename="justdownloaded.xml";
die "The filename specified in this script ($filename)doesnt exist..?" if (! -e $filename);

my $parser = new XML::Parser(ErrorContext => 2);
$parser->setHandlers(Start => \&XMLStartHandler,
                       Char  => \&XMLCharHandler);
$parser->parsefile($file);

sub XMLStartHandler {
  my ($expat, $element, $attr, $value) = @_;

  print "Attribute: $attr\n";
  print "Value: $value\n";

}
sub XMLCharHandler {
  my ($p, $data) = @_;
  
  print "Data: $data\n";

}

..or something along those lines.. -Vlad

perlmonks.org content © perlmonks.org and meetraz, mosiondz, Steve_p, vladdrak

prlmnks.org © 2006 edmund von der burg (eccles & toad)

v 0.03