Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

input & output for TB conv

by satheeshbssb
on Nov 23, 2012 at 11:25 UTC ( #1005258=perlquestion: print w/ replies, xml ) Need Help??
satheeshbssb has asked for the wisdom of the Perl Monks concerning the following question:

how to convert text table into xml table

<table-wrap position="float" cols="4">Table 6.7 Animals in collecti +ons. <thead>Animal group Secondary group Type of specimens in natural + history collections Derived materials found in museum collections +</thead> <tbody>Invertebrates With hard parts (mollusks, corals, etc.) Dr +y shells, wet-preserved animals Pearls, shell (M-of-p), coral, spo +nges Invertebrates Insects Pinned or mounted dry insects, wet Butt +erfly wings, beetles (scarabs) Vertebrates Mammals Study skins, pelts, mounted specimens (taxid +ermy), whole or partial skeletons, teeth, wet-preserved animals, part +s, or stomach contents, eggs, nests Ivory, ruminant horn, rhino ho +rn, antler, bone, claws, skin (leather, vellum), hair (bristles, quil +ls, fur) hooves Vertebrates Fish Whole or partial skeletons, teeth, wet-preserve +d animals, parts, or stomach contents Scales Vertebrates Reptiles and amphibians Study skins, mounted specime +ns (taxidermy), whole or partial skeletons, wet-preserved animals or +parts Tortoise shell, teeth, skin Vertebrates Birds Study skins, mounted specimens (taxidermy), wh +ole or partial skeletons, wet-preserved animals or parts Feathers, + down, beaks, feet</tbody></table-wrap>

Comment on input & output for TB conv
Download Code
Re: input & output for TB conv
by moritz (Cardinal) on Nov 23, 2012 at 11:32 UTC
Re: input & output for TB conv
by roboticus (Canon) on Nov 23, 2012 at 14:52 UTC

    satheeshbssb:

    I was able to turn your HTML table into a valid[1] XML table simply by prefixing it with:

    <?xml version="1.0"?>

    On a more serious note: You don't really provide a lot of information about what you're doing, so my joke answer above is an accurate-enough answer to your question.

    However, what I think you want is to break your table up into more than a single cell. For that, you're going to need to figure out how to (a) break your text up into records, and (b) split the records into fields.

    If we ignore your header, it looks like your table data is a single line per record, so you can use a typical loop to read it record by record:

    while (my $record = <$FH>) { ... process each record ... }

    And your first two fields appear to be single words. So you could use a regular expression or some other method to split your records up into the individual fields.

    Finally, to turn your data into XML, you would be well served to go to CPAN and look for some module to write XML, to make sure you don't make "fake" XML files (of which the world sees too many).

    [1] as verified by the w3 org xml validator.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1005258]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2014-10-23 00:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (122 votes), past polls