Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re: Converting a Text file to XML

by CountZero (Bishop)
on Nov 17, 2011 at 07:24 UTC ( #938531=note: print w/replies, xml ) Need Help??

in reply to Converting a Text file to XML

/^\d\d\d\d$/ looks for a string of 4 digits and nothing more, thanks to the ^ and $ anchors. Just use /\d{4}/ and you will find the year.


A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Replies are listed 'Best First'.
Re^2: Converting a Text file to XML
by Perl300 (Pilgrim) on Jun 26, 2015 at 19:23 UTC
    Hi Monks, I used code suggested to convert a text file into xml but getting some errors for some of the lines in the input text file. Posting those here just in case if anyone is still following and could have some suggestion on correcting it:


    I am trying to convert a text file into xml using following code:
    #!/usr/bin/perl use strict; use warnings; use XML::Writer; use XML::Simple; use XML::LibXML; my $out; my $xml = XML::Writer->new(OUTPUT => \$out, DATA_MODE => 1, DATA_INDEN +T => ' '); $xml->xmlDecl(); $xml->startTag('doc'); my $check_1 = 0; open(my $fh, "<", "20150625163139.txt") or die "Failed to open file: $!\n"; while(<$fh>) { chomp; next if !length; my ($string1, $string2, $subscript_name, $subscript_value) = / ^(.*?):: ([^\s]+) \.([^\s]+)\s+= \s(.*) /x; if ( $check_1 == 0 ) { $xml->startTag($string1); $check_1 += 1; } $xml->startTag($string2); $xml->dataElement($subscript_name => $subscript_value); $xml->endTag(); } $xml->endTag(); $xml->endTag(); $xml->end(); print $out; close $fh;
    The file 20150625163139.txt contains 366 lines with format:
    GI-eSTB-MIB-NPH::eSTBGeneralErrorCode.0 = INTEGER: 0 GI-eSTB-MIB-NPH::eSTBGeneralConnectedState.0 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBGeneralPlatformID.0 = INTEGER: 2076 GI-eSTB-MIB-NPH::eSTBGeneralFamilyID.0 = INTEGER: 25 GI-eSTB-MIB-NPH::eSTBGeneralModelID.0 = INTEGER: 60436 GI-eSTB-MIB-NPH::eSTBGeneralUnitAddressID.0 = STRING: 000-00802-49393- +076 GI-eSTB-MIB-NPH::eSTBGeneralSettopMac.0 = STRING: b8:16:19:28:18:f3 GI-eSTB-MIB-NPH::eSTBGeneralRemodChan.0 = INTEGER: 3 GI-eSTB-MIB-NPH::eSTBGeneralSettopTime.0 = INTEGER: 1119302620 GPS GI-eSTB-MIB-NPH::eSTBPurchaseStatusUnsentPurchases.0 = INTEGER: 0 GI-eSTB-MIB-NPH::eSTBPurchaseStatusUnackPurchases.0 = INTEGER: 0 GI-eSTB-MIB-NPH::eSTBPurchaseStatusLastSeqNumPurchases.0 = INTEGER: 0 GI-eSTB-MIB-NPH::eSTBPurchaseStatusLastReportBackTimePurchases.0 = INT +EGER: 1118516578 GI-eSTB-MIB-NPH::eSTBPurchaseStatusIppvStatus.0 = INTEGER: false(2) GI-eSTB-MIB-NPH::eSTBOobFrequency.0 = INTEGER: 75250000 GI-eSTB-MIB-NPH::eSTBOobCarrierLock.0 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBOobLostLockCount.0 = Counter32: 0 GI-eSTB-MIB-NPH::eSTBOobDataPresent.0 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBOobEMMDataPresent.0 = INTEGER: false(2) GI-eSTB-MIB-NPH::eSTBOobSNRValue.0 = INTEGER: 24.9 GI-eSTB-MIB-NPH::eSTBOobSNRState.0 = INTEGER: good(4) GI-eSTB-MIB-NPH::eSTBOobAGCValue.0 = INTEGER: 16 GI-eSTB-MIB-NPH::eSTBOobAGCState.0 = INTEGER: good(4) GI-eSTB-MIB-NPH::eSTBOobNetworkPid.0 = INTEGER: 1911 GI-eSTB-MIB-NPH::eSTBOobEMMPid.0 = INTEGER: 5379 GI-eSTB-MIB-NPH::eSTBOobEMMProviderID.0 = INTEGER: 1 GI-eSTB-MIB-NPH::eSTBInBandNumberOfTuners.0 = INTEGER: 2 GI-eSTB-MIB-NPH::eSTBTunerIndex.1 = INTEGER: 1 GI-eSTB-MIB-NPH::eSTBTunerIndex.2 = INTEGER: 2 GI-eSTB-MIB-NPH::eSTBInBandTunerModulationMode.1 = INTEGER: qam256(3) GI-eSTB-MIB-NPH::eSTBInBandTunerModulationMode.2 = INTEGER: qam256(3) GI-eSTB-MIB-NPH::eSTBInBandTunerCarrierLock.1 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBInBandTunerCarrierLock.2 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBInBandTunerPCRLock.1 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBInBandTunerPCRLock.2 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBInBandTunerDataLock.1 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBInBandTunerDataLock.2 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBInBandTunerEMMDataPresent.1 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBInBandTunerEMMDataPresent.2 = INTEGER: true(1) GI-eSTB-MIB-NPH::eSTBInBandTunerFrequency.1 = INTEGER: 195000000 GI-eSTB-MIB-NPH::eSTBInBandTunerFrequency.2 = INTEGER: 501000000 GI-eSTB-MIB-NPH::eSTBInBandTunerAGCValue.1 = INTEGER: 0 GI-eSTB-MIB-NPH::eSTBInBandTunerAGCValue.2 = INTEGER: 0 GI-eSTB-MIB-NPH::eSTBInBandTunerAGCState.1 = INTEGER: poor(2) GI-eSTB-MIB-NPH::eSTBInBandTunerAGCState.2 = INTEGER: poor(2) GI-eSTB-MIB-NPH::eSTBInBandTunerSNRValue.1 = INTEGER: 42.0
    When I run the above code for this file, I get error:
    Code point \u0016 is not a valid character in XML at ./<script_name>.p +l line 34

    Where line 34 is

    $xml->dataElement($subscript_name => $subscript_value);

    When I remove all the lines from the file 20150625163139.txt and keep only these two lines

    GI-eSTB-MIB-NPH::eSTBGeneralErrorCode.0 = INTEGER: 0 GI-eSTB-MIB-NPH::eSTBGeneralConnectedState.0 = INTEGER: true(1)

    The same code runs fine and generates following xml

    <?xml version="1.0"?> <doc> <GI-eSTB-MIB-NPH> <eSTBGeneralErrorCode> <0>INTEGER: 0</0> </eSTBGeneralErrorCode> <eSTBGeneralConnectedState> <0>INTEGER: true(1)</0> </eSTBGeneralConnectedState> </GI-eSTB-MIB-NPH> </doc>
    I searched for error: "Code point \u0016 is not a valid character in XML at ./ line 32" It seems that this error is being generated due to control characters present in the text which are not allowed in xml. So I have two options:

    1) Remove these control characters from the file and then print: I have tried this using

    perl -pe's/\x08//g' <20150625163139.txt >20150625163139.txt

    But this gives error: Bad name after g' at <script_name>.pl line 13.

    2) To actually generate an xml (actual .xml file) from code and put the text that is converted in xml into this file and then read it. Do anyone have any suggestions on point 1 or 2?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://938531]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (10)
As of 2018-04-25 17:06 GMT
Find Nodes?
    Voting Booth?