Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^2: merging a file with a value present in another file

by lakssreedhar (Acolyte)
on Jul 13, 2012 at 10:00 UTC ( [id://981591]=note: print w/replies, xml ) Need Help??


in reply to Re: merging a file with a value present in another file
in thread merging a file with a value present in another file

my file f1 is a tagged data some what in this format.{MCL}why is this{/MCL}.second file is in a column format where the rows are why,is and this. i want to add a field to the column like near the word why field should be<clause_start="mcl"> and near the word this field should be<clause_end="mcl">.

  • Comment on Re^2: merging a file with a value present in another file

Replies are listed 'Best First'.
Re^3: merging a file with a value present in another file
by aaron_baugher (Curate) on Jul 13, 2012 at 12:45 UTC

    Please try again: as frozenwithjoy said, give us a sample of file1, a sample of file2, and a sample of what you want the output to be. Wrap each of these three samples in <code></code> tags to preserve their formatting. By a "sample," I mean a few lines that are enough to demonstrate your problem. One line is rarely enough; more than a dozen is usually too many.

    Aaron B.
    Available for small or large Perl jobs; see my home node.

      file1 is

      {RP}makaravilYakkin Sabarimala ayyappanu cArZwwAnulYlYa wiruviwAMkUrZ rAjAvAyirunna SrI ciwwirawirunnAlYZ bAlarAmavarZmma natakk vacca  420 kilogrAM wUkkamulYlYa wafkayafki{/RP}{MCL} sUkRikkunnaw I kRewrawwilAN.{/MCL}

      file2 is

      <Sentence id="1"> 1 (( NP 1.1 makaravilYakkin NN <fs af='makaravilYakk,n,any,sg,,d,,kk' + conj="blank" spec="blank" CASE_NAME="dat" dubi="blank"> )) 2 (( NP 2.1 Sabarimala NNP <fs af='Sabarimala,n,any,sg,,d,,0' conj="b +lank" spec="blank" CASE_NAME="nom" dubi="blank"> 2.2 ayyappanu NN <fs af='ayyappanu,unkn,,,,,,' poslcat="NM"> )) 3 (( VGF 3.1 cArZwwAnulYlYa VM <fs af='cArZww,v,any,any,any,,AnulYlYa, +AnulYlYa'> )) 4 (( NP 4.1 wiruviwAMkUrZ QF <fs af='wiruviwAMkUrZ,n,any,sg,,d,,0' co +nj="blank" spec="blank" CASE_NAME="nom" dubi="blank" poslcat="NM"> 4.2 rAjAvAyirunna NN <fs af='rAjAv,n,m,sg,,o,,yAyirunna' conj +="blank" spec="blank" CASE_NAME="nom" dubi="blank"> )) 5 (( NP 5.1 SrI UNK <fs af='SrI,n,any,sg,,d,,0' conj="blank" spec="bl +ank" CASE_NAME="nom" dubi="blank" poslcat="NM"> 5.2 ciwwirawirunnAlYZ NN <fs af='ciwwirawirunnAlYZ,unkn,,,,,, +' poslcat="NM"> 5.3 bAlarAmavarZmma NNP <fs af='bAlarAmavarZmma,unkn,,,,,,' p +oslcat="NM"> 5.4 natakk NN <fs af='nata,n,any,sg,,d,,kk' conj="blank" spec +="blank" CASE_NAME="dat" dubi="blank"> )) 6 (( VGF 6.1 vacca VM <fs af='vaykk,v,any,any,any,,ta,ta' CASE_NAME="n +om"> )) 7 (( NP 7.1 420 QC <fs af='420,num,,,,,,'> 7.2 kilogrAM NN <fs af='kilogrAM,unkn,,,,,,' poslcat="NM"> )) 8 (( NP 8.1 wUkkamulYlYa NN <fs af='wUkkaM,n,any,sg,,d,,yulYlYa' conj +="blank" spec="blank" CASE_NAME="nom" dubi="blank"> 8.2 wafkayafki NNP <fs af='wafkayafki,unkn,,,,,,' poslcat="NM +"> )) 9 (( VGNF 9.1 sUkRikkunnaw VM <fs af='sUkRikk,v,any,any,any,,unnaw,unna +w'> )) 10 (( NP 10.1 I DEM <fs af='I,pn,any,sg,,,,0' conj="blank" spec="blank +" CASE_NAME="nom" dubi="blank"> 10.2 kRewrawwilAN NN <fs af='kRewraM,n,any,sg,,d,,yilAN' conj +="blank" spec="blank" CASE_NAME="nom" dubi="blank"> 10.3 . SYM <fs af='.,punc,,,,,,' poslcat="NM"> )) </Sentence>

      my output file should be

      <Sentence id="1"> 1 (( NP 1.1 makaravilYakkin NN <fs af='makaravilYakk,n,any,sg,,d,,kk' + conj="blank" spec="blank" CASE_NAME="dat" dubi="blank" clause_start= +"rp"> )) 2 (( NP 2.1 Sabarimala NNP <fs af='Sabarimala,n,any,sg,,d,,0' conj="b +lank" spec="blank" CASE_NAME="nom" dubi="blank"> 2.2 ayyappanu NN <fs af='ayyappanu,unkn,,,,,,' poslcat="NM"> )) 3 (( VGF 3.1 cArZwwAnulYlYa VM <fs af='cArZww,v,any,any,any,,AnulYlYa, +AnulYlYa'> )) 4 (( NP 4.1 wiruviwAMkUrZ QF <fs af='wiruviwAMkUrZ,n,any,sg,,d,,0' co +nj="blank" spec="blank" CASE_NAME="nom" dubi="blank" poslcat="NM"> 4.2 rAjAvAyirunna NN <fs af='rAjAv,n,m,sg,,o,,yAyirunna' conj +="blank" spec="blank" CASE_NAME="nom" dubi="blank"> )) 5 (( NP 5.1 SrI UNK <fs af='SrI,n,any,sg,,d,,0' conj="blank" spec="bl +ank" CASE_NAME="nom" dubi="blank" poslcat="NM"> 5.2 ciwwirawirunnAlYZ NN <fs af='ciwwirawirunnAlYZ,unkn,,,,,, +' poslcat="NM"> 5.3 bAlarAmavarZmma NNP <fs af='bAlarAmavarZmma,unkn,,,,,,' p +oslcat="NM"> 5.4 natakk NN <fs af='nata,n,any,sg,,d,,kk' conj="blank" spec +="blank" CASE_NAME="dat" dubi="blank"> )) 6 (( VGF 6.1 vacca VM <fs af='vaykk,v,any,any,any,,ta,ta' CASE_NAME="n +om"> )) 7 (( NP 7.1 420 QC <fs af='420,num,,,,,,'> 7.2 kilogrAM NN <fs af='kilogrAM,unkn,,,,,,' poslcat="NM"> )) 8 (( NP 8.1 wUkkamulYlYa NN <fs af='wUkkaM,n,any,sg,,d,,yulYlYa' conj +="blank" spec="blank" CASE_NAME="nom" dubi="blank"> 8.2 wafkayafki NNP <fs af='wafkayafki,unkn,,,,,,' poslcat="NM +" clause_end="rp"> )) 9 (( VGNF 9.1 sUkRikkunnaw VM <fs af='sUkRikk,v,any,any,any,,unnaw,unna +w'> )) 10 (( NP 10.1 I DEM <fs af='I,pn,any,sg,,,,0' conj="blank" spec="blank +" CASE_NAME="nom" dubi="blank"> 10.2 kRewrawwilAN NN <fs af='kRewraM,n,any,sg,,d,,yilAN' conj +="blank" spec="blank" CASE_NAME="nom" dubi="blank"> 10.3 . SYM <fs af='.,punc,,,,,,' poslcat="NM"> )) </Sentence>

        I'm not sure why the MCL clause doesn't show up in your output sample. But I'd say you have a two-step process, possibly involving two hashes:

        1. Go through file1, parsing out the beginning and end word in each clause, putting them in a %start hash and an %end hash respectively, with the tag (RP, MCL) as the keys' values.
        2. Go through file2, checking the first word of each line to see if it exists in one of these hashes, and if so, add the appropriate tag to the end of the line.

        The rest is just implementation.

        Aaron B.
        Available for small or large Perl jobs; see my home node.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://981591]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-04-18 03:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found