Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

it should be simple enough...

by lomSpace (Scribe)
on Feb 20, 2011 at 01:44 UTC ( #889140=perlquestion: print w/replies, xml ) Need Help??

lomSpace has asked for the wisdom of the Perl Monks concerning the following question:

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re: it should be simple enough...
by toolic (Bishop) on Feb 20, 2011 at 02:04 UTC
    Any suggestions?
    Here are a few which might help you get some useful feedback:
    • Ask a specific question.
    • Describe what problem you are having.
    • Show a small sample of your input file.
    • Show the desired contents of your array.
    • Change your node title: How do I compose an effective node title?
    Without knowing more details, I would consider using Range Operators as an alternate to $/:
    if (/LOCAL/ .. m{//}) { # do something }
      Toolic,
      I was trying to use the split function to parse a file with records. I realized
      that was incorrect. I then looked at the record separator and attempted
      to implement it and then use the split function.
      I still had the same output. I used data dumper and I am able to see that only
      the first line is being parsed. Here is my code and example file:
      #!/usr/bin/perl -w use strict; use Data::Dumper; # create scalar variable to define the file that will be # parsed. my $genpept = "/Users/mgavibrathwaite/Desktop/proteins.gp"; #Set the global record separator to "//" open(my $in,"$genpept"); undef $/; my @genpepts = split(/^\w{5}/,$in); print Dumper(@genpepts); __DATA__ LOCUS NP_644805 770 aa linear PRI 06 +-FEB-2011 DEFINITION signal transducer and activator of transcription 3 isoform + 1 [Homo sapiens]. ACCESSION NP_644805 VERSION NP_644805.1 GI:21618340 DBSOURCE REFSEQ: accession NM_139276.2 KEYWORDS . SOURCE Homo sapiens (human) ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Eutele +ostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhin +i; Catarrhini; Hominidae; Homo. REFERENCE 1 (residues 1 to 770) AUTHORS Davidson,S.I., Liu,Y., Danoy,P.A., Wu,X., Thomas,G.P., Jia +ng,L., Sun,L., Wang,N., Han,J., Han,H., Visscher,P.M., Brown,M.A. + and Xu,H. CONSRTM Australo-Anglo-American Spondyloarthritis Consortium TITLE Association of STAT3 and TNFRSF1A with ankylosing spondyli +tis in Han Chinese JOURNAL Ann. Rheum. Dis. 70 (2), 289-292 (2011) PUBMED 21068102 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) REFERENCE 2 (residues 1 to 770) AUTHORS Budarf,M.L., Goyette,P., Boucher,G., Lian,J., Graham,R.R., Claudio,J.O., Hudson,T., Gladman,D., Clarke,A.E., Pope,J.E +., Peschken,C., Smith,C.D., Hanly,J., Rich,E., Boire,G., Barr +,S.G., Zummer,M., Fortin,P.R., Wither,J. and Rioux,J.D. CONSRTM GenES Investigators TITLE A targeted association study in systemic lupus erythematos +us identifies multiple susceptibility alleles JOURNAL Genes Immun. 12 (1), 51-58 (2011) PUBMED 20962850 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) REFERENCE 3 (residues 1 to 770) AUTHORS Hosur,V. and Loring,R.H. TITLE alpha4beta2 nicotinic receptors partially mediate anti-inf +lammatory effects through Janus kinase 2-signal transducer and activ +ator of transcription 3 but not calcium or cAMP signaling JOURNAL Mol. Pharmacol. 79 (1), 167-174 (2011) PUBMED 20943775 REMARK GeneRIF: A role was determined for signal transducer and a +ctivator of transcription 3 and Janus kinase-2 transduction in alph +a4beta2 nicotinic receptor-mediated anti-inflammatory effects. REFERENCE 4 (residues 1 to 770) AUTHORS Laukens,D., Georges,M., Libioulle,C., Sandor,C., Mni,M., V +ander Cruyssen,B., Peeters,H., Elewaut,D. and De Vos,M. TITLE Evidence for significant overlap between common risk varia +nts for Crohn's disease and ankylosing spondylitis JOURNAL PLoS ONE 5 (11), E13795 (2010) PUBMED 21072187 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) Publication Status: Online-Only REFERENCE 5 (residues 1 to 770) AUTHORS Shukla,S., Shishodia,G., Mahata,S., Hedau,S., Pandey,A., Bhambhani,S., Batra,S., Basir,S.F., Das,B.C. and Bharti,A. +C. TITLE Aberrant expression and constitutive activation of STAT3 i +n cervical carcinogenesis: implications in high-risk human papillomavirus infection JOURNAL Mol. Cancer 9, 282 (2010) PUBMED 20977777 REMARK GeneRIF: Data show that in the presence of HPV16, STAT3 is aberrantly-expressed and constitutively-activated in cervi +cal cancer which increases as the lesion progresses thus indic +ating its potential role in progression of HPV16-mediated cervical carcinogenesis. Publication Status: Online-Only REFERENCE 6 (residues 1 to 770) AUTHORS Boulton,T.G., Zhong,Z., Wen,Z., Darnell,J.E. Jr., Stahl,N. + and Yancopoulos,G.D. TITLE STAT3 activation by cytokines utilizing gp130 and related transducers involves a secondary modification requiring an H7-sensitive kinase JOURNAL Proc. Natl. Acad. Sci. U.S.A. 92 (15), 6915-6919 (1995) PUBMED 7624343 REFERENCE 7 (residues 1 to 770) AUTHORS Lin,J.X., Migone,T.S., Tsang,M., Friedmann,M., Weatherbee, +J.A., Zhou,L., Yamauchi,A., Bloom,E.T., Mietz,J., John,S. et al. TITLE The role of shared receptor motifs and common Stat protein +s in the generation of cytokine pleiotropy and redundancy by IL-2, +IL-4, IL-7, IL-13, and IL-15 JOURNAL Immunity 2 (4), 331-339 (1995) PUBMED 7719938 REFERENCE 8 (residues 1 to 770) AUTHORS Zhang,X., Blenis,J., Li,H.C., Schindler,C. and Chen-Kiang, +S. TITLE Requirement of serine phosphorylation for formation of STAT-promoter complexes JOURNAL Science 267 (5206), 1990-1994 (1995) PUBMED 7701321 REFERENCE 9 (residues 1 to 770) AUTHORS Akira,S., Nishio,Y., Inoue,M., Wang,X.J., Wei,S., Matsusak +a,T., Yoshida,K., Sudo,T., Naruto,M. and Kishimoto,T. TITLE Molecular cloning of APRF, a novel IFN-stimulated gene fac +tor 3 p91-related transcription factor involved in the gp130-med +iated signaling pathway JOURNAL Cell 77 (1), 63-71 (1994) PUBMED 7512451 REFERENCE 10 (residues 1 to 770) AUTHORS DuBridge,R.B., Tang,P., Hsia,H.C., Leong,P.M., Miller,J.H. + and Calos,M.P. TITLE Analysis of mutation in human cells by using an Epstein-Ba +rr virus shuttle system JOURNAL Mol. Cell. Biol. 7 (1), 379-387 (1987) PUBMED 3031469 COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staf +f. The reference sequence was derived from BI461226.1, BC014482.1 +, AK092965.1, CB216860.1, BC008044.2, CF454565.1 and AI63189 +6.1. This sequence is a reference standard in the RefSeqGene pr +oject. On May 7, 2004 this sequence version replaced gi:16596688. Summary: The protein encoded by this gene is a member of t +he STAT protein family. In response to cytokines and growth factor +s, STAT family members are phosphorylated by the receptor associat +ed kinases, and then form homo- or heterodimers that transloc +ate to the cell nucleus where they act as transcription activator +s. This protein is activated through phosphorylation in response t +o various cytokines and growth factors including IFNs, EGF, IL5, IL6 +, HGF, LIF and BMP2. This protein mediates the expression of a va +riety of genes in response to cell stimuli, and thus plays a key ro +le in many cellular processes such as cell growth and apoptosis. + The small GTPase Rac1 has been shown to bind and regulate the +activity of this protein. PIAS3 protein is a specific inhibitor of +this protein. Three alternatively spliced transcript variants e +ncoding distinct isoforms have been described. [provided by RefSeq +]. Transcript Variant: This variant (1) represents the longes +t transcript, and encodes the longest isoform (1). Publication Note: This RefSeq record includes a subset of + the publications that are available for this gene. Please see +the Entrez Gene record to access additional publications. FEATURES Location/Qualifiers source 1..770 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="17" /map="17q21.31" Protein 1..770 /product="signal transducer and activator of tran +scription 3 isoform 1" /note="acute-phase response factor; DNA-binding p +rotein APRF" /calculated_mol_wt=87937 Region 150..162 /region_name="Essential for nuclear import" /experiment="experimental evidence, no additional + details recorded" /note="propagated from UniProtKB/Swiss-Prot (P407 +63.2)" Site 539 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 691 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphoserine; propagated from UniProtKB/S +wiss-Prot (P40763.2)" Site 705 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 714 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphothreonine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 727 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphoserine, by NLK; propagated from UniProtKB/Swiss-Prot (P40763.2)" CDS 1..770 /gene="STAT3" /gene_synonym="APRF; FLJ20882; HIES; MGC16063" /coded_by="NM_139276.2:241..2553" /note="isoform 1 is encoded by transcript variant + 1" /db_xref="CCDS:CCDS32656.1" /db_xref="GeneID:6774" /db_xref="HGNC:11364" /db_xref="HPRD:00026" /db_xref="MIM:102582" ORIGIN 1 maqwnqlqql dtryleqlhq lysdsfpmel rqflapwies qdwayaaske shatl +vfhnl 61 lgeidqqysr flqesnvlyq hnlrrikqfl qsrylekpme iarivarclw eesrl +lqtaa 121 taaqqggqan hptaavvtek qqmleqhlqd vrkrvqdleq kmkvvenlqd dfdfn +yktlk 181 sqgdmqdlng nnqsvtrqkm qqleqmltal dqmrrsivse lagllsamey vqktl +tdeel 241 adwkrrqqia ciggppnicl drlenwitsl aesqlqtrqq ikkleelqqk vsykg +dpivq 301 hrpmleeriv elfrnlmksa fvverqpcmp mhpdrplvik tgvqfttkvr llvkf +pelny 361 qlkikvcidk dsgdvaalrg srkfnilgtn tkvmnmeesn ngslsaefkh ltlre +qrcgn 421 ggrancdasl ivteelhlit fetevyhqgl kidlethslp vvvisnicqm pnawa +silwy 481 nmltnnpknv nfftkppigt wdqvaevlsw qfssttkrgl sieqlttlae kllgp +gvnys 541 gcqitwakfc kenmagkgfs fwvwldniid lvkkyilalw negyimgfis kerer +ailst 601 kppgtfllrf sesskeggvt ftwvekdisg ktqiqsvepy tkqqlnnmsf aeiim +gykim 661 datnilvspl vylypdipke eafgkycrpe sqehpeadpg saapylktkf icvtp +ttcsn 721 tidlpmsprt ldslmqfgnn gegaepsagg qfesltfdme ltsecatspm // LOCUS NP_003141 769 aa linear PRI 06 +-FEB-2011 DEFINITION signal transducer and activator of transcription 3 isoform + 2 [Homo sapiens]. ACCESSION NP_003141 NP_444275 VERSION NP_003141.2 GI:21618338 DBSOURCE REFSEQ: accession NM_003150.3 KEYWORDS . SOURCE Homo sapiens (human) ORGANISM Homo sapiens Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Eutele +ostomi; Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhin +i; Catarrhini; Hominidae; Homo. REFERENCE 1 (residues 1 to 769) AUTHORS Davidson,S.I., Liu,Y., Danoy,P.A., Wu,X., Thomas,G.P., Jia +ng,L., Sun,L., Wang,N., Han,J., Han,H., Visscher,P.M., Brown,M.A. + and Xu,H. CONSRTM Australo-Anglo-American Spondyloarthritis Consortium TITLE Association of STAT3 and TNFRSF1A with ankylosing spondyli +tis in Han Chinese JOURNAL Ann. Rheum. Dis. 70 (2), 289-292 (2011) PUBMED 21068102 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) REFERENCE 2 (residues 1 to 769) AUTHORS Budarf,M.L., Goyette,P., Boucher,G., Lian,J., Graham,R.R., Claudio,J.O., Hudson,T., Gladman,D., Clarke,A.E., Pope,J.E +., Peschken,C., Smith,C.D., Hanly,J., Rich,E., Boire,G., Barr +,S.G., Zummer,M., Fortin,P.R., Wither,J. and Rioux,J.D. CONSRTM GenES Investigators TITLE A targeted association study in systemic lupus erythematos +us identifies multiple susceptibility alleles JOURNAL Genes Immun. 12 (1), 51-58 (2011) PUBMED 20962850 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) REFERENCE 3 (residues 1 to 769) AUTHORS Hosur,V. and Loring,R.H. TITLE alpha4beta2 nicotinic receptors partially mediate anti-inf +lammatory effects through Janus kinase 2-signal transducer and activ +ator of transcription 3 but not calcium or cAMP signaling JOURNAL Mol. Pharmacol. 79 (1), 167-174 (2011) PUBMED 20943775 REMARK GeneRIF: A role was determined for signal transducer and a +ctivator of transcription 3 and Janus kinase-2 transduction in alph +a4beta2 nicotinic receptor-mediated anti-inflammatory effects. REFERENCE 4 (residues 1 to 769) AUTHORS Laukens,D., Georges,M., Libioulle,C., Sandor,C., Mni,M., V +ander Cruyssen,B., Peeters,H., Elewaut,D. and De Vos,M. TITLE Evidence for significant overlap between common risk varia +nts for Crohn's disease and ankylosing spondylitis JOURNAL PLoS ONE 5 (11), E13795 (2010) PUBMED 21072187 REMARK GeneRIF: Observational study of gene-disease association. +(HuGE Navigator) Publication Status: Online-Only REFERENCE 5 (residues 1 to 769) AUTHORS Shukla,S., Shishodia,G., Mahata,S., Hedau,S., Pandey,A., Bhambhani,S., Batra,S., Basir,S.F., Das,B.C. and Bharti,A. +C. TITLE Aberrant expression and constitutive activation of STAT3 i +n cervical carcinogenesis: implications in high-risk human papillomavirus infection JOURNAL Mol. Cancer 9, 282 (2010) PUBMED 20977777 REMARK GeneRIF: Data show that in the presence of HPV16, STAT3 is aberrantly-expressed and constitutively-activated in cervi +cal cancer which increases as the lesion progresses thus indic +ating its potential role in progression of HPV16-mediated cervical carcinogenesis. Publication Status: Online-Only REFERENCE 6 (residues 1 to 769) AUTHORS Boulton,T.G., Zhong,Z., Wen,Z., Darnell,J.E. Jr., Stahl,N. + and Yancopoulos,G.D. TITLE STAT3 activation by cytokines utilizing gp130 and related transducers involves a secondary modification requiring an H7-sensitive kinase JOURNAL Proc. Natl. Acad. Sci. U.S.A. 92 (15), 6915-6919 (1995) PUBMED 7624343 REFERENCE 7 (residues 1 to 769) AUTHORS Lin,J.X., Migone,T.S., Tsang,M., Friedmann,M., Weatherbee, +J.A., Zhou,L., Yamauchi,A., Bloom,E.T., Mietz,J., John,S. et al. TITLE The role of shared receptor motifs and common Stat protein +s in the generation of cytokine pleiotropy and redundancy by IL-2, +IL-4, IL-7, IL-13, and IL-15 JOURNAL Immunity 2 (4), 331-339 (1995) PUBMED 7719938 REFERENCE 8 (residues 1 to 769) AUTHORS Zhang,X., Blenis,J., Li,H.C., Schindler,C. and Chen-Kiang, +S. TITLE Requirement of serine phosphorylation for formation of STAT-promoter complexes JOURNAL Science 267 (5206), 1990-1994 (1995) PUBMED 7701321 REFERENCE 9 (residues 1 to 769) AUTHORS Akira,S., Nishio,Y., Inoue,M., Wang,X.J., Wei,S., Matsusak +a,T., Yoshida,K., Sudo,T., Naruto,M. and Kishimoto,T. TITLE Molecular cloning of APRF, a novel IFN-stimulated gene fac +tor 3 p91-related transcription factor involved in the gp130-med +iated signaling pathway JOURNAL Cell 77 (1), 63-71 (1994) PUBMED 7512451 REFERENCE 10 (residues 1 to 769) AUTHORS DuBridge,R.B., Tang,P., Hsia,H.C., Leong,P.M., Miller,J.H. + and Calos,M.P. TITLE Analysis of mutation in human cells by using an Epstein-Ba +rr virus shuttle system JOURNAL Mol. Cell. Biol. 7 (1), 379-387 (1987) PUBMED 3031469 COMMENT REVIEWED REFSEQ: This record has been curated by NCBI staf +f. The reference sequence was derived from BI461226.1, BC000627.2 +, AK092965.1, CB216860.1, BC008044.2, CF454565.1 and AI63189 +6.1. On Jun 27, 2002 this sequence version replaced gi:4507253. Summary: The protein encoded by this gene is a member of t +he STAT protein family. In response to cytokines and growth factor +s, STAT family members are phosphorylated by the receptor associat +ed kinases, and then form homo- or heterodimers that transloc +ate to the cell nucleus where they act as transcription activator +s. This protein is activated through phosphorylation in response t +o various cytokines and growth factors including IFNs, EGF, IL5, IL6 +, HGF, LIF and BMP2. This protein mediates the expression of a va +riety of genes in response to cell stimuli, and thus plays a key ro +le in many cellular processes such as cell growth and apoptosis. + The small GTPase Rac1 has been shown to bind and regulate the +activity of this protein. PIAS3 protein is a specific inhibitor of +this protein. Three alternatively spliced transcript variants e +ncoding distinct isoforms have been described. [provided by RefSeq +]. Transcript Variant: This variant (2) lacks a segment in th +e 5' UTR and 3 nt within the CDS, as compared to variant 1. The res +ulting isoform (2) lacks an amino acid compared to isoform 1. Publication Note: This RefSeq record includes a subset of + the publications that are available for this gene. Please see +the Entrez Gene record to access additional publications. FEATURES Location/Qualifiers source 1..769 /organism="Homo sapiens" /db_xref="taxon:9606" /chromosome="17" /map="17q21.31" Protein 1..769 /product="signal transducer and activator of tran +scription 3 isoform 2" /note="acute-phase response factor; DNA-binding p +rotein APRF" /calculated_mol_wt=87850 Region 150..162 /region_name="Essential for nuclear import" /experiment="experimental evidence, no additional + details recorded" /note="propagated from UniProtKB/Swiss-Prot (P407 +63.2)" Site 539 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 691 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphoserine; propagated from UniProtKB/S +wiss-Prot (P40763.2)" Site 704 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphotyrosine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 713 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphothreonine; propagated from UniProtKB/Swiss-Prot (P40763.2)" Site 726 /site_type="phosphorylation" /experiment="experimental evidence, no additional + details recorded" /note="Phosphoserine, by NLK; propagated from UniProtKB/Swiss-Prot (P40763.2)" CDS 1..769 /gene="STAT3" /gene_synonym="APRF; FLJ20882; HIES; MGC16063" /coded_by="NM_003150.3:219..2528" /note="isoform 2 is encoded by transcript variant + 2" /db_xref="CCDS:CCDS32657.1" /db_xref="GeneID:6774" /db_xref="HGNC:11364" /db_xref="HPRD:00026" /db_xref="MIM:102582" ORIGIN 1 maqwnqlqql dtryleqlhq lysdsfpmel rqflapwies qdwayaaske shatl +vfhnl 61 lgeidqqysr flqesnvlyq hnlrrikqfl qsrylekpme iarivarclw eesrl +lqtaa 121 taaqqggqan hptaavvtek qqmleqhlqd vrkrvqdleq kmkvvenlqd dfdfn +yktlk 181 sqgdmqdlng nnqsvtrqkm qqleqmltal dqmrrsivse lagllsamey vqktl +tdeel 241 adwkrrqqia ciggppnicl drlenwitsl aesqlqtrqq ikkleelqqk vsykg +dpivq 301 hrpmleeriv elfrnlmksa fvverqpcmp mhpdrplvik tgvqfttkvr llvkf +pelny 361 qlkikvcidk dsgdvaalrg srkfnilgtn tkvmnmeesn ngslsaefkh ltlre +qrcgn 421 ggrancdasl ivteelhlit fetevyhqgl kidlethslp vvvisnicqm pnawa +silwy 481 nmltnnpknv nfftkppigt wdqvaevlsw qfssttkrgl sieqlttlae kllgp +gvnys 541 gcqitwakfc kenmagkgfs fwvwldniid lvkkyilalw negyimgfis kerer +ailst 601 kppgtfllrf sesskeggvt ftwvekdisg ktqiqsvepy tkqqlnnmsf aeiim +gykim 661 datnilvspl vylypdipke eafgkycrpe sqehpeadpg aapylktkfi cvtpt +tcsnt 721 idlpmsprtl dslmqfgnng egaepsaggq fesltfdmel tsecatspm //

      The desired contents of my array would be the individual records which are
      separated by LOCAL and //. Thanks for informing me to improve my posting.
      Lom Space
Re: it should be simple enough...
by eyepopslikeamosquito (Bishop) on Feb 20, 2011 at 02:07 UTC

    You'll need to show us some example lines/records from your $genpept file to clarify what you are trying to achieve. Your code looks wrong on a number of levels. For example, you're opening the file but not actually reading from it (or even checking that the open succeeded). Also, the quoting around $genpept in open(my $in,"$genpept") is pointless.

    If you want to slurp the whole file into the $in variable, you need something like:

    open(my $fhin, '<', $genpept) or die "open '$genpept': $!"; undef $/; my $in = <$fhin>;

    Alternatively, if the file consists of multiple lines, it seems better to not slurp the whole file, but to read it line by line:

    while (my $in = <$fhin>) { # ... split etc. done once for each line $in. }

    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: it should be simple enough...
by ELISHEVA (Prior) on Feb 20, 2011 at 10:31 UTC

    I suspect you are confused because your final print statement emits something like $VAR1 = 'GLOB(0x817f880)'. As for why:

    • You never set the global separator to "//". You merely defined it to nothing at all. To set the record separator to "//", you do local $/='//'. That will break your input stream into records ending with '//'.
    • You never read in your file. Your split statement is attempting to split the string representation of an input handle, i.e. 'GLOB(...)', not the contents of your inputfile. To read a file into an array you need to use <$in>. To remove the '//' from the end you need to use chomp.

    Your code should look something like this if you want to read in everything in one gulp:

    local $/='//'; #define record separator my @records = <$in>; # read in all records chomp @records; #get rid of trailing // from each record

    Or this, which reads in one record at a time and is much more memory efficient

    local $/='//'; while (my $record = <$in>) { chomp $record; #remove // from end of line #... process the record ... }
Re: it should be simple enough...
by Marshall (Canon) on Feb 20, 2011 at 02:35 UTC
    I also suggest that you read about the split() function. The default split uses /\s+/. Basically that means that when a sequence of one or more "whitespace" characters are found, they are thrown away and the token to that point is returned. Your split is highly unlikely to return anything of interest.
    my @genpepts = split(/^\w{5}/,$in);
    Probably is not going to do what you want.

    You seem to be describing a file like:

    LOCAL asrdf asfd afd asdf aafd qwer qwer qre qwre qrew // LOCAL asrdf asfd afd asdf aafd qwer qwer qre qwre qrew //
    I doubt that this is what you actually have.

    use split() when you know what to throw away.
    use match global when you know what you want to keep.

    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: it should be simple enough...
by AnomalousMonk (Bishop) on Feb 20, 2011 at 17:04 UTC

    lomSpace: Many thanks for posting an exact copy of your huge example data three separate times. The monastery had some server space it was desperate to kill. Thanks also for making sure that the string 'LOCAL' for which you say you are interested in processing the data never once appears anywhere in said data.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://889140]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2021-04-21 01:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?