What is the best way to pattern match an unknown pattern? Allow me to explain... I have a file that contains a series of data values (microarray probe sets to be specific) that I need to sort through. Technically, there should be 11 "probes" for each target (ex. 154115_at=target name), but there are not. So, since there is a commonality between these probes (the target name), I need to be able to sort through the file and have the program take the target name value from the first line, compare it to succesive lines until one doesn't match. (The matching data needs to be further parsed and put on one line tab delimited, but I know how to do that.)When that occurs, the mismatched data needs to become the new pattern to be compared to. I'm familiar with pattern matching. However, I don't know how to designate an "unknown" pattern in perl, since I can't go and write 22,000 some-odd patterns:-). A sample imput file:
>probe:MOE430A:1415670_at(target name):549:177; Interrogation_Position
>probe:MOE430A:1415670_at:549:177; Interrogation_Position=2513; Antise
>probe:MOE430A:1415670_at:467:433; Interrogation_Position=2521; Antise
>probe:MOE430A:1415670_at:254:643; Interrogation_Position=2533; Antise
>probe:MOE430A:1415670_at:54:269; Interrogation_Position=2556; Antisen
>probe:MOE430A:1415670_at:405:339; Interrogation_Position=2583; Antise
>probe:MOE430A:1415670_at:60:395; Interrogation_Position=2597; Antisen
>probe:MOE430A:1415670_at:284:165; Interrogation_Position=2619; Antise
>probe:MOE430A:1415670_at:622:145; Interrogation_Position=2634; Antise
>probe:MOE430A:1415670_at:291:661; Interrogation_Position=2804; Antise
>probe:MOE430A:1415670_at:146:701; Interrogation_Position=2956; Antise
>probe:MOE430A:1415671_at:116:525; Interrogation_Position=1156; Antise
>probe:MOE430A:1415671_at:655:137; Interrogation_Position=1173; Antise
>probe:MOE430A:1415671_at:398:139; Interrogation_Position=1232; Antise
Any help is most appreciated!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||