Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

identifying arbitury patterns in multiple strings

by westernflame (Sexton)
on Aug 18, 2005 at 16:14 UTC ( #484847=perlquestion: print w/ replies, xml ) Need Help??
westernflame has asked for the wisdom of the Perl Monks concerning the following question:

I have a list of strings with an element that I want to extract. However, that element is often preceded by arbitrary text that occurs in each entry. For example:

greater than 32
greater than 26

I would like to be able to cycle through a list and identify recurring patterns. How should I do this in perl?


Update: Sorry for not being clear enough. The data that I want to process has some parts that will be on each line (such as greater than) and some that are not (such as a number). The point is that the string that repeats itself is not known. What I want to identify is parts of each string that are present in every string.

Comment on identifying arbitury patterns in multiple strings
Re: identifying arbitury patterns in multiple strings
by JediWizard (Deacon) on Aug 18, 2005 at 16:27 UTC

    Which element do you want to extract? Those strings look about the same to me (only the numbers changed)... what kind of arbitrary text will appear before the pattern (sometimes)?

    You Tell me you want to match a pattern... what pattern? Perrhaps you should see perlre.


    They say that time changes things, but you actually have to change them yourself.

    —Andy Warhol

Re: identifying arbitury patterns in multiple strings
by davidrw (Prior) on Aug 18, 2005 at 16:46 UTC
    ditto on JediWizard's reply. in addition, though, i will take a guess that with the two sample lines you're looking for an integer element and want to track what comes before it. Simply match both parts and then store accordingly (below shows how to keep track of just the counts, or in addition to keep the elements themselves too, in which case the count is jsut the size of the array ref)
    my %cts; if( $s =~ /^(.*?) (\d+)/ ){ my ($text, $element) = ($1, $2); $cts{$text}++; # OR push @{$cts{$text}}, $element; }
Re: identifying arbitury patterns in multiple strings
by si_lence (Deacon) on Aug 18, 2005 at 19:07 UTC
    Maybe not the most elegant solution (and I bet rather slow as well)
    It finds the longest string that is at the start of every line in the
    imput.
    use warnings; use strict; my %ch; my $part; my $lines; while (<DATA>) { chomp; $part=""; map { $part .= $_; print $part."\n"; $ch{$part}++; } split (//); $lines ++; } my $max; foreach (keys (%ch) ) { if ( ($ch{$_} == $lines) and (length($_) > length($max)) ) { $max = $_; } } print $max; __DATA__ abcdefg abcd abcd grtf abcd abd abcd daf


    si_lence

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://484847]
Approved by Nevtlathiel
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (12)
As of 2014-07-24 09:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (158 votes), past polls