Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

finding strings

by duyet (Friar)
on Dec 09, 2012 at 11:51 UTC ( [id://1007973]=perlquestion: print w/replies, xml ) Need Help??

duyet has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks, Someone sent me a link to www.interviewstreet.com and i tried one of those challenges called: Find strings for fun. I have submitted a working code but after 3 tests it fails with "Out of Memory" I have tried a few attempts without real improvement, it gets a bit faster but still fails. Maybe you can have a look at it and let me know which parts can be improved. TIA
Inputs: 2 aab aac 3 3 8 24 Output: aab c INVALID #!/usr/bin/perl use strict; use warnings; $| = 1; # flush output # Get inputs my $n = 0; # number of strings chomp( $n = <STDIN> ); my $i = 0; # line counter my $w = ''; # input string my @S = (); # uniq strings from inputs while ( $i++ < $n ) { chomp ( $w = <STDIN> ); push @S, get_uniq_strings( $w ); } # Get the queries my $q = 0; chomp( $q = <STDIN> ); $i = 0; my $k = ''; my @queries = (); while ( $i++ < $q ) { chomp( $k = <STDIN> ); push @queries, $k - 1; } # Output @S = make_uniq( \@S ); my $uniq_len = scalar @S; foreach ( @queries ) { print (( $_ <= $uniq_len ) ? "$S[ $_ ]\n" : "INVALID\n" ); } exit; sub get_uniq_strings { my $string = shift; my @a = (); my $len = length $string; for ( my $i = 1; $i <= $len; $i++ ) { for ( my $offset = 0; $offset < $len - $i + 1; $offset++ ) { my $sub_str = substr $string, $offset, $i; push @a, $sub_str if $sub_str; } } @a; } sub make_uniq { my $str_array = shift; my $uniq_array = {}; foreach( @{ $str_array } ) { $uniq_array->{ $_ }++ if $_; } sort keys %{ $uniq_array }; }

Replies are listed 'Best First'.
Re: finding strings
by CountZero (Bishop) on Dec 09, 2012 at 14:06 UTC
    I think this does the trick:
    use Modern::Perl; my %uniq_substrings; my $strings = <DATA>; for ( 1 .. $strings ) { my $string = <DATA>; $uniq_substrings{$_}++ for substrings($string); } my $queries = <DATA>; for ( 1 .. $queries ) { my $query = <DATA>; say +( ( $query > scalar( keys %uniq_substrings ) ) ? 'INVALID' : ( sort keys %uniq_substrings )[ $query - 1 ] ); } sub substrings { my $string = shift; chomp($string); my @substrings; my $maxlength = length $string; for my $start ( 0 .. $maxlength ) { for my $length ( 1 .. $maxlength - $start ) { push @substrings, substr $string, $start, $length; } } return @substrings; } __DATA__ 2 aab aac 3 3 8 23
    Output:
    aab c INVALID
    Update: my code fails on test no. 4 with a time-out error. I'll see if I can speed it up.

    Update 2: When I exchange memory for speed (by using an array with the unique substrings, so I would not have to sort the keys again every time), test 4 passes, but now I get (predictably) an "out or memory" error on test 5 and beyond.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
Re: finding strings
by davido (Cardinal) on Dec 09, 2012 at 17:43 UTC

    I don't have time to work through it at the moment, but I think you'll be more successful using a Suffix Tree. There are a couple of CPAN modules: Tree::Suffix, and SuffixTree. The latter is written in C and bound to Perl via SWIG. It's an old module, but might be useful.


    Dave

      Perhaps, but these "programming challenge" sites have a problem with using non core-modules. :(

      CountZero

      A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      My blog: Imperial Deltronics

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1007973]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2024-04-16 11:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found