Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Minimal password check, again

by bronto (Priest)
on Jul 27, 2003 at 16:29 UTC ( [id://278236]=perlquestion: print w/replies, xml ) Need Help??

bronto has asked for the wisdom of the Perl Monks concerning the following question:

Ciao Monks, it's nice to be back again in the monastery!

I have a question about basic password checking; I already took a look at Data::Password and, in the monastery, to this and this other node (I also tried SuperSearch to check in FAQ and Categorized Questions/Answers, but I got too much garbage and gave up).

What I need is to do just a minimal check, in particular:

  • password is at least 5 characters long;
  • password contains alphabetic characters and digits
  • password isn't a repetition of a few characters or patterns (e.g.: "pippo12": too many p's; "cacca11": only c's and a's, and widely repeated; "32ratatata": the pattern "ta" appears 3 times and covers 60% of the password...)

While the first two points are really easy to code:

sub validate_password { my ($field,$pass) = @_ ; my $minlen = 5 ; return "Field $field: Password too short (min. $minlen chars)" if length($pass) < $minlen ; return "Field $field: Please use alphabetics and digits (at least)" unless $pass =~ /[a-z/i and $pass =~ /\d/ ; ...

I'm not sure how to manage the third point in a clear and efficient manner, where clear means that the code is enough readable for me to understand when I'll get back to the code after 3 months :-) and efficient is that it doesn require years to run.

I gave up Data::Password because it was too general and UNIX oriented (I'd like my program to be portable). Similarly, Authen::PAM and Crypt::Cracklib don't suit my needs

Any clues?

Ciao!
--bronto


The very nature of Perl to be like natural language--inconsistant and full of dwim and special cases--makes it impossible to know it all without simply memorizing the documentation (which is not complete or totally correct anyway).
--John M. Dlugosz

Replies are listed 'Best First'.
Re: Minimal password check, again
by Abigail-II (Bishop) on Jul 27, 2003 at 20:16 UTC
    I don't see what's so bad about the password 32ratatata. Sure, it repeats the sequence ta, but the first five characters are 32rat. That's at least five characters (check 1), contains alphabetic characters and digits (check 2), and doesn't have any repeatitions (check 3).

    And password policy that can turn a rejected password into an accepted password by purely deleting characters from the end is IMO flawed.

    Abigail

      I don't see what's so bad about the password 32ratatata.

      That's because you're not looking at the type of cracking techniques that are commonly used (sadly the info's mostly only available through commercial monitoring services, but you can make fairly educated guesses based on what you would try). Adding on 2 numbers to the beginning or ending of a short (particularly lowercase) string is an extremely common style of password creation. Looking at a 2n3a (2 numbers, 3 lowercase alpha) password we get 10 * 10 * 26 * 26 * 26 or (1 757 600) possible solutions. This sounds like a lot, but it isn't. If someone got access to the password file, and tried that structure it would be trival to crack.

      Now consider a password that instead of the 2 numbers, uses 2 extra letters that can be either uppercase or lowercase: (52 * 52 * 52 * 52 * 52) (~380 million). Still not great, but a lot better. Now consider making it alpha (u and l), numeric, and throw in some punctuation (problems memorizing passwords can be reduced with a minimal cost to security by generating them in memorable formats such as Til8iB3@pm - also implement a system where the user has to enter the password at certain reducing intervals (5 min, 1 hour, 6 hours...) after receiving it, it greatly helps)

      Now I babbled on and forgot what I was talking about in the first place! Sorry for taking up so much space, hope someone finds it moderately useful! ;-)

      Bet that I'll vote for Abigail-II's comment when I get some points again tomorrow.
Re: Minimal password check, again
by saintbrie (Scribe) on Jul 27, 2003 at 16:54 UTC

    Repetition of a few characters shouldn't be too tough, you can split on a null string.

    my @letters = split //, $pass; # get the letters into # an array my %lc = (); # initialize a hash for use in a count. foreach my $letter (@letters) { $lc{$letter}++; # count the occurrence of each letter } my @keys = keys %lc; get the individual characters used. if (($#keys+1)/length($foo) < .6) { # if there are less than 60% unique characters print "not enough different letters\n"; } my $ok = 1; # just a flag foreach my $key (@keys) { # check the frequency of individual letters. # if a character occurs more than 20% of the time, it is # no good. # an alternate way to do this would be to check the actual # occurrences of each character: # $ok = 0 if ($lc{$key} >= 2); $ok = 0 if ($lc{$key}/length($foo) > .2) } print "You have to have more different letters." unless $ok;

    The patterns part is a bit tougher, a monk higher up the food chain may want to tackle that, but I'd suggest with low enough ratios for the different letters and individual letters, you will probably trap those problems.
Re: Minimal password check, again
by ihb (Deacon) on Jul 27, 2003 at 17:15 UTC

    Regarding your repetition problem this can perhaps can be useful to you.

    sub check_reps { my ($pwd) = @_; while ($pwd =~ /(?=((.+)\2+))/g) { printf "%s repeated %d times, covers %d%%\n", $2, length($1) / length($2), 100 * length($1) / length($pwd) ; } } check_reps('xattattttatty'); __END__ att repeated 2 times, covers 46% ttatt repeated 2 times, covers 76% tt repeated 2 times, covers 30% t repeated 3 times, covers 23% t repeated 2 times, covers 15% t repeated 2 times, covers 15%

    Without a better definition of the problem it's hard to give a better answer.

    Update: Added an example.

    Hope this'll help,
    ihb
Re: Minimal password check, again
by hossman (Prior) on Jul 28, 2003 at 02:48 UTC

    A lot of comments have been made regarding the usefullness finding repeating sequences. Without making any comments for or against, I'd just like to point out that as several recent nodes have pointed out, a very simple way to find commonalities between strings (or even within a single string) is to use compression, and analyze the size ratio between the compressed/raw versions.

    for example...

    laptop:~> perl -MCompress::LZW -lne 'print " -> ", compress($_), " / $ +_ == ", length(compress($_))/length($_);' a -> a / a == 2 aaa -> a / aaa == 1.33333333333333 aaaaaaaaaaaaaa -> a / aaaaaaaaaaaaaa == 0.714285714285714 qwre1qwre2qwer3 -> qwre12er3 / qwre1qwre2qwer3 == 1.6 qwertgfdsazxcv -> qwertgfdsazxcv / qwertgfdsazxcv == 2 ^D
      <virtual aplaud>never would have thought of that one...</virtual aplaud>

      ----
      Zak
      Pluralitas non est ponenda sine neccesitate - mysql's philosphy
Re: Minimal password check, again
by Limbic~Region (Chancellor) on Jul 27, 2003 at 17:12 UTC
    bronto,
    I am not sure why eliminating too many repeated characters is a requirement, but here is a node that was a contest to find the most efficient way to get the number of unique characters in a string. From a brute force perspective - it is no easier to crack the password abcde then it is ppppp.

    As far as identifying repeating sequence of characters, I do not know of an efficient method to do this, but:

  • Use substr with various lengths and starting positions
  • Use my $count = $temp_pass =~ s/$substring//g

    To get the count of how often the sequence repeats in the password. Again, I do not understand the point of this.

    I hope this helps - L~R

Re: Minimal password check, again
by waswas-fng (Curate) on Jul 27, 2003 at 17:11 UTC
    you may expand the test for alpha-numerics and split the string into chars, insert into a hash and count the keys, if the # of keys is < than your acceptable unique chars then reject.

    -Waswas

    Edited: I have to agree with L~R abnout the repeats not effecting the strength of the password, really the only things that effect that are:
  • The types of characters allowed in the password
  • The minimum length
  • the maximum length
  • allowing users to use dictonary words in their password
Re: Minimal password check, again
by eric256 (Parson) on Jul 27, 2003 at 16:59 UTC

    I'm not sure how to manage the patterns but you can do some simple comparisons to catch the too many p's and only c's and a's.

    Basicaly just do a kind of statistical analysis. Count all the different letters and store the value for each. Total that and then make sure that no one char is above your threshold for percentage of the whole. Then count the total number of different chars and make sure that is above your threshold. That should in theory eliminate the simple patterns to start with. You could also then require that no two (or three) letters have the same percentage of the whole (since that would be the case with repeating patterns.)

    I could include some code in that if you need it.

    ___________
    Eric Hodges
Re: Minimal password check, again
by SyN/AcK (Scribe) on Jul 28, 2003 at 01:55 UTC

    I'm going to agree with what a few others have said and say that you not worry about repeating patterns. Even if a password was ztztztzt, its not anymore likely to be cracked than any other string. What you really have to worry about is passwords that would be in a common dictionary file.

    Another thing to consider is every word in a certain dictionary file, plus either appending any two characters, or prepending any two characters. The addition of two characters is not a far stretch for a password cracker to get.

    A good idea might be to get a really good dictionary file, then search the password for any substrings that are one of those dictionary words. Then perhaps you could find out how many characters in the password are not said substring, and have some constant number that you check this against.

    If you need a strong password policy, I suggest forcing users to use at least ten characters, and suggest that they choose a line from a favorite song, then grab the first or second letter of each word to make up there password. This makes it secure and easy to remember. Easy to remember is important because if your users are writing their passwords down everywhere, there not exactly secure.

Re: Minimal password check, again
by matsmats (Monk) on Jul 28, 2003 at 21:55 UTC

    It seems like Text::Ngrams is what you are looking for. n-grams are items of size n in a piece of data, and Text::Ngrams locates and counts all possible combinations.

    A word of notice: The module only seems to have documented functions to return formatted data. You can access the internal data structure directly, though, but have too look around in the source a little. (basically its my %grams = %{$ng->{table}[2]}; to get the 2-grams (where $ng is the ngrams-object))

    Hope this is of any help,
    Mats

Re: Minimal password check, again
by EdwardG (Vicar) on Jul 28, 2003 at 15:55 UTC
    I don't know about you, but the usual reason I get stuck when coding is because I didn't think hard enough about what I wanted to achieve in the first place.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://278236]
Approved by fglock
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2024-04-24 22:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found