Password Validation via Regular Expression

I have recently been working on Password validation via regular expression, after so many researches I came across very good quality articles so I recapitulate over here.

When it comes to password validation using regular expressions, things can get a bit complicated. Normally, you want people to enter a “good” password that has a mix of numbers and letters. But you may not care where the numbers and letters appear. So you’re not looking for a “pattern” in the string. You just want a letter somewhere and a number somewhere.

In this first example, the password must be at least 8 characters long and start and end with a letter.

^[A-Za-z]\w{6,}[A-Za-z]$

The ^ looks for something at the start of the string. The brackets indicate the valid character set. So it must start with an upper or lower case letter. After that, the \w means there can be valid alphanumeric characters (numbers 0-9, upper/lower case letters a-z, the underscore) and says there must be at least 6 (but no upper limit). Then comes another set and the $ looks for something at the end of the string. So this statement says there must be a letter, then at least 6 of any alphanumeric characters, then a letter (making 8 the minimum number of characters).

In this second example, the password length doesn’t matter, but the password must contain at least 1 number, at least 1 lower case letter, and at least 1 upper case letter.

^\w*(?=\w*\d)(?=\w*[a-z])(?=\w*[A-Z])\w*$

Again, the ^ and $ are looking for things at the start and end. The “\w*” combination is used at both the start and the end. \w means any alphanumeric character, and * means zero or more. You’ll see why it’s “zero or more” in a bit. Between are groupings in parentheses. The “(?” combination is a flag in regular expressions. Basically, they say “apply the following formula, but don’t consume any of the string”. In this example, instead of specifying the order that things should appear, it’s saying that it must appear but we’re not worried about the order.

The first grouping (called an “atom” in “regular expresion speak”) uses the = sign. This means that there must be a match. Other choices are ! for a negative match (the string must not look like this). There are others (more complicated) for preceeding matches and stuff. We can refer you to a regular expression syntax web site for further details.

After the = sign comes “\w*\d”. Again, any alphanumeric character can happen zero or more times, then any digit (\d means any digit from 0 to 9) can happen. So this checks to see if there is at least one number in the string. But since the string isn’t comsumed, that one digit can appear anywhere in the string.

The next atom (grouping) is (?=\w*[a-z]). This is similar to the digit grouping, except it looks for a lower case letter. Again, the lower case letter can appear anywhere, but there has to be at least one.

The third atom is (?=\w*[A-Z]) which looks for an upper case letter somewhere in the string.

At the end is zero or more alphanumeric characters. To match this string, the minimum characters needed is 3 (one upper case letter, one lower case letter, and one number).

In this third example:

  • Must be at least 10 characters
  • Must contain at least one one lower case letter, one upper case letter, one digit and one special character
  • Valid special characters are –   @#$%^&+=

^.*(?=.{10,})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=]).*$

As you can see in the regex, the list of special characters is configurable.

Reference:
Breaking Par Consulting
Anil John’s Blog

About these ads

18 Responses to Password Validation via Regular Expression

  1. Literacy_Hooligan says:

    Actually the third example let’s through a HEX expression %00 (null), which in ASCII code means “end of file”. Many viruses operate on %00 and many hackers also use it to bypass the password system.
    I haven’t found a way to change it but i’m working on it and i will post it here soon.

  2. fokeerbux says:

    hi i’m doing my project on security
    if u can help in this one:
    be between 8 and 12 characters long
    contain at least three of the following:
    one lower case letter (a, b, c etc)
    one upper case letter (A, B, C etc)
    one numeral (1,2,3 etc)
    one of the following characters: ! # £ $ @

  3. David Rogers says:

    Very nice discussion of regular expression operators used for password validation!

  4. Sosys says:

    how to get at least one letter and one number and must only letter(s) and numbers?

  5. John Smith says:

    Very useful information. I was looking for something like this on the web and I’m glad I found this post. Thanks a lot.

  6. Robert says:

    Not sure what the problem is. This regex does NOT allow spcial characters of & + =

    could be a problem with the .Net framework

    ^.*(?=.{10,})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=]).*$

    are you aware?

  7. Jared says:

    I used the third example in C# for password validation, and find that it allows the user to enter in spaces as characters for the password. How can I edit the regular expression to not allow spaces?

    Using this regex:
    ^.*(?=.{10,32})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%*^&(){}]).*$

  8. Vivek Shastri says:

    Password Requirement :
    Password should be at least eight (8) characters in length where the system can support it.
    Passwords must include characters from at least two (2) of these groupings: alpha, numeric, and special characters.

    ^.*(?=.{8,})(?=.*\d)(?=.*[a-zA-Z])|(?=.{8,})(?=.*\d)(?=.*[!@#$%^&])|(?=.{8,})(?=.*[a-zA-Z])(?=.*[!@#$%^&]).*$

    I tested it and it works with Struts validation fine.

  9. This topic is quite hot in the net at the moment. What do you pay attention to when choosing what to write about?

  10. Ion Drimba says:

    Hi Nilang,

    My name is Ion Drimba and I´m a flash developer from Brazil.
    I ´m wondering if it´s possible to built a regular expression for this scenario:

    min 1 number max 8 and
    min 1 alphanumeric max 2.
    The order doesn’t matter.

    It is possible or am I trying to do something that regular expression doesn´t support?

    Thanks Ion Drimba

  11. Alps says:

    easy to understand !
    thnx

  12. mohammed rehan rizvi says:

    Pretty useful. Nice and easy

  13. pam says:

    Very useful ! Thanks

  14. Fozzy says:

    I’m doing this on PHP 5 and it appears that the starting “.*” in expression three is redundant and can be eliminated (at least, I did and it works fine).

    ^(?=.{10,32})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[!@#$%*^&(){}]).*$

    Analyzing this, you basically can pull the “atoms” out and you can see this…

    ^.*$

    Which reads “accept any number of characters”.

    Putting the atoms back states that the string must meet the “atom” criteria. For example, if we just wanted to limit the size we use the size “atom”.

    ^(?=.{10,32}).*$

    “Accept between 10 and 32 characters of any type.”

    If you wanted to make sure it also had at least 1 digit:

    ^(?=.{10,32})(?=.*\d).*$

    “Accept between 10 and 32 characters and must have any number of digits.”

    Now, we simply don’t care how many digits they use. If you only wanted them to use a specific number of digits, you’d have to change the (.*) portion of the digit “atom”. As an example, lets limit it to between 2 and 4 digits:

    ^(?=.{10,32})(?=\d{2,4}).*$

    In this case, we can keep building up what your password Regex criteria is.

    If you wanted to allow the user to pick 2 or 3 of the 4 criteria such as:

    “Your password must contain at least 3 of the following criteria: Upper-case, Lower-case, Number, and special character”

    You need to use the or (|) and build up all the possible combination. Allowing any 3 of the 4 criteria would create:

    ^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).*$
    |
    ^(?=.*\d)(?=.*[a-z])(?=.*[@#$%^&+=]).*$
    |
    ^(?=.*\d)(?=.*[A-Z])(?=.*[@#$%^&+=]).*$
    |
    ^(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=]).*$

    (There’s no limit to size, but clearly the password has to be at least 3 characters long since it has to have at least 3 unique characters in the string)

    Another issue you might see is that it allows ANY character (Nulls, Newlines, etc) in the string as long as AT LEAST the criteria characters at met.

    Meaning, you can have:

    “aB3{space}{newline}{etc..}”

    and it will still be accepted. To limit what characters you string can contain to be only the characters listed in your criteria (alpha-numeric and the listed special characters) you have to edit the (.*) to be the list of characters you want to accept such as ([a-zA-Z0-9@#$%^+=]*).

    Taking the “atoms” out, you will have a base Regex like this:

    ^[a-zA-Z0-9@#$%^+=]*$

    Simply add in the “Atoms” you want for criteria back in. Here’s what I use:

    “Require 3 of 4 criteria of: upper-case, lower-case, number, or the following special characters (@#$%^+=). No other characters are allowed.”

    ^(?=.*\d)(?=.*[a-z])(?=.*[A-Z])[a-zA-Z0-9@#$%^&+=]*$
    |
    ^(?=.*\d)(?=.*[a-z])(?=.*[@#$%^&+=])[a-zA-Z0-9@#$%^&+=]*$
    |
    ^(?=.*\d)(?=.*[A-Z])(?=.*[@#$%^&+=])[a-zA-Z0-9@#$%^&+=]*$
    |
    ^(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=])[a-zA-Z0-9@#$%^&+=]*$

    Add/remove special characters or “atoms” as you need.

  15. Shallu tagra says:

    All expressions work well with Java reqular expression library and however these do not seem to be working with Apache library . Trying to use the below and gives me an error – “Syntax error: Missing operand to closure”

    Expression used:
    ^.*(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&]).*$

    • Shallu tagra says:

      All expressions work well with Java reqular expression library and however these do not seem to be working with Apache library . Trying to use the below and gives me an error – “Syntax error: Missing operand to closure”

      Expression used:
      ^.*(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&]).*$

      Having looked at it again …seems like Apache doesnt accept characters like “?” , not sure what could be the replacement for the same and if there is none then the expression would be difficult to be written and understood by anyone at later stage as to check for 4 conditions in expression upper alpha , lower alpha , numeric and special characters string of expression will become too lengthy.

  16. dan says:

    I think a found a little flaw in the last regex. This regex allows to put !any! character at the end of the string, regardless of what you defined as valid characters.

    Test it against 0123456789aA+* and you will see.

    I also think the leading .* can be dismissed since it present in all positive lookaheads. The last .* is the source of the flaw since it allows any character to follow the valid characters.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: