Frequently Used Regular Expressions

June 28, 2007

There are a variety of tasks that regular expressions can be used for in regards to input validation and file parsing. Some of the most common expressions are often hard to remember because of the power and options available. Here is a list of commonly used regular expressions:

Use Expression
Social Security Number \d{3}-\d{2}-\d{4}
US Phone Number ((\(\d{3}\) ?)|(\d{3}-))?\d{3}-\d{4}
US Postal Code \d{5}(-\d{4})?
Internet EMail Address [\w-]+@([\w-]+\.)+[\w-]+
Internet URL http://(%5B\w-]\.)+[\w-](/[\w- ./?%=]*)?
Simple Password (digit) ^(?=.*\d).{4,8}$
Advanced Password (upper, lower, digit) ^(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{4,8}$
Common File masks ^(.+)\\(.+)\.(.+)
Major Credit Card \d{4}-?\d{4}-?\d{4}-?\d{4}

Reference:
ASP alliance


Password Validation via Regular Expression

June 26, 2007

I have recently been working on Password validation via regular expression, after so many researches I came across very good quality articles so I recapitulate over here.

When it comes to password validation using regular expressions, things can get a bit complicated. Normally, you want people to enter a “good” password that has a mix of numbers and letters. But you may not care where the numbers and letters appear. So you’re not looking for a “pattern” in the string. You just want a letter somewhere and a number somewhere.

In this first example, the password must be at least 8 characters long and start and end with a letter.

^[A-Za-z]\w{6,}[A-Za-z]$

The ^ looks for something at the start of the string. The brackets indicate the valid character set. So it must start with an upper or lower case letter. After that, the \w means there can be valid alphanumeric characters (numbers 0-9, upper/lower case letters a-z, the underscore) and says there must be at least 6 (but no upper limit). Then comes another set and the $ looks for something at the end of the string. So this statement says there must be a letter, then at least 6 of any alphanumeric characters, then a letter (making 8 the minimum number of characters).

In this second example, the password length doesn’t matter, but the password must contain at least 1 number, at least 1 lower case letter, and at least 1 upper case letter.

^\w*(?=\w*\d)(?=\w*[a-z])(?=\w*[A-Z])\w*$

Again, the ^ and $ are looking for things at the start and end. The “\w*” combination is used at both the start and the end. \w means any alphanumeric character, and * means zero or more. You’ll see why it’s “zero or more” in a bit. Between are groupings in parentheses. The “(?” combination is a flag in regular expressions. Basically, they say “apply the following formula, but don’t consume any of the string”. In this example, instead of specifying the order that things should appear, it’s saying that it must appear but we’re not worried about the order.

The first grouping (called an “atom” in “regular expresion speak”) uses the = sign. This means that there must be a match. Other choices are ! for a negative match (the string must not look like this). There are others (more complicated) for preceeding matches and stuff. We can refer you to a regular expression syntax web site for further details.

After the = sign comes “\w*\d”. Again, any alphanumeric character can happen zero or more times, then any digit (\d means any digit from 0 to 9) can happen. So this checks to see if there is at least one number in the string. But since the string isn’t comsumed, that one digit can appear anywhere in the string.

The next atom (grouping) is (?=\w*[a-z]). This is similar to the digit grouping, except it looks for a lower case letter. Again, the lower case letter can appear anywhere, but there has to be at least one.

The third atom is (?=\w*[A-Z]) which looks for an upper case letter somewhere in the string.

At the end is zero or more alphanumeric characters. To match this string, the minimum characters needed is 3 (one upper case letter, one lower case letter, and one number).

In this third example:

  • Must be at least 10 characters
  • Must contain at least one one lower case letter, one upper case letter, one digit and one special character
  • Valid special characters are –   @#$%^&+=

^.*(?=.{10,})(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?=.*[@#$%^&+=]).*$

As you can see in the regex, the list of special characters is configurable.

Reference:
Breaking Par Consulting
Anil John’s Blog


UML : A Hands-On Introduction for Developers

June 14, 2007

I came across very good article for UML. This tutorial provides a quick introduction to the Unified Modeling Language.
UML : A Hands-On Introduction for Developers