Sunday, June 5, 2011

Regex to find Email

Wrote a regular expression to find e-mail which has below conditions.

1. User name must start with a character.
2. It can contain _-. characters but not two consequetive hypens (--). It can contain numbers.
3. Domain name must start with a character or a number it can have any number of subdomains i.e .co.in etc

(\b[a-zA-Z][\w\.^[--]]*)@([a-zA-Z][\w^[--]]*)(\.[\w^[--]]+)+

My idea is to group email as username,domainname. First lets take username. This can be identified by (\b[a-zA-Z][\w^[--]]*).

\b means it should start with a word boundary. It should start with a character, so the character class [a-zA-Z] confirms the same.  Then it could contain any number of characters but not two hypens which can be achieved by a word character \w (a-zA-Z0-9) and a character class ^[--].
Email can have any number of sub domains i.e .co.in. so I split it as (.\w) i.e a . followed by a word character. And the rest is self explanatory.

No comments:

Post a Comment