Regular Expression to cover a range of zip codes

For general questions and discussions specific to the AbleCommerce GOLD ASP.Net shopping cart software.
Post Reply
User avatar
bkort@web2market.com
AbleCommerce Partner
AbleCommerce Partner
Posts: 113
Joined: Thu Jan 22, 2004 3:17 pm
Location: Illinois
Contact:

Regular Expression to cover a range of zip codes

Post by bkort@web2market.com » Mon Aug 26, 2013 1:43 pm

Could someone give me an example of how to create a regular expression for a zip code range? In Gold we've found you can't use a hyphen for a range. I'm not good at regular expressions.

User avatar
ForumsAdmin
AbleCommerce Moderator
AbleCommerce Moderator
Posts: 399
Joined: Wed Mar 13, 2013 7:19 am

Re: Regular Expression to cover a range of zip codes

Post by ForumsAdmin » Mon Aug 26, 2013 8:04 pm


User avatar
gjaros
AbleCommerce Partner
AbleCommerce Partner
Posts: 1717
Joined: Tue Feb 24, 2004 2:20 pm
Location: Illinois
Contact:

Re: Regular Expression to cover a range of zip codes

Post by gjaros » Wed Aug 28, 2013 9:05 am

AbleCommerce requires the RegEx to start with @ to indicate that it should look for a regular expression match instead of just an exact string match. You shouldn’t include the @ if you’re testing, a regex on a site like http://myregextester.com but when you enter it into the postal code boxes for regions you need to include it.

For a range of zip codes you need to apply an expression that includes all the numbers in your range. Example, a range from 60606-60808, but NOT 60803 or 60805 would need to be created like this:

@^(60(6((0[6-9])|([1-9][0-9]))|7[0-9]{2}|8(0[0-246-8])))(-\d{4})?$

Each regex starts with ^ - that indicates it should be at the start of the string you are matching (so a phone number entered like 3123560606 won't match the patter accidentally). At the end of the expression above is (-\d{4})?$ - that indicates that optionally you can have –XXXX at the end of the zip code, where the Xs are any digits. This enables matches for both zip and zip+4 formats. The - matches the hyphen in a zip+4 code and the \d{4} matches any digit four times. The ? means a match of what is in the parentheses is optional (can be matched 0 or 1 time) and the $ indicates that the pattern has to be at the end of the string that you are checking.

I've found that it's easier to initially write the expressions indented so it's easy to see the different sections, like this:

Code: Select all

@^(
    60
    (
      6
      (
        (
          0[6-9]
        )|
        (
          [1-9][0-9]
        )
      )|
      7[0-9]{2}
      |
      8
      (
        0[0-246-8]
      )
    )
  )
  (-\d{4})?$
You need to break the range up into smaller, complete consecutive ranges. So in my example, I started by just using 60 because everything in the range starts with 60???. The next digit in the range is 6, 7, or 8, but depending on which of those digits is next there is a different possible set of digits after that, so you have to handle each 3rd digit and its following possibilities separately.

So for 606?? I have a fourth digit that can be 0-9, but if it's 0 then the only options for the fifth digit are 6-9 (60605 is an invalid code). So I have to break the fourth digit for 606?? into two choices, 0 and 1-9.

You can define a set of characters in regex by putting them in brackets []. You can define a range within those brackets with a hyphen, or you can define individual characters. So [a-d] is the same as [abcd] or [0-9] is the same as [0123456789]. You can also do a mixture of ranges or individual characters, so you can write [0-37-9] to represent [0123789]. Remember, these are matching individual characters, not multi-digit numbers.

So back to the expression we're building, for the fourth digit of 606??, if I choose 0 then the fifth digit can be [6-9]. But if the fourth digit is [1-9] then the fifth digit can be [0-9].

For 607?? any number from 0-9 can be used for both of the last two digits (60700-60799), so I tell my expression to use the range [0-9] and match it twice {2}.

For 608?? the only codes that match are 60800, 60801, 60802, 60804, 60806, 60807, and 60808. 60803 and 60805 and 60809 and greater are excluded. So for the fourth digit you know it has to be 0 and the fifth digit can be in the range 0-2, the digits 4 or 6, or in the range 6-8, so you enter [0-246-8].

Finally I end the expression with my check for the +4 part of the zip code.

Within the expression, different parts are contained by parentheses, and a pipe means OR.

You can also enter several expressions into the regions postal code fields separated by a semicolon. You only enter the @ symbol once though. So the above expression could also be entered like this:

@^(606((0[6-9])|([1-9][0-9])))(-\d{4})?$;^(607[0-9]{2})(-\d{4})?$;^(608(0[0-246-8]))(-\d{4})?$

It's a bit longer, but sometimes it might be easier to break it up into several smaller expressions, especially if the ranges that you need to cover are very scattered.
Image
Image

Post Reply