java - Regex not to match a set of Strings -


how construct regex not contain set of strings within.

for example, want validate address line 1 text box wont contain secondary address parts such 'apt', 'bldg','ste','unit' etc.

a regex can used verify string not contain set of words. here tested java code snippet commented regex precisely this:

if (s.matches("(?sxi)" +     "# match string containing no 'bad' words.\n" +     "^                # anchor start of string.\n" +     "(?:              # step through string 1 char @ time.\n" +     "  (?!            # negative lookahead exclude words.\n" +     "    \\b          # bad words begin on word boundary\n" +     "    (?:          # list of 'bad' words not matched.\n" +     "      apt        # cannot 'apt',\n" +     "    | bldg       # or 'bldg',\n" +     "    | ste        # or 'ste',\n" +     "    | unit       # or 'unit'.\n" +     "    )            # end list of words not matched.\n" +     "    \\b          # bad words end on word boundary\n" +     "  )              # not @ beginning of bad word.\n" +     "  .              # ok. safe match character.\n" +     ")*               # 0 or more 'not-start-of-bad-word' chars.\n" +     "$                # anchor end of string.")     ) {     // string has no bad words.     system.out.print("ok: string has no bad words.\n"); } else {     // string has bad words.     system.out.print("err: string has bad words.\n"); }  

this assumes words must "whole" words , "bad" words should recognized regardless of case. note also, (as others have correctly stated), not efficient checking presence of bad words , taking logical not.


Comments