Matching SSNs with RegEx -
i filtering out ssn#’s project. product’s default expressions ssns is
(^|\b)(?!9|8|77[3-9]|666|000)(\d{3})( - | |-)(?!00)(\d{2})\3(?!0000)(\d{4})(\b|$)
there modification in past customized script regex expression:
(^|\b)(?!9|8|77[3-9]|666|000)(\d{3})( - | |-)(?!00)(\d{2})\3(?!0000)(\d{4})($|[^\d-])
the difference between 2 expressions
(\b|$) \b: backspace $:end of string
and
($|[^\d-]) $:end of string [^]: not in position \d-: not match 0-9
i guess not make sense me. why change made? both of end of these expressions seem superfluous. appreciated. thanks!!
the difference here:
(\b|$)
vs
($|[^\d-])
\b
represents backspace character within character class [\babc]
match 'a', 'b', 'c', or backspace char. outside character class, seen here, word break, such between letter , space.
previously, regex accept ssn, matching pattern point, ending end of string ($
), or word break (\b
). match ssn in "111-22-3333" or "111-22-3333 garbage data"
after change, more permissive of follows ssn. can end end of string ($
), or character other digit or hyphen ([^\d-]
). so, matched above, match ssn in "111-22-3333garbage" or "111-22-3333#6789", not in "111-22-33333" or "111-22-3333-123"
frankly, old version list seems more correct cases me, depend on needs application, of course.
Comments
Post a Comment