[relaxng-user] Latest proposal for smart regexes in RELAX NG

Amelia A Lewis amyzing at talsever.org
Thu May 6 16:44:45 ICT 2004


On Fri, 7 May 2004 00:19:17 +0500 (AMST)
David Tolpin <dvd at davidashen.net> wrote:

> jcowan at reutershealth.com:
> > David Tolpin scripsit:
> > 
> > > A W3C XML Schema regexp for the very same production is 136
> > > characters long.
> > 
> > Show me, please.
> 
> I did a few months ago on this list. And explained in detail.

Your example covered only a subset of the RFC822 address production. 
Jeffrey Friedl's covers all of it.

Nor was his production intended to be "a joke" or an example of
illegibility.  He very carefully builds up to it, so that folks can be
impressed at the fact that they can read it, using the techniques he's
presented.

That doesn't make it any less rococo.  Regexes are wonderful, as John
Cowan points out, for small applications, but the inability to break them
into pieces and then put the pieces together, a facility that he's built
into his alternative, makes learning to read complex ones (much less
putting together complex ones that don't have hidden problems) enormously
more difficult than it needs to be.

The perl whitespace extensions alone are worth the price of entry for perl
regexes.  Using them, any expression can be reformatted into (nearly)
digestible chunks and analyzed so.

Amy!
-- 
Amelia A. Lewis
Architect/Principal Engineer
TIBCO/Extensibility, Inc.
alewis at tibco.com


More information about the relaxng-user mailing list