[relaxng-user] Latest proposal for smart regexes in RELAX NG
Bob Foster
bob at objfac.com
Wed May 5 01:36:18 ICT 2004
David Tolpin wrote:
> I am asking because I have a feeling that the string-based (as opposed
> to XML) syntax for regular expressions (unix-like, adopted by W3C Schema)
> is the compact syntax. It is easy to write and convenient to use,
> and actually needs just one addition: ability to compose a regular
> expression from parts.
>
> RNV provides this function - through dsl datatypelibrary and s-pattern
> facet. I had written about it on xml-dev, and http://ftp.davidashen.net/PreTI/RNV/readme.txt
> tells about it too (I believe) near the bottom of the page -- search for
> s-pattern .
Ok. I've looked at this before, but I don't know what it is. What class
of grammars do these patterns accept? Does your parser guarantee to
handle any pattern that can be written this way?
> I think that use of XML syntax for string templates (and regular expressions
> are string templates) is plain wrong. XML regular expressions are good
> or XML data, and the regular expressions is Relax NG itself, and the
> data is XML.
>
> Strings are not trees. Templates should match instances in structure.
> Instances regular expressions are matched against are strings; templates
> are pretty good as strings too. Just make them structured, that is,
> composable.
I'm sorry (remember, brain on haitus) but I don't understand what you
are saying. This:
s-pattern="""
comment = "\(([^\(\)\\]|\\.)*\)"
atom = "[a-zA-Z0-9!#$%&'*+\-/=?\^_`{|}~]+"
atoms = atom "(\." atom ")*"
person = "\"([^\"\\]|\\.)*\""
location = "\[([^\[\]\\]|\\.)*\]"
local-part = "(" atom "|" person ")"
domain = "(" atoms "|" location ")"
start = "(" comment " )?" local-part "@" domain "( " comment ")?"
"""
is not RELAX NG itself. At a glance, it's a context-free grammar. Hence
my questions above.
Bob Foster
More information about the relaxng-user
mailing list