[relaxng-user] Latest proposal for smart regexes in RELAX NG
jcowan at reutershealth.com
jcowan at reutershealth.com
Thu May 6 11:14:23 ICT 2004
David Tolpin scripsit:
> I don't think so. I am confident that one of the most significant
> benefits we get from Unicode is that we can write A-Z instead of [:ascalpha:]
> (and not ucalpha, as these are different things).
The difficulty is that most people do write [A-Z], seduced by the
English-Only side of the Force, rather than [:ucalpha:], which more
usually represents what they actually want (as opposed to what they
believe they want).
> are ordered and at fixed places. A-Z is always ABCDEFGHIJKLMNOPQRSTUVWXYZ,
> it was not so in the past, and I am glad it is now.
Well, it is so if the locale is Unicode.
> 1) Strings are not trees. XML documents are trees. That's why
> regular expressions (which can be represented in either tree-like
> (XML) form or in the form of a sequence of instructions (traditional
> string regular expressions)) should provide
>
> - XML structured representation (or compact but still structured tree-like)
> for XML documents in whole
> - string representation to match strings.
We don't provide a string-regex-ish formulation for XML, though.
> One can propose XML representation of string regular expressions
> to ease processing, but as well as the XML syntax is the base syntax
> for XML regular expressions, and the compact syntax is designed to
> make life easier in certain environment, string syntax for string
> regular expressions is the base syntax, and XML syntax can be
> provided as a convenience, but must map to the string syntax
> bidirectionally.
I agree about the bidirectional mapping, but I think using string
syntax as the base regex syntax is a hangover from the past that
ought to be discarded.
--
The man that wanders far jcowan at reutershealth.com
from the walking tree http://www.reutershealth.com
--first line of a non-existent poem by: John Cowan
More information about the relaxng-user
mailing list