[Date Prev] [Date Next] [Thread Prev] [Thread Next] Date Index Thread Index Search archive:
Date:Mon Aug 2 17:42:06 1999 
Subject:Regular expression string pattern matching: Embedding pop-11 procedures, and more 
From:Luc Beaudoin 
Volume-ID:990802.03 

Someone started a thread recently on this subject (which I unfortunately
failed to archive, which is why I'm not directly responding to it),
about enhancing poplog's regular expression string pattern matching.

Steve Knight in 1992 produced a Pop-11 library called lmatches (similar
to matches, but more efficient, and with additional functionality), one
of the main benefits of which was to allow matching to be constrained by
a procedural requirement, such that a pattern element could contain an
identifier and procedure, such that the pattern element would only match
if the procedure, when applied to the pattern element, would return a
true result:

E.g.,

 vars x, y;
  ( [1 2 3] lmatches [ %?x : isnumber, ??y% ] )=>
   x =>
  y =>
 
would yield:
 
  ** <true>
  ** 1
  ** [2 3]


This kind of thing would be useful in regexp pattern matching on
strings. regexp syntax would need a new symbol like @e, @E, denoting the
beginning and end of a Pop-11 procedural expression. It would also be
useful to do away with regexp's limitation of only having 1-9
subexpressions, and allow something more general (more akin to matches
or lmatches). The subexpressions could be identified by Pop-11
identifiers, or at least properties associated with the regular
expression procedure, rather than a numbered look-up table
(regexp_subexp).

sfk>(Steve also produced pmatches, here's a 1992 e-mail thread of ours
on the subject:
sfk>luc> There's a modification of matching that recently occurred to
me. It
sfk> luc> would involve doing something like matching a list against a
pattern,
sfk> luc> except that the result would not be a boolean, (and there need
not be
sfk> luc> side effects on identifiers, local or otherwise, though there
could be)
sfk> luc> but the structure elements 'merged' into the pattern.
sfk> 
sfk> Yes -- this seems like a nice variant on the pattern matching
theme.  By a
sfk> strange coincidence, -lmatches- can be easily modified to have
pretty much
sfk> the effect you want.
sfk> 
sfk> What I am thinking of is adding the idea of a transforming
procedure to a
sfk> pattern variable.  In other words a pattern variable now looks like
this
sfk>     ? x : restriction # converter
sfk> with the obvious optional bits. i.e.
sfk>     ? x
sfk>     ? x : r
sfk>     ? x # c
sfk>     ? x : r # c
sfk> and the same for ?? variables.  The point of the converter
procedure would be
sfk> to arrange for the pattern variables to be converted to different
results if
sfk> the pattern match was successful.  Hmmmm not expressing myself well
on this.
sfk> 
sfk> Here's an example of what I'm thinking about.
sfk> 
sfk>     instantiate( [1 2 3 4], [% ?? x # length, ?? y # length %] ) =>
sfk>     ** [0 4]
sfk> 
sfk> Anyway, I think I'll experiment & let you know what I come up with.
sfk> 

)

I have a copy of the library, e-mail me if you want it (something for
the bham archive?)

I'll try to find some time to give this some thought and let the group
know if I come up with something interesting.

-- 
Luc Beaudoin
Abatis Systems Corporation          http://www.abatis-sys.com
4190 Still Creek Drive, Suite 200
Burnaby BC V5C 6C6
E-mail address obfuscated for anti-span purpose:
e-mail username: lbeaudoin
e-mail network:           @abatis-sys.com