--- cl-lexer-1.orig/README +++ cl-lexer-1/README @@ -0,0 +1,74 @@ +LEXER package + +The LEXER package implements a lexical-analyzer-generator called DEFLEXER, +which is built on top of both REGEX and CLAWK. Many of the optimizations in the +recent rewrite of the regex engine went into optimizing the sorts of patterns +generated by DEFLEX. + +The default lexer doesn't implement full greediness. If you have a rule for +ints followed by a rule for floats, the int rule will match on the part before +the decimal before the float rule gets a change to look at it. You can fix this +by specifying :flex-compatible as the first rule. This gives all patterns a +chance to examine the text and takes the one that matches the longest string +(first pattern wins in case of a tie). The down side of this option is that it +slows down the analyser. If you can solve the issue by reordering your rules +that's the way to do it. + +I'm currently writing an AWK->CLAWK translator using this as the lexer, and +it's working fine. As far as I can tell, the DEFLEXER-generated lexing +functions should be fast enough for production use. + +Currently, the LEX/FLEX/BISON feature of switching productions on and off using +state variables is not supported, but it's a pretty simple feature to add. If +you're using LEXER and discover you need this feature, let me know. + +It also doesn't yet support prefix and postfix context patterns. This isn't +quite so trivial to add, but it's planned for a future release of regex, so +LEXER will be getting it someday. + +Anyway, Here's a simple DEFLEXER example: + + (deflexer test-lexer + ("[0-9]+([.][0-9]+([Ee][0-9]+)?)" + (return (values 'flt (num %0)))) + ("[0-9]+" + (return (values 'int (int %0)))) + ("[:alpha:][:alnum:]*" + (return (values 'name %0))) + ("[:space:]+") ) + + > (setq *lex* (test-lexer "1.0 12 fred 10.23e45")) + + + > (funcall *lex*) + FLT + 1.0 + + > (funcall *lex*) + INT + 12 + + > (funcall *lex*) + NAME + "fred" + + > (funcall *lex*) + FLT + 1.0229999999999997E46 + + > (funcall *lex*) + NIL + NIL + +You can also write this lexer using the :flex-compatible option, in which case +you can write the int and flt rules in any order. + +(deflexer test-lexer + :flex-compatible + ("[0-9]+" + (return (values 'int (int %0)))) + ("[0-9]+([.][0-9]+([Ee][0-9]+)?)" + (return (values 'flt (num %0)))) + ("[:space:]+") + ) + --- cl-lexer-1.orig/README.Debian +++ cl-lexer-1/README.Debian @@ -0,0 +1,6 @@ +cl-lexer for Debian +------------------- + +Load the file ``example.lisp'' from the examples directory and +evaluate ``(test)''. + --- cl-lexer-1.orig/debian/changelog +++ cl-lexer-1/debian/changelog @@ -0,0 +1,34 @@ +cl-lexer (1-5) unstable; urgency=low + + * Updating debhelper level (Closes: #817396) + + -- Matthew Danish Fri, 11 Mar 2016 13:54:17 +0000 + +cl-lexer (1-4) unstable; urgency=low + + * Re-adopting. Closes: #377920 + * Added debian/compat file. + * Added documentation from homepage. + + -- Matthew Danish Sat, 02 Jun 2007 21:00:18 -0400 + +cl-lexer (1-3) unstable; urgency=low + + * QA Upload + * Set Maintainer to QA Group, Orphaned: #377920 + * Move debhelper from B-D-I to B-D + + -- Michael Ablassmeier Mon, 31 Jul 2006 15:21:53 +0200 + +cl-lexer (1-2) unstable; urgency=low + + * Fixed a bug in the macroexpansion of deflexer that prevented + proper compilation of such forms. + + -- Matthew Danish Sat, 7 Dec 2002 07:23:46 -0500 + +cl-lexer (1-1) unstable; urgency=low + + * Initial release. (Closes: #168334) + + -- Matthew Danish Fri, 8 Nov 2002 13:23:50 -0500 --- cl-lexer-1.orig/debian/compat +++ cl-lexer-1/debian/compat @@ -0,0 +1 @@ +9 --- cl-lexer-1.orig/debian/control +++ cl-lexer-1/debian/control @@ -0,0 +1,13 @@ +Source: cl-lexer +Section: devel +Priority: optional +Maintainer: Matthew Danish +Build-Depends: debhelper (>> 9.0.0) +Standards-Version: 3.9.7 + +Package: cl-lexer +Architecture: all +Depends: common-lisp-controller (>= 3.37), cl-regex, ${misc:Depends} +Description: Lexical-analyzer-generator package for Common Lisp + Implements a lexical-analyzer-generator called DEFLEXER, built on top + of REGEX and CLAWK. --- cl-lexer-1.orig/debian/copyright +++ cl-lexer-1/debian/copyright @@ -0,0 +1,34 @@ +Debian Copyright Section +======================== + +Upstream Source URL: http://www.geocities.com/mparker762/clawk.html +Upstream Author: Kenneth Michael Parker +Debian Maintainer: Matthew Danish + +Upstream Copyright Statement +============================ + +Copyright (c) 2000,2001,2002 Kenneth Michael Parker +All rights reserved. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: +1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. +2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. +3. The name of the author may not be used to endorse or promote products + derived from this software without specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR +IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES +OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. +IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, +INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT +NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF +THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. --- cl-lexer-1.orig/debian/docs +++ cl-lexer-1/debian/docs @@ -0,0 +1 @@ +README --- cl-lexer-1.orig/debian/postinst +++ cl-lexer-1/debian/postinst @@ -0,0 +1,52 @@ +#! /bin/sh +# postinst script for lml +# +# see: dh_installdeb(1) + +set -e + +# package name according to lisp +LISP_PKG=lexer + +# summary of how this script can be called: +# * `configure' +# * `abort-upgrade' +# * `abort-remove' `in-favour' +# +# * `abort-deconfigure' `in-favour' +# `removing' +# +# for details, see http://www.debian.org/doc/debian-policy/ or +# the debian-policy package +# +# quoting from the policy: +# Any necessary prompting should almost always be confined to the +# post-installation script, and should be protected with a conditional +# so that unnecessary prompting doesn't happen if a package's +# installation fails and the `postinst' is called with `abort-upgrade', +# `abort-remove' or `abort-deconfigure'. + +case "$1" in + configure) + register-common-lisp-source ${LISP_PKG} + + ;; + + abort-upgrade|abort-remove|abort-deconfigure) + + ;; + + *) + echo "postinst called with unknown argument \`$1'" >&2 + exit 1 + ;; +esac + +# dh_installdeb will replace this with shell code automatically +# generated by other debhelper scripts. + +#DEBHELPER# + +exit 0 + + --- cl-lexer-1.orig/debian/prerm +++ cl-lexer-1/debian/prerm @@ -0,0 +1,42 @@ +#! /bin/sh +# prerm script for lml +# +# see: dh_installdeb(1) + +set -e + +# package name according to lisp +LISP_PKG=lexer + +# summary of how this script can be called: +# * `remove' +# * `upgrade' +# * `failed-upgrade' +# * `remove' `in-favour' +# * `deconfigure' `in-favour' +# `removing' +# +# for details, see http://www.debian.org/doc/debian-policy/ or +# the debian-policy package + + +case "$1" in + remove|upgrade|deconfigure) + unregister-common-lisp-source ${LISP_PKG} + ;; + failed-upgrade) + ;; + *) + echo "prerm called with unknown argument \`$1'" >&2 + exit 1 + ;; +esac + +# dh_installdeb will replace this with shell code automatically +# generated by other debhelper scripts. + +#DEBHELPER# + +exit 0 + + --- cl-lexer-1.orig/debian/rules +++ cl-lexer-1/debian/rules @@ -0,0 +1,89 @@ +#!/usr/bin/make -f + +pkg := lexer +debpkg := cl-lexer + +files := packages.lisp lexer.lisp +examples := example.lisp +docs := README.Debian + +clc-source := usr/share/common-lisp/source +clc-systems := usr/share/common-lisp/systems +clc-pkg := $(clc-source)/$(pkg) +doc-dir := usr/share/doc/$(debpkg) + + +configure: configure-stamp +configure-stamp: + dh_testdir + # Add here commands to configure the package. + + touch configure-stamp + + +build-arch: build +build-indep: build +build: build-stamp + +build-stamp: configure-stamp + dh_testdir + + # Add here commands to compile the package. + touch build-stamp + +clean: + dh_testdir + dh_testroot + rm -f build-stamp configure-stamp + # Add here commands to clean up after the build process. + rm -f debian/$(debpkg).postinst.* debian/$(debpkg).prerm.* + dh_clean + +install: build + dh_testdir + dh_testroot + dh_prep + # Add here commands to install the package into debian/$(pkg). + dh_installdirs $(clc-systems) $(clc-pkg) $(doc-dir) + chmod 644 $(files) $(pkg).asd $(examples) + dh_install $(pkg).asd $(files) $(clc-pkg) + dh_install $(docs) $(doc-dir) + dh_link $(clc-pkg)/$(pkg).asd $(clc-systems)/$(pkg).asd + + +# Build architecture-dependent files here. +binary-arch: build install + +# Build architecture-independent files here. +binary-indep: build install + dh_testdir + dh_testroot +# dh_installdebconf + dh_installdocs + dh_installexamples $(examples) + dh_link +# dh_installmenu +# dh_installlogrotate +# dh_installemacsen +# dh_installpam +# dh_installmime +# dh_installinit +# dh_installcron +# dh_installman +# dh_installinfo +# dh_undocumented + dh_installchangelogs + dh_strip + dh_compress + dh_fixperms +# dh_makeshlibs + dh_installdeb +# dh_perl + dh_shlibdeps + dh_gencontrol + dh_md5sums + dh_builddeb + +binary: binary-indep binary-arch +.PHONY: build clean binary-indep binary-arch binary install configure + --- cl-lexer-1.orig/debian/source/format +++ cl-lexer-1/debian/source/format @@ -0,0 +1 @@ +1.0 --- cl-lexer-1.orig/example.lisp +++ cl-lexer-1/example.lisp @@ -0,0 +1,24 @@ +(eval-when (:compile-toplevel :load-toplevel :execute) + (require :lexer)) + +(defpackage #:test-lexer + (:use #:common-lisp #:lexer) + (:export #:test)) + +(in-package #:test-lexer) + +(deflexer test-lexer + ("[0-9]+([.][0-9]+([Ee][0-9]+)?)" + (return (values 'flt (num %0)))) + ("[0-9]+" + (return (values 'int (int %0)))) + ("[:alpha:][:alnum:]*" + (return (values 'name %0))) + ("[:space:]+") ) + + +(defparameter *lex* (test-lexer "1.0 12 fred 10.23e12")) + +(defun test () + (loop repeat 5 + collect (multiple-value-list (funcall *lex*)))) --- cl-lexer-1.orig/lexer.asd +++ cl-lexer-1/lexer.asd @@ -0,0 +1,9 @@ +;;; -*- Mode: Lisp; Syntax: ANSI-Common-lisp; Package: CL-USER; Base: 10 -*- + +(in-package "CL-USER") + +(asdf:defsystem lexer + :depends-on (regex) + :components ((:file "packages") + (:file "lexer"))) + --- cl-lexer-1.orig/lexer.lisp +++ cl-lexer-1/lexer.lisp @@ -100,7 +100,7 @@ (multiple-value-bind (patterns actions) (extract-patterns-and-actions rules) `(progn - (defparameter ,matchervar ,(macroexpand-regex-expr (combine-patterns patterns))) + (defparameter ,matchervar (macroexpand-regex-expr ',(combine-patterns patterns))) (defun ,name (,strvar &key (start 0) (end (length ,strvar)) (end-token nil) @@ -146,7 +146,7 @@ (error "~A lexing failure (unknown token) in ~S @ ~D, ~S ~S ~S ~S" ',name ,strvar start ,rcvar ,matchstartvar ,matchlenvar ,matchregsvar)))) - finally return (values end-token end-value)))))))))) + finally (return (values end-token end-value))))))))))) #+:Lispworks (editor:setup-indent "deflexer" 1 2 10) ; Pull out the patterns and actions, with rule numbers so we can keep them associated @@ -158,7 +158,7 @@ do (progn (push `(,pat ,rulenum) patterns) (push `(,action ,rulenum) actions)) - finally return (values (nreverse patterns) (nreverse actions)))) + finally (return (values (nreverse patterns) (nreverse actions))))) ; Combines patterns into one big ALT, with clause extended with a new success node ; that returns the pattern number that matched.