pax_global_header00006660000000000000000000000064124705522040014513gustar00rootroot0000000000000052 comment=b9273e28867979385c0f1eb4f68ffb9f45367817 parsley-clojure-0.9.3/000077500000000000000000000000001247055220400146445ustar00rootroot00000000000000parsley-clojure-0.9.3/README.md000066400000000000000000000177461247055220400161420ustar00rootroot00000000000000# Parsley ## Parsnip Parsley has been a test bed and a proof of concept for total incremental parsers. However it suffers from severe limitations (mainly revolving around lookaheads, both at the lexeme and production level) which hinder further development and acceptance. Further development of the concepts and techniques explored in Parsley will occur in [Parsnip](https://github.com/cgrand/parsnip/). ## Introduction Parsley generates *total and truly incremental parsers*. Total: a Parsley parser *yields a parse-tree for any input string*. Truly incremental: a Parsley parser can operate as a text buffer, in best cases recomputing the parse-tree after a sequence of edits happens in *logarithmic time* (worst case: it behaves like a restartable parser). Parsley parsers have *no separate lexer*, this allows for better compositionality of grammars. For now Parsley uses the same technique (for lexer-less parsing) as described in this paper: Context-Aware Scanning for Parsing Extensible Languages http://www.umsec.umn.edu/publications/Context-Aware-Scanning-Parsing-Extensible-Language (I independently rediscovered this technique and dubbed it LR+.) Without a separate lexer, a language is entirely defined by its grammar. A grammar is an alternation of keywords (non-terminal names) and other values. A keyword and another value form a production rule. ## Specifying grammars A simple grammar is: :expr #{"x" ["(" :expr* ")"]} `x` `()` `(xx)` `((x)())` are recognized by this grammar. By default the main production of a grammar is the first one. A production right value is a combination of: * strings and regexes (terminals -- the set of terminal types is broader and even open, more later) * keywords (non-terminals) which can be suffixed by `*`, `+` or `?` to denote repetitions or options. * sets to denote an alternative * vectors to denote a sequence. Inside vectors `:*`, `:+` and `:?` are postfix unary operators. That is `["ab" :+]` denotes a non-empty repetition of the `ab` string A production left value is always a keyword. If this keyword is suffixed by `-`, no node will be generated in the parse-tree for this rule, its child nodes are inlined in the parent node. Rules with such names are called anonymous rules. An anonymous rule must be referred to by its base name (without the `-`). These two grammars specify the same language but the resulting parse-trees will be different (additional `:expr-rep` nodes): :expr #{"x" ["(" :expr* ")"]} :expr #{"x" :expr-rep} :expr-rep ["(" :expr* ")"] These two grammars specify the same language and the same parse-trees: :expr #{"x" ["(" :expr* ")"]} :expr #{"x" :expr-rep} :expr-rep- ["(" :expr* ")"] ## Creating parsers A parser is created using the `parser` or `make-parser` functions. (require '[net.cgrand.parsley :as p]) (def p (p/parser :expr #{"x" ["(" :expr* ")"]})) (pprint (p "(x(x))")) {:tag :net.cgrand.parsley/root, :content [{:tag :expr, :content ["(" {:tag :expr, :content ["x"]} {:tag :expr, :content ["(" {:tag :expr, :content ["x"]} ")"]} ")"]}]} ; running on malformed input with garbage (pprint (p "a(zldxn(dez)")) {:tag :net.cgrand.parsley/unfinished, :content [{:tag :net.cgrand.parsley/unexpected, :content ["a"]} {:tag :net.cgrand.parsley/unfinished, :content ["(" {:tag :net.cgrand.parsley/unexpected, :content ["zld"]} {:tag :expr, :content ["x"]} {:tag :net.cgrand.parsley/unexpected, :content ["n"]} {:tag :expr, :content ["(" {:tag :net.cgrand.parsley/unexpected, :content ["dez"]} ")"]}]}]} ## Creating buffers Creating a buffer, editing it and getting its resulting parse-tree: (-> p p/incremental-buffer (p/edit 0 0 "(") (p/edit 1 0 "(x)") p/parse-tree pprint) {:tag :net.cgrand.parsley/unfinished, :content [{:tag :net.cgrand.parsley/unfinished, :content ["(" {:tag :expr, :content ["(" {:tag :expr, :content ["x"]} ")"]}]}]} Incremental parsing at work: => (def p (p/parser :expr #{"x" "\n" ["(" :expr* ")"]})) #'net.cgrand.parsley/p => (let [line (apply str "\n" (repeat 10 "((x))")) input (str "(" (apply str (repeat 1000 line)) ")") buf (p/incremental-buffer p) buf (p/edit buf 0 0 input)] (time (p/parse-tree buf)) (time (p/parse-tree (-> buf (p/edit 2 0 "(") (p/edit 51002 0 ")")))) nil) "Elapsed time: 508.834 msecs" "Elapsed time: 86.038 msecs" nil Hence, *reparsing the buffer only took a fraction of the original time* despite the buffer having been modified at the start and at the end. ## Incremental parsing The input string is split into _chunks_ (lines by default) and chunks are always reparsed as a whole, so don't experiment with incremental parsing with 1-line inputs! Let's look at a bit more complex example: => (def p (p/parser {:main :expr* :space :ws? :make-node (fn [tag content] {:tag tag :content content :id (gensym)})} :ws #"\s+" :expr #{#"\w+" ["(" :expr* ")"]})) This example introduces the option map: if the first arg to `parser` is a map (instead of a keyword), it's a map of options. See Options for more. The important option here is that we redefine how nodes of the parse-tree are constructed (via the `make-node` option). We add a unique identifier to each node. Now let's create a 3-line input and parse it: => (def buf (-> p incremental-buffer (edit 0 0 "((a)\n(b)\n(c))"))) => (-> buf parse-tree pprint) nil {:tag :net.cgrand.parsley/root, :content [{:tag :expr, :content ["(" {:tag :expr, :content ["(" {:tag :expr, :content ["a"], :id G__1806} ")"], :id G__1807} {:tag :ws, :content ["\n"], :id G__1808} {:tag :expr, :content ["(" {:tag :expr, :content ["b"], :id G__1809} ")"], :id G__1810} {:tag :ws, :content ["\n"], :id G__1811} {:tag :expr, :content ["(" {:tag :expr, :content ["c"], :id G__1812} ")"], :id G__1813} ")"], :id G__1814}], :id G__1815} Now, let's modify this "B" in "BOO" and parse the buffer again: => (-> buf (edit 6 1 "BOO") parse-tree pprint) nil {:tag :net.cgrand.parsley/root, :content [{:tag :expr, :content ["(" {:tag :expr, :content ["(" {:tag :expr, :content ["a"], :id G__1806} ")"], :id G__1807} {:tag :ws, :content ["\n"], :id G__1818} {:tag :expr, :content ["(" {:tag :expr, :content ["BOO"], :id G__1819} ")"], :id G__1820} {:tag :ws, :content ["\n"], :id G__1811} {:tag :expr, :content ["(" {:tag :expr, :content ["c"], :id G__1812} ")"], :id G__1813} ")"], :id G__1821}], :id G__1822} ----- We can spot that 5 out of the 10 nodes are shared with the previous parse-tree. ## Options `:main` specifies the root production, by default this is the the first production of the grammar. `:root-tag` specifies the tag name to use for the root node (`:net.cgrand.parsley/root` by default). `:space` specifies a production which will be interspersed between every symbol (terminal or not) *except in a sequence created with `unspaced`.* `:make-node` specifies a function whose arglist is `[tag children-vec]` which returns a new node. By default create instances the Node record with keys `tag` and `content`. `:make-unexpected` specifies a 1-arg function which converts a string (of unexpected characters) to a node. By defaut delegates to `:make-node`. `:make-leaf` specifies a 1-arg function which converts a string (token) to a node, by default behaves like identity. parsley-clojure-0.9.3/epl-v10.html000066400000000000000000000305601247055220400167220ustar00rootroot00000000000000 Eclipse Public License - Version 1.0

Eclipse Public License - v 1.0

THE ACCOMPANYING PROGRAM IS PROVIDED UNDER THE TERMS OF THIS ECLIPSE PUBLIC LICENSE ("AGREEMENT"). ANY USE, REPRODUCTION OR DISTRIBUTION OF THE PROGRAM CONSTITUTES RECIPIENT'S ACCEPTANCE OF THIS AGREEMENT.

1. DEFINITIONS

"Contribution" means:

a) in the case of the initial Contributor, the initial code and documentation distributed under this Agreement, and

b) in the case of each subsequent Contributor:

i) changes to the Program, and

ii) additions to the Program;

where such changes and/or additions to the Program originate from and are distributed by that particular Contributor. A Contribution 'originates' from a Contributor if it was added to the Program by such Contributor itself or anyone acting on such Contributor's behalf. Contributions do not include additions to the Program which: (i) are separate modules of software distributed in conjunction with the Program under their own license agreement, and (ii) are not derivative works of the Program.

"Contributor" means any person or entity that distributes the Program.

"Licensed Patents" mean patent claims licensable by a Contributor which are necessarily infringed by the use or sale of its Contribution alone or when combined with the Program.

"Program" means the Contributions distributed in accordance with this Agreement.

"Recipient" means anyone who receives the Program under this Agreement, including all Contributors.

2. GRANT OF RIGHTS

a) Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, distribute and sublicense the Contribution of such Contributor, if any, and such derivative works, in source code and object code form.

b) Subject to the terms of this Agreement, each Contributor hereby grants Recipient a non-exclusive, worldwide, royalty-free patent license under Licensed Patents to make, use, sell, offer to sell, import and otherwise transfer the Contribution of such Contributor, if any, in source code and object code form. This patent license shall apply to the combination of the Contribution and the Program if, at the time the Contribution is added by the Contributor, such addition of the Contribution causes such combination to be covered by the Licensed Patents. The patent license shall not apply to any other combinations which include the Contribution. No hardware per se is licensed hereunder.

c) Recipient understands that although each Contributor grants the licenses to its Contributions set forth herein, no assurances are provided by any Contributor that the Program does not infringe the patent or other intellectual property rights of any other entity. Each Contributor disclaims any liability to Recipient for claims brought by any other entity based on infringement of intellectual property rights or otherwise. As a condition to exercising the rights and licenses granted hereunder, each Recipient hereby assumes sole responsibility to secure any other intellectual property rights needed, if any. For example, if a third party patent license is required to allow Recipient to distribute the Program, it is Recipient's responsibility to acquire that license before distributing the Program.

d) Each Contributor represents that to its knowledge it has sufficient copyright rights in its Contribution, if any, to grant the copyright license set forth in this Agreement.

3. REQUIREMENTS

A Contributor may choose to distribute the Program in object code form under its own license agreement, provided that:

a) it complies with the terms and conditions of this Agreement; and

b) its license agreement:

i) effectively disclaims on behalf of all Contributors all warranties and conditions, express and implied, including warranties or conditions of title and non-infringement, and implied warranties or conditions of merchantability and fitness for a particular purpose;

ii) effectively excludes on behalf of all Contributors all liability for damages, including direct, indirect, special, incidental and consequential damages, such as lost profits;

iii) states that any provisions which differ from this Agreement are offered by that Contributor alone and not by any other party; and

iv) states that source code for the Program is available from such Contributor, and informs licensees how to obtain it in a reasonable manner on or through a medium customarily used for software exchange.

When the Program is made available in source code form:

a) it must be made available under this Agreement; and

b) a copy of this Agreement must be included with each copy of the Program.

Contributors may not remove or alter any copyright notices contained within the Program.

Each Contributor must identify itself as the originator of its Contribution, if any, in a manner that reasonably allows subsequent Recipients to identify the originator of the Contribution.

4. COMMERCIAL DISTRIBUTION

Commercial distributors of software may accept certain responsibilities with respect to end users, business partners and the like. While this license is intended to facilitate the commercial use of the Program, the Contributor who includes the Program in a commercial product offering should do so in a manner which does not create potential liability for other Contributors. Therefore, if a Contributor includes the Program in a commercial product offering, such Contributor ("Commercial Contributor") hereby agrees to defend and indemnify every other Contributor ("Indemnified Contributor") against any losses, damages and costs (collectively "Losses") arising from claims, lawsuits and other legal actions brought by a third party against the Indemnified Contributor to the extent caused by the acts or omissions of such Commercial Contributor in connection with its distribution of the Program in a commercial product offering. The obligations in this section do not apply to any claims or Losses relating to any actual or alleged intellectual property infringement. In order to qualify, an Indemnified Contributor must: a) promptly notify the Commercial Contributor in writing of such claim, and b) allow the Commercial Contributor to control, and cooperate with the Commercial Contributor in, the defense and any related settlement negotiations. The Indemnified Contributor may participate in any such claim at its own expense.

For example, a Contributor might include the Program in a commercial product offering, Product X. That Contributor is then a Commercial Contributor. If that Commercial Contributor then makes performance claims, or offers warranties related to Product X, those performance claims and warranties are such Commercial Contributor's responsibility alone. Under this section, the Commercial Contributor would have to defend claims against the other Contributors related to those performance claims and warranties, and if a court requires any other Contributor to pay any damages as a result, the Commercial Contributor must pay those damages.

5. NO WARRANTY

EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT LIMITATION, ANY WARRANTIES OR CONDITIONS OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Each Recipient is solely responsible for determining the appropriateness of using and distributing the Program and assumes all risks associated with its exercise of rights under this Agreement , including but not limited to the risks and costs of program errors, compliance with applicable laws, damage to or loss of data, programs or equipment, and unavailability or interruption of operations.

6. DISCLAIMER OF LIABILITY

EXCEPT AS EXPRESSLY SET FORTH IN THIS AGREEMENT, NEITHER RECIPIENT NOR ANY CONTRIBUTORS SHALL HAVE ANY LIABILITY FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING WITHOUT LIMITATION LOST PROFITS), HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OR DISTRIBUTION OF THE PROGRAM OR THE EXERCISE OF ANY RIGHTS GRANTED HEREUNDER, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

7. GENERAL

If any provision of this Agreement is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this Agreement, and without further action by the parties hereto, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable.

If Recipient institutes patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Program itself (excluding combinations of the Program with other software or hardware) infringes such Recipient's patent(s), then such Recipient's rights granted under Section 2(b) shall terminate as of the date such litigation is filed.

All Recipient's rights under this Agreement shall terminate if it fails to comply with any of the material terms or conditions of this Agreement and does not cure such failure in a reasonable period of time after becoming aware of such noncompliance. If all Recipient's rights under this Agreement terminate, Recipient agrees to cease use and distribution of the Program as soon as reasonably practicable. However, Recipient's obligations under this Agreement and any licenses granted by Recipient relating to the Program shall continue and survive.

Everyone is permitted to copy and distribute copies of this Agreement, but in order to avoid inconsistency the Agreement is copyrighted and may only be modified in the following manner. The Agreement Steward reserves the right to publish new versions (including revisions) of this Agreement from time to time. No one other than the Agreement Steward has the right to modify this Agreement. The Eclipse Foundation is the initial Agreement Steward. The Eclipse Foundation may assign the responsibility to serve as the Agreement Steward to a suitable separate entity. Each new version of the Agreement will be given a distinguishing version number. The Program (including Contributions) may always be distributed subject to the version of the Agreement under which it was received. In addition, after a new version of the Agreement is published, Contributor may elect to distribute the Program (including its Contributions) under the new version. Except as expressly stated in Sections 2(a) and 2(b) above, Recipient receives no rights or licenses to the intellectual property of any Contributor under this Agreement, whether expressly, by implication, estoppel or otherwise. All rights in the Program not expressly granted under this Agreement are reserved.

This Agreement is governed by the laws of the State of New York and the intellectual property laws of the United States of America. No party to this Agreement will bring a legal action under this Agreement more than one year after the cause of action arose. Each party waives its rights to a jury trial in any resulting litigation.

parsley-clojure-0.9.3/pom.xml000066400000000000000000000033531247055220400161650ustar00rootroot00000000000000 4.0.0 net.cgrand parsley 0.9.1 parsley a generator of total and truly incremental parsers http://github.com/cgrand/parsley/ d24a5040fd8417371f17366dd3e16fdc37a41a9d src test resources test-resources central http://repo1.maven.org/maven2 clojars http://clojars.org/repo/ org.clojure clojure [1.2.0,) net.cgrand regex 1.1.0 parsley-clojure-0.9.3/project.clj000066400000000000000000000004441247055220400170060ustar00rootroot00000000000000(defproject net.cgrand/parsley "0.9.3" :description "a generator of total and truly incremental parsers" :url "http://github.com/cgrand/parsley/" :license {:name "Eclipse Public License 1.0"} :dependencies [[org.clojure/clojure "1.4.0"] [net.cgrand/regex "1.1.0"]]) parsley-clojure-0.9.3/src/000077500000000000000000000000001247055220400154335ustar00rootroot00000000000000parsley-clojure-0.9.3/src/net/000077500000000000000000000000001247055220400162215ustar00rootroot00000000000000parsley-clojure-0.9.3/src/net/cgrand/000077500000000000000000000000001247055220400174575ustar00rootroot00000000000000parsley-clojure-0.9.3/src/net/cgrand/parsley.clj000066400000000000000000000077161247055220400216430ustar00rootroot00000000000000; Copyright (c) Christophe Grand. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns net.cgrand.parsley "A total truly incremental parser generator. Grammars are expressed in a value-based DSL." (:require [net.cgrand.parsley.lrplus :as core] [net.cgrand.parsley.fold :as f] [net.cgrand.parsley.tree :as t] [net.cgrand.parsley.util :as u] [net.cgrand.parsley.grammar :as g])) (defrecord Node [tag content]) ; for memory efficiency (defn- stepper [table options-map] (let [options-map (merge {:make-node #(Node. %1 %2) :make-leaf nil} ; nil for identity options-map) options-map (if-not (:make-unexpected options-map) (let [make-node (:make-node options-map) make-leaf (or (:make-leaf options-map) identity)] (assoc options-map :make-unexpected #(make-node ::unexpected [(make-leaf %)]))) options-map) ops (select-keys options-map [:make-node :make-leaf :make-unexpected])] ^{::options options-map} ; feeling dirty, metadata make me uneasy (fn self ([s] (let [a (self core/zero s) b (self a nil)] (-> (f/stitch a b) (nth 2) f/finish))) ([state s] (core/step table ops state s))))) (defn- flatten-rules [rules] (if (map? rules) (apply concat rules) rules)) (defn make-parser [options-map rules] (-> (apply g/grammar options-map (flatten-rules rules)) core/lr-table core/totalize core/number-states (stepper options-map))) (defn parser [options-map & rules] (let [[options-map rules] (if-not (map? options-map) [{} (cons options-map rules)] [options-map rules])] (make-parser options-map rules))) (defn unspaced "Creates an unspaced sequence." [& specs] (apply g/unspaced specs)) (defn- memoize-parser [f] (let [cache (atom nil)] (fn [input] (u/cond [last-result @cache new-result (f/rebase last-result input)] (if (identical? last-result new-result) last-result (reset! cache new-result)) (reset! cache (f input)))))) (defn- memoize1 [parser s] (memoize-parser #(parser % s))) (defn- memoize2 [mpa mpb] (memoize-parser #(let [a (mpa %) b (mpb a)] (f/stitch a b)))) (defn- memoize-1shot [f] (let [cache (atom [(Object.) nil])] (fn [& args] (let [[cargs cr] @cache] (if (= args cargs) cr (let [r (apply f args)] (reset! cache [args r]) r)))))) (defn- memoize-eof [parser] (let [mp (memoize1 parser nil)] (memoize-1shot #(-> (f/stitch % (mp %)) (nth 2) f/finish)))) (defn incremental-buffer "Creates an empty incremental buffer for the specified parser." [parser] {:buffer (t/buffer {:unit #(memoize1 parser %) :plus memoize2 :chunk #(.split ^String % "(?<=\n)") :left #(subs %1 0 %2) :right subs :cat str}) :eof-parser (memoize-eof parser) :options (::options (meta parser))}) (defn edit "Returns a new buffer reflecting the specified edit." [incremental-buffer offset length s] (update-in incremental-buffer [:buffer] t/edit offset length s)) (defn parse-tree "Returns the parse-tree." [incremental-buffer] (let [f (t/value (:buffer incremental-buffer)) a (f core/zero)] ((:eof-parser incremental-buffer) a))) parsley-clojure-0.9.3/src/net/cgrand/parsley/000077500000000000000000000000001247055220400211365ustar00rootroot00000000000000parsley-clojure-0.9.3/src/net/cgrand/parsley/fold.clj000066400000000000000000000102731247055220400225570ustar00rootroot00000000000000; Copyright (c) Christophe Grand. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns net.cgrand.parsley.fold (:require [net.cgrand.parsley.util :as u]) (:refer-clojure :exclude [cat])) (defprotocol EphemeralFolding (unexpected! [this s]) (node! [this tag n]) (leaf! [this s]) (cat! [this pfq])) ; nodes is a collection of nodes (returned by make-node) ; pending is an alternate collection of tag, coll of nodes, integer (deftype FoldingQueue [^java.util.ArrayList pending ^java.util.ArrayList nodes ^java.util.ArrayList offsets make-node make-leaf make-unexpected] EphemeralFolding (unexpected! [this s] (when make-unexpected (.add nodes (make-unexpected s)) this)) (node! [this tag N] (let [n (.size offsets)] (u/cond (or (> N n) (neg? N)) (do (doto pending (.add tag) (.add (vec nodes)) (.add (- N n))) (.clear nodes) (.clear offsets)) :let [m (- n N) offset (.get offsets m) _ (-> offsets (.subList (inc m) n) .clear)] tag (let [tail (.subList nodes offset (.size nodes)) children (vec (.toArray tail))] (.clear tail) (.add nodes (make-node tag children)))) this)) (leaf! [this s] (let [leaf (if make-leaf (make-leaf s) s) offset (.size nodes)] (.add offsets offset) (.add nodes leaf) this)) (cat! [this pfq] (doseq [[tag pnodes n] (partition 3 (:pending pfq))] (.addAll nodes pnodes) (.node! this tag n)) (let [n (.size nodes)] (.addAll nodes (:nodes pfq)) (doseq [offset (:offsets pfq)] (.add offsets (+ n offset)))) this) clojure.lang.IDeref (deref [this] {:pending (vec pending) :nodes (vec nodes) :offsets (vec offsets) :make-node make-node :make-leaf make-leaf :make-unexpected make-unexpected})) (defn folding-queue [{:keys [pending nodes offsets make-node make-leaf make-unexpected] :or {pending [] nodes [] offsets []}}] (FoldingQueue. (java.util.ArrayList. ^java.util.Collection pending) (java.util.ArrayList. ^java.util.Collection nodes) (java.util.ArrayList. ^java.util.Collection offsets) make-node make-leaf make-unexpected)) (defn cat [a b] @(cat! (folding-queue a) b)) (defn finish [pfq] (u/cond :let [{:keys [pending nodes offsets make-node]} pfq] (and (seq nodes) (seq pending)) nil [[x & xs] (seq nodes)] (when-not xs x) [[[tag pnodes n] & xs] (seq (partition 3 pending))] (when (and (not xs) (neg? n) tag) (make-node tag pnodes)))) (defn stitchability "Returns :full, or a number (the number of states on A stack which remains untouched) when rebasing is possible or nil." [a b] (u/cond :let [[a-end a-watermark a-events a-start] a [b-end b-watermark b-events b-start] b] (= a-end b-start) :full :let [[a-stack a-rem] a-end [b-stack b-rem] b-start] :when (= a-rem b-rem) :let [b-tail (subvec b-stack b-watermark) n (- (count a-stack) (count b-tail))] :when (not (neg? n)) (= b-tail (subvec a-stack n)) n)) (defn rebase [b a] (u/cond :when-let [st (stitchability a b)] (= :full st) b ; if it's not full, it's partial :let [[a-end] a [b-end b-watermark b-events] b [a-stack] a-end [b-stack b-rem] b-end b-tail (subvec b-stack b-watermark) a-stub (subvec a-stack 0 st)] [[(into (vec a-stub) b-tail) b-rem] st b-events a-end])) (defn stitch [a b] (when (and a b) (let [[a-end a-watermark a-events a-start] a [b-end b-watermark b-events b-start] b] [b-end (min a-watermark b-watermark) (cat a-events b-events) a-start]))) parsley-clojure-0.9.3/src/net/cgrand/parsley/functional_trees.clj000066400000000000000000000023311247055220400251730ustar00rootroot00000000000000(ns net.cgrand.parsley.functional-trees "Views on functional trees." (:require [net.cgrand.parsley.views :as v])) ;; Functional trees are trees-as-functions (a subset of objects-as-functions) ;; A functional tree is a function of two arguments: leaf and node. ;; leaf and node are two functions: ;; * leaf takes one string and returns a value in a view domain, ;; * node takes a tag (keyword) and a collection of children functional trees ;; and returns a value in the same view domain as leaf's. ;; When a functional tree represent a leaf, leaf is called, when it's a inner ;; node, node is called. ;; ;; The purpose of these functional trees (over maps for example) is that they ;; can be memoizing and the caches are associated with the trees, not with ;; the functions. ;; ;; It's an expedient implementation, better impl welcome. ;; ;; Canonicalizing may be interesting for all keywords, single spaces etc. (defn fleaf "Constructs a functional leaf." [s] (memoize (fn [leaf node] (leaf s)))) (defn fnode "Constructs a functional node." [tag content] (memoize (fn [leaf node] (node tag content)))) (extend-type clojure.lang.Fn v/ViewableNode (compute [f leaf node] (f leaf node))) parsley-clojure-0.9.3/src/net/cgrand/parsley/grammar.clj000066400000000000000000000154071247055220400232650ustar00rootroot00000000000000; Copyright (c) Christophe Grand. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns net.cgrand.parsley.grammar (:require [net.cgrand.parsley.util :as u])) (alias 'p 'net.cgrand.parsley) ; avoid circular dependency ;; A grammar production right-hand side consists of a combination of: ;; * vectors (sequence) ;; * sets (alternatives) ;; * keywords (non-terminal or operators: :*, :+, :?) ;; * antything else is considered as a literal (a "matcher") (defprotocol RuleFragment (unsugar [fragment] "Remove sugarized forms") (collect [fragment unspaced top-rulename] "Collect \"anonymous\" productions") (develop [fragment rewrite space] "normalize as a seq of seqs of keywords and terminals" )) (defrecord Unspaced [item] RuleFragment (unsugar [this] (Unspaced. (unsugar item))) (collect [this unspaced top-rulename] (collect item true top-rulename)) (develop [this rewrite space] (rewrite item #{[]}))) (defn unspaced [& specs] (Unspaced. (vec specs))) (defrecord Root [item] RuleFragment (unsugar [this] (Root. (unsugar item))) (collect [this unspaced top-rulename] (collect item unspaced top-rulename)) (develop [this rewrite space] (for [s1 space x (rewrite item space) s2 (if (empty? x) #{[]} space)] (concat s1 x s2)))) (defrecord Repeat+ [item] RuleFragment (unsugar [this] (Repeat+. (unsugar item))) (collect [this unspaced top-rulename] (let [kw (keyword (gensym (str top-rulename "_repeat+_"))) alt #{[kw item] item}] (cons [this kw (if unspaced (Unspaced. alt) alt)] (collect item unspaced top-rulename))))) ;; 2. collect new rules (extend-protocol RuleFragment Object (unsugar [this] (if (= "" this) [] this)) (collect [this unspaced top-rulename] nil) (develop [this rewrite space] [[this]]) ;; a ref to another rule: add support for + ? or * suffixes clojure.lang.Keyword (unsugar [kw] (if-let [[_ base suffix] (re-matches #"(.*?)([+*?])" (name kw))] (unsugar [(keyword base) (keyword suffix)]) kw)) (collect [this unspaced top-rulename] nil) (develop [this rewrite space] [[this]]) ;; a set denotes an alternative clojure.lang.IPersistentSet (unsugar [this] (set (map unsugar this))) (collect [items unspaced top-rulename] (mapcat #(collect % unspaced top-rulename) items)) (develop [items rewrite space] (mapcat #(rewrite % space) items)) ;; a vector denotes a sequence, supports postfix operators :+ :? and :* clojure.lang.IPersistentVector (unsugar [this] (reduce #(condp = %2 :* (conj (pop %1) #{[] (Repeat+. (peek %1))}) :+ (conj (pop %1) (Repeat+. (peek %1))) :? (conj (pop %1) #{[](peek %1)}) (conj %1 (unsugar %2))) [] this)) (collect [items unspaced top-rulename] (mapcat #(collect % unspaced top-rulename) items)) (develop [items rewrite space] (reduce #(for [x (rewrite %2 space) sp space xs %1] (concat x (and (seq x) (seq xs) sp) xs)) [()] (rseq items)))) (defn collect-new-rules "Collect new rules for new non-terminals corresponding to repeatitions." [grammar] (let [collected-rules (mapcat (fn [[k v]] (collect v false (name k))) grammar) rewrites (into {} (for [[op k] collected-rules] [op k])) new-rules (set (vals rewrites)) grammar (into grammar (for [[_ k v] collected-rules :when (new-rules k)] [k v]))] [rewrites grammar])) ;; 3. develop-alts: (defn normalize "Normalize grammar as a map of non-terminals to set of seqs of terminals and non-terminals" ([grammar] (normalize grammar nil {})) ([grammar space rewrites] (let [helper (fn helper [item space] (if-let [rw (rewrites item)] [[rw]] (develop item helper space))) space (helper space #{[]})] (into {} (for [[k v] grammar] [k (set (helper v space))]))))) ;; 4. remove-empty-prods (defn split-empty-prods [grammar] [(into {} (for [[k prods] grammar] [k (set (remove empty? prods))])) (into {} (for [[k prods] grammar :when (some empty? prods)] [k [() [k]]]))]) (defn- inline-prods [prods replacement-map] (letfn [(inline1 [prod] (if-let [[x & xs] (seq prod)] (for [a (replacement-map x [[x]]) b (inline1 xs)] (concat a b)) [()]))] (mapcat inline1 prods))) (defn- inline-empty-prods* [grammar] (let [[grammar empty-prods] (split-empty-prods grammar)] (into {} (for [[k prods] grammar] [k (-> prods (inline-prods empty-prods) set (disj [k]))])))) (defn inline-empty-prods [grammar] (u/fix-point inline-empty-prods* grammar)) (defn- private? [kw] (.endsWith (name kw) "-")) (defn- basename [kw] (if (private? kw) (let [s (name kw) n (subs s 0 (dec (count s)))] (if-let [ns (namespace kw)] (keyword ns n) (keyword n))) kw)) (defn- unalias [kw] (keyword (namespace kw) (let [n (name kw)] (subs n (inc (.indexOf n ".")))))) ;;;;;;;;;;; (defn grammar [options-map & rules] (let [[options-map rules] (if-not (map? options-map) [{} (cons options-map rules)] [options-map rules]) rules (partition 2 rules) public-rulenames (remove private? (map first rules)) {:keys [main space root-tag] :or {main (first public-rulenames) root-tag ::p/root space #{[]}}} options-map public-rulenames (-> (zipmap public-rulenames (map unalias public-rulenames)) (assoc ::p/S root-tag)) rules (concat rules [[::p/S (Root. main)] [::space (unspaced space)]]) grammar (into {} (for [[name specs] rules] [(basename name) (unsugar specs)])) [rewrites grammar] (collect-new-rules grammar) space (::space grammar) grammar (dissoc grammar ::space) grammar (-> grammar (normalize space rewrites) (assoc ::canary #{[::p/S ::eof]}) inline-empty-prods) matches-empty (contains? (grammar ::canary) [::eof]) grammar (dissoc grammar ::canary)] [grammar public-rulenames matches-empty])) parsley-clojure-0.9.3/src/net/cgrand/parsley/lrplus.clj000066400000000000000000000275201247055220400231570ustar00rootroot00000000000000; Copyright (c) Christophe Grand. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns net.cgrand.parsley.lrplus "LR+ is LR(0) with contextual tokenizing." (:require [net.cgrand.parsley.fold :as f] [net.cgrand.parsley.util :as u] [net.cgrand.parsley.stack :as st] [net.cgrand.regex :as re])) (alias 'p 'net.cgrand.parsley) ; avoid circular dependency ; I independently figured out a technique similar to the one described ; in this paper: Context-Aware Scanning for Parsing Extensible Languages ; http://www.umsec.umn.edu/publications/Context-Aware-Scanning-Parsing-Extensible-Language ;; pushdown automaton (defrecord TableState [token-matcher reduce gotos accept eof]) (defn table-state [shifts reduce goto accept & [eof]] (assoc (TableState. nil reduce goto accept eof) :shifts shifts)) (def incomplete [-1]) (defn complete? [m] (when-not (or (= m incomplete) (neg? (nth m 0))) m)) (defprotocol MatcherFactory (matcher-fn [this id])) (extend-protocol MatcherFactory nil (matcher-fn [this id] (fn [^String s eof] nil)) Character (matcher-fn [this id] (let [cv (int (.charValue this))] (fn [^String s eof] (cond (zero? (.length s)) (when-not eof incomplete) (== (int (.charAt s 0)) cv) [1 id])))) String (matcher-fn [this id] (let [n (.length this)] (fn [^String s eof] (cond (.startsWith s this) [n id] eof nil (.startsWith this s) incomplete)))) java.util.regex.Pattern (matcher-fn [this id] (fn [s eof] (let [m (re-matcher this s) found (.lookingAt m)] (cond (and (not eof) (.hitEnd m)) incomplete found [(.end m) id])))) net.cgrand.regex.Regex (matcher-fn [this id] (matcher-fn (:re this) id)) clojure.lang.IFn ; TODO fix (matcher-fn [this id] (throw (RuntimeException. "TODO")) (fn [s eof] (this s eof)))) (defn array-union [matchers] (let [matchers (to-array matchers)] (fn [s eof] (loop [i (alength matchers) r nil] (u/cond (zero? i) r :let [i (unchecked-dec i) m (aget matchers i) mr (m s eof)] (= incomplete mr) incomplete (and r mr) (throw (Exception. (str "Ambiguous match for " (pr-str s) " by " (pr-str matchers)))) (recur i (or r mr))))))) (defn prefix-matcher [matcher s] (u/cond :when-let [r (matcher s false)] (complete? r) (constantly r) matcher)) (defn- matchers-union [matchers] (let [qtable (to-array (map (fn [cp] (let [s (str (char cp)) ms (keep #(prefix-matcher % s) matchers)] (when (seq ms) (if (next ms) (array-union (to-array ms)) (first ms))))) (range 128))) m (array-union (to-array matchers)) on-eof (m "" true)] (fn [^String s eof] (u/cond (zero? (.length s)) (if eof on-eof incomplete) :let [cp (.codePointAt s 0)] (< cp (int 128)) (when-let [m (aget qtable cp)] (m s eof)) (m s eof))))) (defn matcher [terminals terminals-ids] (u/cond :let [ms (seq (map #(matcher-fn % (terminals-ids %)) (set terminals)))] (next ms) (matchers-union ms) ms (first ms) (constantly nil))) (defn- count-unexpected [tm s eof] (loop [n 1] (let [suf (subs s n)] (if (or (tm suf eof) (empty? suf)) n (recur (inc n)))))) (defn step1 "Returns [stack water-mark buffer events] where stack is the new stack, water-mark the number of items at the bottom of the stack which didn't took part in this step, buffer the remaining string to be tokenized, events the parsing events." [^objects table ops stack rem s] (when (nil? stack) (throw (IllegalStateException. "Can't accept more input past EOF."))) (let [eof (nil? s) s (or s "") s (if (= "" rem) s (str rem s)) fq (f/folding-queue ops) stack (st/stack stack)] (loop [^String s s] (u/cond :when-let [state (st/peek! stack) cs (aget table state)] [[sym n tag] (and (zero? (.length s)) (:accept cs))] (if eof (do (f/node! fq tag n) [(assoc @stack :stack nil) "" @fq]) [@stack "" @fq]) [[sym n tag] (:reduce cs)] (let [cs (aget table (-> stack (st/popN! n) st/peek!))] (f/node! fq tag n) (st/push! stack (aget ^objects (:gotos cs) sym)) (recur s)) :let [tm (:token-matcher cs)] [r (tm s eof)] (if-let [[n state] (complete? r)] (let [token (subs s 0 n) s (subs s n)] (f/leaf! fq token) (st/push! stack state) (recur s)) [@stack s @fq]) (not (empty? s)) ; unexpected input (let [n (count-unexpected tm s eof)] (f/unexpected! fq (subs s 0 n)) (recur (subs s n))) ; unexpected eof (let [[sym n tag] (:eof cs) cs (aget table (-> stack (st/popN! n) st/peek!)) _ (f/node! fq tag n)] (st/push! stack (aget ^objects (:gotos cs) sym)) (recur s)))))) (def zero [[[0] ""] 0 nil nil]) (defn step [table ops state s] (u/when-let [[[stack rem :as start]] state [{:keys [watermark stack]} rem events] (step1 table ops stack rem s)] [[stack rem] watermark events start])) ;; LR+ table construction (defn close [init-states state] (u/fix-point (fn [state] (let [follows (map #(first (nth % 2)) state)] (into state (mapcat init-states follows)))) (set state))) (defn filter-keys [map pred] (into {} (for [kv map :when (pred (key kv))] kv))) (defn follow-map [state] (apply merge-with into {} (for [[k n prod] state] {(first prod) #{[k n (next prod)]}}))) (defn transitions [close tags state] (u/cond :let [follows (u/map-vals (follow-map state) #(close %2)) gotos (filter-keys follows keyword?) shifts (filter-keys (dissoc follows nil) (complement gotos)) reduces (follows nil) accepts (filter (fn [[s _ r]] (= 0 s)) reduces) reduces (reduce disj reduces accepts) reduction (when-let [[sym n] (first reduces)] [sym n (tags sym)]) accept (when-let [[sym n] (first accepts)] [sym -1 (tags sym)])] (next reduces) (throw (Exception. ^String (apply str "at state " state "\n reduce/reduce conflict " (interpose "\n" reduces)))) (and reduction (seq shifts)) (throw (Exception. (str "at state " state "\n shift/reduce conflict " shifts "\n" reduces))) (table-state shifts reduction gotos accept))) (defn to-states [{:keys [gotos shifts]}] (concat (vals gotos) (vals shifts))) (defn lr-table [[grammar tags matches-empty]] (let [grammar (-> grammar (dissoc ::p/S) (assoc 0 (::p/S grammar))) tags (assoc tags 0 (tags ::p/S)) init-states (u/map-vals grammar #(set (for [prod %2] [%1 (count prod) prod]))) close (partial close init-states) state0 (-> 0 init-states close) transitions (partial transitions close tags) table (loop [table {} todo #{state0}] (if-let [state (first todo)] (let [transition (transitions state) table (assoc table state transition) new-states (remove table (to-states transition)) todo (-> todo (disj state) (into new-states))] (recur table todo)) table)) table (assoc table 0 (assoc (table state0) :accept (when matches-empty [0 -1 (tags 0)]))) ; state0 is unreachable by construction table (dissoc table state0)] table)) (defn- eof-reduction [state] (reduce (fn [[mk mn :as best] [k n prod]] (let [n (- n (count prod))] (if (and best (>= mn n)) best [k n]))) nil state)) (defn- unfinished-state [public accept n] (let [state #{[::p/unfinished (inc n) nil]} state-id [::p/unfinished (boolean public) (boolean accept) n] transitions (transitions identity (if public #{::p/unfinished} #{}) state) transitions (if accept (assoc transitions :accept [::p/unfinished -1 ::p/unfinished]) transitions)] [state-id transitions])) (defn totalize [table] ;; I wanted to make table the only input of totalize ;; that's why the tags map is recomputed (let [tags (into {} (for [transition (vals table) :let [[k _ tag] (or (:reduce transition) (:accept transition))] :when tag] [k tag]))] (reduce (fn [table [state transition]] (u/cond (:reduce transition) table (= 0 state) (let [table (if-not (:accept transition) (assoc-in table [state :accept] [::p/unfinished -1 ::p/unfinished]) table) [ustate utransition :as ust] (unfinished-state (tags 0) true 0) table (assoc-in table [state :gotos ::p/unfinished] ustate)] (conj table ust)) :let [[k n] (eof-reduction state) tag (when (tags k) ::p/unfinished) [ustate utransition :as ust] (unfinished-state tag (= 0 k) n)] (conj table ust [state (-> transition (assoc :eof [::p/unfinished n tag]) (assoc-in [:gotos ::p/unfinished] ustate))]))) table table))) (defn number-states [table] (let [; number states table-without-start (dissoc table 0) mapping (zipmap (cons 0 (keys table-without-start)) (range)) renum (fn [m] (reduce #(update-in %1 [%2] mapping) m (keys m))) ; compute matchers which return the shifted state id token-matcher (memoize (fn [shifts] (matcher (keys shifts) (comp mapping shifts)))) syms (set (for [v (vals table) [x] [(:reduce v) (:eof v) (:accept v)] :when x] x)) syms-mapping (zipmap syms (range)) renum-action #(when-let [[sym n tag] %] [(syms-mapping sym) n tag]) empty-goto (vec (repeat (count syms) nil)) renum-gotosyms (fn [goto] (reduce (fn [goto [sym state]] (assoc goto (syms-mapping sym) state)) empty-goto goto))] (to-array (for [{shifts :shifts gotos :gotos :as v} (cons (get table 0) (vals table-without-start))] (-> v (dissoc :shifts) (assoc :reduce (renum-action (:reduce v)) :eof (renum-action (:eof v)) :accept (renum-action (:accept v)) :token-matcher (token-matcher shifts) :gotos (-> gotos renum renum-gotosyms to-array))))))) (comment (def g {:E #{["(" :E+ ")"] [#"\w+"]} :E+ #{[:E+ :E] [:E]}}) (let [t (lr-table [g :E identity])] (step t zero "((hello)")) ) parsley-clojure-0.9.3/src/net/cgrand/parsley/stack.clj000066400000000000000000000022361247055220400227400ustar00rootroot00000000000000; Copyright (c) Christophe Grand. All rights reserved. ; The use and distribution terms for this software are covered by the ; Eclipse Public License 1.0 (http://opensource.org/licenses/eclipse-1.0.php) ; which can be found in the file epl-v10.html at the root of this distribution. ; By using this software in any fashion, you are agreeing to be bound by ; the terms of this license. ; You must not remove this notice, or any other, from this software. (ns net.cgrand.parsley.stack) (defprotocol EphemeralStack (push! [this x]) (popN! [this n]) (peek! [this])) (deftype Stack [^java.util.ArrayList list ^{:unsynchronized-mutable true, :tag long} wm] EphemeralStack (push! [this x] (.add list x) this) (popN! [this n] (let [sz (.size list) sl (.subList list (- sz (long n)) sz)] (.clear sl) this)) (peek! [this] (let [i (unchecked-dec (.size list))] (set! wm (Math/min (long wm) (long i))) (.get list i))) clojure.lang.IDeref (deref [this] {:watermark wm :stack (vec list)})) (defn stack [coll] (Stack. (java.util.ArrayList. ^java.util.Collection coll) (count coll)))parsley-clojure-0.9.3/src/net/cgrand/parsley/tree.clj000066400000000000000000000132211247055220400225660ustar00rootroot00000000000000(ns net.cgrand.parsley.tree "An incremental buffer backed by a 2-3 tree." (:require [net.cgrand.parsley.util :as u])) (defprotocol Node "Protocol for inner nodes and leaves of a 2-3 buffer. Leaves contain collections (or strings or anything sequential and finite). The buffer maintains a reduction over all its leaves contents. The buffer is parametrized by a map of fns, see the ops method." (len [node] "Returns the length of the Node") (left-cut [node offset]) (right-cut [node offset]) (value [node] "The result value for this node.") (children [node] "For inner nodes, return their children.") (ops [node] "Returns a map of fns with keys: :unit [leaf-content] turns a leaf content into a value from the result type, :plus [a b] (associative fn) combines two values from the result type into a value of the result type, :chunk [leaf-content] breaks a leaf content into a seq of leaf contents -- it controls the computational granularity of the buffer. :left [leaf-content offset] returns the part of a leaf content to the left of the offset, :right [leaf-content offset] returns the part of a leaf content to the right of the offset, :cat [& leaf-contents] returns teh concatenation of the seq of leaf-contents, with no args MUST returns the proper identity element (eg \"\" for str, () or [] or nil for concat), :len [leaf-content] returns the length of leaf-content.")) (defrecord Ops [unit plus chunk left right cat len]) (defn as-ops [options] (if (instance? Ops options) options (into (Ops. nil nil nil nil nil nil count) options))) (deftype InnerNode [ops val length a b c] Node (left-cut [this offset] (u/cond :let [la (len a)] (<= offset la) (conj (left-cut a offset) nil) :let [offset (- offset la) lb (len b)] (<= offset lb) (conj (left-cut b offset) [a]) :let [offset (- offset lb)] :else (conj (left-cut c offset) [a b]))) (right-cut [this offset] (u/cond :let [la (len a)] (< offset la) (conj (right-cut a offset) (if c [b c] [b])) :let [offset (- offset la) lb (len b)] (or (< offset lb) (nil? c)) (conj (right-cut b offset) (when c [c])) :let [offset (- offset lb)] :else (conj (right-cut c offset) nil))) (len [this] length) (value [this] val) (children [node] (if c [a b c] [a b])) (ops [this] ops)) (defn node [ops children] (let [[a b c] children plus (:plus ops) val (plus (value a) (value b)) val (if c (plus val (value c)) val)] (InnerNode. ops val (+ (len a) (len b) (if c (len c) 0)) a b c))) (deftype Leaf [ops val s] Node (left-cut [this offset] [((:left ops) s offset)]) (right-cut [this offset] [((:right ops) s offset)]) (len [this] ((:len ops) s)) (value [this] val) (ops [this] ops)) (defn leaf [ops s] (Leaf. ops ((:unit ops) s) s)) (defn group "Groups a sequence of at least two nodes into a sequence of nodes with 2 or 3 children." [nodes] (let [ops (ops (first nodes))] (if (odd? (count nodes)) (cons (node ops (take 3 nodes)) (map #(node ops %) (partition 2 (drop 3 nodes)))) (map #(node ops %) (partition 2 nodes))))) (defn- left-borrow [lefts] (or (u/cond :when-let [[nodes & xs] (seq lefts)] [r (peek nodes)] (list* (children r) (pop nodes) xs) :let [[nodes & xs] (left-borrow xs)] [r (peek nodes)] (list* (children r) (pop nodes) xs)) (cons nil lefts))) (defn- right-borrow [rights] (or (u/cond :when-let [[nodes & xs] (seq rights)] [l (first nodes)] (list* (children l) (next nodes) xs) :let [[nodes & xs] (right-borrow xs)] [l (first nodes)] (list* (children l) (next nodes) xs)) (cons nil rights))) (defn edit "Performs an edit on the buffer. Content from offset to offset+length (excluded) is replaced by s." [tree offset length s] (let [[sl & lefts] (left-cut tree offset) [sr & rights] (right-cut tree (+ offset length)) ops (ops tree) s ((:cat ops) sl s sr) leaves (map #(leaf ops %) ((:chunk ops) s))] (loop [[l & ls :as lefts] lefts [r & rs :as rights] rights nodes leaves] #_(assert (seq nodes)) (u/cond :let [nodes (concat l nodes r)] (next nodes) (recur ls rs (group nodes)) ; it means that nodes has only one item and both l and r are empty :let [lefts (left-borrow ls)] (first lefts) (recur lefts rights nodes) :let [rights (right-borrow rs)] (first rights) (recur lefts rights nodes) ; nothing left, we have a new root! (first nodes))))) (defn buffer ([ops] (let [ops (as-ops ops)] (leaf ops ((:cat ops))))) ([ops s] (-> (buffer ops) (edit 0 0 s)))) ;;;;;;;;;;;;;;; (comment ; demo (def str-buffer (partial buffer {:unit identity :plus str :chunk #(.split ^String % "(?<=\n)") :left #(subs %1 0 %2) :right subs :cat str})) (defprotocol Treeable (tree [treeable])) (extend-protocol Treeable Leaf (tree [leaf] (.s leaf)) InnerNode (tree [nv] (map tree (if (.c nv) [(.a nv) (.b nv) (.c nv)] [(.a nv) (.b nv)])))) ;; repl session => (-> "a\nb" str-buffer (edit 1 0 "c") ((juxt tree value))) ("ac\n" "b") => (-> "a\nb" str-buffer (edit 1 0 "cd") ((juxt tree value))) ("acd\n" "b") => (-> "a\nb" str-buffer (edit 1 0 "cd") (edit 2 0 "\n\n") ((juxt tree value))) (("ac\n" "\n") ("d\n" "b")))parsley-clojure-0.9.3/src/net/cgrand/parsley/util.clj000066400000000000000000000040561247055220400226120ustar00rootroot00000000000000(ns net.cgrand.parsley.util "Some functions and a collection of variations on Clojure's core macros. Let's see which features end up being useful." {:author "Christophe Grand"} (:refer-clojure :exclude [cond when-let if-let])) (defmacro if-let "A variation on if-let where all the exprs in the bindings vector must be true. Also supports :let." ([bindings then] `(if-let ~bindings ~then nil)) ([bindings then else] (if (seq bindings) (if (= :let (bindings 0)) `(let ~(bindings 1) (if-let ~(subvec bindings 2) ~then ~else)) `(let [test# ~(bindings 1)] (if test# (let [~(bindings 0) test#] (if-let ~(subvec bindings 2) ~then ~else)) ~else))) then))) (defmacro when-let "A variation on when-let where all the exprs in the bindings vector must be true. Also supports :let." [bindings & body] `(if-let ~bindings (do ~@body))) (defmacro cond "A variation on cond which sports let bindings and implicit else: (cond (odd? a) 1 :let [a (quot a 2)] (odd? a) 2 3). Also supports :when-let and binding vectors as test expressions." [& clauses] (when-let [[test expr & more-clauses] (seq clauses)] (if (next clauses) (if-let [sym ({:let `let :when `when :when-let `when-let} test)] (list sym expr `(cond ~@more-clauses)) (if (vector? test) `(if-let ~test ~expr (cond ~@more-clauses)) `(if ~test ~expr (cond ~@more-clauses)))) test))) (comment ;if one could define cond with itself it would read: (cond :when-let [[test expr & more-clauses] (seq clauses)] (not (next clauses)) test [sym ({:let `let :when `when :when-let `when-let} test)] (list sym expr `(cond ~@more-clauses)) (vector? test) `(if-let ~test ~expr (cond ~@more-clauses)) `(if ~test ~expr (cond ~@more-clauses)))) (defn map-vals [map f] (into map (for [[k v] map] [k (f k v)]))) (defn fix-point [f init] (let [v (f init)] (if (= v init) init (recur f v)))) parsley-clojure-0.9.3/src/net/cgrand/parsley/views.clj000066400000000000000000000054661247055220400230000ustar00rootroot00000000000000(ns net.cgrand.parsley.views) ; Right now compute is responsible for memoization (defprotocol ViewableNode (compute [n leaf node] "Applies either the leaf or node function to the node at hand. The leaf function expects one argument: a string. The node function expects two arguments: a keyword and a collection of children nodes.")) ; TODO: make Node and String Viewable + memoization (defn view "Creates a simple recursive view function. The leaf function is passed the string stored into a leaf and the node function gets the tag and the sequence of values returned by the view recursively called on the children nodes." [leaf node] (letfn [(node* [tag fs] (node tag (map v fs))) (v [n] (compute n leaf node*))] v)) (defn nonrec-view "Creates a view function. Unlike view, the second arg passed to node is a collection of children functional trees." [leaf node] (fn [n] (compute n leaf node))) (def content "View that returns the nodes (or nil) of a tree. Unlike :content works on all viewable trees" (nonrec-view (constantly nil) (fn [_ nodes] nodes))) (def tag "View that returns the tag (or nil) of a functional tree. Unlike :tag works on all viewable trees." (nonrec-view (constantly nil) (fn [tag _] tag))) (def text (view identity #(apply str %2))) (def length (view count #(reduce + %2))) (def offsets "View which returns a sorted-map of relative end offsets to children nodes." (nonrec-view (constantly nil) (fn [_ nodes] (into (sorted-map) (next (reductions (fn [[offset _] node] [(+ offset (length node)) node]) [0 nil] nodes)))))) ;; zipper on viewable trees #_(defn fzip [root] (z/zipper content content (fn [node children] (fnode (tag node) (vec children))))) ;; we need indexed/measured zippers! #_(defn offset-at "Returns the offset at the *start* of the node pointed by the loc. " [loc] (if-let [ploc (z/up loc)] (let [roffsets (offsets (z/node ploc)) n (count (z/lefts loc)) roffset (if (zero? n) 0 (key (first (drop (dec n) roffsets))))] (+ roffset (offset-at ploc))) 0)) (defn path-to [node offset] (loop [node node offset offset path []] (if-let [offsets (offsets node)] (let [[o n] (first (subseq offsets >= offset)) so (- o (length n)) ro (- offset so)] (recur n ro (conj path [node offset]))) (conj path [node offset])))) #_(defn loc-at [loc offset] (if (and (z/branch? loc) (seq (z/children loc))) (let [cloc (z/down loc) lefts (subseq (offsets loc) < offset)]) ) ) parsley-clojure-0.9.3/test/000077500000000000000000000000001247055220400156235ustar00rootroot00000000000000parsley-clojure-0.9.3/test/net/000077500000000000000000000000001247055220400164115ustar00rootroot00000000000000parsley-clojure-0.9.3/test/net/cgrand/000077500000000000000000000000001247055220400176475ustar00rootroot00000000000000parsley-clojure-0.9.3/test/net/cgrand/parsley/000077500000000000000000000000001247055220400213265ustar00rootroot00000000000000parsley-clojure-0.9.3/test/net/cgrand/parsley/test.clj000066400000000000000000000063511247055220400230040ustar00rootroot00000000000000(ns net.cgrand.parsley.test (:require [net.cgrand.parsley :as p]) (:require [net.cgrand.parsley.lrplus :as core]) (:require [net.cgrand.parsley.util :as u]) (:require [net.cgrand.parsley.views :as v]) (:require [net.cgrand.parsley.functional-trees :as f]) (:use clojure.test)) (defn- unexpected? [x] (when (and (vector? x) (= ::p/unexpected (first x))) (second x))) (defn v [node] (if (map? node) (reduce (fn [v x] (u/if-let [b (unexpected? x) a (unexpected? (peek v))] (conj (pop v) [::p/unexpected (str a b)]) (conj v x))) [(:tag node)] (map v (:content node))) node)) (deftest empty-grammar (let [eg (p/parser {:main []})] (are [s t] (= (v (eg s)) t) "" [::p/root] "abcdef" [::p/root [::p/unexpected "abcdef"]] " " [::p/root [::p/unexpected " "]] " a " [::p/root [::p/unexpected " a "]]))) (deftest empty-whitespaced-grammar (let [eg (p/parser {:main [] :space :ws? :root-tag :root} :ws #" +")] (are [s t] (= (v (eg s)) t) "" [:root] "abcdef" [:root [::p/unexpected "abcdef"]] " " [:root [:ws " "]] " a " [:root [:ws " "] [::p/unexpected "a "]]))) (def sexpr (p/parser {:main :expr* :space :ws? :root-tag :root} :ws #"\s+" :expr- #{:vector :list :map :set :symbol} :symbol #"[a-zA-Z-]+" :vector ["[" :expr* "]"] :list ["(" :expr* ")"] :map ["{" :expr* "}"] :set ["#{" :expr* "}"])) (deftest sexpr-once (are [s t] (= (v (sexpr s)) t) "" [:root] "hello world" [:root [:symbol "hello"] [:ws " "] [:symbol "world"]] " hello " [:root [:ws " "] [:symbol "hello"] [:ws " "]] "(hello #{world kitty})" [:root [:list "(" [:symbol "hello"] [:ws " "] [:set "#{" [:symbol "world"] [:ws " "] [:symbol "kitty"] "}"] ")"]] "(hello #{world kitty])" [::p/unfinished [::p/unfinished "(" [:symbol "hello"] [:ws " "] [::p/unfinished "#{" [:symbol "world"] [:ws " "] [:symbol "kitty"] [::p/unexpected "])"]]]] "hello 123 world" [:root [:symbol "hello"] [:ws " "] [::p/unexpected "123 "] [:symbol "world"]])) ;; views (def fexpr (p/parser {:main :expr* :space :ws? :root-tag :root :make-node f/fnode :make-leaf f/fleaf} :ws #"\s+" :expr- #{:vector :list :map :set :symbol} :symbol #"[a-zA-Z-]+" :vector ["[" :expr* "]"] :list ["(" :expr* ")"] :map ["{" :expr* "}"] :set ["#{" :expr* "}"])) (def input "(hello #{world kitty})") (def ftree (fexpr input)) (deftest views (are [v r] (= (v ftree) r) v/length (count input) v/text input (v/view (constantly 0) (fn [_ xs] (reduce + 1 xs))) 8)) (deftest path-to (is (= (-> (v/path-to ftree 10) peek first v/text) "world")))