MPC is a monadic parser combinators library. It provides a toolbox of predefined parsers, facilities for error handling and parses arrays, streams and lists.
This manual summarizes the exported functions of the packages
mpc.numerals. Refer to the MPC API for detailed descriptions of all exported symbols. To learn more about MPC's internals and monadic parser combinators in general read Drew Crampsie's parser combinators tutorial.
Assuming you want to parse an email address in the form of
<user>@<host>. Let's start by defining our package:
Next we restrict the allowed characters in the user and host fields:
Finally we use
=address-character to implement a simple address parser:
We can now apply our grammar using
run is the main entry point to MPC and has to be used to run a parser against an input-source. The input-source can be of type
list. Because MPC supports non-deterministic parsers which can return multiple results,
run accepts a keyword parameter result, a function used to select the desired return value. By default
run returns only the first result's value.
The core of MPC is made up of primitive parsers and combinators. A parser is a function that accepts an input source and returns a list of pairs containing the result and the remaining input if it is successful and
nil othwerise to signal its failure to parent parsers. A combinator is a function which returns a parser. For consistency primitive parsers are defined as combinators that that always return the primitive parser.
=item is used to pop off an item from the input. It fails if the input is empty.
=result always succeeds with a given value without consuming input. It is used to return arbitrary values from a parser. To check for end of input there is
=end-of-input which succeeds only when the input is empty.
The primitive combinator
=bind permits applying parsers in sequence and offers a way to access their intermediate results.
=plus lets us combine parsers in a non-deterministic way while
=and are deterministic alternatives.
=if allows for conditional application of parsers.
=let* macro offers a lispy syntax for
=bind. It binds the results of a sequence of parsers to variables and unless any parser fails runs the body parsers. It also understands the special symbol
_ (underscore) to signify ignorable bindings, where ignorable means that no symbol shall be bound to the value of a given parser (
=let* nevertheless requires the parser to succeed).
The syntax of
=let* is as follows:
(=let* ((symbol parser
=fail simply always fails. It optionally accepts expressions to be evaluated at failure. Those expressions are permitted to call
get-input-position, which can be used to determine where a failure occurred.
Two other error handling facilities
=restart-case do what their names suggest. Instead of forms to be evaluated, every case clause accepts parsers to be run.
=unless behave like
=if with an implicit
progn and no else clause. Just like
unless in Common Lisp.
=not takes one parser and, if it would fail, consumes and returns the next item from input.
=prog2 behave like
=and but return the result of the first or second parser respectively.
=maybe applies a parser and succeeds even if the parser fails.
=list also behaves like
=and but collects all results in a list.
(=item) but only succeeds when the result returned by
(=item) satisfies a given predicate.
=range are like
=satisfies but require the result returned by
(=item) to be
eql to a given value,
eql to one or none of the values in a given list or to be inside a range defined by a predicate and a lower and upper bound respectively.
=exactly do as their names suggest and apply parsers multiple times in various variants.
=funcall applies a parser and returns the result of a given function called on the parsers result.
=character behaves like
=eql but uses
char= instead of
=string is similar too as it parses a given string or fails.
=string-of parses a string of characters parsed by a given parser.
=skip-whitespace parse or skip characters commonly considered as whitespace respectively.
=newline parses the newline character and
=line parses a string of characters terminated by newline or end of input.
=digit parses a number from a digit character.
=integer-number both parse numbers from numeral strings while the latter also understands a leading dash for negativity. All three parses accept an optional radix argument.