Camlp4 is a preprocessor for OCaml. As a preprocessor, you can do
syntax extensions to your OCaml programs. But Camlp4 also provides
some other features:
-
A system of grammars
- A kind of macros called quotations
- A revised syntax for OCaml
- A pretty printing system of OCaml programs
- and other: extensible functions, functional streams, ...
Camlp4 is syntax, syntax, syntax. It uses its own syntax systems to do
its own syntax extensions: it is highly bootstrapped. Camlp4 stops at
syntax level: it does not know anything about semantic, typing nor
code generation (for it, a type definition is just a syntactic thing
which starts with ``type'').
The ``p4'' in the name ``camlp4'' stands for the 4 ``p'' of
``Pre-Processor-Pretty-Printer''.
1.1 |
Extending the syntax of OCaml |
|
To start with the beginning, we could try to learn how to make simple
syntax extensions to OCaml. If you know the C language, you probably
experimented the define
construction, very easy to use:
#define FOO xyzzy
and all occurrences of FOO
in the rest of the program are
replaced by xyzzy
.
In Camlp4, is it not so simple. A syntax extension is not just text
replacing: it is an extension of an entry of the grammar of the
language, and you need to create syntax trees.
It is therefore necessary 1/ to know what is the grammar system
provided by Camlp4 2/ to know how to create syntax trees. It is what
we are going to do in this tutorial. Once these points described, we
have got the tools to do the syntax extensions of the language.
If you are impatient, and you want to create your syntax extension in
the next quarter of an hour, and you don't want to learn all that
stuff, you may consider taking the text of an already existing syntax
extension and change it for you own needs. A syntax extension is not
necessarily a long program (for example, adding the
repeat..until
construction of Pascal takes 6 lines) and you can
guess ``how it works'' and ask the wizards...
Examples are given in chapter 7.
However, if you read this manual, you may be interested on learning
the original system of grammars that Camlp4 provides. It can be used
for other goals than extending the OCaml language: for your own
grammars. This system of grammars is an alternative of yacc: a
different approach, but you can describe your language in some
identical way.
Just the practical things before (what do I type to experiment?)
1.2 |
Using Camlp4 as a command and in the toplevel |
|
You must first know that camlp4 is a command. This chapter does not
explain all the details of the command and its options: we see that
further (chapter 8; you can also use the man pages by typing
"man camlp4"
in your shell).
For the moment, here is a magic incantation to compile a file named
foo.ml
:
ocamlc -pp "camlp4o pa_extend.cmo" -I +camlp4 -c foo.ml
This command just compiles foo.ml
as a normal OCaml file, but
where the parsing is done by camlp4. The first examples in this
documentation (grammars in Camlp4) can be compiled using this
command. Otherwise, the examples are given with the correct command to
use in order to compile the files.
Another (recommended) better way is to use the OCaml toplevel. In the
toplevel, type:
#load "camlp4o.cma";;
#load "pa_extend.cmo";;
You can type the examples of this documentation in the toplevel. You
can also type them in files and use the directive #use
to
include them.
All the examples in this documentation are written in the normal
syntax of OCaml, but if you know and prefer the revised syntax
provided by Camlp4, change camlp4o
into camlp4r
in the
ocamlc
command, or, load "camlp4r.cma"
instead of
"camlp4o.cma"
in the toplevel.
1.3 |
Linking applications using Camlp4 libraries |
|
Many examples of this tutorial use some specific Camlp4 libraries. In
the toplevel, you don't need to load them because they are in the file
camlp4o.cma
.
To link a standalone application, you need to add the library named
gramlib.cma
of the Camlp4 library directory. The command is:
ocamlc -I +camlp4 gramlib.cma <the_files_you_link>
1.4 |
Differences in parsing behavior |
|
Even if you use the normal syntax, there are some small differences
in the parsing behavior between the normal ocamlc parser (bottom
up, LALR parsing) and the camlp4 parser (top down, recursive
descent parsing). These differences appear notably when giving
erroneous input. As a trivial example, suppose that you wanted to type
(* correct intended input *)
type t = Buf of Buffer.t
| Str of string
Instead of typing the above example, you forgot the second occurrence
of the of keyword, getting
(* file wrongsyntax.ml : wrong input - missing keyword *)
type t = Buf of Buffer.t
| Str (*missing "of"*) string
The ocamlc compiler1
(invoked as ocamlc -c wrongsyntax.ml
) finds a syntax
error on the string word; it parses the whole file as a single
type declaration and finds a syntax error inside it.
The camlp4 parser (with ordinary syntax), invoked as ocamlc -c -pp camlp4o wrongsyntax.ml
don't find any (shallow)
syntactic error, but parses the above input as two items:
type t = Buf of Buffer.t
| Str
which is a correct type declaration (different from the one intended
by the author), followed by a simple expression
string
which is understood like let _ = string
, and produces the
following message Unbound value string
- 1
- The interactive toplevel ocaml has the same behavior, unless you load a different parser.