DSD

Posted in Code, Projects
2 Comments

The Real Theorem Generator: a Context Free Grammar

I should prob­a­bly doc­u­ment the real ori­gin of the The­o­rem of the Day and Phi­los­o­phy of the Day. Cof­fee and Henry David Thoreau are per­haps less involved than orig­i­nally indicated.

nothoreauThe the­o­rem gen­er­a­tor was writ­ten by a good friend of mine, Matt Gline, as a project for CS51: Abstrac­tion and Design in Com­puter Pro­gram­ming, which we took together as freshmen.

The assign­ment was to use LISP to imple­ment a con­text free gram­mar — basi­cally a set of rules for computer-generated mad libs. The sub­ject was what­ever we wanted. Good ones from past years include computer-generated mys­tery novel­las, course-guide reports, and per­for­mance art direc­tions. Every year there’s a con­test, and Matt’s the­o­rem gen­er­a­tor was hys­ter­i­cal enough to win him lunch at the fac­ulty club.

Context-Free Gram­mars

Each rule in a context-free gram­mar con­sists of a term, and a set of choices for how to make that term. The choices can con­tain text, or other terms, or even refer recur­sively to the term being defined. For instance, a gram­mar like

nounphrase ::= <noun> | <adj> <adj> <noun> | super <nounphrase>
noun ::= apple | pear | mailman
adj ::= smelly | chartreuse | enormous

can pro­duce phrases like “apple,” “enor­mous smelly mail­man,” or “super super smelly char­treuse mail­man.” The orig­i­nal gram­mars we wrote for CS51 were hundred-line clumps of LISP code that spit out un-punctuated para­graphs in all-caps. Over the years, my room­mate Mike and I spiffed-up the the­o­rem gen­er­a­tor a bit, result­ing in the cur­rent shiny ajax/latex ver­sion.

The cur­rent the­o­rem gram­mar is defined in a gram­mar file (thm.gram) that looks like this

theorem ::= \begin{theorem}[<thname>]<thmstatement>.
          \end{theorem} \begin{proof} <proof>
          \end{proof}

proof ::= <statement>. <conclusion>. | See <citation>.
    | <statement>. <conclusion>. | <statement>. <conclusion>.

conclusion ::= 
    The result follows by d\'evissage | The theorem follows trivially
    | The conclusion is self-evident | Clearly, the theorem holds
    | We leave the rest as an exercise | This is the desired result
    | The rest follows from <citation> | QED
    | A simple application of <famoustheorem> completes the proof

famoustheorem ::=
    <mathguy>'s theorem
    | Hilbert's problem <nzdigit><zdigit>
    | the Riemann Hypothesis
    | Perelman's theorem (formerly the Poincar\'e Conjecture)
    | horizontal Iwasawa theory | Kummer theory | the different
    | Dynkin diagrams | Gegenbauer polynomials
    | trichotomy
...

A perl script (compile-grammar):

...
print "parsing grammar '$gram' into '$out.ml'...";
foreach (@entries) {
    s/\s* \# .* $//xmg;
    if (/\s*(\S*)\s*::=\s*(.*)/s) {
        $lhs = $1;
        $rhs = $2;

        if (!defined ($mainphrase)) {
            $mainphrase = $lhs;
            print OUT "let rec $lhs = Opts [|\n";
        } else {
            print OUT "and $lhs = Opts [|\n";
        }

        @opts = split(/\s*\|\s*/, $rhs);
        foreach $opt (@opts) {
            print OUT "  [|";
            $opt =~ s/\s*\n\s*/ /g;

            # split just before < and after >
...

processes the gram­mar file and com­piles it into OCaml code (thm.ml).

type phrase = Str of string | Opts of phrase array array

let _ = Random.self_init ()

let randelt a = a.(Random.int (Array.length a))
let rec print phr = match phr with
  Str  s       -> print_string s
| Opts options ->
    let parts = randelt options in
    Array.iter print parts

(* Grammar definitions *)
let rec top = Opts [|
  [| theorem;|];
|]

and theorem = Opts [|
  [| Str "\\begin{theorem}["; thname; Str "]"; thmstatement; Str ". \\end{theorem} \\begin{proof} ";
 proof; Str " \\end{proof}";|];
|]
...

This is almost cer­tainly more elab­o­rate than nec­es­sary, but OCaml is a lovely lan­guage for recur­sive struc­tures, and the code is nice and sim­ple. Run­ning thm.ml spits out a ready-made LaTeX-ed the­o­rem, which gets run through LaTeX to png, and cached, ready for viewing.

The Phi­los­o­phy of the Day

My entry into the CS51 con­test was a phi­los­o­phy gen­er­a­tor, which spits out semi-plausible def­i­n­i­tions of philoso­phies from numer­ous cul­tural and intel­lec­tual tra­di­tions. It was never as clever as the the­o­rem gen­er­a­tor (which still cracks me up every time it pro­duces any­thing attrib­uted to Lip­schitz), and the philoso­phies tend to involve con­cepts that are only of inter­est to nerdy Har­vard fresh­men. It’s still amus­ing, though. Here’s the gram­mar def­i­n­i­tion.

Have Fun!

The sys­tem I set up to cre­ate and run context-free gram­mars is pretty easy to use. Gram­mar files are sim­ple to write, and from there all you need is compile-grammar and an OCaml instal­la­tion. Here, once again, are the rel­e­vant files

  1. Some exam­ple gram­mar files: thm.gram, philo.gram
  2. Parse gram­mar files into ml code: compile-grammar

Anyone’s absolutely wel­come to send exten­sions to thm.gram and philo.gram, or any new gram­mars you might write. I’ll ajax­ify them and post them here, if you like. Have fun!


2 Responses to “The Real Theorem Generator: a Context Free Grammar” Comment Feed, Trackback.

  1. Jiang says: Dec 21, 2009 @ 6:31 am

    Thanks for this post. I love your con­cise yet read­able for­mat for spec­i­fy­ing CFG’s, and have writ­ten a Python CFG class around it that works prop­erly with philo.gram.

    My pri­mary pur­pose was to gen­er­ate words and names, so I needed to be able to spec­ify the “empty” pseudo-character. My solu­tion was to do a gen­eral search-and-replace for pseudo-nonterminals in the form of . Well, thanks again for this enlight­en­ing post. Awe­some the­o­rem gen­er­a­tor ;p

  2. Trackback: Travels in a Mathematical World Jul 02, 2010 @ 5:13 am

    Car­ni­val of Math­e­mat­ics #67…

    Mean­while, the The­o­rem of the Day gen­er­a­tor is cook­ing up real­is­tic look­ing ‘the­o­rems’ and ‘proofs’ using a con­text free grammar.…

Leave a Reply