OmniLexer Features
  About OmniLexer | Product Tour

This page describes how we, as programmers, might use OmniLexer to solve a sample programming task. The generated scanner, the example program’s source code, as well as the resultant executable are available for download at the end of the tour.

You may also download a printable version of this tour (pdf file, 400 KB).

Part 1: Creating a Mini-Calculator Application

Suppose that we are tasked with creating a simple console-based calculator in ANSI C. The C-language “mini-calculator” application will read a mathematical expression typed by the user, evaluate the expression, and display the answer. The mini-calculator should be versatile enough to allow the user to enter a unique expression in several different ways. For example, the user may enter “2 x 3,” which will be identified by the program as being equivalent to “2 times 3,” “2 * 3,” or “2*3.”

Our mini-calculator will recognize five basic mathematical operations: addition, subtraction, multiplication, division and exponentiation. In addition, our application will also be able to take absolute values and perform modular arithmetic. Of course, we will also need to allow the user to use parenthesis to specify order of operations.

This is a simple application, and yet we are already burdened with developing a scanner that accurately recognizes 10 unique lexeme types (signed numbers, seven types of operations, and the beginnings and endings of parenthetical expressions). Readers may ponder for a moment how they would approach the task of coding this scanner. Needless to say, it is a complex task. This is especially true when we consider representing operations with more than one character or character string, and when we think that numbers may or may not begin with a unary (plus or minus) sign and may or may not include a decimal point.

Luckily, we have OmniLexer. All we need to do is create a simple scanner specification, and our scanner will be automatically generated.

Part 2: Writing the Scanner Specification

We must first decide how we want to represent each token. This table lists some reasonable representations that we may want to use for each of our tokens:

Token Representation(s)
Signed integer One or zero unary sings followed by a digit string.
Signed decimal One or zero unary signs followed by a digit string, followed by a single decimal point and another digit string.
Addition “+”, “and”, “plus”
Subtraction “-”, “minus”
Multiplication “x”, “*”, “times”
Division “/”, “\”, “|”
Exponentiation “^”, “E”, “exp”
Modulus “%”, “mod”, “modulo”
Begin sub-expression “(”
End sub-expression “)”
Absolute value “abs”
Quit program “QUIT”

Most tokens will be easy to define – we simply need to specify the character strings that represent each token. The signed integer and decimal tokens, however, have infinitely many representations. To accurately define these tokens, we will first define character classes and then define the tokens themselves as permutations character classes.

OmniLexer also allows us to set aside reserved words. To demonstrate this, we will declare “abs” and “QUIT” as reserved words. In OmniLexer, we write the following code:

   scanner MINI_CALCULATOR_SCANNER is

      character_classes is
         begin

            unary_signs = { '-' } ;
            decimal_point = '.' ;
            digits = { '0'..'9' } ;

         end ;

      token_definitions is
         begin

            signed_whole = unary_signs*<1> digits+ ;
            signed_decimal = unary_signs*<1> digits+ decimal_point digits+ ;


            
multiplication_token = 'x' | '*' | "times" ;
            division_token = '/' | '\' | '|' | "over" ;

            addition_token = '+' | "plus" | "and" ;
            
subtraction_token = '-' | "minus" ;
            modulo_token = '%' | "mod" | "modulo" ;
            exponentiation_token = '^' | 'E' | "exp" ;

            begin_expression_token = '(' ;
            end_expression_token = ')' ;

            ABS = reserved( "ABS", instance of identifier ) ;
            QUIT = reserved( "QUIT", instance of identifier ) ;

         end ;

   end MINI_CALCULATOR_SCANNER ;

We can easily set OmniLexer’s preferences to make these tokens case-insensitive, so that “13 times 11” will be read the same as “13 Times 11.”

Part 3: Adjusting Build Settings

Before we generate our mini-calculator scanner, we should adjust the build settings so that we can have a scanner that fits our more specific needs. We bring up OmniLexer's build settings dialog:

The most significant aspect of the scanner that we define from the build settings dialog is the target language. OmniLexer is capable of generating scanners in ANSI C, C++, Ada and PL/SQL.

OmniLexer provides the option of generating a scanner for text files or for smaller character strings, a choice that depends on the intended purpose of the scanner. If we want to build a command line interpreter or scan input typed into an edit control, we would want OmniLexer to generate a more compact character-string scanner. If we need a scanner that takes input from a file, OmniLexer can easily generate the appropriate code. A file scanner might be more appropriate for compilers or search programs. Since our mini-calculator will take input as command-line prompts, this example will use a character string scanner.

Part 4: Using OmniLexer’s Generated Scanners

Since we are creating a scanner in ANSI C, OmniLexer generates a C-language source code file and a header file, MiniCalculatorScanner.c and MiniCalculatorScanner.h. We will include the header file in our mini-calculator source code. This header file gives us access to eight sub-routines that comprise the mini-calculator scanner and allow us to define its behavior within our individual application. These routines will allow us to connect and disconnect from the input stream (in this case, it will read the user input from the command line), read tokens and lookahead tokens and determine their type, and alter our position within the input stream.

All of the source code for the mini-calcualtor application is available for download below. The scanner functions are used in two ways in this application, once for the evaluation of expressions in the file MiniCalculator.c, and once for the error checking routines (defined in the header file ErrorChecking.h).

The finished product is a rather powerful calculator. Users may enter unique expressions in a variety of ways, and see what syntax errors occur when the expression cannot be evaluated.

The OmniLexer-generated scanner is incorporated into the application using the recursive descent parsing method. Keep in mind that the application references two additional header files in addition to the scanner files.

Download all source files together:

MiniCalculator.zip (73 KB)

Download files separately:

MiniCalculator.c

MiniCalculatorScanner.c

MiniCalculatorScanner.h

ErrorChecking.h

StringNumberConversions.h

MiniCalculator.exe (96 KB)

The Original Scanner Specification (text file)


Home | Download | Purchase | Features | Support

PerfectLogic Home


© 2005 Perfect Logic Corporation.

webmaster@perfectlogic.com