| About | Downloads | Documents | Links |

5 Customized command-line processing   prev toc next

Part of the intent of gema is that it can be used as a means of implementing more specialized tools. A utility program is defined by the command line arguments that it uses as well as by how it processes its input files. Therefore, gema provides a way to customize the handling of command line arguments.

The main program of gema just does some initialization, and then processes the command line arguments by translating them with a set of built-in patterns. These rules that define the command line arguments are defined in a domain named ``ARGV''. The user is free to add additional rules to this domain, thereby implementing new command line options, or even to undefine existing rules. In the input stream that is translated by the ARGV domain, the command line arguments are separated by newline characters.[Footnote 1] The actions for the ARGV rules are expected to do all their work with side-effects and to not return any value. Any value that is returned by the translation (except for the delimiting newlines) will be reported by the main program as undefined arguments.

The complete set of built-in ARGV rules can be seen by looking at the source file ``gema.c'' in the variable argv_rules. Here are a few representative examples:

  ARGV:\N-idchars\n*\n=@set-parm{idchars;$1}
  ARGV:\N-literal\n*\n=@set-syntax{L;$1}
  ARGV:\N-p\n*\n=@define{*}
  ARGV:\N\L*\=*\n=@define{$0}
  ARGV:\N-odir\n*\n=@set{.ODIR;*}
  ARGV:\N-<L1>\n=@set-switch{$1;1}
  ARGV:\N-*\n=@err{Unrecognized option\:\ "-*"\n}@exit-status{3}

For an example of extending the command line options, suppose you wanted to emulate a C pre-processor by accepting ``-D'' options to define macros. That could be done by defining rules such as:

  ARGV:\N-D<I>\=*\n=@define{\\I$1\\I\=@quote{$2}}
  ARGV:\N-D<I>\n=@define{\\I$1\\I\=1}

Instead of adding to the built-in rules, it is also possible to suppress the built-in rules and define your own rules from scratch. To do this, start the program with a command line like:
gema -prim pattern-file ...
The -prim (``primitive mode'') option suppresses loading of the built-in rules and reads patterns from the specified file. Then the remainder of the command line is processed according to whatever ARGV rules were defined in that file. Note that even the default behavior of reading from standard input and writing to standard output is implemented by the ARGV rules. (The -prim option is the only one that is hard-coded instead of being implemented by patterns.)

up 

6 Exit codes   prev toc next

When the program terminates, it will return one of the following status codes to the operating system (unless overridden by the use of function @exit-status):
0
nothing wrong
1
(reserved for user via @exit-status{1})
2
failed match signaled by @fail or @abort
3
undefined command line argument
4
syntax error in pattern definitions
5
use of undefined name during translation (domain, variable, switch, parameter, syntax type, or locale)
6
invalid numeric operand
7
can't execute shell command for @shell function
8
I/O error on input file
9
I/O error on output file
10
out of memory

up 

7 Status and Future development   prev toc next

This program is now at the stage where in an ideal world it should be regarded as a completed prototype, and it would now be the time to start designing the real program to replace it. However, as usually happens in the real world, we ship the prototype because there isn't time to do any more. There is room for improvement in the areas of consistency, ease of use, and performance at least. Also, this documentation was written rather hurriedly and is not nearly as polished as I would like.

Since this was developed by one person as a spare time hobby, it has not had very extensive testing, so there are likely to be bugs. The -w and -t options are the most recently added functionality, and hence the most likely to have inadequacies.

I don't know at this time whether I will be spending any more effort on further development, but I am interested in hearing about any bugs found or other suggestions.

Following, in no particular order, are some assorted ideas for enhancements which remain for the future:

  • Should warn about a domain that is defined but not referenced, since it it easy to mistakenly neglect to quote a colon.
  • It might be useful to have a way to switch (or push and pop) the output file - e.g. to write each chapter of a document to a separate file even though the input might be a single file.
  • A function to construct a unique pathname for a temporary file.
  • A default notation for quoting a long section of literal text, in addition to using the backslash for quoting individual characters.
  • A function to return the pathname of the current directory.
  • Record the file and line that each rule came from, to be used in run-time error messages.
  • Improved trace mode as an aid for debugging pattern files.
  • A template operator for specifying an action to be taken after all input files have been processed.

up 

8 Acknowledgments   prev toc

This program was conceived as an extension of the concepts embodied in W. M. Waite's ``STAGE2'' processor [Footnote 2], as implemented by Roger Hall.[Footnote 3]

This program has some similarities to awk, but they are generally due more to similarity of purpose than to any deliberate copying. I did copy the $0 notation and adopt the term action.

This program was designed and coded by myself, David N. Gray, except for the regular expression processor, which utilizes public domain code written by Ozan S. Yigit and updated by Craig Durland and Harlan Sexton. David A. Mundie [Footnote 4] supplied modifications to enable use on the Macintosh, and offered some helpful comments.

up 


 
 

Table of Contents

1 Introduction
2 Operational Overview
3 Notation
  3.1 Special characters
  3.2 Escape Sequences
  3.3 Recognizer arguments
4 Built-in Functions
  4.1 Numbers
  4.2 String functions
    4.2.1 Output formatting -- padding, filling, and wrapping
    4.2.2 String Comparison
    4.2.3 Case conversion
    4.2.4 Miscellaneous string functions
  4.3 Variables
  4.4 Files
    4.4.1 Pathname manipulation
    4.4.2 Using alternate input and output files
    4.4.3 File context queries
  4.5 Control flow functions
  4.6 Other operating system interfaces
  4.7 Definitions
  4.8 Setting Options
  4.9 Informational functions
5 Customized command-line processing
6 Exit codes
7 Status and Future development
8 Acknowledgments
 
 
 

Footnotes

Footnote 1:
The newline was chosen for convenience, but it would more exactly emulate the C argument semantics if the NUL character was used as the separator, and perhaps that ought to be done in the future. (back)

Footnote 2:
Communications of the ACM, July, 1970, page 415ff (back)

Footnote 3:
Under the name ``TILT'' (``Texas Instruments Language Translator''), it was used on various TI computers during the 70s and 80s. (back)

Footnote 4:
mundie@telerama.lm.com (back)