DragonFly On-Line Manual Pages
Munger(1) DragonFly General Commands Manual Munger(1)
NAME
munger - Text Processing Lisp
SYNOPSIS
munger [<script> [<args> ...]]
DESCRIPTION
Munger is a simple, statically-scoped, naive lisp interpreter,
specialized for writing processors for 8-bit text. Munger makes it easy
to write editors, shells, filters, low-demand clients and servers, and
utility scripts.
This document begins with an overview of the language, followed by a
complete reference for every intrinsic function and macro, and library
function and macro included in the Munger package. A list of these
precedes the reference section, immediately after the following overview.
A variety of example programs are included in the source distribution.
They are installed in (libdir), and are described below.
Included Example Programs
cat.munger is a version of the cat utility, the simplest possible
filter.
grep.munger is an egrep-like filter.
options.munger is a module which simplifies the processing of command-
line arguments.
fmt.munger is a version of the fmt utility.
cal.munger Prints a calender of the current month to stdout.
Demonstrates date arithmetic functions.
filter.munger is a simple filter which expands documents with
embedded munger code in them.
transform.munger performs a set of regular expression based
substitutions destructively over a set of files.
view.munger is a file viewer resembling vi when invoked as view.
It demonstrates character-I/O and cursor addressing.
mush.munger is a job control shell. It demonstrates how to control
process groups.
xml2alist.munger is a minimal XML parser which will convert a standalone
XML 1.0 document into an alist which may be queried for
structure and content with the xmlquery.munger module.
The program is a filter, reading XML from stdin and
printing to stdout a lisp representation of the XML.
xml2buffer.munger
This is another version of the above XML parser which
instead of printing the lisp representation of the
converted XML document to stdout, inserts it into the
current buffer, from where it may be read by the lisp
reader with the "eval_buffer" intrinsic. This module
can also read the XML document from a string.
xml2sqlite.munger
is another version of the XML parser, which serializes
the XML document into an SQLite database file.
xmlsqlite.munger is a module providing helper functions to access a
database containing an XML document serialized by
xml2sqlite.munger.
rss.munger prints to stdout an RSS feed which has been converted
into a SQLite database file by xml2sqlite.munger. It
demonstrates accessing a database created by
xml2sqlite.munger.
xmlquery.munger is a module which may be used to extract data from a
document processed with the XML parsers included in the
Munger distribution, which store their results in
lists.
echo.munger is a TCP echo server.
httpd.munger is a non fully-conforming HTTP/1.1 server.
IMPLEMENTATION NOTES
* The Munger implementation is designed to be easy to understand and
extend. It is a naive implementation, and will perform best when
interpreting programs written in an imperative programming style.
Any general-purpose lisp/Scheme implementation out there will out-
perform Munger.
* Munger's strings are 8-bit clean, but some of the intrinsic functions
in the interpreter cannot handle embedded NULs. This should not be a
problem, as it is unlikely you will want to apply these functions to
binary strings. A full list of 8-bit safe intrinsics is specified in
the section below entitled STRINGS.
* Munger's string-manipulation functions are "functional" in that they
do not modify their arguments, but return new strings.
* There is only one namespace.
* There is a "dynamic_let" to impose dynamic scope on a single global
at a time.
* The function position of applications is fully-evaluated and must
evaluate to an intrinsic function or a closure, as in Scheme. The
order of evaluation of the terms of an application is fixed, from
left to right, and may be relied upon in computations.
* Munger does not recognize tail-calls, however you may explicitly
request tail-calls with the "tailcall" intrinsic. This mechanism
allows even anonymous functions to be tail-recursive, and can be used
to turn non-tail positions into tail positions. Using function
calls, whether they be tail-calls or not, to perform iteration in
Munger will always be orders of magnitude slower than using one of
the looping intrinsics.
* As in Perl, all values are considered to be boolean "true" values
except for the empty list, the empty string, and zero. There is no T
nor NIL object, nor #t or #f.
* All functions return a value. There are no undefined or unspecified
return values used in this interpreter.
* Improper lists cannot be formed in Munger. The final "cdr" of all
lists is the empty list.
* The empty list is a constant which evaluates to itself, and it is a
list, not an atom. Every empty list is identical to every other
empty list, which is to say they are "eq" to each other.
* "eq" in Munger behaves similarly to "eql" in Common Lisp. A true
"eq", which only returns a true value when comparing an object to
itself, has very limited use, and therefore I have not bothered to
include one. There is an "equal" as well, which does what you think
it does. It is defined in (join "/" (libdir) "library.lsp").
* "eval" does not contain the lisp reader. Therefore applying "eval"
to a string will not parse lisp code in the string, but simply cause
the original string to be returned (strings are constants). There is
a separate "eval_string" intrinsic.
* There are no destructive list operations. Munger forces the
programmer to use simple constructive list manipulations. Munger's
"list" and "append" intrinsics are "functional" in that they make
copies of their arguments before forming them into the final returned
lists.
* Munger's looping constructs are modelled on those of C. The most
efficient way to iterate is to use the "for", "iterate", or "loop"
intrinsics.
* The set of intrinsics is purposefully-limited to a minimal set useful
for text processing. There are many intrinsics with names similar to
those of other lisp dialects, which do not perform similar tasks. Be
forewarned.
* Symbol syntax is similar to symbol syntax in C, including case-
sensitivity. "let*" therefore, is "letn" in Munger. If you are used
to hyphenating symbol names, you will have to get used to using the
underscore in place of the hyphen to be happy here.
* Simple "eval-twice" macros and gensyms are supported. A macro
definition differs from a function definition in that the initial
"lambda" symbol is replaced with the "macro" symbol.
* Besides lists, three other aggregate types are provided by the
language: one-dimensional dynamically-resizable arrays called
"stacks", associative arrays called "tables", and statically-sized
one-dimensional arrays called "records".
* There are no interactive debugging facilities in the interpreter, nor
any editing capabilities at the interpreter prompt, beyond those
provided by the terminal driver (CTRL-H, CTRL-W, CTRL-U). To debug
scripts the `(print "I made it here!")(exit 1)' technique is used by
the author.
STARTUP
The interpreter attempts to read three files at startup, in order. The
first, the system library (library.munger), is read from the Munger data
directory (the "libdir" intrinsic will return the fully-qualified path of
the data directory). Then, if the user has a custom lisp library
(.munger) in his or her home directory, it is read, and lastly, any file
specified by the first command-line argument is read.
The succeeding command-line arguments, if present, are considered to be
arguments to the script referenced by the first command-line argument,
and are not accessed by the interpreter. All the command line arguments
may be accessed from lisp programs with the "current" "rewind", "next",
and "prev" intrinsics, described in the reference section at the end of
this document.
COMMENTS
If the parser encounters a semicolon (;) or an octothorpe (#), outside of
a string token, the rest of the line from that character to the next
newline or carriage return is considered to be a comment and discarded.
Recognizing the octothorpe as well as the traditional semicolon comment-
character, allows one to put a "shebang" line at the top of one's
scripts, in order to have the system feed the script to the interpreter,
if the script itself is invoked as a command from the shell.
#!/usr/local/bin/munger
SYMBOLS
The interpreter is case-sensitive when it comes to recognizing all
tokens. Symbol names must consist of only sequences of alphanumeric
characters and the underscore (_), and must not start with numerical
characters.
NUMBERS
The interpreter supports only one numerical type, the fixnum. Fixnums
are fixed-size integers which can be manipulated efficiently by the
interpreter. The binary size of the fixnum is the word size of the
machine the interpreter is running on. Arithmetical operations which
overflow this size, will not be detected. The "maxidx" intrinsic will
return the value of the largest fixnum the interpreter can represent.
The lowest supported fixnum will be one more than the value returned by
"maxidx", negated. This can be demonstrated by adding 1 to the value
returned by "maxidx" to cause the two's-complement representation to
"wrap-around" to the negative side. There is an "unsigned" intrinsic
which can be used to display the result of unsigned arithmetic operations
which wrap-around to the negative side.
> (setq max (maxidx))
1073741823
> (+ max 1)
-1073741824
When an integer value is read by the lisp reader it is represented
internally as a fixnum. If the value is too large or too small to be
represented in that form, the value is silently truncated to fit.
STRINGS
Arbitrary strings of character data can be placed between " marks (ASCII
34). Such tokens are constants and evaluate to themselves. " marks may
be embedded into strings if they are escaped with a backslash:
> "asd\""
"asd""
A backslash occurring at the end of a string will be interpreted as
escaping the closing " character. To make it possible for the lisp
reader to read a string ending with a backslash, it is necessary,
therefore, to have a means of escaping backslashes. Backslashes are
therefore escaped with themselves:
> "asd\\"
"asd\"
Backslashes are interpreted as escapes only when the occur before another
backslash or a " character. Backslashes not followed by a " character or
another backslash, are inserted into the string. The following two
strings are therefore identical:
> "\a"
"\a"
> "\\a"
"\a"
It is only necessary to employ these escapes in strings to be parsed by
the lisp reader, that is to say, in the literal strings in your source
code. These two characters may be inserted into strings
programmatically, without escaping them:
> (char 34)
"""
> (char 92)
"\"
The interpreter has a small set of string manipulation intrinsics, some
you would expect in any language, such as the "substring" and "strcmp"
functions, and others like the "split", "join", "chomp" and "chop"
intrinsics, which are inspired by similar functions in Perl. Regular
expressions can be used to find matches on substrings or to transform
strings. There is no character data type. Single-character strings are
used instead. All string operations are documented in the LANGUAGE
REFERENCE section of this document.
Munger's strings are 8-bit clean, but not all of the intrinsics which use
strings will function correctly when confronted with strings which have
embedded NULs in them. The following intrinsics will, however:
getline, getchar, getchars, print, cgi_read, cgi_print, code, substring,
stringify, concat, join, chop, chomp, insert, retrieve, slice, strcmp,
child_read, child_write, getline_ub
Due to a change in the regular expression library used by Munger, as of
Munger 4.172, the following regular expression related intrinsics no
longer work with strings with embedded NULs in them:
regcomp, match, matches, substitute, replace
The "split" intrinsic, notably, will work correctly when its first
argument is the empty string and its second argument contains NULs. The
programmer should avoid presenting binary strings to any other intrinsics
than those guaranteed to work with them.
STACKS, TABLES, AND RECORDS
Munger supports an associative array type called a table. Associative
arrays are collections of pairs of lisp atoms and arbitrary lisp objects.
The atom is called the "key" and is used to retrieve the other object,
called the "value" from the table. Tables are implemented internally as
hash tables, hence the name. For more information on usings tables, see
the entries for the intrinsics listed under the heading entitled, Tables,
in the LANGUAGE REFERENCE occurring later in this document.
Munger also supports a dynamically-resizable unidimensional array type
called a stack. Stacks may be treated as push-down stacks or indexed as
arrays. Internally, stacks are implemented as unidimensional arrays, but
they are called stacks to emphasize their one-dimensional nature. Any
lisp object may be stored in a stack. The capacity of stacks may be
increased dynamically, simply by "pushing" items onto them, but stacks
never decrease in size. Multi-dimensional arrays may be simulated using
stacks of stacks. For more information on using stacks, see the entries
for the intrinsics listed under the heading entitled, Stacks, in the
LANGUAGE REFERENCE occurring later in this document.
In addition to stacks and tables, an aggregate type called a record is
provided. Records are fixed-size unidimensional arrays. They are more
time-and-space-efficient means of representing fixed-size structures than
lists, tables, or stacks.
Tables, stacks, and records are opaque constant atoms which evaluate to
themselves.
FILES
File I/O may be performed in one of two ways. The content of files may
be read into text buffers, manipulated, then written out again, if random
access to the files' content is desired, or the standard descriptors may
be redirected onto files with the "redirect" intrinsic, if line-oriented,
serial (filter-like) access is sufficient. Three convenience macros
"with_input_file", "with_output_file", and "with_error_file" simplify the
process:
> (with_input_file "README"
>> (for (a 1 4) (print a (char 9) (getline))))
1 Munger
2 ======
3
4 Munger is a simple, statically-scoped, interpreted lisp that has
All redirections are undone upon return to toplevel. Redirections made
with "redirect" may be explicitly undone in programs with the "resume"
intrinsic. Redirections made with the "with_input_file",
"with_output_file", and "with_error_file" macros are undone automatically
when those macros return. Redirections made with these macros are
dynamically-scoped, which is to say their visibility is unlimited, but
their extent is limited to the duration of the macro. Redirections
exhibit stack-like behavior, allowing nested redirections to "shadow"
enclosing redirections:
> (with_input_file "README"
>> (print (getline))
Munger
>> (with_input_file "lisp.h"
>>> (print (getline)))
/*
>> (print (getline)))
======
The "temporary" intrinsic will redirect stdout onto a temporary file
opened for writing, via mkstemp(2). The function returns the name of the
file. A convenience macro is provided, "with_temporary_output_file":
> (setq filename (with_temporary_output_file (print "foobar")))
"/tmp/munger2nvIa8D98M"
> (with_input_file filename
>> (print (getline)))
"foobar"
> (unlink filename)
1
Two macros simplify the writing of filters: "foreach_line" and
"foreach_line_callback". Both accept a monadic function to be applied
successively to each line of input, the second also accepts an additional
function to be called when the input source changes. Both macros
determine whether to read data from the standard input or from files
specified on the command line based on the existence of command line
arguments after the first two, which will always be the name of the
interpreter and the name of the currently-executing script, respectively.
Here is what "cat" looks like in Munger:
(next)
(foreach_line print)
(exit 0)
The example programs further demonstrate the use of these macros. They
are fully documented later in this document.
COMMUNICATING WITH CHILD PROCESSES
The interpreter's standard descriptors can be redirected onto processes
with the "pipe" intrinsic. Successive invocations of "pipe" inherit any
already-made redirections, allowing pipelines to be created. The
"with_input_process" and "with_output_process" macros simplify the
process:
> (with_input_process "jot 100"
>> (with_input_process "fmt"
>>> (while (setq l (getline))
>>>> (print l))))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91
92 93 94 95 96 97 98 99 100
Upon return to toplevel, all redirections are undone. Redirections are
explicitly undone with the "resume" intrinsic. The first argument is
passed to the shell (always /bin/sh), and so may be any expression that
program understands. Because of this it is actually more efficient to
let the shell create the pipeline for us:
> (with_input_process "jot 100 | fmt" ...
The curl(1) utility may be used to redirect standard input onto a remote
file via ftp or http:
> (with_input_process "curl -s 'ftp://www.freebsd.org/pub/FreeBSD/README.TXT'"
>> (for (n 1 10) (print n (char 9) (getline))))
1 Welcome to the FreeBSD archive!
2 -------------------------------
3
4 Here you will find the official releases of FreeBSD, along with
5 the ports and packages collection and other FreeBSD-related
6 material. For those who have World Wide Web access, we encourage
7 you to visit the FreeBSD home page at:
8
9 http://www.FreeBSD.org/
10
The munger code below is the equivalent of the shell command:
# cmd < infile 2> errfile | cmd2 > outfile
(redirect 0 "infile")
(redirect 2 "errfile")
(pipe 0 "cmd")
(redirect 1 "outfile")
(resume 2)
(exec "cmd2")
The "getstring" library function will fork a local program and accumulate
all the returned data from it into a single string. Here we run curl
locally to retrieve a remote file as a single string:
> (getstring "curl -s 'http://www.mammothcheese.ca/robots.txt'")
"User-Agent: *
Disallow: /cgi-bin/"
If we use curl(1) as intermediary, it can handle HTTP redirects, remove
the HTTP header and merge chunked responses, leaving us with just the
requested resource's data. To see the HTTP header for ourselves, we can
connect a socket directly to the server with the "child_open" intrinsic.
The presence of a second port number argument to "child_open" tells the
interpreter this is a request to connect to a network server.
> (child_open "www.mammothcheese.ca" 80)
1
> (child_write "GET /robots.txt HTTP/1.0" (char 13) (char 10) (char 13) (char 10))
1
> (while (stringp (setq line (child_read)))
>> (print line))
HTTP/1.0 200 OK
Content-Type: text/plain; charset=utf-8
Content-Length: 24
Last-Modified: Tue, 12 Jun 2007 16:10:51 GMT
Server: Drood/1.14 (FreeBSD/6.1/i386)
Date: Sun, 02 Sep 2007 01:27:20 GMT
User-agent: *
Disallow:
"child_open" creates a full-duplex communication stream, while "pipe",
"redirect", and the "with_input_*" and "with_output_*" macros create only
unidirectional streams off of the standard descriptors. Only one full-
duplex connection can be active at any time, but it exists independently
of whatever source the standard descriptors are connected to. With a
port number argument of 0, "child_open" attempts to connect to another
process listening on a UNIX domain socket. Without a port number
argument "child_open" forks a local program:
> (child_open "munger")
1
> (child_write "(setq foobar 43)")
1
> (chomp (child_read))
"43"
> (child_close)
1
TEXT BUFFERS
Munger provides line-oriented buffers for storing large amounts of text.
A buffer must be opened before it can be used, with the "open" intrinsic,
and may be closed with the "close" intrinsic. A whole number is returned
by "open" upon success, called the buffer number of the opened buffer.
Only one buffer, out of those currently open, can be active at any time,
and is called the current buffer. Each call to "open" creates a new
buffer and makes the new buffer the current buffer. The buffer number of
an open buffer may be passed as the argument to the "switch" intrinsic to
make that buffer the current buffer.
A number of intrinsics are provided which act upon the current buffer, to
insert, retrieve, and delete lines, to report the number of words and
lines in the buffer, and to write and read buffer lines, to and from,
files and programs. Lines may be retrieved whole, or in slices with tabs
expanded. A range of buffer lines may be processed through an external
program with the "filter" intrinsic, while the "find" intrinsic may be
used to search through the current buffer to find non-overlapping matches
on regular expressions. A range of lines may be copied from one buffer
to another with the "transfer" intrinsic. There is also intrinsics to
set and find bookmarks in buffers. The full list of buffer intrinsics
can be found later in this document at the top of the LANGUAGE REFERENCE
section, under the subheading, Buffer Operations. Each is fully
described in its own particular entry.
; Loading a buffer from a file:
> (open)
0
> (read 0 "README")
38 ; Number of lines read.
> (for (a 1 5) (print (retrieve a)))
Munger
======
Munger is a simple, statically-scoped, interpreted lisp that has
line-editor-like access to multiple text buffers, for use on the FreeBSD
; Loading a buffer from a process:
> (empty)
1
> (input 0 "ls")
20
> (for (a 1 (lastline)) (print (retrieve a)))
LICENSE
Makefile
README
cat.munger
client.munger
err.munger
cgi.munger
fmt.munger
grep.munger
intrinsics.c
library.munger
lisp.c
lisp.h
options.munger
munger.man
transform.munger
; Filtering buffer content through a process:
> (filter 1 (lastline) "fmt")
3 ; Number of lines received back from filter.
> (for (a 1 (lastline)) (print (retrieve a)))
LICENSE Makefile README cat.munger client.munger err.munger cgi.munger
fmt.munger grep.munger intrinsics.c library.munger lisp.c lisp.h options.munger
munger.man transform.munger
; Loading a buffer from a remote file using curl(1):
> (empty)
1
> (input 0 "curl -s 'http://www.mammothcheese.ca/index.html'")
296 ; Number of lines read.
; Finding the location of a match on a regular expression:
> (setq rx (regcomp "<body[^>]*>"))
<REGEX#1>
> (find 1 1 0 rx 0)
(27 3 6)
> (slice 27 3 6 1 0)
"<body>"
; Filtering a buffer through an HTTP server:
> (empty)
1
> (insert 1 (concat "GET /Slashdot/slashdot HTTP/1.0" (char 13) (char 10)) 0)
1
> (insert 2 (concat (char 13) (char 10)) 0)
1
> (filter_server 1 2 "rss.slashdot.org" 80)
272
; Get rid of the HTTP header and condense a possibly chunked response body:
> (remove_http_stuff)
0
; Filter the XML document left in the buffer through the xml2alist example
; program:
> (filter 1 (lastline) (join "/" (libdir) "xml2alist.munger"))
264
; The buffer now contains a lisp representation of the XML we can use
; the xmlquery.munger example module with:
> (load (join "/" (libdir) "xmlquery.munger"))
<CLOSURE#24>
; Evaluate the buffer content as lisp:
> (eval_buffer)
[ Converted document scrolls by dramatically! ]
; Make a query. Let's see the cdata content of the title elements of the
; RSS feed:
> (dynamic_let (document (get_elements "item" "document" 1 "rdf:RDF" 1))
>> (while document
>>> (println (get_cdata "item" 1 "title" 1))
>>> (setq document (cdr document))))
Unrefined "Musician" Gains a Global Audience
Open Source Laser Business Opens In New York
OpenOffice.org 2.1 Released With New Templates
Texas Lawmaker Wants To Let the Blind Hunt
Designer Glasses With Microdisplay Unveiled
Arctic Ice May Melt By 2040
DIY Service Pack For Windows 2000/XP/2003
Sea Snail Toxin Offers Promise For Pain
A Press Junket To Redmond
The Dutch Kill Analog TV Nationwide
Google Web Toolkit Now 100% Open Source
Novell and Microsoft Claim Customer Support
Wikipedia Founder to Give Away Web Hosting
How Craigslist is Keeping up Internet Ideals
Norman & Spolsky - Simplicity is Out
REGULAR EXPRESSIONS
Munger provides five intrinsic functions for working with extended
regular expressions. All regular expressions must be compiled with the
"regcomp" intrinsic before they may be used by the other regular-
expression wielding intrinsics. The "match" intrinsic returns a list of
two character indices describing a match. The "matches" intrinsic
returns a list of the text matched by the regular expression and up to 20
parenthesized subexpressions. The "substitute" intrinsic is modelled
after the substitute command from the vi/ex editor, and may be used to
perform regular-expression-based substitution operations upon strings.
the "replace" library function allows snippets of lisp to be used for the
replacement expression, allowing replacement text to be dynamically
generated for each match in a string. Lastly, the "find" intrinsic may
be used to find the location of matches against a particular regular
expression in the text in the current buffer.
> (set 'rx (regcomp "munger"))
<REGEXP#1>
> (set 's "/usr/local/share/munger/library.munger")
> (match rx s)
(17 23)
; Text before match:
> (substring s 0 17)
"/usr/local/share/"
; Text of match:
> (substring s 17 (- 23 17))
"munger"
; Text after match:
> (substring s 23 0)
"/library.munger"
> (set 'rx (regcomp "^/?([^/]+/)*([^/]+)"))
<REGEXP#2>
; Full match and matched subexpressions:
> (matches rx s)
("/usr/local/share/munger/library.munger" "munger/" "library.munger"
"" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "")
; Ex-like substitution:
> (substitute rx "\U\2" "/usr/local/bin/munger" 1)
"MUNGER"
; Dynamically-generated replacement strings for each match:
(let ((r0 (regcomp "%([0-9A-Fa-f][0-9A-Fa-f])"))
(r1 (regcomp "\+")))
(defun decode (str)
(replace r0
(char (hex2dec m1))
(substitute r1 " " str 0))))
> (decode "foobar+tooley%7Efunbag%24%25%26")
"foobar tooley~funbag$%&"
FUNCTIONS
The elements of function or macro applications are evaluated from left to
right, with the function position being evaluated in the same environment
as the succeeding positions. The function position must be occupied by
an expression that evaluates to a an intrinsic function object or a
closure. Lambda- and macro-expressions evaluate to closure objects and
are closed over the local bindings visible at that time:
> (let ((a 10))
>> (defun booger () a))
<CLOSURE#32>
> (booger)
10
With static scoping, the local bindings visible to a function are limited
by the block structure in place at the time the function is created, but
variable extent is unlimited, so the function "booger"'s reference to the
enclosing block's variable "a" is still valid after the enclosing block
has returned. The binding is said to be "closed" within "booger" and is
no longer visible anywhere else.
In the example above, the invocation of "defun" creates a toplevel
binding, and not a new locally-visible binding, as an invocation of
"define" would do in this position in Scheme. Locally-visible bindings
can only be created by function application ("let" is syntactic sugar for
the application of an anonymous function), or by the "extend" intrinsic.
If the variable "booger" were a local variable in an enclosing scope, its
binding would have been modified, creating a locally-visible function,
but in this case, since no local binding exists, a new toplevel binding
is created.
When a lambda-expression is evaluated it is closed over the environment
in which it is embedded to become a closure. This means free variables
visible in containing functions remain accessible. Closures can be used
to create shared lexical environments for functions which need to
communicate with each other, or they can be used to create static
bindings which persist between invocations of a function, much like local
variables in C declared with the "static" storage class. Because
closures perform encapsulation they can be used to simulate objects in
the message-passing style of object programming. Closures can also be
used to capture state for use in the continuation-passing style of
programming. A full discussion of these topics is beyond the scope of
this manual page.
Functions may be defined to accept variable-length argument lists. To
specify that a function accepts a varying number of arguments, a final
parameter to reference the optional arguments is enclosed in parentheses
in the parameter list of the function definition. For example,
(lambda ((a)) (print a))
defines a function which accepts zero or more arguments. Upon
invocation, all arguments passed to the function will be collected into a
list and bound to "a" inside the function body. If no arguments are
passed to the function, "a" will be bound to the empty list.
The function below accepts two or more arguments, with any arguments
after the second argument collected into a list bound to "c" in the
function body:
(lambda (a b (c)) (print a b c))
The "labels" intrinsic facilitates the creation of local function
bindings, where each binding is visible inside each function definition.
Recursive and mutually-recursive temporary functions therefore may be
created with "labels".
>(labels ((even (lambda (n) (or (eq n 0) (tailcall odd (- n 1)))))
>>> (odd (lambda (n) (and (not (eq n 0)) (tailcall even (- n 1))))))
>> (print (even 11))
>> (newline)
>> (print (even 12))
>> (newline))
0
1
1 ; this is the return value of the last (newline).
There are "let", "letn", and "letf" macros to facilitate the creation of
new lexical bindings. "letn" corresponds to let* in other lisp dialects,
while "letf" corresponds to a named let in Scheme or flet in lisp. It
may not be necessary to use these macros to create new lexical bindings,
depending upon the needs of a particular situation.
If the interpreter is a toplevel, then one of these macros must be used,
or a closure applied, to create a lexical environment. Inside a lexical
environment, however, new bindings may be dynamically created with the
"extend" intrinsic. These bindings have unlimited extent just as all
lexical bindings do, and so may be safely closed-over by closures.
> (defun (a)
>> (extend 'b (* a a))
>> (lambda () b))
If the programmer wishes to limit the extent of the new binding, the
invocation of "extend" and the expressions which use the new binding may
be wrapped in a "dynamic_extent" intrinsic:
> (lambda (a)
>> (extend 'f (lambda () b))
>> (dynamic_extent
>>> (extend 'b (* a a))
>>> (print (f))) ; Works here inside "dynamic_extent"
; even though f was closed before "extend" was invoked.
; b suddenly "pops-up" into the lexical environment.
>> (f)) ; ERROR: b is no longer extant here.
Tail recursion is not recognized by the interpreter, but must be
requested explicitly by the "tailcall" intrinsic. This mechanism allows
even anonymous functions to perform tail recursion. See the entry for
"tailcall" later in this document for more details.
> (let ((n 10)
>>> (a 1))
>> (if (< n 2)
>>> a
>>> (tailcall 0 (- n 1) (* a n))))
3628800
MACROS
Macros allow us to define syntactic transformations. With macros we can
create new special operators for the interpreter in lisp. A macro is
simply a function which receives its arguments unevaluated, and instead
has its return expression evaluated. In other words, the purpose of a
macro is to compose a new expression out of its arguments, which the
interpreter then evaluates. Despite this simple definition, the use of
macros can become confusingly complex. With macros, we write lisp which
writes the lisp to be ultimately evaluated.
A simple example is the "quit" macro defined in library.munger:
(set 'quit (macro () '(exit 0)))
We can see how a macro will expand with the test intrinsic:
> (test (quit))
(exit 0)
Macros typically consist of a template expression into which we plug sub-
expressions to produce the expression which is finally returned by the
macro. The "qquote" (quasiquote) macro aids in the defining of other
macros, by allowing us to use the template paradigm directly. This
greatly simplifies complex macro definitions. The "qquote" macro is
similar to the "quote" intrinsic in that it prevents evaluation of its
argument, but the quoting can be selectively turned off for sub-
expressions by prefacing them with a (",") comma character. Here is the
definition of the "with_input_file" macro from library.munger:
(set 'with_input_file
(macro (file (code))
(qquote
(when (> (redirect 0 ,file) 0)
(protect ,(cons 'progn code)
(resume 0))))))
The sub-expressions prefaced by commas are evaluated and the result of
the evaluations are inserted into the template expression.
We can see how the macro will expand with the "test" intrinsic:
> (test (with_input_file "library.munger" (getline)))
(when (> (redirect 0 "library.munger") 0)
(protect (progn (getline))
(resume 0)))
There is a "defmac" macro in library.munger which allows macros to be
created with the following syntax:
(defmac with_input_file (file (code))
(qquote
(when (> (redirect 0 ,file) 0)
(protect ,(cons 'progn code)
(resume 0)))))
PROGRAMMING STYLE
Programs written in an imperative style will always out-perform programs
written in a functional style because Munger is a naive interpreter. A
Scheme programmer might be tempted to write a factorial function in
Munger like this:
(defun fact (n)
(labels ((f (lambda (n a)
(if (< n 2)
a
(f (- n 1) (* n a))))))
(f n 1)))
This function will work as expected in Munger, but there are some
improvements we can make to it increase its efficiency. First of all,
the recursive invocation of "f" will not be recognized as a tail-call by
the Munger interpreter, so we will use use the "tailcall" intrinsic to
make this call. Secondly, "tailcall" does not need to "see" a binding in
order to make a recursive call. It can restart the currently-running
function internally. This means we can use a "let" to make the binding
to "f":
(defun fact (n)
(let ((f (lambda (n a)
(if (< n 2)
a
(tailcall 0 (- n 1) (* n a))))))
(f n 1)))
We can simplify this further. We don't actually need to use "let" to
create new local bindings, unless the interpreter is at toplevel, where
no local lexical environment exists. The "extend" intrinsic will extend
the current local environment with a new binding more efficiently,
because "let" is a macro internal to the interpreter, which must be
expanded and re-evaluated, resulting in another function application. If
we wish to limit the extent of the bindings introduced by "extend" we can
wrap expressions with the "dynamic_extent" intrinsic. Replacing the
"let" with "extend" gives us:
(defun fact (n)
(extend 'f
(lambda (n a)
(if (< n 2)
a
(tailcall 0 (- n 1) (* n a)))))
(f n 1))
We don't use it here, but the binding to "f" is actually visible inside
the function body. Although the function is closed over the environment
before the environment is extended, the binding will nonetheless be
visible when the closure is applied, because closures do not simply close
over visible bindings, but rather the environments which contain them.
At the time the closure is applied, the environment will have been
dynamically extended to make "f" visible.
Using function calls to iterate will always result in slow programs in
Munger. The only reason we are introducing the helper function is to
create a new binding to "a" to act as an accumulator, but we can use
"extend" to do that directly, and replace the closure with a loop:
(defun fact (n)
(extend 'a 1)
(while (> n 1)
(setq a (* n a))
(dec n))
a)
This function is very different than the factorial we started with and
will run much more efficiently in the Munger interpreter. But there is
one last change we can make to further improve efficiency. The most
efficient way to iterate in Munger is to use the "iterate" intrinsic. It
can be used when the number of iterations can be calculated before
entering the body of the loop:
(defun fact (n)
(extend 'a 1)
(iterate n (setq a (* a n)) (dec n))
a)
This is the fastest form of this function.
RETURN VALUES
Munger returns 1 if the interpreter stopped due to an error, 0 otherwise.
AUTHORS
James Bailie <jimmy@mammothcheese.ca>
http://www.mammothcheese.ca
LANGUAGE REFERENCE
Do not assume you know what an intrinsic does because it has a similar
name to an intrinsic of another Lisp dialect, or you may have some nasty
surprises working in Munger.
The following functions are either built-in to the interpreter, or are
library functions or macros defined in library.munger. To find the entry
for a particular item, search this document for the name of the item
followed by a colon.
List Operations:
cons, car, cdr, list, length,
caar, cdar, cadr, cddr, cdddr,
cddddr, caddr, cadddr, caddddr, append,
alist_lookup, alist_remove, alist_replace, reverse, sort,
sortlist, sortcar, mapcar, foreach, remove,
nthcdr, nth, member, map
Predicates and Conditionals:
eq, atomp, if, when, unless,
and, or, not, nullp, boundp,
pairp, equal <, <=, >,
>=, stringp, fixnump, symbolp, regexpp,
tablep, stackp, intrinsicp, closurep,
macrop, recordp,
Assignment:
set, setq, inc, dec, defun
defmac
Evaluation and Control Flow:
progn, throw, while, until, do,
catch, continue, main, eval, quote,
load, gensym, version, test, let,
letn, labels, cond, apply, exit,
quit, interact, fatal, nofatal, extract,
qquote protect, letf, tailcall, prog1
printer, noprinter die, dynamic_let,
extend, gc, for, iterate loop,
dynamic_extent gc_freq, case, eval_string
blind_eval_string, eval_buffer
Fixnum Arithmetic:
+, -, *, /, %,
abs, random, negate, unsigned
Type Conversions:
stringify, digitize, intern, char, code,
hex2dec, dec2hex
Buffer Operations:
open, close, insert, delete, retrieve,
lastline, filter, write, read, empty,
slice, find, input, output, words,
maxidx, buffer, buffers, switch, transfer,
setmark, getmark, with_buffer, filter_server
remove_http_stuff
Regular Expressions:
regcomp, match, matches, substitute, replace,
String Operations:
split, join, expand, substring, concat
chop, chomp, upcase, downcase, length,
reverse, strcmp, split_rx, tokenize, rootname,
suffix, explode base64_encode, base64_decode
form_encode form_decode
Filesystem Operations:
chdir, libdir, directory, unlink, rmdir,
pwd, exists, stat, rename, seek
mkdir, complete, realpath, access, truncate,
redirect, resume, chown, chmod, basename,
readlock, writelock, unlock dirname, symlink
Command-Line Arguments:
current, next, prev, rewind
Line-Oriented I/O:
print, println, warn,
with_input_file, with_output_file, with_output_file_appending,
with_input_process, with_output_process, with_error_file,
with_error_file_appending, with_temporary_output_file
foreach_line, foreach_line_callback, pipe
newline, redirect, resume,
stderr2stdout, getline, rescan_path,
stdout2stderr, flush_stdout getline_ub
reset_history, save_history, load_history
Network daemon-related:
listen, listen_unix, stop_listening, accept, daemonize,
syslog, getpeername, receive_descriptors, send_descriptors,
busymap, nobusymap, busy, notbusy,
get_scgi_header busyp,
System Access:
system, getenv, setenv, block, unblock,
suspend, lines, cols, date, time,
beep, checkpass, crypt, setuid, getuid, setgid
geteuid, seteuid, getgid, hostname, gecos,
timediff, timethen, date2days, days2date, date2time,
localtime, utctime, week, weekday, month, getpid,
getppid, setpgid, getpgrp, tcsetpgrp, tcgetpgrp
kill, killpg, fork, glob, wait,
zombies, nozombies, zombiesp, exec, shexec,
forkpipe, command_lookup, unsetenv getstring
datethen, chroot, isatty, sle
Tables:
table, hash, unhash, lookup, keys,
values
Stacks:
stack, push, pop, index, store,
used, topidx, assign, flatten, shift,
clear,
unshift
Records:
record, getfield, setfield,
Communication with a child process:
child_open, child_write, child_read,
child_close, child_running, child_ready,
child_wait, child_eof
SQLite Interface:
sqlite_open, sqlite_close, sqlite_exec, sqlite_prepare, sqlite_step,
sqlite_finalize, sqlite_reset, sqlite_row, sqlite_bind, sqlp,
sqlitep
Character-Oriented I/O
display, clearscreen, pause, clearline,
getchar, goto, scrolldn, scrollup, hide,
show, pushback, insertln, getchars
fg_black, fg_red, fg_green,
fg_yellow, fg_blue, fg_magenta,
fg_cyan, fg_white, bg_black,
bg_red, bg_green, bg_yellow,
bg_blue, bg_magenta, bg_cyan,
bg_white, boldface, normal,
A description of each follows:
cons: (cons expr1 expr2)
Intrinsic "cons" adds an element to the beginning of a list. Expr2 must
evaluate to a list.
> (cons 'a (b c)
(a b c)
car: (car expr1)
Intrinsic "car" returns the first element of a list. An error is
generated if expr1 does not evaluate to a list.
> (car '(a b c))
a
cdr: (cdr expr1)
Intrinsic "cdr" returns the sublist of a list, beginning from the second
element of the list. An error is generated if expr1 does not evaluate to
a list.
> (cdr '(a b c))
(b c)
boundp: (boundp expr1)
The "boundp" intrinsic accepts one argument which must evaluate to a
symbol. The function returns 1 if the symbol is currently bound, 0
otherwise.
caar, cadr, cdar, caddr, cadddr, caddddr, cddr, cdddr, cddddr: (form expr)
These four library functions take the same argument as the car and cdr
intrinsics and are built out of nested groupings of those intrinsics.
> (caar '((a) b c)) ; is equivalent to: (car (car '((a) b c)))
a
> (cadr '(a b c)) ; is equivalent to: (car (cdr '(a b c)))
b
> (cdar '((a) b c)) ; is equivalent to: (cdr (car '((a) b c)))
()
> (cddr '(a b c)) ; is equivalent to: (cdr (cdr '(a b c)))
(c)
etc.
eq: (eq expr1 expr2)
Intrinsic "eq" returns 1 if expr1 and expr2 evaluate to the same atom
token, to atoms representing equivalent numbers, or to the exact same
list, otherwise it returns 0.
> (eq 'a 'a)
1
> (eq '(a b c) '(a b c))
0
> (set 'l '(a b c))
(a b c)
> (eq l l)
1
> (eq 0001 1)
1
equal: (equal expr1 expr2)
Library function "equal" returns 1 in all the situations where "eq"
returns 1, and additionally will return 1 if both of its arguments
evaluate to lists having the same structure and content. While "eq" will
fail if its arguments evaluate to different lists with the same structure
and content, equal will return 1 for these arguments..
> (equal '(a b c) '(a b c))
1
> (set 'l '(a b c))
(a b c)
> (equal l l)
1
> (equal '(00 01 02) '(0 1 2))
1
atomp: (atomp expr)
Intrinsic "atomp" returns 1 if its argument evaluates to an atom, and
returns 0 otherwise.
> (atomp 'a)
1
> (atomp '(a b c))
0
set: (set expr1 expr2)
Intrinsic "set" accepts two arguments. The result of evaluating the
second argument is bound to the result of evaluating the first argument.
The first argument must evaluate to a symbol. If a local variable with
the syntax of the symbol exists, its binding will be modified, otherwise
"set" will create or modify a toplevel binding. Locals can only be
created by function application, or with the "extend" intrinsic.
> (set 's '(a b c))
(a b c)
> s
(a b c)
> (set (car '(a b c)) ((lambda (x) (* x x)) 4))
16
> a
16
setq: (setq symbol expr)
The "setq" intrinsic works similarly to the "set" intrinsic, except the
first argument to the function is not evaluated and must be a symbol.
(setq a b)
is equivalent to:
(set 'a b)
eval_buffer: (eval_buffer)
The "eval_buffer" intrinsic evaluates Munger lisp in the current buffer.
The function accepts no arguments. The buffer is evaluated in the
current lexical context. Recursive invocations of "eval_buffer" are not
permitted, which is to say the code in the current buffer (or any other
code it invokes) may not itself invoke "eval_buffer" while "eval_buffer"
is running. If the code in the current buffer messes with itself by
altering the content of the current buffer, disaster may result; however,
the code in the current buffer may open new buffers as it likes without
fear. The "eval_buffer" function will continue to parse the code in the
buffer which was current when it was invoked. The function returns the
result of evaluating the last expression in the buffer upon success, or 0
if a recursive invocation is attempted or if the current buffer is empty.
Any errors encountered during evaluation of the code in the buffer will
stop evaluation.
eval_string: (eval_string expr)
blind_eval_string: (blind_eval_string expr)
Both the "eval_string" and "blind_eval_string" accept one argument which
must evaluate to a string, and attempt to execute the string as lisp.
Any error encountered will stop evaluation of the string, but the
interpreter will attempt to carry on interpreting the rest of your
program. Your program may be in a "messed-up" state from the badly-
behaved code parsed from the string, which may cause the interpreter to
encounter another error which stops evaluation when it attempts to
continue interpreting your program. Both functions return the result of
evaluating the last successfully-evaluated expression parsed from the
string. If no expressions are successfully-evaluated, then the original
string will be returned.
The difference between the two functions is that with "eval_string" the
code parsed from the string is evaluated in the current lexical context,
while with "blind_eval_string" only the global environment is visible to
the code in the string. The current lexical environment is invisible to
the string.
Care must be taken when specifying recursive invocations of "eval_string"
literally in program text. Consider this example:
(eval_string "(setq foobar (eval_string \"(join \\\":\\\" \\\"b\\\")\"))")
The string-within-a-string, which is the argument to the recursive
invocation, must have its delimiting quotes escaped with backslashes in
order for them to be embedded in the toplevel string, and not end it.
Then the strings-within-a-string-within-a-string, which are the arguments
to the recursive invocation's argument string's invocation of "join",
must be double-escaped with three backslashes. The backslashes closest
to the quotes escape the quotes in the toplevel string, while the double
backslashes before the escaped quotes, embed backslashes into the
toplevel string which will be interpreted as escaping the quotes during
the recursive invocation of the lisp reader.
Invoking "eval_string" is similar to invoking "load" except the code is
extracted from a string instead of from a file. If an expression is not
complete within the string, it will be discarded.
eval: (eval expr)
Intrinsic "eval" returns the result of evaluating its lone argument
twice, which is to say, the argument is evaluated as usual, and the
result of this evaluation is evaluated again. Note "eval" does not
contain the lisp reader. Calling "eval" with a string argument will only
cause the original string to be returned, since strings are constants in
Munger.
> (set 'a (quote (set 'b 'booger)))
(set (quote b) (quote (booger)))
> b
evaluate: b has no value.
> (eval a)
booger
> b
booger
quote: (quote expr) or 'expr
Intrinsic "quote" returns its argument unevaluated. It is used so
frequently it has an abbreviated form of a single apostrophe.
> (quote (a b c))
(a b c)
> '(a b c)
a
protect: (protect expr expr ...)
The "protect" intrinsic is analogous to unwind-protect in other lisp
dialects. The function accepts one or more arguments and evaluates them
in sequence, but returns the value of evaluating the first argument. The
arguments subsequent to the first are evaluated EVEN IF THE EVALUATION OF
THE FIRST IS INTERRUPTED by the interpreter encountering an error.
> (catch
>> (protect (throw 0)
>>> (print 'booger)
>>> (newline)))
booger
0
>
qquote: (qquote expr)
The "qquote" (quasiquote) macro accepts a list and returns it unchanged
except where sub-expressions have been "escaped" with comma characters.
Those escaped sub-expressions are evaluated and the result of the
evaluation inserted into the template expression. "qquote" aids in the
composition of macros.
Some lisps define a read macro "`" as a short form for "qquote", but
Munger does not support this. The commas escaping sub-expressions are
separate tokens in Munger, unlike in other lisps. For example ,token is
actually parsed by Munger as two tokens, "," and "token". This means
',token parses to "(quote ,) token" and not "(quote ,token)".
if: (if expr1 expr2 [exp3...])
Intrinsic "if" is a conditional. It accepts two or three arguments, the
first being the test condition. The test condition is evaluated, and if
the result is a true value (anything except 0 (fixnum), the empty string,
or the empty list), the second argument is evaluated and the result of
that evaluation is returned. Otherwise, if the test condition evaluated
to a false value, and further expressions are present after the second,
those expressions are evaluated in order and the result of the last
expression evaluated is returned. If only two expressions are present in
the original form, and the test condition evaluates to a false value, the
result of evaluating the test expression is returned.
> (if (> 3 4) 'yes 'no)
no
> (if (> 3 4) 'yes)
0
and: (and expr1 [expr2 ...])
Intrinsic "and" accepts one or more number of arguments, and evaluates
them from left to right until an argument evaluates to a "false" value
(0, the empty string, or the empty list), or the end of the arguments is
reached. The value of the last evaluation is returned.
> (and 1 'a 0)
0
> (and 1 'a "string")
1
or: (or expr1 [expr2 ...])
Intrinsic "or" accepts one or more arguments, and evaluates them from
left to right until an argument evaluates to a true value (anything other
than 0, the empty string, or the empty list), or the end of the arguments
is reached. The value of the last evaluation is returned.
> (or 1 0)
1
> (or 0 0)
0
list: (list expr1 [expr2 ...])
Intrinsic "list" accepts one or more arguments, evaluates them in order,
and returns a newly-constructed list containing of the result of each
evaluation, also in order.
> (list 'a 'b 'c) ; is equivalent to (cons 'a (cons 'b (cons 'c ())))
(a b c)
> (list '(a b c) 4 "hello")
((a b c) 4 "hello")
progn: (progn expr...)
Intrinsic "progn" accepts one or more arguments, and evaluates them in
order, returning the result of evaluating the last argument. If no
arguments are passed to "progn" an error is generated which will stop
evaluation.
> (progn
>> (set 'f (lambda (n) (+ n 1)))
>> (set 'x 1)
>> (f x))
2
prog1: (prog1 ...)
Library macro "prog1" accepts zero or more arguments, and evaluates them
in order, returning the result of evaluating the first argument. If no
arguments are passed to the macro, it returns the empty list.
> (prog1
>> (+ 2 2)
>> (+ 3 3))
4
not: (not expr)
Intrinsic "not" returns 1 if its argument is the empty list, the empty
string, or zero, otherwise it returns 0.
> (not 0)
1
> (not "hello")
0
nullp: (nullp expr)
Library function "nullp" returns 1 if its argument is the empty list,
otherwise it returns 0.
> (nullp ())
1
> (nullp 'a)
0
pairp: (pairp expr)
Library function "pairp" returns 1 if its argument is a non-empty list,
otherwise it returns 0.
> (pairp '(a b c))
1
(pairp ())
0
(pairp 'a)
0
warn: (warn expr [expr...])
The "warn" intrinsic evaluates its arguments then writes the value of
each evaluation, and one final newline character to the standard error
stream. The "warn" intrinsic always returns 1.
setenv: (setenv expr1 expr2)
The "setenv" intrinsic sets the value of a named environment variable.
The function accepts two arguments, which must both evaluate to strings.
The first argument is the environment variable to set, and the second is
the new value to bind to the variable. Any errors encountered will stop
evaluation. Upon success, "setenv" returns 1.
unsetenv: (unsetenv expr)
The "unsetenv" intrinsic accepts one argument which must evaluate to a
string and removes any environment variable named by the string from the
environment. The function always returns 1.
getenv: (getenv expr)
The "getenv" intrinsic looks up the value of an environment variable. It
accepts one argument which must evaluate to a string. If the environment
variable specified exists, a string is returned representing the
variable's value. If the variable does not exist, 0 is returned.
> (getenv "HOME")
"/usr/home/jbailie"
> (getenv "foobar")
0
directory: (directory expr)
The "directory" intrinsic returns a list of strings representing the
filenames of the entries in a specified directory, or a string
representing the error encountered by the opendir() system call. The
function does not return the ".." and "." directory entries, in its
result set. "directory" accepts one argument, which must evaluate to a
string, specifying the directory to list.
> (directory "/usr/local")
("man" "bin" "share" "include" "lib" "etc" "info" "libexec" "sbin" "libdata")
> (directory "/foobar")
"No such file or directory"
chomp: (chomp expr)
The "chomp" intrinsic removes all contiguous terminating carriage return
and newline characters from a string. The function accepts one argument
which must evaluate to a string, and ALWAYS returns a new string,
regardless of whether any terminators were removed from its argument or
not.
> (chomp (getline))
hello[return]
"hello"
> (getline)
hello[return]
"hello
"
>
chop: (chop expr)
The "chop" intrinsic accepts one argument which must evaluate to a
string, and returns a new string with the same characters as the original
string but with the last character removed. If the argument string is
empty "chop" does nothing.
> (chop "hello")
"hell"
> (chop "")
""
beep: (beep)
The "beep" intrinsic accepts no arguments and will cause the device
connected to standard output to beep, if it is capable of doing so. 1 is
returned on success. Any error encountered stop evaluation.
suspend: (suspend)
The "suspend" intrinsic accepts no arguments and causes the interpreter
to send a SIGSTOP to itself, suspending the interpreter process. See
your shell's documentation on how to resume stopped jobs.
stderr2stdout: (stderr2stdout)
stdout2stderr: (stdout2stderr)
The "stderr2stdout" intrinsic connects the stderr stream to the stdout
stream. The "stdout2stderr" intrinsic connects the stdout stream to the
stderr stream. Both functions accept no arguments, and always return 1.
The standard steam redirected may be reconnected to the stream it was
previously connected to, with the "resume" intrinsic.
After successful invocation, the stream whose name occurs first in the
name of the intrinsic is connected to the same descriptor as the stream
whose name occurs second in the name of the intrinsic. So, if no
previous redirections have been made, invoking "stderr2stdout" will cause
stderr and stdout to both write to descriptor 1, while invoking
"stdout2stderr" will cause both stdout and stderr to write to descriptor
2.
pipe: (pipe expr1 expr2)
NOTE there are separate shortcut intrinsics to send or receive buffer
lines, to or from, child processes, or filter them through child
processes. see the entries for the "input", "output", and "filter"
intrinsics for more details.
the "pipe" intrinsic forks a specified process, piping one of the
interpreter's standard descriptors to the standard input or standard
output of the new process. The function accepts two arguments, the first
of which must evaluate to 0, 1, or 2, and specifies the interpreter
descriptor to be redirected. 0 is the interpreter's standard input, 1 is
the interpreter's standard output, while 2 is the interpreter's standard
error. if the user wishes to read from the new process, 0 should be
passed as the first argument, but if the user wishes to write to the new
process then 1 or 2 should be passed as the first argument. The second
argument must evaluate to a string specifying a command line to run. The
command is passed to the shell (/bin/sh) for execution, and may contain
any expression that program understands. All errors will stop
evaluation. Upon success "pipe" returns 1.
Note that if the specified command cannot be found, or there is some
other error in the command line, the call to "pipe" will succeed, but the
new shell process will immediately exit. If interpreter descriptor 0
were redirected, then the next call to "getline" will return 0,
indicating EOF, but if descriptors 1 or 2 were redirected, the next
"print" to the pipe will silently fail.
Redirection is undone by the "resume" intrinsic. It is not necessary to
undo a redirection, to perform another redirection. Successive
invocations of "pipe" on the same descriptor may be made to create a
pipeline. If interpretation returns to the toplevel, all redirections
are undone automatically. See the entries for the "with_input_process"
and "with_output_process" macros, for a simple means of performing
temporary redirections.
The following example is equivalent to entering:
"jot 100 1 | sort -n | fmt"
at the shell:
>(progn
>> (pipe 1 "fmt")
>> (pipe 1 "sort -n")
>> (for (a 100 1)
>>> (print a)
>>>> (newline)))
1
> 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91
92 93 94 95 96 97 98 99 100
Note the last line of output from the pipeline is a short line. This
means that after all the data was written to the pipeline, the "fmt"
utility blocked, reading on its stdin, waiting for enough subsequent data
to fill another entire line. It is only when the "progn" returned to
toplevel and the interpreter did an implicit "resume" on descriptor 1,
that the pipeline processes read EOF on the pipe, and "fmt" in turn, spit
out the last line. The output of the pipeline and the interpreter
process are mixed up because they both have their standard outputs
connected to the terminal. The 1 after the lisp, is the return value of
the lisp expression. This is followed by a the lisp prompt >, which is
then followed by the output from the pipeline. It is necessary to wrap
the whole example in a "progn" to prevent the interpreter from returning
to toplevel in-between subexpressions, and undoing the redirections. The
child process spawned by the second invocation of "pipe" inherited the
redirection of stdout by the first invocation of "pipe" which results in
the creation of the pipeline. Since the command argument to "pipe" is
passed to the shell, we could have used (pipe 1 "sort -n | fmt") instead
and let the shell create the pipeline for us. Note that we must create
the subprocesses of the pipeline in reverse order if we are going to
write to the pipeline, but in order if we are going to read from the
pipeline.
>(progn
>> (pipe 0 "jot 1 100")
>> (pipe 0 "fmt")
>> (while (set 'l (getline))
>>> (print l)))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91
92 93 94 95 96 97 98 99 100
0
>
redirect: (redirect expr1 expr2 [expr3 [expr 4]])
The "redirect" intrinsic redirects one of the three standard file
descriptors to a file. The function accepts two, three or four
arguments, the first of which must evaluate to one of 0, 1, or 2, and
specifies the descriptor to be redirected. The three possible values
correspond to standard input, standard output, and standard error,
respectively. The second argument must evaluate to a string specifying
the filename to be opened. The third and fourth arguments are boolean
flags which indicate whether the file specified by the second argument
should be opened in append mode, and whether an attempt should be made to
obtain a lock on the file, respectively. The third optional argument is
ignored unless argument 1 evaluates to 1 or 2, and itself must evaluate
to a fixnum. The fourth optional argument must also evaluate to a
fixnum.
If argument 3 evaluates to a non-zero value, it indicates the contents of
the file should be appended to. If the third argument is not present or
evaluates to 0, and the first argument evaluates to 1 or 2, then the file
will be created if it does not exist, or overwritten if it does. If the
third argument is present and evaluates to a non-zero fixnum, and the
first argument evaluates to 1 or 2, the file will be created if it does
not exist, or it will be appended to, if it already exists. The third
argument is ignored if the first argument evaluates to 0.
If the fourth argument evaluates to non-zero value, and the first
argument evaluates to 0, an attempt will be made to acquire a shared lock
on the file. If the fourth argument evaluates to a non-zero value, and
the first argument evaluates to 1 or 2, then an attempt will be made to
acquire an exclusive lock on the file.
If the function is successful, the specified file descriptor will be
redirected onto the specified file, and 1 will be returned. If an
attempt is made to redirect descriptor 0 to a non-existent file, -1 is
returned. If an attempt is made to redirect any descriptor to a file to
which the user lacks the necessary permissions, -2 is returned. If a
lock cannot be acquired, -3 is returned. If the function encounters any
other error from the open() system call, it will return a string
describing the error. Errors from other sources will stop the
interpreter.
Redirection is undone by the "resume" intrinsic. It is not necessary to
undo a redirection, to perform another redirection. Successive
invocations of "redirect" may be made, and each redirection may be undone
by calling "resume" to reconnect the specified descriptor to the stream
it was connected to previously. In other words, to undo all
redirections, it is necessary to call "resume" on a descriptor the same
number of times one has called "redirect" on the same descriptor.
If the thread of execution returns to the toplevel read-eval-print loop
while one or more streams are redirected, all redirections will be undone
by the interpreter, with all three standard streams being reconnected to
the descriptors to which they were reconnected when the interpreted was
started.
temporary: (temporary)
The "temporary" intrinsic creates a temporary file in /tmp opened for
writing, and redirects stdout onto it. The function accepts no arguments
and returns the filename of the temporary file. The programmer should
invoke (resume 1) when finished writing to the file, to reconnect stdout
the stream it was previously connected to, and "unlink" the file when
completely finished with it.
with_temporary_output_file: (with_temporary_output_file expr...)
The "with_temporary_output_file" invokes "temporary" to redirect stdout
onto a temporary file, evaluates a sequence of expressions, then invokes
"resume" to undo the redirection to the temporary file. The macro
accepts one or more arguments to be evaluated when the redirection is in
place, and returns the name of the temporary file, so that further code
in your program may access the temporary file. One should "unlink" the
filename when one is completely finished with the file.
resume: (resume expr)
The "resume" intrinsic causes a redirected descriptor to be reconnected
to the device it was connected to before the last call to "redirect" or
"pipe". The function accepts one argument which must evaluate to one of
0, 1, or 2, corresponding to the descriptor to be affected. If the
specified descriptor were not redirected, "resume" returns 0, otherwise
it returns 1.
with_input_file: (with_input_file expr1 expr2 ...)
The "with_input_file" macro temporarily redirects standard input onto a
file. The macro accepts two or more arguments, the first of which must
evaluate to a string specifying the file to read from, while the rest are
expressions to be evaluated while standard input is redirected. The
redirection is undone after all the arguments have been evaluated. No
arguments are evaluated until after the macro expands, in the calling
scope. The arguments following the first argument will not be evaluated
unless redirection has been successful. The macro returns the value of
evaluating the last argument upon success, otherwise it returns return
value of the failed invocation of the "redirect" intrinsic.
The "resume" intrinsic is called by the macro to reconnect standard input
to the device it was connected to before the macro was invoked, after all
the body expressions have been evaluated.
> (with_input_file "library.munger"
>> (print (getline)))
; This file contains the lisp library code for the Munger interpreter.
1
Note that calls to "with_input_file" may be nested.
> (with_input_file "library.munger"
>> (print (getline))
>> (with_input_file "INSTALL"
>>> (print (getline)))
>> (print (getline)))
; This file contains the lisp library code for the Munger interpreter.
Munger Installation
; Copyright (c) 2001, 2002, 2003, James Bailie <jimmy@mammothcheese.ca>.
1
>
with_output_file: (with_output_file expr1 expr2 ...)
The "with_output_file" macro behaves similarly to the "with_input_file"
macro except that "with_output_file" redirects standard output instead of
standard input. The specified file will be overwritten, if it already
exists.
> (with_output_file "tmp" (print "hello") (newline))
1
> (with_input_file "tmp" (print (getline)))
hello
1
with_output_file_appending: (with_output_file_appending expr1 expr2 ...)
The "with_output_file_appending" macro behaves similarly to the
"with_output_file" macro, except that the specified file will be opened
for appending.
with_error_file: (with_error_file expr1 expr2 ...)
The "with_error_file" macro behaves similarly to the "with_output_file"
macro except that "with_error_file" redirects standard error instead of
standard output. The specified file will be overwritten, if it already
exists.
with_error_file_appending: (with_error_file_appending expr1 expr2 ...)
The "with_error_file_appending" macro behaves similarly to the
"with_error_file" macro, except that the specified file will be opened
for appending.
with_input_process:
with_output_process:
These two macros behave similarly to the "with_input_file" and
"with_output_file" macros, the only difference being the first argument
specifies a program for the appropriate descriptor to be piped to or
from, instead of a file. The first argument is passed to /bin/sh for
processing and may therefore contain any commands that program
understands. Read the entries for with_input_file and with_output_file
for more information.
> (with_input_process "ls"
(while (set 'l (getline))
(print l)))
INSTALL
LICENSE
Makefile
README
cat.munger
cgi.munger
cgi.munger
echo.cgi
err.munger
grep.munger
intrinsics.c
library.munger
lisp.c
lisp.h
munger.man
transform.munger
foreach_line: (foreach_line expr)
The "foreach_line" library macro is used to create filter-like programs.
The function accepts one argument, which must evaluate to a monadic
function. The monadic function is called repeatedly with each line of
input. The input lines are taken from the files specified on the command
line, or if no files are specified, from the standard input. Note the
function passed to the macro can be an intrinsic which accepts variable
length argument lists, but is happy with just one argument, such as the
"print" intrinsic.
The argument pointer must be positioned so that the next call to (next)
will return the first of the command line arguments the user wishes
processed, if any. A script would normally process any option arguments,
advancing the argument pointer as it did so, and then call foreach_line.
The invocation of foreach_line returns when it has processed all the
specified files, or when it encounters EOF on standard input.
; Skip over the interpreter name, so that the argument pointer is now
; pointing to the script name. The next invocation of (next) will
; either return the next argument, or 0.
(next)
(foreach_line (lambda (line) [do something with line] ))
(exit 0)
foreach_line_callback: (foreach_line_callback expr1 expr2 (expr3))
The "foreach_line_callback" macro functions similarly to the
"foreach_line" macro, but is intended for use by filters which need to
reset state variables when the input source changes, and/or which wish to
output the results of processing the just-completed input source. The
macro accepts one or two more arguments than "foreach_line". The second
argument must evaluate to a function, which itself accepts no arguments,
which will be called at the end of processing each input source. The
third argument is optional, and if present, is treated as a logical flag
which specifies whether the callback should be invoked before or after
the argument pointer is advanced. In Munger, zero (fixnum), the empty
list, and the empty string are all logical "false" values; all other
objects are considered "true". A "true" third argument causes the
callback to be invoked after the argument pointer has been advanced.
Omitting the third argument, or setting it to one of the three "false"
values, causes the callback to be invoked before the argument pointer is
advanced. Within the callback function, invoking (current) will return
the command-line argument the argument pointer is currently pointing at.
This means if you wish to access the name of the file which has just been
processed in the callback, no third argument, or a "false" third argument
should be passed to the macro, but if you wish to access the name of the
new input source about to be processed, a "true" third value should be
passed as the third argument. If the currently-running script has not
been presented with any command-line arguments, then
"foreach_line_callback" will read data from the standard input, and then
invoke the callback function once after all data has been processed. The
callback function should be crafted with this possibility in mind.
The "grep.munger" example program in (libdir) demonstrates using the
third argument. The following script does not supply a third argument,
and simulates wc -l:
(next)
(set 'script (current))
(set 'count 0)
(set 'total 0)
(set 'callback
(lambda ()
(let ((name (if (eq script (current)) "stdin" (current))))
(print count " " name)
(newline)
(set 'total (+ total count))
(set 'count 0))))
(foreach_line_callback (lambda (x) (inc count)) callback)
(print total " total")
(newline)
(exit 0)
unlink: (unlink expr)
The "unlink" intrinsic removes a file from a directory. It accepts one
argument which must evaluate to a string specifying the filename to
remove. The function returns 1 upon success, or a string specifying the
error encountered upon failure.
realpath: (realpath expr)
The "realpath" intrinsic resolves all symbolic links, extra / characters,
and references to . and .. in a specified pathname. If the filename
begins with a tilde (~), the function will attempt to expand the tilde as
csh or bash does to reference a home directory. The Function accepts one
argument, which must evaluate to a string, and returns a string. If a
directory component of the specified path does not exist, the empty
string is returned.
> (realpath ".")
"/usr/home/jbailie/src/munger-4.81"
> (realpath "~/foobar")
"/usr/home/jbailie/foobar"
> (realpath "/foobar/foobar")
""
; Tilde-expansion does not occur if the tilde-expression does not reference
; a valid, readable, home directory. Here the tilde is considered to be
; the first character of a relative filename, because there is no
; ~nobody home directory on my system.
> (realpath "~nobody/foobar")
"/usr/home/jbailie/src/munger-4.81/~nobody/foobar"
access: (access expr1 expr2)
The "access" intrinsic determines whether or not the real user id the
interpreter is running as, has access to a specified file. The function
accepts two arguments, the first of which must evaluate to a string and
specifies the filename to check access to, while the second argument must
evaluate to a number and be one of 0, 1, or 2. A second argument of 0
instructs the function to check for read access to the specified file. A
second argument of 1 instructs the function to check for write access to
the file, and a second argument of 2 instructs the function to check for
executable access to the file. The function will return 1 to indicate
access is allowed, or 0 otherwise.
mkdir: (mkdir expr)
The "mkdir" intrinsic accepts one argument, which must evaluate to a
string, and attempts to create a directory with the same name as the
string. Upon success, "mkdir" returns 1; otherwise it returns a string
describing the error encountered. If successful, the newly-created
directory will have permissions 755 (possibly modified by the current
umask).
rmdir: (unlink expr)
The "rmdir" intrinsic removes a directory from the filesystem. It
accepts one argument which must evaluate to a string specifying the
directory name, and returns 1 upon success, or a string specifying the
error encountered upon failure.
words: (words)
The "words" intrinsic accepts no arguments and returns the number of
words in the buffer.
time: (time)
The "time" intrinsic accepts no arguments and returns a string
representing an integer representing the number of seconds since 00:00
Jan 1, 1970 (the UNIX epoch). It was not possible to return the integer
value directly when Munger's fixnums were 1 bit less in size than the
word-size the interpreter was running on, as time values would overflow
the 31 bits of a fixnum on 32-bit machines. This limitation has been
lifted and fixnums are now the full 32 bits on a 32 bit machine, but the
time intrinsic remains unchanged for backward compatibility. Note that
one may use the "digitize" intrinsic to convert the time value to a
fixnum.
timediff: (timediff expr1 expr2)
The "timediff" intrinsic accepts two arguments which both must evaluate
to strings specifying a time value expressed in seconds from the UNIX
epoch (00:00 Jan 1, 1970), such as returned by the "time" intrinsic, and
returns the difference in seconds between the two times, as a number. If
the time difference cannot be expressed within the word-size of the
machine, minus 1-bit, then fixnum wrap-around will occur and the returned
value will not be accurate. The "unsigned" intrinsic will return a
string representing the correct time value.
timethen: (timethen expr)
The "timethen" intrinsic accepts one argument which must evaluate to a
fixnum and returns a string representing the current time in seconds from
the UNIX epoch (00:00 Jan 1, 1970), offset by that many seconds. This
function can be used to determine the UNIX time value for a time in the
future or the past. For example, the following code returns the UNIX
time value for a time 1 hour in the past:
> (timethen -3600)
"1124553616"
date2days: (date2days expr1 expr2 expr3)
The "date2days" intrinsic accepts three arguments, each of which must
evaluate to a fixnum, specifying a year, month, and a day of the month,
and returns a fixnum representing the number of days this date is removed
from March 1st, 1 BCE. This value can be useful in performing calendar
arithmetic. The first number indicates the year and must be greater than
or equal to 0, (1 BCE). The second number indicates the month and must
be in the range 1-12. The third number indicates the day of the month
and must be in the range 1-31, or 1-30, depending on the month specified.
days2date: (days2date expr)
The "days2date" intrinsic accepts a day number fixnum returned by
"date2days" and converts it into a list of three fixnums representing
that date. The first number indicates the year and will be greater than
or equal to 0, (1 BCE). The second number indicates the month and will
be in the range 1-12. The third number indicates the day of the month
and will be in the range 1-31.
week: (week expr)
The "week" intrinsic accepts one argument which must evaluate to a
fixnum, representing a day number as returned by "date2days", and returns
a two-element list of fixnums representing the year and the week number
in that year, respectively, in which the specified day occurs.
weekday: (weekday expr)
The "weekday" accepts one argument which must evaluate to a day number,
such as returned by "date2days", and returns a two-element list
describing the day of the week of the specified day. The first returned
element is a fixnum in the range 0-6, each value of which corresponds to
a weekday in the range Sunday-Saturday, respectively. The second element
is a string representing the name of the weekday, in English.
month: (month expr)
The "month" intrinsic accepts a fixnum argument specifying the numerical
value of a month (1-12), as used by the "date2days", "days2date", and
"localtime" intrinsics, and returns a string representing the name of the
month in English.
localtime: (localtime expr)
The "localtime" intrinsic accepts a string representing a UNIX time
value, such as returned by the "time" intrinsic, and returns a six-
element list of fixnums representing the date of the specified time in
the local timezone. The first returned element is the year. The second
is the month. The third is the day of the month. The fourth is the
hour. The fifth is the minutes. The sixth is the seconds.
utctime: (utctime expr)
The "utctime" intrinsic accepts a string representing a UNIX time value,
such as returned by the "time" intrinsic, and returns a six-element list
of fixnums representing the date of the specified time in terms of
coordinated universal time. The first returned element is the year. The
second is the month. The third is the day of the month. The fourth is
the hour. The fifth is the minutes. The sixth is the seconds.
date2time: (date2time expr1 expr2 expr3...)
The "date2time" intrinsic converts a list of fixnums representing a date
on or after midnight January 1st, 1970, and returns a fixnum representing
the number of seconds from the UNIX epoch (midnight, January 1st, 1970).
See the entry for the "time" intrinsic for more information on UNIX time
values. The function accepts three, four, five, or six arguments. The
first three arguments must be present, and represent the year (1970-),
the month (1-12), and the day of the month (1-31), respectively. The
fourth, fifth, and sixth optional arguments represent the hour, minute,
and seconds values of the date, respectively. Omitted optional arguments
default to 0. The presence of an optional argument implies the presence
of all preceding optional arguments. This means if you include a value
for minutes, you must include a value for hours as well, and if you
include a value for seconds, you must include values for hours and
minutes as well.
random: (random expr)
The "random" intrinsic accepts one argument which must evaluate to a
positive number, and returns a random integer where 0 < returned-value <
evaluated-argument. The interpreter calls srandomdev() at startup to
initialize the random number generator, and random() to generate random
numbers. Random numbers are then scaled to be within the requested range
by computing: evaluated-argument * random-number / RAND_MAX.
date: (date [expr [expr ]])
The "date" intrinsic accepts two optional arguments and returns a string
representing the current date and time. The two optional arguments, if
present, must evaluate to fixnums. Invoking "date" with no arguments is
the same as invoking it with an first argument of 0. The second argument
is only significant when the first argument is non-zero. When invoked
with no arguments, or one argument of 0, the function returns a
representation of the time and date expressed in terms of the local
timezone. When invoked with a non-zero first argument, the function
returns a representation of the time and date expressed as universal
coordinated time.
If you are running FreeBSD 4.x, then the timezone in this case will be
represented by the three-letter abbreviation GMT (Greenwich Mean Time),
but on FreeBSD 5.x and later versions the timezone will be represented by
the three-letter abbreviation UTC (Universal Coordinated Time, in
French). The "date" intrinsic was designed to return a properly
formatted date string per HTTP/1.1 if invoked with a non-zero first
argument, but this standard specifies GMT must be used for the timezone
indicator and not UTC, therefore the second optional argument may be
present, and if non-zero changes the UTC abbreviation to GMT.
> (date)
"Fri, 04 Oct 2002 12:06:11 EDT"
> (date 1)
"Fri, 04 Oct 2002 16:06:13 UTC"
> (date 1 1)
"Fri, 04 Oct 2002 16:06:16 GMT"
datethen: (datethen expr1 [expr2 [expr3 ]])
The "datethen" intrinsic may be used to produce human-readable date
strings from UNIX time values. The function accepts one to three
arguments. The first must evaluate to a string specifying a UNIX time
value, such as those generated by the the "time" and "timethen"
intrinsics. The optional second and third arguments must be one of
either fixnum 0 or fixnum 1. The function returns a string describing
the time argument in human readable form, formatted identically to the
strings produced by the "date" intrinsic. If the second argument is
present and is 1, the returned time string will be expressed in UTC,
otherwise the time string will be expressed relative to the local time
zone. If the third argument is present and evaluates to 1, then the
timzone abbreviation will be GMT instead of UTC.
Invoking (datethen (time)) is equivalent to invoking (date), while
invoking (datethen (time) 1) is equivalent to invoking (date 1).
print: (print expr...)
println: (println expr...)
Intrinsic "print" accepts one or more arguments, evaluates them, and
prints the each evaluated argument, in order, to the standard output
stream. If any of the arguments evaluate to strings, the contents of the
strings are printed without the surrounding quotes. If any of the
arguments evaluate to lists, then any strings in those lists will be
printed with their surrounding quotes. The "print" intrinsic always
returns 1.
Intrinsic "println" functions similarly to "print" but it outputs a
single newline character after printing its arguments.
> (print "hello there")
hello there
1
> (set 'f '(a b c))
(a b c)
1
> (print f)
(a b c)
> (print 'hello)
hello1
> (progn
>> (print 'hello)
>> (newline))
hello
1
The "newline" intrinsic outputs a newline character (ASCII 10) to the
standard output.
load: (load expr)
Intrinsic "load" reads lisp from the file specified by its lone argument,
which must evaluate to a string. The function returns the value
resulting from evaluating the last expression in the file. Any errors
encountered opening the file, or evaluating its content will stop
evaluation.
getline_ub: (getline_ub [expr])
The "getline_ub" (getline unbuffered) is intended for use when the user
wants to read lines of text from stdin, but subsequently "fork" or
"forkpipe" the interpreter and "exec" or "shexec" a new program to
continue reading lines from stdin. The "getline" intrinsic does its own
input buffering, and if used in such a situation, would result in
buffered data in the parent being lost to the child. Note that if the
child does not call "exec" or "shexec" this is not true, as the child
will have its own copy of the buffered data.
The "getline_ub" intrinsic reads a line from standard input and returns
it as a string, including the terminating newline. If it does not
encounter a newline after having read 2048 characters, the 2048
characters will be returned without a terminating newline. If EOF is
encountered before a newline is read, the returned string will contain
all the remaining data in the input stream without a terminating newline.
If no characters remain in the input stream the function returns 0. If
the read() system call fails the function also returns 0. The function
accepts two optional arguments, which if present, must both evaluate to
fixnums.
The lone optional argument is a timeout value, useful when stdin is
connected to a socket. "getline_ub" reads data a character at a time
from stdin until it reads a newline. If any invocation of the read()
system call blocks for a longer number of seconds than that specified by
the timeout value, it will be interrupted and "getline_ub" will return
any characters successfully read before the timeout, not terminated by a
newline character. If no characters were read before the timeout, then
the empty string will be returned. This is the only circumstance in
which the empty string will be returned. If invoked without a timeout
value, "getline_ub" will block until at least one character can be read
from stdin or EOF is encountered, and return either a non-empty string,
or 0 on EOF. Therefore, if stdin has been connected to a socket via
"accept", a return value of "" from (getline_ub 5) indicates a timeout,
while a return value of 0 indicates EOF. The timeout argument may be
omitted or it may be set to 0, to allow read() to block indefinitely.
When reading from a terminal device in canonical mode, carriage returns
will be converted into newlines by the terminal driver, and be returned
as newlines to "getline_ub". When reading from other sources, or from a
terminal in non-canonical mode, the carriage returns will be passed
through untranslated. Note that if reading from a terminal device in
canonical mode with a timeout, the empty string will always be returned
when a timeout occurs, as the terminal driver will not return any
characters to the interpreter until a carriage return or newline is
input.
getline: (getline [expr1 [expr2]])
Intrinsic "getline" reads a line from standard input and returns it as a
string, including the terminating newline. If it does not encounter a
newline before reading 2048 characters, it will return the 2048
characters without a terminating newline. If the end of stdin is reached
while searching for the next newline, all the remaining characters in the
stream, if any, will be returned as a string without a terminating
newline. If no characters remain in the stream, "getline" returns fixnum
0. Any subsequent invocations of "getline" on the same input source will
continue to return fixnum 0. The function also returns 0 when it
encounters any error condition.
"getline" does its own input buffering to make line-by-line reading of
data more efficient, using a 100K buffer. The function should not be
used when the user wishes to read some lines, then fork and exec another
program to continue reading from the inherited stdin, as the data already
read into the parent's buffer will not be available to the child process
after an "exec", so data will be lost. In this situation, the unbuffered
version, "getline_ub" must be used instead, or the unbuffered "getchars"
intrinsic.
IF THE STANDARD INPUT IS NOT A TERMINAL DEVICE, BOTH ARGUMENTS ARE
IGNORED AND THE BEHAVIOR IN THE REST OF THIS DESCRIPTION DOES NOT OCCUR.
If standard input is a terminal device, then the terminal is put into
"raw" mode, with the interpreter simulating the standard UNIX terminal
line discipline (ctrl-h => backspace; ctrl-w => werase; ctrl-u => kill).
If ctrl-d (EOF) is encountered as the first character of input, "getline"
returns the fixnum zero, otherwise EOF is ignored, and getline continues
to accumulate characters until either a carriage-return or a newline is
input. Carriage returns are converted into newlines, so the strings
returned by "getline" when reading from a terminal device are always
newline-terminated. The function drops the cursor to the last line of
the terminal device and echoes input there. The function will also
automatically perform horizontal scrolling if the input line grows to the
width of the terminal device, scrolling in increments of the terminal
width - 1 character. The last column of the last line of the screen is
not used because some terminals will automatically scroll up a line if a
character is printed there.
The function accepts two optional arguments, the first of which, if
present, must evaluate to a string, and is printed as a prompt to the
user. If the second optional argument is also present, it must evaluate
to a fixnum and specifies the location of tabstops in the onscreen echoed
string. For example a value of 4, will cause tabstops to appear to the
user to be set every four characters. If the second argument is not
present, tabstops default to every 8 characters. Pass the empty string
as the first argument if you wish to submit a second argument to the
function, but do not wish it to print a prompt.
If the second argument is 0, then the input of a tab character will
trigger filename completion. if the second argument is -1, then tab will
trigger command and/or filename completion, depending upon the state of
the input line when the tab is received. If the second argument is -2,
then tab will trigger filename completion, but the filename completion
mechanism will not work recursively. If the second argument is -3, then
tab will trigger command and/or filename completion, but the filename
completion mechanism will not work recursively. Note that in these
cases, tabs cannot be entered by the user. For the cases where the
filename completion mechanism does not work recursively (second argument
of -2 -3), it will attempt to complete one level of the path provided to
it. If completion is invoked again at this point, it will complete
another level, and so on. For the cases where the filename completion
works recursively, it will complete as much of the path given to it, as
is possible.
When command and filename completion have been requested, and the text
entered so far consists of only non-whitespace characters, then command
completion will be attempted, otherwise filename completion will be
attempted upon the last contiguous segment of non-whitespace characters.
There are two exceptions to this rule. If the first character input is
either '/' or '.' then filename completion will be attempted in the
initial position instead of command completion.
Completion will work with commands or filenames which contain whitespace
only if the portion before the cursor does not contain whitespace.
Quoting mechanisms, such as those provided by shell programs, are not
available. The "getline" function will append a single space character
to the input string after a completion has been successfully completed
and the completion was unambiguous.
When completions are requested, each string created with "getline" is
placed onto a 500-line history list and may be recalled, edited and re-
entered. Pressing Contol-P or Control-N, while inputting text via
"getline" will cause the current text to be replaced with the previous or
the next item on the history list, respectively. While in the history
list one may quickly move back to the input line by invoking Control-X.
The history list may be read from or written to a file with the
"load_history" and "save_history" intrinsics.
When completions are requested, pressing Control-R or Control-S causes
the interpreter to enter history search mode. Square brackets appear
above the input line. Text typed by the user appears inside the brackets
and causes the interpreter to search forward (C-s) or backward (C-r) in
the history list for a line containing the bracketed text. Pressing C-s
or C-r again causes the interpreter to search forward or backward for
another matching line. Both Control-R and Control-S wraparound the ends
of the history list. Control-H and Backspace erase the last type
character inside the brackets.
The history can be cleared with the "reset_history" intrinsic.
When completions are requested, these command-line editing commands are
recognized by getline, in addition to C-h, C-w, and C-u:
M-f - Moves the cursor to the beginning of the next word in the line.
M-b - Moves the cursor to the beginning of the previous word in the line.
C-k - Deletes the text from the cursor location to the end of the line.
C-d - Deletes the character the cursor is on.
M-d - Deletes the word or word-portion the cursor is before.
C-a - Moves the cursor to the beginning of the line.
C-e - Moves the cursor to the end of the line.
C-y - pastes the last deletion into the line before the cursor.
Additionally, will C-u not delete the entire line, when completions are
requested, but only the portion before the cursor.
reset_history (reset_history)
The "reset_history" intrinsic clears the items on the history list the
interpreter maintains for the "getline" intrinsic. The function accepts
no arguments and always returns 1.
load_history (load_history expr)
save_history (save_history expr)
The "load_history" and "save_history" intrinsics read and write the
history list maintained by the "getline" intrinsic to and from files.
Both functions accept a single argument that must evaluate to a string
specifying the filename. In both functions, if the open() system call
encounters an error, a string will be returned. In the "save_history"
intrinsic, if the read() system call encounters an error a string will be
returned. Otherwise, both functions return the number of lines read or
written.
rescan_path: (rescan_path)
The "rescan_path" intrinsic causes the interpreter to re-build its
internal list of executable files from the directories defined by the
PATH environment variable. This list is used by "getline" and by
"command_lookup". If the programmer wishes those two intrinsics to
continue to find all executables, it is necessary to invoke "rescan_path"
if the PATH environment variable has changed, or if new executables have
been added to the directories specified by that environment variable,
since the last invocation of either "getline" or "command_lookup". The
function always returns 1.
stringp: (stringp expr)
Intrinsic "stringp" returns 1 if its lone argument evaluates to a string,
otherwise it returns 0.
> (stringp "0")
1
> (stringp 0)
0
fixnump: (fixnump expr)
Intrinsic "fixnump" returns 1 if its lone argument evaluates to a number,
otherwise, it returns 0.
> (fixnump 0)
1
> (fixnump "0")
0
symbolp: (symbolp expr)
Intrinsic "symbolp" returns 1 if its lone argument evaluates to a symbol,
otherwise, it returns 0.
regexpp: (regexpp expr)
Intrinsic "regexpp" returns 1 if its lone argument evaluates to a
compiled regular expression object, otherwise, it returns 0.
tablep: (tablep expr)
Intrinsic "tablep" returns 1 if its lone argument evaluates to a table,
otherwise, it returns 0.
stackp: (stackp expr)
Intrinsic "stackp" returns 1 if its lone argument evaluates to a stack,
otherwise, it returns 0.
intrinsicp: (intrinsicp expr)
Intrinsic "intrinsicp" returns 1 if its lone argument evaluates to an
intrinsic function, otherwise, it returns 0.
closurep: (closurep expr)
Intrinsic "closurep" returns 1 if its lone argument evaluates to a
closure, otherwise, it returns 0.
macrop: (macrop expr)
Intrinsic "macrop" returns 1 if its lone argument evaluates to a macro
closure, otherwise, it returns 0.
recordp: (recordp expr)
Intrinsic "recordp" returns 1 if its lone argument evaluates to a record,
otherwise it returns 0.
sqlitep: (sqlitep expr)
Intrinsic "sqlitep" returns 1 if its lone argument evaluates to a sqlite
database object, otherwise, it returns 0.
regcomp: (regcomp expr1 [expr2 [expr 3]])
The "regcomp" intrinsic accepts one, two, or three arguments, the first
of which must evaluate to a string, and which is interpreted as a regular
expression to be compiled into a compiled regular expression object. The
function returns the compiled regular expression object if the
compilation is successful, or a string containing an error message,
otherwise. Regular expression objects are constants which evaluate to
themselves.
The "regcomp" intrinsic also accepts two optional arguments after the
first. If the second optional argument is present, it must evaluate to a
fixnum, and is interpreted as a boolean. If non-zero, it causes
"regcomp" to produce a compiled regular expression which will match text
case-insensitively. Without the second argument, "regcomp" defaults to
case-sensitive matching. If the third optional argument is present, it
must evaluate to a fixnum, and is interpreted as a boolean. If non-zero,
it prevents "regcomp" from recognizing regular expression operators in
the first argument. The resulting compiled expression will match the
literal string value of the first argument.
The "regcomp" intrinsic interprets the two-character escape-sequences \t,
and \b, to represent the tab and space characters respectively, allowing
one to specify regular expressions containing whitespace, without using
whitespace. This can sometimes be convenient. The carriage return and
newline characters can also be specified with escape sequences: \r, and
\n, respectively. The zero width assertions \< and \> match the null
string at the beginning and end of a word, respectively. All the
regular-expression operators may be escaped with a backslash to include
the operator as a literal literal character in the expression.
Unrecognized escape sequences will be replaced with the escaped
character. The backslash will be consumed. All escape sequences are
ignored and left unchanged in the expression if a non-zero third argument
has been passed to "regcomp".
substitute: (substitute expr1 expr2 expr3 [expr4])
Intrinsic "substitute" performs search and replace operations on strings
using regular expressions to specify the matches. The function uses a
template string to specify the replacement text, in a manner similar to
that of the substitute commands of ed and sed, where the template may
refer to the text of the match, and the text matched by parenthesized
subexpressions. There is also a "replace" library function available,
described elsewhere in this document, which allows the replacement
expression to be a snippet of lisp, which is evaluated upon each match to
dynamically generate the replacement text.
The "substitute" function takes three or four arguments, and returns a
new string, incorporating the appropriate replacements. The first
argument must evaluate to a compiled regular expression. See the
"regcomp" intrinsic for more information on compiling regular
expressions. The second and third arguments must evaluate to strings,
and are interpreted as the replacement string, and the string to search
for matches in, respectively. The fourth optional argument, if present,
must evaluate to a number, which indicates the number of matches which
should be replaced. A value of 0 indicates the substitution should
replace all the matches of the pattern in the text of the third argument.
The absence of the fourth argument is the same as having a fourth
argument of 1, which is to say, only the first match will be replaced.
The text matched by the first ten subexpressions in the regular
expression may be inserted into the replacement string by including an
escape sequence of the form \[0-9] in the replacement text where one
wishes the matched text to appear. The first subexpression is referred to
by \1 and the tenth subexpression is referred to by \0. The text matched
by the entire regular expression may be inserted into the replacement
text by inserting \& into the replacement string at the desired location.
In the replacement string unrecognized escape sequences are replaced with
the unescaped character. To escape the backslash, to cause a literal
backslash to appear in the substitution, a total of four backslashes are
necessary in the replacement string. One level of escaping will be
removed by the lisp string parser, leaving two backslashes for
"substitute" to see, the first one escaping the second. Note that
Munger's string parser will not remove unrecognized escape sequences, so
that this double level of escaping is only necessary when the backslash
itself is being escaped.
> (substitute (regcomp "string") "booger" "string string string string" 2)
"booger booger string string"
> (substitute (regcomp "[a-z]+") "{\&}" "one two three" 0)
"{one} {two} {three}"
> (substitute (regcomp "([a-zA-Z]+) ([a-zA-Z]+)") "\2 \1"
>> "is This sentence a." 0)
"This is a sentence."
The "substitute" intrinsic interprets the two-character escape-sequences
\t, and \b, to represent the tab and space characters respectively,
allowing one to specify whitespace in the replacement string, without
using whitespace. This can sometimes be convenient. Five other escape
sequences may be embedded in the replacement string to control the case
of portions of the returned string. These escape sequences work for
strings of ASCII alphanumeric characters, only:
\U Turns on conversion to uppercase.
\u Turns on conversion to uppercase for the next character only.
\L Turns on conversion to lowercase.
\l Turns on conversion to lowercase for the next character only.
\e Turns off \U or \L.
> (substitute (regcomp "foo") "\U\&\e" "foobar")
"FOObar"
The effects of \U and \L extend beyond the replacement string if they are
not terminated:
> (substitute (regcomp "foo") "\U\&" "foobar")
"FOOBAR"
\U and \L override each other:
> (substitute (regcomp "foo") "\U\&\L" "foobar FUNBAG")
"FOObar funbag"
match: (match expr1 expr2)
The "match" intrinsic matches a regular expression against a string, and
returns a two element list if the regular expression finds a match within
the string, or the empty list otherwise. The function accepts two
arguments, the first of which must evaluate to a compiled regular
expression object, while the second must evaluate to the string to match
the regular expression against. The two-element list returned upon
success, consists of the character indices of the starting and ending
locations of the matched text. The second index is the start of the text
following the match. These two indices may be used in conjunction with
the "substring" intrinsic, to extract the text before, after, or
containing the match:
> (set 'rx (regcomp "foobar"))
<REGEXP#3>
> (set 's "---foobar---")
"---foobar---"
> (match rx s)
(3 9)
> (substring s 0 3) ; text before the match
"---"
> (substring s 3 (- 9 3)) ; text of the match
"foobar"
> (substring s 9 0) ; text after the match
"---"
Note that if the first returned index were 0, the first invocation of
"substring" above would have returned the whole string, because passing a
third length argument of 0 to "substring" means "to the end of the
string." This should not be a problem, since a first returned index of 0
indicates there is no text before the match. The programmer can check
for this situation before invoking "substring".
The "matches" intrinsic, described directly below, can be used to extract
the text of a match and any matching parenthesized subexpressions; while
"match" itself, is intended to be used in situations where it is only
necessary to know whether a match occurred or not, or if the text not
matched needs to be accessed, or if the location of the match is
required.
matches: (matches expr1 expr2)
The "matches" intrinsic accepts the same arguments as the "match"
intrinsic, but returns a list, which if non-empty, describes the text
matched.
If no match were found, the list will be empty. Otherwise, a list of
twenty strings will be returned. The first string of the list will be
the text matched by the entire regular expression, while the subsequent
elements of the list will be the text matched by the first nineteen
parenthesized subexpressions in the regular expression. If nineteen
subexpressions are not present in the regular expression, empty strings
will be returned for the missing subexpressions. If a subexpression
fails to match, an empty string will be returned in the corresponding
position of the list.
> (matches (regcomp "[0-9]+") "I have 22 figurines.")
("22" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "")
> (matches (regcomp "([0-9])+") "12345")
("12345" "5" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "")
> (matches (regcomp "([0-9]+)") "12345")
("12345" "12345" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ")
If one is using a regular expression with a lot of parenthesized
subexpressions, it may be more convenient to assign the results of
invoking "matches" to a stack. The final zero in the example below is
the return value of the "print" intrinsic.
> (set 'rx (regcomp "(.)(.)(.)(.)(.*)"))
<REGEX1>
> (when (used (set 's (assign (stack) (matches rx "fooobar"))))
>> (for (a 0 19)
>>> (print (index s a) ":")))
foobar:f:o:o:o:bar:::::::::::::::1
split: (split expr1 expr2 [expr3])
The "split" intrinsic is used to break up a string into substrings at
delimiter characters. It accepts either two or three arguments. Both of
the first two arguments must evaluate to strings. The first is
interpreted as a set of delimiter characters which specify where the
second argument should be split. Where a character from the delimiter
string occurs in the second argument, it is consumed, breaking up the
second argument into pieces. The pieces are returned as a list of
strings. The optional third argument must evaluate to a number, and
specifies a limit on the number of pieces the second argument is to be
split into. If the first argument is the empty string, then the second
argument is split up character-by-character. The original string is
unchanged. The "split" intrinsic will return empty strings for empty
fields in the original string. Delimiter characters occurring adjacent
to each other, or at the very start or very end of the second argument,
will be recognized as empty fields. If none of the delimiter characters
can be found in the string, then the returned list will contain only the
original string.
> (split " " "Lisp is an applicative language.")
("Lisp" "is" "an" "applicative" "language.")
> (split " " "Lisp is an applicative language.", 3)
("Lisp" "is" "an applicative language.")
> (split "" "Lisp is an applicative language", 3)
("L" "i" "sp is an applicative language")
> (split ":" ":a:b:c:")
("" "a" "b" "c" "")
split_rx: (split_rx expr1 expr2 [expr3])
The "split_rx" library function is used to break up a string into
substrings using a regular expression to specify delimiter substrings.
It works analogously to the "split" intrinsic, except the first argument
must evaluate to a compiled regular expression object. As with "split",
the second argument must evaluate to a string, and the third argument, if
present limits the number pieces the second argument will be broken into.
The supplied regular expression is matched against the second argument,
then the text of matches is consumed to split the second argument into
pieces, which are gathered into a list and returned. The function
returns a list containing only the original second argument, if no match
on the regular expression can be found. Empty fields cannot be detected
by split_rx.
> (setq rx (regcomp "[\b\t]+"))
<REGEXP#34>
> (split_rx rx "foobar tooley marzipan loopy ")
("foobar" "tooley" "marzipan" "loopy")
rootname: (rootname expr)
The "rootname" library function accepts one argument which must evaluate
to a string, and returns the portion of that string, without the last
trailing filename suffix. If the argument does not have a suffix the
original string is returned.
> (rootname "foobar.tar.gz")
"foobar.tar"
> (rootname (rootname "foobar.tar.gz"))
"foobar"
suffix: (suffix expr)
The "suffix" library function accepts one argument which must evaluate to
a string and returns the last filename suffix contained in the string.
If the string does not end with a filename suffix, the original string is
returned.
> (suffix "foobar.tar.gz")
".gz"
> (suffix (rootname "foobar.tar.gz"))
".tar"
tokenize: (tokenize expr)
The "tokenize" library function accepts one argument which must evaluate
to a string, and returns a list consisting of all the tokens embedded in
the string composed of contiguous non-whitespace characters. The
function returns a list containing only the original argument string if
it contains no whitespace.
> (tokenize " foobar tooley hot-cha-cha!")
("foobar" "tooley" "hot-cha-cha!")
replace: (replace expr1 expr2 expr3 [expr4])
The "replace" library macro behaves similarly to the "substitute"
intrinsic, except that the replacement expression will be executed once
for every match in the string. This allows the replacement string to be
dynamically created for each match. The function accepts three or four
arguments. The first argument must evaluate to a compiled regular
expression object. The second argument is the expression to create the
replacement string and is evaluated in a lexical context where the
symbols m0...m19 are bound to the text of the full match, and the text
matched by the first nineteen parenthesized subexpressions, respectively.
Note that the resulting string this argument produces when evaluated will
undergo some escape-processing, because it is ultimately used as the
replacement pattern for a call to the "substitute" intrinsic. See the
entry describing that function, elsewhere in this document, for details
on the escape sequences recognized. Note also, one can simply pass a
string as the second argument, but in that case it is much more efficient
to use "substitute" directly. The third argument must evaluate to the
string the regular expression is to be matched against. The fourth
optional argument, if present is a repeat count, and can be used to limit
the number of replacements made in the string. Unlike the "substitute"
intrinsic, "replace" interprets a repeat count of 0 as a request to
inhibit any substitutions from being performed. If the repeat count is
not present, all matches in the string will undergo replacement. If no
match is found, the function returns the original string, otherwise it
returns a new string containing the specified substitutions.
The following example uses "replace" to decode x-www-form-url-encoded
characters in a specified string.
> (setq str "foobar+tooley%7Efunbag%24%25%26")
"foobar+tooley%7Efunbag%24%25%26"
> (setq r0 (regcomp "%([0-9A-Fa-f][0-9A-Fa-f])"))
<REGEXP#10>
> (setq r1 (regcomp "\+"))
<REGEXP#11>
; Decode encoded spaces.
> (setq str (substitute r1 " " str 0))
"foobar tooley%7Efunbag%24%25%26"
; Decoded hex encodings.
>(setq str
>> (replace r0
>>> (char (hex2dec m1))
>>> str))
"foobar tooley~funbag$%&"
concat: (concat expr ...)
The "concat" macro is used to concatenate strings. The macro accepts one
or more arguments which must all evaluate to strings, or to arbitrarily
nested lists containing only strings, and returns a single string
consisting of all the argument strings concatenated.
> (concat "foo" "bar")
"foobar"
> (concat '("foo" "bar"))
"foobar"
explode: (explode expr)
The "explode" macro accepts one argument which must evaluate to a string
and returns a list of one-character strings, corresponding to each
character in the original string.
> (explode "foobar")
("f" "o" "o" "b" "a" "r")
join: (join expr1 expr2 expr3...)
The "join" intrinsic is used to concatenate strings with delimiters. It
accepts three or more arguments, which must all evaluate to strings, or
to arbitrarily-nested lists containing only strings. The first argument
is sandwiched between the strings of the remaining arguments to make a
new string, which is returned.
> (join " " "This" "will" "be" "a" "sentence.")
"This will be a sentence."
> (join "" "this" "will" "be" "run" "together")
"thiswillberuntogether"
> (join ":" '("a" ("b" ("c") "d") "e"))
"a:b:c:d:e"
length: (length expr)
The "length" intrinsic accepts one argument which must evaluate to a
string, a record, a stack, a table, or a list, and returns the number of
characters in the string, the number of key/value pairs in the table, or
the number of elements in the list.
> (length "fiver")
5
> (length '(a b c))
3
sort: (sort [expr...])
The "sort" intrinsic accepts any number of arguments and sorts them in
ascending order if they all evaluate to either strings or numbers.
Invoking "sort" without arguments returns the empty list. When sorting
strings, case is ignored.
> (sort "D" "c" "B" "a")
("a" "B" "c" "D")
> (sort 3 2 1)
(3 2 1)
sortlist: (sort expr)
The "sortlist" intrinsic accepts one argument which must evaluate to a
list whose elements must be either all numbers or all strings. The
function returns a new list whose members are the members of the original
list sorted into ascending order. When sorting strings, case is ignored.
> (sortlist '("c" "b" "a"))
("a" "b" "c")
> (sortlist '(0 -23 1))
(-23 0 1)
sortcar: (sortcar expr)
The "sortcar" intrinsic accepts one argument which must evaluate to a
list and whose elements must all be lists themselves, whose first
elements must be either all strings or all numbers. The function returns
a new list containing the elements of the original list, sorted using the
first elements of the sublists as the sort key. An example will make the
meaning of this convoluted description clear:
> (sortcar '(("Z" f f) ("B" f f) ("C" f f)))
(("B" f f)("C" f f)("Z" f f))
"sortcar" is useful for sorting aggregate types. The following example
sorts a stack-of-stacks, using the second element of each substack as the
sort key. Perl users may recognize this as an example of an idiom that
community has come to call "The Schwartzian Transform," but of course,
this sort of thing is pure lisp.
> (set 's (stack 3))
<STACK1>
> (store s 0 (assign (stack) '(f 2 f)))
<STACK2>
> (store s 1 (assign (stack) '(f 1 f)))
<STACK3>
> (store s 2 (assign (stack) '(f 0 f)))
<STACK4>
> (flatten s)
(<STACK2> <STACK3> <STACK4>)
> (assign s (mapcar (lambda (x) (cadr x))
(sortcar (mapcar (lambda (x) (list (index x 1) x))
(flatten s)))))
<STACK1>
> (flatten s)
(<STACK4> <STACK3> <STACK2>)
until: (until expr1 [expr2 ...])
while: (while expr1 [expr2 ...])
The "while" intrinsic is a looping construct. It accepts one or more
arguments, the first of which is the test condition. If it evaluates to
a true value (anything other than 0 (fixnum), the empty string, or the
empty list), the rest of the arguments are evaluated in order. This
process is repeated until the first argument no longer evaluates to a
true value, when the loop stops without evaluation of the other
arguments. Therefore a while-loop will execute zero or more times. The
"while" intrinsic returns the value of the failed test expression.
The "until" intrinsic has similar behavior to the "while" intrinsic,
except the logic of the test is inverted, which is to say the body of an
"until" loop is executed as long as the test condition evaluates to a
false value, or, phrased another way, until the test condition evaluates
to a true value.
> (setq n 10)
10
> (while n
>>> (print n)
>>> (newline)
>>> (setq n (- n 1)))
10 9 8 7 6 5 4 3 2 1
1
> (setq n 0)
0
> (setq l ())
()
> (until (eq n 10)
>> (setq l (cons (inc n) l)))
1
> l
(10 9 8 7 6 5 4 3 2 1)
The final 1 here is the return value of the "while" expression.
do: (do expr1 [expr2 ...])
The "do" intrinsic introduces a looping construct. It accepts one or
more arguments. All the arguments are evaluated in order. If the last
expression evaluates to a true value (anything other than 0, the empty
string, or the empty list), then all the argument expressions are
evaluated again. This process repeats until the last expression
evaluates to a "false" value, when the loop exits without further
evaluation. A do-loop will therefore evaluate its argument expressions
one or more times. The "do" intrinsic returns the value of the failed
test expression.
> (set 'n 0)
0
> (do
>> (print n)
>> (newline)
>> (set 'n (+ n 1)))
> (< n 10))
0 1 2 3 4 5 6 7 8 9
1
>
The final 1 here is the return value of the "do" expression.
fatal: (fatal)
nofatal: (nofatal)
The "fatal" and "nofatal" intrinsics, accept no arguments, and determine
if errors encountered during evaluation will cause the interpreter to
exit. When the interpreter starts it is in the "nofatal" state. If an
error is encountered in the "nofatal" state, the stack will unwind back
to the toplevel interpreter prompt. After "fatal" has been invoked, all
errors which stop evaluation will additionally cause the interpreter to
exit. Invoking "die" when in the "fatal" state, will also cause the
interpreter to exit up return to toplevel.
die: (die [expr...])
Intrinsic "die" forces the interpreter to stop evaluating any lisp it may
be evaluating and return to the toplevel. The optional arguments are
passed to the "warn" intrinsic. Invoking "die" when "fatal" has been
invoked will cause the interpreter to exit, upon return to toplevel.
throw: (throw expr)
catch: (catch [expr1]...)
The "catch" intrinsic together with the "throw" intrinsic provide a means
of performing non-local exits. The arguments to catch are evaluated in
an implicit "progn" and the result of the evaluation of the last
expression is returned. If, however, a "throw" expression is encountered
while evaluating any of the arguments to "catch" then the thread of
execution returns to the catch expression. The "catch" form immediately
returns, with the result of evaluating the argument to the "throw"
expression becoming the return value of the "catch" expression. A
"throw" is caught by its enclosing "catch" expression, if there is one,
or if there is no enclosing "catch" expression, the interpreter returns
to the toplevel, just as if the "die" intrinsic has been invoked. If the
"fatal" intrinsic has been previously invoked, an uncaught "throw" will
cause the interpreter to exit.
> (catch
>> (print (catch 0 (throw 'hello) 2 3 4))
>> (newline)
>> (throw 'goodbye)
>> (print 'not_reached))
hello
goodbye
>
stringify: (stringify expr...)
The "stringify" intrinsic accepts one or more arguments which must all
evaluate to some sort of atom, and converts each atom's print syntax into
a string. Then all the strings are concatenated into one string, which
is returned.
> (stringify 12 "," 43)
"12,43"
> (stringify 'hello " " 'there)
"hello there"
digitize: (digitize expr)
Intrinsic "digitize" accepts one argument which must evaluate to an atom
and converts it to a number, if possible. Attempting to apply digitize
to a non-number will cause it to return 0.
> (digitize '42)
42
> (digitize "-3")
-3
mapcar: (mapcar expr1 expr2)
The "mapcar" library function accepts two arguments, the first of which
must evaluate to a monadic closure or a monadic intrinsic function or an
intrinsic which may accept one or more arguments, such as the "print"
intrinsic, while the second argument must evaluate to a list. The macro
applies the function to each element of the list, returning a new list
containing the results of each evaluation.
> (set 'l '(a b c))
(a b c)
> (set 'f (lambda (n) (list n n n)))
<CLOSURE#23>
> (mapcar f l)
((a a a) (b b b) (c c c))
map: (map expr1 expr2 ...)
The "map" library function is a generalized version of "mapcar". The
function accepts two or more arguments, the first of which must evaluate
to a closure or an intrinsic function, while the remaining arguments must
all evaluate to lists of the same length. The function passed as the
first argument must accept as many arguments as there are list arguments.
A new list is returned consisting of the result of successively applying
the function to a grouping of elements from each list. The elements of
the lists are successively grouped according to position.
> (map (lambda (x y z) (+ x y z)) '(1 2 3) '(4 5 6) '(7 8 9))
(12 15 18)
remove: (remove expr1 expr2)
Library function "remove" accepts two arguments, an expression of any
type, and a list, respectively, and returns a new list which is a copy of
the argument list, with all occurrences of the first argument removed.
The function uses library function "equal" to test for equivalency.
> (remove 'a '(a a a b))
(b)
> (remove '(a b) '((a b) c (a b) (a b) d e f g))
(c d e f g)
nthcdr: (nthcdr expr1 expr2)
The "nthcdr" intrinsic returns the "nth" cdr of a list. The function
accepts two arguments. The first must evaluate to a list. The second
must evaluate to a positive fixnum. The function returns the list which
would be created if "cdr" were applied to the list the number of times
specified by the second argument. If the list has insufficient elements
to complete the request, the empty list is returned.
> (nthcdr '(a b c) 1)
(b c)
> (nthcdr '(a b c) 2)
(c)
> (nthcdr '(a b c) 3)
()
nth: (nth expr1 expr2)
The "nth" intrinsic returns the "nth" element of a list. elements are
numbered starting from zero. The function accepts two arguments. The
first must evaluate to a list, while the second must evaluate to a
positive fixnum specifying the index position of the desired element.
The function returns the specified element, if it can. If the list has
insufficient elements to satisfy the request, then the function returns
the empty list.
> (setq l '(a b c))
(a b c)
> (for (n (length l) 0)
>> (println (nth l n)))
()
c
b
a
0 ; This is the return value of "while"
member: (member expr1 expr2)
Library function "member" accepts two arguments, the first of which can
evaluate to any lisp object, while the second must evaluate to a list.
If the first argument is "equal" to any element of the list 1 is
returned, else 0 is returned.
> (member 'a '(a b c))
1
reverse: (reverse expr1)
Library function "reverse" accepts one argument, which must evaluate to a
list or a string. If a list the function creates a new list containing
the same elements as the original list, in reverse order. If the
argument is a string the function creates a new string consisting of the
same characters as the original, but in reverse order.
> (reverse '(a b c))
(c b a)
> (reverse "hello")
"olleh"
append: (append expr1 ...)
The "append" intrinsic accepts one or more arguments which all must
evaluate to lists, and returns a new list consisting of the elements of
the argument lists.
loop: (loop expr1 ...)
The "loop" intrinsic implements an infinite loop. The function accepts
one or more arguments, and evaluates them in order, repeatedly. The
return value of each body expression is discarded. The only way to exit
a "loop" loop is to wrap it in a "catch" and invoke "throw" in the body:
> (catch
>> (loop
>>> (if (setq line (getline))
>>>> (print line)
>>>> (throw line))))
iterate: (iterate expr1 ...)
The "iterate" intrinsic executes expressions a specified number of times.
Unlike the "for" intrinsic it does not make an index variable visible to
the expressions in its body. The function accepts one or more arguments,
the first of which must evaluate to a number and specifies the number of
times the loop will execute. A negative repeat count will be converted
into a positive value by converting it to its absolute value. The
subsequent expressions, if any, are optional. If present, each
subsequent expression will be executed in order on each iteration of the
loop. The result of evaluating the last body expression in the loop, on
the last iteration, will be returned, or if no further arguments were
provided beyond the repeat count, the repeat count will be returned. The
"iterate" intrinsic is the fastest means of iteration available in the
Munger interpreter.
> (iterate 10 (print "A"))
AAAAAAAAAA1
The 1 is the return value of the "print" intrinsic, which is in turn the
return value of "iterate".
for: (for (symbol expr1 expr2 [expr3]) expr ...)
for: (for (([expr...]) (expr [...]) ([expr...])) expr ...)
The "for" intrinsic provides a looping facility similar to the C "for"
keyword. There are two forms of this intrinsic:
The first form of the "for" intrinsic executes loops with an index
variable set to each of a range of specified fixnum values upon each
iteration. It is the most efficient means of iterating a fixed number of
times, if the loop body needs a visible index variable. Two or more
arguments must be provided. The first is expected to be a three or four
element list consisting of a symbol to be used for the loop index
variable, an expression evaluating to the initial number value for the
index variable, an expression evaluating to the number value the index
variable should have on the last iteration of the loop, and an optional
increment value. If the start value is less than or equal to the stop
value, the index variable will be incremented on each iteration,
otherwise it will be decremented on each iteration. Each of the
remaining arguments from the second argument on, will be evaluated during
each iteration of the loop. The "for" function always returns the value
of the last body expression evaluated on the last iteration.
An optional fourth element may be present in the first argument list, and
if present, it must evaluate to a fixnum specifying the increment or
decrement value. The absolute value of the argument will be added or
subtracted from the index, depending upon whether the first element is
lesser-than-or-equal-to, or greater than the second element,
respectively. If the fourth element is not present, the increment value
defaults to 1. If incrementing or decrementing of the index results
values that do not ever become the stop value, then the index value on
the last iteration of the loop will be the last index value which was
within the range specified by the start and stop values, inclusive.
A "for" loop introduces a new environment for the loop's index variable.
If a closure is formed in the body of a "for" loop, it will close over
the new environment. Any invocations of "extend" or "dynamic_extent"
directly inside of a "for" loop body will affect the environment of the
loop and not the environment in which the loop is embedded.
> (for (n 0 10 2)
>> (println n))
0
2
4
6
8
10
1 ; return value of "println"
> (set 'a 5)
5
> (for (a a 1)
>> (for (a a (+ a a))
>>> (print a " "))
>> (newline))
5 6 7 8 9 10
4 5 6 7 8
3 4 5 6
2 3 4
1 2
1 ; return value of inner "newline"
>
It is possible to invoke "continue" inside a for loop such that no return
value is ever generated by any of the contained expressions. In that case,
"for" returns 0.
The second form of the "for" intrinsic provides a more generic looping
facility. With this form, the first argument must also be a three-
element list, and the subsequent arguments are the loop's body
expressions, but with this form, each sublist of the first argument list,
must also be a list. The first sublist is a list of expressions to be
evaluated once, in order, before any other part of the loop is evaluated,
and may be empty. The second sublist consists of one or more
expressions, the first of which is the test condition, and will be
evaluated before every entry into the loop body, while the remaining
elements are the final expressions which are evaluated as the last act of
the "for" intrinsic. The elements after the test expression may be
omitted. The third sublist contains the update expressions to be
evaluated after evaluation of the loop body expressions, and may be
empty.
Execution of the loop proceeds as follows. The initialization
expressions are evaluated, then the actions described in the next
paragraph are repeated until the test expression evaluates to a false
value:
REPEAT
The test condition is evaluated. If the evaluation of the test condition
results in a false value, the remaining elements of the second sublist,
after the test condition, if present, are evaluated in order, and the
result of the last evaluation becomes the return value of the "for" loop.
If there are no expressions after the test expression, then the result of
the failed test expression becomes the return value of the loop. The
loop is finished. Otherwise, if the evaluation of the test condition
results in a true value, each of the body expressions are evaluated in
order, and then each of the elements of the update sublist are evaluated,
in order. Goto REPEAT.
One might write a factorial function with this version of "for" like
this:
(defun fact (n)
(for (((extend 'a 1)) ((> n 1) a) ((dec n)))
(setq a (* n a))))
downcase: (downcase expr1 expr2)
upcase: (upcase expr1 expr2)
The "upcase" library function accepts two arguments, the first of which
must evaluate to a string, and the second of which is treated as a
logical true or false value after evaluation. If the second argument
evaluates to a true value, then "upcase" will return a new string
consisting of all the characters of the first argument string, but with
any lowercase alphabetic characters replaced with their corresponding
uppercase characters. If the second argument evaluates to a false value,
then only the first lowercase alphabetic character encountered in the
first argument will be so converted. In either case, if no lowercase
alphabetic characters are present in the first argument, the returned
string will be an exact copy of the first argument.
The "downcase" library function behaves similarly to "upcase", except it
converts lowercase alphabetic characters to uppercase alphabetic
characters.
alist_lookup: (alist_lookup expr1 expr2)
The "alist_lookup" library function accepts two arguments. The second
must evaluate to an alist, while the result of evaluating the first
argument is treated as a key of the alist. The function returns the cdr
of the first sublist of the alist whose car is equal to the key.
> (set 'y '((a b) (b c)))
((a b) (b c))
> (alist_lookup 'b y)
(c)
>
alist_remove: (alist_remove expr1 expr2)
The "alist_remove" library function accepts two arguments. The second
must evaluate to an alist, while the result of evaluating the first
argument is treated as a key of the alist. The function returns a copy
of the alist with the first sublist whose car is the key, removed.
> (set 'y '((a b) (b c)))
((a b) (b c))
> (alist_remove 'b y)
((a b))
> y
((a b) (b c))
>
alist_replace: (alist_replace expr1 expr2 expr3)
The "alist_replace" library function accepts three arguments. The second
must evaluate to an alist, while the result of evaluating the first is
treated as a key of the alist. The third argument must evaluate to a
list. The function returns a copy of the alist with the first sublist
whose car is the key, removed, if such a sublist exists, and containing a
new sublist consisting of the key and the elements of the third list.
The new sublist is added to the front of the alist copy.
> (set 'y '((a b) (b c)))
((a b) (b c))
> (alist_replace 'b y '(d e f))
((b d e f) (a b))
> (set 'y ())
()
> (alist_replace 'b y '(d e f))
((b d e f))
>
when: (when expr1 expr2...)
The "when" intrinsic is a conditional. It accepts one or more arguments,
the first being the test condition. If the test condition evaluates to a
true value (anything but 0 (fixnum), the empty string, or the empty
list), then the succeeding arguments are evaluated in order, and the
result of evaluating the last argument is returned. Otherwise, if the
test condition evaluated to a false value, that value is returned.
> (when (eq 1 1)
>> (print 'hello)
>> (newline))
hello
1
>
unless: (unless expr1 expr2...)
The "unless" intrinsic is a conditional, which operates logically
opposite to the "when" intrinsic. The function accepts one or more
arguments, the first of which is the test condition. If the first
argument evaluates to a false value (0 (fixnum), the empty list, or the
empty string), the succeeding arguments are evaluated in order, and the
result of evaluating the last argument is returned. Otherwise, if the
first argument evaluated to a true value, that value is returned.
> (unless (eq 1 1)
>> (print 'hello)
>> (newline))
1
>
apply: (apply expr1 expr2)
The "apply" macro accepts two arguments, the first of which must evaluate
to a function and the second of which must evaluate to a list. the
function is called with all the elements of the second argument as
arguments. All "apply" does is cons its first element onto its second
and evaluate the result.
> (apply 'set '('q 2))
2
inc: (inc symbol [expr])
dec: (dec symbol [expr])
The "inc" and "dec" intrinsics increment and decrement a numerical value
bound to a symbol, and rebind the symbol to the resultant value. Both
functions accept either one or two arguments. The first argument is not
evaluated, and must be a symbol currently bound to a number. The second
optional argument, if present, must evaluate to a number. The value
bound to the symbol is incremented or decremented by the value specified
by the second argument, or if the function were not presented with a
second argument, by 1, and this new value is then bound to the symbol.
The new value is returned.
> (set 'a 1)
1
> (inc a)
2
> (dec a)
1
> (inc a 10)
11
test: (test expr1)
The "test" intrinsic is used to test macro expansion. Its lone argument
is NOT evaluated, and must be a list consisting of a macro application,
just as one would enter it at the interpreter prompt. The function
returns the result of expanding the macro.
> (test (let ((f "foobar")) f))
((lambda (f) f) "foobar")
> (test (setq a 2))
(set (quote a) 2)
continue: (continue)
The "continue" intrinsic, when invoked from inside a "while", "do",
"for", "iterate", or "loop" loop will cause the loop to skip any
subsequent expressions in the loop and jump to the top of the loop to
continue with the next iteration. Note, that invoking "continue" inside
of a do loop will prevent the evaluation of the test condition. An
infinite loop may result in certain situations.
If "continue" is invoked outside of the body of a loop, then the stack
will unwind until the interpreter returns to the toplevel prompt.
> (set 'n 10)
10
> (while n
>> (print n)
>> (newline)
>> (set 'n (- n 1))
>> (continue)
>> (set 'n 0))
10
9
8
7
6
5
4
3
2
1
0
>
The last 0 is the return value of the "while" intrinsic.
block: (block)
unblock: (unblock)
The "block" and "unblock" intrinsics accept no arguments, and determine
whether SIGINT, SIGQUIT, and SIGHUP will kill the interpreter, and
whether SIGTSTP, SIGTTIN, and SIGTTOU are ignored by the interpreter.
The interpreter starts out with an implicit call to "unblock." When in
the unblocked state, the interpreter will be killed upon receipt of
SIGINT, SIGQUIT, and SIGHUP, and it will be stopped upon receipt of
SIGTSTP, SIGTTIN, or SIGTTOU. It will additionally generate a core dump
upon receipt of SIGQUIT. Invoking "block" renders all of these signals
impotent. Both of these functions always return 1.
SIGTERM is caught by the interprepter. The "sigtermp" intrinsic can be
used to detect the occurrence of SIGTERM.
sigtermp: (sigtermp)
The "sigtermp" intrinsic accepts no arguments and returns either 0 or 1
indicating whether the interpreter has received a SIGTERM since the last
invocation of "sigtermp".
exists: (exists expr)
The "exists" intrinsic accepts one argument which must evaluate to a
string, and attempts to stat() a file system entity with the name
specified by the string. If the call to stat() succeeds "exists" returns
a value in the range of 1 to 8, otherwise, it returns 0 if the entity
does not exist, or -1 if the interpreter lacks permission to search one
of the directories in the specified path. Note the interpreter does not
need permission to read the file itself in order to stat() it. Possible
successful return values are:
1 == regular file
2 == directory
3 == character device
4 == block device
5 == fifo
6 == symbolic link
7 == socket
8 == unknown type
stat: (stat expr)
The "stat" intrinsic accepts one argument, which must evaluate to a
string. The argument is interpreted as a filename to be passed to the
stat() system call. The function returns a list. If the specified
entity does not exist, or the interpreter does not have search permission
for one of the directories in the specified path, the returned list will
be empty. Any other error condition will cause the function return a
string describing the error. Upon successful return of stat(), a five-
element list will be returned, containing, in order, a string containing
the user name associated with the uid of the file, or if the uid does not
map to a user on the system, the uid itself expressed as an integer, a
string containing the group name associated with the gid of the file, or
if the gid does not map to a group on the system, the gid itself
expressed as an integer, the time of the last access of the file, the
time of the last modification of the file, and the size of the file. The
time values are expressed in seconds elapsed since the UNIX Epoch
(00:00:00 January 1, 1970 UTC).
> (stat (current))
("root" "wheel" 1097210850 1097210836 417214)
rename: (rename expr1 expr2)
The "rename" intrinsic is used to rename entries in the filesystem. Its
semantics are the same as those of the system call of the same name, for
which is a wrapper. The function accepts two arguments, both of which
must evaluate to strings. The first specifies the current name of the
filesystem entity, and the second is the new name it will have if the
function succeeds. 1 is returned upon success, or in the case of system
call failure, a string describing the error is returned. All other
errors will stop evaluation.
current: (current)
The interpreter maintains a pointer to an element of the argument list it
was started with. The "current" intrinsic accepts no arguments, and
returns a string representing the current argument the argument pointer
points to. At startup "current" will return the name the interpreter was
invoked by, which is always the first argument passed to programs running
under UNIX. See the "prev" "next" and "rewind" intrinsics for accessing
the other command line arguments.
prev: (prev)
The "prev" intrinsic steps the argument pointer back to the previous
command line argument, if one exists. If no previous argument exists,
"prev" returns zero, otherwise the previous argument is made the current
argument and is returned as a string.
next: (next)
The "next" intrinsic steps the argument pointer forward to the next
command line argument, if one exists. If no further arguments exists,
"next" returns zero, otherwise the next argument is made the current
argument, and is returned as a string.
rewind: (rewind)
The "rewind" intrinsic accepts no arguments, and sets the interpreter's
command-line argument pointer to point to the first argument, and returns
it as a string. The first argument is always the name by which the
interpreter program was invoked.
interact: (interact)
The "interact" intrinsic causes the interpreter to stop running the
current program and work interactively with the user. It is intended for
use in programs which need to make the interpreter toplevel temporarily
available to the user. The function accepts no arguments, and causes the
interpreter to enter a new read-eval-print loop.
To exit the loop and allow the program which invoked "interact" to
continue, the Control-D may be inputted by itself on a line, or the
symbol "_" may be input at toplevel. The function always returns 1.
Recursive invocations will cause a string to be returned with an error
message, but evalution at the prompt of the original invocation will
continue normally. Note that the "fatal" intrinisic has no effect when
lisp errors are generated inside "interact". This means the interpreter
will not exit when a lisp error is generated by the user working at the
"interact" prompt.
The current local environment is hidden from code run via "interact". If
the user wishes to hide the global environment of the program which has
invoked "interact" from the user, then he or she should wrap the program
in a giant "let" closure, defining initial bindings for all globals so
that "defun", "defmac" "set" and "setq" will modify these locally-visible
bindings and not create global bindings. An example of this technique
can be seen in the msh.munger example program.
pwd: (pwd)
The "pwd" intrinsic accepts no arguments, and returns the current working
directory of the interpreter as a string.
let: (let [list] expr2 ...)
The "let" intrinsic introduces new local lexical bindings and evaluates a
list of expressions with those new bindings in place. The bindings are
removed when the "let" expression returns. The first argument is not
evaluated and must be a parameter list of the form: ((symbol1 expr1)
(symbol2 expr2)...). The result of evaluating each exprN in the current
scope gets bound to its paired symbolN in the new scope. The remaining
arguments, if any, are evaluated in the new scope. The result of the
last evaluation performed is returned.
> (let ((a 1) (b 2))
>> (print "a: " a " b: " b)
>> (newline))
a: 1
b: 2
1 ; This is the return value of (newline)
> (set 'a 2)
2
> (let ((a (* a a)))
>> (while a
>>>> (print "a: " a)
>>>> (newline)
>>>> (set 'a (- a 1))))
a: 4
a: 3
a: 2
a: 1
0
> a
2
letn: (letn [list] expr2 ...)
Intrinsic "letn" introduces new local lexical bindings and evaluates a
list of expressions with those new bindings in place. The bindings are
removed when the "letn" expression returns. The first argument is not
evaluated and must be a parameter list of the form: ((symbol1 expr1)
(symbol2 expr2)...). The result of evaluating each exprN in the current
scope gets bound to its paired symbolN in the new scope. The remaining
arguments are evaluated in the new scope. The value of the last
evaluation performed is returned. Intrinsic "letn" is different from
intrinsic "let" in that each binding made to symbolN is visible in
exprN+1, exprN+2, etc. This form is called "let*" in Scheme and Common
Lisp.
> (letn ((a 1)
>>> (b (+ a 1)))
>> (print "b: " b)
>> (newline))
b: 2
1
>
letf: (letf [list] expr2 ...)
Macro "letf" is the equivalent to a named let in Scheme. It provides a
means of creating and applying a temporary function capable of recursing
on its name. The macro accepts two or more arguments, the first of which
must be a symbol, while the second must be a list of the form: ((symbol1
expr1) (symbol2 expr2) ...). The result of evaluating each exprN in the
current scope gets bound to its paired symbolN in a new local lexical
environment. The symbol passed as the first argument is visible in the
new environment to enable recursion by name. The remaining arguments are
then evaluated in the new environment, with the value of the last
evaluation performed being returned.
> (setq a 10)
10
> (letf fact ((n a))
>> (if (< n 2)
>>> n
>>> (* n (fact (- n 1)))))
3628800
>
tailcall: (tailcall expr1 ...)
The "tailcall" intrinsic accepts one or more arguments, the first of
which must evaluate to a closure or 0, while the remaining arguments, if
any, are the arguments to the closure being tail-called, and must be of
the correct number and type for the particular closure being invoked. A
first argument of 0 indicates the tail-call is a directly recursive call
of the currently-executing function. This allows anonymous functions to
call themselves tail-recursively. Using tail-calls to implement
iteration is inefficient in Munger, and should be avoided. See the
entries for the "for" and "iterate" intrinsics for how to iterate
efficiently.
Tail-calls are used to prevent the control stack from growing
unnecessarily during recursive function calls. Use "tailcall" whenever
you are invoking a function from a tail position in another function.
Some tail-recursive functions do all their computation during the
"descent" stage of recursion, and only return the final computed value
during the "ascent" stage when they "unwind". This means the recursive
call is in tail position because nothing happens after it has returned.
Each invocation simply returns to its caller. Invoking the recursive
calls with "tailcall" will cause each recursive invocation to replace the
current context on the call stack with its own context, so that when the
final recursive invocation returns, it returns to the caller of the first
invocation. No "unwinding" occurs. Each invocation of the function
therefore has a constant continuation, and the whole computation is
effectively turned into a loop, inhibiting potentially explosive growth
of the control stack.
Other tail-recursive functions pass closures to each recursive invocation
of themselves to capture state as they descend, with each new closure
closing over the binding for the previous one. When these functions
bottom-out, the current closure is invoked as the continuation of the
function. It, in turn, performs part of a computation, then invokes the
continuation from the previous invocation, in tail position, to continue
the computation, and so on, until the primordial continuation given to
the toplevel invocation receives the final value of the computation.
Since all invocations of the recursive function, and all invocations of
the dynamically-constructed continuations are in tail position, there is
no need to save their contexts on the call stack. Every one can be
invoked with "tailcall", so that the primordial continuation simply
returns to the original caller of the first invocation of the recursive
function. Note that in this case, the savings in growth of the call
stack, is lost due to the growth of the heap from the creation of a new
continuation at each invocation.
Note that invoking "tailcall" from a non-tail position, results in that
position becoming a tail position. Any pending computation in the
function is abandoned.
labels: (labels [list] expr2 ...)
The "labels" intrinsic temporarily binds a set of functions to a set of
symbols, such that each binding is visible in each function, allowing
recursive and mutually-recursive local functions to be defined in a new
environment. The function accepts one or more arguments, the first of
which must be a list of lists where each sublist consists of a symbol
paired with a lambda or macro expression. The remaining arguments are
evaluated in the new environment, with the value of the last evaluation
being returned. An example will make this clear:
> (labels ((even (lambda (n) (or (eq n 0) (tailcall odd (- n 1)))))
>>> (odd (lambda (n) (and (not (eq n 0)) (tailcall even (- n 1))))))
>> (print (even 11))
>> (newline))
0
1 ; This is the return value of the "newline".
Note that it may not be necessary to use "labels" to extend the current
lexical environment, depending on whether or not the programmer wishes to
limit the visibility of the functions created to the body of the "labels"
expression, and whether or not there are any closures formed in the body
which need to see the new bindings after the "labels" has returned. If
both of those behaviors are simultaneously needed, or if the interpreter
is at toplevel, "labels" must be used. Otherwise, the current lexical
environment can be extended with the "extend" intrinsic. One can achieve
the same effect as the above example inside a function body with:
> (defun f ()
>> (extend 'even (lambda (n) (or (eq n 0) (tailcall odd (- n 1)))))
>> (extend 'odd (lambda (n) (and (not (eq n 0)) (tailcall even (- n 1)))))
>> (print (even 11))
>> (newline))
> (f)
0
1
The bindings created by "extend" have unlimited extent, so if closures
were formed in the body of "f" they would close over the new bindings,
even if the closures were closed before the invocations of "extend". See
the entries for the "extend" and "dynamic_extent" intrinsics for more
details on extending the current lexical environment.
extract: (extract expr)
The "extract" intrinsic extracts the lambda-expression from a function
closure. It accepts one argument which must evaluate to a closure, and
returns the closure's lambda-expression.
> mapcar
<CLOSURE#17>
> (extract mapcar)
(lambda (f l) (if (pairp l) (cons (f (car l)) (mapcar f (cdr l))) ()))
cond: (cond [list] ...)
Intrinsic "cond" is a multiple choice conditional. The function accepts
one or more lists as arguments, which must consist of at least two
elements, and attempts to evaluate the first element of each argument
list, in order, until an evaluation returns a true value (anything but
zero, the empty string, or the empty list), when it attempts to evaluate
the remaining elements of the corresponding sublist. The value of the
last evaluation performed in the sublist is returned. If none of the
first elements of any of the argument lists evaluate to a true value,
then the false value returned by the first element of the last argument
list will be returned. By placing a first element of 1 in the final
sublist, a catch-all else clause may be created.
> (set 'a 10)
10
> (cond ((eq a 12) 'no)
>> ((eq a 10) 'yes))
yes
>
case: (case expr expr2 ...)
The "case" macro is a multiple choice conditional. It accepts two or
more arguments. The first argument may be any expression. The
succeeding elements must be lists. The first argument is evaluated, and
then the resulting value is compared against the result of evaluating the
first elements of each of the succeeding lists, using "eq". If "eq"
returns 1, then the succeeding elements of the list with the "eq" first
element, are evaluated, and the value of evaluating the last element is
the return value of the invocation of "case". No other argument lists
are processed.
If the first element of any of the argument lists is the question mark,
?, then the elements of that list succeeding the question mark will
always be executed, if none of the preceding list arguments are executed.
This can be used to create an "else" clause, which will be executed if
none of the others match.
> (setq n 3)
3
> (case n
>> (1 'one)
>> (2 'two)
>> (3 'three)
>> (4 'four)
>> (5 'five))
three
> (case n
>> (6 'six)
>> (7 'seven)
>> (8 'eight)
>> (9 'nine)
>> (10 'ten)
>> (? "no match"))
"no match"
foreach: (foreach expr1 expr2)
The "foreach" library function applies a monadic function to every item
in a list. Unlike "mapcar", this function does not return a list of the
return values. The results of each function application are discarded.
"foreach" always returns the empty list.
> (set 'print_each
>> (lambda (x) (print x) (newline)))
<CLOSURE#34>
> (foreach print_each '(a b c d))
a
b
c
d
()
>
+: (+ expr...)
Intrinsic "+" accepts any number of arguments, evaluates them, and if
all have evaluated to numbers, adds the numbers together and returns the
total.
> (+ 1 1 1 1)
4
> (+ 1 "hello")
+: argument 2 did not evaluate to a number.
-: (- expr1 expr2)
Intrinsic "-" accepts two arguments, both of which must evaluate to
numbers, and subtracts the second value from the first and returns the
result.
> (- 1 2)
-1
> (- (+ 1 1) 2)
0
*: (* expr...)
Intrinsic "*" accepts one or more arguments which must all evaluate to
numbers, and multiplies all the values together and returns the result.
> (* 1 12 2)
24
> (* (+ 23 3) 4)
104
/: (/ expr1 expr2)
Intrinsic "/" accepts two arguments both of which must evaluate to
numbers, and divides the second value into the first and returns the
integer part of the quotient.
> (/ 4 3)
1
%: (% expr1 expr2)
Intrinsic "%" accepts two arguments, both of which must evaluate to
numbers, and divides the second value into the first, and returns the the
remainder.
> (% 4 3)
1
>: (> expr1 expr2)
Intrinsic ">" accepts two arguments, which both must evaluate to numbers,
and returns 1 if the first value is larger than the second, or 0
otherwise.
> (> 4 3)
1
> (> -4 3)
0
>=: (>= expr1 expr2)
Intrinsic ">=" accepts two arguments which both must evaluate to numbers,
and returns 1 if the first value is greater than or equal to the second
value, or 0 otherwise.
> (>= 4 3)
1
<: (< expr1 expr2)
Intrinsic "<" accepts two arguments, both of which must evaluate to
numbers, and returns 1 if the first value is less than the second, or 0
otherwise.
> (< 4 3)
0
<=: (<= expr1 expr2)
Intrinsic "<=" accepts two arguments, both of which must evaluate to
numbers, and returns 1 if the first value is less than or equal to the
second, or 0 otherwise.
> (<= 3 4)
0
negate: (negate expr)
The "negate" intrinsic accepts one argument which must evaluate to a
fixnum, and returns that fixnum negated.
> (negate 3)
-3
> (negate -3)
3
abs: (abs expr)
Intrinsic "abs" evaluates its lone argument, and if it evaluates to a
number, returns the absolute value of the number. Passing a non-number
to "abs" will stop evaluation with an error.
intern: (intern expr1)
The "intern" intrinsic accepts one argument, which must evaluate to a
string, and converts it into a symbol token consisting of the same
sequence of characters, without the enclosing quotation marks. Passing a
non-string to "intern" will generate an error which stops evaluation.
> (set 'hello 4)
4 (eval (intern "hello"))
4
char: (char expr1)
The "char" intrinsic accepts one argument, which must evaluate to a
number between 1 and 255, and creates a one-character string consisting
of the character corresponding to the character code specified by that
number.
code: (code expr1)
The "code" intrinsic accepts one argument which must evaluate to a string
and returns a number representing the character code of the first
character of the string.
open: (open)
The "open" intrinsic creates a new text buffer and makes the new buffer
the active buffer. The buffer number of the new text buffer is returned
on success. Errors stop evaluation. Buffer numbers are whole numbers,
and can be used by the "switch" intrinsic to change the active buffer.
close: (close)
The "close" intrinsic closes the active text buffer, and makes the most-
recently opened buffer that has not been closed the active buffer. 1 is
returned on success. Errors stop evaluation.
insert: (insert expr1 expr2 expr3)
The "insert" intrinsic inserts a line into the active buffer. It accepts
three arguments. The first argument must evaluate to a positive number
and represents an index position in the buffer. The second argument,
which must evaluate to a string, is the data to be inserted. The third
argument, which must evaluate to a number, specifies whether the data
should be inserted before the specified index position, inserted after
the specified index position, or should overwrite the contents of the
specified index position. A value of 0 indicates the index position
should be overwritten. A positive value indicates the data should be
inserted after the specified index. A negative value indicates the data
should be inserted before the specified index. If the specified index
position does not exist it will be created, and all the index positions
preceding it in the buffer will also be created and initialized to be
empty. 1 is returned on success, 0 on failure.
> (open)
1
> (insert 1 "This is the first line." 0)
1
delete: (remove expr1)
The "delete" intrinsic removes a line from the active buffer. The
function accepts one argument which must evaluate to a positive integer
specifying the index of the line in the buffer to be deleted. 1 is
returned on success, 0 on failure.
retrieve: (retrieve expr1)
The "retrieve" intrinsic returns the contents of a specified index
position in the active buffer, as a string. The function accepts one
argument which must evaluate to a positive integer representing the index
of the desired line. An error is generated if the specified index does
not exist.
lastline: (lastline)
The "lastline" intrinsic accepts no arguments, and returns the index
value of the last line in the active buffer. Any error encountered will
stop the interpreter.
filter: (filter expr1 expr2 expr3)
The "filter" intrinsic sends a range of lines from the active buffer to
an external filter program, and replaces the lines in the buffer with the
output from the filter program. The function accepts three arguments.
The first two arguments must evaluate to numbers and inclusively specify
the range of lines to be sent to the filter program. The first index
does not have to be less than the second index. The lines, however are
always processed in ascending order, regardless of how the range was
specified. An error is generated if any of the lines in the specified
range do not exist. The third argument must evaluate to a string,
representing the command line used to launch the filter. It is passed to
/bin/sh for interpretation, and so may contain any expression that
program understands.
Upon success, "filter" returns the number of lines read from the stdout
of the child process.
filter_server: (filter_server expr1 expr2 expr3 expr4)
The "filter_server" library function sends a range of lines in the active
buffer to a TCP or UNIX domain server and replaces the lines in the
buffer with the server's response. This function can be useful in
retrieving data from http servers. It accepts four arguments. The fist
two arguments must evaluate to fixnums specifying the range of buffer
lines, inclusive, to be sent to the server. If the first argument is
greater than the second, then the lines will be sent to the server in
reverse order. The third and fourth arguments are passed to the
"child_open" intrinsic. See the entry in this manual for that function
for details on what these two arguments can be. The function returns the
number of lines read back from the server.
> (open)
0
> (insert 1 "GET / HTTP/1.0" 0)
1
> (insert 2 "" 0)
1
> (for (n 1 2) (insert n (concat (retrieve n) (char 13) (char 10)) 0))
1
> (filter_server 1 2 "www.mammothcheese.ca" 80)
306
The data returned will not be processed in any way, so the HTTP response
header will be present, as well as any chunk headers in the response
body. The "remove_http_stuff" library function can be used to remove
these items.
remove_http_stuff: (remove_http_stuff)
The "remove_http_stuff" library function accepts no arguments, and if
invoked after an invocation of "filter_server", will remove the http
header and merge the response body if it has been "chunked" as per
HTTP/1.1, from the data in the current buffer. After invocation, the
current buffer will contain only the data of the requested resource.
write: (write expr1 expr2 expr3 expr4 [expr5])
The "write" intrinsic writes content from the active buffer to a file.
The function accepts four or five arguments. The first two arguments
must evaluate to positive integers, and specify a range of lines,
inclusively, to be written out. The third argument must evaluate to a
string representing the filename to be written to. The fourth argument
must evaluate to a number, and indicates whether the interpreter should
attempt to get an exclusive lock on the file before writing to it. A
non-zero value means to lock the file, while a value of 0 means to not
lock the file. The optional fifth argument, if present, must evaluate to
a number, and specifies whether the lines from the buffer should be
appended to an already-existing file, or if a new file should be created,
overwriting any existing file of the same name. A non-zero value means
to append, while a value of 0 results in a new file being created. If
the fifth argument is not present, an implicit fifth argument of 0 is
assumed. Empty files can be created by passing 0 for both the first and
second arguments. "write" creates files with both read and write
permissions enabled for the owner, and read permission enabled for
everyone else (mode 644).
The first index argument does not have to be less than the second index
argument, but the lines in the region so specified will be written to the
file in ascending order, regardless of how the region was specified.
Whitespace in the filename can either be present literally, or
represented with the \t and \b escapes to represent the tab and the space
character, respectively. This means if you wish to use a literal \b or
\t in the filename, you must escape the backslash itself and use \\\\b or
\\\\t instead. Remember to embed a single backslash into a string we
must escape it with another backslash, so to embed two backslashes, we
need to escape each of them with another backslash, totaling four
backslashes. If this confuses you, try rereading the section on strings
at the top of this document.
The number of lines written is returned on success. If an error is
encountered opening the file, a string describing the error is returned.
All other errors stop evaluation.
read: (read expr1 expr2)
The "read" intrinsic inserts all the lines from a file into the active
buffer, inserting the lines after a specified line. The function accepts
two arguments. The first argument must evaluate to a number representing
the index value of the line after which to insert the newly-read lines.
If the buffer is empty, the first argument must be 0, or an error is
generated. Similarly, to insert lines at the beginning of the buffer,
the first argument should be 0. The function returns the number of lines
read on success, -1 if the file to be read doesn't exist, and -2 if
permission to access the file is denied. Any other failure of the read()
system call will return a string describing the error. Other errors will
stop evaluation.
Whitespace in the filename can either be present literally, or
represented with the \t and \b escapes to represent the tab and the space
character, respectively. This means if you wish to use a literal \b or
\t in the filename, you must escape the backslash itself and use \\\\b or
\\\\t instead. Remember to embed a single backslash into a string we
must escape it with another backslash, so to embed two backslashes, we
need to escape each of them with another backslash, totaling four
backslashes. If this confuses you, try rereading the section on strings
at the top of this document.
empty: (empty)
The "empty" intrinsic accepts no arguments, and removes all data from the
active buffer. All data in the active buffer are permanently lost.
slice: (slice expr1 expr2 expr3 expr4 expr5)
The "slice" intrinsic returns part of a line in the active buffer or a
description of part of a line in the buffer. The function accepts five
arguments, which must all evaluate to numbers. The first argument
specifies the index value of the line to be sliced in the buffer. The
second argument specifies the character index where the slice starts.
The third argument specifies the length of the slice in characters. If
the length argument is zero, this is interpreted as meaning, "to the end
of the line." The fourth argument specifies where tabstops occur. Tabs
are expanded before the slice is taken, so argument 2 refers to the
"screen" x-coordinate, which may be different from the actual character
located at that index position in the buffer, due to tab expansion. For
example, a fourth argument of 3 indicates that tabstops are considered to
occur every three columns.
The fifth argument specifies what the function returns. A value of 0
indicates the caller wishes to receive the specified slice as a string.
A value of 1 indicates the caller wishes to receive a two-element list
describing the specified slice. The first element of this list is the
length of the slice, in characters, which may be less than the specified
length, if the specified length extended past the end of the line. This
is the actual length of the slice in the buffer before tab expansion.
The second element is the number of extra characters which would be added
to the line during tab expansion from the beginning of the line to the
end of the specified slice.
find: (find expr1 expr2 expr3 expr4 expr5)
The "find" intrinsic searches a specified range of lines in the active
buffer for a match on a specified regular expression. The function
accepts five arguments.
The first three arguments, and the fifth argument, must all evaluate to
numbers, while the fourth argument must evaluate to a compiled regular
expression object. The first argument specifies the direction of the
search: a positive value causes the search to proceed forward in the
buffer, while a negative value causes the search to proceed backwards in
the buffer. The second argument is interpreted as the line number at
which to start the search. The third argument is interpreted as the
character within the line at which to start the search. Remember that
lines in the buffer are indexed from 1, while characters in lines are
indexed from 0. The fifth argument specifies whether the search should
"wraparound" if it fails. A non-zero value enables wraparound, while 0
disables wraparound.
For example, if a search were in the forward direction and failed, a
fifth argument of 1 would cause the search to begin again at the
beginning of the buffer, looking for matches before the specified
starting position. The fourth argument is the compiled regular
expression object to search for matches with. A match starting exactly
at the specified starting place of the search will be ignored, and the
position of the next non-overlapping match, if any, will be sought.
Newlines are temporarily removed from the end of buffer lines before a
match is attempted. This means ^$ will match empty lines.
The function returns a list of three numbers. If a match were found, the
first element will be the index of the line containing the match, while
the second element will be the character index in the line of the start
of the text that matched, and the third element will be the length, in
characters, of the match. If no match occurred, all three values will be
zero.
The three returned values can be used to pluck out the text of a match
from the buffer:
> (open)
0
> (insert 1 "this is the first line in the buffer." 0)
1
> (set 'f (find 1 1 0 (regcomp "buffer") 0)
(1 30 7)
> (substring (retrieve (car f)) (cadr f) (car (cddr f)))
"buffer"
The following function will count the number of blank lines in the
buffer, assuming the buffer has been opened and loaded with text.
Newlines are removed from the end of lines before the regular expression
is applied, so the regular expression will match blank lines.
(set 'blank
(lambda ()
(let ((idx 1)
(regexp (regcomp "^[\b\t]*$"))
(count 0))
(while (set 'idx (car (find 1 idx 0 regexp 0)))
(inc count))
count)))
buffer: (buffer)
The "buffer" intrinsic accepts no arguments, and returns the buffer
number of the active text buffer, or -1 if no buffers have been opened.
buffers: (buffers)
The "buffers" intrinsic accepts no arguments, and returns a list of whole
numbers representing the buffer numbers of all currently open buffers.
If no buffers have been open, the function returns the empty list.
switch: (switch expr)
The "switch" intrinsic makes a specified buffer the active buffer. The
function accepts one argument which must evaluate to the buffer number of
an open buffer, and makes that buffer the active buffer. Errors stop
evaluation.
setmark: (setmark expr1 expr2)
The "setmark" intrinsic is used to mark a line in the buffer for later
reference. The function accepts two arguments, the first of which must
evaluate to an atom of any type, which names the bookmark. The second
argument must evaluate to a valid line number of a line in the active
buffer. Upon success 1 is returned. The mark will be adjusted
accordingly after buffer insertions and deletions in order to track the
marked line. If the line is subsequently deleted with "delete" then any
bookmarks pointing to that line will be set to -1. Invoking "getmark" on
a subsequent occasion will return the marked line's current line number.
Bookmarks are local to the active buffer, and each buffer may have an
unlimited number of bookmarked lines. The active set of bookmarks is
switched when the "switch" intrinsic changes the active buffer. Only the
bookmarks for the active buffer can be altered or examined.
getmark: (getmark expr1)
The "getmark" intrinsic is used to retrieve the line number of a marked
line in the active buffer. The function accepts one argument which must
evaluate to an atom of any type, which names the desired bookmark. The
current line number of the marked line will be returned if the line has
not been deleted from the buffer. If the desired mark has not been set,
"getmark" returns 0. If the marked line has been deleted, "getmark"
returns -1.
transfer: (transfer b1 f1 t1 b2 t2)
The "transfer" intrinsic copies a contiguous range of lines from one
buffer into another already-opened buffer. The function accepts five
arguments which must all evaluate to numbers. The first argument is the
buffer number of the source buffer. The second and third arguments
specify the starting and ending lines, inclusive, of the range to be
copied from the source buffer. If the second argument is greater than
the third, then the lines will be copied in reversed order into the
destination buffer. The fourth argument is the buffer number of the
destination buffer, while the fifth argument is the index in the
destination buffer, after which the copied lines will be inserted. To
copy lines to the front of the destination buffer, a fifth value of 0 is
used. Whatever buffer was active when "transfer" was invoked will be
active when transfer returns. The function returns 1 on success. Any
error will stop evaluation.
with_buffer: (with_buffer expr1 expr2 ...)
The "with_buffer" macro temporarily changes the active buffer. The macro
accepts two or more arguments, the first of which must evaluate to the
buffer number of an open buffer. This buffer is made the active buffer,
and then the arguments subsequent to the first are evaluated. When the
last argument has been evaluated the buffer which was active previous to
the invocation of macro is made the active buffer once again. The result
of evaluating the last argument is returned.
version: (version)
The "version" intrinsic accepts no arguments and returns a list of two
numbers: the first being the major version number of the lisp
interpreter, and the second being the minor version number.
gensym: (gensym)
The "gensym" intrinsic creates and returns a unique anonymous symbol,
called a gensym. Gensyms cannot be named in your code, because the lisp
reader does not recognize the print syntax for a gensym. Gensyms are
useful when it is necessary for a macro to create a working variable in
its returned expression, which must not conflict with other variables
used in the program invoking the macro. The customary way to manipulate
gensyms is to bind them to other symbols, and evaluate these symbols in
macro templates.
The definition of the "with_buffer" macro from library.munger is
presented below as an example.
(let ((buff (gensym)))
(set 'with_buffer
(macro (x (y))
(qquote
(let ((,buff (buffer)))
(switch ,x)
(protect ,(cons 'progn y)
(switch ,buff)))))))
A gensym is bound to the symbol "buff" in a lexical closure surrounding
the macro definition. This symbol's value, the gensym itself, is
inserted into the macro template at appropriate places to act as a
temporary to store the buffer current at the time the macro is invoked.
Because the gensym is anonymous, we can be sure we are not shadowing a
binding used by the program in which the macro is invoked, which the code
passed to the macro as its second argument might modify. Furthermore,
every invocation of the macro uses the same gensym, because it is created
in the outer "let" lexical closure enclosing the macro definition, but
because we insert the gensym into another "let" in the macro expansion,
we can be sure that nested invocations of "with_buffer" will only see
their own lexical binding.
libdir: (libdir)
The "libdir" intrinsic accepts no arguments and returns a string
specifying the location of the interpreter's library files as a fully-
qualified directory name, without the trailing virgule.
> (libdir)
"/usr/local/share/munger"
>
strcmp: (strcmp expr1 expr2)
The "strcmp" intrinsic lexigraphically compares two strings. The
function accepts two arguments which must evaluate to strings, and
returns an integer greater than, equal to, or lesser than zero,
indicating that the first string is either "greater than" (would sort
after), "equal to" (identical), or "less than" (would sort before) the
second string.
> (strcmp "a" "b")
-1
> (strcmp "b" "a")
1
> (strcmp "a" "a")
0
> (strcmp "A" "a")
-32
substring: (substring expr1 expr2 expr3)
The "substring" intrinsic is provided for extracting substrings from
strings, using character indices. The function accepts three arguments.
The first must evaluate to a string, while the second and third must
evaluate to whole numbers. The second argument specifies the character
index (indices start at 0) where the desired substring begins. The third
argument is the number of characters to include in the substring. The
"substring" intrinsic returns the specified substring as a new string.
The second index may be zero, which is interpreted to mean "to the end of
the string."
> (substring "foobar" 0 0)
"foobar"
> (substring "foobar" 3 0)
"bar"
> (substring "foobar" 0 3)
"foo"
> (substring "foobar" 0 10)
"foobar"
expand: (expand expr1 expr2)
The "expand" intrinsic performs tab expansion on arbitrary strings. The
function accepts two arguments, the first of which must evaluate to a
positive integer, while the second must evaluate to a string. The first
value is interpreted as the location of tabstops (ie., every expr1
characters). A new string is returned, which has the same content as the
original second argument, but with any tab characters expanded into an
appropriate number of spaces. The number of spaces resulting from each
expansion depends upon the position of the specific tab character in its
enclosing line. An expansion will always contain at least one space
character, but may contain up to expr1 space characters. As well, the
expansion of tabs occurring earlier in expr2 will influence the expansion
of tabs occurring later in expr2.
lines: (line)
The "lines" intrinsic returns the number of lines on the terminal device
on which the interpreter is running. If the interpreter is not running
on a terminal device, the returned value cannot be predicted.
cols: (cols)
The "cols" intrinsic returns the number of columns on the terminal device
on which the interpreter is running. If the interpreter is not running
on a terminal device, then the returned value cannot be predicted.
exit: (exit)
The "exit" interpreter accepts one argument, which must evaluate to a
number, and causes the interpreter to exit with that number as the exit
value returned to the system.
complete: (complete expr)
The "complete" intrinsic performs filename completion. The function
accepts one argument which must evaluate to a string and is the partial
file or directory name to be completed. The function returns a list of
one or more strings. The first element of the list is always the result
of applying the completion algorithm to the argument, and will be the
same as the initial argument if the completion algorithm could not add
more characters to it. If more than one element is present in the list,
more characters may have been added to the initial argument, but it still
did not unambiguously name a file. The subsequent list elements will be
preformatted lines of text containing all the possible completions for
the first element organized into a table the width of the terminal
device, and may be printed as is.
> (complete "/usr/share/pe")
("/usr/share/perl/man/" "./ ../ man3/ whatis cat3/
")
A leading ~ in the argument to "complete" will trigger home directory
abbreviation expansion, similarly to csh. Home directory abbreviations
are expanded, but not completed. For example, if the argument to
"complete" is either "~" or begins with "~/", these characters will be
replaced with the path to the current user's home directory, but
arguments starting with text matching this pattern: ~[^/]+ will have
those characters replaced with the path to the specified user's home
directory, only if such a user exists. The function will not attempt to
complete an incomplete home directory abbreviation.
input: (input expr1 expr2)
The "input" intrinsic reads data from a process into the active buffer.
The function accepts two arguments. The first argument must evaluate to
a whole number and specifies the line number to insert the new data
after, while second argument must evaluate to a string specifying the
command-line to pass to the shell (/bin/sh) to launch the source process,
and so may be any expression that program understands. The number of
lines read is returned on success. If the child shell cannot find the
specified program or the process exits prematurely for any reason,
"input" returns 0. Any other errors encountered will stop evaluation.
Whitespace in the filename can either be present literally, or
represented with the \t and \b escapes the "substitute" and "match"
intrinsics recognize to represent the tab and the space character,
respectively. This means if you wish to use a literal \b or \t in the
filename, you must escape the backslash itself and use \\b or \\t
instead.
output: (output expr1 expr2 expr3)
The "output" intrinsic writes content from the active buffer to a
process. The function accepts three arguments. The first two arguments
must evaluate to positive integers and specify the range of lines,
inclusively, to be written to the process. The first index does not have
to be less than the second index, but the range of lines so specified
will be written to the child process in ascending order, regardless of
how the range was specified. The third argument must evaluate to a
string, and specifies the command-line to be passed to the shell to
launch the process, and so may be any expression that program
understands. The number of lines written to the process is returned upon
success. If the child shell cannot find the specified program or the
process exits prematurely for some other reason, "output" will return 0.
All other errors will stop evaluation.
Whitespace in the filename can either be present literally, or
represented with the \t and \b escapes the "substitute" and "match"
intrinsics recognize to represent the tab and the space character,
respectively. This means if you wish to use a literal \b or \t in the
filename, you must escape the backslash itself and use \\b or \\t
instead.
system: (system expr1)
The "system" intrinsic is an interface to the "system" system call. The
function accepts one argument which must evaluate to a string, and passes
it to the shell for execution. The function returns the exit status of
the shell; 127 if execution of the shell failed; -1 if fork() or
waitpid() fails.
maxidx: (maxidx)
The "maxidx" intrinsic accepts on arguments, and returns the highest
possible index in the buffer that the interpreter will recognize.
chdir: (chdir expr1)
The "chdir" intrinsic changes the current directory. The function
accepts one argument which must evaluate to a string, specifying the new
current working directory. The function returns 1 on success or a string
describing the error condition on failure.
table: (table)
The "table" intrinsic returns a new associative array table, which may be
used to store lisp objects indexed by atomic keys. Table objects are
constants which evaluate to themselves. See the "hash", "unhash",
"keys", and "values" intrinsics for the details on using tables.
hash: (hash expr1 expr2 expr3)
The "hash" intrinsic is used to store data into a table. The program
accepts three arguments, the first of which must evaluate to the table to
be modified. The second argument must evaluate to any atom, while the
third argument can evaluate to an object of any type. The second
argument becomes the "key" associated with the third argument "value."
The "hash" intrinsic always returns the result of evaluating the third
argument. Note that "lookup" returns the empty list if a key has no
association, so there is no way to differentiate between a key associated
with the empty list and a key with no association.
> (set 't (table))
<TABLE#0>
> (hash 'zero "zero")
"zero"
If you are going to insert a large number of objects (> 1000000) into a
table at once, you might consider using "gc_freq" to turn off garbage
collection before the insertions, and then turn it back on after.
> (setq t (table))
<TABLE#1>
> (for (n 0 999999) (hash t n n))
999999
keys: (keys expr1)
The "keys" intrinsic accepts one argument, which must evaluate to a
table, and returns a list of all the objects used as hash keys in in no
particular order. If the specified table is empty, the empty list is
returned.
> (set 't (table))
<TABLE#0>
> (hash t "0" "zero")
"zero"
> (hash t "1" "one")
"one"
> (keys t)
("1" "0")
values: (values expr1)
The "values" intrinsic accepts one argument, which must evaluate to a
table, and returns a list of all of the values stored in the table object
in no particular order. If the specified table is empty, the empty list
is returned.
> (set 't (table))
<TABLE#0>
> (hash t "0" "zero")
"zero"
> (hash t "1" "one")
"one"
> (hash t "2" "two")
"two"
> (values t)
("two" "zero" "one")
unhash: (unhash expr1 expr2)
The "unhash" intrinsic removes a key/value pair from a table. It accepts
two arguments, the first of which must evaluate to the table to be
modified, while the second of which must evaluate to the key of the
key/value pair to be removed. The "unhash" intrinsic always returns the
result of evaluating the second argument.
> (set 't (table))
<TABLE#0>
> (hash t "0" "zero")
"zero"
> (unhash t "0")
0
> (keys t)
()
lookup: (lookup expr1 expr2)
The "lookup" intrinsic is used to retrieve an object associated with an
atom in a table. The function accepts two arguments. The first argument
must evaluate to the table to be searched, while the second argument must
evaluate to an atom. If another lisp object is associated with the
second argument in the specified table, the associated object is
returned, otherwise the empty list is returned. Note that there is no
way to tell the difference between a key with no association and a key
associated with the empty list.
> (set 't (table))
<TABLE#0>
> (hash t "1" "one")
"one"
> (hash t "0" "zero")
"zero"
> (lookup t "1")
"one"
> (lookup t "0")
"zero"
> (lookup t "2")
()
sqlite_open: (sqlite_open expr)
The "sqlite_open" intrinsic opens a SQLite database file. The function
accepts one argument which must evaluate to a string specifying the
filename for the database. It will be created if it does not exist. The
opened database object is returned upon success. Upon failure, a string
describing the error encountered is returned. Database objects are
constants which evaluate to themselves.
sqlite_close: (sqlite_close expr)
The "sqlite_close" intrinsic closes an open SQLite database file. The
function accepts one argument which must evaluate to the opened database
object to close. If the database is currently open the function closes
it and returns 1, otherwise, it returns 0.
sqlite_exec: (sqlite_exec expr1 expr2)
The "sqlite_exec" function executes SQL commands on an opened SQLite
database file. The function accepts two arguments, the first of which
must evaluate to the opened database object to query, and the second of
which must evaluate to a string specifying the SQL command to execute.
If the command is successfully executed, a list is returned. If the
command would not normally return any data, or the command returns the
empty set, an empty list is returned; otherwise, a list of lists is
returned. The first sublist will contain the column keys, while each
subsequent sublist will contain one row of returned table data. If a
null entry in a row is encounted, an empty string will be returned for
that field in its associated list. If an error is encountered during
execution of the SQL query, a string is returned describing the error.
Data returned by the SQLite interface is expressed as strings. Numbers
are returned as the string representation of the appropriate value, and
may be converted back to a number with the "digitize" intrinsic.
"sqlite_exec" provides the simplest interface to the SQLite library, but
another row-by-row interface is provided, which may be more convenient
when working with rows containing large chunks of data, and which is also
more efficient when the user wishes to invoke the a SQL statement
multiple times on the same database. The row-oriented interface is
provided by the "sqlite_prepare", "sqlite_bind", "sqlite_step",
"sqlite_row", "sqlite_reset", and "sqlite_finalize" intrinsics, detailed
below. See http://www.sqlite.org for further details.
sqlite_prepare: (sqlite_prepare db sql)
The "sqlite_prepare" intrinsic is used to compile a SQL statement for
multiple uses with the alternative SQLite interface. The function
accepts two arguments, the first of which must evaluate to an opened
SQLite database object, while the second must evaluate to a string
containing the SQL to be compiled against that database. The function
returns a compiled SQL object upon success, or a string describing an
error condition, upon failure. The compiled sql object is an opaque
constant atom and may be passed as argument to "sqlite_bind",
"sqlite_step", "sqlite_row", "sqlite_reset", or "sqlite_finalize".
The SQL statement passed to this function may contain parameters of the
form ?, ?NNN, or :AAA, where NNN is a number and AAA is an alphanumeric
identifier. By using "sqlite_bind" values may be inserted in place of
these parameters. This is documented in the entry for "sqlite_bind".
> (setq db (sqlite_open "document.db"))
<db#1>
> (setq sql (sqlite_prepare "SELECT path FROM document WHERE parent = 0"))
<sql#1>
sqlite_bind: (sqlite_bind sql index text)
Statements given to "sqlite_prepare" may contain parameter references in
place of SQL literals, of the forms ?, ?NNN, or :AAA where NNN is a
number, and AAA is an alphanumeric identifier. The "sqlite_bind"
intrinsic is used to set or change the values bound to these parameters.
Unfortunately, only SQL literal values may be parameterized, that is
strings and numbers, and not column or table names. The function accepts
three arguments. The first argument must evaluate to a compiled SQL
statement returned by "sqlite_prepare". The second argument must
evaluate to the index position of the parameter to be substituted.
Parameter indices start at 1. The third argument must evaluate to a
string containing the replacement text. Upon success, the function
returns 1. Otherwise a string will be returned describing an error
condition.
Parameters are specified by their ordinal position in the SQL query.
Note that parameters with the same name all share the same index value,
that value being the index of the first occurrence of the parameter name
in the SQL statement. Using named parameters, the same value may be
substituted into a SQL statement in different locations.
Note that "sqlite_bind" must be invoked on a SQL statement after a call
to "sqlite_prepare" or "sqlite_reset", and before any call to
"sqlite_step".
> (setq db (sqlite_open "example.db"))
<DB#1>
> (setq sql (sqlite_prepare db "SELECT name FROM employees WHERE job = ?"))
<SQL#1>
> (sqlite_bind sql 1 "supervisor")
1
> (for (((setq more (sqlite_step sql)))
(more)
((setq more (sqlite_step sql))))
>> (print (sqlite_row sql))
>> (newline))
("Bob")
> (sqlite_reset sql)
1
> (sqlite_bind sql 1 "technician")
1
> (for (((setq more (sqlite_step sql)))
(more)
((setq more (sqlite_step sql))))
>> (print (sqlite_row))
>> (newline)
("Sally")
("Jeffrey")
("Boodles the cat")
("George")
sqlite_step: (sqlite_step sql)
The "sqlite_step" intrinsic is used to apply a compiled SQL object to its
database to generate the returned data for a single row of the result
set. The function accepts one argument which must evaluate to a compiled
SQL object generated by "sqlite_prepare", and returns 1 if a row of data
has been generated, or 0 if the data in the result set has been
exhausted, or a string describing an error condition, upon failure. If
the function returns 1, then "sqlite_row" may be invoked on the compiled
SQL object to retrieve the generated data for the current row. Further
invocations of "sqlite_step" will generate data for successive rows of
the result set, until the result set has been exhausted.
sqlite_row: (sqlite_row sql)
The "sqlite_row" intrinsic may be called after a successful invocation of
"sqlite_step" to retrieve a row of data from the result set of a SQL
query. The function accepts one argument, which must evaluate to a
compiled SQL object returned by "sqlite_prepare" which has had
"sqlite_step" invoked on it, and returns a list of strings upon success.
Each string represents the data for a single column in the current row of
the result set. Upon failure, the function returns a string describing
an error condition.
> (setq db (sqlite_open "document.db"))
<db#1>
> (setq sql (sqlite_prepare "SELECT * FROM table1"))
<sql#1>
> (for (((setq more (sqlite_step sql))) (more) ((setq more (sqlite_step sql))))
>> (print (sqlite_row sql))
>> (newline))
("first column first row" "second_column first row")
("first column second row" "second_column second row")
sqlite_reset: (sqlite_reset sql)
The "sqlite_reset" intrinsic is used to reset a compiled SQL object after
"sqlite_step" has returned 0 when invoked on it. The function accepts
one argument, which must evaluate to a compiled SQL object. The function
returns 1 upon success, or a string describing an error condition, upon
failure. After a successful invocation of "sqlite_reset", "sqlite_step"
and "sqlite_row" may be invoked on the compiled SQL object to re-generate
the previous result set again.
sqlite_finalize: (sqlite_finalize sql)
The "sqlite_finalize" intrinsic frees the resources associated with a
compiled SQL object. The function accepts one argument, which must
evaluate to a compiled SQL object generated by "sqlite_prepare" and
returns 1 upon success, or a string describing an error condition, upon
failure. "sqlite_finalize" MUST BE CALLED ON EVERY COMPILED SQL OBJECT
ASSOCIATED WITH A PARTICULAR DATABASE, BEFORE THAT DATABASE MAY BE CLOSED
BY "sqlite_close". Failure to do this may cause incomplete updates to be
rolled back and transactions to be canceled. The garbage collector will
call "sqlite_finalize" on any compiled SQL objects it deallocates, but
implicit deallocation should not be relied upon, as there is no guarantee
the database object will not be garbage collected before the SQL
statements, which may result in corruption of the database.
sqlp: (sqlp expr)
The "sqlp" intrinsic accepts one argument, which may evaluate to any type
of object, and returns 1 if that object is a compiled SQL object
generated by "sqlite_prepare". Otherwise, it returns 0.
stack: (stack [expr])
The "stack" intrinsic creates a new stack object. The function accepts
one optional argument, which, if present, must evaluate to a positive
integer specifying a number of elements to preallocate on the stack.
Each element will be set to the empty list. Omitting the optional
argument is the same as invoking (stack 0). The newly-created stack
object is returned.
> (set 's (stack))
<STACK1>
> (used s)
0
> (set 's (stack 10))
<STACK2>
> (used s)
10
push: (push expr1 expr2)
The "push" intrinsic pushes an object onto the top of a stack. The
function accepts two arguments, the first of which must evaluate to the
stack to be affected, while the second argument may be any lisp object.
The function returns the result of evaluating the second argument.
> (set 's (stack))
<STACK1>
> (push s 'foo)
foo
> (index s 0)
foo
pop: (pop expr)
The "pop" intrinsic removes an object from the top of a stack. The
function accepts one argument, which must evaluate to the stack to be
affected. The removed object is returned. When the stack is empty, the
empty list is returned. Note the only way to tell the difference between
an empty stack and one which has the empty list stored in its top
element, is to invoke the "used" intrinsic.
> (set 's (stack))
<STACK1>
> (push s 'foo)
foo
> (push s 'bar)
bar
> (pop s)
bar
> (pop s)
foo
> (pop s)
()
> (used s)
0
unshift: (unshift expr1 expr2)
The "unshift" intrinsic adds a new element onto the bottom of a stack.
The function accepts two arguments, the first of which must evaluate to a
stack, while the second may evaluate to any lisp object. The result of
evaluating the second argument is prepended to the stack, and also
returned by the function.
> (set 's (assign (stack) '(1 2 3 4)))
<STACK1>
> (flatten s)
(1 2 3 4)
> (unshift s 0)
0
> (flatten s)
(0 1 2 3 4)
shift: (shift expr)
The "shift" library function removes an element from the bottom of a
stack. The function accepts on argument which must evaluate to stack.
If the stack is empty, the empty list is returned, otherwise the removed
element is returned.
> (set 's (assign (stack) '(0 1 2 3 4)))
<STACK1>
> (flatten s)
(0 1 2 3 4)
> (shift s)
0
> (shift s)
1
> (flatten s)
(2 3 4)
index: (index expr1 expr2)]
The "index" intrinsic fetches the object stored at a specified index in a
stack. The function accepts two arguments, the first of which must
evaluate to the stack to be accessed, while the second must evaluate to a
whole number specifying the desired element on the stack. The bottom
location on a stack has an index value of zero, and the index value of
the object on the top of the stack is one less than the number of
elements on the stack. Specifying an index value of less than zero, or
more than the index of the last element on the stack, generates an error
which stops evaluation.
> (set 's (assign (stack) '(a b c)))
<STACK1>
> (index s 0)
a
> (index s 2)
c
> (index s 3)
<INDEX>: index 3 out of range.
>
topidx: (topidx expr)
The "topidx" intrinsic accepts one argument which must evaluate to a
stack, and returns the index value of the top element on the stack, or -1
if the stack is empty.
> (topidx (stack 1))
0
used: (used expr)
The "used" intrinsic obtains the number of elements currently on a
specified stack. The function accepts one argument which must evaluate
to the stack to be queried, and returns a whole number describing the
number of elements currently on that stack.
> (used (stack 10))
10
store: (store expr1 expr2 expr3)
The "store" intrinsic stores an object into a specified element of a
stack. The function accepts three arguments, the first of which must
evaluate to the stack to be affected, while the second argument must
evaluate to a whole number specifying the element of the stack to
overwrite, and the third argument may evaluate to any lisp object. The
result of evaluating the third argument is stored in the specified index
of the specified stack, and the result of evaluating the third argument
is returned. Specifying an index value of less than zero, or more than
the index of the last element on the stack, generates an error which
stops evaluation.
> (set 's (stack 3))
<STACK1>
> (store s 0 'foo)
foo
> (store s 1 'bar)
bar
> (store s 2 'wumpus)
wumpus
> (flatten s)
(foo bar wumpus)
> (store s 3 'error)
<STORE>: index 3 out of range.
>
clear: (clear expr1 expr2)
The "clear" intrinsic allows the user to remove and discard multiple
elements from the top of a specified stack. The function accepts two
arguments, the first of which must evaluate to a stack object, while the
second must evaluate to a whole number. The second argument specifies
the number of arguments to remove from the top of the stack. "clear"
discards the removed elements and always returns 1.
> (setq s (assign (stack) '(1 2 3 4 5)))
<STACK#1>
> (clear s 4)
1
> (flatten s)
(1)
> (assign s '(1 2 3 4 5))
<STACK#1>
> (clear s (used s))
1
> (flatten s)
()
assign: (assign expr1 expr2)
The "assign" library function allows the user to store all the elements
of a list into a stack at once, starting at index zero. The previous
contents of each affected element are overwritten. If the number of
elements on the stack is too small to hold all the objects in the list,
the function pushes new elements onto the stack until it can hold the
entire list, before performing the assignments. If the number of
elements on the stack is greater than the number of items in the list,
those elements on the stack starting at the index value equal to the
length of the list, and continuing to the top of stack, inclusive, remain
unaffected. The function returns the stack object affected.
> (set 's (assign (stack) '(a b c d e)))
<STACK1>
> (used s)
5
flatten: (flatten expr)
The "flatten" library function returns the elements of a stack in
ascending order, as a list. The function accepts one argument which must
evaluate to the stack to be queried.
> (flatten (assign (stack) '(a b c d e)))
(a b c d e)
child_open: (child_open expr [expr])
The "child_open" intrinsic opens a full-duplex connection to another
process which may be communicated with by the "child_read" and
"child_write" intrinsics. Only one child process may be running at any
one time. The function accepts one or two arguments.
With the one-argument form, the lone argument must evaluate to a string
specifying a command line to pass to the shell (/bin/sh) to run. The
function will return 1 if the child can be created, otherwise it will
return a string describing an error condition. If the child process
cannot find the specified program, or is not able to run it, it will exit
and print an error to stderr. The user may check for a successful launch
of the specified program by invoking "child_running" after invoking
"child_open".
The two-argument form of the function is used to communicate with another
process, local or remote, over a TCP socket, or a local process over a
UNIX domain socket. To open a local or remote TCP connection, the first
argument must evaluate to a string specifying the local or remote
hostname or IP address. For UNIX domain connections the first argument
must evaluate to a string representing the pathname in the filesystem to
a UNIX domain socket. For TCP connections, the second argument must
evaluate to either a string specifying a service defined in
/etc/services, such as "http", or a fixnum specifying a port number to
attempt to connect to. For UNIX domain connections, the second argument
must be 0.
If the interpreter successfully opens a connection to the specified
entity, then it returns 1, otherwise it will return a string describing
an error condition.
child_running: (child_running)
The "child_running" intrinsic accepts no arguments, and returns 1 if an
child process is running; otherwise, the function returns 0.
child_ready: (child_ready)
The "child_ready" intrinsic accepts no arguments, and returns 1 if data
is waiting to be read from a child process; otherwise the function
returns 0. If a child process has not been started with the "child_open"
intrinsic, 0 is returned as well.
child_wait: (child_wait)
The "child_wait" intrinsic accepts no arguments and blocks until data is
ready from a child process for "child_read" to consume, when it returns
1, or if no child process is running, it will immediately return 0.
child_close: (child_close)
The "child_close" intrinsic terminates a child process launched by
"child_open". The function accepts no arguments, and always returns 1.
If there is no child process running, the function does nothing.
child_eof: (child_eof)
The "child_eof" intrinsic closes the writable half of the connection to
child process opened with "child_open". The function accepts no
arguments and returns 1 upon success. The child process will read EOF on
its next read from its standard input. This can be useful when working
with programs which buffer data. Any subsequent attempt to write to the
child with "child_write" will generate an error which will stop
evaluation. Note that a connection which has had its writable half
closed with "child_eof" still needs to be fully-closed by invoking
"child_close" on it, when one is finished with the child.
For an example of the use of this intrinsic, consider a situation where
one is sending lines of text to the fmt utility. This program will
buffer input text until it has enough to print a full line of output,
unless it encounters a blank line or EOF, when any buffered text will be
output to form a short line. In the situation where one has sent data to
the utility, and it has formatted all the data except for the tail end
which is not long enough to form a complete line, the program will block
forever until it reads more data or EOF from stdin. By invoking
"child_eof" we can cause fmt to read EOF on stdin and print the last of
its buffered output. If we invoked "child_close" we would close both
halves of the full-duplex connection, and so be unable to read that last
short line of data back from the child.
child_write: (child_write expr1 ...)
The "child_write" intrinsic writes a list of strings to a child process
launched by "child_open". The function accepts one or more arguments,
which must all evaluate to strings, and writes them to the child process.
The "child_write" function returns 1 on success. Any errors encountered
will stop evaluation.
> (child_open "/usr/local/bin/munger")
1
> (for (a 1 10)
>> (child_write a (char 10)))
1 ; this is the return value of the "child_write"
>> (print (child_read)))
1
2
3
4
5
6
7
8
9
10
child_read: (child_read)
The "child_read" intrinsic reads up to 1024 characters of data emitted by
a child process, and returns it as a string. The function accepts no
arguments. If "child_read" is invoked when no child process is running,
an error will be generated which will stop evaluation. If no data can be
read after 30 seconds, "child_read" will return the empty string. Zero
is returned if EOF is read from the child process.
> (child_open "munger")
1
> (child_write "(set 'foo 'bar)" (char 10))
1
> (child_read)
"bar
"
1
clearscreen: (clearscreen)
The "clearscreen" intrinsic accepts no arguments, and if stdout is
connected to a terminal device, clears the screen. 1 is returned on
success. Any error encountered will stop evaluation.
clearline: (clearline expr1 expr2)
The "clearline" intrinsic accepts two arguments, both of which must
evaluate to whole numbers. If stdout is connected to a cursor-
addressable terminal device, it will clear the line specified by the
first argument, starting at the column specified by the second argument,
to the end of the line. The coordinates for screen lines and columns
start at 0. The position 0,0 is the top leftmost position on the screen.
The maximum values for the terminal device can be ascertained with the
"lines" and "cols" intrinsics. 1 is returned on success. Any error
encountered will stop the interpreter.
goto: (goto expr1 expr2)
The "goto" intrinsic accepts two arguments, both of which must evaluate
to whole numbers. If stdout is connected to a cursor-addressable
terminal device, the function places the cursor at the specified screen
coordinates. The coordinates for screen lines and columns start at 0.
The position 0,0 is the top leftmost position on the screen. The maximum
values for the terminal device can be ascertained with the "lines" and
"cols" intrinsics. 1 is returned on success. Any error encountered will
stop the interpreter.
getchar: (getchar)
The "getchar" intrinsic accepts no arguments. It does a blocking read on
stdin until a character can be read, when it returns a whole number in
the range of 0-255, representing the character code of the input
character. If stdin is a terminal device, it will be taken out of
canonical mode, so that unbuffered, uninterpreted data may be read.
If EOF is encountered, "getchar" returns -1. If a SIGWINCH is received
by the interpreter while waiting for data, the function returns -2
immediately. Any other error encountered will cause the function to
return a string describing the error. The getchar intrinsic "getchar" is
intended for use in creating interfaces which use character-I/O. If you
just want to read a character from stdin, and stdin is redirected onto a
file, use (getchars 1) instead.
pushback: (pushback expr1)
The "pushback" intrinsic accepts one argument which must evaluate to a
whole number in the range 0-255. It causes the next subsequent
invocation of "getchar" to return the number pushed back. After
returning this value, subsequent invocations of "getchar" will read from
the terminal again. 1 is returned on success. Any error encountered
stop evaluation. Note that "pushback" does not work with "getchars",
only "getchar".
display: (display expr1 expr2 expr3)
The "display" intrinsic aids the implementation of interactive buffer
inspection tools. The function accepts three arguments, all of which
must evaluate to whole numbers. If stdout is connected to a cursor-
addressable terminal device, the function will print buffer lines, one
per screenline, starting with the buffer line whose index value
corresponds to the first argument, and continuing with subsequent buffer
lines, until it has printed one less many buffer lines as there are
screen lines, or until it runs out of buffer lines, if there are not
enough. If it runs out of lines, the function will print single tilde
characters on each of the remaining screenlines, except for the last. If
the first argument is zero, "display" prints a full screen of tildes, and
returns.
The second argument specifies a buffer column to start printing with. If
non-zero, only the portions of the lines from the specified column onward
will be printed. Note the second argument does not specify a screen
column, but a buffer column, and that tab expansion is performed before
the slice is taken, according to the value of the third argument,
discussed below. Lines will always be printed starting at screen column
0. Lines longer than the terminal width are truncated to the terminal
width.
The third argument specifies the tabstop periodicity. A value of 4, for
example, indicates that tabs occur every 4 characters in a line. Any
tabs found in the specified lines will be expanded according to this
value before truncation and printing. For the details of tab expansion,
see the description of the "expand" intrinsic, elsewhere in this
document.
1 is returned on success. Any errors encountered stop evaluation.
boldface: (boldface)
The "boldface" intrinsic turns on boldface mode of the terminal connected
to stdout. The function accepts no arguments, and always returns 1.
normal: (normal)
The "normal" intrinsic turns off boldface mode and resets the colors of
the terminal connected to stdout to their default values. The function
accepts no arguments and always returns 1.
fg_black:
fg_red:
fg_green:
fg_yellow:
fg_blue:
fg_magenta:
fg_cyan:
fg_white:
bg_black:
bg_red:
bg_green:
bg_yellow:
bg_blue:
bg_magenta:
bg_cyan:
bg_white:
These sixteen functions set the foreground or background color of the
terminal connected to stdout to the specified color. These functions
accept no arguments and always return 1.
hide: (hide)
The "hide" intrinsic accepts no arguments, and if stdout is connected to
a terminal device capable of having its cursor made invisible, the
function hides the cursor.
show: (show)
The "show" intrinsic accepts no arguments, and if stdout is connected to
a terminal device capable of having its cursor made invisible, the
function shows the cursor.
pause: (pause expr1)
The "pause" intrinsic accepts one argument which must evaluate to a whole
number no larger than 999999, specifying a number of microseconds for the
interpreter to sleep. The interpreter returns when data is waiting on
stdin or when the time value has expired. See the "sleep" intrinsic if
you need to sleep for longer periods. "sleep" is also much less
processor intensive.
scrollup: (scrollup)
If the device connected to stdin is cursor-addressable, the "scrollup"
intrinsic scrolls the screen lines upward by one line. The line at the
top of the screen is lost, while the line at the bottom of the screen
becomes blank.
scrolldn: (scrolldn)
If the device connected to stdin is cursor-addressable, the "scrolldn"
intrinsic scrolls the screen lines downward by one line. The last line
is lost, while the first line becomes blank.
insertln: (insertln)
If the device connected to stdin in cursor-addressable, the "insertln"
intrinsic scrolls the lines on the screen from the line the cursor is on,
to the last line on the screen, inclusive, downward by one line. The line
the cursor is on is cleared, while the last line on the screen is lost.
printer: (printer)
The "printer" intrinsic turns on the lisp printer, if it has been turned
off with an invocation of the "noprinter" intrinsic. The result of
evaluating an expression at toplevel is discarded unless the printer is
turned on. The printer is turned on by default.
noprinter: (noprinter)
The "noprinter" intrinsic turns off the lisp printer, if it is turned on.
The result of evaluating an expression at toplevel is discarded unless
the printer is turned on. The printer is turned on by default.
shexec: (shexec expr)
The "shexec" intrinsic overlays a new process overtop of the interpreter
process, similarly to how the command of the same name works in the
shell. The function accepts one argument which must evaluate to a
string, and attempts to use the execv() system call to run the shell
(/bin/sh -c) with the command-line specified in the argument string.
Upon success, the Munger interpreter process is abandoned, and the shell
process starts running with the same process id. It "replaces" the
interpreter. This is useful when a script is finished its work and
wishes to run another program to do some further processing. If the new
process image cannot be exec-ed, then the function returns a string
describing the error.
exec: (exec expr ...)
The "exec" intrinsic behaves similarly to the "shexec" intrinsic. It
overlays a new process image overtop of the interpreter process. The
function accepts one or more arguments, which must all evaluate to
strings, and interprets them as a command followed by its command-line
arguments. It DOES NOT pass its arguments to the shell for
interpretation. Upon success, the Munger interpreter process is
abandoned, and the new process starts running with the same process id.
It "replaces" the interpreter. This is useful when a script is finished
its work and wishes to run another program to do some further processing.
If the new process image cannot be exec-ed, then the function returns a
string describing the error.
truncate: (truncate expr)
The "truncate" intrinsic alters the length of a file connected to the
standard output of the interpreter. The function accepts one argument
which must evaluate to a an integer specifying the new length of the
file. If the specified length is greater than the length of the file,
the file will be extended and the extended portion filled with zeros.
The function returns 1 on success, or if an error is encountered, a
string describing the error.
> (with_output_file_appending "HOW_TO_EXTEND_IT"
>> (when (writelock)
>>> (truncate 100)))
1
> (with_input_file "HOW_TO_EXTEND_IT"
>> (setq line (getchars 1000)))
"ADDING INTRINSICS TO THE INTERPRETER
------------------------------------
Under the hood, Munger is"
> (length line)
100
dynamic_let: (dynamic_let (symbol expr) expr1 [expr2...])
Intrinsic "dynamic_let" allows the creation of bindings with dynamic
scope, which is to say, such a binding is globally visible for the time
the dynamic_let is executing. A dynamically-scoped variable in Munger is
a global which will cease to exist when the "dynamic-let" exits, or if a
global of the same name existed previously, revert to its former value.
The bindings created by "dynamic_let" cannot be captured by closures. A
lexical binding of the same name as a dynamic binding will shadow the
dynamic binding. This means a binding in an exterior "let" expression
can shadow the binding in an interior "dynamic_let". The intrinsic
accepts two or more arguments, the first of which is not evaluated and
must be a two-element list consisting of a symbol followed by any
expression. The expression will be evaluated in the current scope and
the result bound to the symbol for the duration of the "dynamic_let".
The remaining arguments are then evaluated, in order, with the new
binding in effect. When the expression terminates the dynamic binding is
removed. The value of the last expression evaluated is returned by
"dynamic_let".
The purpose of "dynamic_let" is to allow one to make temporary changes to
global variables without having to save the previous value and restore it
afterwards. By definition, a global is a variable one wishes to be
globally visible. One would only knowingly mask it with a local binding
if one wished to have some local code find a different value bound to the
same symbol. A "let" is sufficient for this purpose, but if one wishes
to have other functions which the local code may invoke to also see the
changed value, then "dynamic_let" must be used.
> (defun f () a)
<CLOSURE#57>
> (dynamic_let (a 10)
>> (f))
10
> (f)
evaluate: symbol a not bound.
f: error evaluating body expression 1.
> (defun a (x) 'foobar)
<CLOSURE#58>
> (defun g (x) (a x))
<CLOSURE#59>
> (dynamic_let (a (lambda (x) x))
>> (g 23))
23
> (g 23)
foobar
> (dynamic_let (a 10)
>> (defun h () a))
<CLOSURE#60>
> (h)
evaluate: symbol a not bound.
h: error evaluating body expression 1.
; Exterior "let" is shadowing interior "dynamic_let":
> (let ((a 12))
>> (dynamic_let (a 10)
>>> (print a)
>>> (newline)))
12
1 ; return value of "newline"
basename: (basename expr)
The "basename" intrinsic returns just the filename portion of a specified
path. The function accepts one argument which must evaluate to a string
and returns a string. If the argument string does not have a filename
component, "basename" returns the empty string. NOTE, the path specified
by the argument string does not have to exist in the filesystem. This is
a string manipulation function only.
dirname: (dirname expr)
The "dirname" intrinsic returns just the directory portion of a specified
path. The function accepts one argument which must evaluate to a string
and returns a string. If the argument does not have a directory
component "." is returned. NOTE, the path specified by the argument
string does not have to exists in the filesystem. This is a string
manipulation function only.
chmod: (chmod expr1 expr2)
The "chmod" intrinsic may be used to change the permissions associated
with a specified file. The function accepts two arguments, both of which
must evaluate to strings. The first argument must be a new mode
specification of the form accepted by chmod(1), in either symbolic or
octal form. The second argument must evaluate to the filename of the
file to be affected. "chmod" returns 1 on success; otherwise it returns
a string describing an error condition.
chown: (chown expr1 expr2 expr3)
The "chown" intrinsic may be used to change the owner (if the euid of the
interpreter is the superuser's), and/or the group associated with a
specified file. The function accepts three arguments, which all must
evaluate to strings. The first argument must be the name of the new
owner, or the empty string if the owner is not to be changed. The second
argument must be name of the new group, or the empty string if the group
is not to be changed. The third argument must be the filename of the
file to be affected. "chown" returns 1 upon success; otherwise, it
returns a string describing an error condition. Only the superuser may
change the ownership of files.
crypt: (crypt expr)
The "crypt" intrinsic is a frontend to the crypt(3) library function. It
accepts one argument which must evaluate to a string and encrypts it
using the default scheme used to encrypt passwords in the user database.
The encrypted string is returned. Any error encountered by crypt(3) will
stop evaluation.
checkpass: (checkpass expr1 expr2)
The "checkpass" intrinsic verifies a user and password pair are correct
for the system on which it is running. The function accepts two
arguments which must both evaluate to strings, the first of which
specifies the user name, and the second of which specifies the password
of that user. The function will always return 0 unless the euid of the
interpreter is 0 (the superuser), when it will return 1 if the user name
and password are correct, or 0, if they are not.
setuid: (setuid expr)
The "setuid" intrinsic is used to change the uid and euid of the
interpreter process. It accepts one argument which must evaluate to a
string specifying the user to change to. The function will return a
string describing an error condition, unless the euid of the interpreter
is 0 (the superuser) and the requested user exists, when it will change
to the specified user and return 1. It is not possible to switch back to
the superuser after invoking "setuid" to become a non-privileged user.
seteuid: (seteuid expr)
The "seteuid" intrinsic is used to change the euid of the interpreter
process if it is running set-user-id. It accepts one argument which must
evaluate to a string specifying the user to change to, and returns 1 upon
success or a string describing an error condition upon failure. The euid
may be switched between the real uid and the set-user-id, of a set-user-
id interpreter. If the interpreter is not running setuid, then the uid,
the euid, and the saved set-user-id will all be the same and therefore it
will not be possible to change the euid.
getuid: (getuid)
The "getuid" intrinsic accepts no arguments and returns a two element
list consisting of the name of the real user the interpreter is running
as, followed by the numerical uid of that user. If an unforeseen error
occurs which prevents the interpreter from determining its own uid, the
function returns the empty list.
setgid: (setgid expr)
The "setgid" intrinsic is used to change the gid of the interpreter
process. It accepts one argument which must evaluate to a string
specifying the group to change to. The function returns 1 upon success,
or a string describing an error condition upon failure.
setegid: (setgid expr)
The "setegid" intrinsic is used to change the effective gid of the
interpreter process. It accepts one argument which must evaluate to a
string specifying the group to change to. The function returns 1 upon
success, or a string describing an error condition upon failure. The
egid may be switched between the real gid and the saved set-group-id, of
a set-group-id interpreter. If the interpreter is not running set-group-
id, the both the saved set-group-id and the real gid will be the same.
geteuid: (geteuid)
The "geteuid" intrinsic accepts no arguments and returns a two element
list consisting of the name of the effective user the interpreter is
running as, followed by the numerical uid of that user. If an unforeseen
error occurs which prevents the interpreter from determining its own
euid, the function returns the empty list.
getgid: (getgid)
The "getgid" intrinsic accepts no arguments, and returns a two element
list consisting of the name of the primary group of the user the
interpreter is running as, followed by the numerical gid of that group.
If an unforeseen error occurs which prevents the interpreter from
determining its gid, then the function returns the empty list.
seek: (seek expr1 expr2 expr3)
The "seek" intrinsic is used to move the file pointer of a file connected
to one of the standard descriptors. The function accepts three
arguments, the first of which must evaluate to 0, 1, or 2, and specifies
whether the seek operation should affect the file pointer of stdin,
stdout, or stderr, respectively. The second argument must evaluate to a
integer specifying the number of characters by which the file pointer
should be adjusted, and the third argument must evaluate to one of the
following three strings: "SEEK_SET", "SEEK_CUR", or "SEEK_END", and
specifies the position which the adjustment is relative to. "SEEK_SET"
indicates the seek operation should seek from the beginning of the file.
"SEEK_CUR" indicates the seek operation should seek from the current
position of the file pointer. "SEEK_END" indicates the seek operation
should seek from the end of the file. Upon success the function returns
the number of characters from the beginning of the file corresponding to
the new location of the file pointer. Further reads or writes to the
stream will happen relative to the new position of the file pointer. Any
errors encountered will stop evaluation. Seeking past the end of a file
connected to stdout will cause the file to be automatically extended,
with the new portion filled with zeroes.
NOTE: calling "getline" on stdin will cause subsequent invocations of
"seek" to return unexpected values, as "getline" does its own buffering.
If the user wishes to intersperse the reading of data from stdin and
calls to "seek" on stdin, the "getchars" intrinsic must be used to
perform the read operations.
getchars: (getchars expr [expr])
The "getchars" intrinsic is used to read a specific number of characters
from the stream connected to stdin. It accepts one or two arguments
which both must evaluate to fixnums. The first must evaluate to a whole
number specifying the number of characters to read, while the second, if
present, must evaluate to a positive value specifying a number of seconds
after which a timeout will occur. Upon success, the characters read are
returned as a string. Any errors encountered will stop evaluation. If
"getchars" encounters EOF while reading, or a timeout occurs, less than
the desired amount of characters may be returned. If EOF was
encountered, any successive invocation of the "getchars" will return
fixnum 0. Note that invoking "getchars" with a first argument of 0,
always causes it to immediately return the empty string without even
attempting to read from stdin.
Mixing calls to "getline" and "getchars" will result in unexpected
results, as "getline" performs its own input buffering. You MAY mix
calls to "getchars" with calls to "getline_ub", however.
The presence of a timeout value causes "getchars" to operate as follows.
The function calls the read() system call until it has either read the
desired number of characters, or it encounters EOF on the input stream.
This is useful when stdin is connected to a socket. If any invocation of
read() takes longer than the number of seconds specified by the timeout
value, it will be interrupted, and "getchars" will return all the
characters read so far. If no characters have been read, then "getchars"
will return the empty string. This is the only other circumstance in
which it will return the empty string. When invoked without a timeout
value, but with a positive first argument, the function will block
indefinitely until it can read at least one character or EOF, and either
return a non-empty string, or 0 on EOF. This means, that after stdin has
been connected to a socket with "accept", a return value of "" from
(getchars 100 5) means a timeout occured before any data was read from
the socket, while a return value of 0 indicates EOF.
When reading from a terminal device in canonical mode, when a timeout
value has been specified, the empty string may be returned even though
the user has typed some characters, because the terminal driver will not
return any character data to the interpreter in canonical mode until a
carriage return or a newline are input.
readlock: (readlock)
The "readlock" intrinsic is used to obtain a shared lock on a file
connected to stdin. The function accepts no arguments and returns 1 upon
success, or 0 if the file is already locked by another process. If the
lock cannot be obtained for any other reason, the function returns a
string describing the error.
writelock: (writelock)
The "writelock" intrinsic is used to obtain an exclusive lock on a file
connected to stdout. The function accepts no arguments and returns 1
upon success, or 0 if the file is already locked by another process. If
the lock cannot be obtained for any other reason, the function returns a
string describing the error.
unlock: (unlock expr)
The "unlock" intrinsic is used to release a lock obtained by the
"readlock" or "writelock" intrinsics. The function accepts one argument
which must evaluate to either 0 or 1, specifying whether a lock on stdin
or stdout is to be released, respectively.
hostname: (hostname)
The "hostname" intrinsic accepts no arguments and returns the name of
host it is running on, or a string describing an error condition it
encountered while attempting to retrieve the hostname.
symlink: (symlink expr1 expr2)
The "symlink" intrinsic creates a symbolic link to a pre-existing
filesystem entity. The function accepts two arguments, both of which
must evaluate to strings. The first argument specifies the pre-existing
filesystem entity, while the second argument names the symbolic link.
"symlink" returns 1 upon success, or a string describing an error
condition, if the symlink() system call failed.
gecos: (gecos expr)
The "gecos" intrinsic queries the user database for the value of the
gecos field for a specified account. The function accepts one argument
which must be a string specifying the user name associated with the
account. If such a user exists, the value of the gecos field is returned
as a string, otherwise the empty string is returned. The gecos field is
used to store personal information about the user. Traditionally, it
held 4 comma-separated fields containing the user's full name, office
location, work phone number, and home phone number, but the system does
not care what goes into the gecos field. Most administrators nowadays
simply place the user's full name there.
record: (record expr)
The "record" intrinsic creates a fixed-size unidimensional array.
Records are a more space-efficient means of representing fixed-size
aggregate types. The one argument must evaluate to a positive integer
specifying the size of the array. The items of the record are all preset
to the empty list upon creation.
getfield: (getfield expr1 expr2)
The "getfield" intrinsic is used to retrieve an item from a record. It
accepts two arguments, the first of which must evaluate to a record,
while the second must evaluate to a positive integer specifying the index
of the desired item in the array. The object at that index is returned.
setfield: (setfield expr1 expr2 expr3)
The "setfield" intrinsic is used to insert an object into a record. The
function accepts three arguments. The first argument must evaluate to
the record to be affected. The second argument must evaluate to a
positive integer specifying the index location to overwrite. The third
argument can evaluate to any lisp object and will be inserted into the
first argument at the location specified by the second argument. The
evaluated third argument is returned.
extend: (extend expr1 expr2)
The "extend" intrinsic adds a new binding to the currently-active local
environment. The function accepts two arguments, the first of which must
evaluate to a symbol, while the second of which may evaluate to any lisp
object. The symbol is added to the local environment and bound to the
result of evaluating the second argument, replacing any pre-existing
local binding to that symbol. Note that if the pre-existing symbol
occurs free in the current lexical environment, then it is shadowed, not
replaced. The result of evaluating the second argument is returned. The
intent behind the inclusion of this intrinsic in the interpreter is to
allow extensions to the current lexical environment without the use of
"let" and friends, for increased efficiency.
> (defun fact (n)
>> (extend 'a 1)
>> (while (> n 1)
>>> (setq a (* n a))
>>> (dec n))
>> a)
<CLOSURE#23>
> (fact 10)
3628800
Any lambda-expressions closing over the current lexical environment will
"see" the new binding, because closures do not simply close over the
lexical bindings visible at the time of their creation, but rather the
lexical environments visible at the time of their creation. The
difference between these two notions, is that environments may be
dynamically extended with "extend" to contain a superset of the bindings
visible at creation.
If the closure is applied before an invocation of "extend" it will not
see the binding created, since it will be created in the future, but if
that VERY SAME CLOSURE is applied after an invocation of "extend" it will
see the new binding created. The new binding suddenly appears in the
current lexical environment.
This means a closure bound to a new local via "extend" will "see" its own
binding when it is applied, and therefore may call itself by name. The
extent of bindings created by "extend" is unlimited, but may be limited
by wrapping the expressions in which they occur with an invocation of
"dynamic_extent" described elsewhere in this document.
gc: (gc)
The "gc" intrinsic sets the garbage collector to run on the next
evaluation. The garbage collector normally runs once every 65536
evaluations. Judicious invocations of this function may allow the user
to decrease the memory consumption of his or her program.
dynamic_extent: (dynamic extent ...)
The "dynamic_extent" intrinsic limits the extent of additions to the
current lexical environment made with the "extend" intrinsic. The
intrinsic accepts zero or more arguments. If no arguments are supplied,
"dynamic_extent" does nothing and returns 1. If arguments are supplied,
they are evaluated in order, and the result of evaluating the last
expression is returned. If no lexical environment exists at the time of
invocation, an error will be generated which will stop evaluation. When
"dynamic_extent" finishes, any additions made to the current lexical
environment by the expressions in its body are removed. Combinations of
"dynamic_extent" and "extend" can replace occurrences of "let", "letn",
or "labels" inside functions where the bodies of those expressions do not
contain closures, or which contain closures which do not need to close
over the new bindings, which will not persist beyond the extent of
"dynamic_extent".
In this example the new bindings introduced for "b" and "a" do not have
their extent limited, and any closures formed in the body of the "let"
would continue to "see" these bindings when the "let" returned, even if
the closures were closed before the invocations of "extend". The binding
to "c" however, has its extent limited to the extent of the invocation of
"dynamic_extent". If no new bindings are introduced via "extend" inside
an instance of "dynamic_extent", then it effectively does nothing.
> (let ((a 10))
>> (extend 'b (* a a))
>> (dynamic_extent
>>> (extend 'c (* b b))
>>> (print c)
>>> (newline))
>> (print (boundp 'c))
>> (newline))
10000 ; first "print"
0 ; second "print"
gc_freq: (gc_freq expr)
The "gc_freq" intrinsic allows the programmer to change the rate at which
garbage collection occurs. The default is once every 1048576 new objects
or internal atoms have been allocated. Internal atoms are not be
confused with lisp atoms. Each unique syntax has one internal atom
representing all occurences of that syntax. So, (+ a a) is a list of
three objects, but each of the 'a' objects point to the same internal
atom. You don't need to know this.
This function accepts one argument, which must evaluate to a whole number
fixnum specifying a new value for the GC frequency. Increasing GC
frequency will cause the interpreter to run faster up to a point, but
consume more memory, while decreasing the value of gc_frequency will
cause the interpreter to run more slowly but consume less memory.
The function returns the old frequency value. Setting GC frequency to
zero will disable garbage collection. The programmer can manually invoke
garbage collection with the "gc" intrinsic somewhere else in his or her
program.
There is a point, when garbage collection is turned-off or set to happen
very infrequently, where the size of the object and atom pools will grow
to be so large that GC itself will become the performance bottleneck of
your program. This is because these pools are never returned to the
system. Keep this in mind.
getpid: (getpid)
getppid: (getppid)
getpgrp: (getpgrp)
tcgetpgrp: (tcgetpgrp)
The "getpid", "getppid", "getpgrp", "tcgetpgrp" intrinsics accept no
arguments and each returns a fixnum representing the process id of the
interpreter, or the process id of the parent process of the interpreter,
or the process group id of the process group the interpreter belongs to,
or the process group which is currently the foreground process associated
with the terminal device the interpreter is running on, respectively.
The "tcgetpgrp" function will return 0 if the interpreter is not
associated with a terminal device. For any other error it will return a
string describing the error condition.
setpgid: (setpgid expr1 expr2)
The "setpgrp" intrinsic accepts two arguments which both must evaluate to
fixnums. The first argument must be a process id of a running process,
while the second must be the process group id of a running process group
or it must be the same as the first argument. The function puts the
process specified by the first argument into the process group specified
by the second argument. Upon success, the function returns 1, otherwise
it returns a string describing an error condition. There processes which
may be affected by this intrinsic are described in the manual page for
the "getpgrp" system call.
If the first argument is zero, then the pid of the interpreter process is
used as the first argument. If the second argument is zero, then the
first argument will be used for the second argument, as well. New
process groups are created by setting both arguments to the same value.
If the affected process is not already be a process group leader, it will
become the process group leader of a new process group, and the process
group id will be the same as its process id.
tcsetpgrp: (tcsetpgrp expr)
The "tcsetpgrp" intrinsic accepts one argument which must evaluate to a
fixnum specifying a process group id and makes that process group the
foreground process associated with the terminal device the interpreter is
running on. If the interpreter has no associated controlling terminal,
then the function returns 0. Upon success, it returns 1. Otherwise it
returns a string describing an error condition. The processes which may
be affected by this intrinsic are described in the manual page for the
"tcsetpgrp" system call. Note that although it is not mentioned in the
manual page, one cannot call "tcsetpgrp" when one is a background
process, unless one has blocked SIGTTOU, which may be accomplished by
calling the "block" intrinsic.
kill: (kill expr1 expr2)
The "kill" intrinsic accepts two arguments, the first of which must
evaluate to a fixnum representing the process id of a running process,
while the second argument must evaluate to a fixnum representing a signal
number. The function sends the signal specified by the second argument
to the process specified by the first argument. If successful, the
function returns 1, otherwise it returns a string describing an error
condition. The processes which may be affected by this intrinsic are
described in the "kill" system call (man 2 kill). A table mapping signal
numbers to signal names may be found in the "signal" manual page.
killpg: (killpg expr1 expr2)
The "killpg" intrinsic accepts two arguments, the first of which must
evaluate to a fixnum representing the process group id of a running
process group, while the second argument must evaluate to a fixnum
representing a signal number. The function sends the signal specified by
the second argument to every process in the process group specified by
the first argument. If successful, the function returns 1, otherwise it
returns a string describing an error condition. The processes which may
be affected by this intrinsic are described in the manual page for the
"killpg" system call (man 2 killpg). A table mapping signal numbers to
signal names may be found in the "signal" manual page.
fork: (fork)
The "fork" intrinsic is a wrapper for the "fork" system call. The
function accepts no arguments and generates a new interpreter process
which is an exact copy of the current process. The function returns 0 to
the child interpreter, and the process id of the child interpreter to the
parent interpreter. Upon error, -1 is returned. The interpreter will
reap any child processes which exit while it is running, unless the
"zombies" intrinsic has been invoked, in which case, the programmer must
reap them manually using the "wait" intrinsic.
Read the entries for "child_open", "pipe", "with_input_process", and
"with_output_process", and determine if they will do what you need a new
process to do, before using "fork" or "forkpipe", because these other
intrinsics and macros are more convenient to use.
forkpipe: (forkpipe expr)
The "fork" pipe behaves similarly to the "fork" intrinsic, but creates a
pipe between the two interpreters. The function accepts one argument,
which must evaluate to one of 0, 1, or 2, and specifies which of the
parent interpreter's descriptors is attached to the pipe. These values
correspond to the values of the file descriptors for the standard
streams: 0 is the standard input, 1 is the standard output, and 2 is the
standard error. The value specified implies which of the child
interpreter's descriptors is attached to the pipe. If parent descriptor
1 or 2 is specified, then the child interpreter will have its standard
input connected to the pipe. If parent descriptor 0 is specified, then
the child will have its standard output connected to the pipe. The
function returns 0 in the child interpreter, the process id of the child
process in the parent interpreter. If the fork() system call fails, -1
is returned. Any error encountered while creating the pipe or duping
descriptors will stop evaluation.
The interpreter will reap any child processes which exit while it is
running, unless the "zombies" intrinsic has been invoked, in which case,
the programmer must reap them manually using the "wait" intrinsic.
Read the entries for "child_open", "pipe", "with_input_process", and
"with_output_process", and determine if they will do what you need a new
process to do, before using "fork" or "forkpipe", because these other
intrinsics and macros are more convenient to use.
wait: (wait expr1 [expr2])
The "wait" intrinsic is used to reap a zombie process. When a child
process terminates, its process table entry is preserved so that the
interpreter may determine how it exited, and what its exit status was.
When the interpreter starts-up it is in the "nozombies" state, which
means it will automatically reap any zombies created by terminated child
processes, discarding their exit statuses.
If the programmer needs his or her program to wait for a child process to
complete before proceeding, he or she may invoke the child with the
"system" intrinsic, but if the programmer wants to the parent and child
to proceed asynchronously, but still needs to know how the child process
exited, or its exit status, then the programmer must invoke the "zombies"
intrinsic before manually launching the child process with "fork" and
"exec". To retrieve the child's termination information, the programmer
invokes "wait" at some time subsequent to forking the child, to reap the
child's process table entry. One should do this even if one
subsequently decides one is not interested in the termination
information. If the process has not yet terminated, "wait" will block
until it does.
The function accepts one or two arguments, the first of which must
evaluate to a fixnum specifying a process id, 0, -1, or a process group
id negated. This argument is passed as the first argument to the
waitpid() system call. If the argument is a process id, "wait" will reap
that process and return its termination information. If the argument is
-1, then "wait" will reap any zombie child process waiting to be reaped,
and return its termination information. If the argument is 0, then
"wait" will reap any zombie child which belongs to the interpreter's
process group. If the argument is a process group id negated, then
"wait" will reap any zombie child whose process group id is equal to the
absolute value of the argument.
If the second argument is present, it can evaluate to any value. It is a
"don't block" boolean flag. If is not present or if it evaluates to a
boolean "false" value, AND there are no stopped or zombie processes which
can satisfy the "wait" request, BUT there is at least one running process
which can satisfy the "wait" request in the future, THEN "wait" will
block until it can reap a process. Otherwise it will return immediately.
The function returns a two or three element list.
If "wait" cannot reap a process it returns a list containing the fixnum 0
or -1, and the symbol ECHILD. If no second argument was supplied to
"wait" or if the second argument evaluated to a boolean "false" value,
then the first element of the returned list will be -1. This means there
is no running or zombie process which can satisfy the "wait" now or in
the future. If a "true" second argument was supplied to "wait", then the
first element of the returned list may be either -1 or 0. A value of 0
indicates there are processes which can satisfy the "wait" but which are
still running.
Otherwise, "wait" returns a list containing a fixnum representing the
process id of the reaped process, followed by a symbol describing how the
process exited, which will be one of EXITED, KILLED, or STOPPED. If the
second element is EXITED, then the process terminated by calling the
"exit" or "_exit" system calls, and a third element will be present which
will be a fixnum representing the child's exit status, which is the
argument it gave to "exit" or "_exit". If the second element is KILLED,
then the process was terminated by a signal, and a third element will be
present which will be a fixnum representing the signal number of the
signal which terminated the process. If the second element is STOPPED,
then the process was stopped by a signal and may be started again, and a
third element will be present which will be a fixnum representing the
signal number which stopped the process. Processes may be stopped by the
job control related signals: SIGSTOP, SIGSTP, SIGTTIN, SIGTTOU. The
manual page for "signal" maps signal numbers to constant names.
If one wishes to simply ensure that all current zombies are reaped, one
may invoke "wait" with an argument of -1, until it returns (-1 ECHILD).
> (until (eq -1 (car (wait -1)))
zombies: (zombies)
The "zombies" intrinsic accepts no arguments, and when invoked causes the
interpreter to stop reaping zombie child processes. They may be manually
reaped with the "wait" intrinsic. Note that after invocation of
"zombies", each invocation of "fork", "child_open", "with_input_process",
"with_output_process", and "pipe" will generate a new process which must
be manually reaped with the "wait" intrinsic. Note that "input",
"output", and "filter" will all reap their own zombies, regardless of the
setting of zombies state. This is because it is easy for the interpreter
to reap the processes forked by these three, since they are running only
while the associated intrinsic function is running. The function always
returns 1.
nozombies: (nozombies)
The "nozombies" intrinsic accepts no arguments, and when invoked causes
the interpreter to reap zombie child processes. The interpreter starts-
up in the "nozombies" state. This function always returns 1.
zombiesp: (zombiesp)
The "zombiesp" intrinsic accepts no arguments and returns the value of
the zombies state. It returns 1 if the interpreter is in the "zombies"
state, and 0 otherwise.
glob: (glob)
The "glob" intrinsic accepts one argument, which must evaluate to a
string, and calls the library function "glob" upon it, which interprets
the string as a shell glob pattern and searches for matches in the
filesystem on the pattern. Any matches found will be returned as a list
of strings. If no matches are found, the empty list is returned. Any
errors encountered, will stop evaluation. It is not an error for a
pattern to have no matches.
command_lookup: (command_lookup expr)
The "command_lookup" intrinsic accepts one argument which must evaluate
to a string, and attempts to find a file executable by the user the
interpreter is running as, in the directories specified by the PATH
environment variable. If successful, the fully-qualified filename is
returned. If not successful, the empty string is returned.
If the user has changed the value of the PATH environment variable, or
has added new executables to the directories specified by the environment
variable, then the "rescan_path" intrinsic must be invoked to get the
interpreter to update its internal list of executables.
> (command_lookup "munger")
"/usr/local/bin/munger"
> (command_lookup "foobar")
""
getstring: (getstring expr)
The "getstring" library function returns the output of an external
process as a string. The function accepts one argument which must
evaluate to a command line to be passed to the shell (/bin/sh) and
gathers up the data which appears on the program's standard output into a
string, which it then returns. No processing is performed on the program
output. It is returned unaltered.
If the interpreter cannot "forkpipe", the function returns -1. If the
"forkpipe" is successful but the "shexec" is not, then the function will
return the empty string. This means specifying a non-existent program to
run, or a program to which you do not have read and execute permission,
will cause the function to return the empty string.
dec2hex: (dec2hex expr)
The "dec2hex" intrinsic converts a fixnum representing a whole number
into a string representing the number in hexadecimal notation. If the
programmer passes a negative number to the function, an error will be
generated.
> (dec2hex 65535)
"FFFF"
hex2dec: (hex2dec expr)
The "hex2dec" intrinsic converts a string representing a whole number in
hexadecimal notation to a fixnum. The letter characters used in
hexadecimal notation may be in either lower or upper case forms.
> (hex2dec "FfFf")
65535
listen: (listen expr [expr])
The "listen" intrinsic is used to start the kernel accepting incoming tcp
connections for the interpreter process. Both IPv4 and IPv6 connections
will be accepted. The functions accept one or two arguments. The first
argument must evaluate to either a fixnum specifying the port number to
accept incoming connections on, or a string naming a service, as listed
in /etc/services. If the port number is 0, then the kernel will choose a
port from the ephemeral ports. The second optional argument, if present,
must evaluate to a string specifying the IP address of the interface to
use. It must be expressed in the presentation format for either IPv4 or
IPv6. If the second argument is not present, the function will accept
connections on all interfaces of both protocol families.
If successful, the function returns the port number the listening socket
is listening on. Otherwise, it returns a string describing an error
condition. The listen() system call is called with backlog argument of
4096. This determines the number of connections the kernel will queue,
awaiting service. Only one listening socket may be active at any time.
To accept connections on more than one interface simultaneously, the
programmer must "fork" the interpreter, and have each instance call
"listen". The listening socket can be closed with the "stop_listening"
intrinsic. After a successful invocation the kernel will start accepting
and queuing incoming connections on the specified interface. To service
a connection, the programmer invokes the "accept" intrinsic, documented
below.
listen_unix: (listen_unix expr)
The "listen_unix" intrinsic is used to start the kernel accepting
incoming connections over a UNIX domain socket. The function accepts one
argument, which must evaluate to a string specifying the desired pathname
for the listening socket in the filesystem. If the entity so named
exists, an attempt will be made to unlink it first. If the Munger
interpreter lacks the necessary permissions to do so, the bind() system
call will fail and "listen_unix" will return a string containing an error
message. All other errors will also cause an error string to be
returned. The function returns 1 on success. The second paragraph of
the entry in this manual for the "listen" intrinsic applies to
"listen_unix" as well.
stop_listening: (stop_listening)
The "stop_listening" intrinsic closes a listening socket opened by the
"listen" or "listen_unix" intrinsics. The function accepts no arguments
and returns 1 if a listening socket was active, or 0 otherwise. After
calling "stop_listening" no more incoming connections may be accepted
with "accept" until the programmer calls "listen" again.
accept: (accept)
The "accept" intrinsic accepts an incoming tcp connection. It can only
be invoked after "listen" or "listen_unix" has been invoked, and is
invoked repeatedly to accept successive incoming connections. The
function accepts no arguments and returns 1 if successful, -1 if the
system call was interrupted by a SIGTERM, or a string describing an error
condition otherwise. The function blocks until an incoming connection
has been completed and is ready for communication with the client who
initiated it.
When "accept" returns, the stdin and stdout of the interpreter will have
been redirected onto the incoming connection. Any of the intrinsics
which read and write from those descriptors ("print", "println",
"newline", "getchar", getchars", and "getline") may be used to
communicate with the client. One must keep in mind "accept" works like
"pipe" or "redirect" in that the new streams connected to the affected
descriptors "shadow" the previously connected streams, but the
previously-connected streams are still open in the interpreter. This
means one should invoke both (resume 0) and (resume 1) when one is
finished communicating with a client, before calling "accept" again,
unless one intends to come back to the previously-accepted connection in
the future.
To service more than one client at a time, the programmer may "fork" the
interpreter multiple times and have each child call "accept" for itself
in a loop to accept incoming connections, or the programmer may choose to
have one process call "accept" and then fork a new process as needed to
service each client.
In the example of an echo server below, the parent process "accepts" each
incoming connection, then forks off a child to service it. The child
calls "stop_listening" to close its reference to the listening socket.
This is because the parent could exit before the child, and if the child
still had a valid reference to the listening socket, then the kernel
would still keep queuing incoming connections which would never be
accepted, because the parent process does the accepting. Therefore, we
want incoming connections rejected when the parent exits. The parent
calls "resume" on both stdin and stdout to close its references to the
newly-accepted client. If it did not do this, the parent's connection to
the client could remain open until the next time "resume" was invoked in
the parent, unless the client explicitly closed its end of the
connection. Note that this program must be run as "root" to bind to port
7, and will never exit unless "listen" or "accept" encounters an error.
It must be killed with a signal.
(fatal)
(daemonize "echo1.munger")
(defun service_client ()
(while (setq line (getline))
(print line)
(flush_stdout)))
(when (fixnump (setq err (listen 7)))
(setuid "nobody")
(while (fixnump (setq err (accept)))
(if (not (fork))
(progn
(stop_listening)
(service_client)
(exit 0))
(resume 0)
(resume 1))))
(syslog 'CRITICAL err)
(exit 1)
Another example of an echo server is below. In this server, the parent
calls "listen" then forks off 9 children. All ten processes then call
"accept" in an infinite loop to service clients. All ten accept
connections from the same listening socket, so all ten need to keep their
references to it open. Since each process calls "accept" separately,
none of them will have references to each others' accepted connections to
worry about, but there is another detail related to "accept" we must
worry about in this example. In order to ensure we do not keep
previously accepted connections open, we must explicitly close each tcp
connection when we are finished with it, before calling "accept" again,
hence the calls to "resume" in the process_clients function below. Note
that this program must be run as root to bind to port 7, and will never
exit unless "listen" or "accept" encounters an error. It must be killed
with a signal.
(fatal)
(daemonize "echo2.munger")
(defun service_client ()
(while (setq line (getline))
(print line)
(flush_stdout)))
(defun process_clients ()
(while (fixnump (setq err (accept)))
(service_client)
(resume 0)
(resume 1)))
(when (fixnump (setq err (listen 7)))
(setuid "nobody")
(catch
(iterate 9
(unless (fork)
(throw 0)))))
(process_clients)
(syslog 'CRITICAL err)
(exit 1)
get_scgi_header: (get_scgi_header)
The "get_scgi_header" intrinsic parses an SCGI header netstring from
standard input and returns a list of strings, guaranteed to be a multiple
of 2 in length, or fixnum 0. A 0 return value indicates the function
encountered an error and gave up. A returned list of strings indicates
the SCGI header was read successfully, and each pair of strings in the
returned list will be a name of an SCGI environment variable and its
value. For an example of usage see the scgi.munger exampler SCGI server.
Standard input is positioned at the beginning of the SCGI body, when the
function returns.
This function should not be used in conjunction with "getline", as that
function does its own buffering, and "get_scgi_header" will not see
already-read data that "getline" has accumulated in its buffer. Rather
to read data in a server program, use the unbuffered input functions,
"getchars", or "getline_ub".
send_descriptors: (send_descriptors);
The "send_descriptors" intrinsic can only be invoked after "child_open"
has been successfully invoked to open a client connection to a server
process over a UNIX domain socket. The function accepts no arguments and
returns 1 on success, or a string describing an error condition,
otherwise.
This function in conjunction with "receive_descriptors", is used to cause
one Munger interpreter to pass its standard input and output to another
Munger interpreter. The server interpreter calls "receive_descriptors"
while the client calls "send_descriptors". After both intrinsics have
returned success to their respective interpreters, the server's standard
input will be connected to the same source the client's standard input is
connected to, and the server's standard output will be connected to the
same source the client's standard output is connected to. The connection
over the UNIX domain socket is unaffected. The client may close it with
"child_close" afterward, if the client has no more communication to
accomplish with the server.
receive_descriptors: (receive_descriptors)
The "receive_descriptors" intrinsic can only be invoked after
"listen_unix" and "accept" have been successfully invoked to accept a
client connection over a UNIX domain socket. The function accepts no
arguments and returns 1 on success, or a string describing an error
condition, otherwise.
In conjunction with "send_descriptors", this function is used to cause
one Munger interpreter to pass its standard input and output to another
Munger interpreter. The client invokes "send_descriptors" and the server
invokes "receive_descriptors". After both intrinsics have returned
success in their respective interpreters, the server's standard input
will be connected to the same source the client's standard input is
connected to, and the server's standard output will be connected to the
same source the client's standard output is connected to. The "resume"
intrinsic can be called by the server interpreter to return either or
both descriptors to the sources they were formerly connected to, which is
the connection to the client over the UNIX domain socket. Invoking
"resume" again on both 0 and 1, will close the connection, and cause the
server interpreters standard input and output to be connected to the
sources they were connected to before the call to "accept".
busymap: (busymap expr)
The "busymap" intrinsic is used to create an byte-array of shared memory
between parent and child server processes. The function accepts one
argument which must evaluate to a fixnum specifying the length of the
array in bytes. Only one busymap can exist at any time. If a busymap
already exists, then the function returns -1. Upon successfully creating
a new busymap, the function returns 1. Otherwise, a string describing an
error condition is returned.
There is no locking mechanism provided to arbitrate access to the
busymap. The intended usage is for slave processses of multi-process
servers to write to the busymap with the "busy" and "notbusy" intrinsics,
and the master process manager to read it with "busyp". See the
httpd.munger example web server for usage.
Master server processes requests their children exit by sending them a
SIGTERM (signal number 15) with the "kill" intrinsic. If this signal is
received by the interpreter, the next invocation of "accept" will cause
the interpreter to exit. If the interpreter is blocked in "accept" at
the time of the arrival of the signal, the interpreter will exit
immediately. This allows the slave server process to continue processing
any established client connection to completion before exiting.
nobusymap: (nobusymap)
The "nobusymap" intrinsic frees a busymap which has been created with the
"busymap" intrinsic. The function accepts no arguments and returns -1 if
no busymap exists, 1 if the busymap was successfully unmapped, or a
string describing an error condition. Each process which has access to
the busymap, which is to say all the children of the caller of "busymap"
who have not invoked "exec", must call "nobusymap", in order to
completely remove the shared mapping.
busy: (busy expr)
The "busy" intrinsic sets a byte in a busymap to 1. The function accepts
one argument which must evaluate to a fixnum specifying the index of a
byte (indices start at 0) in the active busymap to affect. If no busymap
exists, the function returns -1. If successful, the function returns 1.
If the index is out of range, an error will be generated which will stop
evaluation.
notbusy: (notbusy expr)
The "notbusy" intrinsic sets a byte in a busymap to 0. The function
accepts one argument which must evaluate to a fixnum specifying the index
of a byte (indices start at 0) in the active busymap to affect. If no
busymap exists, the function returns -1. If successful, the function
returns 1. If the index is out of range, an error will be generated
which will stop evaluation.
busyp: (busyp expr)
The "busyp" intrinsic returns the value of a byte in the active busymap.
The function accepts one argument which must evaluate to a fixnum
specifying the index of a byte (indices start at 0) in the active
busymap. If no busymap exists, the function returns -1. If successful,
the function returns either 0 or 1, 0 indicating the "not busy" state,
and 1 indicating the "busy" state. If the index is out of range, an
error will be generated which will stop evaluation.
chroot: (chroot expr)
The "chroot" intrinsic accepts one argument, which must evaluate to a
string, and calls the chroot(2) system call with that string as argument.
This system call is used to change the root directory for the
interpreter, which must be running as root for the call to succeed.
After a successful invocation, the initial slash (/) in all pathnames
will refer to the specified directory. See the chroot(2) manual page for
more details. Upon success the function returns 1, otherwise it returns
a string describing an error condition.
Once a process has been chroot-ed, it, and its children, may no longer
access any part of the file system above the new root directory. This
means a program needing shared libraries existing outside of the visible
hierarchy will not start in the chroot environment. For those programs a
miniature filesystem with appropriate libraries must be set up before
invoking "chroot". Processes are allowed to keep their open descriptors
in the new environment, allowing access to previously opened items
outside of the new hierarchy, so programs typically chroot after they
have done all of their startup tasks, such as opening log files. For
example, "daemonize" should be called before "chroot" to open the
connection to /dev/log, which will not be visible afterward.
daemonize: (daemonize expr)
The "daemonize" intrinsic turns the interpreter into a daemon process.
The function accepts one argument which must evaluate to the name for the
program to use for itself in the system logfile. The function returns 1
upon success, or if it encounters a problem evaluating its arguments, it
generate an error which will stop evaluation. If the function encounters
an error condition after it has closed the standard descriptors, it will
cause the interpreter to exit with an exit status of 1, and write an
error message to the system log.
To become a daemon, "daemonize" undoes all redirections of the standard
descriptors, and closes and reopens stdin, stdout, and stderr on
/dev/null, and it closes any open full-duplex connection opened with
"child_open". It then calls the "openlog" library function to establish
a connection with the syslog daemon, so the daemon process may emit
messages to the system log. The function then calls the "block"
intrinsic to block a number of signals. Consult the entry in this
document for the "block" intrinsic for details. It then forks and allows
the parent process to exit, so that the child cannot be a process group
leader. This is required so that it may next call setsid() to detach
itself from its controlling terminal. This prevents users at the
terminal from sending job-control signals to the daemon process.
syslog: (syslog expr expr)
The "syslog" intrinsic is used to send a message to the system logfile.
It is a wrapper around the syslog() library function. It can only be
used after "daemonize" has been successfully invoked. The function
accepts two arguments, the first of which must evaluate to one of the
following set of symbols indicating the priority of the event being
logged: ALERT, CRITICAL, ERROR, WARNING, NOTICE, INFO, DEBUG. The
symbols must be in all upper-case letters. The /etc/syslog.conf that
ships with FreeBSD at the time this entry in the manual page is being
edited, prevents any message with priority lower than NOTICE of being
logged. The second argument must evaluate to a string containing the log
message. Unlike the syslog() library function, no % processing (a la
printf) takes place in the message string. Any % occurring in the
message string will be escaped with another %. The "daemonize" intrinsic
will have called openlog() to set the name of the daemon and its process
id to automatically appear in the logfile before the message text. It is
not necessary to include anything in the message text except a
description of the event being logged. The function returns 1 upon
success, or it uses the syslog() library call to send an error message to
the system log, and then causes the interpreter to exit. There is no
point in returning to toplevel in response to an error event because
after "daemonize" has been invoked, the interpreter is no longer capable
of performing terminal I/O.
flush_stdout: (flush_stdout)
The "flush_stdout" intrinsic accepts no arguments and calls fflush() on
the standard output stream, returning to the caller whatever that
function returns, either 0 upon success or -1 upon encountering an error.
Any buffered data which has not yet been written to stdout, is written to
the stream. This can be useful when writing to a TCP connection, as the
network stack will buffer data to avoid propagating many small TCP
segments over the connection. Invoking "flush_stdout" after each call-
and-response interaction with a client, is a good idea, to ensure the
client gets its response immediately.
getpeername: (getpeername)
The "getpeername" intrinsic is a wrapper around the system call of the
same name. The function accepts no arguments, and if invoked after
"listen" and "accept", returns a string representing the IP address of
the host on the other end of the TCP connection currently connected to
stdin and stdout. The function returns 0 upon encountering any error.
base64_encode (base64_encode expr)
The "base64_encode" intrinsic accepts one argument which must evaluate to
a string, and returns a new string representing its argument encoded in
the base64 encoding scheme used by MIME messages. It would be
impractical to read a large binary file into memory as a string and feed
it to this intrinsic. One may instead use the "getchars" intrinsic to
read chunks of the file and feed them to this function. As long as each
chunk given to "base64_encode" is a multiple of three bytes in length,
except for the last chunk, which may be any length, the output strings
from each invocation may be concatenated together to form the base64
encoding for the entire file.
> (with_input_file "binary.file"
>> (with_output_file "binary.file.base64"
>>> (println "begin-base64 644 binary.file")
>>> (while (setq line (getchars 57))
>>>> (println (base64_encode line)))
>>> (println "====")))
will produce the same output as the b64encode system utility invoked as:
b64encode -o binary.file.base64 binary.file binary.file
Base 64 encoding represents 3 bytes of data as 4 printable characters, so
using a line size of 57 will cause those lines to expand to 76 characters
after encoding, which is less than the once-customary 80 character limit
on line lengths in an email message.
A more efficient method would read a larger amount of text and cut up the
lines with "substring":
> (with_input_file "binary.file"
>> (with_output_file "binary.file.base64"
>>> (println "begin-base64 644 binary.file")
>>> (setq line "")
>>> (while (setq segment (getchars 100000))
>>>> (setq line (concat line segment))
>>>> (while (> (length line) 57)
>>>>> (println (base64_encode (substring line 0 57)))
>>>>> (setq line (substring line 57 0))))
>>> (when line (println (base64_encode line)))
>>> (println "====")))
base64_decode (base64_decode expr)
The "base64_decode" intrinsic accepts one argument, which must evaluate
to a string containing base64-encoded data, and returns a new string
consisting of the unencoded data. If the function encounters a character
in the input string which is not part of the base64 vocabulary, or if the
length of the argument string is not a multiple of four characters, it
returns 0.
isatty: (isatty expr)
The "isatty" intrinsic is a wrapper for the isatty(3) library call. The
function accepts one argument which must evaluate to either fixnum 0, 1,
or 2, specifying stdin, stdout, or stderr, respectively. The function
returns fixnum 0 if the specified descriptor is not connected to a
terminal device, and a non-zero fixnum value, if the specified descriptor
is connected to a terminal device.
sleep: (sleep expr)
The "sleep" intrinsic is a wrapper for the sleep(3) library call. The
function accepts one argument which must evaluate to a fixnum specifying
a number of seconds for the interpreter to go to sleep. The function
returns when specified number of seconds has elapsed or a signal has been
received by the interpreter. The function returns 0 if the specified
number of seconds has elapsed. If the function is interrupted by a
signal, it returns the remaining, unslept number of seconds. If the
interpreter receives a SIGTERM, this can be discovered by invoking
"sigtermp".
unsigned: (unsigned expr)
The "unsigned" intrinsic accepts one argument which must evaluate to a
fixnum and returns a string representing the value of the fixnum,
expressed as an unsigned value. This is not the same as the absolute
value of the fixnum. It is similar to casting an int to an unsigned int
in C, but getting a string representation back instead of a number.
The intrinsic is designed to be used in those situations where one would
like to do unsigned arithmetic operations, with numbers large enough to
cause the two's-complement representation of the result to wrap-around to
the negative side, but not large enough to overflow the fixnum itself.
That is to say, not generating a result greater than
(unsigned (+ 1 (* (maxidx) 2)))
form_encode: (form_encode expr)
form_decode: (form_decode expr)
These two intrinsics encode and decode strings to and from the x-www-
form-url-encoding used by web clients to encode the data in forms. Each
intrinsic accepts one string argument, and returns a string.
Mon, Apr 21 2014