The e-TeX Short Reference
NTS team
Oktober 1995
Published as
Phil Taylor, e-TeX: a 100%-compatible successor to TeX
Following humbly in the footsteps of the Grand Wizard
in: Proceedings of the Ninth European TeX Conference EuroTeX'95, September 4-8, 1995, Arnhem, The Netherlands, pp. 359-370.
\beginL
\beginR
\botmarks
\detokenize
\endL
\endR
\eTeXrevision
\eTeXversion
\everyeof
\firstmarks
\grouplevel
\grouptype
\ifcsname
\ifdefined
\interactionmode
\lastnodetype
\marks
\middle
\predisplaydirection
\protected
\readline
\scantokens
\showgroups
\showtokens
\splitfirstmarks
\splitbotmarks
\TeXXeTstate
\topmarks
\tracingassigns
\tracingcommands
- additional details
\tracinggroups
\tracingifs
\tracingscantokens
\unexpanded
\unless
The group were very concerned that unless there existed some evolutionary flexibility within which TeX could react to changing needs and environments, it might all too soon become eclipsed by more modern yet less sophisticated systems. Accordingly they agreed to investigate a possible successor or successors to TeX, successors which would enshrine and encapsulate all that was best in TeX whilst being freed from the evolutionary constraints which Knuth had placed on TeX itself. To avoid any suggestion that it was TeX which the group sought to develop against Knuth's wishes, a working title of NTS (for New Typesetting System) was chosen for the project.
During the initial meetings of the NTS group, it became clear
that there were two possible approaches to developments based on TeX:
an evolutionary path which would simply continue where Knuth had left
off, and which would use as its basis the source code of TeX itself
(i.e. TeX.Web); the other a
revolutionary path which would be based on a completely new
implementation of TeX, using a modern rapid-prototyping language which
could allow individual components of the system to be modified or
replaced in a simple and straightforward manner. The group agreed
that the latter (revolutionary) approach had much greater potential,
but were aware that the re-implementation would be non-trivial, and
would require external funding to bring it to fruition in finite time;
accordingly they agreed to concentrate their initial efforts on the
former (evolutionary) path, and set to work to specify and implement a
direct derivative of TeX which became known as e-TeX (the e
of e-TeX may be read as extended, enhanced,
evolutionary or European at will(!), and is also an
acknowledgement of the parallel developments which have lead the
The group took as the starting point for the development of e-TeX the many contributions which had been made on NTS-L (the open mailing list on which discussions pertinent to e-TeX & NTS take place), together with the extremely interesting list of ideas which Knuth gives at the end of TeX82.Bug, and which he describes as `Possibly nice ideas that will not be implemented' (and which he contrasts with `Bad ideas that will not be implemented'!). Individual members of the group also contributed ideas of their own which had not necessarily been discussed publicly. All proposals were then subjected to a rigorous vetting procedure to ensure that they conformed to the e-TeX philosophy, which may be summarised as follows:
e-TeX will in all ways demonstrate its affinity to, and derivation from, Knuth's TeX; it will be implemented as a change-file to TeX.Web, and will not exploit features which could only be achieved by using a particular implementation, operating system or language; it will be capable of being used successfully on a machine as small as an 80286-based PC or similar.At format-generation time, a user will have the option of generating either a TeX-compatible format or an e-TeX format; if the TeX-compatible format is subsequently used in conjunction with e-TeX, the result will be Trip-compatible (i.e. indistinguishable from TeX proper). If an e-TeX format is generated and used in conjunction with e-TeX, then provided that none of the new e-TeX primitives are used, the results will be identical to those which would be produced using TeX proper. If an e-TeX format is used in conjunction with e-TeX and if one or more of the new e-TeX primitives are used, then those portions of the document which are affected by the new primitive(s) may be processed in a manner unique to e-TeX; other portions of the document will be processed in a manner identical to that of TeX proper. Only if an e-TeX format is used in conjunction with e-TeX and if an explicit assignment is made to one of the enhanced-mode variables to enable that particular enhanced mode will e-TeX behave in a manner which may be distinguishable from that of TeX even if no other reference to an e-TeX primitive occurs anywhere in the document. (These modes of operation are referred to as compatibility-mode, extended-mode and enhanced-mode respectively.)
All new e-TeX primitives will be syntactically identical to existing TeX primitives: that is, they will be either control-words or control-symbols within a normal category code regime. Where an analogous primitive exists within TeX, the corresponding e-TeX primitive(s) will occupy the same syntactic niche. Every effort will be made to ensure that new e-TeX primitives fit into the existing set of TeX datatypes; no new datatype will be introduced unless it is absolutely essential.
In brief, this implies that e-TeX will follow the principle of least surprise: an existing TeX user, on using e-TeX for the first time, should not be surprised by e-TeX's behaviour, and should be able to take advantage of new e-TeX features without having either to unlearn some aspects of TeX or to learn some new e-TeX philosophy.
Once a working binary (or binaries, for those systems which have separate executables for IniTeX and VirTeX) has been acquire or produced, the next step will be to generate a suitable format file or files. Whilst e-TeX can be used in conjunction with Plain.TeX to produce a Plain e-format, it is better to use the supplied e-Plain.TeX file which supplements the e-TeX primitives with additional useful control sequences.
When generating the format file, and regardless of the format source
used, one fundamental decision must be made: is e-TeX to generate a
compatibility mode format,
or an extended mode format? If
the former, all e-TeX extensions and
enhancements will be disabled, the format
will contain only the TeX-defined set of primitives, and any
subsequent use of the format in conjunction with e-TeX will result in
completely TeX-compatible behaviour and semantics, including
compatibility at the level of the Trip test. If
the latter option, however, is selected, then all extensions present
in e-TeX will automatically be activated, and the format file will
contain not only the TeX-defined set of primitives but also those
defined by e-TeX itself; any subsequent use of such a format in
conjunction with e-TeX will result in e-TeX operating in extended mode; documents which
contains no references to any of the e-TeX-defined primitives will
continue to generate results identical to those which would have been
produced were the document processed by TeX, but compatibility at the
Trip-test level can no longer be accomplished, and
of course any document which makes reference to an e-TeX primitive
will generate results which could not have been accomplished using
TeX. It should be noted that neither a
compatibility mode format
nor an extended mode format may
be used in conjunction with TeX itself; they are only suitable for use
in conjunction with e-TeX, since formats are not in general portable.
Finally it should be emphasised that even if an
extended mode format is
generated, any document processed using such a format but not
referencing any e-TeX-defined primitive will produce results identical
to those which would have been produced had the same document been
processed using TeX; only if the document makes an explicit assignment
to one of the enhanced mode state
variables (\TeXXeTstate
is the
only instance of these in V1 of e-TeX) will compatibility with TeX be
compromised: e-TeX is then said to be operating in
enhanced mode rather than
extended mode.
The choice between generating a compatibility mode format and an extended mode format is made at the point of specifying the format source file: assuming that the operating system supports command-line entry with parameters, then a normal TeX format-generation command would probably resemble:
IniTeX Plain \dumpor if the more verbose interactive form is preferred:
IniTeX **Plain *\dump
With e-TeX, exactly the same command will achieve exactly the same effect, and the format generated will be a compatibility-mode format; thus assuming that the Ini-version of e-TeX is invoked with the command eIniTeX, the following will both generate compatibility-mode formats:
eIniTeX Plain \dumpand
eIniTeX **Plain *\dump
In order to generate an extended mode format, the file-specification for the format source file must be preceded by an asterisk (*); whilst this may seem an inelegant mechanism, it has the great advantage that it avoids almost all system dependencies (Graphical user interface (GUI) systems excepted, of course), and the asterisk as a component element of a filename is a very remote possibility (most filing systems reserve the asterisk as a `wild card' character, which can therefore not form a part of a real file name per se). Thus to generate an extended mode Plain format, the following dialogue may be used:
eIniTeX *Plain \dumpor
eIniTeX ***Plain *\dumpand to generate an extended mode e-Plain format, the following instead:
eIniTeX *e-Plain \dumpor
eIniTeX ***e-Plain *\dump
Once suitable formats have been generated, they can then be used in conjunction both with e-IniTeX and e-VirTeX without further formality: in particular, no asterisk is needed (nor should be used!) if a format is specified, since the format implicitly defines (depending as its mode of generation) in which mode (compatibilty or extended) e-TeX will operate. Thus, for example, if a Plain format had been generated in compatibility mode, and an e-Plain format had been generated in extended mode, then both:
eIniTeX &Plainand
eVirTeX &Plainwill cause e-TeX to process any subsequent commands in compatibility mode. On the other hand, both
eIniTeX &e-Plainand
eVirTeX &e-Plainwill cause e-TeX to process any subsequent commands in extended mode, but only because the e-Plain format was generated in extended mode: it is not the name of the format, nor is it the contents of the source of the format, which determine the mode of operation -- it is the mode of operation which was used when the format was generated. Any format generated in compatibility mode will cause e-TeX to operate in compatibility mode whenever it is used, whilst the same format generated in extended mode will cause e-TeX to operate in extended mode whenever it is used.
Although e-TeX is completely TeX-compatible, and there is therefore no real reason why any system should need both TeX and e-TeX, it is anticipated that until complete confidence exists in the compatibility of e-TeX many sites and users will prefer to retain instances of each. For this reason the supplied change-files and binaries will ensure that both TeX and e-TeX can happily co-exist on any system by a careful choice of non-overlapping name-spaces. This might, for example, be achieved by changing the default extension for e-format files to (say) .efm rather than .fmt, or by referencing a different format directory and/or environment variable (for example, eTeX_formats rather than TeX_formats).
The new features are listed and briefly described below, clustered together to indicate related functionality; it is intended that a full description of each together with appropriate examples will be published in The e-TeX Manual, which is hoped will become the definitive reference manual for e-TeX.
\protected
\long
,
\outer
, and \global
; it associates with
the macro being defined an attribute which inhibits expansion of
the macro in expansion-only contexts (for example, within the
parameter text of a \write
or \edef
);
if, however, the parser or command processor (TeX's `oesophagous'
and `stomach', in Knuth's alimentary paradigm) is currently
demanding a command, then the \protected
macro will expand in the normal way. This behaviour is identical
to that displayed by the explicit expansion of a token-list
register through the use of \the
; the same model is
used elsewhere in e-TeX to achieve a consistent paradigm
for partial expansion.
\detokenize
,
\catcode
10
(space) or 12 (other) corresponding to a
decomposition of the tokens of the <balanced text>
of the unexpanded <general text>>;
c.f. \showtokens
. The effect is rather as if
\scantokens
(q.v.) were
applied to the <general text> within a regime in
which only \catcodes
10 and 12 existed. Note that
in order to preserve the boundaries between control
words and any following letter, a space is
yielded after each control word including the last.
\unexpanded
,
\write
,
\edef
, etc., but further expansion will occur if the
parser or command processor is currently demanding a
command. The effect is as if the <general
text> were assigned to a token list register, and the
latter were then partially expanded using \the
, but
no assignment actually takes place; thus \unexpanded
can be used in expansion-only contexts.
\readline
\read
, but treats each character as
if it were currently of \catcode
10 (space)
or 12 (other); the text thus read is therefore suitable
for being scanned and re-scanned (using
\scantokens
, q.v.) under
different \catcode
regimes.
\scantokens
,
\input
mechanism to
re-process these characters under the current
\catcode
regime. As the \input
mechanism is used, even hex notation (^^xy) will be
re-interpreted. Parentheses and a single space representing the
pseudo-file will be displayed if
\tracingscantokens
(q.v.) is positive and non-zero.
\eTeXrevision
:
\catcode
12 (other; these
represent the minor component of the combined version/revision
number. Pre-release versions will be characterised by an initial
minus sign (-), whilst post-release versions
will be implicitly positive; both will contain an explicit
leading decimal point, which will follow any minus sign present.
\eTeXversion
:
\grouplevel
:
\grouptype
:
\ifcsname
:
\unless
\expandafter
\ifx
\expandafter
\relax
\csname
but avoids the side-effect of the
cs-name being ascribed the value \relax
,
and also does not rely on \relax
having its
canonical meaning. No hash-table entry is used if
cs-name does not exist.
(\unless
is explained below.)
\ifdefined
:
\unless
\ifx
\undefined
, but does not require
\undefined
to actually be undefined, since no
explicit comparison is made with any particular control sequence.
\lastnodetype
:
\marks
:
\mark
, which has to be
over-loaded if more than one class of information is to be saved
(e.g. over-loading is necessary if separate information for recto
and verso pages is to be maintained), e-TeX has a whole
class of \marks
(16, in the first release); thus
rather than writing \mark
<general
text> as in TeX, in e-TeX one writes
\marks
4-bit number <general
text>. For example, \marks 0
could be used
to retain information for the verso page, whilst
\marks 1
could retain information for the recto.
There are equivalent classes for the five \marks
variables
\botmarks
,
\firstmarks
,
\topmarks
,
\splitfirstmarks
and
\splitbotmarks
.
TeX--XeT was developed by Peter Breitenlohner based on the original TeX-XeT of Donald Knuth and Pierre MacKay; whereas TeX-XeT generated non-standard DVI files, TeX--XeT generates perfectly normal DVI files which can therefore be processed by standard DVI drivers (assuming, of course, that the necessary fonts are available). Both systems permit the direction of typesetting (conventionally left-to-right in Western documents) to be reversed for part or all of a document, which is particularly useful when setting languages such as Hebrew or Arabic.
\beginL
:
\beginR
:
\endL
:
\endR
:
\TeXXeTstate
:
\TeXXeTstate
defaults to zero, and
even if set positive during format creation will be re-set to
zero before the format is dumped. Explicit user action therefore
is required to enable
TeX--XeT semantics, and
TeX--XeT is therefore classed as an
enhancement, not simply an
extension.
\predisplaydirection
:
\interactionmode
\showgroups
\showtokens
\tracingassigns
\tracinggroups
\tracingifs
\tracingscantokens
\tracingcommands
\interactionmode
:
\scrollmode
, \errorstopmode
, etc., in
e-TeX read/write access is provided via
\interactionmode
(an internal integer); assigning a
numeric value sets the associated mode, whilst the current mode
may be ascertained by interrogating its value. Symbolic
definitions of these values may be provided through an associated
macro library.
\showgroups
:
\showgroups
causes e-TeX to display the
level and type of all active groups from the point within which
it was called.
\showtokens
,
\detokenize
.
\tracinggroups
:
\tracinggroups
(an internal read/write integer)
causes e-TeX to trace entry and exit to every group
while set to a positive non-zero value.
\tracingscantokens
:
\scantokens
is
invoked; the matching close-parenthesis will be recorded when the
scan is complete. If a traceback occurs during the expansion of
\scantokens
, the first
displayed line number will reflect the logical line number of the
pseudo-file created from the parameter to
\scantokens
; thus enabling
\tracingscantokens
can assist in identifying why an
seemingly irrational line number is shewn as the source of error
(the traceback always continues until the line number of the
actual source file is displayed).
\tracingcommands
is greater than 2, additional
information is displayed. [More detail needed here!]
\everyeof
:
\every...
primitives, it takes as parameter a
<balanced text>, the tokens of which are inserted
when the end of a file (either real or virtual, if
\scantokens
is used) is
reached. This allows \input
statements to be used
within the replacement text of \edef
s, and allows
totally arbitrary files to be \input
within an
e-TeX conditional, since the necessary \fi
can be inserted before e-TeX complains that it has
fallen off the end of the file.
\middle
:
\left
and \right
,
\middle
specifies that the following delimiter is to
serve both as a right and left delimiter; it will be set with
spacing appropriate to a right delimiter w.r.t. the preceding
atom(s), and with spacing appropriate to a left delimiter
w.r.t. the succeeding atom(s).
\unless
:
\ifeof
, \ifodd
,
\ifvoid
, etc., have no complementary counterparts.
Whilst this normally poses no problems since each accepts both a
\then
(implicit) and an \else
(explicit) part, they fall down when used as the final
\if...
of a \loop ... \if ... \repeat
construct, since no \else
is allowed after the final
\if...
. \unless
allows the sense of
all Boolean conditionals to be inverted, and thus (for example)
\unless
\ifeof
yields true iff
end-of-file has not yet been reached.
At the time of writing, e-TeX version 1 is ready to go to TeX-Implementors, although work remains to be done on the eTrip test. The version being prepared for the implementors is termed Version 1beta (the NTS team themselves acted as alpha-testers). Once the implementors have given us the go-ahead and said that in their opinion e-TeX is a viable alternative to TeX (by which I mean that it is completely compatible, and functions according to the accompanying documentation), we will release it to the TeX world as a whole. We will react as quickly as possible to any bug reports (we sincerely hope that there will be few!), and we will then concentrate on new features for version 2. We certainly intend to work as closely as possible with the LaTeX2e team, not because we believe that LaTeX2e is necessarily right for everybody, but because (a) we respect the intellect and knowledge of the members of the LaTeX2e team, and (b) because it might be possible to enable them to achieve things with LaTeX and e-TeX which would either be impossible or extraordinarily difficult with LaTeX and TeX. We have a very long list of suggestions from Nelson Beebe, we still have many of Knuth's `possibly good ideas' to consider, and we have an enormous number of suggestions made on NTS-L: we are unlikely to run out of ideas for many years yet!
(Put on the WWW by Bernd Raichle, Member of the NTS group.)