Synopsis - Code Repository

Diff for /branches/S10/Synopsis/Parsers/Cpp/ucpp/README between version 2017 and 2018

version 2017, Sun Mar 1 01:49:46 2009 UTC version 2018, Wed Apr 15 14:28:50 2009 UTC
Line 1 
Line 1 
 ucpp-1.2 is a C preprocessor mostly compliant to ISO-C99.  ucpp-1.3 is a C preprocessor compliant to ISO-C99.
   
 Author: Thomas Pornin <pornin@bolet.org>  Author: Thomas Pornin <pornin@bolet.org>
 Main site: http://www.di.ens.fr/~pornin/ucpp/  Main site: http://pornin.nerim.net/ucpp/
   
   
   
Line 12 
Line 12 
 replacement, conditional compilation and inclusion of header files.  replacement, conditional compilation and inclusion of header files.
 It is often found as a stand-alone program on Unix systems.  It is often found as a stand-alone program on Unix systems.
   
 Ucpp is such a preprocessor; it is designed to be quick and light,  ucpp is such a preprocessor; it is designed to be quick and light,
 but anyway fully compliant to the ISO standard 9899:1999, also known  but anyway fully compliant to the ISO standard 9899:1999, also known
 as C99. Ucpp can be compiled as a stand-alone program, or linked to  as C99. ucpp can be compiled as a stand-alone program, or linked to
 some other code; in the latter case, ucpp will output tokens, one  some other code; in the latter case, ucpp will output tokens, one
 at a time, on demand, as an integrated lexer.  at a time, on demand, as an integrated lexer.
   
 Ucpp operates in two modes:  ucpp operates in two modes:
 -- lexer mode: ucpp is linked to some other code and outputs a stream of  -- lexer mode: ucpp is linked to some other code and outputs a stream of
 tokens (each call to the lex() function will give one token)  tokens (each call to the lex() function will yield one token)
 -- non-lexer mode: ucpp preprocesses text and outputs the resulting text  -- non-lexer mode: ucpp preprocesses text and outputs the resulting text
 on a file descriptor; if linked to some other code, the cpp() function  to a file descriptor; if linked to some other code, the cpp() function
 must be called repeatedly, otherwise ucpp is a stand-alone binary.  must be called repeatedly, otherwise ucpp is a stand-alone binary.
   
   
Line 40 
Line 40 
   NO_LIBC_BUF    NO_LIBC_BUF
   NO_UCPP_BUF    NO_UCPP_BUF
      Two options used to disable the two bufferings inside ucpp. Define       Two options used to disable the two bufferings inside ucpp. Define
      both options for maximum memory saving but you will probably want       both options for maximum memory savings but you will probably want
      to keep libc buffering if you want decent performance. Define none       to keep libc buffering for decent performance. Define none on large
      on large systems (modern 32 or 64-bit systems).       systems (modern 32 or 64-bit systems).
   UCPP_MMAP    UCPP_MMAP
      With this option, if ucpp internal buffering is active, ucpp will       With this option, if ucpp internal buffering is active, ucpp will
      try to mmap() the input files. This might give a slight performance       try to mmap() the input files. This might yield a slight performance
      improvement, but will work only on a limited set of architectures.       improvement, but will work only on a limited set of architectures.
   PRAGMA_TOKENIZE    PRAGMA_TOKENIZE
      Make ucpp generate tokenized PRAGMA tokens on #pragma and _Pragma();       Make ucpp generate tokenized PRAGMA tokens on #pragma and _Pragma();
Line 63 
Line 63 
      Do not evaluate _Pragma() inside #if, #include, #include_next and #line       Do not evaluate _Pragma() inside #if, #include, #include_next and #line
      directives; instead, emit an error (since the remaining _Pragma will       directives; instead, emit an error (since the remaining _Pragma will
      surely imply a syntax error).       surely imply a syntax error).
     DSHARP_TOKEN_MERGE
        When two tokens are to be merged with the `##' operator, but fail
        because they do not merge into a single valid token, ucpp keeps those
        two tokens separate by adding an extra space between them in text
        output. With this option on, that extra space is not added, which means
        that some tokens may merge partially if the text output is preprocessed
        again. See tune.h for details.
   INMACRO_FLAG    INMACRO_FLAG
      In lexer mode, set the inmacro flag to 1 if the current token comes       In lexer mode, set the inmacro flag to 1 if the current token comes
      from a macro replacement, 0 otherwise. macro_count maintains an       from a macro replacement, 0 otherwise. macro_count maintains an
Line 76 
Line 83 
      Default predefined macros in stand-alone ucpp.       Default predefined macros in stand-alone ucpp.
   STD_ASSERT    STD_ASSERT
      Default assertions in stand-alone ucpp.       Default assertions in stand-alone ucpp.
   NATIVE_INTMAX    NATIVE_SIGNED
   NATIVE_UINTMAX    NATIVE_UNSIGNED
   SIMUL_UINTMAX    NATIVE_UNSIGNED_BITS
     NATIVE_SIGNED_MIN
     NATIVE_SIGNED_MAX
     SIMUL_ARITH_SUBTYPE
     SIMUL_SUBTYPE_BITS
     SIMUL_NUMBITS
   WCHAR_SIGNEDNESS    WCHAR_SIGNEDNESS
      Those options define how #if expressions are evaluated; see the       Those options define how #if expressions are evaluated; see the
      cross-compilation section of this file for more info.       cross-compilation section of this file for more info, and the
        comments in tune.h. Extra info is found in arith.h and arith.c,
        at the possible expense of your mental health.
   DEFAULT_LEXER_FLAGS    DEFAULT_LEXER_FLAGS
   DEFAULT_CPP_FLAGS    DEFAULT_CPP_FLAGS
      Default flags in respectively lexer and non-lexer modes.       Default flags in respectively lexer and non-lexer modes.
Line 90 
Line 104 
      siglongjmp(); it is known to (very slightly) improve performance       siglongjmp(); it is known to (very slightly) improve performance
      on AIX systems.       on AIX systems.
   MAX_CHAR_VAL    MAX_CHAR_VAL
      Ucpp will consider characters whose value is equal or above       ucpp will consider characters whose value is equal or above
      MAX_CHAR_VAL as outside the C source charset (so they will be       MAX_CHAR_VAL as outside the C source charset (so they will be
      treated just like '@', for instance). For ASCII systems, 128       treated just like '@', for instance). For ASCII systems, 128
      is fine. 256 is a safer value, but uses more (static) memory.       is fine. 256 is a safer value, but uses more (static) memory.
Line 107 
Line 121 
      true content; this is intended for reconstruction of the source       true content; this is intended for reconstruction of the source
      line. Beware that some comments may have embedded newlines.       line. Beware that some comments may have embedded newlines.
   COPY_LINE_LENGTH    COPY_LINE_LENGTH
      Ucpp can maintain a copy of the current source line, up to that       ucpp can maintain a copy of the current source line, up to that
      length. Irrelevant to stand-alone version.       length. Irrelevant to stand-alone version.
   *_MEMG    *_MEMG
      Those settings modify ucpp behaviour, wrt memory allocations. With       Those settings modify ucpp behaviour, wrt memory allocations. With
Line 126 
Line 140 
      With this setting, ucpp will check for the return value of malloc()       With this setting, ucpp will check for the return value of malloc()
      and exit with a diagnostic when out of memory. MEM_CHECK is implied       and exit with a diagnostic when out of memory. MEM_CHECK is implied
      by AUDIT.       by AUDIT.
     -DMEM_DEBUG
        Enable memory debug code. This will track memory leaks and several
        occurrences of memory management errors; it will also slow down
        things and increase memory consumption, so you probably do not
        want to use this option.
   -DINLINE=foobar    -DINLINE=foobar
      The ucpp code uses "inline" qualifier for some functions; by       The ucpp code uses "inline" qualifier for some functions; by
      default, that qualifier is macro-replaced with nothing. Define       default, that qualifier is macro-replaced with nothing. Define
Line 140 
Line 159 
    gcc believes some variables might be used prior to their     gcc believes some variables might be used prior to their
    initialization; ignore those messages.     initialization; ignore those messages.
   
 5. Install wherever you want the binary and the man page ucpp.1.  5. Install wherever you want the binary and the man page ucpp.1. I
      have not provided an install sequence because I didn't bother.
   
 6. If you do not have the make utility, compile each file seperately  6. If you do not have the make utility, compile each file separately
    and link them together. The exact details depend on your compiler.     and link them together. The exact details depend on your compiler.
    You must define the macro STAND_ALONE when compiling cpp.c (there     You must define the macro STAND_ALONE when compiling cpp.c (there
    is such a definition, commented out, in cpp.c, line 34).     is such a definition, commented out, in cpp.c, line 34).
   
 There is no "configure" script since:  There is no "configure" script because:
 -- I do not like the very idea of a "configure" script.  -- I do not like the very idea of a "configure" script.
 -- Ucpp is written in ANSI-C and should be fairly portable.  -- ucpp is written in ANSI-C and should be fairly portable.
 -- There is no such thing as "standard" settings for a C preprocessor.  -- There is no such thing as "standard" settings for a C preprocessor.
    The predefined system macros, standard assertions,... must be tuned     The predefined system macros, standard assertions,... must be tuned
    by the sysadmin.     by the sysadmin.
Line 161 
Line 181 
 not C99 (or later), read the cross-compilation section in this README  not C99 (or later), read the cross-compilation section in this README
 file.  file.
   
 The C90 and C99 standards state that external linkage names might  The C90 and C99 standards state that external linkage names might be
 be considered equal or different based upon only their first 6  considered equal or different based upon only their first 6 characters;
 characters; this rule might make ucpp not to compile on a conformant C  this rule might make ucpp not compile on a conformant C implementation.
 implementation. I have yet to see such an implementation, however.  I have yet to see such an implementation, however.
   
 If you want to use ucpp as an integrated preprocessor and lexer, see the  If you want to use ucpp as an integrated preprocessor and lexer, see the
 section REUSE. Compiling ucpp as a library is an exercise left to the  section REUSE. Compiling ucpp as a library is an exercise left to the
 reader.  reader.
   
 With the LOW_MEM code enabled, ucpp can run on a Minix-86 or Msdos  With the LOW_MEM code enabled, ucpp can run on a Minix-i86 or Msdos
 16-bit small-memory-model machine. It will not be fully compliant  16-bit small-memory-model machine. It will not be fully compliant
 on such an architecture to C99, since C99 states that at least one  on such an architecture to C99, since C99 states that at least one
 source code with 4095 simultaneously defined macros must be processed;  source code with 4095 simultaneously defined macros must be processed;
Line 189 
Line 209 
 subclause (which BSD dropped recently anyway) and with no reference to  subclause (which BSD dropped recently anyway) and with no reference to
 Berkeley (since the code is all mine, written from scratch). Informally,  Berkeley (since the code is all mine, written from scratch). Informally,
 this means that you can reuse and redistribute the code as you want,  this means that you can reuse and redistribute the code as you want,
 provided that you states in the documentation (or any substantial part  provided that you state in the documentation (or any substantial part of
 of the software) of redistributed code that I am the original author.  the software) of redistributed code that I am the original author. (If
 (If you press a cdrom with 200 software packages, I do not insist on  you press a cdrom with 200 software packages, I do not insist on having
 having my name on the cover of the cdrom -- just keep a Readme file  my name on the cover of the cdrom -- just keep a Readme file somewhere
 somewhere on the cdrom, with the copyright notice included.)  on the cdrom, with the copyright notice included.)
   
 As a courteous gesture, if you reuse my code, please drop me a mail.  As a courteous gesture, if you reuse my code, please drop me a mail.
 It raises my self-esteem.  It raises my self-esteem.
Line 351 
Line 371 
   
   
 Afterwards:  Afterwards:
   
 -- if you are in lexer mode, call lex(); each call will make the ctok  -- if you are in lexer mode, call lex(); each call will make the ctok
    field point to the next token. A non-zero return value is an error.     field point to the next token. A non-zero return value is an error.
    lex() skips whitespace tokens. The memory used by the string value     lex() skips whitespace tokens. The memory used by the string value
Line 364 
Line 385 
    ignore the error.     ignore the error.
   
 -- otherwise, call cpp(); each call will analyze one or more tokens  -- otherwise, call cpp(); each call will analyze one or more tokens
    (one token if it did not find a cpp directive, or a macro name).     (one token if it did find neither a cpp directive nor a macro name).
    A positive return value is an error.     A positive return value is an error.
   
 For both functions, if the return value is CPPERR_EOF (which is a  For both functions, if the return value is CPPERR_EOF (which is a
Line 411 
Line 432 
 This will add a trailing 0 if the line was not read entirely.  This will add a trailing 0 if the line was not read entirely.
   
   
   ucpp may be configured at runtime to accept alternate characters as
   possible parts of identifiers. Typical intended usage is for the '$'
   and '@' characters. The two relevant functions are set_identifier_char()
   and unset_identifier_char(). When this call is issued:
           set_identifier_char('$');
   then for all the remaining input, the '$' character will be considered
   as just another letter, as far as identifier tokenizing is concerned. This
   is for identifiers only; numeric constants are not modified by that setting.
   This call resets things back:
           unset_identifier_char('$');
   Those two functions modify the static table which is initialized by
   init_cpp(). You may call init_cpp() at any time to restore the table
   to its standard state.
   
   When using this feature, take care of the following points:
   
   -- Do NOT use a character whose numeric value (as an `unsigned char'
   cast into an `int') is greater than or equal to MAX_CHAR_VAL (in tune.h).
   This would lead to unpredictable results, including an abrupt crash of
   ucpp. ucpp makes absolutely no check whatsoever on that matter: this is
   the programmer's responsibility.
   
   -- If you use a standard character such as '+' or '{', tokens which
   begin with those characters cease to exist. This can be troublesome.
   If you use set_identifier_char() on the '<' character, the handling of
   #include directives will be greatly disturbed. Therefore the use of any
   standard C character in set_identifier_char() of unset_identifier_char()
   is declared unsupported, forbidden and altogether unwise.
   
   -- Stricto sensu, when an extra character is declared as part of an
   identifier, ucpp behaviour cease to conform to C99, which mandates that
   characters such as '$' or '@' must be treated as independant tokens of
   their own. Therefore, if your purpose is to use ucpp in a conformant
   C implementation, the use of set_identifier_char() should be made at
   least a runtime option.
   
   -- When enabling a new character in the middle of a macro replacement,
   the effect of that replacement may be delayed up to the end of that
   macro (but this is a "may" !). If you wish to trigger this feature with
   a custom #pragma or _Pragma(), you should remember it (for instance,
   usine _Pragma() in a macro replacement, and then the extra character
   in the same macro replacement, is not reliable).
   
   
   
 COMPATIBILITY NOTES  COMPATIBILITY NOTES
 -------------------  -------------------
Line 421 
Line 486 
 -- Traditional C, aka "K&R". This is the language first described by  -- Traditional C, aka "K&R". This is the language first described by
 Brian Kernighan and Dennis Ritchie, and implemented in the first C  Brian Kernighan and Dennis Ritchie, and implemented in the first C
 compiler that was ever coded. There are actually several dialects of  compiler that was ever coded. There are actually several dialects of
 K&R, and all of them are considered as deprecated.  K&R, and all of them are considered deprecated.
   
 -- ISO 9899:1990, aka C90, aka C89, aka ANSI-C. Formalized by ANSI  -- ISO 9899:1990, aka C90, aka C89, aka ANSI-C. Formalized by ANSI
 in 1989 and adopted by ISO the next year, it is the C flavour many C  in 1989 and adopted by ISO the next year, it is the C flavour many C
Line 429 
Line 494 
 with enhancements, clarifications and several new features.  with enhancements, clarifications and several new features.
   
 -- ISO 9899:1999, aka C99. This is an evolution on C90, almost fully  -- ISO 9899:1999, aka C99. This is an evolution on C90, almost fully
 backward compatible with C90 (exhibitting a code that makes a difference  backward compatible with C90. C99 introduces many new and useful
 is a tricky exercise). C99 introduces many new and useful features,  features, however, including in the preprocessor.
 however, including in the preprocessor.  
   
 There was also a normative addendum in 1995, that added a few features  There was also a normative addendum in 1995, that added a few features
 to C90 (for instance, digraphs) that are also present in C99.  to C90 (for instance, digraphs) that are also present in C99. It is
   sometimes refered to as "C95" or "AMD 1".
   
   
 Ucpp implements the C99 standard, but can be used in a stricter mode,  ucpp implements the C99 standard, but can be used in a stricter mode,
 to enforce C90 compatibility (it will, however, still recognize some  to enforce C90 compatibility (it will, however, still recognize some
 constructions that are not in plain C90).  constructions that are not in plain C90).
   
 Ucpp also knows several extensions to C99:  ucpp also knows about several extensions to C99:
   
 -- Assertions: this is an extension to the defined() operator, with  -- Assertions: this is an extension to the defined() operator, with
    its own namespace. Assertions seem to be used in several places,     its own namespace. Assertions seem to be used in several places,
Line 460 
Line 525 
 support is always active.  support is always active.
   
 The ucpp code itself should be compatible with any ISO-C90 compiler.  The ucpp code itself should be compatible with any ISO-C90 compiler.
 The cpp.c file is rather big (~ 53kB), it might confuse old 16-bit C  The cpp.c file is rather big (~ 64kB), it might confuse old 16-bit C
 compilers; the macro.c file is somewhat large also (~ 43kB).  compilers; the macro.c file is somewhat large also (~ 47kB).
   
 The evaluation of #if expressions is subject to some subtleties, see the  The evaluation of #if expressions is subject to some subtleties, see the
 section "cross-compilation".  section "cross-compilation".
Line 473 
Line 538 
 strict positivity is already assured by the C standard, so you just need  strict positivity is already assured by the C standard, so you just need
 to adjust MAX_CHAR_VAL.  to adjust MAX_CHAR_VAL.
   
 Ucpp has been tested succesfully on ASCII/ISO-8859-1 and EBCDIC systems.  ucpp has been tested succesfully on ASCII/ISO-8859-1 and EBCDIC systems.
 Beware that UTF-8 is NOT compatible with EBCDIC.  Beware that UTF-8 is NOT compatible with EBCDIC.
   
 Pragma handling: when used in non-lexer mode, ucpp tries to output  Pragma handling: when used in non-lexer mode, ucpp tries to output a
 a source text that, read again, will give the exact same stream of  source text that, when read again, will yield the exact same stream of
 tokens. This is not completely true with regards to line numbering in  tokens. This is not completely true with regards to line numbering in
 some tricky macro replacements, but it should work correctly otherwise,  some tricky macro replacements, but it should work correctly otherwise,
 especially with pragma directives if the compile-time option PRAGMA_DUMP  especially with pragma directives if the compile-time option PRAGMA_DUMP
 was set: #pragma are dumped, non-void _Pragma() are converted to the  was set: #pragma are dumped, non-void _Pragma() are converted to the
 corresponding #pragma and dumped also.  corresponding #pragma and dumped also.
   
 Ucpp does not macro-replace the contents of #pragma and _Pragma();  ucpp does not macro-replace the contents of #pragma and _Pragma();
 If you want a macro-replaced pragma, use this:  If you want a macro-replaced pragma, use this:
   
 #define pragma_(x)      _Pragma(#x)  #define pragma_(x)      _Pragma(#x)
Line 494 
Line 559 
 inside a #pragma or another _Pragma).  inside a #pragma or another _Pragma).
   
   
 I wrote ucpp according to what is found in "The Language C" from Brian  I wrote ucpp according to what is found in "The C Programming Language"
 Kernighan and Dennis Ritchie (2nd edition) and the C99 standard; but I  from Brian Kernighan and Dennis Ritchie (2nd edition) and the C99
 could have misinterpreted some points. On some tricky points I got help  standard; but I could have misinterpreted some points. On some tricky
 from the helpful people from the comp.std.c newsgroup. For assertions  points I got help from the helpful people from the comp.std.c newsgroup.
 and #include_next, I mimicked the behaviour of GNU cpp, as is stated  For assertions and #include_next, I mimicked the behaviour of GNU cpp,
 in the GNU cpp info documentation. An open question is related to the  as is stated in the GNU cpp info documentation. An open question is
 following code:  related to the following code:
   
 #define undefined       !  #define undefined       !
 #define makeun(x)       un ## x  #define makeun(x)       un ## x
Line 510 
Line 575 
 bar  bar
 #endif  #endif
   
 Ucpp will replace 'defined foo' with 0 first (since foo is not defined),  ucpp will replace 'defined foo' with 0 first (since foo is not defined),
 then it will replace the macro makeun, and the expression will become  then it will replace the macro makeun, and the expression will become
 'un0', which is replaced by 0 since this is a remaining identifier. The  'un0', which is replaced by 0 since this is a remaining identifier. The
 expression evaluates to false, and 'bar' is emitted.  expression evaluates to false, and 'bar' is emitted.
Line 538 
Line 603 
 behaviour).  behaviour).
   
   
   Another point about macro replacement has been discussed at length in
   several occasions. It is about the following code:
   
   #define CAT(a, b)    CAT_(a, b)
   #define CAT_(a, b)   a ## b
   #define AB(x, y)     CAT(x, y)
   CAT(A, B)(X, Y)
   
   ucpp will produce `CAT(X,Y)' as replacement for the last line, whereas
   some other preprocessors output `XY'. The answer to the question
   "which behaviour is correct" seems to be "this is not defined by the
   C standard". It is the answer that has been actually given by the C
   standardization committee in 1992, to the defect report #017, question
   23, which asked that very same question. Since the wording of the
   standard has not changed in these parts from the 1990 to the 1999
   version, the preprocessor behaviour on the above-stated code should
   still be considered as undefined.
   
   It seems, however, that there used to be a time (around 1988) when the
   committee members agreed upon a precise macro-replacement algorithm,
   which specified quite clearly the preprocessor behaviour in such
   situation. ucpp behaviour is occasionnaly claimed as "incorrect" with
   regards to that algorithm. Since that macro replacement algorithm has
   never been published, and the committee itself backed out from it in
   1992, I decided to disregard those feeble claims.
   
   It is possible, however, that at some point in the future I rewrite the
   ucpp macro replacement code, since that code is a bit messy and might be
   made to use less memory in some occasions. It is then possible that, in
   the aftermath of such a rewrite, the ucpp behaviour for the above stated
   code become tunable. Don't hold your breath, though.
   
   
 About _Pragma: the standard is not clear about when this operator is  About _Pragma: the standard is not clear about when this operator is
 evaluated, and if it is allowed inside #if directives and such. For  evaluated, and if it is allowed inside #if directives and such. For
 ucpp, I coded _Pragma as a special macro with lazy replacement: it will  ucpp, I coded _Pragma as a special macro with lazy replacement: it will
Line 594 
Line 692 
 equivalent, except that the types used are intmax_t and uintmax_t, as  equivalent, except that the types used are intmax_t and uintmax_t, as
 defined in <stdint.h>.  defined in <stdint.h>.
   
 Ucpp can use two expression evaluators: one uses native integer types  ucpp can use two expression evaluators: one uses native integer types
 (one signed and one unsigned), the other evaluator emulates big integer  (one signed and one unsigned), the other evaluator emulates big integer
 numbers by representing them with two "unsigned long". By default, it  numbers by representing them with two values of some unsigned type. The
 will use the first evaluator, using (u)intmax_t as native types if the  emulated type handles signed values in two's complement representation,
 compiler is C99-compliant, or (unsigned) long otherwise. If you want  and can be any width ranging from 2 bits to twice the size of the
 another behaviour, modify the relevant section in tune.h. Here are  underlying native unsigned type used. An odd width is allowed. When
 examples of definitions:  right shifting an emulated signed negative value, it is left-padded with
   bits set to 1 (this is sign extension).
 /* evaluate natively with type "long long" */  
 #define NATIVE_UINTMAX  unsigned long long  When the ARITHMETIC_CHECKS macro is defined in tune.h, all occurrences
 #define NATIVE_INTMAX   long long  of implementation-defined or undefined behaviour during arithmetic
   evaluation are reported as errors or warned upon. This includes all
 /* evaluate natively with type "long" (even if bigger is available) */  overflows and underflows on signed quantities, constants too large,
 #define MATIVE_UINTMAX  unsigned long  and so on. Errors (which terminate immediately evaluation) are emitted
 #define MATIVE_INTMAX   long  for division by 0 (on / and % operators) and overflow (on / operator);
   otherwise, warnings are emitted and the faulty evaluation takes place.
 /* evaluate with bignum evaluation */  This prevents ucpp from crashing on typical x86 machines, while still
 #undef NATIVE_UINTMAX  allowing to use some extensions.
 #define SIMUL_UINTMAX  
   
 The bignum evaluation handles signed integers in two's complement  
 representation, whether this is the native integer representation or  
 not. The code makes the non-standard assumption that unsigned long are  
 represented unpadded in memory, that is, unsigned long are made up of  
 exactly sizeof(unsigned long) * CHAR_BIT bits. I have never heard of any  
 architecture where this assumption would be false.  
   
   
   
 FUTURE EVOLUTIONS  FUTURE EVOLUTIONS
 -----------------  -----------------
   
 Ucpp is quite complete now. There was a longstanding project of  ucpp is quite complete now. There was a longstanding project of
 "traditional" preprocessing, but I dropped it because it would not  "traditional" preprocessing, but I dropped it because it would not
 map cleanly on the token-based ucpp structure. Maybe I will code a  map cleanly on the token-based ucpp structure. Maybe I will code a
 string-based preprocessor one day; it would certainly use some of the  string-based preprocessor one day; it would certainly use some of the
 code from lexer.c, eval.c, mem.c and hash.c. However, making such a tool  code from lexer.c, eval.c, mem.c and nhash.c. However, making such a
 is almost irrelevant nowadays. If one wants to handle such project,  tool is almost irrelevant nowadays. If one wants to handle such project,
 using ucpp as code base, I would happily provide some help, if needed.  using ucpp as code base, I would happily provide some help, if needed.
   
   
Line 639 
Line 729 
 CHANGES  CHANGES
 -------  -------
   
   From 1.2 to 1.3:
   
   * brand new integer evaluation code, with precise evaluation and checks
   * new hash table implementation, with binary trees
   * relaxed attitude on failed `##' operators
   * bugfix on macro definition on command-line wrt nesting macros
   * support for up to 32766 macro arguments in LOW_MEM code
   * support for optional additional "identifier" characters such as '$' or '@'
   * bugfix: memory leak on void #assert
   
 From 1.1 to 1.2:  From 1.1 to 1.2:
   
 * bugfix: numerous memory leaks  * bugfix: numerous memory leaks
Line 764 
Line 864 
 ---------  ---------
   
 Volker Barthelmann, Neil Booth, Stephen Davies, Stéphane Ecolivet,  Volker Barthelmann, Neil Booth, Stephen Davies, Stéphane Ecolivet,
 Marcus Holland-Moritz, Antoine Leca, Cyrille Lefevre, Dave Rivers, Loic  Marc Espie, Marcus Holland-Moritz, Antoine Leca, Cyrille Lefevre,
 Tortay and Laurent Wacrenier, for suggestions and beta-testing.  Dave Rivers, Loic Tortay and Laurent Wacrenier, for suggestions and
   beta-testing.
   
 Paul Eggert, Douglas A. Gwyn, Clive D.W. Feather, and the other guys from  Paul Eggert, Douglas A. Gwyn, Clive D.W. Feather, and the other guys from
 comp.std.c, for explanations about the standard.  comp.std.c, for explanations about the standard.


Generate output suitable for use with a patch program
Legend:
Removed from v.2017  
changed lines
  Added in v.2018