Home | History | Annotate | Download | only in fastwc
      1 This directory contains some examples illustrating techniques for extracting
      2 high-performance from flex scanners.  Each program implements a simplified
      3 version of the Unix "wc" tool: read text from stdin and print the number of
      4 characters, words, and lines present in the text.  All programs were compiled
      5 using gcc (version unavailable, sorry) with the -O flag, and run on a
      6 SPARCstation 1+.  The input used was a PostScript file, mainly containing
      7 figures, with the following "wc" counts:
      8 
      9 	lines  words  characters
     10 	214217 635954 2592172
     11 
     12 
     13 The basic principles illustrated by these programs are:
     14 
     15 	- match as much text with each rule as possible
     16 	- adding rules does not slow you down!
     17 	- avoid backing up
     18 
     19 and the big caveat that comes with them is:
     20 
     21 	- you buy performance with decreased maintainability; make
     22 	  sure you really need it before applying the above techniques.
     23 
     24 See the "Performance Considerations" section of flexdoc for more
     25 details regarding these principles.
     26 
     27 
     28 The different versions of "wc":
     29 
     30 	mywc.c
     31 		a simple but fairly efficient C version
     32 
     33 	wc1.l	a naive flex "wc" implementation
     34 
     35 	wc2.l	somewhat faster; adds rules to match multiple tokens at once
     36 
     37 	wc3.l	faster still; adds more rules to match longer runs of tokens
     38 
     39 	wc4.l	fastest; still more rules added; hard to do much better
     40 		using flex (or, I suspect, hand-coding)
     41 
     42 	wc5.l	identical to wc3.l except one rule has been slightly
     43 		shortened, introducing backing-up
     44 
     45 Timing results (all times in user CPU seconds):
     46 
     47 	program	  time 	 notes
     48 	-------   ----   -----
     49 	wc1       16.4   default flex table compression (= -Cem)
     50 	wc1        6.7   -Cf compression option
     51 	/bin/wc	   5.8	 Sun's standard "wc" tool
     52 	mywc	   4.6   simple but better C implementation!
     53 	wc2	   4.6   as good as C implementation; built using -Cf
     54 	wc3	   3.8   -Cf
     55 	wc4	   3.3   -Cf
     56 	wc5	   5.7   -Cf; ouch, backing up is expensive
     57