1 This directory contains some examples illustrating techniques for extracting 2 high-performance from flex scanners. Each program implements a simplified 3 version of the Unix "wc" tool: read text from stdin and print the number of 4 characters, words, and lines present in the text. All programs were compiled 5 using gcc (version unavailable, sorry) with the -O flag, and run on a 6 SPARCstation 1+. The input used was a PostScript file, mainly containing 7 figures, with the following "wc" counts: 8 9 lines words characters 10 214217 635954 2592172 11 12 13 The basic principles illustrated by these programs are: 14 15 - match as much text with each rule as possible 16 - adding rules does not slow you down! 17 - avoid backing up 18 19 and the big caveat that comes with them is: 20 21 - you buy performance with decreased maintainability; make 22 sure you really need it before applying the above techniques. 23 24 See the "Performance Considerations" section of flexdoc for more 25 details regarding these principles. 26 27 28 The different versions of "wc": 29 30 mywc.c 31 a simple but fairly efficient C version 32 33 wc1.l a naive flex "wc" implementation 34 35 wc2.l somewhat faster; adds rules to match multiple tokens at once 36 37 wc3.l faster still; adds more rules to match longer runs of tokens 38 39 wc4.l fastest; still more rules added; hard to do much better 40 using flex (or, I suspect, hand-coding) 41 42 wc5.l identical to wc3.l except one rule has been slightly 43 shortened, introducing backing-up 44 45 Timing results (all times in user CPU seconds): 46 47 program time notes 48 ------- ---- ----- 49 wc1 16.4 default flex table compression (= -Cem) 50 wc1 6.7 -Cf compression option 51 /bin/wc 5.8 Sun's standard "wc" tool 52 mywc 4.6 simple but better C implementation! 53 wc2 4.6 as good as C implementation; built using -Cf 54 wc3 3.8 -Cf 55 wc4 3.3 -Cf 56 wc5 5.7 -Cf; ouch, backing up is expensive 57