Home | History | Annotate | Download | only in scripts
      1 #! /usr/bin/env python
      2 
      3 """fixdiv - tool to fix division operators.
      4 
      5 To use this tool, first run `python -Qwarnall yourscript.py 2>warnings'.
      6 This runs the script `yourscript.py' while writing warning messages
      7 about all uses of the classic division operator to the file
      8 `warnings'.  The warnings look like this:
      9 
     10   <file>:<line>: DeprecationWarning: classic <type> division
     11 
     12 The warnings are written to stderr, so you must use `2>' for the I/O
     13 redirect.  I know of no way to redirect stderr on Windows in a DOS
     14 box, so you will have to modify the script to set sys.stderr to some
     15 kind of log file if you want to do this on Windows.
     16 
     17 The warnings are not limited to the script; modules imported by the
     18 script may also trigger warnings.  In fact a useful technique is to
     19 write a test script specifically intended to exercise all code in a
     20 particular module or set of modules.
     21 
     22 Then run `python fixdiv.py warnings'.  This first reads the warnings,
     23 looking for classic division warnings, and sorts them by file name and
     24 line number.  Then, for each file that received at least one warning,
     25 it parses the file and tries to match the warnings up to the division
     26 operators found in the source code.  If it is successful, it writes
     27 its findings to stdout, preceded by a line of dashes and a line of the
     28 form:
     29 
     30   Index: <file>
     31 
     32 If the only findings found are suggestions to change a / operator into
     33 a // operator, the output is acceptable input for the Unix 'patch'
     34 program.
     35 
     36 Here are the possible messages on stdout (N stands for a line number):
     37 
     38 - A plain-diff-style change ('NcN', a line marked by '<', a line
     39   containing '---', and a line marked by '>'):
     40 
     41   A / operator was found that should be changed to //.  This is the
     42   recommendation when only int and/or long arguments were seen.
     43 
     44 - 'True division / operator at line N' and a line marked by '=':
     45 
     46   A / operator was found that can remain unchanged.  This is the
     47   recommendation when only float and/or complex arguments were seen.
     48 
     49 - 'Ambiguous / operator (..., ...) at line N', line marked by '?':
     50 
     51   A / operator was found for which int or long as well as float or
     52   complex arguments were seen.  This is highly unlikely; if it occurs,
     53   you may have to restructure the code to keep the classic semantics,
     54   or maybe you don't care about the classic semantics.
     55 
     56 - 'No conclusive evidence on line N', line marked by '*':
     57 
     58   A / operator was found for which no warnings were seen.  This could
     59   be code that was never executed, or code that was only executed
     60   with user-defined objects as arguments.  You will have to
     61   investigate further.  Note that // can be overloaded separately from
     62   /, using __floordiv__.  True division can also be separately
     63   overloaded, using __truediv__.  Classic division should be the same
     64   as either of those.  (XXX should I add a warning for division on
     65   user-defined objects, to disambiguate this case from code that was
     66   never executed?)
     67 
     68 - 'Phantom ... warnings for line N', line marked by '*':
     69 
     70   A warning was seen for a line not containing a / operator.  The most
     71   likely cause is a warning about code executed by 'exec' or eval()
     72   (see note below), or an indirect invocation of the / operator, for
     73   example via the div() function in the operator module.  It could
     74   also be caused by a change to the file between the time the test
     75   script was run to collect warnings and the time fixdiv was run.
     76 
     77 - 'More than one / operator in line N'; or
     78   'More than one / operator per statement in lines N-N':
     79 
     80   The scanner found more than one / operator on a single line, or in a
     81   statement split across multiple lines.  Because the warnings
     82   framework doesn't (and can't) show the offset within the line, and
     83   the code generator doesn't always give the correct line number for
     84   operations in a multi-line statement, we can't be sure whether all
     85   operators in the statement were executed.  To be on the safe side,
     86   by default a warning is issued about this case.  In practice, these
     87   cases are usually safe, and the -m option suppresses these warning.
     88 
     89 - 'Can't find the / operator in line N', line marked by '*':
     90 
     91   This really shouldn't happen.  It means that the tokenize module
     92   reported a '/' operator but the line it returns didn't contain a '/'
     93   character at the indicated position.
     94 
     95 - 'Bad warning for line N: XYZ', line marked by '*':
     96 
     97   This really shouldn't happen.  It means that a 'classic XYZ
     98   division' warning was read with XYZ being something other than
     99   'int', 'long', 'float', or 'complex'.
    100 
    101 Notes:
    102 
    103 - The augmented assignment operator /= is handled the same way as the
    104   / operator.
    105 
    106 - This tool never looks at the // operator; no warnings are ever
    107   generated for use of this operator.
    108 
    109 - This tool never looks at the / operator when a future division
    110   statement is in effect; no warnings are generated in this case, and
    111   because the tool only looks at files for which at least one classic
    112   division warning was seen, it will never look at files containing a
    113   future division statement.
    114 
    115 - Warnings may be issued for code not read from a file, but executed
    116   using an exec statement or the eval() function.  These may have
    117   <string> in the filename position, in which case the fixdiv script
    118   will attempt and fail to open a file named '<string>' and issue a
    119   warning about this failure; or these may be reported as 'Phantom'
    120   warnings (see above).  You're on your own to deal with these.  You
    121   could make all recommended changes and add a future division
    122   statement to all affected files, and then re-run the test script; it
    123   should not issue any warnings.  If there are any, and you have a
    124   hard time tracking down where they are generated, you can use the
    125   -Werror option to force an error instead of a first warning,
    126   generating a traceback.
    127 
    128 - The tool should be run from the same directory as that from which
    129   the original script was run, otherwise it won't be able to open
    130   files given by relative pathnames.
    131 """
    132 
    133 import sys
    134 import getopt
    135 import re
    136 import tokenize
    137 
    138 multi_ok = 0
    139 
    140 def main():
    141     try:
    142         opts, args = getopt.getopt(sys.argv[1:], "hm")
    143     except getopt.error, msg:
    144         usage(msg)
    145         return 2
    146     for o, a in opts:
    147         if o == "-h":
    148             print __doc__
    149             return
    150         if o == "-m":
    151             global multi_ok
    152             multi_ok = 1
    153     if not args:
    154         usage("at least one file argument is required")
    155         return 2
    156     if args[1:]:
    157         sys.stderr.write("%s: extra file arguments ignored\n", sys.argv[0])
    158     warnings = readwarnings(args[0])
    159     if warnings is None:
    160         return 1
    161     files = warnings.keys()
    162     if not files:
    163         print "No classic division warnings read from", args[0]
    164         return
    165     files.sort()
    166     exit = None
    167     for filename in files:
    168         x = process(filename, warnings[filename])
    169         exit = exit or x
    170     return exit
    171 
    172 def usage(msg):
    173     sys.stderr.write("%s: %s\n" % (sys.argv[0], msg))
    174     sys.stderr.write("Usage: %s [-m] warnings\n" % sys.argv[0])
    175     sys.stderr.write("Try `%s -h' for more information.\n" % sys.argv[0])
    176 
    177 PATTERN = ("^(.+?):(\d+): DeprecationWarning: "
    178            "classic (int|long|float|complex) division$")
    179 
    180 def readwarnings(warningsfile):
    181     prog = re.compile(PATTERN)
    182     try:
    183         f = open(warningsfile)
    184     except IOError, msg:
    185         sys.stderr.write("can't open: %s\n" % msg)
    186         return
    187     warnings = {}
    188     while 1:
    189         line = f.readline()
    190         if not line:
    191             break
    192         m = prog.match(line)
    193         if not m:
    194             if line.find("division") >= 0:
    195                 sys.stderr.write("Warning: ignored input " + line)
    196             continue
    197         filename, lineno, what = m.groups()
    198         list = warnings.get(filename)
    199         if list is None:
    200             warnings[filename] = list = []
    201         list.append((int(lineno), intern(what)))
    202     f.close()
    203     return warnings
    204 
    205 def process(filename, list):
    206     print "-"*70
    207     assert list # if this fails, readwarnings() is broken
    208     try:
    209         fp = open(filename)
    210     except IOError, msg:
    211         sys.stderr.write("can't open: %s\n" % msg)
    212         return 1
    213     print "Index:", filename
    214     f = FileContext(fp)
    215     list.sort()
    216     index = 0 # list[:index] has been processed, list[index:] is still to do
    217     g = tokenize.generate_tokens(f.readline)
    218     while 1:
    219         startlineno, endlineno, slashes = lineinfo = scanline(g)
    220         if startlineno is None:
    221             break
    222         assert startlineno <= endlineno is not None
    223         orphans = []
    224         while index < len(list) and list[index][0] < startlineno:
    225             orphans.append(list[index])
    226             index += 1
    227         if orphans:
    228             reportphantomwarnings(orphans, f)
    229         warnings = []
    230         while index < len(list) and list[index][0] <= endlineno:
    231             warnings.append(list[index])
    232             index += 1
    233         if not slashes and not warnings:
    234             pass
    235         elif slashes and not warnings:
    236             report(slashes, "No conclusive evidence")
    237         elif warnings and not slashes:
    238             reportphantomwarnings(warnings, f)
    239         else:
    240             if len(slashes) > 1:
    241                 if not multi_ok:
    242                     rows = []
    243                     lastrow = None
    244                     for (row, col), line in slashes:
    245                         if row == lastrow:
    246                             continue
    247                         rows.append(row)
    248                         lastrow = row
    249                     assert rows
    250                     if len(rows) == 1:
    251                         print "*** More than one / operator in line", rows[0]
    252                     else:
    253                         print "*** More than one / operator per statement",
    254                         print "in lines %d-%d" % (rows[0], rows[-1])
    255             intlong = []
    256             floatcomplex = []
    257             bad = []
    258             for lineno, what in warnings:
    259                 if what in ("int", "long"):
    260                     intlong.append(what)
    261                 elif what in ("float", "complex"):
    262                     floatcomplex.append(what)
    263                 else:
    264                     bad.append(what)
    265             lastrow = None
    266             for (row, col), line in slashes:
    267                 if row == lastrow:
    268                     continue
    269                 lastrow = row
    270                 line = chop(line)
    271                 if line[col:col+1] != "/":
    272                     print "*** Can't find the / operator in line %d:" % row
    273                     print "*", line
    274                     continue
    275                 if bad:
    276                     print "*** Bad warning for line %d:" % row, bad
    277                     print "*", line
    278                 elif intlong and not floatcomplex:
    279                     print "%dc%d" % (row, row)
    280                     print "<", line
    281                     print "---"
    282                     print ">", line[:col] + "/" + line[col:]
    283                 elif floatcomplex and not intlong:
    284                     print "True division / operator at line %d:" % row
    285                     print "=", line
    286                 elif intlong and floatcomplex:
    287                     print "*** Ambiguous / operator (%s, %s) at line %d:" % (
    288                         "|".join(intlong), "|".join(floatcomplex), row)
    289                     print "?", line
    290     fp.close()
    291 
    292 def reportphantomwarnings(warnings, f):
    293     blocks = []
    294     lastrow = None
    295     lastblock = None
    296     for row, what in warnings:
    297         if row != lastrow:
    298             lastblock = [row]
    299             blocks.append(lastblock)
    300         lastblock.append(what)
    301     for block in blocks:
    302         row = block[0]
    303         whats = "/".join(block[1:])
    304         print "*** Phantom %s warnings for line %d:" % (whats, row)
    305         f.report(row, mark="*")
    306 
    307 def report(slashes, message):
    308     lastrow = None
    309     for (row, col), line in slashes:
    310         if row != lastrow:
    311             print "*** %s on line %d:" % (message, row)
    312             print "*", chop(line)
    313             lastrow = row
    314 
    315 class FileContext:
    316     def __init__(self, fp, window=5, lineno=1):
    317         self.fp = fp
    318         self.window = 5
    319         self.lineno = 1
    320         self.eoflookahead = 0
    321         self.lookahead = []
    322         self.buffer = []
    323     def fill(self):
    324         while len(self.lookahead) < self.window and not self.eoflookahead:
    325             line = self.fp.readline()
    326             if not line:
    327                 self.eoflookahead = 1
    328                 break
    329             self.lookahead.append(line)
    330     def readline(self):
    331         self.fill()
    332         if not self.lookahead:
    333             return ""
    334         line = self.lookahead.pop(0)
    335         self.buffer.append(line)
    336         self.lineno += 1
    337         return line
    338     def __getitem__(self, index):
    339         self.fill()
    340         bufstart = self.lineno - len(self.buffer)
    341         lookend = self.lineno + len(self.lookahead)
    342         if bufstart <= index < self.lineno:
    343             return self.buffer[index - bufstart]
    344         if self.lineno <= index < lookend:
    345             return self.lookahead[index - self.lineno]
    346         raise KeyError
    347     def report(self, first, last=None, mark="*"):
    348         if last is None:
    349             last = first
    350         for i in range(first, last+1):
    351             try:
    352                 line = self[first]
    353             except KeyError:
    354                 line = "<missing line>"
    355             print mark, chop(line)
    356 
    357 def scanline(g):
    358     slashes = []
    359     startlineno = None
    360     endlineno = None
    361     for type, token, start, end, line in g:
    362         endlineno = end[0]
    363         if startlineno is None:
    364             startlineno = endlineno
    365         if token in ("/", "/="):
    366             slashes.append((start, line))
    367         if type == tokenize.NEWLINE:
    368             break
    369     return startlineno, endlineno, slashes
    370 
    371 def chop(line):
    372     if line.endswith("\n"):
    373         return line[:-1]
    374     else:
    375         return line
    376 
    377 if __name__ == "__main__":
    378     sys.exit(main())
    379