Source

astoptimizer / TODO

Full commit
TODO
====

bugs:

 - "i=0; while i < 10: print(i); i = i + 1": don't replace print(i) with print('0')
 - "for x in (): try: pass finally: continue" must raise a SyntaxError
 - "type(iter([]))"

major optimizations:

 - move invariant out of the loop. Need type inference.

   * "x=[0]; for ...: x.append(...)"
     => "x=[0]; x_append=x.append; for ...: x_append(...)"

 - drop dead code with a warning: "x=1; 'useless instruction'; print(x)"

   * "x=1; print(x)" => "print('1')" (drop x)
   * "x=1" drops x
   * "return x" keeps x

 - drop unused instructions with a warning: "def f(): x=1; return 2"
 - support variables: x="abc"; print(len(x))

   * rewrite the optimizer to work in two steps:

     a) scan function arguments, variables and builtins; detect usage of globals()
     b) replace constants, remove unused variables, peephole, call builtin functions

 - unroll short loops: "for x in range(4): print(x)"

   * duplicate the body and evaluate the body with x=0, x=1, ...
   * handle continue/break
   * drop x if possible

 - convert naive loop to list comprehension: "x=[]; for item in data: x.append(item.upper())"
   => "x=[item.upper() for item in data]". Same for x=set() and x={}.

 - inline functions: see http://bugs.python.org/issue10399

other:

 - enable pythonbin by default?
 - disable string and math by default?
 - "list([1, 2])" and "dict({...})" should create a copy?
 - operator module:

   * need to add an import, need to ensure that operator name is not used
   * lambda x: x[1] => operator.itemgetter(1)
   * lambda x: x.a => operator.attrgetter('a')
   * lambda x: x.f('a', b=1) => operator.methodcaller('f', 'a', b=1)

 - Python 3.2+: x in (1, 2, 3) => x in {1, 2, 3}, set converted to frozenset
   by the peepholer
 - Python 2: from __future__ import divison
 - [x*2 for x in range(3)]
 - map, itertools.map, filter:

   * [f(x) for x in a] => map(f, a) / list(map(f, a))
   * (f(x) for x in a) => itertools.map(f, a) / map(f, a) ? scope ?
   * (x for x in a if f(x)) => filter(f, a)
   * (x for x in a if not f(x)) => __builtin_filternot__(f, a) ?
   * (2 * x for x in a) => map((2).__mul__, a)
   * (x for x in a if x in 'abc') => filter('abc'.__contains__, a)

 - not not not x
 - 1 < 2 < 3
 - str.format() (str%args already implemented), don't optimize {!n} (locale dependant)
 - write tests for patch_compile(), ex: ast.literal_ast("1+(2+3j)")
 - (x * 2) * 3 => x * 6, "fast-math" option?

later:

 - infer type:

   * "x=int(text)" : x is an int
   * int: "x=1; for ... in ...: x = x + 1; return x + 0"
     =>  "x=1; for ... in ...: x += 1; return x"
   * str: "x='abc'; for ... in ...: x = 'def'; return x == 'abc' or x == 'def'"
     => "x='abc'; for ... in ...: x = 'def'; return x in ('abc', 'def')"
   * str: "s=str(x); return 'x=' + s" => "s=str(x); return 'x=%s' % s"
   * list: "x=list(data); for ...: x.append(...)"
     => "x=[... for item in data]"
   * file: "f=open(...); while True: s=f.readline(); if not s: break; ..."
     => "f=open(...); for s in f: ..."
   * int: "x=1; for i in range(n): x = x + 1; return x * 1" => "...; return x"
   * str: "x="abc"; for i in range(n): x = x + "x"; return str(x)" => "...; return x"
   * file: "f=open(...); for line in data: f.write(line)"
     => "...; f_write = f.write; for line in data: f_write(line)"
   * etc.

 - use enumerate():

   * for i in range(len(a)): x = a[i] ... => for i, x in enumerate(a): ...
   * i = 0; for x in a: i += 1 ... => for i, x in enumerate(a, 1): ...
   * i = 1; for x in a: ... i += 1 => for i, x in enumerate(a, 1): ...

 - use sum():

   * s = 0; for ...: if ...: s += 1 => s = sum(1 for ... if ...)

 - replace builtins with locals: def(x): return len(x) => def f(x, len=len):
   return len(x). (Implement something better using closures.)
 - re module, warning at the locale dependent patterns
 - base64, binascii
 - pow() and a**b for complex numbers
 - optimize mutable types? "x=[1, 2, 3]; print(len(x))",
   {1,2}|{3} => {1, 2, 3}.
 - Python3: print(sep="-"); print(end="")