Source

astoptimizer / TODO

Full commit
Victor Stinner abaaebd 




Victor Stinner 0a2f065 

Victor Stinner 676b58d 
Victor Stinner 531cb10 

Victor Stinner 0fe5737 
Victor Stinner abaaebd 
Victor Stinner 014eaa2 


Victor Stinner d787778 



Victor Stinner 3534a77 
Victor Stinner 014eaa2 

Victor Stinner abaaebd 

Victor Stinner 0fe5737 
Victor Stinner 4b2a705 




Victor Stinner abaaebd 








Victor Stinner 6630cc9 
Victor Stinner 0fe5737 
Victor Stinner ed267c0 
Victor Stinner 6630cc9 

Victor Stinner 0fe5737 
Victor Stinner 3534a77 
Victor Stinner 6630cc9 


Victor Stinner 0fe5737 







Victor Stinner abaaebd 

Victor Stinner 0a2f065 
Victor Stinner 9157483 


Victor Stinner 7a10922 

Victor Stinner 9157483 
Victor Stinner 7a10922 



Victor Stinner f29bd40 

Victor Stinner 4b2a705 
Victor Stinner 7a10922 








Victor Stinner eb8730c 
Victor Stinner 0fe5737 

Victor Stinner f54374c 
Victor Stinner 0a2f065 
Victor Stinner 9145f95 
Victor Stinner abaaebd 


Victor Stinner 9145f95 
Victor Stinner ebb0cfd 

Victor Stinner 9157483 
Victor Stinner 4b2a705 








Victor Stinner ebb0cfd 





Victor Stinner 20380f0 









Victor Stinner d4e30df 

Victor Stinner 0a2f065 

Victor Stinner da8fc7c 
Victor Stinner 9157483 

Victor Stinner 199d0f6 
Victor Stinner 0a2f065 
+++++++++
TODO list
+++++++++

Bugs
====

 - "i=0; while i < 10: print(i); i = i + 1": don't replace print(i) with print('0')
 - "for x in (): try: pass finally: continue" must raise a SyntaxError
 - "type(iter([]))"


Misc
====

 * automatically detect pure functions: see pythran.analysis.PureFunctions of
   pythran project, depend on ArgumentEffects and GlobalEffects analysys
 * replace '(a and b) and c' (2 op) with 'a and b and c' (1 op),
   same for "or" operator
 * unroll: support break/continue


Major Optimizations
===================

 - move invariant out of the loop. Need type inference.

   * "x=[0]; for ...: x.append(...)"
     => "x=[0]; x_append=x.append; for ...: x_append(...)"

 - convert naive loop to list comprehension: "x=[]; for item in data: x.append(item.upper())"
   => "x=[item.upper() for item in data]". Same for x=set() and x={}.

 - inline functions: see http://bugs.python.org/issue10399


Support variables
=================

 - constant folding:

   * "x=1; print(x)" => "print('1')" (drop x)
   * "x=1" => "pass" (drop x)
   * "return x" => "return x" (keep x)

 - drop unused local variables with a warning:

   * "def f(): x=1; return 2"

 - support variables: x="abc"; print(len(x))

   * rewrite the optimizer to work in two steps:

     a) scan function arguments, variables and builtins; detect usage of globals()
     b) replace constants, remove unused variables, peephole, call builtin functions


Other
=====

 - enable pythonbin by default?
 - disable string and math by default?
 - "list([1, 2])" and "dict({...})" should create a copy?
 - operator module:

   * need to add an import, need to ensure that operator name is not used
   * lambda x: x[1] => operator.itemgetter(1)
   * lambda x: x.a => operator.attrgetter('a')
   * lambda x: x.f('a', b=1) => operator.methodcaller('f', 'a', b=1)

 - Python 3.2+: x in (1, 2, 3) => x in {1, 2, 3}, set converted to frozenset
   by the peepholer
 - Python 2: from __future__ import divison
 - map, itertools.map, filter:

   * [f(x) for x in a] => map(f, a) / list(map(f, a))
   * (f(x) for x in a) => itertools.map(f, a) / map(f, a) ? scope ?
   * (x for x in a if f(x)) => filter(f, a)
   * (x for x in a if not f(x)) => __builtin_filternot__(f, a) ?
   * (2 * x for x in a) => map((2).__mul__, a)
   * (x for x in a if x in 'abc') => filter('abc'.__contains__, a)

 - not not not x
 - 1 < 2 < 3
 - str.format() (str%args already implemented), don't optimize {!n} (locale dependant)
 - write tests for patch_compile(), ex: ast.literal_ast("1+(2+3j)")
 - (x * 2) * 3 => x * 6, "fast-math" option?


Later
=====

 - infer type:

   * "x=int(text)" : x is an int
   * int: "x=1; for ... in ...: x = x + 1; return x + 0"
     =>  "x=1; for ... in ...: x += 1; return x"
   * str: "x='abc'; for ... in ...: x = 'def'; return x == 'abc' or x == 'def'"
     => "x='abc'; for ... in ...: x = 'def'; return x in ('abc', 'def')"
   * str: "s=str(x); return 'x=' + s" => "s=str(x); return 'x=%s' % s"
   * list: "x=list(data); for ...: x.append(...)"
     => "x=[... for item in data]"
   * file: "f=open(...); while True: s=f.readline(); if not s: break; ..."
     => "f=open(...); for s in f: ..."
   * int: "x=1; for i in range(n): x = x + 1; return x * 1" => "...; return x"
   * str: "x="abc"; for i in range(n): x = x + "x"; return str(x)" => "...; return x"
   * file: "f=open(...); for line in data: f.write(line)"
     => "...; f_write = f.write; for line in data: f_write(line)"
   * etc.

 - use enumerate():

   * for i in range(len(a)): x = a[i] ... => for i, x in enumerate(a): ...
   * i = 0; for x in a: i += 1 ... => for i, x in enumerate(a, 1): ...
   * i = 1; for x in a: ... i += 1 => for i, x in enumerate(a, 1): ...

 - use sum():

   * s = 0; for ...: if ...: s += 1 => s = sum(1 for ... if ...)

 - replace builtins with locals: def(x): return len(x) => def f(x, len=len):
   return len(x). (Implement something better using closures.)
 - re module, warning at the locale dependent patterns
 - base64, binascii
 - pow() and a**b for complex numbers
 - optimize mutable types? "x=[1, 2, 3]; print(len(x))",
   {1,2}|{3} => {1, 2, 3}.
 - Python3: print(sep="-"); print(end="")