时间:2022-05-25 22:12:01 | 来源:网络营销
时间:2022-05-25 22:12:01 来源:网络营销
选择了脚本语言就要忍受其速度,这句话在某种程度上说明了python作为脚本的一个不足之处,那就是执行效率和性能不够理想,特别是在performance较差的机器上,因此有必要进行一定的代码优化来提高程序的执行效率。那么我们该如何进行Python性能优化呢?接下来我就在亿企邦上跟大家共同探讨一下这个问题。本文会涉及常见的代码优化方法,性能优化工具的使用以及如何诊断代码的性能瓶颈等内容,希望可以给Python开发人员一定的参考价值。O(1) -> O(lg n) -> O(n lg n) -> O(n^2) -> O(n^3) -> O(n^k) -> O(k^n) -> O(n!)因此如果能够在时间复杂度上对算法进行一定的改进,对性能的提高不言而喻。但对具体算法的改进不属于本文讨论的范围,读者可以自行参考这方面资料。下面的内容将集中讨论数据结构的选择。
from time import time上述代码运行大概需要16.09seconds。如果去掉行#list = dict.fromkeys(list,True)的注释,将list转换为字典之后再运行,时间大约为8.375 seconds,效率大概提高了一半。因此在需要多数据成员进行频繁的查找或者访问的时候,使用dict而不是list是一个较好的选择。
t = time()
list = ['a','b','is','python','jason','hello','hill','with','phone','test',
'dfdf','apple','pddf','ind','basic','none','baecr','var','bana','dd','wrd']
#list = dict.fromkeys(list,True)
print list
filter = []
for i in range (1000000):
for find in ['is','hat','new','list','old','.']:
if find not in list:
filter.append(find)
print "total run time:"
print time()-t
from time import time上述程序的运行时间大概为:
t = time()
lista=[1,2,3,4,5,6,7,8,9,13,34,53,42,44]
listb=[2,4,6,9,23]
intersection=[]
for i in range (1000000):
for a in lista:
for b in listb:
if a == b:
intersection.append(a)
print "total run time:"
print time()-t
total run time:使用set 求交集
38.4070000648
from time import time改为set后程序的运行时间缩减为8.75,提高了4倍多,运行时间大大缩短。读者可以自行使用表1其他的操作进行测试。
t = time()
lista=[1,2,3,4,5,6,7,8,9,13,34,53,42,44]
listb=[2,4,6,9,23]
intersection=[]
for i in range (1000000):
list(set(lista)&set(listb))
print "total run time:"
print time()-t
from time import time现在进行如下优化,将长度计算提到循环外,range用xrange代替,同时将第三层的计算lista[a]提到循环的第二层。
t = time()
lista = [1,2,3,4,5,6,7,8,9,10]
listb =[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.01]
for i in range (1000000):
for a in range(len(lista)):
for b in range(len(listb)):
x=lista[a]+listb[b]
print "total run time:"
print time()-t
from time import time上述优化后的程序其运行时间缩短为102.171999931。在清单 4 中 lista[a] 被计算的次数为1000000*10*10,而在优化后的代码中被计算的次数为1000000*10,计算次数大幅度缩短,因此性能有所提升。
t = time()
lista = [1,2,3,4,5,6,7,8,9,10]
listb =[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.01]
len1=len(lista)
len2=len(listb)
for i in xrange (1000000):
for a in xrange(len1):
temp=lista[a]
for b in xrange(len2):
x=temp+listb[b]
print "total run time:"
print time()-t
from time import time在未进行优化之前程序的运行时间大概为8.84,如果使用注释行代替第一个if,运行的时间大概为6.17。
t = time()
abbreviations = ['cf.', 'e.g.', 'ex.', 'etc.', 'fig.', 'i.e.', 'Mr.', 'vs.']
for i in range (1000000):
for w in ('Mr.', 'Hat', 'is', 'chasing', 'the', 'black', 'cat', '.'):
if w in abbreviations:
#if w[-1] == '.' and w in abbreviations:
pass
print "total run time:"
print time()-t
from time import time同时要避免:
t = time()
s = ""
list = ['a','b','b','d','e','f','g','h','i','j','k','l','m','n']
for i in range (10000):
for substr in list:
s+= substr
print "total run time:"
print time()-t
s = ""而是要使用:
for x in list:
s += func(x)
slist = [func(elt) for elt in somelist](2)、当对字符串可以使用正则表达式或者内置函数来处理的时候,选择内置函数。如 str.isalpha(),str.isdigit(),str.startswith(('x', 'yz')),str.endswith(('x', 'yz'))
s = "".join(slist)
out = "<html>%s%s%s%s</html>" % (head, prologue, query, tail)而避免
out = "<html>" + head + prologue + query + tail + "</html>"8、使用列表解析(list comprehension)和生成器表达式(generator expression)
from time import time使用列表解析:
t = time()
list = ['a','b','is','python','jason','hello','hill','with','phone','test',
'dfdf','apple','pddf','ind','basic','none','baecr','var','bana','dd','wrd']
total=[]
for i in range (1000000):
for w in list:
total.append(w)
print "total run time:"
print time()-t
for i in range (1000000):上述代码直接运行大概需要17s,而改为使用列表解析后 ,运行时间缩短为9.29s。将近提高了一半。生成器表达式则是在2.4中引入的新内容,语法和列表解析类似,但是在大数据量处理时,生成器表达式的优势较为明显,它并不创建一个列表,只是返回一个生成器,因此效率较高。在上述例子上中代码a = [w for w in list]修改为a = (w for w in list),运行时间进一步减少,缩短约为2.98s。
a = [w for w in list]
>>> from timeit import Timer(2)、在循环的时候使用 xrange 而不是 range;使用 xrange 可以节省大量的系统内存,因为 xrange() 在序列中每次调用只产生一个整数元素。而 range() 将直接返回完整的元素列表,用于循环时会有不必要的开销。在 python3 中 xrange 不再存在,里面 range 提供一个可以遍历任意长度的范围的 iterator。
>>> Timer("t=a;a=b;b=t","a=1;b=2").timeit()
0.25154118749729365
>>> Timer("a,b=b,a","a=1;b=2").timeit()
0.17156677734181258
>>>
import profile程序的运行性能分析结果如下图所示:
def profileTest():
Total =1;
for i in range(10):
Total=Total*(i+1)
print Total
return Total
if __name__ == "__main__":
profile.run("profileTest()")
import pstats其中sort_stats()方法能够对剖分数据进行排序,可以接受多个排序字段,如sort_stats('name', 'file')将首先按照函数名称进行排序,然后再按照文件名进行排序。常见的排序字段有calls( 被调用的次数 ),time(函数内部运行时间),cumulative(运行的总时间)等。此外pstats也提供了命令行交互工具,执行python – m pstats后可以通过help了解更多使用方式。
p = pstats.Stats('testprof')
p.sort_stats("name").print_stats()
C:Documents and SettingsAdministrator>pypy以第5条中的“循环优化后”的循环为例子,使用python和pypy分别运行,得到的运行结果分别如下:
Python 2.7.2 (0e28b379d8b3, Feb 09 2012, 18:31:47)
[PyPy 1.8.0 with MSC v.1500 32 bit] on win32
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``PyPy is vast, and contains
multitudes''
>>>>
C:Documents and SettingsAdministrator 桌面 docpython>pypy loop.py可见使用pypy来编译和运行程序,其效率大大的提高。
total run time:
8.42199993134
C:Documents and SettingsAdministrator 桌面 docpython>python loop.py
total run time:
106.391000032
def sum(int a,int b):在linux上利用gcc编译为.so文件:
print a+b
[root@v5254085f259 test]# cython sum.pyx
[root@v5254085f259 test]# ls
total 76
4 drwxr-xr-x 2 root root 4096 Apr 17 02:45 .
4 drwxr-xr-x 4 root root 4096 Apr 16 22:20 ..
4 -rw-r--r-- 1 root root 35 Apr 17 02:45 1
60 -rw-r--r-- 1 root root 55169 Apr 17 02:45 sum.c
4 -rw-r--r-- 1 root root 35 Apr 17 02:45 sum.pyx
[root@v5254085f259 test]# gcc -shared -pthread -fPIC -fwrapv -O2B、使用distutils编译
-Wall -fno-strict-aliasing -I/usr/include/python2.4 -o sum.so sum.c
[root@v5254085f259 test]# ls
total 96
4 drwxr-xr-x 2 root root 4096 Apr 17 02:47 .
4 drwxr-xr-x 4 root root 4096 Apr 16 22:20 ..
4 -rw-r--r-- 1 root root 35 Apr 17 02:45 1
60 -rw-r--r-- 1 root root 55169 Apr 17 02:45 sum.c
4 -rw-r--r-- 1 root root 35 Apr 17 02:45 sum.pyx
20 -rwxr-xr-x 1 root root 20307 Apr 17 02:47 sum.so
from distutils.core import setup编译完成之后可以导入到 python 中使用:
from distutils.extension import Extension
from Cython.Distutils import build_ext
ext_modules = [Extension("sum", ["sum.pyx"])]
setup(
name = 'sum app',
cmdclass = {'build_ext': build_ext},
ext_modules = ext_modules
)
[root@v5254085f259 test]# python setup.py build_ext --inplace
running build_ext
cythoning sum.pyx to sum.c
building 'sum' extension
gcc -pthread -fno-strict-aliasing -fPIC -g -O2 -DNDEBUG -g -fwrapv -O3
-Wall -Wstrict-prototypes -fPIC -I/opt/ActivePython-2.7/include/python2.7
-c sum.c -o build/temp.linux-x86_64-2.7/sum.o
gcc -pthread -shared build/temp.linux-x86_64-2.7/sum.o
-o /root/cpython/test/sum.so
[root@v5254085f259 test]# python亿企邦点评:
ActivePython 2.7.2.5 (ActiveState Software Inc.) based on
Python 2.7.2 (default, Jun 24 2011, 11:24:26)
[GCC 4.0.2 20051125 (Red Hat 4.0.2-8)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyximport; pyximport.install()
>>> import sum
>>> sum.sum(1,3)
from time import time测试结果:
def test(int n):
cdef int a =0
cdef int i
for i in xrange(n):
a+= i
return a
t = time()
test(10000000)
print "total run time:"
print time()-t
[GCC 4.0.2 20051125 (Red Hat 4.0.2-8)] on linux2Python 测试代码
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyximport; pyximport.install()
>>> import ctest
total run time:
0.00714015960693
from time import time从上述对比可以看到使用Cython的速度提高了将近100多倍。
def test(n):
a =0;
for i in xrange(n):
a+= i
return a
t = time()
test(10000000)
print "total run time:"
print time()-t
[root@v5254085f259 test]# python test.py
total run time:
0.971596002579
关键词:方法,性能,语言