与coroutine的意外邂逅

看tornado源码的时候,意外看到了coroutine这个东西,觉得十分有意思,去问了端神和July大神后发现原来Go和Erlang的高并发都是用coroutine来实现的,Lua中coroutine的概念也是十分常用。后面去百度了一下,发现当前并发模型最火的也就是nodejs的callback模型和coroutine模型(据说node的callback模型的性能比coroutine更强大,怪不得死月他们都学node去了)。

Python原生不支持crountine,不过python原生的生成器倒是挺像coroutine。但是由于生成器无法指定将执行权限交给谁,所以对于异步编程来说意义不是很大。幸好有了gevent这个python的coroutine框架。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
from gevent import monkey
import gevent

monkey.patch_socket()

def f(n):
for i in range(n):
print gevent.getcurrent(), i

g1 = gevent.spawn(f, 5)
g2 = gevent.spawn(f, 5)
g3 = gevent.spawn(f, 5)

g1.join()
g2.join()
g3.join()

# 输出结果是这样的:
<Greenlet at 0x1068cf0f0: f(5)> 0
<Greenlet at 0x1068cf0f0: f(5)> 1
<Greenlet at 0x1068cf0f0: f(5)> 2
<Greenlet at 0x1068cf0f0: f(5)> 3
<Greenlet at 0x1068cf0f0: f(5)> 4
<Greenlet at 0x1068cf230: f(5)> 0
<Greenlet at 0x1068cf230: f(5)> 1
<Greenlet at 0x1068cf230: f(5)> 2
<Greenlet at 0x1068cf230: f(5)> 3
<Greenlet at 0x1068cf230: f(5)> 4
<Greenlet at 0x1068cf2d0: f(5)> 0
<Greenlet at 0x1068cf2d0: f(5)> 1
<Greenlet at 0x1068cf2d0: f(5)> 2
<Greenlet at 0x1068cf2d0: f(5)> 3
<Greenlet at 0x1068cf2d0: f(5)> 4
说明它其实是按顺序执行的。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
from gevent import monkey
import gevent

monkey.patch_socket()

def f(n):
for i in range(n):
print gevent.getcurrent(), i
gevent.sleep(0)

g1 = gevent.spawn(f, 5)
g2 = gevent.spawn(f, 5)
g3 = gevent.spawn(f, 5)

g1.join()
g2.join()
g3.join()

# 输出结果如下
<Greenlet at 0x10c4d20f0: f(5)> 0
<Greenlet at 0x10c4d2230: f(5)> 0
<Greenlet at 0x10c4d22d0: f(5)> 0
<Greenlet at 0x10c4d20f0: f(5)> 1
<Greenlet at 0x10c4d2230: f(5)> 1
<Greenlet at 0x10c4d22d0: f(5)> 1
<Greenlet at 0x10c4d20f0: f(5)> 2
<Greenlet at 0x10c4d2230: f(5)> 2
<Greenlet at 0x10c4d22d0: f(5)> 2
<Greenlet at 0x10c4d20f0: f(5)> 3
<Greenlet at 0x10c4d2230: f(5)> 3
<Greenlet at 0x10c4d22d0: f(5)> 3
<Greenlet at 0x10c4d20f0: f(5)> 4
<Greenlet at 0x10c4d2230: f(5)> 4
<Greenlet at 0x10c4d22d0: f(5)> 4
# 这说明它是交替执行的。

其实是这样的,coroutine并不是并行,它不会创建多个进程或线程,而是始终运行在一个线程里面。但是当每个coroutine执行遇到IO的时候,它会交出执行权限gevent会将执行权限交给其他没有在IO等待状态的coroutine,如果全部都在等待状态,gevent会不断轮询,类似与epoll。也正式由于coroutine始终执行在一条线程中所以它使用不了多核的资源。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import gevent, random

products = []

def consume(count):
while count:
for index in xrange(0, len(products)):
products.pop(0)
print products
count = count -1
gevent.sleep(0)

def product(count):
while count:
while len(products) < 5:
p = random.randint(0, 9)
products.append(p)
print products
count = count -1
gevent.sleep(0)

print products
gevent.joinall([
gevent.spawn(product, 3),
gevent.spawn(consume, 3),
])

# 输出如下:
[]
[6]
[6, 2]
[6, 2, 6]
[6, 2, 6, 7]
[6, 2, 6, 7, 8]
[2, 6, 7, 8]
[6, 7, 8]
[7, 8]
[8]
[]
[8]
[8, 5]
[8, 5, 3]
[8, 5, 3, 1]
[8, 5, 3, 1, 4]
[5, 3, 1, 4]
[3, 1, 4]
[1, 4]
[4]
[]
[5]
[5, 4]
[5, 4, 8]
[5, 4, 8, 8]
[5, 4, 8, 8, 4]
[4, 8, 8, 4]
[8, 8, 4]
[8, 4]
[4]
[]
这与多线程并行的程序相比,无论执行多少次输出结果都是固定的。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from gevent import monkey
import gevent
import urllib2

monkey.patch_all()


def f(url):
print("GET: %s" % url)
resp = urllib2.urlopen(url)
data = resp.read()
print("%d bytes received from %s." % (len(data), url))

gevent.joinall([
gevent.spawn(f, "http://www.python.org/"),
gevent.spawn(f, "http://eleven.name/"),
gevent.spawn(f, "http://github.com/"),
])

# 输出结果
GET: http://www.python.org/
GET: http://eleven.name/
GET: http://github.com/
28866 bytes received from http://eleven.name/.
47108 bytes received from http://www.python.org/.
17424 bytes received from http://github.com/.

看了Qcon上豆瓣清风大神的分享,顿时觉得有点明白github是怎么玩的了,以后大家多多给我做code review吧。