Benutzer:Dirk Huenniger/threading

Aus Wikibooks

On this page I will discuss a simple yet powerful interface to threading. I happend to write a script that needed to download a lot of websites, and do something with their content. In this example I just calculate the total size in bytes.

a=getURL("http://www.google.de")
b=getURL("http://www.python.org")
c=getURL("http://www.haskell.org")

print (len(a)+len(b)+len(c) )

It works fine, but its slow. This is because IO is slow, and we speed it up but running the IO operations in parallel. But that usually quite some work. Start the GET request in separate threads and synchronize to get the information back to your main thread. Can we avoid this work. Yes we can

from multiproc import *

a=muliproc(getURL,"http://www.google.de")
b=muliproc(getURL,"http://www.python.org")
c=muliproc(getURL,"http://www.haskell.org")

print (len(a)+len(b)+len(c) )

This is enough to make it threaded and to synchronize it. You don't believe that. Well look at the output:

Getting URL http://www.google.de
Getting URL http://www.python.org
Getting URL http://www.haskell.org
Got URL http://www.google.de
Got URL http://www.haskell.org
Got URL http://www.python.org
49690

Works in Haskell and Python. In both cases I use the fact that functions are first class citizens. In Haskell I use the fact that the language supports laziness. And I python I use automated delegates overriding getattr and such things. Don't think you can do it in C++ or Java.