pp-1.6.4/0000755000175000017500000000000012112562207011541 5ustar vitalyvitalypp-1.6.4/doc/0000755000175000017500000000000012112562207012306 5ustar vitalyvitalypp-1.6.4/doc/ppserver.10000644000175000017500000000412112065007547014244 0ustar vitalyvitaly.\" It was generated by help2man 1.36. .TH PPSERVER "1" "February 2010" "Parallel Python Network Server" "User Commands" .SH NAME ppserver \- manual page for Parallel Python Network Server .SH SYNOPSIS .B ppserver [\fI-hda\fR] [\fI-i interface\fR] [\fI-b broadcast\fR] [\fI-p port\fR] [\fI-w nworkers\fR] [\fI-s secret\fR] [\fI-t seconds\fR] .SH DESCRIPTION Parallel Python Network Server .SH OPTIONS .TP \fB\-h\fR this help message .TP \fB\-d\fR debug .TP \fB\-a\fR enable auto\-discovery service .TP \fB\-r\fR restart worker process after each task completion .TP \fB\-n\fR protocol number for pickle module .TP \fB\-c\fR path to config file .TP \fB\-i\fR interface to listen .TP \fB\-b\fR broadcast address for auto\-discovery service .TP \fB\-p\fR port to listen .TP \fB\-w\fR number of workers to start .TP \fB\-s\fR secret for authentication .TP \fB\-t\fR timeout to exit if no connections with clients exist .TP \fB\-k\fR socket timeout in seconds which is also the maximum time a remote job could be executed. Increase this value if you have long running jobs or decrease if connectivity to remote ppservers is often lost. .TP \fB\-P\fR pid_file file to write PID to .PP Please visit http://www.parallelpython.com for extended up\-to\-date documentation, examples and support forums .br .SH SECURITY Due to the security concerns it is highly recommended to run ppserver.py with an non-trivial secret key (-s command line argument) which should be paired with the matching secret keyword of PP Server class constructor. An alternative way to set a secret key is by assigning .B pp_secret variable in the configuration file .B .pythonrc.py which should be located in the user home directory (please make this file readable and writable only by user). The secret key set in .pythonrc.py could be overridden by command line argument (for ppserver.py) and secret keyword (for PP Server class constructor). .SH AUTHOR This manual page was written by Sandro Tosi , and Vitalii Vanovschi support@parallelpython.com pp-1.6.4/doc/example.config0000644000175000017500000000030511150463012015121 0ustar vitalyvitaly[general] debug = True workers = 2 secret = epo20pdosl;dksldkmm proto = 0 restart = False [network] autodiscovery = False interface = 0.0.0.0 broadcast = 255.255.255.255 port = 60000 timeout = 10 pp-1.6.4/doc/ppdoc.html0000644000175000017500000010356712065010324014310 0ustar vitalyvitaly Parallel Python - Parallel Python documentation
Parallel Python documentation PDF Print E-mail

 Module API
 Quick start guide, SMP
 Quick start guide, clusters
 
Quick start guide, clusters with auto-discovery
 Advanced guide, clusters
 Command line arguments, ppserver.py
 Security and secret key
 PP FAQ


 

  pp 1.6.3 module API

 
class Server
    Parallel Python SMP execution server class
 
  Methods defined here:
__init__(self, ncpus='autodetect', ppservers=(), secret=None, restart=False, proto=2, socket_timeout=3600)
Creates Server instance
 
 
 
ncpus - the number of worker processes to start on the local 
        computer, if parameter is omitted it will be set to 
        the number of processors in the system
ppservers - list of active parallel python execution servers 
        to connect with
secret - passphrase for network connections, if omitted a default
        passphrase will be used. It's highly recommended to use a 
        custom passphrase for all network connections.
restart - whether to restart worker process after each task completion 
proto - protocol number for pickle module
socket_timeout - socket timeout in seconds which is also the maximum
        time a remote job could be executed. Increase this value
        if you have long running jobs or decrease if connectivity
        to remote ppservers is often lost.
 
With ncpus = 1 all tasks are executed consequently
For the best performance either use the default "autodetect" value
or set ncpus to the total number of processors in the system
destroy(self)
Kills ppworkers and closes open files
get_active_nodes(self)
Returns active nodes as a dictionary 
[keys - nodes, values - ncpus]
get_ncpus(self)
Returns the number of local worker processes (ppworkers)
get_stats(self)
Returns job execution statistics as a dictionary
print_stats(self)
Prints job execution statistics. Useful for benchmarking on 
clusters
set_ncpus(self, ncpus='autodetect')
Sets the number of local worker processes (ppworkers)
 
ncpus - the number of worker processes, if parammeter is omitted
        it will be set to the number of processors in the system
submit(self, func, args=(), depfuncs=(), modules=(), callback=None, callbackargs=(), group='default', globals=None)
Submits function to the execution queue
 
func - function to be executed
args - tuple with arguments of the 'func'
depfuncs - tuple with functions which might be called from 'func'
modules - tuple with module names to import
callback - callback function which will be called with argument 
        list equal to callbackargs+(result,) 
        as soon as calculation is done
callbackargs - additional arguments for callback function
group - job group, is used when wait(group) is called to wait for
jobs in a given group to finish
globals - dictionary from which all modules, functions and classes
will be imported, for instance: globals=globals()
wait(self, group=None)
Waits for all jobs in a given group to finish.
If group is omitted waits for all jobs to finish
default_port = 60000
default_secret = 'epo20pdosl;dksldkmm'

 
class Template
    Template class
 
  Methods defined here:
__init__(self, job_server, func, depfuncs=(), modules=(), callback=None, callbackargs=(), group='default', globals=None)
Creates Template instance
 
jobs_server - pp server for submitting jobs
func - function to be executed
depfuncs - tuple with functions which might be called from 'func'
modules - tuple with module names to import
callback - callback function which will be called with argument 
        list equal to callbackargs+(result,) 
        as soon as calculation is done
callbackargs - additional arguments for callback function
group - job group, is used when wait(group) is called to wait for
jobs in a given group to finish
globals - dictionary from which all modules, functions and classes
will be imported, for instance: globals=globals()
submit(self, *args)
Submits function with *arg arguments to the execution queue

 
Data
        copyright = 'Copyright (c) 2005-2012 Vitalii Vanovschi. All rights reserved'
version = '1.6.3'


  Quick start guide, SMP

1) Import pp module:

    import pp

2) Start pp execution server with the number of workers set to the number of processors in the system

    job_server = pp.Server() 

3) Submit all the tasks for parallel execution:

    f1 = job_server.submit(func1, args1, depfuncs1, modules1)

    f2 = job_server.submit(func1, args2, depfuncs1, modules1)

    f3 = job_server.submit(func2, args3, depfuncs2, modules2)

   ...etc...

4) Retrieve the results as needed:

    r1 = f1()

    r2 = f2()

    r3 = f3() 

    ...etc...

 To find out how to achieve efficient parallelization with pp please take a look at examples


  Quick start guide, clusters 

On the nodes

1) Start parallel python execution server on all your remote computational nodes:

    node-1> ./ppserver.py

    node-2> ./ppserver.py

    node-3> ./ppserver.py

On the client

2) Import pp module:

    import pp

3)  Create a list of all the nodes in your cluster (computers where you've run ppserver.py)

    ppservers=("node-1", "node-2", "node-3")

4) Start pp execution server with the number of workers set to the number of processors in the system and list of ppservers to connect with :

    job_server = pp.Server(ppservers=ppservers

5) Submit all the tasks for parallel execution:

    f1 = job_server.submit(func1, args1, depfuncs1, modules1)

    f2 = job_server.submit(func1, args2, depfuncs1, modules1)

    f3 = job_server.submit(func2, args3, depfuncs2, modules2)

   ...etc...

6) Retrieve the results as needed:

    r1 = f1()

    r2 = f2()

    r3 = f3() 

    ...etc...

 To find out how to achieve efficient parallelization with pp please take a look at examples


  Quick start guide, clusters with autodiscovery

On the nodes 

1) Start parallel python execution server on all your remote computational nodes:

    node-1> ./ppserver.py -a

    node-2> ./ppserver.py -a

    node-3> ./ppserver.py -a

On the client

2) Import pp module:

    import pp

3)  Set ppservers list to auto-discovery:

    ppservers=("*",)

4) Start pp execution server with the number of workers set to the number of processors in the system and list of ppservers to connect with :

    job_server = pp.Server(ppservers=ppservers

5) Submit all the tasks for parallel execution:

    f1 = job_server.submit(func1, args1, depfuncs1, modules1)

    f2 = job_server.submit(func1, args2, depfuncs1, modules1)

    f3 = job_server.submit(func2, args3, depfuncs2, modules2)

   ...etc...

6) Retrieve the results as needed:

    r1 = f1()

    r2 = f2()

    r3 = f3() 

    ...etc...

 To find out how to achieve efficient parallelization with pp please take a look at examples 


    Advanced guide, clusters 

On the nodes  

1) Start parallel python execution server on all your remote computational nodes (listen to a given port 35000,
and local network interface only, accept only connections which know correct secret):

    node-1> ./ppserver.py -p 35000 -i 192.168.0.101 -s "mysecret"

    node-2> ./ppserver.py -p 35000 -i 192.168.0.102 -s "mysecret"

    node-3> ./ppserver.py -p 35000 -i 192.168.0.103 -s "mysecret"

On the client

2) Import pp module:

    import pp

3)  Create a list of all the nodes in your cluster (computers where you've run ppserver.py)

    ppservers=("node-1:35000", "node-2:35000", "node-3:35000")

4) Start pp execution server with the number of workers set to the number of processors in the system,
list of ppservers to connect with and secret key to authorize the connection:

    job_server = pp.Server(ppservers=ppservers, secret="mysecret") 

5) Submit all the tasks for parallel execution:

    f1 = job_server.submit(func1, args1, depfuncs1, modules1)

    f2 = job_server.submit(func1, args2, depfuncs1, modules1)

    f3 = job_server.submit(func2, args3, depfuncs2, modules2)

   ...etc...

6) Retrieve the results as needed:

    r1 = f1()

    r2 = f2()

    r3 = f3() 

    ...etc...

 7) Print the execution statistics:

    job_server.print_stats()

To find out how to achieve efficient parallelization with pp please take a look at examples


  Command line options, ppserver.py

Usage: ppserver.py [-hda] [-i interface] [-b broadcast] [-p port] [-w nworkers] [-s secret] [-t seconds]
Options:
-h : this help message
-d : debug
-a : enable auto-discovery service
-i interface : interface to listen
-b broadcast : broadcast address for auto-discovery service
-p port : port to listen
-w nworkers : number of workers to start
-s secret : secret for authentication
-t seconds : timeout to exit if no connections with clients exist
-k seconds : socket timeout in seconds
-P pid_file : file to write PID to

  Security and secret key

 Due to the security concerns it is highly recommended to run ppserver.py with an non-trivial secret key (-s command line argument) which should be paired with the matching secret keyword of PP Server class constructor. Since PP 1.5.3 it is possible to set secret key by assigning pp_secret variable in the configuration file .pythonrc.py which should be located in the user home directory (please make this file readable and writable only by user). The key set in .pythonrc.py could be overridden by command line argument (for ppserver.py) and secret keyword (for PP Server class constructor).

 


  ppserver.py stats and PID file example

To print job execution statistics for ppserver.py send a SIGUSR1 signal to its main process.
For instance on UNIX platform following commands will start a server and print its stats:
ppserver.py  -P /tmp/ppserver.pid

kill -s SIGUSR1 `cat /tmp/ppserver.pid`

 
Nutrition facts and analysis
pp-1.6.4/pp.py0000644000175000017500000007517712112562007012551 0ustar vitalyvitaly# Parallel Python Software: http://www.parallelpython.com # Copyright (c) 2005-2012, Vitalii Vanovschi # All rights reserved. # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright notice, # this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of the author nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF # THE POSSIBILITY OF SUCH DAMAGE. """ Parallel Python Software, Execution Server http://www.parallelpython.com - updates, documentation, examples and support forums """ import os import threading import logging import inspect import sys import types import time import atexit import user import cPickle as pickle import pptransport import ppauto import ppcommon copyright = "Copyright (c) 2005-2012 Vitalii Vanovschi. All rights reserved" version = "1.6.4" # Reconnect persistent rworkers in seconds. RECONNECT_WAIT_TIME = 5 # If set to true prints out the exceptions which are expected. SHOW_EXPECTED_EXCEPTIONS = False # we need to have set even in Python 2.3 try: set except NameError: from sets import Set as set _USE_SUBPROCESS = False try: import subprocess _USE_SUBPROCESS = True except ImportError: import popen2 class _Task(object): """Class describing single task (job) """ def __init__(self, server, tid, callback=None, callbackargs=(), group='default'): """Initializes the task""" self.lock = threading.Lock() self.lock.acquire() self.tid = tid self.server = server self.callback = callback self.callbackargs = callbackargs self.group = group self.finished = False self.unpickled = False def finalize(self, sresult): """Finalizes the task. For internal use only""" self.sresult = sresult if self.callback: self.__unpickle() self.lock.release() self.finished = True def __call__(self, raw_result=False): """Retrieves result of the task""" if not self.finished and self.server._exiting: raise DestroyedServerError("Server was destroyed before the job completion") self.wait() if not self.unpickled and not raw_result: self.__unpickle() if raw_result: return self.sresult else: return self.result def wait(self): """Waits for the task""" if not self.finished: self.lock.acquire() self.lock.release() def __unpickle(self): """Unpickles the result of the task""" self.result, sout = pickle.loads(self.sresult) self.unpickled = True if len(sout) > 0: print sout, if self.callback: args = self.callbackargs + (self.result, ) self.callback(*args) class _Worker(object): """Local worker class """ command = [sys.executable, "-u", "-m", "ppworker"] command.append("2>/dev/null") def __init__(self, restart_on_free, pickle_proto): """Initializes local worker""" self.restart_on_free = restart_on_free self.pickle_proto = pickle_proto self.start() def start(self): """Starts local worker""" if _USE_SUBPROCESS: proc = subprocess.Popen(self.command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) self.t = pptransport.CPipeTransport(proc.stdout, proc.stdin) else: self.t = pptransport.CPipeTransport( *popen2.popen3(self.command)[:2]) self.pid = int(self.t.receive()) self.t.send(str(self.pickle_proto)) self.is_free = True def stop(self): """Stops local worker""" self.is_free = False self.t.send('EXIT') # can send any string - it will exit self.t.close() def restart(self): """Restarts local worker""" self.stop() self.start() def free(self): """Frees local worker""" if self.restart_on_free: self.restart() else: self.is_free = True class _RWorker(pptransport.CSocketTransport): """Remote worker class """ def __init__(self, host, port, secret, server, message, persistent, socket_timeout): """Initializes remote worker""" self.server = server self.persistent = persistent self.host = host self.port = port self.secret = secret self.address = (host, port) self.id = host + ":" + str(port) self.server.logger.debug("Creating Rworker id=%s persistent=%s" % (self.id, persistent)) self.socket_timeout = socket_timeout self.connect(message) def __del__(self): """Closes connection with remote server""" self.close() def connect(self, message=None): """Connects to a remote server""" while True and not self.server._exiting: try: pptransport.SocketTransport.__init__(self, None, self.socket_timeout) self._connect(self.host, self.port) if not self.authenticate(self.secret): self.server.logger.error("Authentication failed for host=%s, port=%s" % (self.host, self.port)) return False if message: self.send(message) self.is_free = True return True except: if SHOW_EXPECTED_EXCEPTIONS: self.server.logger.debug("Exception in connect method " "(possibly expected)", exc_info=True) if not self.persistent: self.server.logger.debug("Deleting from queue Rworker %s" % (self.id, )) return False self.server.logger.info("Failed to reconnect with " \ "(host=%s, port=%i), will try again in %i s" % (self.host, self.port, RECONNECT_WAIT_TIME)) time.sleep(RECONNECT_WAIT_TIME) class _Statistics(object): """Class to hold execution statisitcs for a single node """ def __init__(self, ncpus, rworker=None): """Initializes statistics for a node""" self.ncpus = ncpus self.time = 0.0 self.njobs = 0 self.rworker = rworker class Template(object): """Template class """ def __init__(self, job_server, func, depfuncs=(), modules=(), callback=None, callbackargs=(), group='default', globals=None): """Creates Template instance jobs_server - pp server for submitting jobs func - function to be executed depfuncs - tuple with functions which might be called from 'func' modules - tuple with module names to import callback - callback function which will be called with argument list equal to callbackargs+(result,) as soon as calculation is done callbackargs - additional arguments for callback function group - job group, is used when wait(group) is called to wait for jobs in a given group to finish globals - dictionary from which all modules, functions and classes will be imported, for instance: globals=globals()""" self.job_server = job_server self.func = func self.depfuncs = depfuncs self.modules = modules self.callback = callback self.callbackargs = callbackargs self.group = group self.globals = globals def submit(self, *args): """Submits function with *arg arguments to the execution queue """ return self.job_server.submit(self.func, args, self.depfuncs, self.modules, self.callback, self.callbackargs, self.group, self.globals) class Server(object): """Parallel Python SMP execution server class """ default_port = 60000 default_secret = "epo20pdosl;dksldkmm" def __init__(self, ncpus="autodetect", ppservers=(), secret=None, restart=False, proto=2, socket_timeout=3600): """Creates Server instance ncpus - the number of worker processes to start on the local computer, if parameter is omitted it will be set to the number of processors in the system ppservers - list of active parallel python execution servers to connect with secret - passphrase for network connections, if omitted a default passphrase will be used. It's highly recommended to use a custom passphrase for all network connections. restart - whether to restart worker process after each task completion proto - protocol number for pickle module socket_timeout - socket timeout in seconds which is also the maximum time a remote job could be executed. Increase this value if you have long running jobs or decrease if connectivity to remote ppservers is often lost. With ncpus = 1 all tasks are executed consequently For the best performance either use the default "autodetect" value or set ncpus to the total number of processors in the system """ if not isinstance(ppservers, tuple): raise TypeError("ppservers argument must be a tuple") self.logger = logging.getLogger('pp') self.logger.info("Creating server instance (pp-" + version+")") self.logger.info("Running on Python %s %s", sys.version.split(" ")[0], sys.platform) self.__tid = 0 self.__active_tasks = 0 self.__active_tasks_lock = threading.Lock() self.__queue = [] self.__queue_lock = threading.Lock() self.__workers = [] self.__rworkers = [] self.__rworkers_reserved = [] self.__sourcesHM = {} self.__sfuncHM = {} self.__waittasks = [] self.__waittasks_lock = threading.Lock() self._exiting = False self.__accurate_stats = True self.autopp_list = {} self.__active_rworkers_list_lock = threading.Lock() self.__restart_on_free = restart self.__pickle_proto = proto self.__connect_locks = {} # add local directory and sys.path to PYTHONPATH pythondirs = [os.getcwd()] + sys.path if "PYTHONPATH" in os.environ and os.environ["PYTHONPATH"]: pythondirs += os.environ["PYTHONPATH"].split(os.pathsep) os.environ["PYTHONPATH"] = os.pathsep.join(set(pythondirs)) atexit.register(self.destroy) self.__stats = {"local": _Statistics(0)} self.set_ncpus(ncpus) self.ppservers = [] self.auto_ppservers = [] self.socket_timeout = socket_timeout for ppserver in ppservers: ppserver = ppserver.split(":") host = ppserver[0] if len(ppserver)>1: port = int(ppserver[1]) else: port = Server.default_port if host.find("*") == -1: self.ppservers.append((host, port)) else: if host == "*": host = "*.*.*.*" interface = host.replace("*", "0") broadcast = host.replace("*", "255") self.auto_ppservers.append(((interface, port), (broadcast, port))) self.__stats_lock = threading.Lock() if secret is not None: if not isinstance(secret, types.StringType): raise TypeError("secret must be of a string type") self.secret = str(secret) elif hasattr(user, "pp_secret"): secret = getattr(user, "pp_secret") if not isinstance(secret, types.StringType): raise TypeError("secret must be of a string type") self.secret = str(secret) else: self.secret = Server.default_secret self.__connect() self.__creation_time = time.time() self.logger.info("pp local server started with %d workers" % (self.__ncpus, )) def submit(self, func, args=(), depfuncs=(), modules=(), callback=None, callbackargs=(), group='default', globals=None): """Submits function to the execution queue func - function to be executed args - tuple with arguments of the 'func' depfuncs - tuple with functions which might be called from 'func' modules - tuple with module names to import callback - callback function which will be called with argument list equal to callbackargs+(result,) as soon as calculation is done callbackargs - additional arguments for callback function group - job group, is used when wait(group) is called to wait for jobs in a given group to finish globals - dictionary from which all modules, functions and classes will be imported, for instance: globals=globals() """ # perform some checks for frequent mistakes if self._exiting: raise DestroyedServerError("Cannot submit jobs: server"\ " instance has been destroyed") if not isinstance(args, tuple): raise TypeError("args argument must be a tuple") if not isinstance(depfuncs, tuple): raise TypeError("depfuncs argument must be a tuple") if not isinstance(modules, tuple): raise TypeError("modules argument must be a tuple") if not isinstance(callbackargs, tuple): raise TypeError("callbackargs argument must be a tuple") if globals is not None and not isinstance(globals, dict): raise TypeError("globals argument must be a dictionary") for module in modules: if not isinstance(module, types.StringType): raise TypeError("modules argument must be a list of strings") tid = self.__gentid() if globals: modules += tuple(self.__find_modules("", globals)) modules = tuple(set(modules)) self.logger.debug("Task %i will autoimport next modules: %s" % (tid, str(modules))) for object1 in globals.values(): if isinstance(object1, types.FunctionType) \ or isinstance(object1, types.ClassType): depfuncs += (object1, ) task = _Task(self, tid, callback, callbackargs, group) self.__waittasks_lock.acquire() self.__waittasks.append(task) self.__waittasks_lock.release() # if the function is a method of a class add self to the arguments list if isinstance(func, types.MethodType) and func.im_self is not None: args = (func.im_self, ) + args # if there is an instance of a user deined class in the arguments add # whole class to dependancies for arg in args: # Checks for both classic or new class instances if isinstance(arg, types.InstanceType) \ or str(type(arg))[:6] == " 0") if ncpus > len(self.__workers): self.__workers.extend([_Worker(self.__restart_on_free, self.__pickle_proto) for x in\ range(ncpus - len(self.__workers))]) self.__stats["local"].ncpus = ncpus self.__ncpus = ncpus def get_active_nodes(self): """Returns active nodes as a dictionary [keys - nodes, values - ncpus]""" active_nodes = {} for node, stat in self.__stats.items(): if node == "local" or node in self.autopp_list \ and self.autopp_list[node]: active_nodes[node] = stat.ncpus return active_nodes def get_stats(self): """Returns job execution statistics as a dictionary""" for node, stat in self.__stats.items(): if stat.rworker: try: stat.rworker.send("TIME") stat.time = float(stat.rworker.receive()) except: self.__accurate_stats = False stat.time = 0.0 return self.__stats def print_stats(self): """Prints job execution statistics. Useful for benchmarking on clusters""" print "Job execution statistics:" walltime = time.time() - self.__creation_time statistics = self.get_stats().items() totaljobs = 0.0 for ppserver, stat in statistics: totaljobs += stat.njobs print " job count | % of all jobs | job time sum | " \ "time per job | job server" for ppserver, stat in statistics: if stat.njobs: print " %6i | %6.2f | %8.4f | %11.6f | %s" \ % (stat.njobs, 100.0*stat.njobs/totaljobs, stat.time, stat.time/stat.njobs, ppserver, ) print "Time elapsed since server creation", walltime print self.__active_tasks, "active tasks,", self.get_ncpus(), "cores" if not self.__accurate_stats: print "WARNING: statistics provided above is not accurate" \ " due to job rescheduling" print # all methods below are for internal use only def insert(self, sfunc, sargs, task=None): """Inserts function into the execution queue. It's intended for internal use only (ppserver.py). """ if not task: tid = self.__gentid() task = _Task(self, tid) self.__queue_lock.acquire() self.__queue.append((task, sfunc, sargs)) self.__queue_lock.release() self.logger.debug("Task %i inserted" % (task.tid, )) self.__scheduler() return task def connect1(self, host, port, persistent=True): """Conects to a remote ppserver specified by host and port""" hostid = host+":"+str(port) lock = self.__connect_locks.setdefault(hostid, threading.Lock()) lock.acquire() try: if hostid in self.autopp_list: return rworker = _RWorker(host, port, self.secret, self, "STAT", persistent, self.socket_timeout) ncpus = int(rworker.receive()) self.__stats[hostid] = _Statistics(ncpus, rworker) for x in range(ncpus): rworker = _RWorker(host, port, self.secret, self, "EXEC", persistent, self.socket_timeout) self.__update_active_rworkers(rworker.id, 1) # append is atomic - no need to lock self.__rworkers self.__rworkers.append(rworker) #creating reserved rworkers for x in range(ncpus): rworker = _RWorker(host, port, self.secret, self, "EXEC", persistent, self.socket_timeout) self.__update_active_rworkers(rworker.id, 1) self.__rworkers_reserved.append(rworker) self.logger.debug("Connected to ppserver (host=%s, port=%i) \ with %i workers" % (host, port, ncpus)) self.__scheduler() except: if SHOW_EXPECTED_EXCEPTIONS: self.logger.debug("Exception in connect1 method (possibly expected)", exc_info=True) finally: lock.release() def __connect(self): """Connects to all remote ppservers""" for ppserver in self.ppservers: ppcommon.start_thread("connect1", self.connect1, ppserver) self.discover = ppauto.Discover(self, True) for ppserver in self.auto_ppservers: ppcommon.start_thread("discover.run", self.discover.run, ppserver) def __detect_ncpus(self): """Detects the number of effective CPUs in the system""" #for Linux, Unix and MacOS if hasattr(os, "sysconf"): if "SC_NPROCESSORS_ONLN" in os.sysconf_names: #Linux and Unix ncpus = os.sysconf("SC_NPROCESSORS_ONLN") if isinstance(ncpus, int) and ncpus > 0: return ncpus else: #MacOS X return int(os.popen2("sysctl -n hw.ncpu")[1].read()) #for Windows if "NUMBER_OF_PROCESSORS" in os.environ: ncpus = int(os.environ["NUMBER_OF_PROCESSORS"]) if ncpus > 0: return ncpus #return the default value return 1 def __dumpsfunc(self, funcs, modules): """Serializes functions and modules""" hashs = hash(funcs + modules) if hashs not in self.__sfuncHM: sources = [self.__get_source(func) for func in funcs] self.__sfuncHM[hashs] = pickle.dumps( (funcs[0].func_name, sources, modules), self.__pickle_proto) return self.__sfuncHM[hashs] def __find_modules(self, prefix, dict): """recursively finds all the modules in dict""" modules = [] for name, object in dict.items(): if isinstance(object, types.ModuleType) \ and name not in ("__builtins__", "pp"): if object.__name__ == prefix+name or prefix == "": modules.append(object.__name__) modules.extend(self.__find_modules( object.__name__+".", object.__dict__)) return modules def __scheduler(self): """Schedules jobs for execution""" self.__queue_lock.acquire() while self.__queue: if self.__active_tasks < self.__ncpus: #TODO: select a job number on the basis of heuristic task = self.__queue.pop(0) for worker in self.__workers: if worker.is_free: worker.is_free = False break else: self.logger.error("There are no free workers left") raise RuntimeError("Error: No free workers") self.__add_to_active_tasks(1) try: self.__stats["local"].njobs += 1 ppcommon.start_thread("run_local", self._run_local, task+(worker, )) except: pass else: for rworker in self.__rworkers: if rworker.is_free: rworker.is_free = False task = self.__queue.pop(0) self.__stats[rworker.id].njobs += 1 ppcommon.start_thread("run_remote", self._run_remote, task+(rworker, )) break else: if len(self.__queue) > self.__ncpus: for rworker in self.__rworkers_reserved: if rworker.is_free: rworker.is_free = False task = self.__queue.pop(0) self.__stats[rworker.id].njobs += 1 ppcommon.start_thread("run_remote", self._run_remote, task+(rworker, )) break else: break else: break self.__queue_lock.release() def __get_source(self, func): """Fetches source of the function""" hashf = hash(func) if hashf not in self.__sourcesHM: #get lines of the source and adjust indent sourcelines = inspect.getsourcelines(func)[0] #remove indentation from the first line sourcelines[0] = sourcelines[0].lstrip() self.__sourcesHM[hashf] = "".join(sourcelines) return self.__sourcesHM[hashf] def _run_local(self, job, sfunc, sargs, worker): """Runs a job locally""" if self._exiting: return self.logger.info("Task %i started", job.tid) start_time = time.time() try: worker.t.csend(sfunc) worker.t.send(sargs) sresult = worker.t.receive() job.finalize(sresult) except: if self._exiting: return if SHOW_EXPECTED_EXCEPTIONS: self.logger.debug("Exception in _run_local (possibly expected)", exc_info=True) # remove the job from the waiting list if self.__waittasks: self.__waittasks_lock.acquire() self.__waittasks.remove(job) self.__waittasks_lock.release() worker.free() self.__add_to_active_tasks(-1) if not self._exiting: self.__stat_add_time("local", time.time()-start_time) self.logger.debug("Task %i ended", job.tid) self.__scheduler() def _run_remote(self, job, sfunc, sargs, rworker): """Runs a job remotelly""" self.logger.debug("Task (remote) %i started", job.tid) try: rworker.csend(sfunc) rworker.send(sargs) sresult = rworker.receive() rworker.is_free = True job.finalize(sresult) except: self.logger.debug("Task %i failed due to broken network " \ "connection - rescheduling", job.tid) self.insert(sfunc, sargs, job) self.__scheduler() self.__update_active_rworkers(rworker.id, -1) if rworker.connect("EXEC"): self.__update_active_rworkers(rworker.id, 1) self.__scheduler() return # remove the job from the waiting list if self.__waittasks: self.__waittasks_lock.acquire() self.__waittasks.remove(job) self.__waittasks_lock.release() self.logger.debug("Task (remote) %i ended", job.tid) self.__scheduler() def __add_to_active_tasks(self, num): """Updates the number of active tasks""" self.__active_tasks_lock.acquire() self.__active_tasks += num self.__active_tasks_lock.release() def __stat_add_time(self, node, time_add): """Updates total runtime on the node""" self.__stats_lock.acquire() self.__stats[node].time += time_add self.__stats_lock.release() def __stat_add_job(self, node): """Increments job count on the node""" self.__stats_lock.acquire() self.__stats[node].njobs += 1 self.__stats_lock.release() def __update_active_rworkers(self, id, count): """Updates list of active rworkers""" self.__active_rworkers_list_lock.acquire() if id not in self.autopp_list: self.autopp_list[id] = 0 self.autopp_list[id] += count self.__active_rworkers_list_lock.release() def __gentid(self): """Generates a unique job ID number""" self.__tid += 1 return self.__tid - 1 def __del__(self): self._exiting = True def destroy(self): """Kills ppworkers and closes open files""" self._exiting = True self.__queue_lock.acquire() self.__queue = [] self.__queue_lock.release() for worker in self.__workers: try: worker.t.close() if sys.platform.startswith("win"): os.popen('TASKKILL /PID '+str(worker.pid)+' /F') else: os.kill(worker.pid, 9) os.waitpid(worker.pid, 0) except: pass class DestroyedServerError(RuntimeError): pass # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/README0000644000175000017500000000027311337422171012426 0ustar vitalyvitalyVisit http://www.parallelpython.com for up-to-date documentation, examples and support forums INSTALATION: python setup.py install LOCAL DOCUMENTATION: pydoc.html pp-1.6.4/pptransport.py0000644000175000017500000001541412112562007014512 0ustar vitalyvitaly# Parallel Python Software: http://www.parallelpython.com # Copyright (c) 2005-2012, Vitalii Vanovschi # All rights reserved. # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright notice, # this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of the author nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF # THE POSSIBILITY OF SUCH DAMAGE. """ Parallel Python Software, PP Transport http://www.parallelpython.com - updates, documentation, examples and support forums """ import logging import os import socket import struct import types copyright = "Copyright (c) 2005-2012 Vitalii Vanovschi. All rights reserved" version = "1.6.4" # compartibility with Python 2.6 try: import hashlib sha_new = hashlib.sha1 md5_new = hashlib.md5 except ImportError: import sha import md5 sha_new = sha.new md5_new = md5.new class Transport(object): def send(self, msg): raise NotImplemented("abstact function 'send' must be implemented "\ "in a subclass") def receive(self, preprocess=None): raise NotImplemented("abstact function 'receive' must be implemented "\ "in a subclass") def authenticate(self, secret): remote_version = self.receive() if version != remote_version: logging.error("PP version mismatch (local: pp-%s, remote: pp-%s)" % (version, remote_version)) logging.error("Please install the same version of PP on all nodes") return False srandom = self.receive() answer = sha_new(srandom+secret).hexdigest() self.send(answer) response = self.receive() if response == "OK": return True else: return False def close(self): pass def _connect(self, host, port): pass class CTransport(Transport): """Cached transport """ rcache = {} def hash(self, msg): return md5_new(msg).hexdigest() def csend(self, msg): hash1 = self.hash(msg) if hash1 in self.scache: self.send("H" + hash1) else: self.send("N" + msg) self.scache[hash1] = True def creceive(self, preprocess=None): msg = self.receive() if msg[0] == 'H': hash1 = msg[1:] else: msg = msg[1:] hash1 = self.hash(msg) self.rcache[hash1] = map(preprocess, (msg, ))[0] return self.rcache[hash1] class PipeTransport(Transport): def __init__(self, r, w): self.scache = {} self.exiting = False if isinstance(r, types.FileType) and isinstance(w, types.FileType): self.r = r self.w = w else: raise TypeError("Both arguments of PipeTransport constructor " \ "must be file objects") def send(self, msg): self.w.write(struct.pack("!Q", len(msg))) self.w.flush() self.w.write(msg) self.w.flush() def receive(self, preprocess=None): e_size = struct.calcsize("!Q") r_size = 0 data = "" while r_size < e_size: msg = self.r.read(e_size-r_size) if msg == "": raise RuntimeError("Communication pipe read error") r_size += len(msg) data += msg e_size = struct.unpack("!Q", data)[0] r_size = 0 data = "" while r_size < e_size: msg = self.r.read(e_size-r_size) if msg == "": raise RuntimeError("Communication pipe read error") r_size += len(msg) data += msg return map(preprocess, (data, ))[0] def close(self): self.w.close() self.r.close() class SocketTransport(Transport): def __init__(self, socket1, socket_timeout): if socket1: self.socket = socket1 else: self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.socket.settimeout(socket_timeout) self.scache = {} def send(self, data): size = struct.pack("!Q", len(data)) t_size = struct.calcsize("!Q") s_size = 0L while s_size < t_size: p_size = self.socket.send(size[s_size:]) if p_size == 0: raise RuntimeError("Socket connection is broken") s_size += p_size t_size = len(data) s_size = 0L while s_size < t_size: p_size = self.socket.send(data[s_size:]) if p_size == 0: raise RuntimeError("Socket connection is broken") s_size += p_size def receive(self, preprocess=None): e_size = struct.calcsize("!Q") r_size = 0 data = "" while r_size < e_size: msg = self.socket.recv(e_size-r_size) if msg == "": raise RuntimeError("Socket connection is broken") r_size += len(msg) data += msg e_size = struct.unpack("!Q", data)[0] r_size = 0 data = "" while r_size < e_size: msg = self.socket.recv(e_size-r_size) if msg == "": raise RuntimeError("Socket connection is broken") r_size += len(msg) data += msg return data def close(self): self.socket.close() def _connect(self, host, port): self.socket.connect((host, port)) class CPipeTransport(PipeTransport, CTransport): pass class CSocketTransport(SocketTransport, CTransport): pass # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/PKG-INFO0000644000175000017500000000227112112562207012640 0ustar vitalyvitalyMetadata-Version: 1.0 Name: pp Version: 1.6.4 Summary: Parallel and distributed programming for Python Home-page: http://www.parallelpython.com Author: Vitalii Vanovschi Author-email: support@parallelpython.com License: BSD-like Download-URL: http://www.parallelpython.com/downloads/pp/pp-1.6.4.zip Description: Parallel Python module (PP) provides an easy and efficient way to create parallel-enabled applications for SMP computers and clusters. PP module features cross-platform portability and dynamic load balancing. Thus application written with PP will parallelize efficiently even on heterogeneous and multi-platform clusters (including clusters running other application with variable CPU loads). Visit http://www.parallelpython.com for further information. Platform: Windows Platform: Linux Platform: Unix Classifier: Topic :: Software Development Classifier: Topic :: System :: Distributed Computing Classifier: Programming Language :: Python Classifier: Operating System :: OS Independent Classifier: License :: OSI Approved :: BSD License Classifier: Natural Language :: English Classifier: Intended Audience :: Developers Classifier: Development Status :: 5 - Production/Stable pp-1.6.4/ppcommon.py0000644000175000017500000000510612112562007013743 0ustar vitalyvitaly# Parallel Python Software: http://www.parallelpython.com # Copyright (c) 2005-2012, Vitalii Vanovschi # All rights reserved. # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright notice, # this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of the author nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF # THE POSSIBILITY OF SUCH DAMAGE. """ Parallel Python Software, Execution Server http://www.parallelpython.com - updates, documentation, examples and support forums """ import threading copyright = "Copyright (c) 2005-2012 Vitalii Vanovschi. All rights reserved" version = "1.6.4" def start_thread(name, target, args=(), kwargs={}, daemon=True): """Starts a thread""" thread = threading.Thread(name=name, target=target, args=args, kwargs=kwargs) thread.daemon = daemon thread.start() return thread def get_class_hierarchy(clazz): classes = [] if clazz is type(object()): return classes for base_class in clazz.__bases__: classes.extend(get_class_hierarchy(base_class)) classes.append(clazz) return classes def is_not_imported(arg, modules): args_module = str(arg.__module__) for module in modules: if args_module == module or args_module.startswith(module + "."): return False return True # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/ppserver.py0000755000175000017500000003336312112562152013773 0ustar vitalyvitaly#!/usr/bin/env python # Parallel Python Software: http://www.parallelpython.com # Copyright (c) 2005-2012, Vitalii Vanovschi # All rights reserved. # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright notice, # this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of the author nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF # THE POSSIBILITY OF SUCH DAMAGE. """ Parallel Python Software, Network Server http://www.parallelpython.com - updates, documentation, examples and support forums """ import atexit import logging import errno import getopt import sys import socket import threading import random import string import signal import time import os import pp import ppauto import ppcommon import pptransport copyright = "Copyright (c) 2005-2012 Vitalii Vanovschi. All rights reserved" version = "1.6.4" LISTEN_SOCKET_TIMEOUT = 20 # compatibility with Python 2.6 try: import hashlib sha_new = hashlib.sha1 except ImportError: import sha sha_new = sha.new class _NetworkServer(pp.Server): """Network Server Class """ def __init__(self, ncpus="autodetect", interface="0.0.0.0", broadcast="255.255.255.255", port=None, secret=None, timeout=None, restart=False, proto=2, socket_timeout=3600, pid_file=None): pp.Server.__init__(self, ncpus, (), secret, restart, proto, socket_timeout) if pid_file: with open(pid_file, 'w') as pfile: print >>pfile, os.getpid() atexit.register(os.remove, pid_file) self.host = interface self.bcast = broadcast if port is not None: self.port = port else: self.port = self.default_port self.timeout = timeout self.ncon = 0 self.last_con_time = time.time() self.ncon_lock = threading.Lock() self.logger.debug("Strarting network server interface=%s port=%i" % (self.host, self.port)) if self.timeout is not None: self.logger.debug("ppserver will exit in %i seconds if no "\ "connections with clients exist" % (self.timeout)) ppcommon.start_thread("timeout_check", self.check_timeout) def ncon_add(self, val): """Keeps track of the number of connections and time of the last one""" self.ncon_lock.acquire() self.ncon += val self.last_con_time = time.time() self.ncon_lock.release() def check_timeout(self): """Checks if timeout happened and shutdowns server if it did""" while True: if self.ncon == 0: idle_time = time.time() - self.last_con_time if idle_time < self.timeout: time.sleep(self.timeout - idle_time) else: self.logger.debug("exiting ppserver due to timeout (no client"\ " connections in last %i sec)", self.timeout) os._exit(0) else: time.sleep(self.timeout) def listen(self): """Initiates listenting to incoming connections""" try: self.ssocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # following allows ppserver to restart faster on the same port self.ssocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.ssocket.settimeout(LISTEN_SOCKET_TIMEOUT) self.ssocket.bind((self.host, self.port)) self.ssocket.listen(5) except socket.error, e: self.logger.error("Cannot create socket for %s:%s, %s", self.host, self.port, e) try: while 1: csocket = None # accept connections from outside try: (csocket, address) = self.ssocket.accept() except socket.timeout: pass # don't exit on an interupt due to a signal except socket.error as e: if e.errno == errno.EINTR: pass if self._exiting: return # now do something with the clientsocket # in this case, we'll pretend this is a threaded server if csocket: ppcommon.start_thread("client_socket", self.crun, (csocket, )) except KeyboardInterrupt: pass except: self.logger.debug("Exception in listen method (possibly expected)", exc_info=True) finally: self.logger.debug("Closing server socket") self.ssocket.close() def crun(self, csocket): """Authenticates client and handles its jobs""" mysocket = pptransport.CSocketTransport(csocket, self.socket_timeout) #send PP version mysocket.send(version) #generate a random string srandom = "".join([random.choice(string.ascii_letters) for i in xrange(16)]) mysocket.send(srandom) answer = sha_new(srandom+self.secret).hexdigest() clientanswer = mysocket.receive() if answer != clientanswer: self.logger.warning("Authentication failed, client host=%s, port=%i" % csocket.getpeername()) mysocket.send("FAILED") csocket.close() return else: mysocket.send("OK") ctype = mysocket.receive() self.logger.debug("Control message received: " + ctype) self.ncon_add(1) try: if ctype == "STAT": #reset time at each new connection self.get_stats()["local"].time = 0.0 mysocket.send(str(self.get_ncpus())) while 1: mysocket.receive() mysocket.send(str(self.get_stats()["local"].time)) elif ctype=="EXEC": while 1: sfunc = mysocket.creceive() sargs = mysocket.receive() fun = self.insert(sfunc, sargs) sresult = fun(True) mysocket.send(sresult) except: if self._exiting: return if pp.SHOW_EXPECTED_EXCEPTIONS: self.logger.debug("Exception in crun method (possibly expected)", exc_info=True) self.logger.debug("Closing client socket") csocket.close() self.ncon_add(-1) def broadcast(self): """Initiaates auto-discovery mechanism""" discover = ppauto.Discover(self) ppcommon.start_thread("server_broadcast", discover.run, ((self.host, self.port), (self.bcast, self.port))) def parse_config(file_loc): """ Parses a config file in a very forgiving way. """ # If we don't have configobj installed then let the user know and exit try: from configobj import ConfigObj except ImportError, ie: print >> sys.stderr, ("ERROR: You must have config obj installed to use" "configuration files. You can still use command line switches.") sys.exit(1) if not os.access(file_loc, os.F_OK): print >> sys.stderr, "ERROR: Can not access %s." % arg sys.exit(1) # Load the configuration file config = ConfigObj(file_loc) # try each config item and use the result if it exists. If it doesn't # then simply pass and move along try: args['secret'] = config['general'].get('secret') except: pass try: autodiscovery = config['network'].as_bool('autodiscovery') except: pass try: args['interface'] = config['network'].get('interface', default="0.0.0.0") except: pass try: args['broadcast'] = config['network'].get('broadcast') except: pass try: args['port'] = config['network'].as_int('port') except: pass try: args['loglevel'] = config['general'].as_bool('debug') except: pass try: args['ncpus'] = config['general'].as_int('workers') except: pass try: args['proto'] = config['general'].as_int('proto') except: pass try: args['restart'] = config['general'].as_bool('restart') except: pass try: args['timeout'] = config['network'].as_int('timeout') except: pass try: args['socket_timeout'] = config['network'].as_int('socket_timeout') except: pass try: args['pid_file'] = config['general'].get('pid_file') except: pass # Return a tuple of the args dict and autodiscovery variable return args, autodiscovery def print_usage(): """Prints help""" print "Parallel Python Network Server (pp-" + version + ")" print "Usage: ppserver.py [-hdar] [-f format] [-n proto]"\ " [-c config_path] [-i interface] [-b broadcast]"\ " [-p port] [-w nworkers] [-s secret] [-t seconds]"\ " [-k seconds] [-P pid_file]" print print "Options: " print "-h : this help message" print "-d : set log level to debug" print "-f format : log format" print "-a : enable auto-discovery service" print "-r : restart worker process after each"\ " task completion" print "-n proto : protocol number for pickle module" print "-c path : path to config file" print "-i interface : interface to listen" print "-b broadcast : broadcast address for auto-discovery service" print "-p port : port to listen" print "-w nworkers : number of workers to start" print "-s secret : secret for authentication" print "-t seconds : timeout to exit if no connections with "\ "clients exist" print "-k seconds : socket timeout in seconds" print "-P pid_file : file to write PID to" print print "To print server stats send SIGUSR1 to its main process (unix only). " print print "Due to the security concerns always use a non-trivial secret key." print "Secret key set by -s switch will override secret key assigned by" print "pp_secret variable in .pythonrc.py" print print "Please visit http://www.parallelpython.com for extended up-to-date" print "documentation, examples and support forums" def create_network_server(argv): try: opts, args = getopt.getopt(argv, "hdarn:c:b:i:p:w:s:t:f:k:P:", ["help"]) except getopt.GetoptError: print_usage() raise args = {} autodiscovery = False log_level = logging.WARNING log_format = "%(asctime)s - %(name)s - %(levelname)s - %(message)s" for opt, arg in opts: if opt in ("-h", "--help"): print_usage() sys.exit() elif opt == "-c": args, autodiscovery = parse_config(arg) elif opt == "-d": log_level = logging.DEBUG pp.SHOW_EXPECTED_EXCEPTIONS = True elif opt == "-f": log_format = arg elif opt == "-i": args["interface"] = arg elif opt == "-s": args["secret"] = arg elif opt == "-p": args["port"] = int(arg) elif opt == "-w": args["ncpus"] = int(arg) elif opt == "-a": autodiscovery = True elif opt == "-r": args["restart"] = True elif opt == "-b": args["broadcast"] = arg elif opt == "-n": args["proto"] = int(arg) elif opt == "-t": args["timeout"] = int(arg) elif opt == "-k": args["socket_timeout"] = int(arg) elif opt == "-P": args["pid_file"] = arg log_handler = logging.StreamHandler() log_handler.setFormatter(logging.Formatter(log_format)) logging.getLogger("pp").setLevel(log_level) logging.getLogger("pp").addHandler(log_handler) server = _NetworkServer(**args) if autodiscovery: server.broadcast() return server def signal_handler(signum, stack): """Prints server stats when SIGUSR1 is received (unix only). """ server.print_stats() if __name__ == "__main__": server = create_network_server(sys.argv[1:]) if hasattr(signal, "SIGUSR1"): signal.signal(signal.SIGUSR1, signal_handler) server.listen() #have to destroy it here explicitly otherwise an exception #comes out in Python 2.4 del server # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/ppworker.py0000644000175000017500000001066412112562007013771 0ustar vitalyvitaly# Parallel Python Software: http://www.parallelpython.com # Copyright (c) 2005-2012, Vitalii Vanovschi # All rights reserved. # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright notice, # this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of the author nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF # THE POSSIBILITY OF SUCH DAMAGE. """ Parallel Python Software, PP Worker http://www.parallelpython.com - updates, documentation, examples and support forums """ import sys import os import StringIO import cPickle as pickle import pptransport copyright = "Copyright (c) 2005-2012 Vitalii Vanovschi. All rights reserved" version = "1.6.4" def preprocess(msg): fname, fsources, imports = pickle.loads(msg) fobjs = [compile(fsource, '', 'exec') for fsource in fsources] for module in imports: try: if not module.startswith("from ") and not module.startswith("import "): module = "import " + module exec module globals().update(locals()) except: print "An error has occured during the module import" sys.excepthook(*sys.exc_info()) return fname, fobjs class _WorkerProcess(object): def __init__(self): self.hashmap = {} self.e = sys.__stderr__ self.sout = StringIO.StringIO() # self.sout = open("/tmp/pp.debug","a+") sys.stdout = self.sout sys.stderr = self.sout self.t = pptransport.CPipeTransport(sys.stdin, sys.__stdout__) self.t.send(str(os.getpid())) self.pickle_proto = int(self.t.receive()) def run(self): try: #execution cycle while 1: __fname, __fobjs = self.t.creceive(preprocess) __sargs = self.t.receive() for __fobj in __fobjs: try: exec __fobj globals().update(locals()) except: print "An error has occured during the " + \ "function import" sys.excepthook(*sys.exc_info()) __args = pickle.loads(__sargs) __f = locals()[__fname] try: __result = __f(*__args) except: print "An error has occured during the function execution" sys.excepthook(*sys.exc_info()) __result = None __sresult = pickle.dumps((__result, self.sout.getvalue()), self.pickle_proto) self.t.send(__sresult) self.sout.truncate(0) except: print "A fatal error has occured during the function execution" sys.excepthook(*sys.exc_info()) __result = None __sresult = pickle.dumps((__result, self.sout.getvalue()), self.pickle_proto) self.t.send(__sresult) if __name__ == "__main__": # add the directory with ppworker.py to the path sys.path.append(os.path.dirname(__file__)) wp = _WorkerProcess() wp.run() # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/CHANGELOG0000644000175000017500000000710312112562107012753 0ustar vitalyvitalypp-1.6.4: 1) Start ppworker using -m 2) Fixed windows compatibility issue. pp-1.6.3: 1) Added -P pid_file command line argument to ppserver.py 2) Modified print_stats() to output the number of active tasks. 3) Added SIGUSR1 handler to ppserver.py to print stats when signal is received. pp-1.6.2: 1) Made socket timeout configurable via constructor argument socket_timeout and command line argument of ppserver.py -k. 2) Fixed sresult referenced before assignment bug 3) Removed shell from subprocess.Popen call. 4) Fixed race condition in autodiscovery. pp-1.6.1: 1) Fixed struct.unpack("!Q", size_packed) bug which started to happen with Python 2.7 on certain platforms. 2) Fixed bug with auto-discovery not working after ppserver is restarted. 3) Added full support of python import statements. For instance "from numpy.linalg import det as determinant" is now supported. For compatibility old module name imports will continue to work. 4) Exposed more detailed network error messages in ppserver.py. pp-1.6.0: 1) Changed logging mechanism. Now logger is obtained as logging.getLogger('pp'). 2) Modified ppworker to use exec instead of eval. 3) Modified exception handling on destruction. Now if server was destroyed, uncompleted jobs throw DestroyedServerError exception on call. 4) Fixed issue with submitting a method of an instance of a class inherited from another. 5) Added timeouts to all socket operations. 6) Changed default proto type to 2. 7) Moved from thread module to threading. Made all pp threads daemons. 8 ) Refactored ppserver.py to improve testability 9) Fixed bug with ppsecret in user module Changes w.r.t RC1: 10) Fixed issue with argument which is an instance of an imported class Changes w.r.t RC2: 11) Fixed DEBUG logging in ppserver. 12) Added a flag (-f) to ppserver to set a custom log format. Changed default log format. 13) Made printing of the expected exceptions optional and improved the way they are handled. 14) Removed default logging handler from pp module (to improve logging flexibility). Changes w.r.t RC3: 15) Created a common module ppcommon.py and moved common functions there. 16) Fixed issue with pipes not being closed. Changes w.r.t. RC4: 17) Fixed issues with ppserver exiting on first connection. 18) Fixed deadlock when using ppworker restart option. 19) Enables support for submodule importing. pp-1.5.7: 1) Added ppworker restart after task completion functionality 2) Added pickle protocol option 3) Merged patch for Python 2.6 compatibility (contributed by mrtss) 4) Merged patch for config file support (contributed by stevemilner) 5) Documentation has been moved to doc folder pp-1.5.6 1) Fixed problem with autodiscovery service on Winsows XP and Vista 2) Merged new code quality improvement patches (contributed by stevemilner) pp-1.5.5 1) Fixed bug which caused segmentation fault when calling destroy() method. 2) Merged performance and quality improvement patches (contributed by stevemilner) pp-1.5.4 1) Fixed bug with unindented comments 2) easy_intall functionality repaired 3) Code quality improved (many small changes) pp-1.5.3 1) Added support for methods of new-style classes. 2) Added ability to read secret key from pp_secret variable of .pythonrc.py 3) ppdoc.html and ppserver.1 are included in the distribution 4) examples bundled with the distribution CHANGELOG started * - nicknames of the contributors refer to the PP forum profile login names. pp-1.6.4/MANIFEST.in0000644000175000017500000000057311550767242013317 0ustar vitalyvitalyinclude AUTHORS include COPYING include MANIFEST.in include CHANGELOG include PKG-INFO include README include python-restlib.spec include examples/auto_diff.py include examples/callback.py include examples/dynamic_ncpus.py include examples/quicksort.py include examples/reverse_md5.py include examples/sum_primes.py include examples/sum_primes_functor.py recursive-include doc * pp-1.6.4/AUTHORS0000644000175000017500000000005712064277010012614 0ustar vitalyvitalyVitalii Vanovschi - support@parallelpython.com pp-1.6.4/ppauto.py0000644000175000017500000001166512112562007013432 0ustar vitalyvitaly# Parallel Python Software: http://www.parallelpython.com # Copyright (c) 2005-2012, Vitalii Vanovschi # All rights reserved. # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are met: # * Redistributions of source code must retain the above copyright notice, # this list of conditions and the following disclaimer. # * Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # * Neither the name of the author nor the names of its contributors # may be used to endorse or promote products derived from this software # without specific prior written permission. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF # THE POSSIBILITY OF SUCH DAMAGE. """Parallel Python Software, Auto-Discovery Service http://www.parallelpython.com - updates, documentation, examples and support forums """ import socket import sys import time import threading import ppcommon copyright = "Copyright (c) 2005-2012 Vitalii Vanovschi. All rights reserved" version = "1.6.4" # broadcast every 10 sec BROADCAST_INTERVAL = 10 class Discover(object): """Auto-discovery service class""" def __init__(self, base, isclient=False): self.base = base self.hosts = [] self.isclient = isclient def run(self, interface_addr, broadcast_addr): """Starts auto-discovery""" self.interface_addr = interface_addr self.broadcast_addr = broadcast_addr self.bsocket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) self.bsocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.bsocket.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1) try: self.listen() except: sys.excepthook(*sys.exc_info()) def broadcast(self): """Sends a broadcast""" if self.isclient: self.base.logger.debug("Client sends initial broadcast to (%s, %i)" % self.broadcast_addr) self.bsocket.sendto("C", self.broadcast_addr) else: while True: if self.base._exiting: return self.base.logger.debug("Server sends broadcast to (%s, %i)" % self.broadcast_addr) self.bsocket.sendto("S", self.broadcast_addr) time.sleep(BROADCAST_INTERVAL) def listen(self): """Listens for broadcasts from other clients/servers""" self.base.logger.debug("Listening (%s, %i)" % self.interface_addr) self.socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_BROADCAST, 1) self.socket.settimeout(5) self.socket.bind(self.interface_addr) ppcommon.start_thread("broadcast", self.broadcast) while True: try: if self.base._exiting: return message, (host, port) = self.socket.recvfrom(1024) remote_address = (host, self.broadcast_addr[1]) hostid = host + ":" + str(self.broadcast_addr[1]) self.base.logger.debug("Discovered host (%s, %i) message=%c" % (remote_address + (message[0], ))) if not self.base.autopp_list.get(hostid, 0) and self.isclient \ and message[0] == 'S': self.base.logger.debug("Connecting to host %s" % (hostid, )) ppcommon.start_thread("ppauto_connect1", self.base.connect1, remote_address+(False, )) if not self.isclient and message[0] == 'C': self.base.logger.debug("Replying to host %s" % (hostid, )) self.bsocket.sendto("S", self.broadcast_addr) except socket.timeout: pass except: self.base.logger.error("An error has occured during execution of " "Discover.listen") sys.excepthook(*sys.exc_info()) pp-1.6.4/setup.py0000755000175000017500000000334712112562007013263 0ustar vitalyvitaly#!/usr/bin/env python # Parallel Python setup script # For the latest version of the Parallel Python # software visit: http://www.parallelpython.com """ Standard build tool for python libraries. """ import os.path from distutils.core import setup from pp import version as VERSION LONG_DESCRIPTION = """ Parallel Python module (PP) provides an easy and efficient way to create \ parallel-enabled applications for SMP computers and clusters. PP module \ features cross-platform portability and dynamic load balancing. Thus \ application written with PP will parallelize efficiently even on \ heterogeneous and multi-platform clusters (including clusters running other \ application with variable CPU loads). Visit http://www.parallelpython.com \ for further information. """ setup( name="pp", url="http://www.parallelpython.com", version=VERSION, download_url="http://www.parallelpython.com/downloads/pp/pp-%s.zip" % ( VERSION), author="Vitalii Vanovschi", author_email="support@parallelpython.com", py_modules=["pp", "ppauto", "ppcommon", "pptransport", "ppworker"], scripts=["ppserver.py"], description="Parallel and distributed programming for Python", platforms=["Windows", "Linux", "Unix"], long_description=LONG_DESCRIPTION, license="BSD-like", classifiers=[ "Topic :: Software Development", "Topic :: System :: Distributed Computing", "Programming Language :: Python", "Operating System :: OS Independent", "License :: OSI Approved :: BSD License", "Natural Language :: English", "Intended Audience :: Developers", "Development Status :: 5 - Production/Stable", ], ) pp-1.6.4/examples/0000755000175000017500000000000012112562207013357 5ustar vitalyvitalypp-1.6.4/examples/reverse_md5.py0000755000175000017500000000510411525631364016164 0ustar vitalyvitaly#!/usr/bin/env python # File: reverse_md5.py # Author: Vitalii Vanovschi # Desc: This program demonstrates parallel computations with pp module # It tries to reverse an md5 hash in parallel # Parallel Python Software: http://www.parallelpython.com import math import sys import md5 import pp def md5test(hash, start, end): """Calculates md5 of the integerss between 'start' and 'end' and compares it with 'hash'""" for x in xrange(start, end): if md5.new(str(x)).hexdigest() == hash: return x print """Usage: python reverse_md5.py [ncpus] [ncpus] - the number of workers to run in parallel, if omitted it will be set to the number of processors in the system """ # tuple of all parallel python servers to connect with #ppservers = ("*",) # auto-discover #ppservers = ("10.0.0.1","10.0.0.2") # list of static IPs ppservers = () if len(sys.argv) > 1: ncpus = int(sys.argv[1]) # Creates jobserver with ncpus workers job_server = pp.Server(ncpus, ppservers=ppservers) else: # Creates jobserver with automatically detected number of workers job_server = pp.Server(ppservers=ppservers) print "Starting pp with", job_server.get_ncpus(), "workers" #Calculates md5 hash from the given number hash = md5.new("1829182").hexdigest() print "hash =", hash #Now we will try to find the number with this hash value start = 1 end = 2000000 # Since jobs are not equal in the execution time, division of the problem # into a 128 of small subproblems leads to a better load balancing parts = 128 step = (end - start) / parts + 1 jobs = [] for index in xrange(parts): starti = start+index*step endi = min(start+(index+1)*step, end) # Submit a job which will test if a number in the range starti-endi # has given md5 hash # md5test - the function # (hash, starti, endi) - tuple with arguments for md5test # () - tuple with functions on which function md5test depends # ("md5",) - tuple with module names which must be imported before # md5test execution # jobs.append(job_server.submit(md5test, (hash, starti, endi), # globals=globals())) jobs.append(job_server.submit(md5test, (hash, starti, endi), (), ("md5", ))) # Retrieve results of all submited jobs for job in jobs: result = job() if result: break # Print the results if result: print "Reverse md5 for", hash, "is", result else: print "Reverse md5 for", hash, "has not been found" job_server.print_stats() # Properly finalize all tasks (not necessary) job_server.wait() # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/examples/sum_primes_functor.py0000755000175000017500000000521611525631364017673 0ustar vitalyvitaly#!/usr/bin/env python # File: sum_primes_template.py # Author: Vitalii Vanovschi # Desc: This program demonstrates using pp template class # It calculates the sum of prime numbers below a given integer in parallel # Parallel Python Software: http://www.parallelpython.com import math, sys import pp def isprime(n): """Returns True if n is prime and False otherwise""" if not isinstance(n, int): raise TypeError("argument passed to is_prime is not of 'int' type") if n < 2: return False if n == 2: return True max = int(math.ceil(math.sqrt(n))) i = 2 while i <= max: if n % i == 0: return False i += 1 return True def sum_primes(n): """Calculates sum of all primes below given integer n""" return sum([x for x in xrange(2,n) if isprime(x)]) print """Usage: python sum_primes.py [ncpus] [ncpus] - the number of workers to run in parallel, if omitted it will be set to the number of processors in the system """ # tuple of all parallel python servers to connect with #ppservers = ("*",) # auto-discover #ppservers = ("10.0.0.1","10.0.0.2") # list of static IPs ppservers = () if len(sys.argv) > 1: ncpus = int(sys.argv[1]) # Creates jobserver with ncpus workers job_server = pp.Server(ncpus, ppservers=ppservers) else: # Creates jobserver with automatically detected number of workers job_server = pp.Server(ppservers=ppservers) print "Starting pp with", job_server.get_ncpus(), "workers" # Creates a template # Template is created using all the parameters of the jobs except # the arguments of the function # sum_primes - the function # (isprime,) - tuple with functions on which function sum_primes depends # ("math",) - tuple with module names which must be imported # before sum_primes execution fn = pp.Template(job_server, sum_primes, (isprime,), ("math",)) # Submit a job of calulating sum_primes(100) for execution using # previously created template # Execution starts as soon as one of the workers will become available job1 = fn.submit(100) # Retrieves the result calculated by job1 # The value of job1() is the same as sum_primes(100) # If the job has not been finished yet, # execution will wait here until result is available result = job1() print "Sum of primes below 100 is", result # The following submits 8 jobs and then retrieves the results inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700) jobs = [(input, fn.submit(input)) for input in inputs] for input, job in jobs: print "Sum of primes below", input, "is", job() job_server.print_stats() # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/examples/auto_diff.py0000755000175000017500000000672611525631364015717 0ustar vitalyvitaly#!/usr/bin/env python # File: auto_diff.py # Author: Vitalii Vanovschi # Desc: This program demonstrates parallel computations with pp module # using class methods as parallel functions (available since pp 1.4). # Program calculates the partial sums of f(x) = x-x**2/2+x**3/3-x**4/4+... # and first derivatives f'(x) using automatic differentiation technique. # In the limit f(x) = ln(x+1) and f'(x) = 1/(x+1). # Parallel Python Software: http://www.parallelpython.com import math import sys import pp # Partial implemenmtation of automatic differentiation class class AD(object): def __init__(self, x, dx=0.0): self.x = float(x) self.dx = float(dx) def __pow__(self, val): if isinstance(val, int): p = self.x**val return AD(self.x**val, val*self.x**(val-1)*self.dx) else: raise TypeError("Second argumnet must be an integer") def __add__(self, val): if isinstance(val, AD): return AD(self.x+val.x, self.dx+val.dx) else: return AD(self.x+val, self.dx) def __radd__(self, val): return self+val def __mul__(self, val): if isinstance(val, AD): return AD(self.x*val.x, self.x*val.dx+val.x*self.dx) else: return AD(self.x*val, val*self.dx) def __rmul__(self, val): return self*val def __div__(self, val): if isinstance(val, AD): return self*AD(1/val.x, -val.dx/val.x**2) else: return self*(1/float(val)) def __rdiv__(self, val): return AD(val)/self def __sub__(self, val): if isinstance(val, AD): return AD(self.x-val.x, self.dx-val.dx) else: return AD(self.x-val, self.dx) def __repr__(self): return str((self.x, self.dx)) class PartialSum(object): def __init__(self, n): """ This class contains methods which will be executed in parallel """ self.n = n def t_log(self, x): """ truncated natural logarithm """ return self.partial_sum(x-1) def partial_sum(self, x): """ partial sum for truncated natural logarithm """ return sum([float(i%2 and 1 or -1)*x**i/i for i in xrange(1, self.n)]) print """Usage: python auto_diff.py [ncpus] [ncpus] - the number of workers to run in parallel, if omitted it will be set to the number of processors in the system """ # tuple of all parallel python servers to connect with #ppservers = ("*",) # auto-discover #ppservers = ("10.0.0.1","10.0.0.2") # list of static IPs ppservers = () if len(sys.argv) > 1: ncpus = int(sys.argv[1]) # Creates jobserver with ncpus workers job_server = pp.Server(ncpus, ppservers=ppservers) else: # Creates jobserver with automatically detected number of workers job_server = pp.Server(ppservers=ppservers) print "Starting pp with", job_server.get_ncpus(), "workers" proc = PartialSum(20000) results = [] for i in range(32): # Creates an object with x = float(i)/32+1 and dx = 1.0 ad_x = AD(float(i)/32+1, 1.0) # Submits a job of calulating proc.t_log(x). f = job_server.submit(proc.t_log, (ad_x, )) results.append((ad_x.x, f)) for x, f in results: # Retrieves the result of the calculation val = f() print "t_log(%lf) = %lf, t_log'(%lf) = %lf" % (x, val.x, x, val.dx) # Print execution statistics job_server.print_stats() # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/examples/callback.py0000755000175000017500000000472511550776321015511 0ustar vitalyvitaly#!/usr/bin/env python # File: callback.py # Author: Vitalii Vanovschi # Desc: This program demonstrates parallel computations with pp module # using callbacks (available since pp 1.3). # Program calculates the partial sum 1-1/2+1/3-1/4+1/5-1/6+... # (in the limit it is ln(2)) # Parallel Python Software: http://www.parallelpython.com import math import time import thread import sys import pp class Sum(object): """Class for callbacks """ def __init__(self): self.value = 0.0 self.lock = thread.allocate_lock() def add(self, value): """ the callback function """ # we must use lock here because += is not atomic self.lock.acquire() self.value += value self.lock.release() def part_sum(start, end): """Calculates partial sum""" sum = 0 for x in xrange(start, end): if x % 2 == 0: sum -= 1.0 / x else: sum += 1.0 / x return sum print """Usage: python callback.py [ncpus] [ncpus] - the number of workers to run in parallel, if omitted it will be set to the number of processors in the system """ start = 1 end = 20000000 # Divide the task into 128 subtasks parts = 128 step = (end - start) / parts + 1 # tuple of all parallel python servers to connect with #ppservers = ("*",) # auto-discover #ppservers = ("10.0.0.1","10.0.0.2") # list of static IPs ppservers = () if len(sys.argv) > 1: ncpus = int(sys.argv[1]) # Creates jobserver with ncpus workers job_server = pp.Server(ncpus, ppservers=ppservers) else: # Creates jobserver with automatically detected number of workers job_server = pp.Server(ppservers=ppservers) print "Starting pp with", job_server.get_ncpus(), "workers" # Create anm instance of callback class sum = Sum() # Execute the same task with different number # of active workers and measure the time start_time = time.time() for index in xrange(parts): starti = start+index*step endi = min(start+(index+1)*step, end) # Submit a job which will calculate partial sum # part_sum - the function # (starti, endi) - tuple with arguments for part_sum # callback=sum.add - callback function job_server.submit(part_sum, (starti, endi), callback=sum.add) #wait for jobs in all groups to finish job_server.wait() # Print the partial sum print "Partial sum is", sum.value, "| diff =", math.log(2) - sum.value job_server.print_stats() # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/examples/quicksort.py0000755000175000017500000000323711525631364015775 0ustar vitalyvitaly#!/usr/bin/env python # File: sum_primes.py # Author: Vitalii Vanovschi # Desc: This program demonstrates parallel version of quicksort algorithm # implemented using pp module # Parallel Python Software: http://www.parallelpython.com import sys, random import pp def quicksort(a, n=-1, srv=None): if len(a) <= 1: return a if n: return quicksort([x for x in a if x < a[0]], n-1, srv) \ + [a[0]] + quicksort([x for x in a[1:] if x >= a[0]], n-1, srv) else: return [srv.submit(quicksort, (a,))] print """Usage: python quicksort.py [ncpus] [ncpus] - the number of workers to run in parallel, if omitted it will be set to the number of processors in the system """ # tuple of all parallel python servers to connect with #ppservers = ("*",) #ppservers = ("10.0.0.1",) ppservers = () if len(sys.argv) > 1: ncpus = int(sys.argv[1]) # Creates jobserver with ncpus workers job_server = pp.Server(ncpus, ppservers=ppservers) else: # Creates jobserver with automatically detected number of workers job_server = pp.Server(ppservers=ppservers) print "Starting pp with", job_server.get_ncpus(), "workers" n = 1000000 input = [] for i in xrange(n): input.append(random.randint(0,100000)) # set n to a positive integer to create 2^n PP jobs # or to -1 to avoid using PP # 32 PP jobs n = 5 # no PP #n = -1 outputraw = quicksort(input, n, job_server) output = [] for x in outputraw: if callable(x): output.extend(x()) else: output.append(x) print "first 30 numbers in increasing order:", output[:30] job_server.print_stats() # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/examples/dynamic_ncpus.py0000755000175000017500000000356311525631364016607 0ustar vitalyvitaly#!/usr/bin/env python # File: dynamic_ncpus.py # Author: Vitalii Vanovschi # Desc: This program demonstrates parallel computations with pp module # and dynamic cpu allocation feature. # Program calculates the partial sum 1-1/2+1/3-1/4+1/5-1/6+... # (in the limit it is ln(2)) # Parallel Python Software: http://www.parallelpython.com import math import sys import time import pp def part_sum(start, end): """Calculates partial sum""" sum = 0 for x in xrange(start, end): if x % 2 == 0: sum -= 1.0 / x else: sum += 1.0 / x return sum print """Usage: python dynamic_ncpus.py""" print start = 1 end = 20000000 # Divide the task into 64 subtasks parts = 64 step = (end - start) / parts + 1 # Create jobserver job_server = pp.Server() # Execute the same task with different amount of active workers # and measure the time for ncpus in (1, 2, 4, 8, 16, 1): job_server.set_ncpus(ncpus) jobs = [] start_time = time.time() print "Starting ", job_server.get_ncpus(), " workers" for index in xrange(parts): starti = start+index*step endi = min(start+(index+1)*step, end) # Submit a job which will calculate partial sum # part_sum - the function # (starti, endi) - tuple with arguments for part_sum # () - tuple with functions on which function part_sum depends # () - tuple with module names which must be # imported before part_sum execution jobs.append(job_server.submit(part_sum, (starti, endi))) # Retrieve all the results and calculate their sum part_sum1 = sum([job() for job in jobs]) # Print the partial sum print "Partial sum is", part_sum1, "| diff =", math.log(2) - part_sum1 print "Time elapsed: ", time.time() - start_time, "s" print job_server.print_stats() # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/examples/sum_primes.py0000755000175000017500000000502611525631364016132 0ustar vitalyvitaly#!/usr/bin/env python # File: sum_primes.py # Author: Vitalii Vanovschi # Desc: This program demonstrates parallel computations with pp module # It calculates the sum of prime numbers below a given integer in parallel # Parallel Python Software: http://www.parallelpython.com import math import sys import pp def isprime(n): """Returns True if n is prime and False otherwise""" if not isinstance(n, int): raise TypeError("argument passed to is_prime is not of 'int' type") if n < 2: return False if n == 2: return True max = int(math.ceil(math.sqrt(n))) i = 2 while i <= max: if n % i == 0: return False i += 1 return True def sum_primes(n): """Calculates sum of all primes below given integer n""" return sum([x for x in xrange(2, n) if isprime(x)]) print """Usage: python sum_primes.py [ncpus] [ncpus] - the number of workers to run in parallel, if omitted it will be set to the number of processors in the system""" # tuple of all parallel python servers to connect with ppservers = () #ppservers = ("127.0.0.1:60000", ) if len(sys.argv) > 1: ncpus = int(sys.argv[1]) # Creates jobserver with ncpus workers job_server = pp.Server(ncpus, ppservers=ppservers) else: # Creates jobserver with automatically detected number of workers job_server = pp.Server(ppservers=ppservers) print "Starting pp with", job_server.get_ncpus(), "workers" # Submit a job of calulating sum_primes(100) for execution. # sum_primes - the function # (100,) - tuple with arguments for sum_primes # (isprime,) - tuple with functions on which function sum_primes depends # ("math",) - tuple with module names which must be imported before # sum_primes execution # Execution starts as soon as one of the workers will become available job1 = job_server.submit(sum_primes, (100, ), (isprime, ), ("math", )) # Retrieves the result calculated by job1 # The value of job1() is the same as sum_primes(100) # If the job has not been finished yet, execution will # wait here until result is available result = job1() print "Sum of primes below 100 is", result # The following submits 8 jobs and then retrieves the results inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700) jobs = [(input, job_server.submit(sum_primes, (input, ), (isprime, ), ("math", ))) for input in inputs] for input, job in jobs: print "Sum of primes below", input, "is", job() job_server.print_stats() # Parallel Python Software: http://www.parallelpython.com pp-1.6.4/COPYING0000644000175000017500000000305411762603063012604 0ustar vitalyvitalyParallel Python Software: http://www.parallelpython.com Copyright (c) 2005-2012, Vitalii Vanovschi All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the author nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.