Better examples of Parallel processing in Python -
i hope not downvoted time. have been struggling parallel processing in python while(2 days , exactly). have checking these resources(a partial list shown here:
(a) http://eli.thegreenplace.net/2013/01/16/python-paralellizing-cpu-bound-tasks-with-concurrent-futures
(b) https://pythonadventures.wordpress.com/tag/processpoolexecutor/
i came unstuck. want this:
master:
break file chunks(strings or numbers) broadcast pattern searched workers receive offsets in file pattern found
workers:
receive pattern , chunk of text master compute() send offsets master.
i tried implement using mpi/concurrent.futures/multiprocessing , came unstuck.
my naive implementation using multiprocessing module
import multiprocessing filename = "file1.txt" pat = "afow" n = 1000 """ naive string search algorithm""" def search(pat, txt): patlen = len(pat) txtlen = len(txt) offsets = [] # loop slide pattern[] 1 one # range generates numbers not including number in range ((txtlen - patlen) + 1): # can not use loop here # loops in c && statements must # converted while statements in python counter = 0 while(counter < patlen) , pat[counter] == txt[counter + i]: counter += 1 if counter >= patlen: offsets.append(i) return str(offsets).strip('[]') """" want if __name__ == "__main__": tasks = [] pool_outputs = [] pool = multiprocessing.pool(processes=5) open(filename, 'r') infile: lines = [] line in infile: lines.append(line.rstrip()) if len(lines) > n: pool_output = pool.map(search, tasks) pool_outputs.append(pool_output) lines = [] if len(lines) > 0: pool_output = pool.map(search, tasks) pool_outputs.append(pool_output) pool.close() pool.join() print('pool:', pool_outputs) """"" open(filename, 'r') infile: line in infile: print(search(pat, line))
i grateful guidance concurrent.futures. time. valeriy helped me addition , thank him that.
but if indulge me moment, code working on concurrent.futures(working off example saw somewhere)
from concurrent.futures import processpoolexecutor, as_completed import math def search(pat, txt): patlen = len(pat) txtlen = len(txt) offsets = [] # loop slide pattern[] 1 one # range generates numbers not including number in range ((txtlen - patlen) + 1): # can not use loop here # loops in c && statements must # converted while statements in python counter = 0 while(counter < patlen) , pat[counter] == txt[counter + i]: counter += 1 if counter >= patlen: offsets.append(i) return str(offsets).strip('[]') #check list of strings def chunked_worker(lines): return {0: search("fmo", line) line in lines} def pool_bruteforce(filename, nprocs): lines = [] open(filename) f: lines = [line.rstrip('\n') line in f] chunksize = int(math.ceil(len(lines) / float(nprocs))) futures = [] processpoolexecutor() executor: in range(nprocs): chunk = lines[(chunksize * i): (chunksize * (i + 1))] futures.append(executor.submit(chunked_worker, chunk)) resultdict = {} f in as_completed(futures): resultdict.update(f.result()) return resultdict filename = "file1.txt" pool_bruteforce(filename, 5)
thanks again , valeriy , attempts me solve riddle.
you using several arguments, so:
import multiprocessing functools import partial filename = "file1.txt" pat = "afow" n = 1000 """ naive string search algorithm""" def search(pat, txt): patlen = len(pat) txtlen = len(txt) offsets = [] # loop slide pattern[] 1 one # range generates numbers not including number in range ((txtlen - patlen) + 1): # can not use loop here # loops in c && statements must # converted while statements in python counter = 0 while(counter < patlen) , pat[counter] == txt[counter + i]: counter += 1 if counter >= patlen: offsets.append(i) return str(offsets).strip('[]') if __name__ == "__main__": tasks = [] pool_outputs = [] pool = multiprocessing.pool(processes=5) lines = [] open(filename, 'r') infile: line in infile: lines.append(line.rstrip()) tasks = lines func = partial(search, pat) if len(lines) > n: pool_output = pool.map(func, lines ) pool_outputs.append(pool_output) elif len(lines) > 0: pool_output = pool.map(func, lines ) pool_outputs.append(pool_output) pool.close() pool.join() print('pool:', pool_outputs)
Comments
Post a Comment