Skip to content

Pypeln Overview

cgarciae/pypeln



pl.process

The process module lets you create pipelines using objects from python's multiprocessing module according to Pypeln's general architecture. Use this module when you are in need of true parallelism for CPU heavy operations but be aware of its implications.

Most functions in this module return a pl.process.Stage object which implement the Iterable interface which enables you to combine it seamlessly with regular Python code.

Iterable

You can iterate over any p.process.Stage to get back the results of its computation:

import pypeln as pl
import time
from random import random

def slow_add1(x):
    time.sleep(random()) # <= some slow computation
    return x + 1

def slow_gt3(x):
    time.sleep(random()) # <= some slow computation
    return x > 3

data = range(10) # [0, 1, 2, ..., 9] 

stage = pl.process.map(slow_add1, data, workers=3, maxsize=4)
stage = pl.process.filter(slow_gt3, stage, workers=2)

for x in stage:
    print(x) # e.g. 5, 6, 9, 4, 8, 10, 7

At each stage the you can specify the numbers of workers. The maxsize parameter limits the maximum amount of elements that the stage can hold simultaneously.

Members