parallel_map_funcs

sherpa.utils.parallel.parallel_map_funcs(funcs, datasets, numcores=None)[source] [edit on github]

Run a sequence of function on a sequence of inputs in parallel.

Sherpa’s parallel_map runs a single function to an iterable set of sequence. parallel_map_funcs is generalized parallelized version of sherpa’s parallel_map function since each element of the ordered iterable funcs shall operate on the each element of the datasets.

Parameters:
  • funcs (a list or tuple of functions) – An ordered iterable sequence of functions which accepts an element of the datasets and returns a value. The number of elements in funcs must match the number of elements of the datasets.

  • datasets (a list or tuple of array_like) – The data to be passed to func. The number of elements in datasets must match the number of elements of funcs.

  • numcores (int or None, optional) – The number of calls to funcs to run in parallel. When set to None, all the available CPUs on the machine - as set either by the ‘numcores’ setting of the ‘parallel’ section of Sherpa’s preferences or by multiprocessing.cpu_count - are used.

Returns:

ans – The return values from the calls, in the same order as the sequence array.

Return type:

array

Notes

Due to the overhead involved in passing the functions and datasets to the different cores, the functions should be very time consuming to compute (of order 0.1-1s). This is similar to the parallel_map function.

An ordered iterable (i.e. tuple or list) should be used to pass multiple values to the multiple functions. The lengths of the iterable funcs and datasets must be equal. The corresponding funcs and datasets are passed to the different cores to distribute the work in parallel. There is no guarantee to the ordering of the tasks.

Examples

In the following examples a simple set of computations, sum and std deviations, are used; in reality the function is expected to be run on computations that take a significant amount of time to run.

Run the computation (summing up each element of the first input array and calculate the standard deviation of the second input array) on a separate core and return the results (unless the machine only has a single core or the parallel.numcores setting is set to 1).

>>> import numpy as np
>>> funcs = [np.sum, np.std]
>>> datasets = [np.arange(3), np.arange(4)]
>>> parallel_map_funcs(funcs, datasets, numcores=2)
[0, 1, 2, 0.0, 0.0, 0.0, 0.0]