# Convolution

A convolution model requires both the evaluation grid *and* the
data to convolve. Examples include using a point-spread function
(PSF) model to modify a two-dimensional model to account
for blurring due to the instrument (or other sources, such as
the atmosphere for ground-based Astronomical data sets),
or the redistribution of the counts as a function of energy
as modelled by the RMF when analyzing astronomical
X-ray spectra.

Note

In the following documentation we are going to use the term “convolution” to refer to any model that requires the data from another model, such as a model that reprocesses emission into another passband or that renormalises the data, even if it does not match the meaning of convolution.

## Creating a convolution kernel

For convolution-style models, there are normally two steps to create a model component:

create an instance of the convolution kernel

apply the instance to the model components

The convolution kernel is not an instance of
`sherpa.models.model.ArithmeticModel`

, and so
can not be evaluated by passing it the data grid. Instead,
the kernel is used to wrap another model expression,
normally done by passing the model to be wrapped as the
argument to the model, and the result of this is then
a model that can be evaluated on the data grid.

Note

Evaluating the kernel - such as `psf`

in the example
below - with a grid will raise an error like:

```
ModelErr: attempted to create ArithmeticFunctionModel from non-callable object of type ndarray
```

## One-dimensional convolution with a PSF

In this example we want to convolve a model by the “kernel”
[5, 10, 3, 2], which requires creating a
`sherpa.data.Data1D`

instance to store the
data, and then pass it to `sherpa.instrument.PSFModel`

:

```
>>> import numpy as np
>>> from sherpa.data import Data1D
>>> from sherpa.instrument import PSFModel
>>> k = np.asarray([5, 10, 3, 2])
>>> x = np.arange(k.size)
>>> kernel = Data1D('kdata1', x, k)
>>> psf1 = PSFModel('psf1', kernel=kernel)
>>> print(psf1)
psf1
Param Type Value Min Max Units
----- ---- ----- --- --- -----
psf1.kernel frozen kdata1
psf1.radial frozen 0 0 1
psf1.norm frozen 1 0 1
```

To show how this model works we will use it to convolve
a `Box1D`

model (this is chosen
since it has sharp edges and so the convolution is more obvious).

```
>>> from sherpa.models.basic import Box1D
>>> point = Box1D('pt')
>>> point.xlow = 9.5
>>> point.xhi = 10.5
>>> print(point)
pt
Param Type Value Min Max Units
----- ---- ----- --- --- -----
pt.xlow thawed 9.5 -3.40282e+38 3.40282e+38
pt.xhi thawed 10.5 -3.40282e+38 3.40282e+38
pt.ampl thawed 1 -3.40282e+38 3.40282e+38
```

The convolution case is created by applying the `psf1`

model
to the `point`

model (the 2D example
below shows an example of applying a kernel to a
composite model):

```
>>> convolved = psf1(point)
>>> print(convolved)
psf1(pt)
Param Type Value Min Max Units
----- ---- ----- --- --- -----
pt.xlow thawed 9.5 -3.40282e+38 3.40282e+38
pt.xhi thawed 10.5 -3.40282e+38 3.40282e+38
pt.ampl thawed 1 -3.40282e+38 3.40282e+38
```

The model can be evaluated both before and after convolution, by
passing it the data grid. Unlike normal model evaluation the
`PSFModel`

class requires that its
`fold()`

model be called before
evaluation. This method pre-calculates terms needed for the
convolution (which is done using a fourier transform), and so needs
the grid over which it is to be applied. This is done by passing in a
`Data`

instance (in this case the Y data in the
`Data1D`

instance is not used so is set to
zero):

```
>>> x = np.arange(6, 15)
>>> blank = Data1D('blank', x, np.zeros(x.size))
>>> psf1.fold(blank)
```

With this out of the way, we can compare the convolved to un-convolved results:

```
>>> y1 = point(x)
>>> y2 = convolved(x)
>>> for z in zip(x, y1, y2):
... print("x: {:2d} y: {:.0f} convolved: {:.2f}".format(*z))
x: 6 y: 0 convolved: 0.00
x: 7 y: 0 convolved: 0.00
x: 8 y: 0 convolved: 0.00
x: 9 y: 0 convolved: 0.25
x: 10 y: 1 convolved: 0.50
x: 11 y: 0 convolved: 0.15
x: 12 y: 0 convolved: 0.10
x: 13 y: 0 convolved: 0.00
x: 14 y: 0 convolved: 0.00
```

The `PSFModel`

instance has automatically
selected the largest pixel in the kernel as the center (in this case
the second element, 10), and has automatically re-normalized the
kernel. The parameters of the convolution kernel (in this case
`psf1`

) can be changed to control the behavior.

```
>>> print(psf1)
psf1
Param Type Value Min Max Units
----- ---- ----- --- --- -----
psf1.kernel frozen kdata1
psf1.size frozen 4 4 4
psf1.center frozen 2 2 2
psf1.radial frozen 0 0 1
psf1.norm frozen 1 0 1
```

Note

The model parameters for the convolution kernel have changed
since earlier, as the use of the
`fold()`

method has added
new parameters (in this case the `size`

and `center`

parameters).

## Two-dimensional convolution with a PSF

The `sherpa.astro.instrument.PSFModel`

class augments the
behavior of `sherpa.instrument.PSFModel`

by supporting
images with a World Coordinate System (WCS). For this example
we do not need this capability and so use the
`sherpa.instrument.PSFModel`

class directly.

### Including a PSF in a model expression

The “kernel” of the PSF is the actual data used to represent the
blurring, and can be given as a numeric array or as a Sherpa model.
In the following example a simple 3 by 3 array is used to represent
the PSF, but it first has to be converted into a
`Data2D`

object (this is similar to
the steps needed in the
1D case above):

```
>>> from sherpa.data import Data2D
>>> from sherpa.instrument import PSFModel
>>> k = np.asarray([[0, 1, 0], [1, 0, 1], [0, 1, 0]])
>>> yg, xg = np.mgrid[:3, :3]
>>> kernel = Data2D('kdata', xg.flatten(), yg.flatten(), k.flatten(),
... shape=k.shape)
>>> psf = PSFModel(kernel=kernel)
>>> print(psf)
psfmodel
Param Type Value Min Max Units
----- ---- ----- --- --- -----
psfmodel.kernel frozen kdata
psfmodel.radial frozen 0 0 1
psfmodel.norm frozen 1 0 1
```

As shown below, the data in the
PSF is renormalized so that its sum matches the `norm`

parameter,
which here is set to 1.

The following example sets up a simple model expression which represents
the sum of a single pixel and a line of pixels, using
`Box2D`

for both.

```
>>> from sherpa.models.basic import Box2D
>>> pt = Box2D('pt')
>>> pt.xlow, pt.xhi = 1.5, 2.5
>>> pt.ylow, pt.yhi = 2.5, 3.5
>>> pt.ampl = 8
>>> box = Box2D('box')
>>> box.xlow, box.xhi = 4, 10
>>> box.ylow, box.yhi = 6.5, 7.5
>>> box.ampl = 10
>>> unconvolved_mdl = pt + box
>>> print(unconvolved_mdl)
pt + box
Param Type Value Min Max Units
----- ---- ----- --- --- -----
pt.xlow thawed 1.5 -3.40282e+38 3.40282e+38
pt.xhi thawed 2.5 -3.40282e+38 3.40282e+38
pt.ylow thawed 2.5 -3.40282e+38 3.40282e+38
pt.yhi thawed 3.5 -3.40282e+38 3.40282e+38
pt.ampl thawed 8 -3.40282e+38 3.40282e+38
box.xlow thawed 4 -3.40282e+38 3.40282e+38
box.xhi thawed 10 -3.40282e+38 3.40282e+38
box.ylow thawed 6.5 -3.40282e+38 3.40282e+38
box.yhi thawed 7.5 -3.40282e+38 3.40282e+38
box.ampl thawed 10 -3.40282e+38 3.40282e+38
```

Note

Although Sherpa provides the `Delta2D`

class, it is suggested that alternatives such as
`Box2D`

be used instead, since a
delta function is **very** sensitive to the location at which it
is evaluated. However, including a `Box2D`

component in a fit can still
be problematic since the output of the model does not vary smoothly
as any of the bin edges change, which is a challenge for the
optimisers provided with Sherpa.

Rather than being another term in the model expression - that is, an item that is added, subtracted, multiplied, or divided into an existing expression - the PSF model “wraps” the model it is to convolve. This can be a single model or - as in this case - a composite one:

```
>>> convolved_mdl = psf(unconvolved_mdl)
>>> print(convolved_mdl)
psfmodel(pt + box)
Param Type Value Min Max Units
----- ---- ----- --- --- -----
pt.xlow thawed 1.5 -3.40282e+38 3.40282e+38
pt.xhi thawed 2.5 -3.40282e+38 3.40282e+38
pt.ylow thawed 2.5 -3.40282e+38 3.40282e+38
pt.yhi thawed 3.5 -3.40282e+38 3.40282e+38
pt.ampl thawed 8 -3.40282e+38 3.40282e+38
box.xlow thawed 4 -3.40282e+38 3.40282e+38
box.xhi thawed 10 -3.40282e+38 3.40282e+38
box.ylow thawed 6.5 -3.40282e+38 3.40282e+38
box.yhi thawed 7.5 -3.40282e+38 3.40282e+38
box.ampl thawed 10 -3.40282e+38 3.40282e+38
```

This new expression can be treated as any other Sherpa model, which means that we can apply extra terms to it, such as adding a background component that is not affected by the PSF:

```
>>> from sherpa.models.basic import Const2D
>>> bgnd = Const2D('bgnd')
>>> bgnd.c0 = 0.25
>>> print(convolved_mdl + bgnd)
psfmodel(pt + box) + bgnd
Param Type Value Min Max Units
----- ---- ----- --- --- -----
pt.xlow thawed 1.5 -3.40282e+38 3.40282e+38
pt.xhi thawed 2.5 -3.40282e+38 3.40282e+38
pt.ylow thawed 2.5 -3.40282e+38 3.40282e+38
pt.yhi thawed 3.5 -3.40282e+38 3.40282e+38
pt.ampl thawed 8 -3.40282e+38 3.40282e+38
box.xlow thawed 4 -3.40282e+38 3.40282e+38
box.xhi thawed 10 -3.40282e+38 3.40282e+38
box.ylow thawed 6.5 -3.40282e+38 3.40282e+38
box.yhi thawed 7.5 -3.40282e+38 3.40282e+38
box.ampl thawed 10 -3.40282e+38 3.40282e+38
bgnd.c0 thawed 0.25 -3.40282e+38 3.40282e+38
```

In the following this extra term (`bgnd`

) is not included to simplify
the comparison between the unconvolved and convolved versions.

### Evaluating a model including a PSF

The PSF-convolved model can be evaluated - in *most cases* - just as
is done for ordinary models. That is by supplying it with the grid
coordinates to use. However, the need to convolve the data with a
fixed grid does limit this somewhat.

For this example, a grid covering the points 0 to 9 inclusive is used for each axis (with a unit pixel size), which means that the unconvolved model can be evaluated with the following:

```
>>> yg, xg = np.mgrid[:10, :10]
>>> xg1d, yg1d = xg.flatten(), yg.flatten()
>>> m1 = unconvolved_mdl(xg1d, yg1d).reshape(xg.shape)
```

An easier alternative, once the PSF is included, is to create an
empty dataset with the given grid (that is, a dataset for which we
do not care about the dependent axis), and use the
`eval_model()`

method to
evaluate the model (the result for `m1`

is the same whichever
approach is used):

```
>>> blank = Data2D('blank', xg1d, yg1d, np.ones(xg1d.shape), xg.shape)
>>> m1 = blank.eval_model(unconvolved_mdl).reshape(xg.shape)
```

The “point source” is located at `x = 2, y = 3`

and the line
starts at `x=5`

and extends to the end of the grid (at `y=7`

).

Note

In this example the image coordinates were chosen to be the same
as those drawn by matplotlib. The `extent`

parameter of the
`imshow`

call can be used when this correspondence does not
hold.

As with the 1D case, the
`fold()`

method must be called
before evaluation.

```
>>> psf.fold(blank)
>>> m2 = blank.eval_model(convolved_mdl).reshape(xg.shape)
```

The kernel used redistributes flux from the central pixel to its four
immediate neighbors equally, which is what has happened to the point
source at `(2, 2)`

. The result for the line is to blur the line
slightly, but note that the convolution has “wrapped around”, so that
the flux that should have been placed into the pixel at `(10, 7)`

,
which is off the grid, has been moved to `(0, 7)`

.

Note

If the `fold`

method is not called then evaluating the model will
raise the following exception:

```
PSFErr: PSF model has not been folded
```

Care must be taken to ensure that `fold`

is called whenever the grid
has changed. This suggests that the same PSF model should not be used
in simultaneous fits, unless it is known that the grid is the same
in the multiple datasets.

### The PSF Normalization

Since the `norm`

parameter of the PSF model was set to 1, the PSF
convolution is flux preserving, even at the edges thanks to the
wrap-around behavior of the fourier transform. This can be seen by
comparing the signal in the unconvolved and convolved images, which
are (to numerical precision) the same:

```
>>> m1.sum()
58.0
>>> m2.sum()
58.0
```

The use of a fourier transform means that low-level signal will be
found in many pixels which would expect to be 0. For example,
looking at the row of pixels at `y = 7`

gives:

```
>>> m2[7]
array([2.50000000e+00, 1.73472348e-16, 5.20417043e-16, 4.33680869e-16,
2.50000000e+00, 2.50000000e+00, 5.00000000e+00, 5.00000000e+00,
5.00000000e+00, 2.50000000e+00])
```