Convolution¶
A convolution model requires both the evaluation grid and the data to convolve. Examples include using a point-spread function (PSF) model to modify a two-dimensional model to account for blurring due to the instrument (or other sources, such as the atmosphere for ground-based Astronomical data sets), or the redistribution of the counts as a function of energy as modelled by the RMF when analyzing astronomical X-ray spectra.
Two-dimensional convolution with a PSF¶
The sherpa.astro.instrument.PSFModel
class augments the
behavior of
sherpa.instrument.PSFModel
by supporting images with
a World Coordinate System (WCS).
Including a PSF in a model expression¶
There are two steps to including a PSF:
- create an instance
- apply the instance to the model components
The “kernel” of the PSF is the actual data used to represent the
blurring, and can be given as a numeric array or as a Sherpa model.
In the following example a simple 3 by 3 array is used to represent
the PSF, but it first has to be converted into a
Data
object:
>>> from sherpa.data import Data2D
>>> from sherpa.instrument import PSFModel
>>> k = np.asarray([[0, 1, 0], [1, 0, 1], [0, 1, 0]])
>>> yg, xg = np.mgrid[:3, :3]
>>> kernel = Data2D('kdata', xg.flatten(), yg.flatten(), k.flatten(),
shape=k.shape)
>>> psf = PSFModel(kernel=kernel)
>>> print(psf)
psfmodel
Param Type Value Min Max Units
----- ---- ----- --- --- -----
psfmodel.kernel frozen kdata
psfmodel.radial frozen 0 0 1
psfmodel.norm frozen 1 0 1
As shown below, the data in the
PSF is renormalized so that its sum matches the norm
parameter,
which here is set to 1.
The following example sets up a simple model expression which represents
the sum of a single pixel and a line of pixels, using
Box2D
for both.
>>> from sherpa.models.basic import Box2D
>>> pt = Box2D('pt')
>>> pt.xlow, pt.xhi = 1.5, 2.5
>>> pt.ylow, pt.yhi = 2.5, 3.5
>>> pt.ampl = 8
>>> box = Box2D('box')
>>> box.xlow, box.xhi = 4, 10
>>> box.ylow, box.yhi = 6.5, 7.5
>>> box.ampl = 10
>>> unconvolved_mdl = pt + box
>>> print(unconvolved_mdl)
(pt + box)
Param Type Value Min Max Units
----- ---- ----- --- --- -----
pt.xlow thawed 1.5 -3.40282e+38 3.40282e+38
pt.xhi thawed 2.5 -3.40282e+38 3.40282e+38
pt.ylow thawed 2.5 -3.40282e+38 3.40282e+38
pt.yhi thawed 3.5 -3.40282e+38 3.40282e+38
pt.ampl thawed 10 -3.40282e+38 3.40282e+38
box.xlow thawed 4 -3.40282e+38 3.40282e+38
box.xhi thawed 10 -3.40282e+38 3.40282e+38
box.ylow thawed 6.5 -3.40282e+38 3.40282e+38
box.yhi thawed 7.5 -3.40282e+38 3.40282e+38
box.ampl thawed 10 -3.40282e+38 3.40282e+38
Note
Although Sherpa provides the Delta2D
class, it is suggested that alternatives such as
Box2D
be used instead, since a
delta function is very sensitive to the location at which it
is evaluated. However, including a Box2D
component in a fit can still
be problematic since the output of the model does not vary smoothly
as any of the bin edges change, which is a challenge for the
optimisers provided with Sherpa.
Rather than being another term in the model expression - that is, an item that is added, subtracted, multiplied, or divided into an existing expression - the PSF model “wraps” the model it is to convolve. This can be a single model or - as in this case - a composite one:
>>> convolved_mdl = psf(unconvolved_mdl)
>>> print(convolved_mdl)
psfmodel((pt + box))
Param Type Value Min Max Units
----- ---- ----- --- --- -----
pt.xlow thawed 1.5 -3.40282e+38 3.40282e+38
pt.xhi thawed 2.5 -3.40282e+38 3.40282e+38
pt.ylow thawed 2.5 -3.40282e+38 3.40282e+38
pt.yhi thawed 3.5 -3.40282e+38 3.40282e+38
pt.ampl thawed 10 -3.40282e+38 3.40282e+38
box.xlow thawed 4 -3.40282e+38 3.40282e+38
box.xhi thawed 10 -3.40282e+38 3.40282e+38
box.ylow thawed 6.5 -3.40282e+38 3.40282e+38
box.yhi thawed 7.5 -3.40282e+38 3.40282e+38
box.ampl thawed 10 -3.40282e+38 3.40282e+38
This new expression can be treated as any other Sherpa model, which means that we can apply extra terms to it, such as adding a background component that is not affected by the PSF:
>>> from sherpa.models.basic import Const2D
>>> bgnd = Const2D('bgnd')
>>> bgnd.c0 = 0.25
>>> print(convolved_mdl + bgnd)
(psfmodel((pt + box)) + bgnd)
Param Type Value Min Max Units
----- ---- ----- --- --- -----
pt.xlow thawed 1.5 -3.40282e+38 3.40282e+38
pt.xhi thawed 2.5 -3.40282e+38 3.40282e+38
pt.ylow thawed 2.5 -3.40282e+38 3.40282e+38
pt.yhi thawed 3.5 -3.40282e+38 3.40282e+38
pt.ampl thawed 10 -3.40282e+38 3.40282e+38
box.xlow thawed 4 -3.40282e+38 3.40282e+38
box.xhi thawed 10 -3.40282e+38 3.40282e+38
box.ylow thawed 6.5 -3.40282e+38 3.40282e+38
box.yhi thawed 7.5 -3.40282e+38 3.40282e+38
box.ampl thawed 10 -3.40282e+38 3.40282e+38
bgnd.c0 thawed 0.25 -3.40282e+38 3.40282e+38
In the following this extra term (bgnd
) is not included to simplify
the comparison between the unconvolved and convolved versions.
Evaluating a model including a PSF¶
The PSF-convolved model can be evaluated - in most cases - just as is done for ordinary models. That is by supplying it with the grid coordinates to use. However, the need to convolve the data with a fixed grid does limit this somewhat.
For this example, a grid covering the points 0 to 9 inclusive is used for each axis (with a unit pixel size), which means that the unconvolved model can be evaluated with the following:
>>> yg, xg = np.mgrid[:10, :10]
>>> xg1d, yg1d = xg.flatten(), yg.flatten()
>>> m1 = unconvolved_mdl(xg1d, yg1d).reshape(xg.shape)
An easier alternative, once the PSF is included, is to create an
empty dataset with the given grid (that is, a dataset for which we
do not care about the dependent axis), and use the
eval_model()
method to
evaluate the model (the result for m1
is the same whichever
approach is used):
>>> blank = Data2D('blank', xg1d, yg1d, np.ones(xg1d.shape), xg.shape)
>>> m1 = blank.eval_model(unconvolved_mdl).reshape(xg.shape)
The “point source” is located at x = 2, y = 3
and the line
starts at x=5
and extends to the end of the grid (at y=7
).
Note
In this example the image coordinates were chosen to be the same
as those drawn by matplotlib. The extent
parameter of the
imshow
call can be used when this correspondance does not
hold.
The PSF model includes a
fold()
method, which is used to
pre-calculate terms needed for the convolution (which is done using
a fourier transform), and so needs the grid over which it is to be
applied. This is done by passing in a Sherpa dataset, such as the
blank
example we have just created:
>>> psf.fold(blank)
>>> m2 = blank.eval_model(convolved_mdl).reshape(xg.shape)
The kernel used redistributes flux from the central pixel to its four
immediate neighbors equally, which is what has happened to the point
source at (2, 2)
. The result for the line is to blur the line
slightly, but note that the convolution has “wrapped around”, so that
the flux that should have been placed into the pixel at (10, 7)
,
which is off the grid, has been moved to (0, 7)
.
Note
If the fold method is not called then evaluating the model will raise the following exception:
PSFErr: PSF model has not been folded
Care must be taken to ensure that fold is called whenever the grid has changed. This suggests that the same PSF model should not be used in simultaneous fits, unless it is known that the grid is the same in the multiple datasets.
The PSF Normalization¶
Since the norm
parameter of the PSF model was set to 1, the PSF
convolution is flux preserving, even at the edges thanks to the
wrap-around behavior of the fourier transform. This can be seen by
comparing the signal in the unconvolved and convolved images, which
are (to numerical precision) the same:
>>> m1.sum()
60.0
>>> m2.sum()
60.0
The use of a fourier transform means that low-level signal will be
found in many pixels which would expect to be 0. For example,
looking at the row of pixels at y = 7
gives:
>>> m2[7]
array([2.50000000e+00, 1.73472348e-16, 5.20417043e-16, 4.33680869e-16,
2.50000000e+00, 2.50000000e+00, 5.00000000e+00, 5.00000000e+00,
5.00000000e+00, 2.50000000e+00])