DataPHA

class sherpa.astro.data.DataPHA(name, channel, counts, staterror=None, syserror=None, bin_lo=None, bin_hi=None, grouping=None, quality=None, exposure=None, backscal=None, areascal=None, header=None)[source] [edit on github]

Bases: Data1D

PHA data set, including any associated instrument and background data.

The PHA format is described in an OGIP document [OGIP_92_007] and [OGIP_92_007a].

Parameters:

name (str) – The name of the data set; often set to the name of the file containing the data.
channel (array of int) – The PHA data.
counts (array of int) – The PHA data.
staterror (scalar or array or None, optional) – The statistical and systematic errors for the data, if defined.
syserror (scalar or array or None, optional) – The statistical and systematic errors for the data, if defined.
bin_lo (array or None, optional) – The wavelength ranges for the channels. This is intended to support Chandra grating spectra.
bin_hi (array or None, optional) – The wavelength ranges for the channels. This is intended to support Chandra grating spectra.
grouping (array of int or None, optional)
quality (array of int or None, optional)
exposure (number or None, optional) – The exposure time for the PHA data set, in seconds.
backscal (scalar or array or None, optional)
areascal (scalar or array or None, optional)
header (dict or None, optional) – If None the header will be pre-populated with a minimal set of keywords that would be found in an OGIP compliant PHA I file.

name

Used to store the file name, for data read from a file.

Type:: str

exposure

Notes

The original data is stored in the attributes - e.g. counts - and the data-access methods, such as get_dep and get_staterror, provide any necessary data manipulation to handle cases such as: background subtraction, filtering, and grouping.

There is additional complexity compared to the Data1D case when filtering data because:

although the data uses channel numbers, users will often want to filter the data using derived values (in energy or wavelength units, such as 0.5 to 7.0 keV or 16 to 18 Angstroms);
although derived from the Data1D case, PHA data is more-properly thought about as being an integrated data set, so each channel maps to a range of energy or wavelength values;
the data is often grouped to improve the signal-to-noise, and so requests for values need to determine whether to filter the data or not, whether to group the data or not, and how to combine the data within each group;
and there is also the quality array, which indicates whether or not a channel is trust-worthy or not (and so acts as an additional filtering term).

The handling of the AREASCAl value - whether it is a scalar or array - is currently in flux. It is a value that is stored with the PHA file, and the OGIP PHA standard ([OGIP_92_007], [OGIP_92_007a]) describes the observed counts being divided by the area scaling before comparison to the model. However, this is not valid for Poisson-based statistics, and is also not how XSPEC handles AREASCAL ([PRIVATE_KA]); the AREASCAL values are used to scale the exposure times instead. The aim is to add this logic to the instrument models in sherpa.astro.instrument, such as sherpa.astro.instrument.RMFModelPHA. The area scaling still has to be applied when calculating the background contribution to a spectrum, as well as when calculating the data and model values used for plots (following XSPEC so as to avoid sharp discontinuities where the area-scaling factor changes strongly).

Attributes Summary

`areascal`	The area scaling value (can be a scalar or array).
`background_ids`	IDs of defined background data sets.
`backscal`	The background scaling value (can be a scalar or array).
`bin_hi`	The upper edge of each channel, in Angstroms, or None.
`bin_lo`	The lower edge of each channel, in Angstroms, or None.
`channel`	The channel array.
`counts`	The counts array.
`default_background_id`	The identifier for the background component when not set.
`dep`	Left for compatibility with older versions
`grouped`	Are the data grouped?
`grouping`	The grouping data.
`indep`	The grid of the data space associated with this data set.
`mask`	Mask array for dependent variable
`ndim`	The dimensionality of the dataset, if defined, or None.
`plot_fac`	How the X axis is used to create the Y axis when plotting data.
`primary_response_id`	The identifier for the response component when not set.
`quality`	The quality data.
`rate`	Is the Y axis displayed as a rate when plotting data?
`response_ids`	IDs of defined instrument responses (ARF/RMF pairs).
`size`	The number of elements in the data set.
`staterror`	The statistical error on the dependent axis, if set.
`subtracted`	Are the background data subtracted?
`syserror`	The systematic error on the dependent axis, if set.
`units`	one of 'channel', 'energy', 'wavelength'.
`x`	Used for compatibility, in particular for __str__ and __repr__
`y`	The dependent axis.

Methods Summary

`apply_filter`(data[, groupfunc])	Group and filter the supplied data to match the data set.
`apply_grouping`(data[, groupfunc])	Apply the grouping scheme of the data set to the supplied data.
`delete_background`([id])	Remove the background component.
`delete_response`([id])	Remove the response component.
`eval_model`(modelfunc)	Evaluate the model on the independent axis.
`eval_model_to_fit`(modelfunc)	Evaluate the model on the independent axis after filtering.
`get_analysis`()	Return the units used when fitting spectral data.
`get_areascal`([group, filter])	Return the fractional area factor of the PHA data set.
`get_arf`([id])	Return the ARF from the response.
`get_background`([id])	Return the background component.
`get_background_scale`([bkg_id, units, group, ...])	Return the correction factor for the background dataset.
`get_backscal`([group, filter])	Return the background scaling of the PHA data set.
`get_bounding_mask`()
`get_dep`([filter])	Return the dependent axis of a data set.
`get_dims`([filter])	Return the dimensions of this data space as a tuple of tuples.
`get_error`([filter, staterrfunc])	Return the total error on the dependent variable.
`get_evaluation_indep`([filter, model, ...])
`get_filter`([group, format, delim])	Return the data filter as a string.
`get_filter_expr`()	Return the data filter as a string along with the units.
`get_full_response`([pileup_model])	Calculate the response for the dataset.
`get_img`([yfunc])	Return 1D dependent variable as a 1 x N image.
`get_imgerr`()
`get_indep`([filter])	Return the independent axes of a data set.
`get_mask`()	Returns the (ungrouped) mask.
`get_noticed_channels`()	Return the noticed channels.
`get_noticed_expr`()	Returns the current set of noticed channels.
`get_response`([id])	Return the response component.
`get_rmf`([id])	Return the RMF from the response.
`get_specresp`([filter])	Return the effective area values for the data set.
`get_staterror`([filter, staterrfunc])	Return the statistical error.
`get_syserror`([filter])	Return any systematic error.
`get_x`([filter, response_id])
`get_xerr`([filter, response_id])	Returns an X "error".
`get_xlabel`()	Return label for linear view of independent axis/axes
`get_y`([filter, yfunc, response_id, ...])	Return dependent axis in N-D view of dependent variable.
`get_yerr`([filter, staterrfunc, response_id])	Return errors in dependent axis in N-D view of dependent variable.
`get_ylabel`()	Return label for dependent axis in N-D view of dependent variable.
`group`()	Group the data according to the data set's grouping scheme.
`group_adapt`(minimum[, maxLength, tabStops])	Adaptively group to a minimum number of counts.
`group_adapt_snr`(minimum[, maxLength, ...])	Adaptively group to a minimum signal-to-noise ratio.
`group_bins`(num[, tabStops])	Group into a fixed number of bins.
`group_counts`(num[, maxLength, tabStops])	Group into a minimum number of counts per bin.
`group_snr`(snr[, maxLength, tabStops, errorCol])	Group into a minimum signal-to-noise ratio.
`group_width`(val[, tabStops])	Group into a fixed bin width.
`ignore`(args, *kwargs)
`ignore_bad`()	Exclude channels marked as bad.
`notice`([lo, hi, ignore, bkg_id])	Notice or ignore the given range.
`notice_response`([notice_resp, noticed_chans])
`set_analysis`(quantity[, type, factor])	Set the units used when fitting and plotting spectral data.
`set_arf`(arf[, id])	Add or replace the ARF in a response component.
`set_background`(bkg[, id])	Add or replace a background component.
`set_dep`(val)	Set the dependent variable values.
`set_indep`(val)
`set_response`([arf, rmf, id])	Add or replace a response component.
`set_rmf`(rmf[, id])	Add or replace the RMF in a response component.
`subtract`()	Subtract the background data.
`sum_background_data`([get_bdata_func])	Sum up data, applying the background correction value.
`to_component_plot`([yfunc, staterrfunc])
`to_fit`([staterrfunc])
`to_guess`()
`to_plot`([yfunc, staterrfunc, response_id])
`ungroup`()	Remove any data grouping.
`unsubtract`()	Remove background subtraction.

Attributes Documentation

areascal

The area scaling value (can be a scalar or array).

If this is an array then it must match the length of channel.

background_ids

IDs of defined background data sets.

If set, the identifiers must already exist, and any other backgrounds will be removed. The identifiers can be integers or strings.

backscal

The background scaling value (can be a scalar or array).

If this is an array then it must match the length of channel.

bin_hi

The upper edge of each channel, in Angstroms, or None.

The values are expected to be in descending order, with the bin_hi value larger than the corresponding bin_lo element. This is only expected to be set for Chandra grating data.

bin_lo

The lower edge of each channel, in Angstroms, or None.

The values are expected to be in descending order. This is only expected to be set for Chandra grating data.

channel

The channel array.

This is the first, and only, element of the indep attribute.

counts

The counts array.

This is an alias for the y attribute.

default_background_id = 1

The identifier for the background component when not set.

It is an integer or string.

dep: Left for compatibility with older versions

grouped: Are the data grouped?

grouping

The grouping data.

A group is indicated by a sequence of flag values starting with 1 and then -1 for all the channels in the group, following [OGIP_92_007]. The grouping array must match the number of channels and it will be converted to an integer type if necessary.

Changed in version 4.15.1: The filter is now re-calculated when the grouping is changed. It is suggested that the filter be checked with get_filter to check it is still sensible. If set to None then the group flag is cleared.

Returns:: grouping
Return type:: numpy.ndarray or None

See also

group, grouped, quality

indep

The grid of the data space associated with this data set.

When set, the field must be set to a tuple, even for a one-dimensional data set. The “related” fields such as the dependent axis and the error fields are set to None if their size does not match.

Changed in version 4.14.1: The filter created by notice and ignore is now cleared when the independent axis is changed.

Return type:: tuple of array_like

mask

Mask array for dependent variable

Returns:: mask
Return type:: bool or numpy.ndarray

ndim = 1: The dimensionality of the dataset, if defined, or None.

plot_fac

How the X axis is used to create the Y axis when plotting data.

The Y axis values are multiplied by X^plot_fac. The default value of 0 means the X axis is not used in plots. The value must be an integer.

primary_response_id = 1: The identifier for the response component when not set.

quality

The quality data.

A quality value of 0 indicates a good channel, otherwise (values >=1) the channel is considered bad and can be excluded using the ignore_bad method, as discussed in [OGIP_92_007]. The quality array must match the number of channels and it will be converted to an integer type if necessary.

Returns:: quality
Return type:: numpy.ndarray or None

See also

group, grouping

rate

Is the Y axis displayed as a rate when plotting data?

When True the y axis is normalised by the exposure time to display a rate.

response_ids

IDs of defined instrument responses (ARF/RMF pairs).

If set, the identifiers must already exist, and any other responses will be removed. The identifiers can be integers or strings.

size

The number of elements in the data set.

Returns:: size – If the size has not been set then None is returned.
Return type:: int or None

staterror

The statistical error on the dependent axis, if set.

This must match the size of the independent axis.

subtracted: Are the background data subtracted?

syserror

The systematic error on the dependent axis, if set.

This must match the size of the independent axis.

units

one of ‘channel’, ‘energy’, ‘wavelength’.

Type:: Units of the independent axis

x: Used for compatibility, in particular for __str__ and __repr__

y

The dependent axis.

If set, it must match the size of the independent axes.

Methods Documentation

apply_filter(data, groupfunc=<function sum>)[source] [edit on github]

Group and filter the supplied data to match the data set.

Parameters:

data (ndarray or None) – The data to group, which must match either the number of channels of the data set or the number of filtered channels.
groupfunc (function reference) – The grouping function. See apply_grouping for the supported values.

Returns:

result – The grouped and filtered data, or None if the input was None.

Return type:

ndarray or None

Raises:

sherpa.utils.err.DataErr – If the data size does not match the number of channels.
ValueError – If the name of groupfunc is not supported or the data does not match the filtered data.

Examples

Group and filter the counts array with no filter and then with a filter:

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.grouped
True
>>> pha.notice()
>>> print(pha.apply_filter(pha.counts))
[17.  15.  16.  15. ...
>>> pha.notice(0.5, 7)
>>> print(pha.apply_filter(pha.counts))
[15.  16.  15.  18.  ...

As the previous example but with no grouping:

>>> pha.ungroup()
>>> pha.notice()
>>> pha.apply_filter(pha.counts)[0:5]
array([0., 0., 0., 0., 0.])
>>> pha.notice(0.5, 7)
>>> pha.apply_filter(pha.counts)[0:5]
array([4., 3., 0., 1., 1.])

Rather than group the counts, use the channel numbers and return the first and last channel number in each of the filtered groups (for the first five groups):

>>> pha.group()
>>> pha.notice(0.5, 7.0)
>>> pha.apply_filter(pha.channel, pha._min)[0:5]
array([33., 40., 45., 49., 52.])
>>> pha.apply_filter(pha.channel, pha._max)[0:5]
array([39., 44., 48., 51., 54.])

Find the approximate energy range of each selected group from the RMF EBOUNDS extension:

>>> rmf = pha.get_rmf()
>>> elo = pha.apply_filter(rmf.e_min, pha._min)
>>> ehi = pha.apply_filter(rmf.e_max, pha._max)

Calculate the grouped data, after filtering, if the counts were increased by 2 per channel. Note that in this case the data to apply_filter contains the channel counts after applying the current filter:

>>> orig = pha.counts[pha.get_mask()]
>>> new = orig + 2
>>> cts = pha.apply_filter(new)

apply_grouping(data, groupfunc=<function sum>)[source] [edit on github]

Apply the grouping scheme of the data set to the supplied data.

Parameters:

data (ndarray or None) – The data to group, which must match the number of channels of the data set.
groupfunc (function reference) – The grouping function. Note that what matters is the name of the function, not its code. The supported function names are: “sum”, “_sum_sq”, “_min”, “_max”, “_middle”, and “_make_groups”.

Returns:

grouped – The grouped data, unless the data set is not grouped or the input array was None, when the input data is returned.

Return type:

ndarray or None

Raises:

sherpa.utils.err.DataErr – If the data size does not match the number of channels.
ValueError – If the name of groupfunc is not supported.

See also

apply_filter, ignore_bad

Notes

The supported grouping schemes are:

Name	Description
sum	Sum all the values in the group.
_min	The minimum value in the group.
_max	The maximum value in the group.
_middle	The average of the minimum and maximum values.
_sum_sq	The square root of the sum of the squared values.
_make_groups	The group number, starting at the first value of data.

There are methods of the DataPHA class that can be used for all other than “sum” (the default value).

The grouped data is not filtered unless a quality filter has been applied (e.g. by ignore_bad) in which case the quality filter will be applied to the result. In general apply_filter should be used if the data is to be filtered as well as grouped.

Examples

Sum up the counts in each group (note that the data has not been filtered so using get_dep with the filter argument set to True is generally preferred to using this method):

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> gcounts = pha.apply_grouping(pha.counts)

The grouping for an unfiltered PHA data set with 1024 channels is used to calculate the number of channels in each group, the lowest channel number in each group, the highest channel number in each group, and the mid-point between the two:

>>> pha.grouped
True
>>> pha.mask
True
>>> len(pha.channel)
1024
>>> import numpy as np
>>> dvals = np.arange(1, 1025)
>>> print(pha.apply_grouping(np.ones(1024)))
[ 17.   4.  11.  ...
>>> print(pha.apply_grouping(dvals, pha._min))
[  1.  18.  22.  ...
>>> print(pha.apply_grouping(dvals, pha._max))
[  17.   21.   32.   ...
>>> print(pha.apply_grouping(dvals, pha._middle))
[  9.   19.5  27.   ...

The grouped data is not filtered (unless ignore_bad has been used):

>>> pha.notice()
>>> v1 = pha.apply_grouping(dvals)
>>> pha.notice(1.2, 4.5)
>>> v2 = pha.apply_grouping(dvals)
>>> np.all(v1 == v2)
True

delete_background(id=None)[source] [edit on github]

Remove the background component.

If the background component does not exist then the method does nothing.

Parameters:: id (int or str, optional) – The identifier of the background component. If it is None then the default background identifier is used.

See also

set_background

Notes

If this call removes the last of the background components then the subtracted flag is cleared (if set).

delete_response(id=None)[source] [edit on github]

Remove the response component.

If the response component does not exist then the method does nothing.

Parameters:: id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

See also

set_response

eval_model(modelfunc) [edit on github]: Evaluate the model on the independent axis.

eval_model_to_fit(modelfunc)[source] [edit on github]: Evaluate the model on the independent axis after filtering.

get_analysis()[source] [edit on github]

Return the units used when fitting spectral data.

Returns:

setting – The analysis setting.

Return type:

{ ‘channel’, ‘energy’, ‘wavelength’ }

Raises:

sherpa.utils.err.ArgumentErr – If the data set does not contain PHA data.
sherpa.utils.err.IdentifierErr – If the id argument is not recognized.

See also

set_analysis

Examples

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.set_analysis("wave")
>>> is_wave = pha.get_analysis() == 'wavelength'

get_areascal(group=True, filter=False)[source] [edit on github]

Return the fractional area factor of the PHA data set.

Return the AREASCAL setting [OGIP_92_007] for the PHA data set.

Parameters:

group (bool, optional) – Should the values be grouped to match the data?
filter (bool, optional) – Should the values be filtered to match the data?

Returns:

areascal – The AREASCAL value, which can be a scalar or a 1D array.

Return type:

number or ndarray

See also

get_backscal, get_background_scale

Notes

The fractional area scale is normally set to 1, with the ARF used to scale the model.

Examples

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.get_areascal()
1.0

get_arf(id=None)[source] [edit on github]

Return the ARF from the response.

Parameters:: id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.
Returns:: arf – The ARF, if set.
Return type:: sherpa.astro.data.DataARF instance or None

See also

get_response, get_rmf, get_full_responses

get_background(id=None)[source] [edit on github]

Return the background component.

Parameters:: id (int or str, optional) – The identifier of the background component. If it is None then the default background identifier is used.
Returns:: bkg – The background dataset. If there is no component then None is returned.
Return type:: sherpa.astro.data.DataPHA instance or None

See also

delete_background, set_background

get_background_scale(bkg_id=1, units='counts', group=True, filter=False)[source] [edit on github]

Return the correction factor for the background dataset.

Changed in version 4.12.2: The bkg_id, units, group, and filter parameters have been added and the routine no-longer calculates the average scaling for all the background components but just for the given component.

Parameters:

bkg_id (int or str, optional) – The background component to use (the default is 1).
units ({'counts', 'rate'}, optional) – The correction is applied to a model defined as counts, the default, or a rate. The latter should be used when calculating the correction factor for adding the background data to the source aperture.
group (bool, optional) – Should the values be grouped to match the data?
filter (bool, optional) – Should the values be filtered to match the data?

Returns:

scale – The scaling factor to correct the background data onto the source data set. If bkg_id is not valid then None is returned.

Return type:

None, number, or NumPy array

Notes

The correction factor when units is ‘counts’ is:

scale_exposure * scale_backscal * scale_areascal / nbkg

where nbkg is the number of background components and scale_x is the source value divided by the background value for the field x.

When units is ‘rate’ the correction is:

scale_backscal / nbkg

and it is currently uncertain whether it should include the AREASCAL scaling.

get_backscal(group=True, filter=False)[source] [edit on github]

Return the background scaling of the PHA data set.

Return the BACKSCAL setting [OGIP_92_007] for the PHA data set.

Parameters:

group (bool, optional) – Should the values be grouped to match the data?
filter (bool, optional) – Should the values be filtered to match the data?

Returns:

backscal – The BACKSCAL value, which can be a scalar or a 1D array.

Return type:

number or ndarray

See also

get_areascal, get_background_scale

Notes

The BACKSCAL value can be defined as the ratio of the area of the source (or background) extraction region in image pixels to the total number of image pixels. The fact that there is no ironclad definition for this quantity does not matter so long as the value for a source dataset and its associated background dataset are defined in the same manner, because only the ratio of source and background BACKSCAL values is used. It can be a scalar or an array.

Examples

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.get_backscal()
2.5264364698914e-06

get_bounding_mask() [edit on github]

get_dep(filter=False)[source] [edit on github]

Return the dependent axis of a data set.

Parameters:: filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
Returns:: axis – The dependent axis values for the data set. This gives the value of each point in the data set.
Return type:: array

See also

get_indep: Return the independent axis of a data set.
get_error: Return the errors on the dependent axis of a data set.
get_staterror: Return the statistical errors on the dependent axis of a data set.
get_syserror: Return the systematic errors on the dependent axis of a data set.

get_dims(filter=False) [edit on github]

Return the dimensions of this data space as a tuple of tuples. The first element in the tuple is a tuple with the dimensions of the data space, while the second element provides the size of the dependent array.

Return type:: tuple

get_error(filter=False, staterrfunc=None) [edit on github]

Return the total error on the dependent variable.

Parameters:

filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
staterrfunc (function) – If no statistical error has been set, the errors will be calculated by applying this function to the dependent axis of the data set.

Returns:

axis – The error for each data point, formed by adding the statistical and systematic errors in quadrature.

Return type:

array or None

See also

get_dep: Return the independent axis of a data set.
get_staterror: Return the statistical errors on the dependent axis of a data set.
get_syserror: Return the systematic errors on the dependent axis of a data set.

get_evaluation_indep(filter=False, model=None, use_evaluation_space=False) [edit on github]

get_filter(group=True, format='%.12f', delim=':')[source] [edit on github]

Return the data filter as a string.

The filter expression depends on the analysis setting.

Changed in version 4.14.0: Prior to 4.14.0 the filter used the mid-point of the bin, not its low or high value.

Parameters:

group (bool, optional) – Should the filter reflect the grouped data?
format (str, optional) – The formatting of the numeric values (this is ignored for channel units, which uses format="%i").
delim (str, optional) – The string used to mark the low-to-high range.

Returns:

expr – The noticed channel range as a string of comma-separated ranges, where the low and high values are separated by the delim string. The units of the ranges are controlled by the analysis setting. If all bins have been filtered out then “No noticed bins” is returned.

Return type:

str

Examples

For a Chandra non-grating dataset which has been grouped:

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.set_analysis('energy')
>>> pha.notice(0.5, 7)
>>> pha.get_filter(format='%.4f')
'0.4672:9.8696'
>>> pha.set_analysis('channel')
>>> pha.get_filter()
'33:676'

The filter expression shows the first selected channel to the last one, and so is independent of whether the data is grouped or not:

>>> pha.set_analysis('energy')
>>> pha.get_filter(format='%.4f')
'0.4672:9.8696'
>>> pha.get_filter(group=False, format='%.4f')
'0.4672:9.8696'

Although the group argument does not change the output of get_filter, the selected range does depend on whether the data is grouped or not (unless the groups align with the filter edges):

>>> pha.ungroup()
>>> pha.notice()
>>> pha.notice(0.5, 7)
>>> pha.get_filter(format='%.3f')
'0.496:7.008'
>>> pha.group()
>>> pha.get_filter(format='%.3f')
'0.467:9.870'

>>> pha.notice()
>>> pha.notice(0.5, 6)
>>> pha.ignore(2.1, 2.2)
>>> pha.get_filter(format='%.2f', delim='-')
'0.47-2.09,2.28-6.57'

get_filter_expr()[source] [edit on github]

Return the data filter as a string along with the units.

This is a specialised version of get_filter which adds the axis units.

Returns:: filter – The filter, represented as a collection of single values or ranges, separated by commas.
Return type:: str

See also

get_filter

Examples

>>> d = Data1D('example', [1., 2., 3., 5., 6., 7.], [0, .4, .5, .6, .7, .8])
>>> d.notice(1., 6.)
>>> d.ignore(2.5, 4.)
>>> d.get_filter_expr()
'1.0000-2.0000,5.0000-6.0000 x'

Note that the expression lists the valid data points. While we ignore only the range 2.5-4.0, there is no data point between 4. and 5., so the second part of the valid range is 5.0 to 6.0.

get_full_response(pileup_model=None)[source] [edit on github]

Calculate the response for the dataset.

Unlike get_response, which returns a single response, this function returns all responses for datasets that have multiple responses set and it offers the possibility to include a pile-up model.

Parameters:: pileup_model (None or a sherpa.astro.models.JDPileup instance) – If a pileup model shall be included in the return, then it needs to be passed in.
Returns:: The return value depends on whether an ARF, RMF, or pile up model has been associated with the data set.
Return type:: response

See also

get_response, get_arf, get_rmf

get_img(yfunc=None) [edit on github]

Return 1D dependent variable as a 1 x N image.

Parameters:: yfunc

get_imgerr() [edit on github]

get_indep(filter=True)[source] [edit on github]

Return the independent axes of a data set.

Parameters:: filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
Returns:: axis – The independent axis values for the data set. This gives the coordinates of each point in the data set.
Return type:: tuple of arrays

See also

get_dep: Return the dependent axis of a data set.

get_mask()[source] [edit on github]

Returns the (ungrouped) mask.

Returns:: mask – The mask, in channels, or None.
Return type:: ndarray or None

get_noticed_channels()[source] [edit on github]

Return the noticed channels.

Returns:: channels – The noticed channels (this is independent of the analysis setting).
Return type:: ndarray

get_noticed_expr()[source] [edit on github]

Returns the current set of noticed channels.

The values returned are always in channels, no matter the current analysis setting.

Returns:: expr – The noticed channel range as a string of comma-separated “low-high” values. As these are channel filters the low and high values are inclusive. If all channels have been filtered out then “No noticed channels” is returned.
Return type:: str

See also

get_filter, get_noticed_channels

get_response(id=None)[source] [edit on github]

Return the response component.

Parameters:: id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.
Returns:: arf, rmf – The response, as an ARF and RMF. Either, or both, components can be None.
Return type:: sherpa.astro.data.DataARF,sherpa.astro.data.DataRMF instances or None

get_rmf(id=None)[source] [edit on github]

Return the RMF from the response.

Parameters:: id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.
Returns:: rmf – The RMF, if set.
Return type:: sherpa.astro.data.DataRMF instance or None

See also

get_arf, get_response, get_full_responses

get_specresp(filter=False)[source] [edit on github]

Return the effective area values for the data set.

Parameters:: filter (bool, optional) – Should the filter attached to the data set be applied to the ARF or not. The default is False.
Returns:: arf – The effective area values for the data set (or background component) if set.
Return type:: array or None

Notes

This will return None when a RSP file (a combined ARF and RMF) is used, rather than separate responses. The relationship between RSP, ARF, and RMF is described in OGIP Calibration Memo CAL/GEN/92-002 and OGIP Calibration Memo CAL/GEN/92-002a.

get_staterror(filter=False, staterrfunc=None)[source] [edit on github]

Return the statistical error.

The staterror column is used if defined, otherwise the function provided by the staterrfunc argument is used to calculate the values.

Parameters:

filter (bool, optional) – Should the channel filter be applied to the return values?
staterrfunc (function reference, optional) – The function to use to calculate the errors if the staterror field is None. The function takes one argument, the counts (after grouping and filtering), and returns an array of values which represents the one-sigma error for each element of the input array. This argument is designed to work with implementations of the sherpa.stats.Stat.calc_staterror method.

Returns:

staterror – The statistical error. It will be grouped and, if filter=True, filtered. The contribution from any associated background components will be included if the background-subtraction flag is set.

Return type:

array or None

Notes

There is no scaling by the AREASCAL setting, but background values are scaled by their AREASCAL settings. It is not at all obvious that the current code is doing the right thing, or that this is the right approach.

Examples

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi', use_errors=True)
>>> dy = pha.get_staterror()

Ensure that there is no pre-defined statistical-error column and then use the Chi2DataVar statistic to calculate the errors:

>>> from sherpa.stats import Chi2DataVar
>>> stat = Chi2DataVar()
>>> pha.staterror = None
>>> dy = pha.get_staterror(staterrfunc=stat.calc_staterror)

get_syserror(filter=False)[source] [edit on github]

Return any systematic error.

Parameters:: filter (bool, optional) – Should the channel filter be applied to the return values?
Returns:: syserror – The systematic error, if set. It will be grouped and, if filter=True, filtered.
Return type:: array or None

Notes

There is no scaling by the AREASCAL setting.

get_x(filter=False, response_id=None)[source] [edit on github]

get_xerr(filter=False, response_id=None)[source] [edit on github]

Returns an X “error”.

The error value for the independent axis is not well defined in Sherpa.

Changed in version 4.16.1: The return value is now half the bin width instead of the full bin width and is now calculated correctly when the analysis is set to “wavelength”.

Parameters:

filter (bool, optional) – Should the values be filtered to the current notice range?
response_id (int or None, optional) – What response should be used?

Returns:

xerr – The half-width of each bin (or group) in the current analysis units.

Return type:

ndarray

get_xlabel()[source] [edit on github]

Return label for linear view of independent axis/axes

Returns:: label
Return type:: str

get_y(filter=False, yfunc=None, response_id=None, use_evaluation_space=False)[source] [edit on github]

Return dependent axis in N-D view of dependent variable.

Parameters:

filter
yfunc
use_evaluation_space

Returns:

y

Return type:

array or None

get_yerr(filter=False, staterrfunc=None, response_id=None)[source] [edit on github]

Return errors in dependent axis in N-D view of dependent variable.

Parameters:

filter
staterrfunc

get_ylabel()[source] [edit on github]

Return label for dependent axis in N-D view of dependent variable.

Parameters:: yfunc

group()[source] [edit on github]

Group the data according to the data set’s grouping scheme.

This sets the grouping flag which means that the value of the grouping attribute will be used when accessing data values. This can be called even if the grouping attribute is empty.

Changed in version 4.15.1: The grouping status of any background component is now also changed.

See also

ungroup

group_adapt(minimum, maxLength=None, tabStops=None)[source] [edit on github]

Adaptively group to a minimum number of counts.

Combine the data so that each bin contains num or more counts. The difference to group_counts is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is, by default, applied to only the noticed data range. It is suggested that filtering is done before calling group_adapt.

Changed in version 4.16.0: Grouping now defaults to only using the noticed channel range.

Parameters:

minimum (int) – The number of channels to combine into a group.
maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
tabStops (array of int or bool, optional) – If not set then it will be based on the filtering of the data set, so that the grouping only uses the filtered data. If set, it should be an array of booleans where True indicates that the channel should not be used in the grouping (this array must match the number of channels in the data set).

See also

group_adapt_snr: Adaptively group to a minimum signal-to-noise ratio.
group_bins: Group into a fixed number of bins.
group_counts: Group into a minimum number of counts per bin.
group_snr: Group into a minimum signal-to-noise ratio.
group_width: Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_adapt_snr(minimum, maxLength=None, tabStops=None, errorCol=None)[source] [edit on github]

Adaptively group to a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds minimum. The difference to group_snr is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is, by default, applied to only the noticed data range. It is suggested that filtering is done before calling group_adapt_snr.

Changed in version 4.16.0: Grouping now defaults to only using the noticed channel range.

Parameters:

minimum (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.
maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
tabStops (array of int or bool, optional) – If not set then it will be based on the filtering of the data set, so that the grouping only uses the filtered data. If set, it should be an array of booleans where True indicates that the channel should not be used in the grouping (this array must match the number of channels in the data set).
errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

group_adapt: Adaptively group to a minimum number of counts.
group_bins: Group into a fixed number of bins.
group_counts: Group into a minimum number of counts per bin.
group_snr: Group into a minimum signal-to-noise ratio.
group_width: Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_bins(num, tabStops=None)[source] [edit on github]

Group into a fixed number of bins.

Combine the data so that there num equal-width bins (or groups). The binning scheme is, by default, applied to only the noticed data range. It is suggested that filtering is done before calling group_bins.

Changed in version 4.16.0: Grouping now defaults to only using the noticed channel range.

Parameters:

num (int) – The number of bins in the grouped data set. Each bin will contain the same number of channels.
tabStops (array of int or bool, optional) – If not set then it will be based on the filtering of the data set, so that the grouping only uses the filtered data. If set, it should be an array of booleans where True indicates that the channel should not be used in the grouping (this array must match the number of channels in the data set).

See also

group_adapt: Adaptively group to a minimum number of counts.
group_adapt_snr: Adaptively group to a minimum signal-to-noise ratio.
group_counts: Group into a minimum number of counts per bin.
group_snr: Group into a minimum signal-to-noise ratio.
group_width: Group into a fixed bin width.

Notes

Since the bin width is an integer number of channels, it is likely that some channels will be “left over”. This is even more likely when the tabStops parameter is set. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_counts(num, maxLength=None, tabStops=None)[source] [edit on github]

Group into a minimum number of counts per bin.

Combine the data so that each bin contains num or more counts. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set. The binning scheme is, by default, applied to only the noticed data range. It is suggested that filtering is done before calling group_counts.

Changed in version 4.16.0: Grouping now defaults to only using the noticed channel range.

Parameters:

num (int) – The number of channels to combine into a group.
maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
tabStops (array of int or bool, optional) – If not set then it will be based on the filtering of the data set, so that the grouping only uses the filtered data. If set, it should be an array of booleans where True indicates that the channel should not be used in the grouping (this array must match the number of channels in the data set).

See also

group_adapt: Adaptively group to a minimum number of counts.
group_adapt_snr: Adaptively group to a minimum signal-to-noise ratio.
group_bins: Group into a fixed number of bins.
group_snr: Group into a minimum signal-to-noise ratio.
group_width: Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

Examples

Group by 20 counts within the range 0.5 to 7 keV (this is the default behavior for 4.16 and later):

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.set_analysis("energy")
>>> pha.notice()
>>> pha.notice(0.5, 7)
>>> pha.group_counts(20)

Group by 20 but over the whole channel range, but then filtering to the noticed range of 0.5 to 7 keV (this was the default behaviour before 4.16):

>>> pha.group_counts(20, tabStops=[0] * pha.size)

group_snr(snr, maxLength=None, tabStops=None, errorCol=None)[source] [edit on github]

Group into a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds snr. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set. The binning scheme is, by default, applied to only the noticed data range. It is suggested that filtering is done before calling group_snr.

Changed in version 4.16.0: Grouping now defaults to only using the noticed channel range.

Parameters:

snr (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.
maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
tabStops (array of int or bool, optional) – If not set then it will be based on the filtering of the data set, so that the grouping only uses the filtered data. If set, it should be an array of booleans where True indicates that the channel should not be used in the grouping (this array must match the number of channels in the data set).
errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

group_adapt: Adaptively group to a minimum number of counts.
group_adapt_snr: Adaptively group to a minimum signal-to-noise ratio.
group_bins: Group into a fixed number of bins.
group_counts: Group into a minimum number of counts per bin.
group_width: Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_width(val, tabStops=None)[source] [edit on github]

Group into a fixed bin width.

Combine the data so that each bin contains num channels. The binning scheme is, by default, applied to only the noticed data range. It is suggested that filtering is done before calling group_width.

Changed in version 4.16.0: Grouping now defaults to only using the noticed channel range.

Parameters:

val (int) – The number of channels to combine into a group.
tabStops (array of int or bool, optional) – If not set then it will be based on the filtering of the data set, so that the grouping only uses the filtered data. If set, it should be an array of booleans where True indicates that the channel should not be used in the grouping (this array must match the number of channels in the data set).

See also

group_adapt: Adaptively group to a minimum number of counts.
group_adapt_snr: Adaptively group to a minimum signal-to-noise ratio.
group_bins: Group into a fixed number of bins.
group_counts: Group into a minimum number of counts per bin.
group_snr: Group into a minimum signal-to-noise ratio.

Notes

Unless the requested bin width is a factor of the number of channels (and no tabStops parameter is given), then some channels will be “left over”. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

ignore(*args, **kwargs) [edit on github]

ignore_bad()[source] [edit on github]

Exclude channels marked as bad.

Ignore any bin in the PHA data set which has a quality value that is not equal to zero.

Raises:: sherpa.utils.err.DataErr – If the data set has no quality array.

See also

ignore, notice

Notes

Bins with a non-zero quality setting are not automatically excluded when a data set is created.

If the data set has been grouped, then calling ignore_bad will remove any filter applied to the data set. If this happens a warning message will be displayed.

notice(lo=None, hi=None, ignore=False, bkg_id=None)[source] [edit on github]

Notice or ignore the given range.

Changed in version 4.14.0: PHA filtering has been improved to fix a number of corner cases which can result in the same filter now selecting one or two fewer channels that done in earlier versions of Sherpa. The lo and hi arguments are now restricted based on the units setting.

Parameters:

lo (number or None, optional) – The range to change. A value of None means the minimum or maximum permitted value. The units of lo and hi are set by the units field.
hi (number or None, optional) – The range to change. A value of None means the minimum or maximum permitted value. The units of lo and hi are set by the units field.
ignore (bool, optional) – Set to True if the range should be ignored. The default is to notice the range.
bkg_id (int or str, or sequence of int or str, optional) – If not None then apply the filter to the given background dataset or datasets, otherwise change the object and all its background datasets.

Notes

Calling notice with no arguments selects all points in the dataset (or, if ignore=True, it will remove all points).

If no channels have been ignored then a call to notice with ignore=False will select just the lo to hi range, and exclude any channels outside this range. If there has been a filter applied then the range lo to hi will be added to the range of noticed data (when ignore=False). One consequence to the above is that if the first call to notice (with ignore=False) selects a range outside the data set - such as a channel range of 2000 to 3000 when the valid range is 1 to 1024 - then all points will be ignored.

When filtering with channel units then:

the lo and hi arguments, if set, must be integers,
and the lo and hi values are inclusive.

For energy and wavelength filters:

the lo and hi arguments, if set, must be >= 0,
and the lo limit is inclusive but the hi limit is exclusive.

Examples

So, for an ungrouped PHA file with 1024 channels:

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.ungroup()
>>> pha.units = 'channel'
>>> pha.get_filter()
'1:1024'
>>> pha.notice(20, 200)
>>> pha.get_filter()
'20:200'
>>> pha.notice(300, 500)
>>> pha.get_filter()
'20:200,300:500'

Calling notice with no arguments removes all the filters:

>>> pha.notice()
>>> pha.get_filter()
'1:1024'

Ignore the first 30 channels (this is the same as calling `pha.ignore(hi=30):

>>> pha.notice(hi=30, ignore=True)
>>> pha.get_filter()
'31:1024'

When using wavelength or energy units the noticed (or ignored) range will not always match the requested range because each channel has a finite width in these spaces:

>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.grouped
True
>>> pha.get_analysis()
'energy'
>>> pha.notice()
>>> pha.notice(0.5, 7)
>>> pha.get_filter(format='%.3f')
'0.467:9.870'

notice_response(notice_resp=True, noticed_chans=None)[source] [edit on github]

set_analysis(quantity, type='rate', factor=0)[source] [edit on github]

Set the units used when fitting and plotting spectral data.

Parameters:

quantity ({'channel', 'energy', 'wavelength'}) – The analysis setting.
type ({'rate', 'counts'}, optional) – Do plots display a rate or show counts?
factor (int, optional) – The Y axis of plots is multiplied by Energy^factor or Wavelength^factor before display. The default is 0.

Raises:

sherpa.utils.err.DatatErr – If the type argument is invalid, the RMF or ARF has the wrong size, or there in no response.

See also

get_analysis

Examples

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> pha.set_analysis('energy')

>>> pha.set_analysis('wave', type='counts', factor=1)
>>> pha.units
'wavelength'

set_arf(arf, id=None)[source] [edit on github]

Add or replace the ARF in a response component.

This replaces the existing ARF of the response, keeping the previous RMF (if set). Use the delete_response method to remove the response, rather than setting arf to None.

Parameters:

arf (sherpa.astro.data.DataARF instance) – The ARF to add.
id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

set_background(bkg, id=None)[source] [edit on github]

Add or replace a background component.

If the background has no grouping of quality arrays then they are copied from the source region. If the background has no response information (ARF or RMF) then the response is copied from the source region.

Parameters:

bkg (sherpa.astro.data.DataPHA instance) – The background dataset to add. This object may be changed by this method.
id (int or str, optional) – The identifier of the background component. If it is None then the default background identifier is used.

See also

delete_background, get_background

Notes

If the PHA header does not have the TELESCOP, INSTRUME, or FILTER header keywords set (or they are set to “none”), then they are taken from the background, if they are not set to “none”. This is to allow simulated data sets to be used with external programs, such as XSPEC.

set_dep(val) [edit on github]

Set the dependent variable values.

Parameters:: val (sequence or number) – If a number then it is used for each element.

set_indep(val) [edit on github]

set_response(arf=None, rmf=None, id=None)[source] [edit on github]

Add or replace a response component.

To remove a response use delete_response(), as setting arf and rmf to None here does nothing.

Parameters:

arf (sherpa.astro.data.DataARF instance or None, optional) – The ARF to add if any.
rmf (sherpa.astro.data.DataRMF instance or None, optional) – The RMF to add, if any.
id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

Notes

If the PHA header does not have the TELESCOP, INSTRUME, or FILTER header keywords set (or they are set to “none”), then they are taken from the ARF or RMF, if they are not set to “none”. This is to allow simulated data sets to be used with external programs, such as XSPEC.

set_rmf(rmf, id=None)[source] [edit on github]

Add or replace the RMF in a response component.

This replaces the existing RMF of the response, keeping the previous ARF (if set). Use the delete_response method to remove the response, rather than setting rmf to None.

Parameters:

rmf (sherpa.astro.data.DataRMF instance) – The RMF to add.
id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

subtract()[source] [edit on github]: Subtract the background data.

See also

unsubtract

sum_background_data(get_bdata_func=<function DataPHA.<lambda>>)[source] [edit on github]

Sum up data, applying the background correction value.

Parameters:: get_bdata_func (function, optional) – What data should be used for each background dataset. The function takes the background identifier and background DataPHA object and returns the data to use. The default is to use the counts array of the background dataset.
Returns:: value – The sum of the data, including any area, background, and exposure-time corrections.
Return type:: scalar or NumPy array

Notes

For each associated background, the data is retrieved (via the get_bdata_func parameter), and then

divided by its BACKSCAL value (if set)

divided by its AREASCAL value (if set)

divided by its exposure time (if set)

The individual background components are then summed together, and then multiplied by the source BACKSCAL (if set), multiplied by the source AREASCAL (if set), and multiplied by the source exposure time (if set). The final step is to divide by the number of background files used.

Example

Calculate the background counts, per channel, scaled to match the source:

>>> from sherpa.astro.io import read_pha
>>> pha = read_pha(data_3c273 + '3c273.pi')
>>> bcounts = pha.sum_background_data()

Calculate the scaling factor that you need to multiply the background data to match the source data. In this case the background data has been replaced by the value 1 (rather than the per-channel values used with the default argument):

>>> bscale = pha.sum_background_data(lambda k, d: 1)

to_component_plot(yfunc=None, staterrfunc=None) [edit on github]

to_fit(staterrfunc=None)[source] [edit on github]

to_guess()[source] [edit on github]

to_plot(yfunc=None, staterrfunc=None, response_id=None)[source] [edit on github]

ungroup()[source] [edit on github]

Remove any data grouping.

This un-sets the grouping flag which means that the grouping attribute will not be used when accessing data values.

Changed in version 4.15.1: The grouping status of any background component is now also changed.

See also

group

unsubtract()[source] [edit on github]: Remove background subtraction.

See also

subtract