DataPHA

class sherpa.astro.data.DataPHA(name, channel, counts, staterror=None, syserror=None, bin_lo=None, bin_hi=None, grouping=None, quality=None, exposure=None, backscal=None, areascal=None, header=None)[source]

Bases: sherpa.data.Data1DInt

PHA data set, including any associated instrument and background data.

The PHA format is described in an OGIP document [1].

Parameters:
  • name (str) – The name of the data set; often set to the name of the file containing the data.
  • counts (channel,) – The PHA data.
  • syserror (staterror,) – The statistical and systematic errors for the data, if defined.
  • bin_hi (bin_lo,) –
  • grouping (array of int or None, optional) –
  • quality (array of int or None, optional) –
  • exposure (number or None, optional) – The exposure time for the PHA data set, in seconds.
  • backscal (scalar or array or None, optional) –
  • areascal (scalar or array or None, optional) –
  • header (dict or None, optional) –
name

Used to store the file name, for data read from a file.

Type:str
channel
counts
staterror
syserror
bin_lo
bin_hi
grouping
quality
exposure
backscal
areascal

Notes

The original data is stored in the attributes - e.g. counts - and the data-access methods, such as get_dep and get_staterror, provide any necessary data manipulation to handle cases such as: background subtraction, filtering, and grouping.

The handling of the AREASCAl value - whether it is a scalar or array - is currently in flux. It is a value that is stored with the PHA file, and the OGIP PHA standard ([1]) describes the observed counts being divided by the area scaling before comparison to the model. However, this is not valid for Poisson-based statistics, and is also not how XSPEC handles AREASCAL ([2]); the AREASCAL values are used to scale the exposure times instead. The aim is to add this logic to the instrument models in sherpa.astro.instrument, such as sherpa.astro.instrument.RMFModelPHA. The area scaling still has to be applied when calculating the background contribution to a spectrum, as well as when calculating the data and model values used for plots (following XSPEC so as to avoid sharp discontinuities where the area-scaling factor changes strongly).

References

[1](1, 2) “The OGIP Spectral File Format”, https://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007/ogip_92_007.html
[2]Private communication with Keith Arnaud

Attributes Summary

background_ids IDs of defined background data sets
default_background_id
filter Filter for dependent variable
grouped Are the data grouped?
mask Mask array for dependent variable
plot_fac Number of times to multiply the y-axis quantity by x-axis bin size
primary_response_id
rate counts or counts/sec
response_ids IDs of defined instrument responses (ARF/RMF pairs)
subtracted Are the background data subtracted?
units Units of the independent axis

Methods Summary

apply_filter(data[, groupfunc]) Filter the array data, first passing it through apply_grouping() (using groupfunc) and then applying the general filters
apply_grouping(data[, groupfunc]) Apply the data set’s grouping scheme to the array data, combining the grouped data points with groupfunc, and return the grouped array.
delete_background([id])
delete_response([id])
eval_model(modelfunc)
eval_model_to_fit(modelfunc)
get_analysis()
get_areascal([group, filter])
get_arf([id])
get_background([id])
get_background_scale()
get_backscal([group, filter])
get_bounding_mask()
get_dep([filter]) Return the dependent axis of a data set.
get_dims([filter])
get_error([filter, staterrfunc]) Return the total error on the dependent variable.
get_filter([group, format, delim]) Integrated values returned are measured from center of bin
get_filter_expr()
get_img([yfunc]) Return 1D dependent variable as a 1 x N image
get_imgerr() Return total error in dependent variable as an image
get_indep([filter]) Return the independent axes of a data set.
get_mask()
get_noticed_channels()
get_noticed_expr()
get_response([id])
get_rmf([id])
get_specresp([filter]) Return the effective area values for the data set.
get_staterror([filter, staterrfunc]) Return the statistical error.
get_syserror([filter]) Return any systematic error.
get_x([filter, response_id]) Return linear view of independent axis/axes
get_x0([filter]) Return first dimension in 2-D view of independent axis/axes
get_x0label() Return label for first dimension in 2-D view of independent axis/axes
get_x1([filter]) Return second dimension in 2-D view of independent axis/axes
get_x1label() Return label for second dimension in 2-D view of independent axis/axes
get_xerr([filter, response_id]) Return linear view of bin size in independent axis/axes
get_xlabel() Return label for linear view of independent axis/axes
get_y([filter, yfunc, response_id, …]) Return dependent axis in N-D view of dependent variable
get_yerr([filter, staterrfunc, response_id]) Return errors in dependent axis in N-D view of dependent variable
get_ylabel() Return label for dependent axis in N-D view of dependent variable
group() Group the data according to the data set’s grouping scheme
group_adapt(minimum[, maxLength, tabStops]) Adaptively group to a minimum number of counts.
group_adapt_snr(minimum[, maxLength, …]) Adaptively group to a minimum signal-to-noise ratio.
group_bins(num[, tabStops]) Group into a fixed number of bins.
group_counts(num[, maxLength, tabStops]) Group into a minimum number of counts per bin.
group_snr(snr[, maxLength, tabStops, errorCol]) Group into a minimum signal-to-noise ratio.
group_width(val[, tabStops]) Group into a fixed bin width.
ignore(*args, **kwargs)
ignore_bad() Exclude channels marked as bad.
notice([lo, hi, ignore, bkg_id])
notice_response([notice_resp, noticed_chans])
set_analysis(quantity[, type, factor])
set_arf(arf[, id])
set_background(bkg[, id])
set_dep(val) Set the dependent variable values
set_response([arf, rmf, id])
set_rmf(rmf[, id])
subtract() Subtract the background data
sum_background_data([get_bdata_func])
to_component_plot([yfunc, staterrfunc])
to_contour([yfunc])
to_fit([staterrfunc])
to_guess()
to_plot([yfunc, staterrfunc, response_id])
ungroup() Ungroup the data
unsubtract() Remove background subtraction

Attributes Documentation

background_ids

IDs of defined background data sets

default_background_id = 1
filter

Filter for dependent variable

grouped

Are the data grouped?

mask

Mask array for dependent variable

plot_fac

Number of times to multiply the y-axis quantity by x-axis bin size

primary_response_id = 1
rate

counts or counts/sec

Type:Quantity of y-axis
response_ids

IDs of defined instrument responses (ARF/RMF pairs)

subtracted

Are the background data subtracted?

units

Units of the independent axis

Methods Documentation

apply_filter(data, groupfunc=<function sum>)[source]

Filter the array data, first passing it through apply_grouping() (using groupfunc) and then applying the general filters

apply_grouping(data, groupfunc=<function sum>)[source]

Apply the data set’s grouping scheme to the array data, combining the grouped data points with groupfunc, and return the grouped array. If the data set has no associated grouping scheme or the data are ungrouped, data is returned unaltered.

delete_background(id=None)[source]
delete_response(id=None)[source]
eval_model(modelfunc)[source]
eval_model_to_fit(modelfunc)[source]
get_analysis()[source]
get_areascal(group=True, filter=False)[source]
get_arf(id=None)[source]
get_background(id=None)[source]
get_background_scale()[source]
get_backscal(group=True, filter=False)[source]
get_bounding_mask()
get_dep(filter=False)[source]

Return the dependent axis of a data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
Returns:axis – The dependent axis values for the data set. This gives the value of each point in the data set.
Return type:array

See also

get_indep()
Return the independent axis of a data set.
get_error()
Return the errors on the dependent axis of a data set.
get_staterror()
Return the statistical errors on the dependent axis of a data set.
get_syserror()
Return the systematic errors on the dependent axis of a data set.
get_dims(filter=False)
get_error(filter=False, staterrfunc=None)

Return the total error on the dependent variable.

Parameters:
  • filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
  • staterrfunc (function) – If no statistical error has been set, the errors will be calculated by applying this function to the dependent axis of the data set.
Returns:

axis – The error for each data point, formed by adding the statistical and systematic errors in quadrature.

Return type:

array or None

See also

get_dep()
Return the independent axis of a data set.
get_staterror()
Return the statistical errors on the dependent axis of a data set.
get_syserror()
Return the systematic errors on the dependent axis of a data set.
get_filter(group=True, format='%.12f', delim=':')[source]

Integrated values returned are measured from center of bin

get_filter_expr()[source]
get_img(yfunc=None)

Return 1D dependent variable as a 1 x N image

get_imgerr()

Return total error in dependent variable as an image

get_indep(filter=True)[source]

Return the independent axes of a data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
Returns:axis – The independent axis values for the data set. This gives the coordinates of each point in the data set.
Return type:tuple of arrays

See also

get_dep()
Return the dependent axis of a data set.
get_mask()[source]
get_noticed_channels()[source]
get_noticed_expr()[source]
get_response(id=None)[source]
get_rmf(id=None)[source]
get_specresp(filter=False)[source]

Return the effective area values for the data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the ARF or not. The default is False.
Returns:arf – The effective area values for the data set (or background component).
Return type:array
get_staterror(filter=False, staterrfunc=None)[source]

Return the statistical error.

The staterror column is used if defined, otherwise the function provided by the staterrfunc argument is used to calculate the values.

Parameters:
  • filter (bool, optional) – Should the channel filter be applied to the return values?
  • staterrfunc (function reference, optional) – The function to use to calculate the errors if the staterror field is None. The function takes one argument, the counts (after grouping and filtering), and returns an array of values which represents the one-sigma error for each element of the input array. This argument is designed to work with implementations of the sherpa.stats.Stat.calc_staterror method.
Returns:

staterror – The statistical error. It will be grouped and, if filter=True, filtered. The contribution from any associated background components will be included if the background-subtraction flag is set.

Return type:

array or None

Notes

There is no scaling by the AREASCAL setting, but background values are scaled by their AREASCAL settings. It is not at all obvious that the current code is doing the right thing, or that this is the right approach.

Examples

>>> dy = dset.get_staterror()

Ensure that there is no pre-defined statistical-error column and then use the Chi2DataVar statistic to calculate the errors:

>>> stat = sherpa.stats.Chi2DataVar()
>>> dset.set_staterror(None)
>>> dy = dset.get_staterror(staterrfunc=stat.calc_staterror)
get_syserror(filter=False)[source]

Return any systematic error.

Parameters:filter (bool, optional) – Should the channel filter be applied to the return values?
Returns:syserror – The systematic error, if set. It will be grouped and, if filter=True, filtered.
Return type:array or None

Notes

There is no scaling by the AREASCAL setting.

get_x(filter=False, response_id=None)[source]

Return linear view of independent axis/axes

get_x0(filter=False)

Return first dimension in 2-D view of independent axis/axes

get_x0label()

Return label for first dimension in 2-D view of independent axis/axes

get_x1(filter=False)

Return second dimension in 2-D view of independent axis/axes

get_x1label()

Return label for second dimension in 2-D view of independent axis/axes

get_xerr(filter=False, response_id=None)[source]

Return linear view of bin size in independent axis/axes

get_xlabel()[source]

Return label for linear view of independent axis/axes

get_y(filter=False, yfunc=None, response_id=None, use_evaluation_space=False)[source]

Return dependent axis in N-D view of dependent variable

get_yerr(filter=False, staterrfunc=None, response_id=None)[source]

Return errors in dependent axis in N-D view of dependent variable

get_ylabel()[source]

Return label for dependent axis in N-D view of dependent variable

group()[source]

Group the data according to the data set’s grouping scheme

group_adapt(minimum, maxLength=None, tabStops=None)[source]

Adaptively group to a minimum number of counts.

Combine the data so that each bin contains num or more counts. The difference to group_counts is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters:
  • minimum (int) – The number of channels to combine into a group.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_bins()
Group into a fixed number of bins.
group_counts()
Group into a minimum number of counts per bin.
group_snr()
Group into a minimum signal-to-noise ratio.
group_width()
Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_adapt_snr(minimum, maxLength=None, tabStops=None, errorCol=None)[source]

Adaptively group to a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds minimum. The difference to group_snr is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters:
  • minimum (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).
  • errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_bins()
Group into a fixed number of bins.
group_counts()
Group into a minimum number of counts per bin.
group_snr()
Group into a minimum signal-to-noise ratio.
group_width()
Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_bins(num, tabStops=None)[source]

Group into a fixed number of bins.

Combine the data so that there num equal-width bins (or groups). The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters:
  • num (int) – The number of bins in the grouped data set. Each bin will contain the same number of channels.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_counts()
Group into a minimum number of counts per bin.
group_snr()
Group into a minimum signal-to-noise ratio.
group_width()
Group into a fixed bin width.

Notes

Since the bin width is an integer number of channels, it is likely that some channels will be “left over”. This is even more likely when the tabStops parameter is set. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_counts(num, maxLength=None, tabStops=None)[source]

Group into a minimum number of counts per bin.

Combine the data so that each bin contains num or more counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set.

Parameters:
  • num (int) – The number of channels to combine into a group.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_bins()
Group into a fixed number of bins.
group_snr()
Group into a minimum signal-to-noise ratio.
group_width()
Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_snr(snr, maxLength=None, tabStops=None, errorCol=None)[source]

Group into a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds snr. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set.

Parameters:
  • snr (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).
  • errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_bins()
Group into a fixed number of bins.
group_counts()
Group into a minimum number of counts per bin.
group_width()
Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_width(val, tabStops=None)[source]

Group into a fixed bin width.

Combine the data so that each bin contains num channels. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters:
  • val (int) – The number of channels to combine into a group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_bins()
Group into a fixed number of bins.
group_counts()
Group into a minimum number of counts per bin.
group_snr()
Group into a minimum signal-to-noise ratio.

Notes

Unless the requested bin width is a factor of the number of channels (and no tabStops parameter is given), then some channels will be “left over”. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

ignore(*args, **kwargs)
ignore_bad()[source]

Exclude channels marked as bad.

Ignore any bin in the PHA data set which has a quality value that is larger than zero.

Raises:sherpa.utils.err.DataErr – If the data set has no quality array.

See also

ignore()
Exclude data from the fit.
notice()
Include data in the fit.

Notes

Bins with a non-zero quality setting are not automatically excluded when a data set is created.

If the data set has been grouped, then calling ignore_bad will remove any filter applied to the data set. If this happens a warning message will be displayed.

notice(lo=None, hi=None, ignore=False, bkg_id=None)[source]
notice_response(notice_resp=True, noticed_chans=None)[source]
set_analysis(quantity, type='rate', factor=0)[source]
set_arf(arf, id=None)[source]
set_background(bkg, id=None)[source]
set_dep(val)[source]

Set the dependent variable values

set_response(arf=None, rmf=None, id=None)[source]
set_rmf(rmf, id=None)[source]
subtract()[source]

Subtract the background data

sum_background_data(get_bdata_func=<function DataPHA.<lambda>>)[source]
to_component_plot(yfunc=None, staterrfunc=None)
to_contour(yfunc=None)
to_fit(staterrfunc=None)[source]
to_guess()[source]
to_plot(yfunc=None, staterrfunc=None, response_id=None)[source]
ungroup()[source]

Ungroup the data

unsubtract()[source]

Remove background subtraction