class, channel, counts, staterror=None, syserror=None, bin_lo=None, bin_hi=None, grouping=None, quality=None, exposure=None, backscal=None, areascal=None, header=None)[source] [edit on github]


PHA data set, including any associated instrument and background data.

The PHA format is described in an OGIP document [1]_.

  • name (str) – The name of the data set; often set to the name of the file containing the data.
  • counts (channel,) – The PHA data.
  • syserror (staterror,) – The statistical and systematic errors for the data, if defined.
  • bin_hi (bin_lo,) –
  • grouping (array of int or None, optional) –
  • quality (array of int or None, optional) –
  • exposure (number or None, optional) – The exposure time for the PHA data set, in seconds.
  • backscal (scalar or array or None, optional) –
  • areascal (scalar or array or None, optional) –
  • header (dict or None, optional) –

Used to store the file name, for data read from a file.



The original data is stored in the attributes - e.g. counts - and the data-access methods, such as get_dep and get_staterror, provide any necessary data manipulation to handle cases such as: background subtraction, filtering, and grouping.

The handling of the AREASCAl value - whether it is a scalar or array - is currently in flux. It is a value that is stored with the PHA file, and the OGIP PHA standard ([1]_) describes the observed counts being divided by the area scaling before comparison to the model. However, this is not valid for Poisson-based statistics, and is also not how XSPEC handles AREASCAL ([2]); the AREASCAL values are used to scale the exposure times instead. The aim is to add this logic to the instrument models in sherpa.astro.instrument, such as sherpa.astro.instrument.RMFModelPHA. The area scaling still has to be applied when calculating the background contribution to a spectrum, as well as when calculating the data and model values used for plots (following XSPEC so as to avoid sharp discontinuities where the area-scaling factor changes strongly).


[1]“The OGIP Spectral File Format”,
[2]Private communication with Keith Arnaud

Attributes Summary

background_ids IDs of defined background data sets
filter Filter for dependent variable
grouped Are the data grouped?
mask Mask array for dependent variable
plot_fac Number of times to multiply the y-axis quantity by x-axis bin size
rate counts or counts/sec
response_ids IDs of defined instrument responses (ARF/RMF pairs)
subtracted Are the background data subtracted?
units Units of the independent axis

Methods Summary

apply_filter(data[, groupfunc]) Filter the array data, first passing it through apply_grouping() (using groupfunc) and then applying the general filters
apply_grouping(data[, groupfunc]) Apply the data set’s grouping scheme to the array data, combining the grouped data points with groupfunc, and return the grouped array.
get_analysis() Return the units used when fitting spectral data.
get_areascal([group, filter]) Return the fractional area factor of the PHA data set.
get_backscal([group, filter]) Return the area scaling of the PHA data set.
get_dep([filter]) Return the dependent axis of a data set.
get_error([filter, staterrfunc]) Return the total error on the dependent variable.
get_filter([group, format, delim]) Integrated values returned are measured from center of bin
get_img([yfunc]) Return 1D dependent variable as a 1 x N image
get_imgerr() Return total error in dependent variable as an image
get_indep([filter]) Return the independent axes of a data set.
get_specresp([filter]) Return the effective area values for the data set.
get_staterror([filter, staterrfunc]) Return the statistical error.
get_syserror([filter]) Return any systematic error.
get_x([filter, response_id]) Return linear view of independent axis/axes
get_x0([filter]) Return first dimension in 2-D view of independent axis/axes
get_x0label() Return label for first dimension in 2-D view of independent axis/axes
get_x1([filter]) Return second dimension in 2-D view of independent axis/axes
get_x1label() Return label for second dimension in 2-D view of independent axis/axes
get_xerr([filter, response_id]) Return linear view of bin size in independent axis/axes
get_xlabel() Return label for linear view of independent axis/axes
get_y([filter, yfunc, response_id, …]) Return dependent axis in N-D view of dependent variable
get_yerr([filter, staterrfunc, response_id]) Return errors in dependent axis in N-D view of dependent variable
get_ylabel() Return label for dependent axis in N-D view of dependent variable
group() Group the data according to the data set’s grouping scheme
group_adapt(minimum[, maxLength, tabStops]) Adaptively group to a minimum number of counts.
group_adapt_snr(minimum[, maxLength, …]) Adaptively group to a minimum signal-to-noise ratio.
group_bins(num[, tabStops]) Group into a fixed number of bins.
group_counts(num[, maxLength, tabStops]) Group into a minimum number of counts per bin.
group_snr(snr[, maxLength, tabStops, errorCol]) Group into a minimum signal-to-noise ratio.
group_width(val[, tabStops]) Group into a fixed bin width.
ignore(*args, **kwargs)
ignore_bad() Exclude channels marked as bad.
notice([lo, hi, ignore, bkg_id])
notice_response([notice_resp, noticed_chans])
set_analysis(quantity[, type, factor])
set_arf(arf[, id])
set_background(bkg[, id])
set_dep(val) Set the dependent variable values
set_response([arf, rmf, id])
set_rmf(rmf[, id])
subtract() Subtract the background data
to_component_plot([yfunc, staterrfunc])
to_plot([yfunc, staterrfunc, response_id])
ungroup() Ungroup the data
unsubtract() Remove background subtraction

Attributes Documentation


IDs of defined background data sets

default_background_id = 1

Filter for dependent variable


Are the data grouped?


Mask array for dependent variable


Number of times to multiply the y-axis quantity by x-axis bin size

primary_response_id = 1

counts or counts/sec

Type:Quantity of y-axis

IDs of defined instrument responses (ARF/RMF pairs)


Are the background data subtracted?


Units of the independent axis

Methods Documentation

apply_filter(data, groupfunc=<function sum>)[source] [edit on github]

Filter the array data, first passing it through apply_grouping() (using groupfunc) and then applying the general filters

apply_grouping(data, groupfunc=<function sum>)[source] [edit on github]

Apply the data set’s grouping scheme to the array data, combining the grouped data points with groupfunc, and return the grouped array. If the data set has no associated grouping scheme or the data are ungrouped, data is returned unaltered.

delete_background(id=None)[source] [edit on github]
delete_response(id=None)[source] [edit on github]
eval_model(modelfunc)[source] [edit on github]
eval_model_to_fit(modelfunc)[source] [edit on github]
get_analysis()[source] [edit on github]

Return the units used when fitting spectral data.


setting – The analysis setting.

Return type:

{ ‘channel’, ‘energy’, ‘wavelength’ }


See also



>>> is_wave = pha.get_analysis() == 'wavelength'
get_areascal(group=True, filter=False)[source] [edit on github]

Return the fractional area factor of the PHA data set.

Return the AREASCAL setting [1]_ for the PHA data set.

  • group (bool, optional) – Should the values be grouped to match the data?
  • filter (bool, optional) – Should the values be filtered to match the data?

areascal – The AREASCAL value, which can be a scalar or a 1D array.

Return type:

number or ndarray


The fractional area scale is normally set to 1, with the ARF used to scale the model.


[1]“The OGIP Spectral File Format”, Arnaud, K. & George, I.


>>> pha.get_areascal()
get_arf(id=None)[source] [edit on github]
get_background(id=None)[source] [edit on github]
get_background_scale()[source] [edit on github]
get_backscal(group=True, filter=False)[source] [edit on github]

Return the area scaling of the PHA data set.

Return the BACKSCAL setting [1]_ for the PHA data set.

  • group (bool, optional) – Should the values be grouped to match the data?
  • filter (bool, optional) – Should the values be filtered to match the data?

backscal – The BACKSCAL value, which can be a scalar or a 1D array.

Return type:

number or ndarray


The BACKSCAL value can be defined as the ratio of the area of the source (or background) extraction region in image pixels to the total number of image pixels. The fact that there is no ironclad definition for this quantity does not matter so long as the value for a source dataset and its associated background dataset are defined in the similar manner, because only the ratio of source and background BACKSCAL values is used. It can be a scalar or an array.


[1]“The OGIP Spectral File Format”, Arnaud, K. & George, I.


>>> pha.get_backscal()
get_bounding_mask() [edit on github]
get_dep(filter=False)[source] [edit on github]

Return the dependent axis of a data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
Returns:axis – The dependent axis values for the data set. This gives the value of each point in the data set.
Return type:array

See also

Return the independent axis of a data set.
Return the errors on the dependent axis of a data set.
Return the statistical errors on the dependent axis of a data set.
Return the systematic errors on the dependent axis of a data set.
get_dims(filter=False) [edit on github]
get_error(filter=False, staterrfunc=None) [edit on github]

Return the total error on the dependent variable.

  • filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
  • staterrfunc (function) – If no statistical error has been set, the errors will be calculated by applying this function to the dependent axis of the data set.

axis – The error for each data point, formed by adding the statistical and systematic errors in quadrature.

Return type:

array or None

See also

Return the independent axis of a data set.
Return the statistical errors on the dependent axis of a data set.
Return the systematic errors on the dependent axis of a data set.
get_filter(group=True, format='%.12f', delim=':')[source] [edit on github]

Integrated values returned are measured from center of bin

get_filter_expr()[source] [edit on github]
get_img(yfunc=None) [edit on github]

Return 1D dependent variable as a 1 x N image

get_imgerr() [edit on github]

Return total error in dependent variable as an image

get_indep(filter=True)[source] [edit on github]

Return the independent axes of a data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
Returns:axis – The independent axis values for the data set. This gives the coordinates of each point in the data set.
Return type:tuple of arrays

See also

Return the dependent axis of a data set.
get_mask()[source] [edit on github]
get_noticed_channels()[source] [edit on github]
get_noticed_expr()[source] [edit on github]
get_response(id=None)[source] [edit on github]
get_rmf(id=None)[source] [edit on github]
get_specresp(filter=False)[source] [edit on github]

Return the effective area values for the data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the ARF or not. The default is False.
Returns:arf – The effective area values for the data set (or background component).
Return type:array
get_staterror(filter=False, staterrfunc=None)[source] [edit on github]

Return the statistical error.

The staterror column is used if defined, otherwise the function provided by the staterrfunc argument is used to calculate the values.

  • filter (bool, optional) – Should the channel filter be applied to the return values?
  • staterrfunc (function reference, optional) – The function to use to calculate the errors if the staterror field is None. The function takes one argument, the counts (after grouping and filtering), and returns an array of values which represents the one-sigma error for each element of the input array. This argument is designed to work with implementations of the sherpa.stats.Stat.calc_staterror method.

staterror – The statistical error. It will be grouped and, if filter=True, filtered. The contribution from any associated background components will be included if the background-subtraction flag is set.

Return type:

array or None


There is no scaling by the AREASCAL setting, but background values are scaled by their AREASCAL settings. It is not at all obvious that the current code is doing the right thing, or that this is the right approach.


>>> dy = dset.get_staterror()

Ensure that there is no pre-defined statistical-error column and then use the Chi2DataVar statistic to calculate the errors:

>>> stat = sherpa.stats.Chi2DataVar()
>>> dset.set_staterror(None)
>>> dy = dset.get_staterror(staterrfunc=stat.calc_staterror)
get_syserror(filter=False)[source] [edit on github]

Return any systematic error.

Parameters:filter (bool, optional) – Should the channel filter be applied to the return values?
Returns:syserror – The systematic error, if set. It will be grouped and, if filter=True, filtered.
Return type:array or None


There is no scaling by the AREASCAL setting.

get_x(filter=False, response_id=None)[source] [edit on github]

Return linear view of independent axis/axes

get_x0(filter=False) [edit on github]

Return first dimension in 2-D view of independent axis/axes

get_x0label() [edit on github]

Return label for first dimension in 2-D view of independent axis/axes

get_x1(filter=False) [edit on github]

Return second dimension in 2-D view of independent axis/axes

get_x1label() [edit on github]

Return label for second dimension in 2-D view of independent axis/axes

get_xerr(filter=False, response_id=None)[source] [edit on github]

Return linear view of bin size in independent axis/axes

get_xlabel()[source] [edit on github]

Return label for linear view of independent axis/axes

get_y(filter=False, yfunc=None, response_id=None, use_evaluation_space=False)[source] [edit on github]

Return dependent axis in N-D view of dependent variable

get_yerr(filter=False, staterrfunc=None, response_id=None)[source] [edit on github]

Return errors in dependent axis in N-D view of dependent variable

get_ylabel()[source] [edit on github]

Return label for dependent axis in N-D view of dependent variable

group()[source] [edit on github]

Group the data according to the data set’s grouping scheme

group_adapt(minimum, maxLength=None, tabStops=None)[source] [edit on github]

Adaptively group to a minimum number of counts.

Combine the data so that each bin contains num or more counts. The difference to group_counts is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

  • minimum (int) – The number of channels to combine into a group.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

Adaptively group to a minimum signal-to-noise ratio.
Group into a fixed number of bins.
Group into a minimum number of counts per bin.
Group into a minimum signal-to-noise ratio.
Group into a fixed bin width.


If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_adapt_snr(minimum, maxLength=None, tabStops=None, errorCol=None)[source] [edit on github]

Adaptively group to a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds minimum. The difference to group_snr is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

  • minimum (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).
  • errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

Adaptively group to a minimum number of counts.
Group into a fixed number of bins.
Group into a minimum number of counts per bin.
Group into a minimum signal-to-noise ratio.
Group into a fixed bin width.


If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_bins(num, tabStops=None)[source] [edit on github]

Group into a fixed number of bins.

Combine the data so that there num equal-width bins (or groups). The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

  • num (int) – The number of bins in the grouped data set. Each bin will contain the same number of channels.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

Adaptively group to a minimum number of counts.
Adaptively group to a minimum signal-to-noise ratio.
Group into a minimum number of counts per bin.
Group into a minimum signal-to-noise ratio.
Group into a fixed bin width.


Since the bin width is an integer number of channels, it is likely that some channels will be “left over”. This is even more likely when the tabStops parameter is set. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_counts(num, maxLength=None, tabStops=None)[source] [edit on github]

Group into a minimum number of counts per bin.

Combine the data so that each bin contains num or more counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set.

  • num (int) – The number of channels to combine into a group.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

Adaptively group to a minimum number of counts.
Adaptively group to a minimum signal-to-noise ratio.
Group into a fixed number of bins.
Group into a minimum signal-to-noise ratio.
Group into a fixed bin width.


If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_snr(snr, maxLength=None, tabStops=None, errorCol=None)[source] [edit on github]

Group into a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds snr. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set.

  • snr (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).
  • errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

Adaptively group to a minimum number of counts.
Adaptively group to a minimum signal-to-noise ratio.
Group into a fixed number of bins.
Group into a minimum number of counts per bin.
Group into a fixed bin width.


If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_width(val, tabStops=None)[source] [edit on github]

Group into a fixed bin width.

Combine the data so that each bin contains num channels. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

  • val (int) – The number of channels to combine into a group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

Adaptively group to a minimum number of counts.
Adaptively group to a minimum signal-to-noise ratio.
Group into a fixed number of bins.
Group into a minimum number of counts per bin.
Group into a minimum signal-to-noise ratio.


Unless the requested bin width is a factor of the number of channels (and no tabStops parameter is given), then some channels will be “left over”. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

ignore(*args, **kwargs) [edit on github]
ignore_bad()[source] [edit on github]

Exclude channels marked as bad.

Ignore any bin in the PHA data set which has a quality value that is larger than zero.

Raises:sherpa.utils.err.DataErr – If the data set has no quality array.

See also

Exclude data from the fit.
Include data in the fit.


Bins with a non-zero quality setting are not automatically excluded when a data set is created.

If the data set has been grouped, then calling ignore_bad will remove any filter applied to the data set. If this happens a warning message will be displayed.

notice(lo=None, hi=None, ignore=False, bkg_id=None)[source] [edit on github]
notice_response(notice_resp=True, noticed_chans=None)[source] [edit on github]
set_analysis(quantity, type='rate', factor=0)[source] [edit on github]
set_arf(arf, id=None)[source] [edit on github]
set_background(bkg, id=None)[source] [edit on github]
set_dep(val)[source] [edit on github]

Set the dependent variable values

set_response(arf=None, rmf=None, id=None)[source] [edit on github]
set_rmf(rmf, id=None)[source] [edit on github]
subtract()[source] [edit on github]

Subtract the background data

sum_background_data(get_bdata_func=<function DataPHA.<lambda>>)[source] [edit on github]
to_component_plot(yfunc=None, staterrfunc=None) [edit on github]
to_contour(yfunc=None) [edit on github]
to_fit(staterrfunc=None)[source] [edit on github]
to_guess()[source] [edit on github]
to_plot(yfunc=None, staterrfunc=None, response_id=None)[source] [edit on github]
ungroup()[source] [edit on github]

Ungroup the data

unsubtract()[source] [edit on github]

Remove background subtraction