DataPHA

class sherpa.astro.data.DataPHA(name, channel, counts, staterror=None, syserror=None, bin_lo=None, bin_hi=None, grouping=None, quality=None, exposure=None, backscal=None, areascal=None, header=None)[source] [edit on github]

Bases: sherpa.data.Data1D

PHA data set, including any associated instrument and background data.

The PHA format is described in an OGIP document [1]_.

Parameters:
  • name (str) – The name of the data set; often set to the name of the file containing the data.
  • counts (channel,) – The PHA data.
  • syserror (staterror,) – The statistical and systematic errors for the data, if defined.
  • bin_hi (bin_lo,) –
  • grouping (array of int or None, optional) –
  • quality (array of int or None, optional) –
  • exposure (number or None, optional) – The exposure time for the PHA data set, in seconds.
  • backscal (scalar or array or None, optional) –
  • areascal (scalar or array or None, optional) –
  • header (dict or None, optional) –
name

Used to store the file name, for data read from a file.

Type:str
channel
counts
staterror
syserror
bin_lo
bin_hi
grouping
quality
exposure
backscal
areascal

Notes

The original data is stored in the attributes - e.g. counts - and the data-access methods, such as get_dep and get_staterror, provide any necessary data manipulation to handle cases such as: background subtraction, filtering, and grouping.

The handling of the AREASCAl value - whether it is a scalar or array - is currently in flux. It is a value that is stored with the PHA file, and the OGIP PHA standard ([1]_) describes the observed counts being divided by the area scaling before comparison to the model. However, this is not valid for Poisson-based statistics, and is also not how XSPEC handles AREASCAL ([2]); the AREASCAL values are used to scale the exposure times instead. The aim is to add this logic to the instrument models in sherpa.astro.instrument, such as sherpa.astro.instrument.RMFModelPHA. The area scaling still has to be applied when calculating the background contribution to a spectrum, as well as when calculating the data and model values used for plots (following XSPEC so as to avoid sharp discontinuities where the area-scaling factor changes strongly).

References

[1]“The OGIP Spectral File Format”, https://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007/ogip_92_007.html
[2]Private communication with Keith Arnaud

Attributes Summary

background_ids IDs of defined background data sets
default_background_id
dep Left for compatibility with older versions
grouped Are the data grouped?
indep Return the grid of the data space associated with this data set.
mask Mask array for dependent variable
plot_fac Number of times to multiply the y-axis quantity by x-axis bin size
primary_response_id
rate counts or counts/sec
response_ids IDs of defined instrument responses (ARF/RMF pairs)
subtracted Are the background data subtracted?
units Units of the independent axis
x Used for compatibility, in particular for __str__ and __repr__

Methods Summary

apply_filter(data[, groupfunc]) Filter the array data, first passing it through apply_grouping() (using groupfunc) and then applying the general filters
apply_grouping(data[, groupfunc]) Apply the data set’s grouping scheme to the array data, combining the grouped data points with groupfunc, and return the grouped array.
delete_background([id])
delete_response([id])
eval_model(modelfunc)
eval_model_to_fit(modelfunc)
get_analysis() Return the units used when fitting spectral data.
get_areascal([group, filter]) Return the fractional area factor of the PHA data set.
get_arf([id])
get_background([id])
get_background_scale()
get_backscal([group, filter]) Return the area scaling of the PHA data set.
get_bounding_mask()
get_dep([filter]) Return the dependent axis of a data set.
get_dims([filter]) Return the dimensions of this data space as a tuple of tuples.
get_error([filter, staterrfunc]) Return the total error on the dependent variable.
get_evaluation_indep([filter, model, …])
get_filter([group, format, delim]) Integrated values returned are measured from center of bin
get_filter_expr()
get_img([yfunc]) Return 1D dependent variable as a 1 x N image
get_imgerr()
get_indep([filter]) Return the independent axes of a data set.
get_mask()
get_noticed_channels()
get_noticed_expr()
get_response([id])
get_rmf([id])
get_specresp([filter]) Return the effective area values for the data set.
get_staterror([filter, staterrfunc]) Return the statistical error.
get_syserror([filter]) Return any systematic error.
get_x([filter, response_id])
get_xerr([filter, response_id]) Return linear view of bin size in independent axis/axes”
get_xlabel() Return label for linear view of independent axis/axes
get_y([filter, yfunc, response_id, …]) Return dependent axis in N-D view of dependent variable”
get_yerr([filter, staterrfunc, response_id]) Return errors in dependent axis in N-D view of dependent variable
get_ylabel() Return label for dependent axis in N-D view of dependent variable”
group() Group the data according to the data set’s grouping scheme
group_adapt(minimum[, maxLength, tabStops]) Adaptively group to a minimum number of counts.
group_adapt_snr(minimum[, maxLength, …]) Adaptively group to a minimum signal-to-noise ratio.
group_bins(num[, tabStops]) Group into a fixed number of bins.
group_counts(num[, maxLength, tabStops]) Group into a minimum number of counts per bin.
group_snr(snr[, maxLength, tabStops, errorCol]) Group into a minimum signal-to-noise ratio.
group_width(val[, tabStops]) Group into a fixed bin width.
ignore(*args, **kwargs)
ignore_bad() Exclude channels marked as bad.
notice([lo, hi, ignore, bkg_id])
notice_response([notice_resp, noticed_chans])
set_analysis(quantity[, type, factor])
set_arf(arf[, id])
set_background(bkg[, id])
set_dep(val) Set the dependent variable values”
set_indep(val)
set_response([arf, rmf, id])
set_rmf(rmf[, id])
subtract() Subtract the background data
sum_background_data([get_bdata_func])
to_component_plot([yfunc, staterrfunc])
to_fit([staterrfunc])
to_guess()
to_plot([yfunc, staterrfunc, response_id])
ungroup() Ungroup the data
unsubtract() Remove background subtraction

Attributes Documentation

background_ids

IDs of defined background data sets

default_background_id = 1
dep

Left for compatibility with older versions

grouped

Are the data grouped?

indep

Return the grid of the data space associated with this data set. :returns: :rtype: tuple of array_like

mask

Mask array for dependent variable

Returns:mask
Return type:bool or numpy.ndarray
plot_fac

Number of times to multiply the y-axis quantity by x-axis bin size

primary_response_id = 1
rate

counts or counts/sec

Type:Quantity of y-axis
response_ids

IDs of defined instrument responses (ARF/RMF pairs)

subtracted

Are the background data subtracted?

units

Units of the independent axis

x

Used for compatibility, in particular for __str__ and __repr__

Methods Documentation

apply_filter(data, groupfunc=<function sum>)[source] [edit on github]

Filter the array data, first passing it through apply_grouping() (using groupfunc) and then applying the general filters

apply_grouping(data, groupfunc=<function sum>)[source] [edit on github]

Apply the data set’s grouping scheme to the array data, combining the grouped data points with groupfunc, and return the grouped array. If the data set has no associated grouping scheme or the data are ungrouped, data is returned unaltered.

delete_background(id=None)[source] [edit on github]
delete_response(id=None)[source] [edit on github]
eval_model(modelfunc)[source] [edit on github]
eval_model_to_fit(modelfunc)[source] [edit on github]
get_analysis()[source] [edit on github]

Return the units used when fitting spectral data.

Returns:

setting – The analysis setting.

Return type:

{ ‘channel’, ‘energy’, ‘wavelength’ }

Raises:

See also

set_analysis()

Examples

>>> is_wave = pha.get_analysis() == 'wavelength'
get_areascal(group=True, filter=False)[source] [edit on github]

Return the fractional area factor of the PHA data set.

Return the AREASCAL setting [1]_ for the PHA data set.

Parameters:
  • group (bool, optional) – Should the values be grouped to match the data?
  • filter (bool, optional) – Should the values be filtered to match the data?
Returns:

areascal – The AREASCAL value, which can be a scalar or a 1D array.

Return type:

number or ndarray

Notes

The fractional area scale is normally set to 1, with the ARF used to scale the model.

References

[1]“The OGIP Spectral File Format”, Arnaud, K. & George, I. http://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007/ogip_92_007.html

Examples

>>> pha.get_areascal()
1.0
get_arf(id=None)[source] [edit on github]
get_background(id=None)[source] [edit on github]
get_background_scale()[source] [edit on github]
get_backscal(group=True, filter=False)[source] [edit on github]

Return the area scaling of the PHA data set.

Return the BACKSCAL setting [1]_ for the PHA data set.

Parameters:
  • group (bool, optional) – Should the values be grouped to match the data?
  • filter (bool, optional) – Should the values be filtered to match the data?
Returns:

backscal – The BACKSCAL value, which can be a scalar or a 1D array.

Return type:

number or ndarray

Notes

The BACKSCAL value can be defined as the ratio of the area of the source (or background) extraction region in image pixels to the total number of image pixels. The fact that there is no ironclad definition for this quantity does not matter so long as the value for a source dataset and its associated background dataset are defined in the similar manner, because only the ratio of source and background BACKSCAL values is used. It can be a scalar or an array.

References

[1]“The OGIP Spectral File Format”, Arnaud, K. & George, I. http://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007/ogip_92_007.html

Examples

>>> pha.get_backscal()
7.8504301607718007e-06
get_bounding_mask() [edit on github]
get_dep(filter=False)[source] [edit on github]

Return the dependent axis of a data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
Returns:axis – The dependent axis values for the data set. This gives the value of each point in the data set.
Return type:array

See also

get_indep()
Return the independent axis of a data set.
get_error()
Return the errors on the dependent axis of a data set.
get_staterror()
Return the statistical errors on the dependent axis of a data set.
get_syserror()
Return the systematic errors on the dependent axis of a data set.
get_dims(filter=False) [edit on github]

Return the dimensions of this data space as a tuple of tuples. The first element in the tuple is a tuple with the dimensions of the data space, while the second element provides the size of the dependent array. :returns: :rtype: tuple

get_error(filter=False, staterrfunc=None) [edit on github]

Return the total error on the dependent variable.

Parameters:
  • filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
  • staterrfunc (function) – If no statistical error has been set, the errors will be calculated by applying this function to the dependent axis of the data set.
Returns:

axis – The error for each data point, formed by adding the statistical and systematic errors in quadrature.

Return type:

array or None

See also

get_dep()
Return the independent axis of a data set.
get_staterror()
Return the statistical errors on the dependent axis of a data set.
get_syserror()
Return the systematic errors on the dependent axis of a data set.
get_evaluation_indep(filter=False, model=None, use_evaluation_space=False) [edit on github]
get_filter(group=True, format='%.12f', delim=':')[source] [edit on github]

Integrated values returned are measured from center of bin

get_filter_expr()[source] [edit on github]
get_img(yfunc=None) [edit on github]

Return 1D dependent variable as a 1 x N image

Parameters:yfunc
get_imgerr() [edit on github]
get_indep(filter=True)[source] [edit on github]

Return the independent axes of a data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.
Returns:axis – The independent axis values for the data set. This gives the coordinates of each point in the data set.
Return type:tuple of arrays

See also

get_dep()
Return the dependent axis of a data set.
get_mask()[source] [edit on github]
get_noticed_channels()[source] [edit on github]
get_noticed_expr()[source] [edit on github]
get_response(id=None)[source] [edit on github]
get_rmf(id=None)[source] [edit on github]
get_specresp(filter=False)[source] [edit on github]

Return the effective area values for the data set.

Parameters:filter (bool, optional) – Should the filter attached to the data set be applied to the ARF or not. The default is False.
Returns:arf – The effective area values for the data set (or background component).
Return type:array
get_staterror(filter=False, staterrfunc=None)[source] [edit on github]

Return the statistical error.

The staterror column is used if defined, otherwise the function provided by the staterrfunc argument is used to calculate the values.

Parameters:
  • filter (bool, optional) – Should the channel filter be applied to the return values?
  • staterrfunc (function reference, optional) – The function to use to calculate the errors if the staterror field is None. The function takes one argument, the counts (after grouping and filtering), and returns an array of values which represents the one-sigma error for each element of the input array. This argument is designed to work with implementations of the sherpa.stats.Stat.calc_staterror method.
Returns:

staterror – The statistical error. It will be grouped and, if filter=True, filtered. The contribution from any associated background components will be included if the background-subtraction flag is set.

Return type:

array or None

Notes

There is no scaling by the AREASCAL setting, but background values are scaled by their AREASCAL settings. It is not at all obvious that the current code is doing the right thing, or that this is the right approach.

Examples

>>> dy = dset.get_staterror()

Ensure that there is no pre-defined statistical-error column and then use the Chi2DataVar statistic to calculate the errors:

>>> stat = sherpa.stats.Chi2DataVar()
>>> dset.set_staterror(None)
>>> dy = dset.get_staterror(staterrfunc=stat.calc_staterror)
get_syserror(filter=False)[source] [edit on github]

Return any systematic error.

Parameters:filter (bool, optional) – Should the channel filter be applied to the return values?
Returns:syserror – The systematic error, if set. It will be grouped and, if filter=True, filtered.
Return type:array or None

Notes

There is no scaling by the AREASCAL setting.

get_x(filter=False, response_id=None)[source] [edit on github]
get_xerr(filter=False, response_id=None)[source] [edit on github]

Return linear view of bin size in independent axis/axes”

Parameters:
  • filter
  • yfunc
get_xlabel()[source] [edit on github]

Return label for linear view of independent axis/axes

get_y(filter=False, yfunc=None, response_id=None, use_evaluation_space=False)[source] [edit on github]

Return dependent axis in N-D view of dependent variable”

Parameters:
  • filter
  • yfunc
  • use_evaluation_space
get_yerr(filter=False, staterrfunc=None, response_id=None)[source] [edit on github]

Return errors in dependent axis in N-D view of dependent variable

Parameters:
  • filter
  • staterrfunc
get_ylabel()[source] [edit on github]

Return label for dependent axis in N-D view of dependent variable”

Parameters:yfunc
group()[source] [edit on github]

Group the data according to the data set’s grouping scheme

group_adapt(minimum, maxLength=None, tabStops=None)[source] [edit on github]

Adaptively group to a minimum number of counts.

Combine the data so that each bin contains num or more counts. The difference to group_counts is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters:
  • minimum (int) – The number of channels to combine into a group.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_bins()
Group into a fixed number of bins.
group_counts()
Group into a minimum number of counts per bin.
group_snr()
Group into a minimum signal-to-noise ratio.
group_width()
Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_adapt_snr(minimum, maxLength=None, tabStops=None, errorCol=None)[source] [edit on github]

Adaptively group to a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds minimum. The difference to group_snr is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters:
  • minimum (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).
  • errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_bins()
Group into a fixed number of bins.
group_counts()
Group into a minimum number of counts per bin.
group_snr()
Group into a minimum signal-to-noise ratio.
group_width()
Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_bins(num, tabStops=None)[source] [edit on github]

Group into a fixed number of bins.

Combine the data so that there num equal-width bins (or groups). The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters:
  • num (int) – The number of bins in the grouped data set. Each bin will contain the same number of channels.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_counts()
Group into a minimum number of counts per bin.
group_snr()
Group into a minimum signal-to-noise ratio.
group_width()
Group into a fixed bin width.

Notes

Since the bin width is an integer number of channels, it is likely that some channels will be “left over”. This is even more likely when the tabStops parameter is set. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_counts(num, maxLength=None, tabStops=None)[source] [edit on github]

Group into a minimum number of counts per bin.

Combine the data so that each bin contains num or more counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set.

Parameters:
  • num (int) – The number of channels to combine into a group.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_bins()
Group into a fixed number of bins.
group_snr()
Group into a minimum signal-to-noise ratio.
group_width()
Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_snr(snr, maxLength=None, tabStops=None, errorCol=None)[source] [edit on github]

Group into a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds snr. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set.

Parameters:
  • snr (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.
  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).
  • errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_bins()
Group into a fixed number of bins.
group_counts()
Group into a minimum number of counts per bin.
group_width()
Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_width(val, tabStops=None)[source] [edit on github]

Group into a fixed bin width.

Combine the data so that each bin contains num channels. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters:
  • val (int) – The number of channels to combine into a group.
  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt()
Adaptively group to a minimum number of counts.
group_adapt_snr()
Adaptively group to a minimum signal-to-noise ratio.
group_bins()
Group into a fixed number of bins.
group_counts()
Group into a minimum number of counts per bin.
group_snr()
Group into a minimum signal-to-noise ratio.

Notes

Unless the requested bin width is a factor of the number of channels (and no tabStops parameter is given), then some channels will be “left over”. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

ignore(*args, **kwargs) [edit on github]
ignore_bad()[source] [edit on github]

Exclude channels marked as bad.

Ignore any bin in the PHA data set which has a quality value that is larger than zero.

Raises:sherpa.utils.err.DataErr – If the data set has no quality array.

See also

ignore()
Exclude data from the fit.
notice()
Include data in the fit.

Notes

Bins with a non-zero quality setting are not automatically excluded when a data set is created.

If the data set has been grouped, then calling ignore_bad will remove any filter applied to the data set. If this happens a warning message will be displayed.

notice(lo=None, hi=None, ignore=False, bkg_id=None)[source] [edit on github]
notice_response(notice_resp=True, noticed_chans=None)[source] [edit on github]
set_analysis(quantity, type='rate', factor=0)[source] [edit on github]
set_arf(arf, id=None)[source] [edit on github]
set_background(bkg, id=None)[source] [edit on github]
set_dep(val)[source] [edit on github]

Set the dependent variable values”

Parameters:val
set_indep(val) [edit on github]
set_response(arf=None, rmf=None, id=None)[source] [edit on github]
set_rmf(rmf, id=None)[source] [edit on github]
subtract()[source] [edit on github]

Subtract the background data

sum_background_data(get_bdata_func=<function DataPHA.<lambda>>)[source] [edit on github]
to_component_plot(yfunc=None, staterrfunc=None) [edit on github]
to_fit(staterrfunc=None)[source] [edit on github]
to_guess()[source] [edit on github]
to_plot(yfunc=None, staterrfunc=None, response_id=None)[source] [edit on github]
ungroup()[source] [edit on github]

Ungroup the data

unsubtract()[source] [edit on github]

Remove background subtraction