DataPHA

class sherpa.astro.data.DataPHA(name, channel, counts, staterror=None, syserror=None, bin_lo=None, bin_hi=None, grouping=None, quality=None, exposure=None, backscal=None, areascal=None, header=None)[source] [edit on github]

Bases: sherpa.data.Data1D

PHA data set, including any associated instrument and background data.

The PHA format is described in an OGIP document 1 and 2.

Parameters
  • name (str) – The name of the data set; often set to the name of the file containing the data.

  • channel (array of int) – The PHA data.

  • counts (array of int) – The PHA data.

  • staterror (scalar or array or None, optional) – The statistical and systematic errors for the data, if defined.

  • syserror (scalar or array or None, optional) – The statistical and systematic errors for the data, if defined.

  • bin_lo (array or None, optional) –

  • bin_hi (array or None, optional) –

  • grouping (array of int or None, optional) –

  • quality (array of int or None, optional) –

  • exposure (number or None, optional) – The exposure time for the PHA data set, in seconds.

  • backscal (scalar or array or None, optional) –

  • areascal (scalar or array or None, optional) –

  • header (dict or None, optional) – If None the header will be pre-populated with a minimal set of keywords that would be found in an OGIP compliant PHA I file.

name

Used to store the file name, for data read from a file.

Type

str

staterror
syserror
bin_lo
bin_hi
grouping
quality
exposure
backscal
areascal

Notes

The original data is stored in the attributes - e.g. counts - and the data-access methods, such as get_dep and get_staterror, provide any necessary data manipulation to handle cases such as: background subtraction, filtering, and grouping.

There is additional complexity compared to the Data1D case when filtering data because:

  • although the data uses channel numbers, users will often want to filter the data using derived values (in energy or wavelength units, such as 0.5 to 7.0 keV or 16 to 18 Angstroms);

  • although derived from the Data1D case, PHA data is more-properly thought about as being an integrated data set, so each channel maps to a range of energy or wavelength values;

  • the data is often grouped to improve the signal-to-noise, and so requests for values need to determine whether to filter the data or not, whether to group the data or not, and how to combine the data within each group;

  • and there is also the quality array, which indicates whether or not a channel is trust-worthy or not (and so acts as an additional filtering term).

The handling of the AREASCAl value - whether it is a scalar or array - is currently in flux. It is a value that is stored with the PHA file, and the OGIP PHA standard (1, 2) describes the observed counts being divided by the area scaling before comparison to the model. However, this is not valid for Poisson-based statistics, and is also not how XSPEC handles AREASCAL (3); the AREASCAL values are used to scale the exposure times instead. The aim is to add this logic to the instrument models in sherpa.astro.instrument, such as sherpa.astro.instrument.RMFModelPHA. The area scaling still has to be applied when calculating the background contribution to a spectrum, as well as when calculating the data and model values used for plots (following XSPEC so as to avoid sharp discontinuities where the area-scaling factor changes strongly).

References

1(1,2)

“The OGIP Spectral File Format”, https://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007/ogip_92_007.html

2(1,2)

“The OGIP Spectral File Format Addendum: Changes log “, https://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007a/ogip_92_007a.html

3

Private communication with Keith Arnaud

Attributes Summary

background_ids

IDs of defined background data sets

channel

The channel array.

counts

The counts array.

default_background_id

The identifier for the background component when not set.

dep

Left for compatibility with older versions

grouped

Are the data grouped?

indep

Return the grid of the data space associated with this data set.

mask

Mask array for dependent variable

plot_fac

How the X axis is used to create the Y axis when plotting data.

primary_response_id

The identifier for the response component when not set.

rate

Is the Y axis displayed as a rate when plotting data?

response_ids

IDs of defined instrument responses (ARF/RMF pairs)

subtracted

Are the background data subtracted?

units

one of 'channel', 'energy', 'wavelength'.

x

Used for compatibility, in particular for __str__ and __repr__

Methods Summary

apply_filter(data[, groupfunc])

Group and filter the supplied data to match the data set.

apply_grouping(data[, groupfunc])

Apply the grouping scheme of the data set to the supplied data.

delete_background([id])

Remove the background component.

delete_response([id])

Remove the response component.

eval_model(modelfunc)

eval_model_to_fit(modelfunc)

get_analysis()

Return the units used when fitting spectral data.

get_areascal([group, filter])

Return the fractional area factor of the PHA data set.

get_arf([id])

Return the ARF from the response.

get_background([id])

Return the background component.

get_background_scale([bkg_id, units, group, ...])

Return the correction factor for the background dataset.

get_backscal([group, filter])

Return the background scaling of the PHA data set.

get_bounding_mask()

get_dep([filter])

Return the dependent axis of a data set.

get_dims([filter])

Return the dimensions of this data space as a tuple of tuples.

get_error([filter, staterrfunc])

Return the total error on the dependent variable.

get_evaluation_indep([filter, model, ...])

get_filter([group, format, delim])

Return the data filter as a string.

get_filter_expr()

Return the data filter as a string along with the units.

get_full_response([pileup_model])

Calculate the response for the dataset.

get_img([yfunc])

Return 1D dependent variable as a 1 x N image

get_imgerr()

get_indep([filter])

Return the independent axes of a data set.

get_mask()

Returns the (ungrouped) mask.

get_noticed_channels()

Return the noticed channels.

get_noticed_expr()

Returns the current set of noticed channels.

get_response([id])

Return the response component.

get_rmf([id])

Return the RMF from the response.

get_specresp([filter])

Return the effective area values for the data set.

get_staterror([filter, staterrfunc])

Return the statistical error.

get_syserror([filter])

Return any systematic error.

get_x([filter, response_id])

get_xerr([filter, response_id])

Return linear view of bin size in independent axis/axes"

get_xlabel()

Return label for linear view of independent axis/axes

get_y([filter, yfunc, response_id, ...])

Return dependent axis in N-D view of dependent variable"

get_yerr([filter, staterrfunc, response_id])

Return errors in dependent axis in N-D view of dependent variable

get_ylabel()

Return label for dependent axis in N-D view of dependent variable"

group()

Group the data according to the data set's grouping scheme.

group_adapt(minimum[, maxLength, tabStops])

Adaptively group to a minimum number of counts.

group_adapt_snr(minimum[, maxLength, ...])

Adaptively group to a minimum signal-to-noise ratio.

group_bins(num[, tabStops])

Group into a fixed number of bins.

group_counts(num[, maxLength, tabStops])

Group into a minimum number of counts per bin.

group_snr(snr[, maxLength, tabStops, errorCol])

Group into a minimum signal-to-noise ratio.

group_width(val[, tabStops])

Group into a fixed bin width.

ignore(*args, **kwargs)

ignore_bad()

Exclude channels marked as bad.

notice([lo, hi, ignore, bkg_id])

Notice or ignore the given range.

notice_response([notice_resp, noticed_chans])

set_analysis(quantity[, type, factor])

Set the units used when fitting and plotting spectral data.

set_arf(arf[, id])

Add or replace the ARF in a response component.

set_background(bkg[, id])

Add or replace a background component.

set_dep(val)

Set the dependent variable values.

set_indep(val)

set_response([arf, rmf, id])

Add or replace a response component.

set_rmf(rmf[, id])

Add or replace the RMF in a response component.

subtract()

Subtract the background data.

sum_background_data([get_bdata_func])

Sum up data, applying the background correction value.

to_component_plot([yfunc, staterrfunc])

to_fit([staterrfunc])

to_guess()

to_plot([yfunc, staterrfunc, response_id])

ungroup()

Remove any data grouping.

unsubtract()

Remove background subtraction.

Attributes Documentation

background_ids

IDs of defined background data sets

channel

The channel array.

This is the first, and only, element of the indep attribute.

counts

The counts array.

This is an alias for the y attribute.

default_background_id = 1

The identifier for the background component when not set.

dep

Left for compatibility with older versions

grouped

Are the data grouped?

indep

Return the grid of the data space associated with this data set.

Return type

tuple of array_like

mask

Mask array for dependent variable

Returns

mask

Return type

bool or numpy.ndarray

plot_fac

How the X axis is used to create the Y axis when plotting data.

The Y axis values are multiplied by X^plot_fac. The default value of 0 means the X axis is not used in plots. The value must be an integer.

primary_response_id = 1

The identifier for the response component when not set.

rate

Is the Y axis displayed as a rate when plotting data?

When True the y axis is normalised by the exposure time to display a rate.

response_ids

IDs of defined instrument responses (ARF/RMF pairs)

subtracted

Are the background data subtracted?

units

one of ‘channel’, ‘energy’, ‘wavelength’.

Type

Units of the independent axis

x

Used for compatibility, in particular for __str__ and __repr__

Methods Documentation

apply_filter(data, groupfunc=<function sum>)[source] [edit on github]

Group and filter the supplied data to match the data set.

Parameters
  • data (ndarray or None) – The data to group, which must match either the number of channels of the data set or the number of filtered channels.

  • groupfunc (function reference) – The grouping function. See apply_grouping for the supported values.

Returns

result – The grouped and filtered data, or None if the input was None.

Return type

ndarray or None

Raises
  • TypeError – If the data size does not match the number of channels.

  • ValueError – If the name of groupfunc is not supported or the data does not match the filtered data.

Examples

Group and filter the counts array with no filter and then with a filter:

>>> pha.grouped
True
>>> pha.notice()
>>> pha.apply_filter(pha.counts)
array([17., 15., 16., 15., ...
>>> pha.notice(0.5, 7)
>>> pha.apply_filter(pha.counts)
array([15., 16., 15., 18., ...

As the previous example but with no grouping:

>>> pha.ungroup()
>>> pha.notice()
>>> pha.apply_filter(pha.counts)[0:5]
array([0., 0., 0., 0., 0.])
>>> pha.notice(0.5, 7)
>>> pha.apply_filter(pha.counts)[0:5]
array([4., 3., 0., 1., 1.])

Rather than group the counts, use the channel numbers and return the first and last channel number in each of the filtered groups (for the first five groups):

>>> pha.group()
>>> pha.notice(0.5, 7.0)
>>> pha.apply_filter(pha.channel, pha._min)[0:5]
array([33., 40., 45., 49., 52.])
>>> pha.apply_filter(pha.channel, pha._max)[0:5]
array([39., 44., 48., 51., 54.])

Find the approximate energy range of each selected group from the RMF EBOUNDS extension:

>>> rmf = pha.get_rmf()
>>> elo = pha.apply_filter(rmf.e_min, pha._min)
>>> ehi = pha.apply_filter(rmf.e_max, pha._max)

Calculate the grouped data, after filtering, if the counts were increased by 2 per channel. Note that in this case the data to apply_filter contains the channel counts after applying the current filter:

>>> orig = pha.counts[pha.get_mask()]
>>> new = orig + 2
>>> cts = pha.apply_filter(new)
apply_grouping(data, groupfunc=<function sum>)[source] [edit on github]

Apply the grouping scheme of the data set to the supplied data.

Parameters
  • data (ndarray or None) – The data to group, which must match the number of channels of the data set.

  • groupfunc (function reference) – The grouping function. Note that what matters is the name of the function, not its code. The supported function names are: “sum”, “_sum_sq”, “_min”, “_max”, “_middle”, and “_make_groups”.

Returns

grouped – The grouped data, unless the data set is not grouped or the input array was None, when the input data is returned.

Return type

ndarray or None

Raises
  • TypeError – If the data size does not match the number of channels.

  • ValueError – If the name of groupfunc is not supported.

Notes

The supported grouping schemes are:

Name

Description

sum

Sum all the values in the group.

_min

The minimum value in the group.

_max

The maximum value in the group.

_middle

The average of the minimum and maximum values.

_sum_sq

The square root of the sum of the squared values.

_make_groups

The group number, starting at the first value of data.

There are methods of the DataPHA class that can be used for all other than “sum” (the default value).

The grouped data is not filtered unless a quality filter has been applied (e.g. by ignore_bad) in which case the quality filter will be applied to the result. In general apply_filter should be used if the data is to be filtered as well as grouped.

Examples

Sum up the counts in each group (note that the data has not been filtered so using get_dep with the filter argument set to True is generally preferred to using this method):

>>> gcounts = pha.apply_grouping(pha.counts)

The grouping for an unfiltered PHA data set with 1024 channels is used to calculate the number of channels in each group, the lowest channel number in each group, the highest channel number in each group, and the mid-point between the two:

>>> pha.grouped
True
>>> pha.mask
True
>>> len(pha.channel)
1024
>>> pha.apply_grouping(np.ones(1024))
array([ 17.,   4.,  11.,   ...
>>> pha.apply_grouping(np.arange(1, 1025), pha._min)
array([  1.,  18.,  22.,  ...
>>> pha.apply_grouping(np.arange(1, 1025), pha._max)
array([  17.,   21.,   32.,   ...
>>> pha.apply_grouping(np.arange(1, 1025), pha._middle)
array([  9. ,  19.5,  27. ,  ...

The grouped data is not filtered (unless ignore_bad has been used):

>>> pha.notice()
>>> v1 = pha.apply_grouping(dvals)
>>> pha.notice(1.2, 4.5)
>>> v2 = pha.apply_grouping(dvals)
>>> np.all(v1 == v2)
True
delete_background(id=None)[source] [edit on github]

Remove the background component.

If the background component does not exist then the method does nothing.

Parameters

id (int or str, optional) – The identifier of the background component. If it is None then the default background identifier is used.

See also

set_background

Notes

If this call removes the last of the background components then the subtracted flag is cleared (if set).

delete_response(id=None)[source] [edit on github]

Remove the response component.

If the response component does not exist then the method does nothing.

Parameters

id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

See also

set_response

eval_model(modelfunc)[source] [edit on github]
eval_model_to_fit(modelfunc)[source] [edit on github]
get_analysis()[source] [edit on github]

Return the units used when fitting spectral data.

Returns

setting – The analysis setting.

Return type

{ ‘channel’, ‘energy’, ‘wavelength’ }

Raises

See also

set_analysis

Examples

>>> is_wave = pha.get_analysis() == 'wavelength'
get_areascal(group=True, filter=False)[source] [edit on github]

Return the fractional area factor of the PHA data set.

Return the AREASCAL setting [ASCAL] for the PHA data set.

Parameters
  • group (bool, optional) – Should the values be grouped to match the data?

  • filter (bool, optional) – Should the values be filtered to match the data?

Returns

areascal – The AREASCAL value, which can be a scalar or a 1D array.

Return type

number or ndarray

Notes

The fractional area scale is normally set to 1, with the ARF used to scale the model.

References

ASCAL

“The OGIP Spectral File Format”, Arnaud, K. & George, I. http://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007/ogip_92_007.html

Examples

>>> pha.get_areascal()
1.0
get_arf(id=None)[source] [edit on github]

Return the ARF from the response.

Parameters

id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

Returns

arf – The ARF, if set.

Return type

sherpa.astro.data.DataARF instance or None

See also

get_response, get_rmf, get_full_responses

get_background(id=None)[source] [edit on github]

Return the background component.

Parameters

id (int or str, optional) – The identifier of the background component. If it is None then the default background identifier is used.

Returns

bkg – The background dataset. If there is no component then None is returned.

Return type

sherpa.astro.data.DataPHA instance or None

get_background_scale(bkg_id=1, units='counts', group=True, filter=False)[source] [edit on github]

Return the correction factor for the background dataset.

Changed in version 4.12.2: The bkg_id, units, group, and filter parameters have been added and the routine no-longer calculates the average scaling for all the background components but just for the given component.

Parameters
  • bkg_id (int or str, optional) – The background component to use (the default is 1).

  • units ({'counts', 'rate'}, optional) – The correction is applied to a model defined as counts, the default, or a rate. The latter should be used when calculating the correction factor for adding the background data to the source aperture.

  • group (bool, optional) – Should the values be grouped to match the data?

  • filter (bool, optional) – Should the values be filtered to match the data?

Returns

scale – The scaling factor to correct the background data onto the source data set. If bkg_id is not valid then None is returned.

Return type

None, number, or NumPy array

Notes

The correction factor when units is ‘counts’ is:

scale_exposure * scale_backscal * scale_areascal / nbkg

where nbkg is the number of background components and scale_x is the source value divided by the background value for the field x.

When units is ‘rate’ the correction is:

scale_backscal / nbkg

and it is currently uncertain whether it should include the AREASCAL scaling.

get_backscal(group=True, filter=False)[source] [edit on github]

Return the background scaling of the PHA data set.

Return the BACKSCAL setting [BSCAL] for the PHA data set.

Parameters
  • group (bool, optional) – Should the values be grouped to match the data?

  • filter (bool, optional) – Should the values be filtered to match the data?

Returns

backscal – The BACKSCAL value, which can be a scalar or a 1D array.

Return type

number or ndarray

Notes

The BACKSCAL value can be defined as the ratio of the area of the source (or background) extraction region in image pixels to the total number of image pixels. The fact that there is no ironclad definition for this quantity does not matter so long as the value for a source dataset and its associated background dataset are defined in the same manner, because only the ratio of source and background BACKSCAL values is used. It can be a scalar or an array.

References

BSCAL

“The OGIP Spectral File Format”, Arnaud, K. & George, I. http://heasarc.gsfc.nasa.gov/docs/heasarc/ofwg/docs/spectra/ogip_92_007/ogip_92_007.html

Examples

>>> pha.get_backscal()
7.8504301607718007e-06
get_bounding_mask() [edit on github]
get_dep(filter=False)[source] [edit on github]

Return the dependent axis of a data set.

Parameters

filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.

Returns

axis – The dependent axis values for the data set. This gives the value of each point in the data set.

Return type

array

See also

get_indep

Return the independent axis of a data set.

get_error

Return the errors on the dependent axis of a data set.

get_staterror

Return the statistical errors on the dependent axis of a data set.

get_syserror

Return the systematic errors on the dependent axis of a data set.

get_dims(filter=False) [edit on github]

Return the dimensions of this data space as a tuple of tuples. The first element in the tuple is a tuple with the dimensions of the data space, while the second element provides the size of the dependent array.

Return type

tuple

get_error(filter=False, staterrfunc=None) [edit on github]

Return the total error on the dependent variable.

Parameters
  • filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.

  • staterrfunc (function) – If no statistical error has been set, the errors will be calculated by applying this function to the dependent axis of the data set.

Returns

axis – The error for each data point, formed by adding the statistical and systematic errors in quadrature.

Return type

array or None

See also

get_dep

Return the independent axis of a data set.

get_staterror

Return the statistical errors on the dependent axis of a data set.

get_syserror

Return the systematic errors on the dependent axis of a data set.

get_evaluation_indep(filter=False, model=None, use_evaluation_space=False) [edit on github]
get_filter(group=True, format='%.12f', delim=':')[source] [edit on github]

Return the data filter as a string.

The filter expression depends on the analysis setting.

Changed in version 4.14.0: Prior to 4.14.0 the filter used the mid-point of the bin, not its low or high value.

Parameters
  • group (bool, optional) – Should the filter reflect the grouped data?

  • format (str, optional) – The formatting of the numeric values (this is ignored for channel units, which uses format="%i").

  • delim (str, optional) – The string used to mark the low-to-high range.

Returns

expr – The noticed channel range as a string of comma-separated ranges, where the low and high values are separated by the delim string. The units of the ranges are controlled by the analysis setting. If all bins have been filtered out then “No noticed bins” is returned.

Return type

str

Examples

For a Chandra non-grating dataset which has been grouped:

>>> pha.set_analysis('energy')
>>> pha.notice(0.5, 7)
>>> pha.get_filter(format='%.4f')
'0.4672:9.8696'
>>> pha.set_analysis('channel')
>>> pha.get_filter()
'33:676'

The filter expression shows the first selected channel to the last one, and so is independent of whether the data is grouped or not:

>>> pha.set_analysis('energy')
>>> pha.get_filter(format='%.4f')
'0.4672:9.8696'
>>> pha.get_filter(group=False, format='%.4f')
'0.4672:9.8696'

Although the group argument does not change the output of get_filter, the selected range does depend on whether the data is grouped or not (unless the groups align with the filter edges):

>>> d.ungroup()
>>> d.notice()
>>> d.notice(0.5, 7)
>>> d.get_filter(format='%.3f')
'0.496:7.008'
>>> d.group()
>>> d.get_filter(format='%.3f')
'0.467:9.870'
>>> d.notice()
>>> d.notice(0.5, 6)
>>> d.ignore(2.1, 2.2)
>>> d.get_filter(format='%.2f', delim='-')
'0.47-2.09,2.28-6.57'
get_filter_expr()[source] [edit on github]

Return the data filter as a string along with the units.

This is a specialised version of get_filter which adds the axis units.

Returns

filter – The filter, represented as a collection of single values or ranges, separated by commas.

Return type

str

See also

get_filter

Examples

>>> d.get_filter_expr()
'1.0000-2.0000,5.0000-6.0000 x'
get_full_response(pileup_model=None)[source] [edit on github]

Calculate the response for the dataset.

Unlike get_response, which returns a single response, this function returns all responses for datasets that have multiple responses set and it offers the possibility to include a pile-up model.

Parameters

pileup_model (None or a sherpa.astro.models.JDPileup instance) – If a pileup model shall be included in the return, then it needs to be passed in.

Returns

The return value depends on whether an ARF, RMF, or pile up model has been associated with the data set.

Return type

response

get_img(yfunc=None) [edit on github]

Return 1D dependent variable as a 1 x N image

Parameters

yfunc

get_imgerr() [edit on github]
get_indep(filter=True)[source] [edit on github]

Return the independent axes of a data set.

Parameters

filter (bool, optional) – Should the filter attached to the data set be applied to the return value or not. The default is False.

Returns

axis – The independent axis values for the data set. This gives the coordinates of each point in the data set.

Return type

tuple of arrays

See also

get_dep

Return the dependent axis of a data set.

get_mask()[source] [edit on github]

Returns the (ungrouped) mask.

Returns

mask – The mask, in channels, or None.

Return type

ndarray or None

get_noticed_channels()[source] [edit on github]

Return the noticed channels.

Returns

channels – The noticed channels (this is independent of the analysis setting).

Return type

ndarray

get_noticed_expr()[source] [edit on github]

Returns the current set of noticed channels.

The values returned are always in channels, no matter the current analysis setting.

Returns

expr – The noticed channel range as a string of comma-separated “low-high” values. As these are channel filters the low and high values are inclusive. If all channels have been filtered out then “No noticed channels” is returned.

Return type

str

get_response(id=None)[source] [edit on github]

Return the response component.

Parameters

id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

Returns

arf, rmf – The response, as an ARF and RMF. Either, or both, components can be None.

Return type

sherpa.astro.data.DataARF,sherpa.astro.data.DataRMF instances or None

get_rmf(id=None)[source] [edit on github]

Return the RMF from the response.

Parameters

id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

Returns

rmf – The RMF, if set.

Return type

sherpa.astro.data.DataRMF instance or None

See also

get_arf, get_response, get_full_responses

get_specresp(filter=False)[source] [edit on github]

Return the effective area values for the data set.

Parameters

filter (bool, optional) – Should the filter attached to the data set be applied to the ARF or not. The default is False.

Returns

arf – The effective area values for the data set (or background component).

Return type

array

get_staterror(filter=False, staterrfunc=None)[source] [edit on github]

Return the statistical error.

The staterror column is used if defined, otherwise the function provided by the staterrfunc argument is used to calculate the values.

Parameters
  • filter (bool, optional) – Should the channel filter be applied to the return values?

  • staterrfunc (function reference, optional) – The function to use to calculate the errors if the staterror field is None. The function takes one argument, the counts (after grouping and filtering), and returns an array of values which represents the one-sigma error for each element of the input array. This argument is designed to work with implementations of the sherpa.stats.Stat.calc_staterror method.

Returns

staterror – The statistical error. It will be grouped and, if filter=True, filtered. The contribution from any associated background components will be included if the background-subtraction flag is set.

Return type

array or None

Notes

There is no scaling by the AREASCAL setting, but background values are scaled by their AREASCAL settings. It is not at all obvious that the current code is doing the right thing, or that this is the right approach.

Examples

>>> dy = dset.get_staterror()

Ensure that there is no pre-defined statistical-error column and then use the Chi2DataVar statistic to calculate the errors:

>>> stat = sherpa.stats.Chi2DataVar()
>>> dset.set_staterror(None)
>>> dy = dset.get_staterror(staterrfunc=stat.calc_staterror)
get_syserror(filter=False)[source] [edit on github]

Return any systematic error.

Parameters

filter (bool, optional) – Should the channel filter be applied to the return values?

Returns

syserror – The systematic error, if set. It will be grouped and, if filter=True, filtered.

Return type

array or None

Notes

There is no scaling by the AREASCAL setting.

get_x(filter=False, response_id=None)[source] [edit on github]
get_xerr(filter=False, response_id=None)[source] [edit on github]

Return linear view of bin size in independent axis/axes”

Parameters
  • filter

  • yfunc

get_xlabel()[source] [edit on github]

Return label for linear view of independent axis/axes

get_y(filter=False, yfunc=None, response_id=None, use_evaluation_space=False)[source] [edit on github]

Return dependent axis in N-D view of dependent variable”

Parameters
  • filter

  • yfunc

  • use_evaluation_space

get_yerr(filter=False, staterrfunc=None, response_id=None)[source] [edit on github]

Return errors in dependent axis in N-D view of dependent variable

Parameters
  • filter

  • staterrfunc

get_ylabel()[source] [edit on github]

Return label for dependent axis in N-D view of dependent variable”

Parameters

yfunc

group()[source] [edit on github]

Group the data according to the data set’s grouping scheme.

This sets the grouping flag which means that the value of the grouping attribute will be used when accessing data values. This can be called even if the grouping attribute is empty.

See also

ungroup

group_adapt(minimum, maxLength=None, tabStops=None)[source] [edit on github]

Adaptively group to a minimum number of counts.

Combine the data so that each bin contains num or more counts. The difference to group_counts is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters
  • minimum (int) – The number of channels to combine into a group.

  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.

  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt_snr

Adaptively group to a minimum signal-to-noise ratio.

group_bins

Group into a fixed number of bins.

group_counts

Group into a minimum number of counts per bin.

group_snr

Group into a minimum signal-to-noise ratio.

group_width

Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_adapt_snr(minimum, maxLength=None, tabStops=None, errorCol=None)[source] [edit on github]

Adaptively group to a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds minimum. The difference to group_snr is that this algorithm starts with the bins with the largest signal, in order to avoid over-grouping bright features, rather than at the first channel of the data. The adaptive nature means that low-count regions between bright features may not end up in groups with the minimum number of counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters
  • minimum (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.

  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.

  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

  • errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

group_adapt

Adaptively group to a minimum number of counts.

group_bins

Group into a fixed number of bins.

group_counts

Group into a minimum number of counts per bin.

group_snr

Group into a minimum signal-to-noise ratio.

group_width

Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_bins(num, tabStops=None)[source] [edit on github]

Group into a fixed number of bins.

Combine the data so that there num equal-width bins (or groups). The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters
  • num (int) – The number of bins in the grouped data set. Each bin will contain the same number of channels.

  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt

Adaptively group to a minimum number of counts.

group_adapt_snr

Adaptively group to a minimum signal-to-noise ratio.

group_counts

Group into a minimum number of counts per bin.

group_snr

Group into a minimum signal-to-noise ratio.

group_width

Group into a fixed bin width.

Notes

Since the bin width is an integer number of channels, it is likely that some channels will be “left over”. This is even more likely when the tabStops parameter is set. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_counts(num, maxLength=None, tabStops=None)[source] [edit on github]

Group into a minimum number of counts per bin.

Combine the data so that each bin contains num or more counts. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set.

Parameters
  • num (int) – The number of channels to combine into a group.

  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.

  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt

Adaptively group to a minimum number of counts.

group_adapt_snr

Adaptively group to a minimum signal-to-noise ratio.

group_bins

Group into a fixed number of bins.

group_snr

Group into a minimum signal-to-noise ratio.

group_width

Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_snr(snr, maxLength=None, tabStops=None, errorCol=None)[source] [edit on github]

Group into a minimum signal-to-noise ratio.

Combine the data so that each bin has a signal-to-noise ratio which exceeds snr. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped. The background is not included in this calculation; the calculation is done on the raw data even if subtract has been called on this data set.

Parameters
  • snr (number) – The minimum signal-to-noise ratio that must be exceeded to form a group of channels.

  • maxLength (int, optional) – The maximum number of channels that can be combined into a single group.

  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

  • errorCol (array of num, optional) – If set, the error to use for each channel when calculating the signal-to-noise ratio. If not given then Poisson statistics is assumed. A warning is displayed for each zero-valued error estimate.

See also

group_adapt

Adaptively group to a minimum number of counts.

group_adapt_snr

Adaptively group to a minimum signal-to-noise ratio.

group_bins

Group into a fixed number of bins.

group_counts

Group into a minimum number of counts per bin.

group_width

Group into a fixed bin width.

Notes

If channels can not be placed into a “valid” group, then a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

group_width(val, tabStops=None)[source] [edit on github]

Group into a fixed bin width.

Combine the data so that each bin contains num channels. The binning scheme is applied to all the channels, but any existing filter - created by the ignore or notice set of functions - is re-applied after the data has been grouped.

Parameters
  • val (int) – The number of channels to combine into a group.

  • tabStops (array of int or bool, optional) – If set, indicate one or more ranges of channels that should not be included in the grouped output. The array should match the number of channels in the data set and non-zero or True means that the channel should be ignored from the grouping (use 0 or False otherwise).

See also

group_adapt

Adaptively group to a minimum number of counts.

group_adapt_snr

Adaptively group to a minimum signal-to-noise ratio.

group_bins

Group into a fixed number of bins.

group_counts

Group into a minimum number of counts per bin.

group_snr

Group into a minimum signal-to-noise ratio.

Notes

Unless the requested bin width is a factor of the number of channels (and no tabStops parameter is given), then some channels will be “left over”. If this happens, a warning message will be displayed to the screen and the quality value for these channels will be set to 2.

ignore(*args, **kwargs) [edit on github]
ignore_bad()[source] [edit on github]

Exclude channels marked as bad.

Ignore any bin in the PHA data set which has a quality value that is not equal to zero.

Raises

sherpa.utils.err.DataErr – If the data set has no quality array.

See also

ignore, notice

Notes

Bins with a non-zero quality setting are not automatically excluded when a data set is created.

If the data set has been grouped, then calling ignore_bad will remove any filter applied to the data set. If this happens a warning message will be displayed.

notice(lo=None, hi=None, ignore=False, bkg_id=None)[source] [edit on github]

Notice or ignore the given range.

Changed in version 4.14.0: PHA filtering has been improved to fix a number of corner cases which can result in the same filter now selecting one or two fewer channels that done in earlier versions of Sherpa. The lo and hi arguments are now restricted based on the units setting.

Parameters
  • lo (number or None, optional) – The range to change. A value of None means the minimum or maximum permitted value. The units of lo and hi are set by the units field.

  • hi (number or None, optional) – The range to change. A value of None means the minimum or maximum permitted value. The units of lo and hi are set by the units field.

  • ignore (bool, optional) – Set to True if the range should be ignored. The default is to notice the range.

  • bkg_id (int or sequence of int or None, optional) – If not None then apply the filter to the given background dataset or datasets, otherwise change the object and all its background datasets.

Notes

Calling notice with no arguments selects all points in the dataset (or, if ignore=True, it will remove all points).

If no channels have been ignored then a call to notice with ignore=False will select just the lo to hi range, and exclude any channels outside this range. If there has been a filter applied then the range lo to hi will be added to the range of noticed data (when ignore=False). One consequence to the above is that if the first call to notice (with ignore=False) selects a range outside the data set - such as a channel range of 2000 to 3000 when the valid range is 1 to 1024 - then all points will be ignored.

When filtering with channel units then:

  • the lo and hi arguments, if set, must be integers,

  • and the lo and hi values are inclusive.

For energy and wavelength filters:

  • the lo and hi arguments, if set, must be >= 0,

  • and the lo limit is inclusive but the hi limit is exclusive.

Examples

So, for an ungrouped PHA file with 1024 channels:

>>> pha.units = 'channel'
>>> pha.get_filter()
'1:1024'
>>> pha.notice(20, 200)
>>> pha.get_filter()
'20:200'
>>> pha.notice(300, 500)
'20:200,300:500'

Calling notice with no arguments removes all the filters:

>>> pha.notice()
>>> pha.get_filter()
'1:1024'

Ignore the first 30 channels (this is the same as calling `pha.ignore(hi=30):

>>> pha.notice(hi=30, ignore=True)
>>> pha.get_filter()
'31:1024'

When using wavelength or energy units the noticed (or ignored) range will not always match the requested range because each channel has a finite width in these spaces:

>>> pha.grouped
True
>>> pha.get_analysis()
'energy'
>>> pha.notice()
>>> pha.notice(0.5, 7)
>>> pha.get_filter(format='%.3f')
'0.467:9.870'
notice_response(notice_resp=True, noticed_chans=None)[source] [edit on github]
set_analysis(quantity, type='rate', factor=0)[source] [edit on github]

Set the units used when fitting and plotting spectral data.

Parameters
  • quantity ({'channel', 'energy', 'wavelength'}) – The analysis setting.

  • type ({'rate', 'counts'}, optional) – Do plots display a rate or show counts?

  • factor (int, optional) – The Y axis of plots is multiplied by Energy^factor or Wavelength^factor before display. The default is 0.

Raises

sherpa.utils.err.DatatErr – If the type argument is invalid, the RMF or ARF has the wrong size, or there in no response.

See also

get_analysis

Examples

>>> pha.set_analysis('energy')
>>> pha.set_analysis('wave', type='counts', factor=1)
>>> pha.units
'wavelength'
set_arf(arf, id=None)[source] [edit on github]

Add or replace the ARF in a response component.

This replaces the existing ARF of the response, keeping the previous RMF (if set). Use the delete_response method to remove the response, rather than setting arf to None.

Parameters
  • arf (sherpa.astro.data.DataARF instance) – The ARF to add.

  • id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

set_background(bkg, id=None)[source] [edit on github]

Add or replace a background component.

If the background has no grouping of quality arrays then they are copied from the source region. If the background has no response information (ARF or RMF) then the response is copied from the source region.

Parameters
  • bkg (sherpa.astro.data.DataPHA instance) – The background dataset to add. This object may be changed by this method.

  • id (int or str, optional) – The identifier of the background component. If it is None then the default background identifier is used.

Notes

If the PHA header does not have the TELESCOP, INSTRUME, or FILTER header keywords set (or they are set to “none”), then they are taken from the background, if they are not set to “none”. This is to allow simulated data sets to be used with external programs, such as XSPEC.

set_dep(val)[source] [edit on github]

Set the dependent variable values.

Parameters

val

set_indep(val) [edit on github]
set_response(arf=None, rmf=None, id=None)[source] [edit on github]

Add or replace a response component.

To remove a response use delete_response(), as setting arf and rmf to None here does nothing.

Parameters
  • arf (sherpa.astro.data.DataARF instance or None, optional) – The ARF to add if any.

  • rmf (sherpa.astro.data.DataRMF instance or None, optional) – The RMF to add, if any.

  • id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

Notes

If the PHA header does not have the TELESCOP, INSTRUME, or FILTER header keywords set (or they are set to “none”), then they are taken from the ARF or RMF, if they are not set to “none”. This is to allow simulated data sets to be used with external programs, such as XSPEC.

set_rmf(rmf, id=None)[source] [edit on github]

Add or replace the RMF in a response component.

This replaces the existing RMF of the response, keeping the previous ARF (if set). Use the delete_response method to remove the response, rather than setting rmf to None.

Parameters
  • rmf (sherpa.astro.data.DataRMF instance) – The RMF to add.

  • id (int or str, optional) – The identifier of the response component. If it is None then the default response identifier is used.

subtract()[source] [edit on github]

Subtract the background data.

See also

unsubtract

sum_background_data(get_bdata_func=<function DataPHA.<lambda>>)[source] [edit on github]

Sum up data, applying the background correction value.

Parameters

get_bdata_func (function, optional) – What data should be used for each background dataset. The function takes the background identifier and background DataPHA object and returns the data to use. The default is to use the counts array of the background dataset.

Returns

value – The sum of the data, including any area, background, and exposure-time corrections.

Return type

scalar or NumPy array

Notes

For each associated background, the data is retrieved (via the get_bdata_func parameter), and then

  • divided by its BACKSCAL value (if set)

  • divided by its AREASCAL value (if set)

  • divided by its exposure time (if set)

The individual background components are then summed together, and then multiplied by the source BACKSCAL (if set), multiplied by the source AREASCAL (if set), and multiplied by the source exposure time (if set). The final step is to divide by the number of background files used.

Example

Calculate the background counts, per channel, scaled to match the source:

>>> bcounts = src.sum_background_data()

Calculate the scaling factor that you need to multiply the background data to match the source data. In this case the background data has been replaced by the value 1 (rather than the per-channel values used with the default argument):

>>> bscale = src.sum_background_data(lambda k, d: 1)
to_component_plot(yfunc=None, staterrfunc=None) [edit on github]
to_fit(staterrfunc=None)[source] [edit on github]
to_guess()[source] [edit on github]
to_plot(yfunc=None, staterrfunc=None, response_id=None)[source] [edit on github]
ungroup()[source] [edit on github]

Remove any data grouping.

This un-sets the grouping flag which means that the grouping attribute will not be used when accessing data values.

See also

group

unsubtract()[source] [edit on github]

Remove background subtraction.

See also

subtract