get_ascii_data

sherpa.io.get_ascii_data(filename: str, ncols: int = 1, colkeys: ~typing.Sequence[str] | None = None, sep: str = ' ', dstype: type = <class 'sherpa.data.Data1D'>, comment: str = '#', require_floats: bool = True) tuple[list[str], list[ndarray], str][source] [edit on github]

Read in columns from an ASCII file.

Parameters:
  • filename (str) – The name of the ASCII file to read in.

  • ncols (int, optional) – The number of columns to read in (the first ncols columns in the file). This is ignored if colkeys is given.

  • colkeys (array of str, optional) – An array of the column name to read in. The default is None.

  • sep (str, optional) – The separator character. The default is ' '.

  • dstype (data class to use, optional) – Used to check that the data file contains enough columns.

  • comment (str, optional) – The comment character. The default is '#'.

  • require_floats (bool, optional) – If True (the default), non-numeric data values will raise a ValueError.

Returns:

The column names read in, the data for the columns as an array, with each element being the data for the column (the order matches colnames), and the name of the file.

Return type:

(colnames, coldata, filename)

Raises:
  • sherpa.utils.err.IOErr – Raised if a requested column is missing or the file appears to be a binary file.

  • ValueError – If a column value can not be converted into a numeric value and the require_floats parameter is True.

Notes

The file is processed by reading in each line, stripping out any unsupported characters (replacing them by the sep argument), skipping empty lines, and then identifying comment and data lines.

The list of unsupported characters are: \t, \n, \r, comma, semi-colon, colon, space, and |.

The last comment line before the data is used to define the column names, splitting the line by the sep argument. If there are no comment lines then the columns are named starting at col1, col2, up to the number of columns.

Data lines are separated into columns - splitting by the sep comment - and then converted to NumPy arrays. If the require_floats argument is True then the column will be converted to a floating-point number type, with an error raised if this fails.

An error is raised if the number of columns per row is not constant.

If the colkeys argument is used then a case-sensitive match is used to determine what columns to return.

Examples

Read in the first column from the file:

>>> (colnames, coldata, fname) = get_ascii_data('src.dat')

Read in the first three columns from the file:

>>> colinfo = get_ascii_data('src.dat', ncols=3)

Read in a histogram data set, using the columns XLO, XHI, and Y:

>>> cols = ['XLO', 'XHI', 'Y']
>>> res = get_ascii_data('hist.dat', colkeys=cols,
                         dstype=sherpa.data.Data1DInt)

Read in the first and third column from the file cols.dat, where the file has no header information:

>>> res = get_ascii_data('cols.dat', colkeys=['col1', 'col3'])