desidatamodel API

desidatamodel

This package provides support for the DESI Data Model.

exception desidatamodel.DataModelError[source]

Errors related to missing or malformed data model files, etc.

exception desidatamodel.DataModelWarning[source]

Warnings related to missing or malformed data model files, etc.

desidatamodel.check

Check actual files against the data model for validity.

class desidatamodel.check.DataModel(filename, section)[source]

Simple object to store data model data and metadata.

Parameters:
  • filename (str or pathlib.Path) – The full path of the data model file.

  • section (str or pathlib.Path) – The full path to the section of the data model containing the file.

Raises:

TypeError – If filename or section have an unexpected type.

_cross_reference(line)[source]

Obtain the path to a file referred to in another file.

Parameters:

line (str) – Line from original file that is the cross-reference.

Returns:

The path to the referenced file.

Return type:

str

_extract_columns(row, columns)[source]

Given column sizes, extract the data in each column.

Assumes a reStructuredText-compatible table.

Parameters:
  • row (str) – A table row.

  • columns (list) – The sizes of the columns.

Returns:

A tuple containing the extracted data.

Return type:

tuple()

_type_size(line)[source]

Obtain file type and size from a matching line.

Parameters:

line (str) – Line from file that contains the type and size.

Returns:

A tuple containing the type and size.

Return type:

tuple

extract_metadata(error=False)[source]

Extract metadata from a data model file.

Parameters:

error (bool, optional) – If True, failure to extract certain required metadata raises an exception.

Returns:

Metadata in a form similar to Stub metadata. The keys are the EXTNAME header values.

Return type:

dict

Raises:

DataModelError – If error is set and the HDU has no EXTNAME keyword.

get_regexp(root, error=False)[source]

Obtain the regular expression used to match files on disk.

Also internally updates the file type, if detected.

Parameters:
  • root (str) – Path to real files on disk.

  • error (bool, optional) – If True, failure to find a regular expression raises an exception instead of just a warning.

Returns:

The regular expression found, or None if not found. The regular expression is also stored internally.

Return type:

regular expression

Raises:

DataModelError – If error is set and problems with the data model file are detected.

validate_prototype(error=False, skip_keywords=False)[source]

Compares a model’s prototype data file to the data models.

Parameters:
  • error (bool, optional) – If True, failure to extract certain required metadata raises an exception.

  • skip_keywords (bool, optional) – If True, don’t check FITS header keywords

Notes

  • Use set theory to compare the data headers to model headers. This should automatically find missing headers, extraneous headers, etc.

desidatamodel.check._options()[source]

Parse command-line options.

Returns:

The parsed options.

Return type:

Namespace

desidatamodel.check.collect_files(root, files, n_prototypes=5)[source]

Scan a directory tree for files that correspond to data model files.

Parameters:
  • root (str) – Path to real files on disk.

  • files (list) – A list of data model files.

  • n_prototypes (int, optional) – Save up to n_prototypes possible prototype files, in case the first one is bad. Defaults to 5.

Notes

Files are analyzed using this algorithm:

  • The first n_prototypes files that matches a regexp become the ‘prototype candidates’ for that data model file. The first candidate that can be opened cleanly is the ‘prototype’.

  • If no files match a data model file, then files of that type are ‘missing’.

  • If a file does not match any regular expression, it is ‘extraneous’.

  • If a file matches a regular expression that already has a prototype, it is ‘ignored’.

desidatamodel.check.files_to_regexp(root, files, error=False)[source]

Convert a list of data model files into a list of regular expressions.

Parameters:
  • root (str) – Path to real files on disk.

  • files (list) – List of files obtained from the data model.

  • error (bool, optional) – If True, failure to find a regular expression raises an exception instead of just a warning.

Raises:

DataModelError – If error is set and data model files with malformed regular expressions are detected.

desidatamodel.check.main()[source]

Entry point for the check_model script.

Returns:

An integer suitable for passing to sys.exit().

Return type:

int

desidatamodel.check.scan_model(section)[source]

Find all data model files in a top-level directory.

Parameters:

section (str) – Full path to a section of the data model.

Returns:

The data model files found.

Return type:

list

desidatamodel.check.validate_prototypes(files, error=False, skip_keywords=False)[source]

Compares a set of prototype data files to their data models.

Parameters:
  • files (list) – A list of data model files.

  • error (bool, optional) – If True, failure to extract certain required metadata raises an exception.

  • skip_keywords (bool, optional) – If True, don’t check FITS header keywords

Notes

  • Use set theory to compare the data headers to model headers. This should automatically find missing headers, extraneous headers, etc.

desidatamodel.columns

Render the standard column descriptions file.

desidatamodel.columns.format_columns(rows)[source]

Does something.

Parameters:

rows (iterable) – An iterable containing rows with any number of columns.

Returns:

A tuple containing a format string, and an RST-style table separator.

Return type:

tuple

desidatamodel.columns.main()[source]

Entry point for command-line scripts.

Returns:

An integer suitable for passing to sys.exit().

Return type:

int

desidatamodel.scan

Deep scan available files to obtain a comprehensive set of metadata.

class desidatamodel.scan.UnionStub(model, count, error=False)[source]

Container for unified metadata for both existing models and data files.

Initialize the metadata with a DataModel object, then add additional Stub metadata.

Parameters:
  • model (DataModel) – A data model file object.

  • count (int) – Number of files that will be examined. This is used to determine whether a keyword or column is mandatory, optional or unused.

  • error (bool, optional) – If True, failure to extract certain required metadata raises an exception.

mark_optional()[source]

Mark the keywords and columns that do not appear in every file as optional.

update(hdu, data, columns=False)[source]

Search for missing keywords or columns in hdu and add them if necessary.

Parameters:
  • hdu (int) – The HDU number.

  • data (list) – List of keywords or columns to compare to the internal set.

  • columns (bool, optional) – If True, data represents BINTABLE columns, rather than keywords.

desidatamodel.scan._options()[source]

Parse command-line options.

Returns:

The parsed options.

Return type:

Namespace

desidatamodel.scan.collect_files(root, model)[source]

Scan a directory tree for all files that correspond to a data model files.

Parameters:
  • root (str) – Path to real files on disk.

  • model (DataModel) – A data model file object.

Returns:

All files in root that match model.

Return type:

list

desidatamodel.scan.main()[source]

Entry point for the deep_scan_metadata script.

Returns:

An integer suitable for passing to sys.exit().

Return type:

int

desidatamodel.scan.union_metadata(model, stubs, error=False)[source]

Combine all HDU metadata from model and stubs.

Parameters:
  • model (DataModel) – The initial data model.

  • stubs (list) – A list of Stub objects.

  • error (bool, optional) – If True, failure to extract certain required metadata raises an exception.

Returns:

A new Stub object containing the unified metadata of all the inputs.

Return type:

Stub

desidatamodel.stub

Generate data model files from FITS files.

class desidatamodel.stub.Stub(filename, description_file=None, error=False)[source]

This object contains metadata about a file and methods to print that metadata.

Parameters:
  • filename (file path, file-like object or HDUList) – Data file to convert to a data model file.

  • error (bool, optional) – If True, failure to extract certain required metadata raises an exception.

columns_header

The header of a table summarizing the columns of a BINTABLE HDU.

Type:

tuple()

contents_header

The header of a table summarizing the HDUs.

Type:

tuple()

filename

Name of the file.

Type:

str

headers

The HDUs read from the file.

Type:

list

keywords_header

The header of a table listing interesting FITS keywords.

Type:

tuple()

nhdr

Number of HDUs.

Type:

int

property basef

Base name of the file.

colformat(sizes)[source]

Return a string ready to be formatted.

Parameters:

sizes (list) – The width of each column.

Returns:

A string with format characters.

Return type:

str

colsizes(table)[source]

Compute the size (number of characters) of each column in a table.

Parameters:

table (list) – A list representing a table.

Returns:

The size of each column in the table.

Return type:

list

columns(hdu, error=False)[source]

Describe the columns of a BINTABLE HDU.

Parameters:
  • hdu (int) – The HDU number (zero-indexed).

  • error (bool, optional) – If True, failure to extract certain required metadata raises an exception.

Returns:

The rows of the table.

Return type:

list

Raises:
  • DataModelError – If the BINTABLE is actually a compressed image.

  • ValueError – If error and a TUNIT value does not have FITS-standard units.

property contents

A table summarizing the HDUs.

property filesize

Size of the file in human-readable format.

property filetype

Type of file. Assumes FITS (for now) unless overridden in a subclass.

format_table(table, indent=False)[source]

Convert tabular data into reStructuredText-compatible string.

This function assumes that table already has a header as the first row.

Parameters:
  • table (list) – A data table.

  • indent (bool) – If True, indent the table for compatibility with collapsible tables.

Returns:

A list of strings that can be joined.

Return type:

list

property hdumeta

Metadata associated with each HDU.

property hduname

Format of HDU names.

highlight(sizes)[source]

Return reStructuredText-compatible table highlights.

Parameters:

sizes (list) – The width of each column.

Returns:

A highlight string.

Return type:

str

image_format(hdr)[source]

Obtain format of an image HDU.

Parameters:

hdr (Header) – The header to parse.

Returns:

A string describing the image format.

Return type:

str

Raises:

DataModelError – If self.error is set a BUNIT header with units that do not follow the FITS standard is detected.

keywords(hdu)[source]

A table summarizing the interesting keywords in a particular HDU.

Parameters:

hdu (int) – The HDU number (zero-indexed).

Returns:

The rows of the table.

Return type:

list

property modelname

Name to use for the data model file.

section(hdu)[source]

A string describing an HDU.

Parameters:

hdu (int) – The HDU number (zero-indexed).

Returns:

A list of strings that can be joined.

Return type:

list

desidatamodel.stub.extract_keywords(hdr)[source]

Extract interesting keywords from a FITS header.

Parameters:

hdr (Header) – The header to parse.

Returns:

A list of tuples containing the metadata of interesting keywords.

Return type:

list

desidatamodel.stub.extrakey(key)[source]

Return True if key is not a boring standard FITS keyword.

To make the data model more human readable, we don’t overwhelm the output with required keywords which are required by the FITS standard anyway, or cases where the number of headers might change over time.

This list isn’t exhaustive.

Parameters:

key (str) – A FITS keyword.

Returns:

True if the keyword is not boring.

Return type:

bool

Examples

>>> extrakey('SIMPLE')
False
>>> extrakey('DEPNAM01')
False
>>> extrakey('BZERO')
True
desidatamodel.stub.file_size(filename)[source]

Determine file size and return string with human readable size format.

Adapted from stackoverflow answers for human readable size formatting.

Parameters:

filename (str) – A string containing a filename.

Returns:

A human-readable file size.

Return type:

str

Examples

>>> file_size('one-gb-file.dat')
'1 GB'
desidatamodel.stub.fits_column_format(format)[source]

Convert a FITS column format to a human-readable form.

Parameters:

format (str) – A FITS-style format string.

Returns:

A human-readable version of the format string.

Return type:

str

Examples

>>> fits_column_format('A')
'char[1]'
>>> fits_column_format('J')
'int32'
>>> fits_column_format('12E')
'float32[12]'
desidatamodel.stub.main()[source]

Entry point for the generate_model script.

Returns:

An integer suitable for passing to sys.exit().

Return type:

int

desidatamodel.stub.read_column_descriptions(filename)[source]

Read column descriptions csv file and return dictionary

Parameters:

filename (str) – csv filename with columns NAME,TYPE,UNITS,DESCRIPTION

Returns:

coldesc_dict[NAME] = dict with keys TYPE, UNITS, DESCRIPTION

desidatamodel.unit

Shared code for dealing with units in files and data models.

exception desidatamodel.unit._FITSUnitWarning[source]

Warnings related to invalid FITS units.

desidatamodel.unit._validate_unit(unit, error=False)[source]

Check units for consistency with FITS standard, while allowing some special exceptions.

Parameters:
  • unit (str) – The unit to parse.

  • error (bool, optional) – If True, failure to interpret the unit raises an exception.

Returns:

If a special exception is detected, the name of the unit is returned. Otherwise, None.

Return type:

str

Raises:

ValueError – If error is set and the unit can’t be parsed.

desidatamodel.update

Tools to update column units and descriptions in a pre-existing datamodel file.

desidatamodel.update.format_rst_table(table)[source]

Format an astropy Table in left-aligned RST format

Parameters:

table (astropy.table.Table) –

Returns:

list of strings to print/write for the RST-format table

Note: this doesn’t use astropy.io.ascii.rst because that generates right-aligned columns.

desidatamodel.update.main()[source]

Updates a datamodel file with standard units and descriptions

Returns:

An integer suitable for passing to sys.exit().

Return type:

int

desidatamodel.update.read_table_rows(lines, i)[source]

Read an RST-format table from a set of lines

Parameters:
  • lines (list of str) – lines from data model file

  • i (int) – start at line number i

Return: None or list of dict(Name, Type, Units, Description)

Looks for data table description of the form:

==== ==== ===== ===========
Name Type Units Description
==== ==== ===== ===========
blat int  s     biz bat bar
foo  int        bing bang boom
==== ==== ===== ===========

while allowing the columns to have arbitrary widths or possibly be blank.

Returns None if table starting at line i doesn’t match that form.

desidatamodel.update.update(lines, force=False)[source]

Update units and descriptions for data tables in datamodel lines

Parameters:

lines (list of str) – lines read from an input datamodel file

Options:

force (bool): if True, update non-blank input entries too

Returns: list of str lines with updates units and descriptions

This function is separated from main primarily to facilitate testing of updating input lines into output lines without having to actually read and write files every time.