dataset module

This module defines important tools to import data from dataset.

Data path functions

pystem.dataset.read_data_path()

Read the saved data folder path.

The pystem library proposes to store all data in a particular directory with associated configuration files. This folder is saved in pystem. To access to this folder path, use this function.

If no data path is saved, the function returns None. Else, the path is returned.

Returns

None is returned if no path is saved. Else, the data path is returned.

Return type

None, str

pystem.dataset.set_data_path(path)

Sets the saved data folder path.

The pystem library proposes to store all data in a particular directory with associated configuration files. This folder is saved in pystem. To set to this folder path, use this function.

A boolean is returned to confirm that the change is effective.

Parameters

path (str) – The desired data path.

Returns

If the data path has really been changed, the function returns True. Else, it returns False.

Return type

bool

Load functions

pystem.dataset.load_file(file, ndim, scan_ratio=1, scan_seed=None, dev=None, verbose=True)

This function loads a STEM acquisition based on a configuration .conf file path.

The number of dimensions ndim should also be given.

The Path is generated from a scan file given in the configuration file or is randomly drawn. Whatever the case, the Scan object ratio property can be set through the scan_ratio argument. Additionally, in the case where no file is provided for the scan pattern, use the scan_seed argument to have reproductible data.

The function allows the user to ask for development data by setting the dev argument. If dev is None, then the usual Stem2D and Stem3D classes are returned. If dev is a dictionary, then Dev2D and Dev3D classes are returned. This dictionary could contain additional class arguments such as:

  • snr, seed and normalized for Dev2D,

  • snr, seed, normalized, PCA_transformed and PCA_th for Dev3D.

Parameters
  • file (str) – The configuration file path.

  • ndim (int) – The data dimension. Should be 2 or 3.

  • scan_ratio (optional, float) – The Path object ratio. Default is 1.

  • scan_seed (int) – The seed in case of random scan initialization. Default is None for random seed.

  • dev (optional, None, dictionary) – This arguments allows the user to ask for development data. If this is None, usual data is returned. If this argument is a dictionary, then development data will be returned and the dictionary will be given to the data contructors. Default is None for usual data.

  • verbose (optional, bool) – If True, information will be sent to standard output.. Default is True.

Returns

The pystem data.

Return type

Stem2D, Stem3D, Dev2D, Dev3D

Todo

Maybe enable PCA_th in config file for 3D data.

pystem.dataset.load_key(key, ndim, scan_ratio=1, scan_seed=None, dev=None, verbose=True)

This function loads a STEM acquisition based on a key.

A key is a string which can be:

The key should always be the name of the configuration file without the suffix (.conf). As an example, if a configuration file located in the data folder is named my-sample.conf, then its data could be loaded with the my-sample key.

The number of dimensions ndim should also be given.

The Path is generated from a scan file given in the configuration file or is randomly drawn. Whatever the case, the Scan object ratio property can be set through the scan_ratio argument. Additionally, in the case where no file is provided for the scan pattern, use the scan_seed argument to have reproductible data.

The function allows the user to ask for development data by setting the dev argument. If dev is None, then the usual Stem2D and Stem3D classes are returned. If dev is a dictionary, then Dev2D and Dev3D classes are returned. This dictionary could contain additional class arguments such as:

  • snr, seed, normalized and verbose for Dev2D,

  • snr, seed, normalized, PCA_transformed, PCA_th and verbose for Dev3D.

This function only searches for the configuration file to use the load_file function afterwards.

Parameters
  • key (str) – The data key.

  • ndim (int) – The data dimension. Should be 2 or 3.

  • scan_ratio (optional, float) – The Path object ratio. Default is 1.

  • scan_seed (int) – The seed in case of random scan initialization. Default is None for random seed.

  • dev (optional, None, dictionary) – This arguments allows the user to ask for development data. If this is None, usual data is returned. If this argument is a dictionary, then development data will be returned and the dictionary will be given to the data contructors. Default is None for usual data.

  • verbose (optional, bool) – If True, information will be sent to standard output.. Default is True.

Returns

The pystem data.

Return type

Stem2D, Stem3D, Dev2D, Dev3D