Readers

Parsers for obtaining system and target information from files. Currently, metatrain support the following libraries for reading data

Library

Supported targets

Linked file formats

ase

system, energy, forces, stress, virials

.xyz, .extxyz

metatensor

system, energy, forces, stress, virials

.mts

If the reader parameter is not set, the library is determined from the file extension. Overriding this behavior is in particular useful if a file format is not listed here but might be supported by a library.

Below the synopsis of the reader functions in details.

System and target data readers

The main entry point for reading system and target information are the reader functions.

metatrain.utils.data.read_systems(filename: str, reader: str | None = None) List[System][source]

Read system informations from a file.

Parameters:
  • filename (str) – name of the file to read

  • reader (str | None) – reader library for parsing the file. If None the library is is tried to determined from the file extension.

  • dtype – desired data type of returned tensor

Returns:

list of systems determined from the file extension.

Returns:

list of systems stored in double precision

Return type:

List[System]

metatrain.utils.data.read_targets(conf: DictConfig) Tuple[Dict[str, List[TensorMap]], Dict[str, TargetInfo]][source]

Reading all target information from a fully expanded config.

To get such a config you can use expand_dataset_config. All targets are stored in double precision.

This function uses subfunctions like read_energy() to parse the requested target quantity. Currently only energy is a supported target property. But, within the energy section gradients such as forces, the stress or the virial can be added. Other gradients are silently ignored.

Parameters:

conf (DictConfig) – config containing the keys for what should be read.

Returns:

Dictionary containing a list of TensorMaps for each target section in the config as well as a Dict[str, TargetInfo] object containing the metadata of the targets.

Raises:

ValueError – if the target name is not valid. Valid target names are those that either start with mtt:: or those that are in the list of standard outputs of metatensor.torch.atomistic (see https://docs.metatensor.org/latest/atomistic/outputs.html)

Return type:

Tuple[Dict[str, List[TensorMap]], Dict[str, TargetInfo]]

These functions dispatch the reading of the system and target information to the appropriate readers, based on the file extension or the user-provided library.

In addition, the read_targets function uses the user-provided information about the targets to call the appropriate target reader function (for energy targets or generic targets).

ASE

This section describes the parsers for the ASE library.

metatrain.utils.data.readers.ase.read(filename: str | PurePath | IO, *args, **kwargs) List[Atoms][source]

Wrapper around the ase.io.read() function.

The wrapper provides a more informative error message in case of failure. Additionally, it will make the keys "energy", "forces" and "stress" available from the calculator and the info/arrays dictionary.

Warning

Lists of atoms read with this function can NOT be written back to a file with ase.io.write() because of the duplicated keys.

Parameters:
  • filename (str | PurePath | IO) – Name of the file to read from or a file descriptor.

  • args – additional positional arguments for ase.io.read()

  • kwargs – additional keyword arguments for ase.io.read()

Returns:

A list of atoms

Return type:

List[Atoms]

metatrain.utils.data.readers.ase.read_systems(filename: str) List[System][source]

Store system informations using ase.

Parameters:

filename (str) – name of the file to read

Returns:

A list of systems

Return type:

List[System]

metatrain.utils.data.readers.ase.read_energy(target: DictConfig) Tuple[List[TensorMap], TargetInfo][source]
Parameters:

target (DictConfig)

Return type:

Tuple[List[TensorMap], TargetInfo]

metatrain.utils.data.readers.ase.read_generic(target: DictConfig) Tuple[List[TensorMap], TargetInfo][source]
Parameters:

target (DictConfig)

Return type:

Tuple[List[TensorMap], TargetInfo]

It should be noted that metatrain.utils.data.readers.ase.read_energy() currently uses sub-functions to parse the energy and its gradients like forces, virial and stress.

Metatensor

This section describes the parsers for the metatensor library. As the systems and/or targets are already stored in the metatensor format, these reader functions mainly perform checks and return the data.

metatrain.utils.data.readers.metatensor.read_systems(filename: str) List[System][source]

Read system information using metatensor.

Parameters:

filename (str) – name of the file to read

Raises:

NotImplementedError – Serialization of systems is not yet available in metatensor.

Return type:

List[System]

metatrain.utils.data.readers.metatensor.read_energy(target: DictConfig) Tuple[TensorMap, TargetInfo][source]
Parameters:

target (DictConfig)

Return type:

Tuple[TensorMap, TargetInfo]

metatrain.utils.data.readers.metatensor.read_generic(target: DictConfig) Tuple[List[TensorMap], TargetInfo][source]
Parameters:

target (DictConfig)

Return type:

Tuple[List[TensorMap], TargetInfo]