Fitting generic targets¶
Besides energy-like targets, the library also supports reading (and training on) more generic targets.
Support for generic targets¶
Not all architectures can train on all types of target. Here you can find the capabilities of the architectures in metatrain.
Target type |
Energy and its gradients |
Scalars |
Spherical tensors |
Cartesian tensors |
---|---|---|---|---|
SOAP-BPNN |
Energy, forces, stress/virial |
Yes |
Yes |
No |
GAP |
Energy, forces |
No |
No |
No |
PET |
Energy, forces |
No |
No |
No |
NanoPET |
Energy, forces, stress/virial |
Yes |
Yes |
Only with |
Preparing generic targets for reading by metatrain¶
Only a few steps are required to fit arbitrary targets in metatrain.
Input file¶
In order to read a generic target, you will have to specify its layout in the input
file. Suppose you want to learn a target named mtt::my_target
, which is
represented as a set of 10 independent per-atom 3D Cartesian vector (we need to
learn 3x10 values for each atom). The target
section in the input file
should look
like this:
targets:
mtt::my_target:
read_from: dataset.xyz
key: my_target
quantity: ""
unit: ""
per_atom: True
type:
cartesian:
rank: 1
num_subtargets: 10
The crucial fields here are:
per_atom
: This field should be set toTrue
if the target is a per-atomproperty. Otherwise, it should be set to
False
.
type
: This field specifies the type of the target. In this case, the target isa Cartesian vector. The
rank
field specifies the rank of the target. For Cartesian vectors, the rank is 1. Other possibilities for thetype
arescalar
(for a scalar target) andspherical
(for a spherical tensor).
num_subtargets
: This field specifies the number of sub-targets that need to belearned as part of this target. They are treated as entirely equivalent by models in metatrain and will often be represented as outputs of the same neural network layer. A common use case for this field is when you are learning a discretization of a continuous target, such as the grid points of a band structure. In the example above, there are 10 sub-targets. In
metatensor
, these correspond to the number ofproperties
of the target.
A few more words should be spent on spherical
targets. These should be made of a
certain number of irreducible spherical tensors. For example, if you are learning a
property that can be decomposed into two proper spherical tensors with L=0 and L=2,
the target section should would look like this:
targets:
mtt::my_target:
quantity: ""
read_from: dataset.xyz
key: energy
unit: ""
per_atom: True
type:
spherical:
irreps:
- {o3_lambda: 0, o3_sigma: 1}
- {o3_lambda: 2, o3_sigma: 1}
num_subtargets: 10
where o3_lambda
specifies the L value of the spherical tensor and o3_sigma
its
parity with respect to inversion (1 for proper tensors, -1 for pseudo-tensors).
Preparing your targets – ASE¶
If you are using the ASE readers to read your targets, you will have to save them
either in the .info
(if the target is per structure, i.e. not per atom) or in the
.arrays
(if the target is per atom) attributes of the ASE atoms object. Then you can
dump the atoms object to a file using ase.io.write
.
The ASE reader will automatically try to reshape the target data to the format expected given the target section in the input file. In case your target data is invalid, an error will be raised.
Reading targets with more than one spherical tensor is not supported by the ASE reader. In that case, you should use the metatensor reader.
Preparing your targets – metatensor¶
If you are using the metatensor readers to read your targets, you will have to save them
as a metatensor.torch.TensorMap
object with metatensor.torch.TensorMap.save()
into a file with the .mts
extension.
The metatensor reader will verify that the target data in the input files corresponds to
the metadata in the provided TensorMap
objects. In case of a mismatch, errors will
be raised.
In particular:
if the target is per atom, the samples should have the [
system
,atom
] names, otherwise the [system
] name. In order to produce correct results, per-atom targets should be prepared in the same order as the atoms in the system. This means that you should use e.g.metatensor.torch.sort
to sort the atoms in your target.if the target is a
scalar
, only oneTensorBlock
should be present, the keys of theTensorMap
should be aLabels.single()
object, and there should be no components.if the target is a
cartesian
tensor, only oneTensorBlock
should be present, the keys of theTensorMap
should be aLabels.single()
object, and there should be one components, with names [xyz
] for a rank-1 tensor, [xyz_1
,xyz_2
, etc.] for higher rank tensors.if the target is a
spherical
tensor, theTensorMap
can contain multipleTensorBlock
, each corresponding to one irreducible spherical tensor. The keys of theTensorMap
should have theo3_lambda
ando3_sigma
names, corresponding to the values provided in the input file, and eachTensorBlock
should be one component label, with nameo3_mu
and values going from -L to L.