Topic 3: Uncertainty quantification with models

Overview

Uncertainty in linear (and non-linear) models derived from data can be quantified via a Bayesian approach. In this case, instead of simply “optimal” parameters found via error minimization, a “family” of parameters can be described with by their likelihood:

\[\log L(\theta) = -\frac{1}{2W}\sum_{i\in\text{data}} w_i(f(\theta; x_i) - y_i)^2\]

for data pairs \((x_i, y_i)\) with weights \(w_i\), and a “scaling” factor \(W\). In this general situation, the parameters \(\theta\) are sampled, by treating the likelihood as a probability distribution. Then, mean values of predictions, standard deviations, etc., are all computed in a standard statistical manner.

In the case of a linear model, the likelihood is a multivariate normal distribution, which allows for direct sampling of the parameters. In the more general nonlinear model, alternate approaches such as Markov chain Monte Carlo are necessary to sample the parameters.

Background: S. L. Frederiksen, K. W. Jacobsen, K. S. Brown, and J. P. Sethna; “Bayesian ensemble approach to error estimation of interatomic potentials,” Phys. Rev. Lett. 93, 165501 (2004).

Thermodynamic models

Classical interatomic potential models describe the energy of a given system in terms of the chemical identity and positions of all of the species in the lattice; these models are computationally less expensive than more accurate density-functional theory calculations, but require optimization of parameters to be predictive. A related model are “cluster expansions” which map crystal structures onto simple lattice models with discrete interactions; these have been very effective at both reproducing energies and other properties when fit to density-functional theory data, and extending to finite temperature statistical mechanics models. We will consider a simplified version of this to explore the prediction of structural energies for binary systems.

To understand a cluster expansion, consider the simple case of atoms on a lattice of sites at positions \(\mathbf{x}\). For a two-component (binary) system, we can describe any configuration using an (infinite) integer “occupancy vector” \(n(\mathbf{x})\) that is 0 if a site \(\mathbf{x}\) has species A on it, or 1 if the site has species B on it. Then, we can write down the energy of a crystal (note: not the energy per atom, but the total!) as an expansion of “clusters”:

\[E(n) := E^{(0)} + \sum_{\mathbf{x}} E^{(A)}(\mathbf{x})(1-n(\mathbf{x})) + E^{(B)}n(\mathbf{x})\] \[+ \sum_{\mathbf{x}\mathbf{y}} E^{(AA)}(\mathbf{x},\mathbf{y})(1-n(\mathbf{x}))(1-n(\mathbf{y})) + E^{(BB)}(\mathbf{x},\mathbf{y})n(\mathbf{x})n(\mathbf{y}) + E^{(AB)}(\mathbf{x},\mathbf{y})[(1-n(\mathbf{x}))n(\mathbf{y}) + n(\mathbf{x})(1-n(\mathbf{y}))]\] \[+ \cdots\]

This general version can be simplified greatly. First, for many lattices, the site energies \(E^{(A)}(\mathbf{x})\) and \(^{(B)}(\mathbf{x})\) are independent of position; the summation really depends on the difference in the sites, and so the \(\sum_{\mathbf{x}} E^{(A)}(\mathbf{x})(1)\) term can be absorbed into the leading constant term. The quadratic terms can be similarly expanded into constant, linear, and terms of order \(n^2\); moreover, the interactions should only be a function of the separation vector \(\mathbf{x}-\mathbf{u}\). Thus, the model can be reduced to the simpler form:

\[E(n) := E^{(0)} + \sum_{\mathbf{x}} E^{(1)}n(\mathbf{x}) + \frac{1}{2}\sum_{\mathbf{x}\mathbf{y}} E^{(2)}(\mathbf{x}-\mathbf{y})n(\mathbf{x})n(\mathbf{y}) + \frac{1}{3!}\sum_{\mathbf{x}\mathbf{y}\mathbf{z}} E^{(3)}(\mathbf{x}-\mathbf{y}, \mathbf{x}-\mathbf{z}) n(\mathbf{x})n(\mathbf{y})n(\mathbf{z}) + \cdots\]

Now this model begins to look like a classical interatomic potential. Further simplifications are possible based on symmetry; one could, for example, treat interaction parameters \(E^{(2)}\) as functions only of distance \(|\mathbf{x}-\mathbf{y}|\).

Consider the binary Ni-Al system; we wish to make a model of the energy based purely on Al-Al, Ni-Ni, and Al-Ni interactions that are functions of the distances between the atoms. We can use the pymatgen functions get_data to get entries in a binary system (including energies), and get_structure_by_material_id to find structures, with the get_all_neighbors function to construct neighbor lists.

Building on your previous thermodynamic model, quantify the uncertainty in your model predictions. In this case, you should approach the problem similarly to Fredericksen et al., where the weights \(w_i\) are all unity, and the scaling factor \(W\) is the magnitude of the smallest error (i.e., the total error in your optimal parameters).

Discussion: March 31-April 2, 2020