Sunday, December 03, 2017

Neural networks and the neutron transport equation

The Monte Carlo simulation technique was conceived at Los Alamos in the late 1940s by Stanislaw Ulam and John von Neumann with a specific application in mind: neutron transport and nuclear fission chain-reactions. Perhaps, however, other simulation techniques are now available. I'd like to propose one such alternative below, but first we need to understand what the neutron transport problem is.

The neutron transport equation is a special case of the Boltzmann transport equation, the primary equation of non-equilibrium statistical mechanics. In the general case of the Boltzmann equation, the quantity of interest is a distribution function on phase-space, which specifies the expected number of particles per unit of phase-space volume. (At low particle-number densities, one might refer to the probability of particle occupancy per unit of phase-space volume).

In the case of the neutron transport equation, the quantity of interest can be taken to be $n(\mathbf{r},\mathbf{v},t)$, the expected number of neutrons per unit of physical space about $\mathbf{r}$, per unit of velocity space about $\mathbf{v}$, at time $t$. (Whilst, strictly speaking, phase space deals with position and momentum, it is often interchangeable with position and velocity).

The Boltzmann equation is of some philosophical interest because it can be used to describe the approach of a system towards equilibrium. Certainly, a population of high-energy neutrons in a region of space occupied by non-fissile atomic nuclei at room temperature, would transfer energy to the atomic nuclei by means of elastic and inelastic scattering interactions, and approach a state of equilibrium. The neutrons would 'thermalise', their energy spectrum approaching that of a Maxwell-Boltzmann distribution, with a temperature equal to that of the background nuclei.

However, the neutron transport equation can be deployed to describe the evolution of the neutron population in a nuclear reactor, and such a system remains, for a reasonable period of time, and with the appropriate control systems, in a stable state far-from-equilibrium.

A fissile chain reaction occurs when neutrons are absorbed by heavy atomic nuclei, which fission into smaller, more stable nuclei, and emit high-energy neutrons in the process. By means of this fissile chain reaction, the neutron temperature, flux and number density can be maintained at a constant level, despite intensive interactions with a background population of atomic nuclei at a much lower temperature. 

There are two populations of particles here, which remain at different temperatures despite being in contact with each other. The system therefore remains out of thermal equilibrium. Nevertheless, the entropy of the system is still increasing because the fission reactions release the free energy of the heavy fissile isotopes, and transform it into heat. A nuclear reactor is, of course, far from being a closed system, and stability far-from-equilibrium is only possible in this case if the heat generated by the fissile reactions is transported away, typically by conduction into a liquid or gas, which convectively transports the heat to a location outside the reactor vessel.

Before we define the neutron transport equation, let's begin with some definitions. Let's start with the neutron flux, $\phi(\mathbf{r},\mathbf{v},t) =  v \;n(\mathbf{r},\mathbf{v},t)$. For each neutron energy level $E = 1/2 m_n v^2$, (where $m_n$ is the rest-mass of a neutron), this is the number of neutrons at a point $\mathbf{r}$ passing through a unit surface area, per unit time. For each energy level $E$, it is equivalent to the total path length travelled by the energy-$E$ neutrons per unit volume at $\mathbf{r}$, in a unit of time.

To calculate the rate at which neutron reactions take place, the flux is multiplied by a 'macroscopic cross-section' $\Sigma_j(\mathbf{r},\mathbf{v},t)$. The macroscopic cross-sections are the product of the microscopic cross-sections with the number density of the target nuclei $\Sigma_j(\mathbf{r},\mathbf{v},t) = \Sigma_i \sigma_{ij}(\mathbf{r},\mathbf{v},t) N_i(\mathbf{r},t)$. 

The microscopic cross-sections $\sigma_{ij}(\mathbf{r},\mathbf{v},t)$ express the probability of a reaction of type $j$ between a neutron and a target nucleus of species $i$. There are microscopic cross-sections for neutron absorption, fission, elastic scattering and inelastic scattering. The microscopic cross-sections have units of area per atom, and depend upon the energy (velocity) of the incoming neutron, as well as the temperature of the target nuclei.

The macroscopic cross-sections have units of inverse distance. As such, they define the probability of a reaction per unit of neutron path-length. When multiplied with a neutron flux (equivalent to the total path-length travelled by the neutrons per unit time per unit volume), this yields the reaction rate per unit volume.

One other definition to note is the distinction between 'prompt' and 'delayed' neutrons. The prompt neutrons are released in the fission event itself, whilst the delayed neutrons are emitted after the beta decay of certain fission product fragments, and typically occur some time after the fission event. 

With those definitions in place, let's proceed to the neutron transport equation itself. (The equations which follow are adapted from Mathematical Methods in Nuclear Reactor Dynamics, Z.Akcasu, G.S.Lellouche, and L.M.Shotkin, Academic Press, 1971). We will assume for simplicity that there is a single type of fissile isotope. 

The time evolution of the neutron distribution function is governed by the following equation:

$$\eqalign{\partial n(\mathbf{r},&\mathbf{v},t)/\partial t = - \mathbf{v} \cdot \nabla n(\mathbf{r},\mathbf{v},t) -  \Sigma(\mathbf{r},\mathbf{v},t) \;v \;n(\mathbf{r},\mathbf{v},t) \cr &+ f_0(\mathbf{v}) (1-\beta)\int \nu(\mathbf{r},\mathbf{v}',t) \Sigma_f(\mathbf{r},\mathbf{v}',t) \;v' n(\mathbf{r},\mathbf{v}',t) d^3 \mathbf{v}' \cr &+ \Sigma_{i=1}^{6} f_i(\mathbf{v}) \lambda_i C_i(\mathbf{r},t)\cr &+ \int \Sigma_s(\mathbf{r},\mathbf{v}' \rightarrow \mathbf{v},t) \;v' n(\mathbf{r},\mathbf{v}',t)d^3 \mathbf{v}'}$$ Let's consider the various terms and factors in this equation one-by-one.

$\mathbf{v} \cdot \nabla n(\mathbf{r},\mathbf{v},t)$ is the loss of neutrons of velocity $\mathbf{v}$ from the volume about a point $\mathbf{r}$ due to the flow of neutrons down a spatial concentration gradient $\nabla n(\mathbf{r},\mathbf{v},t)$. The loss is only non-zero if the concentration gradient has a non-zero component in the direction of the velocity vector $\mathbf{v}$.

$\Sigma(\mathbf{r},\mathbf{v},t) \;v \;n(\mathbf{r},\mathbf{v},t)$ is the loss of neutrons of velocity $\mathbf{v}$ from the volume about a point $\mathbf{r}$ due to \emph{any} reaction with the atomic nuclei in that volume. $\Sigma$ is the sum of the macroscopic cross-sections for neutron capture $\Sigma_c$, fission $\Sigma_f$, and scattering $\Sigma_s$.

$\Sigma_f(\mathbf{r},\mathbf{v}',t) \;v' n(\mathbf{r},\mathbf{v}',t)$ is the rate at which fission events are triggered by incoming neutrons of velocity $\mathbf{v}'$. 

$\nu(\mathbf{r},\mathbf{v}',t)$ is the mean number of neutrons output from a fission event triggered by an incoming neutron of velocity $\mathbf{v}'$. 

$\beta$ is the fraction of fission neutrons which are delayed, hence $1-\beta$ is the fraction of fission neutrons which are prompt.

$ f_0(v)$ is the probability density function for prompt fission neutrons. i.e., it specifies the probability that a prompt fission neutron will have a speed $v$, (and a kinetic energy $E = 1/2 m_n v^2$).  It is equivalent to the energy spectrum of prompt fission neutrons.

Assuming the outgoing prompt fission neutrons are emitted isotropically, the probability of a prompt fission neutron having a velocity $\mathbf{v}$ is $f_0(\mathbf{v}) = f_0(v)/4 \pi$.

Hence $f_0(\mathbf{v})(1-\beta) \int \nu(\mathbf{r},\mathbf{v}',t) \Sigma_f(\mathbf{r},\mathbf{v}',t) \;v' n(\mathbf{r},\mathbf{v}',t) d^3 \mathbf{v}'$ is the rate at which prompt neutrons of velocity $\mathbf{v}$ are created at position $\mathbf{r}$ in the reactor, by incoming neutrons of any velocity $\mathbf{v}'$.

$C_i(\mathbf{r},t)$ is the concentration of species $i$ delayed neutron precursors, and $\lambda_i$ is the decay constant of that species, hence $\lambda_i C_i(\mathbf{r},t)$ is the decay-rate of the $i$-th delayed neutron precursor. This is equivalent to the production-rate of delayed neutrons from the $i$-th precursor. 

$f_i(\mathbf{v})$ is the probability density function over velocity for delayed neutrons produced by the $i$-th precursor species, hence $\Sigma_{i=1}^{6} f_i(\mathbf{v}) \lambda_i C_i(\mathbf{r},t)$ specifies the rate at which delayed neutrons of velocity $\mathbf{v}$ are created at position $\mathbf{r}$ in the reactor.

$\Sigma_s(\mathbf{r},\mathbf{v}' \rightarrow \mathbf{v},t)$ is the macroscopic cross-section for elastic or inelastic scattering events in which an incoming neutron of velocity $\mathbf{v}'$ transitions to an outgoing neutron of velocity $\mathbf{v}$ by colliding with a target nucleus. Hence $\int \Sigma_s(\mathbf{r},\mathbf{v}' \rightarrow \mathbf{v},t) \;v' n(\mathbf{r},\mathbf{v}',t) d^3 \mathbf{v}'$ specifies the rate at which neutrons of velocity $\mathbf{v}$ are created at position $\mathbf{r}$ in the reactor by incoming neutrons of any velocity $\mathbf{v}'$ scattering with target nuclei.

The concentration of each delayed neutron precursor satisfies the following equation:
$$\eqalign{\partial C_i(\mathbf{r},t)/\partial t &= \beta_i \int \nu(\mathbf{r},\mathbf{v}',t) \Sigma_f(\mathbf{r},\mathbf{v}',t) \;v' n(\mathbf{r},\mathbf{v}',t) d^3 \mathbf{v}' \cr &- \lambda_i C_i(\mathbf{r},t)} $$ $\beta_i$ is the fraction of fission neutrons produced by the delayed neutron precursor of species $i$. Hence, $\beta = \Sigma_i \beta_i$.

The rate at which delayed neutron precursors of species $i$ are produced by fission events is given by $\beta_i \int \nu(\mathbf{r},\mathbf{v}',t) \Sigma_f(\mathbf{r},\mathbf{v}',t) \;v' n(\mathbf{r},\mathbf{v}',t) d^3 \mathbf{v}'$, and the rate at which they decay is given by $\lambda_i C_i(\mathbf{r},t)$.  

Note that, unlike the general Boltzmann equation, there is no term for neutron-neutron interactions. Neutron densities in a reactor are only of the order of $10^9/cm^3$, compared to the number density of atomic nuclei, which is of the order $10^{22}-10^{23}/cm^3$. Hence, neutron-neutron interactions can be neglected.

Now, the interesting point about the neutron transport equation is that the most important terms have the form of integral transforms:
$$(Tf)(x) = \int K(x,y)f(y) dy \, $$ where $K(x,y)$ is the 'kernel' of the transform. In discrete form, this becomes:
$$ (Tf)(x_i) = \Sigma_{j} K(x_i,y_j)f(y_j)  \,.$$
The neutron transport equation takes neutron fluxes as the input, and calculates reaction rates, and thence heat production, by integrating those fluxes with macroscopic cross-sections. The macroscopic cross-sections provide the kernels of the integral transforms. In general schematic terms:
$$\text{Output}(\mathbf{r},t) = g\left(\int \Sigma(\mathbf{r},\mathbf{v'},t) \;\text{Input}(\mathbf{r},\mathbf{v}',t)d^3 \mathbf{v}' \right)\, ,$$ where $g$ is some function.

For example, given the heat produced per fission event, $W_f$, the rate of fission-product heat-production throughout the reactor is given by: $$H_f(\mathbf{r},t) = W_f \int \Sigma_f(\mathbf{r},\mathbf{v}',t) \;v' n(\mathbf{r},\mathbf{v}',t) d^3 \mathbf{v}'\, .$$ Neural networks which implement convolutions have become immensely powerful tools for pattern recognition in recent years. Convolutions are a particular type of integral transform, so let's briefly recall how such neural networks are defined.

On an abstract level, a neural network consists of a set of nodes, and a set of connections between the nodes. The nodes possess activation levels; the connections between nodes possess weights; and the nodes have numerical rules for calculating their next activation level from a combination of the previous activation level, and the weighted inputs from other nodes.

The nodes are generally divided into three classes: input nodes, hidden/intermediate nodes, and output nodes. There is a directionality to a neural network in the sense that patterns of activation propagate through it from the input nodes to the output nodes, and in a feedforward network there is a partial ordering relationship defined on the nodes, which prevents downstream nodes from signalling those upstream.

For the implementation of integral transforms, feedforward neural networks with strictly defined layers are used. The activation levels of the input layer of nodes represents the values of the input function at discrete positions; the weights represent the values of the discretized kernel; and the values of the nodes in the layer beneath represent the discretized version of the transformed function.

Thus, the activation levels $x^l_i$ of the neurons in layer $l$ are given by weighted sums over the activation levels of the neurons in layer $l-1$:
$$x^l_i =  f \left(\sum_{j=1}^{n}W^l_{ij}x^{l-1}_j \right), \,  \text{for } \, i = 1,\ldots,n$$ $W^l$ is the matrix of weights connecting layer $l-1$ to layer $l$, with $W^l_{ij}$ representing the strength of the connection from the $j$-th neuron in layer $l-1$ to the $i$-th neuron in layer $l$. $f$ is a non-linear threshold function.

Now, the empirical cross-section data required for nuclear reactor kinetics is far from complete. The cross-sections are functions of neutron energy, and can oscillate wildly in some energy ranges, (see diagram below). The cross-section curves are therefore not defined to arbitrary levels of resolution.

Perhaps, however, neural networks can offer a solution to this problem. If the input layer of a neural network is used to represent the neutron flux at various positions throughout a reactor, and the output layer is used to represent, say, the temperature levels measured throughout the reactor, such a network could be trained to predict the correct temperature distribution for any given pattern of neutron flux. It would do so by adjusting its weights, and because the weights would represent the nuclear reaction cross-sections, this would provide a means of filling in the gaps in the nuclear physics datasets.

Questions one might pose immediately include the following: (i) are the neutron flux measurements in a reactor sufficiently accurate to facilitate this technique; and (ii) how many neurons would one need in a layer of the neural network to provide the necessary degree of spectral resolution?

These are questions I don't yet know the answer to...