We formulate an equivalence between machine learning and the formulation of statistical data assimilation as used widely in physical and biological sciences. The correspondence is that layer number in a feedforward artificial network setting is the analog of time in the data assimilation setting. This connection has been noted in the machine learning literature. We add a perspective that expands on how methods from statistical physics and aspects of Lagrangian and Hamiltonian dynamics play a role in how networks can be trained and designed. Within the discussion of this equivalence, we show that adding more layers (making the network deeper) is analogous to adding temporal resolution in a data assimilation framework. Extending this equivalence to recurrent networks is also discussed. We explore how one can find a candidate for the global minimum of the cost functions in the machine learning context using a method from data assimilation. Calculations on simple models from both sides of the equivalence are reported. Also discussed is a framework in which the time or layer label is taken to be continuous, providing a differential equation, the Euler-Lagrange equation and its boundary conditions, as a necessary condition for a minimum of the cost function. This shows that the problem being solved is a two-point boundary value problem familiar in the discussion of variational methods. The use of continuous layers is denoted "deepest learning." These problems respect a symplectic symmetry in continuous layer phase space. Both Lagrangian versions and Hamiltonian versions of these problems are presented. Their well-studied implementation in a discrete time/layer, while respecting the symplectic structure, is addressed. The Hamiltonian version provides a direct rationale for backpropagation as a solution method for a certain two-point boundary value problem.
Networks of nonlinear systems contain unknown parameters and dynamical degrees of freedom that may not be observable with existing instruments. From observable state variables, we want to estimate the connectivity of a model of such a network and determine the full state of the model at the termination of a temporal observation window during which measurements transfer information to a model of the network. The model state at the termination of a measurement window acts as an initial condition for predicting the future behavior of the network. This allows the validation (or invalidation) of the model as a representation of the dynamical processes producing the observations. Once the model has been tested against new data, it may be utilized as a predictor of responses to innovative stimuli or forcing. We describe a general framework for the tasks involved in the "inverse" problem of determining properties of a model built to represent measured output from physical, biological, or other processes when the measurements are noisy, the model has errors, and the state of the model is unknown when measurements begin. This framework is called statistical data assimilation and is the best one can do in estimating model properties through the use of the conditional probability distributions of the model state variables, conditioned on observations. There is a very broad arena of applications of the methods described. These include numerical weather prediction, properties of nonlinear electrical circuitry, and determining the biophysical properties of functional networks of neurons. Illustrative examples will be given of (1) estimating the connectivity among neurons with known dynamics in a network of unknown connectivity, and (2) estimating the biophysical properties of individual neurons in vitro taken from a functional network underlying vocalization in songbirds. Published by AIP Publishing.
We assess the utility of an optimization-based data assimilation (D.A.) technique for treating the problem of nonlinear neutrino flavor transformation in core-collapse supernovae. D.A. uses measurements obtained from a physical system to estimate the state variable evolution and parameter values of the associated model. Formulated as an optimization procedure, D.A. can offer an integration-blind approach to predicting model evolution, which offers an advantage for models that thwart solution via traditional numerical integration techniques. Further, D.A. performs most optimally for models whose equations of motion are nonlinearly coupled. In this exploratory work, we consider a simple steady-state model with two monoenergetic neutrino beams coherently interacting with each other and a background medium. As this model can be solved via numerical integration, we have an independent consistency check for D.A. solutions. We find that the procedure can capture key features of flavor evolution over the entire trajectory, even given measurements of neutrino flavor only at the endpoint, and with an assumed known initial flavor distribution. Further, the procedure permits an examination of the sensitivity of flavor evolution to estimates of unknown model parameters, locates degeneracies in parameter space, and can identify the specific measurements required to break those degeneracies.
The problem of forecasting the behavior of a complex dynamical system through analysis of observational time-series data becomes difficult when the system expresses chaotic behavior and the measurements are sparse, in both space and/or time. Despite the fact that this situation is quite typical across many fields, including numerical weather prediction, the issue of whether the available observations are "sufficient" for generating successful forecasts is still not well understood. An analysis by Whartenby et al. (2013) found that in the context of the nonlinear shallow water equations on a beta plane, standard nudging techniques require observing approximately 70% of the full set of state variables. Here we examine the same system using a method introduced by Rey et al. (2014a), which generalizes standard nudging methods to utilize time delayed measurements. We show that in certain circumstances, it provides a sizable reduction in the number of observations required to construct accurate estimates and high-quality predictions. In particular, we find that this estimate of 70% can be reduced to about 33% using time delays, and even further if Lagrangian drifter locations are also used as measurements.
Data assimilation variational principles (4D-Var) exhibit a natural symplectic structure among the state variables x(t) and. x(t). We explore the implications of this structure in both Lagrangian coordinates {x(t), x(t)} andHamiltonian canonical coordinates {x(t), p(t)} through a numerical examination of the chaotic Lorenz 1996 model in ten dimensions. We find that there are a number of subtleties associated with discretization, boundary conditions, and symplecticity, suggesting differing approaches when working in the the Lagrangian versus the Hamiltonian description. We investigate these differences in detail, and accordingly develop a protocol for searching for optimal trajectories in a Hamiltonian space. We find that casting the problem into canonical coordinates can, in some situations, considerably improve the quality of predictions.
With the goal of building a model of the HVC nucleus in the avian song system, we discuss in detail a model of HVCRA projection neurons comprised of a somatic compartment with fast Na+ and K+ currents and a dendritic compartment with slower Ca2+ dynamics. We show this model qualitatively exhibits many observed electrophysiological behaviors. We then show in numerical procedures how one can design and analyze feasible laboratory experiments that allow the estimation of all of the many parameters and unmeasured dynamical variables, given observations of the somatic voltage V-s(t) alone. A key to this procedure is to initially estimate the slow dynamics associated with Ca, blocking the fast Na and K variations, and then with the Ca parameters fixed estimate the fast Na and K dynamics. This separation of time scales provides a numerically robust method for completing the full neuron model, and the efficacy of the method is tested by prediction when observations are complete. The simulation provides a framework for the slice preparation experiments and illustrates the use of data assimilation methods for the design of those experiments.
We propose a functional architecture of the adult songbird nucleus HVC in which the core element is a "functional syllable unit" (FSU). In this model, HVC is organized into FSUs, each of which provides the basis for the production of one syllable in vocalization. Within each FSU, the inhibitory neuron population takes one of two operational states: 1) simultaneous firing wherein all inhibitory neurons fire simultaneously, and 2) competitive firing of the inhibitory neurons. Switching between these basic modes of activity is accomplished via changes in the synaptic strengths among the inhibitory neurons. The inhibitory neurons connect to excitatory projection neurons such that during state 1 the activity of projection neurons is suppressed, while during state 2 patterns of sequential firing of projection neurons can occur. The latter state is stabilized by feedback from the projection to the inhibitory neurons. Song composition for specific species is distinguished by the manner in which different FSUs are functionally connected to each other. Ours is a computational model built with biophysically based neurons. We illustrate that many observations of HVC activity are explained by the dynamics of the proposed population of FSUs, and we identify aspects of the model that are currently testable experimentally. In addition, and standing apart from the core features of an FSU, we propose that the transition between modes may be governed by the biophysical mechanism of neuromodulation.
We report on the construction of neuron models by assimilating electrophysiological data with large-scale constrained nonlinear optimization. The method implements interior point line parameter search to determine parameters from the responses to intracellular current injections of zebra finch HVC neurons. We incorporated these parameters into a nine ionic channel conductance model to obtain completed models which we then use to predict the state of the neuron under arbitrary current stimulation. Each model was validated by successfully predicting the dynamics of the membrane potential induced by 20-50 different current protocols. The dispersion of parameters extracted from different assimilation windows was studied. Differences in constraints from current protocols, stochastic variability in neuron output, and noise behave as a residual temperature which broadens the global minimum of the objective function to an ellipsoid domain whose principal axes follow an exponentially decaying distribution. The maximum likelihood expectation of extracted parameters was found to provide an excellent approximation of the global minimum and yields highly consistent kinetics for both neurons studied. Large scale assimilation absorbs the intrinsic variability of electrophysiological data over wide assimilation windows. It builds models in an automatic manner treating all data as equal quantities and requiring minimal additional insight.
In statistical data assimilation one evaluates the conditional expected values, conditioned on measurements, of interesting quantities on the path of a model through observation and prediction windows. This often requires working with very high dimensional integrals in the discrete time descriptions of the observations and model dynamics, which become functional integrals in the continuous-time limit. Two familiar methods for performing these integrals include (1) Monte Carlo calculations and (2) variational approximations using the method of Laplace plus perturbative corrections to the dominant contributions. We attend here to aspects of the Laplace approximation and develop an annealing method for locating the variational path satisfying the Euler-Lagrange equations that comprises the major contribution to the integrals. This begins with the identification of the minimum action path starting with a situation where the model dynamics is totally unresolved in state space, and the consistent minimum of the variational problem is known. We then proceed to slowly increase the model resolution, seeking to remain in the basin of the minimum action path, until a path that gives the dominant contribution to the integral is identified. After a discussion of some general issues, we give examples of the assimilation process for some simple, instructive models from the geophysical literature. Then we explore a slightly richer model of the same type with two distinct time scales. This is followed by a model characterizing the biophysics of individual neurons.
Most data based state and parameter estimation methods require suitable initial values or guesses to achieve convergence to the desired solution, which typically is a global minimum of some cost function. Unfortunately, however, other stable solutions (e.g., local minima) may exist and provide suboptimal or even wrong estimates. Here, we demonstrate for a 9-dimensional Lorenz-96 model how to characterize the basin size of the global minimum when applying some particular optimization based estimation algorithm. We compare three different strategies for generating suitable initial guesses, and we investigate the dependence of the solution on the given trajectory segment (underlying the measured time series). To address the question of how many state variables have to be measured for optimal performance, different types of multivariate time series are considered consisting of 1, 2, or 3 variables. Based on these time series, the local observability of state variables and parameters of the Lorenz-96 model is investigated and confirmed using delay coordinates. This result is in good agreement with the observation that correct state and parameter estimation results are obtained if the optimization algorithm is initialized with initial guesses close to the true solution. In contrast, initialization with other exact solutions of the model equations (different from the true solution used to generate the time series) typically fails, i.e., the optimization procedure ends up in local minima different from the true solution. Initialization using random values in a box around the attractor exhibits success rates depending on the number of observables and the available time series (trajectory segment). (C) 2015 AIP Publishing LLC.
Cardiac rhythm management devices provide therapies for both arrhythmias and resynchronisation but not heart failure, which affects millions of patients worldwide. This paper reviews recent advances in biophysics and mathematical engineering that provide a novel technological platform for addressing heart disease and enabling beat-to-beat adaptation of cardiac pacing in response to physiological feedback. The technology consists of silicon hardware central pattern generators (hCPGs) that may be trained to emulate accurately the dynamical response of biological central pattern generators (bCPGs). We discuss the limitations of present CPGs and appraise the advantages of analog over digital circuits for application in bioelectronic medicine. To test the system, we have focused on the cardio-respiratory oscillators in the medulla oblongata that modulate heart rate in phase with respiration to induce respiratory sinus arrhythmia (RSA). We describe here a novel, scalable hCPG comprising physiologically realistic (Hodgkin-Huxley type) neurones and synapses. Our hCPG comprises two neurones that antagonise each other to provide rhythmic motor drive to the vagus nerve to slow the heart. We show how recent advances in modelling allow the motor output to adapt to physiological feedback such as respiration. In rats, we report on the restoration of RSA using an hCPG that receives diaphragmatic electromyography input and use it to stimulate the vagus nerve at specific time points of the respiratory cycle to slow the heart rate. We have validated the adaptation of stimulation to alterations in respiratory rate. We demonstrate that the hCPG is tuneable in terms of the depth and timing of the RSA relative to respiratory phase. These pioneering studies will now permit an analysis of the physiological role of RSA as well as its any potential therapeutic use in cardiac disease.
Data assimilation transfers information from an observed system to a physically based model system with state variables x(t). The observations are typically noisy, the model has errors, and the initial state x(t(0)) is uncertain: the data assimilation is statistical. One can ask about expected values of functions h < G(X)> on the path X = {x(t(0)), ..., x(t(m))} of the model state through the observation window t(n) = {t(0),..., t(m)). The conditional (on the measurements) probability distribution P(X) = exp[-A(0)(X)] determines these expected values. Variational methods using saddle points of the "action" A(0)(X), known as 4DVar (Talagrand and Courtier, 1987; Evensen, 2009), are utilized for estimating < G(X)>. In a path integral formulation of statistical data assimilation, we consider variational approximations in a realization of the action where measurement errors and model errors are Gaussian. We (a) discuss an annealing method for locating the path X-0 giving a consistent minimum of the action A(0)(X-0), (b) consider the explicit role of the number of measurements at each t n in determining A(0)(X-0), and (c) identify a parameter regime for the scale of model errors, which allows X-0 to give a precise estimate of < G(X-0)> with computable, small higher-order corrections.
Information in measurements of a nonlinear dynamical system can be transferred to a quantitative model of the observed system to establish its fixed parameters and unobserved state variables. After this learning period is complete, one may predict the model response to new forces and, when successful, these predictions will match additional observations. This adjustment process encounters problems when the model is nonlinear and chaotic because dynamical instability impedes the transfer of information from the data to the model when the number of measurements at each observation time is insufficient. We discuss the use of information in the waveform of the data, realized through a time delayed collection of measurements, to provide additional stability and accuracy to this search procedure. Several examples are explored, including a few familiar nonlinear dynamical systems and small networks of Colpitts oscillators.
Recent results demonstrate techniques for fully quantitative, statistical inference of the dynamics of individual neurons under the Hodgkin-Huxley framework of voltage-gated conductances. Using a variational approximation, this approach has been successfully applied to simulated data from model neurons. Here, we use this method to analyze a population of real neurons recorded in a slice preparation of the zebra finch forebrain nucleus HVC. Our results demonstrate that using only 1,500 ms of voltage recorded while injecting a complex current waveform, we can estimate the values of 12 state variables and 72 parameters in a dynamical model, such that the model accurately predicts the responses of the neuron to novel injected currents. A less complex model produced consistently worse predictions, indicating that the additional currents contribute significantly to the dynamics of these neurons. Preliminary results indicate some differences in the channel complement of the models for different classes of HVC neurons, which accords with expectations from the biology. Whereas the model for each cell is incomplete (representing only the somatic compartment, and likely to be missing classes of channels that the real neurons possess), our approach opens the possibility to investigate in modeling the plausibility of additional classes of channels the cell might possess, thus improving the models over time. These results provide an important foundational basis for building biologically realistic network models, such as the one in HVC that contributes to the process of song production and developmental vocal learning in songbirds.
Estimating the behavior of a network of neurons requires accurate models of the individual neurons along with accurate characterizations of the connections among them. Whereas for a single cell, measurements of the intracellular voltage are technically feasible and sufficient to characterize a useful model of its behavior, making sufficient numbers of simultaneous intracellular measurements to characterize even small networks is infeasible. This paper builds on prior work on single neurons to explore whether knowledge of the time of spiking of neurons in a network, once the nodes (neurons) have been characterized biophysically, can provide enough information to usefully constrain the functional architecture of the network: the existence of synaptic links among neurons and their strength. Using standardized voltage and synaptic gating variable waveforms associated with a spike, we demonstrate that the functional architecture of a small network of model neurons can be established.
We investigate the dynamics of a conductance-based neuron model coupled to a model of intracellular calcium uptake and release by the endoplasmic reticulum. The intracellular calcium dynamics occur on a time scale that is orders of magnitude slower than voltage spiking behavior. Coupling these mechanisms sets the stage for the appearance of chaotic dynamics, which we observe within certain ranges of model parameter values. We then explore the question of whether one can, using observed voltage data alone, estimate the states and parameters of the voltage plus calcium (V+Ca) dynamics model. We find the answer is negative. Indeed, we show that voltage plus another observed quantity must be known to allow the estimation to be accurate. We show that observing both the voltage time course V (t) and the intracellular Ca time course will permit accurate estimation, and from the estimated model state, accurate prediction after observations are completed. This sets the stage for how one will be able to use a more detailed model of V+Ca dynamics in neuron activity in the analysis of experimental data on individual neurons as well as functional networks in which the nodes (neurons) have these biophysical properties.
Transferring information from observations to models of complex systems may meet impediments when the number of observations at any observation time is not sufficient. This is especially so when chaotic behavior is expressed. We show how to use time-delay embedding, familiar from nonlinear dynamics, to provide the information required to obtain accurate state and parameter estimates. Good estimates of parameters and unobserved states are necessary for good predictions of the future state of a model system. This method may be critical in allowing the understanding of prediction in complex systems as varied as nervous systems and weather prediction where insufficient measurements are typical. (C) 2014 Elsevier B.V. All rights reserved.
The authors consider statistical ensemble data assimilation for a one-layer shallow-water equation in a twin experiment: data are generated by an N x N enstrophy-conserving grid integration scheme along with an Ekman vertical velocity at the bottom of an Ekman layer driving the flow and Rayleigh and eddy viscosity dissipation damping the flow. Data are generated for N = 16 and the chaotic flow that results is analyzed. This analysis is performed in a path-integral formulation of the data assimilation problem. These path integrals are estimated by a Monte Carlo method using a Metropolis Hastings algorithm. The authors' concentration is on the number of measurements L-c that must be assimilated by the model to allow accurate estimation of the full state of the model at the end of an observation window. It is found that for this shallow-water flow approximately 70% of the full set of state variables must be observed to accomplish either goal. The number of required observations is determined by examining the number needed to synchronize the observed data L-c and the model output when L data streams are assimilated by the model. Synchronization occurs when L >= L-c and the correct selection of which L-c data are observed is made. If the number of observations is too small, so synchronization does not occur, or the selection of observations does not lead to synchronization of the data with the model output, state estimates during and at the end of the observation window and predictions beyond the observation window are inaccurate.
Hodgkin-Huxley (HH) models of neuronal membrane dynamics consist of a set of nonlinear differential equations that describe the time-varying conductance of various ion channels. Using observations of voltage alone we show how to estimate the unknown parameters and unobserved state variables of an HH model in the expected circumstance that the measurements are noisy, the model has errors, and the state of the neuron is not known when observations commence. The joint probability distribution of the observed membrane voltage and the unobserved state variables and parameters of these models is a path integral through the model state space. The solution to this integral allows estimation of the parameters and thus a characterization of many biological properties of interest, including channel complement and density, that give rise to a neuron's electrophysiological behavior. This paper describes a method for directly evaluating the path integral using a Monte Carlo numerical approach. This provides estimates not only of the expected values of model parameters but also of their posterior uncertainty. Using test data simulated from neuronal models comprising several common channels, we show that short (< 50 ms) intracellular recordings from neurons stimulated with a complex time-varying current yield accurate and precise estimates of the model parameters as well as accurate predictions of the future behavior of the neuron. We also show that this method is robust to errors in model specification, supporting model development for biological preparations in which the channel expression and other biophysical properties of the neurons are not fully known.
Neuroscientists often propose detailed computational models to probe the properties of the neural systems they study. With the advent of neuromorphic engineering, there is an increasing number of hardware electronic analogs of biological neural systems being proposed as well. However, for both biological and hardware systems, it is often difficult to estimate the parameters of the model so that they are meaningful to the experimental system under study, especially when these models involve a large number of states and parameters that cannot be simultaneously measured. We have developed a procedure to solve this problem in the context of interacting neural populations using a recently developed dynamic state and parameter estimation (DSPE) technique. This technique uses synchronization as a tool for dynamically coupling experimentally measured data to its corresponding model to determine its parameters and internal state variables. Typically experimental data are obtained from the biological neural system and the model is simulated in software; here we show that this technique is also efficient in validating proposed network models for neuromorphic spike-based very large-scale integration (VLSI) chips and that it is able to systematically extract network parameters such as synaptic weights, time constants, and other variables that are not accessible by direct observation. Our results suggest that this method can become a very useful tool formodel-based identification and configuration of neuromorphic multichip VLSI systems.
The answers to data assimilation questions can be expressed as path integrals over all possible state and parameter histories. We show how these path integrals can be evaluated numerically using a Markov Chain Monte Carlo method designed to run in parallel on a graphics processing unit (GPU). We demonstrate the application of the method to an example with a transmembrane voltage time series of a simulated neuron as an input, and using a Hodgkin-Huxley neuron model. By taking advantage of GPU computing, we gain a parallel speedup factor of up to about 300, compared to an equivalent serial computation on a CPU, with performance increasing as the length of the observation time used for data assimilation increases. (C) 2011 Elsevier Inc. All rights reserved.
We present a method for using measurements of membrane voltage in individual neurons to estimate the parameters and states of the voltage-gated ion channels underlying the dynamics of the neuron's behavior. Short injections of a complex time-varying current provide sufficient data to determine the reversal potentials, maximal conductances, and kinetic parameters of a diverse range of channels, representing tens of unknown parameters and many gating variables in a model of the neuron's behavior. These estimates are used to predict the response of the model at times beyond the observation window. This method of data assimilation extends to the general problem of determining model parameters and unobserved state variables from a sparse set of observations, and may be applicable to networks of neurons. We describe an exact formulation of the tasks in nonlinear data assimilation when one has noisy data, errors in the models, and incomplete information about the state of the system when observations commence. This is a high dimensional integral along the path of the model state through the observation window. In this article, a stationary path approximation to this integral, using a variational method, is described and tested employing data generated using neuronal models comprising several common channels with Hodgkin-Huxley dynamics. These numerical experiments reveal a number of practical considerations in designing stimulus currents and in determining model consistency. The tools explored here are computationally efficient and have paths to parallelization that should allow large individual neuron and network problems to be addressed.
The process of transferring information from observations of a dynamical system to estimate the fixed parameters and unobserved states of a system model can be formulated as the evaluation of a discrete-time path integral in model state space. The observations serve as a guiding 'potential' working with the dynamical rules of the model to direct system orbits in state space. The path-integral representation permits direct numerical evaluation of the conditional mean path through the state space as well as conditional moments about this mean. Using a Monte Carlo method for selecting paths through state space, we show how these moments can be evaluated and demonstrate in an interesting model system the explicit influence of the role of transfer of information from the observations. We address the question of how many observations are required to estimate the unobserved state variables, and we examine the assumptions of Gaussianity of the underlying conditional probability. Copyright (C) 2010 Royal Meteorological Society
Throughout the brain, neurons encode information in fundamental units of spikes. Each spike represents the combined thresholding of synaptic inputs and intrinsic neuronal dynamics. Here, we address a basic question of spike train formation: how do perithreshold synaptic inputs perturb the output of a spiking neuron? We recorded from single entorhinal principal cells in vitro and drove them to spike steadily at similar to 5 Hz (theta range) with direct current injection, then used a dynamic-clamp to superimpose strong excitatory conductance inputs at varying rates. Neurons spiked most reliably when the input rate matched the intrinsic neuronal firing rate. We also found a striking tendency of neurons to preserve their rates and coefficients of variation, independently of input rates. As mechanisms for this rate maintenance, we show that the efficacy of the conductance inputs varied with the relationship of input rate to neuronal firing rate, and with the arrival time of the input within the natural period. Using a novel method of spike classification, we developed a minimal Markov model that reproduced the measured statistics of the output spike trains and thus allowed us to identify and compare contributions to the rate maintenance and resonance. We suggest that the strength of rate maintenance may be used as a new categorization scheme for neuronal response and note that individual intrinsic spiking mechanisms may play a significant role in forming the rhythmic spike trains of activated neurons; in the entorhinal cortex, individual pacemakers may dominate production of the regional theta rhythm.
In variational formulations of data assimilation, the estimation of parameters or initial state values by a search for a minimum of a cost function can be hindered by the numerous local minima in the dependence of the cost function on those quantities. We argue that this is a result of instability on the synchronization manifold where the observations are required to match the model outputs in the situation where the data and the model are chaotic. The solution to this impediment to estimation is given as controls moving the positive conditional Lyapunov exponents on the synchronization manifold to negative values and adding to the cost function a penalty that drives those controls to zero as a result of the optimization process implementing the assimilation. This is seen as the solution to the proper size of 'nudging' terms: they are zero once the estimation has been completed, leaving only the physics of the problem to govern forecasts after the assimilation window. We show how this procedure, called Dynamical State and Parameter Estimation (DSPE), works in the case of the Lorenz96 model with nine dynamical variables. Using DSPE, we are able to accurately estimate the fixed parameter of this model and all of the state variables, observed and unobserved, over an assimilation time interval [0, T]. Using the state variables at T and the estimated fixed parameter, we are able to accurately forecast the state of the model for t > T to those times where the chaotic behaviour of the system interferes with forecast accuracy. Copyright (C) 2010 Royal Meteorological Society