Data Science Seminar

Welcome to the data seminar. The topics of the talks will vary among multiple topics in applied analysis, probability, applied mathematics related to data and dynamical systems, statistical and machine learning, signal processing, and computation.
To the attending graduate students: together with each talk we include a list of papers that are relevant to that talk, and we strongly encourage the graduate students to read those papers both in advance of the talk and after it.
You may view and subscribe to the calendar for the seminar series (this is being tested, please let us know if it does not work for you).
Organizers: Mauro Maggioni, James Murphy, Stefano Vigogna, Fei Lu. Contacts us if you would like to meet with any of the speakers.

The data seminar will take place on Wednesdays at 3pm, in Hodson Hall 203, during the semester, on Johns Hopkins Homewood campus this map).
September 13th
Yannis Kevrekidis
Bloomberg distinguished Professor, ChemBE, Johns Hopkins University

Title: No equations, no variables, no parameters, no space, no time: Data and the computational modeling of complex/multiscale systems

Abstract: Obtaining predictive dynamical equations from data lies at the heart of science and engineering modeling, and is the linchpin of our technology. In mathematical modeling one typically progresses from observations of the world (and some serious thinking!) first to equations for a model, and then to the analysis of the model to make predictions. Good mathematical models give good predictions (and inaccurate ones do not) - but the computational tools for analyzing them are the same: algorithms that are typically based on closed form equations. While the skeleton of the process remains the same, today we witness the development of mathematical techniques that operate directly on observations -data-, and appear to circumvent the serious thinking that goes into selecting variables and parameters and deriving accurate equations. The process then may appear to the user a little like making predictions by "looking in a crystal ball". Yet the "serious thinking" is still there and uses the same -and some new- mathematics: it goes into building algorithms that "jump directly" from data to the analysis of the model (which is now not available in closed form) so as to make predictions. Our work here presents a couple of efforts that illustrate this ``new” path from data to predictions. It really is the same old path, but it is travelled by new means.

September 20th
Carey Priebe
Professor, Applied Mathematics and Statistics, Johns Hopkins University

September 27
John Benedetto
Professor, Department of Mathematics, University of Maryland, College Park and Norbert Wiener Center

Frames -- two case studies: ambiguity and uncertainty

The theory of frames is an essential concept for dealing with signal representation in noisy environments. We shall examine the theory in the settings of the narrow band ambiguity function and of quantum information theory. For the ambiguity function, best possible estimates are derived for applicable constant amplitude zero autocorrelation (CAZAC) sequences using Weil's solution of the Riemann hypothesis for finite fields. In extending the theory to the vector-valued case modelling multi-sensor environments, the definition of the ambiguity function is characterized by means of group frames. For the uncertainty principle, Andrew Gleason's measure theoretic theorem, establishing the transition from the lattice interpretation of quantum mechanics to Born's probabilistic interpretation, is generalized in terms of frames to deal with uncertainty principle inequalities beyond Heisenberg's. My collaborators are Travis Andrews, Robert Benedetto, Jeffrey Donatelli, Paul Koprowski, and Joseph Woodworth.

October 4th
Nathan Kutz
Robert Bolles and Yasuko Endo Professor, Applied Mathematics, University of Washington

Data-driven discovery of governing equations and physical laws

The emergence of data methods for the sciences in the last decade has been enabled by the plummeting costs of sensors, computational power, and data storage. Such vast quantities of data afford us new opportunities for data-driven discovery, which has been referred to as the 4th paradigm of scientific discovery. We demonstrate that we can use emerging, large-scale time-series data from modern sensors to directly construct, in an adaptive manner, governing equations, even nonlinear dynamics, that best model the system measured using modern regression techniques. Recent innovations also allow for handling multi-scale physics phenomenon and control protocols in an adaptive and robust way. The overall architecture is equation-free in that the dynamics and control protocols are discovered directly from data acquired from sensors. The theory developed is demonstrated on a number of canonical example problems from physics, biology and engineering.

October 11th
Fei Lu
Assistant Professor, Department of Mathematics, Johns Hopkins University

Data assimilation with stochastic model reduction

In weather and climate prediction, data assimilation combines data with dynamical models to make prediction, using ensemble of solutions to represent the uncertainty. Due to limited computational resources, coarse approximations of the dynamical models are often used and subgrid scales are often not taken into account. We propose to quantify the effects of the subgrid scales from simulated data by stochastic parametrization, therefore obtaining a stochastic reduced model that captures main statistical and dynamical properties. We demonstrate by an example of a chaotic systems that the stochastic reduced model improves the performance of ensemble data assimilation.

October 18th
Roy Lederman
Postdoc, Program in Applied and Computational Mathematics, Princeton University


Related papers:
October 25th
John Harlim
Professor, Department of Mathematics, The Pennsylvania State University

Title: TBA

Abstract: TBA

Related papers:

November 1st
Andrew Sommese
Vincent J. and Annamarie Micus Duncan Professor of Mathematics, Department of Applied and Computational Mathematics and Statistics, University of Notre Dame

Title: A Brief Survey of Numerical Algebraic Geometry

Abstract: This talk will survey numerical algebraic geometry with an emphasis on the parts that have been useful in applications and some issues that have arisen in recent years.

Related papers:
Numerical solution of systems of polynomials arising in engineering and science, (2005), World Scientific Press.
Applications to kinematics up to 2011: C.W. Wampler and A.J. Sommese, Numerical Algebraic Geometry and Algebraic Kinematics, Acta Numerica 20 (2011).
Work up to 2013 with a software emphasis: D.J. Bates, J.D. Hauenstein, A.J. Sommese, and C.W. Wampler, Numerically solving polynomial systems with Bertini, SIAM (2013).
Some more recent articles: A homotopy method based on WENO schemes for solving steady state problems of hyperbolic conservation laws, Cell cycle control and bifurcation for a free boundary problem modeling tissue growth, On the completeness of solutions of Bethe's equations, Homotopy techniques for tensor decomposition and perfect identifiability, What is numerical algebraic geometry? Foreword, Journal of Symbolic Computation, Numerical Decomposition of Real Algebraic Curves and Surfaces
November 8th
Eitan Tadmor
Distinguished University Professor, Department of Mathematics, Institute for Physical Science & Technology, Center for Scientific Computation and Mathematical Modeling (CSCAMM), University of Maryland

Title: TBA

Abstract: TBA

Related papers:

November 15th
Tyrus Berry
Research Associate, Department of Mathematical Science, George Mason University

Title: TBA

Abstract: TBA

Related papers:

December 6th
Hau-Tieng Wu
Associate Professor, Department of Mathematics, Duke University

Title: TBA

Abstract: TBA

Related papers:


The data seminar will take place on Wednesdays at 3pm, in Whitehead Hall 304, during the semester, on Johns Hopkins Homewood campus this map).
February 8th
Kasso Okoudjou
Professor and Associate Chair, Department of Mathematics, University of Maryland, College Park

Inductive and numerical approaches to the HRT conjecture

Given a non-zero square integrable function $g$ and $\Lambda=\{(a_k, b_k)\}_{k=1}^N \subset \Br^2$ let $G(g, \Lambda)=\{e^{2\pi i b_k \cdot}g(\cdot - a_k)\}_{k=1}^N.$ The Heil-Ramanathan-Topiwala (HRT) Conjecture is the question of whether $G(g, \Lambda)$ is linearly independent. For the last two decades, very little progress has been made in settling the conjecture. In the first part of the talk, I will give an overview of the state of the conjecture. I will then describe recent inductive and numerical approaches to attack the conjecture. If time permits, I will present some new positive results in the special case where $g$ is real-valued.

February 15th
Jerome Darbon
Assistant Professor, Applied Mathematics, Brown University

On convex finite-dimensional variational methods in imaging sciences, and Hamilton-Jacobi equations

We consider standard finite-dimensional variational models used in signal/image processing that consist in minimizing an energy involving a data fidelity term and a regularization term. We propose new remarks from a theoretical perspective which give a precise description on how the solutions of the optimization problem depend on the amount of smoothing effects and the data itself. The dependence of the minimal values of the energy is shown to be ruled by Hamilton-Jacobi equations, while the minimizers $u(x,t)$ for the observed images x and smoothing parameters $t$ are given by $ u(x,t)=x -\nabla H(\nabla E(x,t))$ where $E(x,t)$ is the minimal value of the energy and $H$ is a Hamiltonian related to the data fidelity term. Various vanishing smoothing parameter results are derived illustrating the role played by the prior in such limits. Finally, we briefly present an efficient numerical numerical method for solving certain Hamilton-Jacobi equations in high dimension and some applications in optimal control.

March 8th
Matthew Hirn
Assistant Professor, Department of Mathematics, Michigan State University

Learning many body physics via multiscale, multilayer machine learnining architectures

Deep learning algorithms are making their mark in machine learning, obtaining state of the art results across computer vision, natural language processing, auditory signal processing and more. A wavelet scattering transform has the general architecture of a convolutional neural network, but leverages structure within data by encoding multiscale, stable invariants relevant to the task at hand. This approach is particularly relevant to data generated by physical systems, as such data must respect underlying physical laws. We illustrate this point through many body physics, in which scattering transforms can be loosely interpreted as the machine learning version of a fast multipole method (FMM). Unlike FMMs, which efficiently simulate the physical system, the scattering transform learns the underlying physical kernel from given states of the system. The resulting learning algorithm obtains state of the art numerical results for the regression of molecular energies in quantum chemistry, obtaining errors on the order of more costly quantum mechanical approaches.

March 29th
Jason Eisner
Professor, Department of Computer Science, Johns Hopkins University

Probabilistic Modeling of Natural Language

Natural language is a particular kind of time-series data. By way of introduction, I will informally sketch some of the phenomena that occur in natural language data and the kinds of probabilistic models that are traditionally used to describe them (e.g., context free grammars, Chinese restaurant processes, graphical models, finite-state transducers, recurrent neural networks augmented with memory). Many of these are covered in my JHU fall course, EN.600.465 Natural Language Processing. As an illustrative example, I will then describe a new conditional probability model that combines LSTMs with finite-state transducers to predict one string from another. For example, such a model can convert between the past and present tenses of an unfamiliar verb. Such pairwise conditional distributions can be combined into graphical models that model the relationships among many strings.

April 5th
Afonso Bandeira
Assistant Professor, Courant Institute of Mathematical Sciences, NYU

On Phase Transitions for Spiked Random Matrix and Tensor Models

A central problem of random matrix theory is to understand the eigenvalues of spiked random matrix models, in which a prominent eigenvector (or low rank structure) is planted into a random matrix. These distributions form natural statistical models for principal component analysis (PCA) problems throughout the sciences, where the goal is often to recover or detect the planted low rank structured. In this talk we discuss fundamental limitations of statistical methods to perform these tasks and methods that outperform PCA at it. Emphasis will be given to low rank structures arising in Synchronization problems. Time permitting, analogous results for spiked tensor models will also be discussed. Joint work with: Amelia Perry, Alex Wein, and Ankur Moitra.

April 12th
Andrew Christlieb
MSU Foundation Professor (Department of Mathematics) and Department Chair (Department of Computational Mathematics, Science and Engineering), Michigan State University

A sub-linear deterministic FFT for sparse high dimensional signals

In this talk we investigate the problems of efficient recover of sparse signals (sparsity=k) in a high dimensional setting. In particular, we are going to investigate efficient recovery of the k largest Fourier modes of a signal of size N^d, where N is the bandwidth and d is the dimension. Our objective is the development of a high dimensional sub-linear FFT, d=100 or 1000, that can recover the signal in O(d k log k) time. The methodology is based on our one dimensional deterministic sparse FFT that is O(k log k). The original method is recursive and based on ratios of short FFTs of pares of sub-sampled signals. The same ratio test allows us to identify when there is a collision due to aliasing the sub-sampled signals. The recursive nature allows us to separate and identify frequencies that have collided. Key in the high dimensional setting is the introduction of a partial unwrapping method and a tilting method that can ensure that we avoid collisions in the high dimensional setting on sub-sampled grids. We present the method, some analysis and results for a range of tests in both the noisy and noiseless cases.

April 26th
Wojciech Czaja
Professor, Department of Mathematics, University of Maryland, College Park

Solving Fredholm integrals from incomplete measurements

We present an algorithm to solve Fredholm integrals of the first kind with tensor product structures, from a limited number of measurements with the goal of using this method to accelerate Nuclear Magnetic Resonance (NMR) acquisition. This is done by incorporating compressive sampling type arguments to fill in the missing measurements using a priori knowledge of the structure of the data. In the first step, we recover a compressed data matrix from measurements that form a tight frame, and establish that these measurements satisfy the restricted isometry property (RIP). In the second step, we solve the zeroth-order regularization minimization problem using the Venkataramanan-Song-Huerlimann algorithm. We demonstrate the performance of this algorithm on simulated and real data and we compare it with other sampling techniques. Our theory applied to both 2D and multidimensional NMR.


The data seminar will take place on Wednesdays at 3pm, in Krieger Hall 309, during the semester, on Johns Hopkins Homewood campus (building 39 at location F2/3 on this map).
September 7th
Radu Balan
Professor, Department of Mathematics, University of Maryland, College Park

Statistics of the Stability Bounds in the Phase Retrieval Problem

In this talk we present a local-global Lipschitz analysis of the phase retrieval problem. Additionally we present tentative estimates of the tail-bound for the distribution of the global Lipschitz constants. Specifically it is known that if the frame $\{f_1,\ldots,f_m\}$ for $C^n$ is phase retrievable then there are constants $a_0$ and $b_0$ so that for every $x,y\in C^n$: $a_0 ||xx^*-yy^*||_1^2 \leq \sum_{k=1}^m ||\langle x,f_k\rangle|^2-|\langle y,f_k\rangle|^2|^2 \leq b_0 ||xx^*-yy^*||_1^2$. Assume $f_1,\ldots,f_m$ are independent realizations with entries from $CN(0,1)$. In this talk we establish estimates for the probability $P(a_0>a)$.

September 14th
Nate Strawn
Assistant Professor, Department of Mathematics and Statistics, Georgetown University

Finite-Sample Bounds for Geometric Multiresolution Analysis

Geometric Multiresolution Analysis takes a multiscale partition of a Euclidean dataset and fits affine approximations to data inside of each partition element at every scale. We prove finite-sample approximation error bounds for Geometric Multiresolution Analysis of "noisy" manifold models. First, we develop bounds on approximation error based on compatibility conditions between a multiscale partitions and the underlying generating distribution of the dataset. We then show partitions obtained from Cover Trees of random data drawn from "noisy" manifolds will satisfy these compatibility conditions with high probability. We conclude with open questions and future directions. This is joint work with Mauro Maggioni (Johns Hopkins University) and Stanislav Minsker (University of Southern California).

September 21st
Charles Meneveau
Louis M. Sardella Professor of Mechanical Engineering, Johns Hopkins University

Hydrodynamic turbulence in the era of big data: simulation, data, and analysis

In this talk, we review the classic problem of Navier-Stokes turbulence, the role numerical simulations have played in advancing the field, and the data challenges posed by these simulations. We describe the Johns Hopkins Turbulence Databases (JHTDB) and present some sample applications from the areas of velocity increment statistics and finite time Lyapunov exponents in isotropic turbulence and wall modeling for Large Eddy Simulations of wall-bounded flows.

Note date and place: September 23rd - 2pm - Gilman 132
Ben Leimkuhler
Professor of Mathematics, University of Edinburgh

From Molecular Dynamics to Large Scale Inference

Molecular models and data analytics problems give rise to very large systems of stochastic differential equations (SDEs) whose paths are designed to ergodically sample multimodal probability distributions. An important challenge for the numerical analyst (or the data scientist, for that matter) is the design of numerical procedures to generate these paths. One of the interesting ideas is to construct stochastic numerical methods with close attention to the error in the invariant measure. Another is to redesign the underlying stochastic dynamics to reduce bias or locally transform variables to enhance sampling efficiency. I will illustrate these ideas with various examples, including a geodesic integrator for constrained Langevin dynamics [1] and an ensemble sampling strategy for distributed inference [2].

September 28th
Rene Vidal
Professor of Biomedical Engineering, Computer Science, Mechanical Engineering, and Electrical and Computer Engineering, Johns Hopkins University

Global Optimality in Matrix and Tensor Factorization, Deep Learning, and Beyond

Matrix, tensor, and other factorization techniques are used in many applications and have enjoyed significant empirical success in many fields. However, common to a vast majority of these problems is the significant disadvantage that the associated optimization problems are typically non-convex due to a multilinear form or other convexity destroying transformation. Building on ideas from convex relaxations of matrix factorizations, in this talk I will present a very general framework which allows for the analysis of a wide range of non-convex factorization problems - including matrix factorization, tensor factorization, and deep neural network training formulations. In particular, I will present sufficient conditions under which a local minimum of the non-convex optimization problem is a global minimum and show that if the size of the factorized variables is large enough then from any initialization it is possible to find a global minimizer using a local descent algorithm.

October 5th
Mauro Maggioni
Bloomberg Distinguished Professor of Mathematics and Applied Mathematics and Statistics, Johns Hopkins University

Geometric Methods for the Approximation of High-dimensional Dynamical Systems

I will discuss a geometry-based statistical learning framework for performing model reduction and modeling of stochastic high-dimensional dynamical systems. I will consider two complementary settings. In the first one, I am given long trajectories of a system, e.g. from molecular dynamics, and I discuss techniques for estimating, in a robust fashion, an effective number of degrees of freedom of the system, which may vary in the state space of then system, and a local scale where the dynamics is well-approximated by a reduced dynamics with a small number of degrees of freedom. I will then use these ideas to produce an approximation to the generator of the system and obtain, via eigenfunctions of an empirical Fokker-Planck question, reaction coordinates for the system that capture the large time behavior of the dynamics. I will present various examples from molecular dynamics illustrating these ideas. In the second setting I assume I only have access to a (large number of expensive) simulators that can return short simulations of high-dimensional stochastic system, and introduce a novel statistical learning framework for learning automatically a family of local approximations to the system, that can be (automatically) pieced together to form a fast global reduced model for the system, called ATLAS. ATLAS is guaranteed to be accurate (in the sense of producing stochastic paths whose distribution is close to that of paths generated by the original system) not only at small time scales, but also at large time scales, under suitable assumptions on the dynamics. I discuss applications to homogenization of rough diffusions in low and high dimensions, as well as relatively simple systems with separations of time scales, and deterministic chaotic systems in high-dimensions, that are well-approximated by stochastic differential equations.

No knowledge of molecular dynamics is required, and the techniques above are quite universal. Ideas in the first part of the talk are based on what is called Diffusion Geometry, and have been used widely in data analysis; ideas in the second part are applicable to MCMC. The talk will be accessible to students with a wide variety of backgrounds and interests.

October 12th
Maria Cameron
Assistant Professor, Department of Mathematics, University of Maryland College Park

Modeling the dynamics of interacting particles by means of stochastic networks

Material science have been rapidly developing in recent years. A variety of particles interacting according to different kinds of pair potentials has been produced in experimental works. Looking into the future, one can imagine controlled self-assembly of particles into clusters of desired structures leading to the creation of new types of materials. Analytical studies of the self-assembly involve coping with difficulties associated with the huge numbers configurations, high dimensionality, complex geometry, and unacceptably large CPU times. A feasible approach to the study of self-assembly consists of mapping the collections of clusters onto stochastic networks (continuous-time Markov chains) and analyzing their dynamics. Vertices of the networks represent local minima of the potential energy of the clusters, while arcs connect only those pairs of vertices that correspond to local minima between which direct transitions are physically possible. Transition rates along the arcs are the transition rates between the corresponding pairs of local minima. Such networks are mathematically tractable and, at the same time, preserve important features of the underlying dynamics. Nevertheless, their huge size and complexity render their analysis challenging and invoke the development of new mathematical techniques. I will discuss some approaches to construction and analysis of such networks.

October 26th
Robert Pego
Professor, Department of Mathematical Sciences, Carnegie Mellon University

Microdroplet instablity for incompressible distance between shapes

AbstractThe least-action problem for geodesic distance on the 'manifold' of fluid-blob shapes exhibits instability due to microdroplet formation.This reflects a striking connection between Arnold's least-action principle for incompressible Euler flows and geodesic paths for Wasserstein distance. A connection with fluid mixture models via a variant of Brenier's relaxed least-action principle for generalized Euler flows will be outlined also. This is joint work with Jian-Guo Liu and Dejan Slepcev.

November 2nd
Dejan Slepcev
Associate Professor, Department of Mathematical Sciences, Carnegie Mellon University

Variational problems on graphs and their continuum limits

We will discuss variational problems arising in machine learning and their limits as the number of data points goes to infinity. Consider point clouds obtained as random samples of an underlying "ground-truth" measure. Graph representing the point cloud is obtained by assigning weights to edges based on the distance between the points. Many machine learning tasks, such as clustering and classification, can be posed as minimizing functionals on such graphs. We consider functionals involving graph cuts and graph laplacians and their limits as the number of data points goes to infinity. In particular we establish under what conditions the minimizers of discrete problems have a well defined continuum limit, and characterize the limit. The talk is primarily based on joint work with Nicolas Garcia Trillos, as well as on works with Xavier Bresson, Moritz Gerlach, Matthias Hein, Thomas Laurent, James von Brecht and Matt Thorpe.

November 9th
Markos Katsoulakis
Professor, Department of Mathematics and Statistics

Scalable Information Inequalities for Uncertainty Quantification in high dimensional probabilistic models

In this this talk we discuss new scalable information bounds for quantities of interest of complex stochastic models. The scalability of inequalities allows us to (a) obtain uncertainty quantification bounds for quantities of interest in high-dimensional systems and/or for long time stochastic dynamics; (b) assess the impact of large model perturbations such as in nonlinear response regimes in statistical mechanics; (c) address model-form uncertainty, i.e. compare different extended probabilistic models and corresponding quantities of interest. We demonstrate these tools in fast sensitivity screening of chemical reaction networks with a very large number of parameters, and towards obtaining robust and tight uncertainty quantification bounds for phase diagrams in statistical mechanics models.

November 30th
Youssef Marzouk
Associate Professor of Aeronautics and Astronautics, Massachusetts Institute of Technology

Measure transport approaches for Bayesian computation

We will discuss how transport maps, i.e., deterministic couplings between probability measures, can enable useful new approaches to Bayesian computation. A first use involves a combination of optimal transport and Metropolis correction; here, we use continuous transportation to transform typical MCMC proposals into adapted non-Gaussian proposals, both local and global. Second, we discuss a variational approach to Bayesian inference that constructs a deterministic transport map from a reference distribution to the posterior, without resorting to MCMC. Independent and unweighted samples can then be obtained by pushing forward reference samples through the map. Making either approach efficient in high dimensions, however, requires identifying and exploiting low-dimensional structure. We present new results relating the sparsity and decomposability of transports to the conditional independence structure of the target distribution. We also describe conditions, common in inverse problems, under which transport maps have a particular low-rank or near-identity structure. In general, these properties of transports can yield more efficient algorithms. As a particular example, we derive new deterministic "online" algorithms for Bayesian inference in nonlinear and non-Gaussian state-space models with static parameters. This is joint work with Daniele Bigoni, Matthew Parno, and Alessio Spantini.

Note date and place: December 2nd - 10am - Gilman 132
Alex Cloninger
Gibbs Assistant Professor and NSF Postdoctoral Fellow, Yale University

Incorporation of Geometry into Learning Algorithms and Medicine
This talk focuses on two instances in which scientific fields outside mathematics benefit from incorporating the geometry of the data. In each instance, the applications area motivates the need for new mathematical approaches and algorithms, and leads to interesting new questions. (1) A method to determine and predict drug treatment effectiveness for patients based off their baseline information. This motivates building a function adapted diffusion operator high dimensional data X when the function F can only be evaluated on large subsets of X, and defining a localized filtration of F and estimation values of F at a finer scale than it is reliable naively. (2) The current empirical success of deep learning in imaging and medical applications, in which theory and understanding is lagging far behind.. By assuming the data lies near low dimensional manifolds and building local wavelet frames, we improve on existing theory that breaks down when the ambient dimension is large (the regime in which deep learning has seen the most success)

December 7th
Ben Adcock
Assistant Professor, Simons Fraser University

Sparse polynomial approximation of high-dimensional functions

Many problems in scientific computing require the approximation of smooth, high-dimensional functions from limited amounts of data. For instance, a common problem in uncertainty quantification involves identifying the parameter dependence of the output of a computational model. Complex physical systems require computational models with many parameters, resulting in multivariate functions of many variables. Although the amount of data may be large, the curse of dimensionality essentially prohibits collecting or processing enough data to reconstruct the unknown function using classical approximation techniques. In this talk, I will give an overview of the approximation of smooth, high-dimensional functions by sparse polynomial expansions. I will focus on the recent application of techniques from compressed sensing to this problem, and demonstrate how such approaches theoretically overcome the curse of dimensionality. If time, I will also discuss a number of extensions, including dealing with corrupted and/or unstructured data, the effect of model error and incorporating additional information such as gradient data. I will also highlight several challenges and open problems. This is joint work with Casie Bao, Simone Brugiapaglia and Yi Sui (SFU).