Inverse Problems, estimation and calibration¶

Inverse problems are of utmost importance in science and technology: it appears in the form of maximum likelihood or calibration problems and is in its most fundamental version the question how to calculate the inverse of a map $$ \theta \mapsto F(\theta) = x \, . $$ Often the range space is high-dimensional and no actual inverse exists, but even if it exists it is usually hard to calculate.

Let $ \mathbf{l} $ be a loss function on range space, then a relaxed mathematical formulation of calculating the inverse is to calculate $$ \theta^*(x) \in \operatorname{argmin}(\mathbf{l}(F(\theta)-x))) $$ for any given $ x $ in the range space. This is now a minimization problem which can be analyzed by classical methods from analysis. Due to non-convexity of the optimization problem and due to non-existence of the inverse map in a continuous form often the map $ x \mapsto \theta^*(x) $ does not depend in a continous way on $ x $. There are several ways to deal with this problem, but in its most general form we perturb the actual optimization problem and solve $$ \theta_\lambda^*(x) \in \operatorname{argmin}(\mathbf{l}(F(\theta)-x))+\lambda h(\theta)) $$ instead, where $ h $ is a real valued convex function in $ \theta $. Often this problem is considerably more regular and called regularized inverse problem. Of course the limit $ \lambda \to 0 $ is important from the original point of view.

An alternative viewpoint comes from Bayesian statistics. Let $ \pi_0 $ denote a prior distribution on parameter space $ \Theta $ from which $ \theta \in \Theta $ are chosen and consider loss functions such that the Bayesian formula for the posterior distribution $$ \pi_1(d \theta \, | \, x) = \frac{\exp \big(-\mathbf{l}(F(\theta) - x ) \big)\pi_0(d \theta)}{\int_{\Theta} \exp \big( -\mathbf{l}(F(\theta) - x ) \big)\pi_0(d \theta)} $$ makes sense. Under mild continuity conditions on $ \mathbf{l} $ the solution $ \pi_1 (. \, | \, x) $ depends continuously on the 'data' $x$. Of course in order to evaluate $ \pi_1 $ one has to sample from the posterior distribution which can be very difficult and actually leads back towards the first approach.

Let now $ \Theta $ be a pool of models encoded in an element $ \theta \in \Theta $ and let $ F $ be a map to model prices. Calibration in mathematical finance actually is the identification of $ \theta \in \Theta $ which explains $ x $ best. Three approaches can be considered:

Try to learn the map $ x \mapsto \theta^*(x) $ from many offline calculation which serve as training data set.
Try to learn the map $ \theta \mapsto F(\theta) $ by a neural network and invert the network.
If $ \theta $ is an infinite dimensional quantity of function type itself (e.g. leverage functions or local volatilities), then parametrize $ \theta $ as neural network and try to learn the network weights to solve approximate $ \theta^*(x) $ for a given data point $ x $.