Jakob Heiss


My current research

My research mainly focuses on understanding the inductive bias of overparametrized deep neural networks, especially in the context of multi-task learning, representation learning, and transfer learning. In the case of L2-regularized parameters, we have proven a theorem that helps to understand the inductive bias towards multi-task learning of deep infinitely wide ReLU-NNs (https://arxiv.org/abs/2112.15577). The proof of this theorem led us to the discovery of a fast almost loss-less compression method for ReLU NNs, which we have not tested well enough in practice yet (https://openreview.net/pdf?id=9GUTgHZgKCH).

Further, we developed a method to estimate the epistemic uncertainty of NN’s prediction (https://proceedings.mlr.press/v162/heiss22a.html). We improved 2 times the SOTA of multiple market design (combinatorial auction) benchmarks 1st by developing a NN architecture that enforces monotonicity constraints and implementing it in an auction mechanism (https://www.ijcai.org/proceedings/2022/0077.pdf) and 2nd by promoting exploration into our auction mechanism in Bayesian optimization fashion by using our estimation of epistemic uncertainty (https://doi.org/10.1609/aaai.v37i5.25726), where our simulations suggest that revenue could be increased by more than 200 million USD for an auction comparable to the Canadian 4G spectrum auction, but it is still a very long way until this mechanism could be implanted in such large auctions. Recently we modified our mechanism to use more practical demand queries instead of value queries (https://www.researchgate.net/publication/373262611_Machine_Learning-powered_Combinatorial_Clock_Auction accepted to AAAI'24).

In another project, we extended the theory and methodology for Path-Dependent Neural Jump ODEs to deal with noisy irregularly observed time series (https://openreview.net/forum?id=0T2OTVCCC1).

Unpublished projects I am working on right now: Outlier-robust NNs; Deep probabilistic calibration of financial models; better theoretical understanding of the Cold Posterior phenomena of Bayesian Neural Networks; Further improving theoretical understanding of the inductive bias of various ML methods (e.g. extending https://arxiv.org/abs/1911.02903); extending our techniques for combinatorial auctions.

My opinion on the generalization strengths of deep neural networks

I am very excited about the empirical fact that deep learning methods can generalize surprisingly well to unseen data points. I am extremely curious to understand their inductive bias that allows them to do so and especially how this inductive bias is influenced by design choices of the architecture and hyper-parameters, especially in the context of multi-task learning, transfer learning, representation learning, and feature learning. For some specific design-choices, I derived a theory to understand the inductive bias of NNs. In general, I see the following 4 main strengths in the inductive bias of neural networks:

  1. Deep Learning can strongly benefit from multi-task learning, transfer learning, representation learning, and feature learning.
  2. NNs with standard activations functions (such as ReLU) have an inductive bias towards flat/simple/smooth/nonoscillating functions because of implicit (and explicit) regularization.
  3. Some architectures (such as transformers, CNNs, RNNs, GNNs) have (soft) invariances/symmetries that are helpful for certain domains
  4. the flexibility of architectures and training algorithms allows for many ways to manipulate the inductive bias by hand-crafting tricks such as specific forms of data augmentation.


My research topics include:

Theory of the inductive bias on infinitely wide deep neural networks (including muti-task learning and transfer learning). Uncertainty and generalization of neural networks. Bayesian optimization with the help of neural networks. Deep Learning in Market Design. Compression of neural networks. Monotonic neural networks. Bayesian neural networks. Outlier-Robust neural network. Irregularly observed time series.