List of publications and preprints

My publications are listed here (Google Scholar)

Inductive Bias of Neural Networks and Selected Applications
J Heiss. PhD Thesis, ETH Zürich, 2024.
This thesis summarized our results during my PhD at ETH Zürich supervised by Josef Teichmann. The main focus is on discussing the difference between "deep inductive biases" and "shallow inductive biases". While most deep learning methods have "deep inductive biases", certain deep learning methods with specific hyperparameters only have "shallow inductive biases". We provide mathematical theory and intuition for these and other phenomena.

How Infinitely Wide Neural Networks Can Benefit from Multi-task Learning – an Exact Macroscopic Characterization
J Heiss, J Teichmann, H Wutte. arxiv 2022.
We provide an exact quantitative characterization of the inductive bias of infinitely wide L2-regularized ReLU-neural networks (NNs) in function space. This provides insights into the multi-task learning ability due to representation learning, while many other infinite width limits in the literature (such as NTK) only study settings where no benefit from multi-task learning is possible.

Reducing the number of neurons of deep ReLU networks based on the current theory of Regularization
J Heiss, A Stockinger, J Teichmann. OpenReview 2020.
Work in progress: The theory in the paper above shows that wide L2-regularized neural networks exhibit sparsity in function space. Our algorithm utilizes this sparsity in function space to compress the neural network to a smaller number of neurons after training by applying specific transformations on the weight matrices. Further experimental evaluation is still required.

Extending Path-Dependent NJ-ODEs to Noisy Observations and a Dependent Observation Framework
W. Andersson, J. Heiss, F. Krach, J Teichmann. TMLR 2024.
The Path-Dependent Neural Jump ODE (PD-NJ-ODE) is a model to learn optimal forecasts given irregularly sampled time series of incomplete past observations. So far the process itself and the coordinate-wise observation times were assumed to be independent and observations were assumed to be noiseless. This work discusses two extensions to lift these restrictions and provides theoretical guarantees, as well as empirical examples for them. (Intuitive video-summary: https://www.youtube.com/watch?v=PSglx3a3bBI)

Nonparametric Filtering, Estimation and Classification using Neural Jump ODEs
J Heiss, F Krach, T Schmidt, F Tambe-Ndonfack. arxiv 2024
This work extends PD-NJ-ODEs to input-output systems, enabling direct applications in online filtering and classification. We establish theoretical convergence guarantees for this approach, providing a robust solution to L2-optimal filtering. Empirical experiments highlight the model's superior performance over classical parametric methods, particularly in scenarios with complex underlying distributions.

NOMU: Neural Optimization-based Model Uncertainty
J Heiss, J Weissteiner, H Wutte, S Seuken, J Teichmann. ICML 2022.
We study methods for estimating model uncertainty for neural networks (NNs) in regression. We introduce five important desiderata regarding model uncertainty that any method should satisfy. However, we find that established benchmarks often fail to reliably capture some of these desiderata. We introduce a new approach for capturing model uncertainty for NNs, which we call NOMU.

How Implicit Regularization of ReLU Neural Networks Characterizes the Learned Function – Part I: the 1-D Case of Two Layers with Random First Layer
J Heiss, J Teichmann, H Wutte. arxiv 2019.
We consider one-dimensional (shallow) ReLU neural networks in which weights are chosen randomly and only the terminal layer is trained. We rigorously show that early stopping or L2 regularization on parameter space both correspond to regularizing the second derivative in function space (similar to smoothing splines) as the number of hidden nodes tends to infinity

How (Implicit) Regularization of ReLU Neural Networks Characterizes the Learned Function – Part II: the Multi-D Case of Two Layers with Random First Layer
J Heiss, J Teichmann, H Wutte. arxiv 2023.
We extend the results of part I to the multi-dimensional case. We show that (shallow) ReLU neural networks in which weights are chosen randomly and only the terminal layer is trained correspond in function space to a generalized additive model (GAM)-typed regression in which infinitely many directions are considered: the infinite generalized additive model (IGAM).

Monotone-Value Neural Networks: Exploiting Preference Monotonicity in Combinatorial Assignment
J Weissteiner, J Heiss, J Siems, S Seuken. IJCAI 2022.
We outperform the previous state-of-the-art in machine learning-based combinatorial assignment problems by introducing monotone-value neural networks (MVNNs). MVNNs are capturing prior knowledge on combinatorial valuations by enforcing monotonicity and normality, while solving the corresponding winner determination problem is still practically feasible via our MILP formulation.

Bayesian Optimization-based Combinatorial Assignment
J Weissteiner, J Heiss, J Siems, S Seuken. AAAI 2023.
We further improve the performance of machine learning-based combinatorial assignment problems by combining MVNNs and NOMU. We use the uncertainty obtained by an adapted version of NOMU to promote exploration by using the upper confidence bounds as acquisition function. (Short and simple video-summary: https://youtu.be/6YH9K6LDHPY)

Machine Learning-powered Combinatorial Clock Auction
E Soumalias, J Weissteiner, J Heiss, S Seuken. AAAI 2024.
For real-world combinatorial auctions (CAs) the combinatorial clock auction (CCA) is the most popular method in practice. While we have previously introduced ML-powered CAs that ask value queries (i.e., "What is your value for the bundle {A,B}?"), the CCA asks demand queries (i.e., "At prices p, what is your most preferred bundle of items?"). We introduce a machine learning-powered CCA that only asks demand queries and still outperforms the classical CCA significantly, while not changing the interaction-paradigm compared to the classical CCA.

Prices, Bids, Values: One ML-Powered Combinatorial Auction to Rule Them All
E Soumalias, J Heiss, J Weissteiner, S Seuken. axiv 2024.
We massively improve the efficiency of combinatorial auctions by smartly combining ML-powered demand queries and value queries.