I am Academy Research Fellow in Aalto University and Finnish Center of AI associated with research groups:
Office: room B360, CS-building, Aalto, Konemiehentie 2, 02150 Espoo, Finland
email: markus.o.heinonen@aalto.fi
mobile: +358 44 294 2600
My guidelines for PhD students on doing machine learning research.
|
48. Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging | |
47. AbODE: Ab initio antibody design using conjoined ODEs | |
46. Learning Energy Conserving Dynamics Efficiently with Hamiltonian Gaussian Processes | |
45. Incorporating functional summary information in Bayesian neural networks using a Dirichlet process likelihood approach | |
44. Latent Neural ODEs with Sparse Bayesian Multiple Shooting | |
43. Generative Modelling With Inverse Heat Dissipation | |
42. Human-in-the-loop assisted de novo molecular design | |
41. TCRconv: predicting recognition between T cell receptors and epitopes using contextualized motifs | |
40. Modular Flows: Differential Molecular Generation | |
39. Tackling covariate shift with node-based Bayesian neural networks | |
38. Variational multiple shooting for Bayesian ODEs with Gaussian processes | |
37. Likelihood-Free Inference with Deep Gaussian Processes | |
36. Modeling binding specificities of transcription factor pairs with random forests | |
35. Prediction and impact of personalized donation intervals | ![]() |
34. De-randomizing MCMC dynamics with the diffusion Stein operator | ![]() |
33. Bayesian Inference for Optimal Transport with Stochastic Cost | ![]() |
32. Continuous-Time Model-Based Reinforcement Learning We propose continuous-time model-based reinforcement learning setting, and derive a continuous-time actor-critic algorithm. | ![]() |
31. Predicting recognition between T cell receptors and epitopes with TCRGP We propose GP classifiers to determine T cell receptor epitope specificity from human and mouse subjects. We employ multiple kernel learning to determine relevances of sequence positions. The classifier achieves better accuracy than the earlier kernel and random forest based methods. | ![]() |
30. Learning continuous-time PDEs from sparse data with graph neural networks | ![]() |
29. Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations | ![]() |
28. Substrate specificity of 2-Deoxy-D-ribose 5-phosphate aldolase (DERA) assessed by different protein engineering and machine learning methods | ![]() |
Sample-efficient reinforcement learning using deep Gaussian processes | ![]() |
Scalable Bayesian neural networks by layer-wise input augmentation | ![]() |
27. Learning spectrograms with convolutional spectral kernels We propose convolutional kernel structures with well-defined spectrograms to shed light into kernel learning and deep Gaussian processes. | ![]() |
26. ODE2VAE: Deep generative second order ODEs with Bayesian neural networks We propose VAE's for dynamical systems where the latent space is a second-order ODE with Bayesian neural networks. | ![]() |
25. Deep Convolutional Gaussian Processes We propose deep convolutional GPs that convolve repeatedly the image signal resulting with Gaussian processes. We achieve state-of-the-art accuracy on CIFAR-10 and MNIST with over 10 percentage point improvement in CIFAR-10 to DeepGP's and Convolution GP. | ![]() |
24. Bayesian Metabolic Flux Analysis reveals intracellular flux couplings We propose Bayesian flux analysis, where full flux distributions are modeled and sampled, in contrast to point-wise FBA estimates. | ![]() |
23. Deep learning with differential Gaussian process flows We propose novel paradigm of machine learning through SDE-GP flows that warp the inputs until final classification or regression function. We achieve better performance than deep GPs. | ![]() |
22. Harmonizable mixture kernels with variational Fourier features We propose theoretical foundations of non-stationary kernels through harmonizable covariances, and present a practical Harmonizable mixture kernel which admit variational Fourier features. We propose Wigner distributions to visualise and interpret spectral kernels. | ![]() |
21. Learning Stochastic Differential Equations With Gaussian Processes Without Gradient Matching We solve SDE's by fitting drift and diffusions to match trajectories as arbitrary sparse Gaussian processes. We extend the sensitivity equations to Euler-Maruyama approximation. |
![]() |
20. Learning unknown ODE models with Gaussian processes We propose to fit black-box ODE models to arbitrary data. We represent ODE differentials as Gaussian processes, and propose efficient sensitivity equations to optimize the models. | ![]() |
19. Variational zero-inflated Gaussian processes with sparse kernels We propose sparse kernel that can learn zero covariances, and predict exact zeros. We also derive sparse GPRN latent mixing models, and their SVI bounds. | ![]() |
Neural Non-Stationary Spectral Kernel We propose neural network parameterisation of the Generalised Spectral Mixture (GSM) kernel. The DNN parameterisation improves scalability and efficiency of the spectral kernel. | ![]() |
18. Temporal clustering analysis of endothelial cell gene expression following exposure to a conventional radiotherapy dose fraction using Gaussian process clustering We propose temporally clustering endothelial cell gene expression profiles, which reveals clusters similarly expressed genes. We propose new temporal clustering technique over Bayesian expression models. | ![]() |
17. Learning with multiple pairwise kernels for drug bioactivity prediction | ![]() |
16. mGPfusion: Predicting protein stability changes with Gaussian process kernel learning and data fusion We combine experimental and simulated data to learn a Gaussian process based protein stability predictor. We propose a Bayesian data transformation that calibrates the simulated data against the experimental one. Our method requires less experimental measurements due to inclusion of simulated data. | ![]() |
15. Flex ddG: Rosetta Ensemble-Based Estimation of Changes in Protein-Protein Binding Affinity Upon Mutation | ![]() |
14. A Mutually-Dependent Hadamard Kernel for Modelling Latent Variable Couplings We introduce a new non-stationary kernel between inputs and signals, which allow non-stationary couplings between latent variables. The new kernel is based on Gibbs kernel and Generalised Wishart Process. | ![]() |
13. Non-Stationary Spectral Kernels We introduce non-stationary spectral kernels, which can learn covariances based on input-dependent frequencies (e.g. wavelets). We model the input-dependent frequencies as Gaussian process mixtures, and can learn signals with varying frequencies. | ![]() |
12. Random Fourier Features for operator-valued kernels We introduce random fourier features for vector-valued function learning, i.e. RFF's for operator-valued kernels. | ![]() |
11. Non-Stationary Gaussian Process Regression with Hamiltonian Monte Carlo We model all kernel parameters and the noise as separete Gaussian processes which are smoothly input-dependent. HMC sampling reveals the full parameter function posteriors. | ![]() |
10. Genome wide analysis of protein production load in Trichoderma reesei Transcriptomics and metabolic analysis of Trichoderma Reesei protein production. | ![]() |
9. Detecting time periods of differential gene expression using Gaussian processes: An application to endothelial cells exposed to radiotherapy dose fraction We propose a two-sample differential testing model on Gaussian processes. We introduce a new two-sample test that is continuous along time and results in differential confidences along time. | ![]() |
Learning nonparametric differential equations with operator-valued kernels and gradient matching We propose learning ODE's as operator-valued kernel functions. | ![]() |
Time-dependent gaussian process regression and significance analysis for sparse time-series We propose non-stationary kernels for Gaussian processes and a new Gaussian process optimization criteria suitable for sparse data. We propose new likelihood ratio tests for significance analysis using GP's. | |
8. Metabolite Identification trough Machine Learning -- Tackling CASMI Challenge using FingerID Our experiences in the CASMI metabolite identification challenge. | |
Full waveform forward seismic modeling of geologically complex environment: Comparison of simulated and field seismic data | |
7. Metabolite identification and fingerprint prediction via machine learning First application of machine learning to identify metabolites based on MS/MS data. We use probability product kernel over mass spectral features to learn a mapping between mass spectrum and binary structural properties of the unknown metabolite. We show that the properties can be used to query the unknown structure from e.g. PubChem. | ![]() |
6. Efficient path kernels for reaction function prediction We introduce first feasible path-based graph kernel. The main contribution is to apply a compressed string index to store millions of paths efficiently. We utilize the path kernel to predict chemical reaction function (EC class) over reaction graphs. | ![]() |
5. Computing atom mappings for biochemical reactions without subgraphs isomorphism We study the problem of mapping the atoms between reactants and products in a chemical reaction. We introduce the first definition of optimality of such mappings through graph edit distance. An A* algorithm is applied to compute the optimal mappings of KEGG reactions. We also introduce atom level descriptors through a message passing algorithm. | |
4. Structured output prediction of anti-cancer drug activity We utilize MMCRF for structured output prediction on small molecules for effectiveness against 59 cancer cell lines. Structured prediction outperforms individual SVM's clearly. However, the structure of the outputs seems to have little effect on performance. | |
3. Multilabel Classification of Drug-like Molecules via Max-margin Conditional Random Fields | |
2. FiD: a software for ab initio structural identification of product ions from tandem mass spectrometric data We introduce software for identifying product ions from MS/MS data. The method outperforms rule-based methods in our dataset of amino acids and sugarphosphates. | |
1. Ab initio prediction of molecular fragments from tandem mass spectrometry data We present a combinatorial algorithm for searching of plausible fragment structures for product ion peaks, based on a bond energy scoring function. We also introduce a mixed integer linear programming algorithm for choosing an optimal fragmentation tree. |