Tübingen AI Center
UniversityTübingen, Germany
Research output, citation impact, and the most-cited recent papers from Tübingen AI Center. Aggregated across the NobleBlocks index of 300M+ scholarly works.
Top-cited papers from Tübingen AI Center
Neural radiance fields enable state-of-the-art photorealistic view synthesis. However, existing radiance field representations are either too compute-intensive for real-time rendering or require too much memory to scale to large scenes. We present a Memory-Efficient Radiance Field (MERF) representation that achieves real-time rendering of large-scale scenes in a browser. MERF reduces the memory consumption of prior sparse volumetric radiance fields using a combination of a sparse feature grid and high-resolution 2D feature planes. To support large-scale unbounded scenes, we introduce a novel contraction function that maps scene coordinates into a bounded volume while still allowing for efficient ray-box intersection. We design a lossless procedure for baking the parameterization used during training into a model that achieves real-time rendering while still preserving the photorealistic view synthesis quality of a volumetric radiance field.
Recently, 3D Gaussian Splatting (3DGS) has demonstrated impressive novel view synthesis results, while allowing the rendering of high-resolution images in real-time. However, leveraging 3D Gaussians for surface reconstruction poses significant challenges due to the explicit and disconnected nature of 3D Gaussians. In this work, we present Gaussian Opacity Fields (GOF), a novel approach for efficient, high-quality, and adaptive surface reconstruction in unbounded scenes. Our GOF is derived from ray-tracing-based volume rendering of 3D Gaussians, enabling direct geometry extraction from 3D Gaussians by identifying its levelset, without resorting to Poisson reconstruction or TSDF fusion as in previous work. We approximate the surface normal of Gaussians as the normal of the ray-Gaussian intersection plane, enabling the application of regularization that significantly enhances geometry. Furthermore, we develop an efficient geometry extraction method utilizing Marching Tetrahedra, where the tetrahedral grids are induced from 3D Gaussians and thus adapt to the scene's complexity. Our evaluations reveal that GOF surpasses existing 3DGS-based methods in surface reconstruction and novel view synthesis. Further, it compares favorably to or even outperforms, neural implicit methods in both quality and speed.
Abstract We can now measure the connectivity of every neuron in a neural circuit 1–9 , but we cannot measure other biological details, including the dynamical characteristics of each neuron. The degree to which measurements of connectivity alone can inform the understanding of neural computation is an open question 10 . Here we show that with experimental measurements of only the connectivity of a biological neural network, we can predict the neural activity underlying a specified neural computation. We constructed a model neural network with the experimentally determined connectivity for 64 cell types in the motion pathways of the fruit fly optic lobe 1–5 but with unknown parameters for the single-neuron and single-synapse properties. We then optimized the values of these unknown parameters using techniques from deep learning 11 , to allow the model network to detect visual motion 12 . Our mechanistic model makes detailed, experimentally testable predictions for each neuron in the connectome. We found that model predictions agreed with experimental measurements of neural activity across 26 studies. Our work demonstrates a strategy for generating detailed hypotheses about the mechanisms of neural circuit function from connectivity measurements. We show that this strategy is more likely to be successful when neurons are sparsely connected—a universally observed feature of biological neural networks across species and brain regions.
The number of publications in biomedicine and life sciences has grown so much that it is difficult to keep track of new scientific works and to have an overview of the evolution of the field as a whole. Here, we present a two-dimensional (2D) map of the entire corpus of biomedical literature, based on the abstract texts of 21 million English articles from the PubMed database. To embed the abstracts into 2D, we used the large language model PubMedBERT, combined with t-SNE tailored to handle samples of this size. We used our map to study the emergence of the COVID-19 literature, the evolution of the neuroscience discipline, the uptake of machine learning, the distribution of gender imbalance in academic authorship, and the distribution of retracted paper mill articles. Furthermore, we present an interactive website that allows easy exploration and will enable further insights and facilitate future research.
Abstract We can now measure the connectivity of every neuron in a neural circuit, but we are still blind to other biological details, including the dynamical characteristics of each neuron. The degree to which connectivity measurements alone can inform understanding of neural computation is an open question. Here we show that with only measurements of the connectivity of a biological neural network, we can predict the neural activity underlying neural computation. We constructed a model neural network with the experimentally determined connectivity for 64 cell types in the motion pathways of the fruit fly optic lobe but with unknown parameters for the single neuron and single synapse properties. We then optimized the values of these unknown parameters using techniques from deep learning, to allow the model network to detect visual motion. Our mechanistic model makes detailed experimentally testable predictions for each neuron in the connectome. We found that model predictions agreed with experimental measurements of neural activity across 24 studies. Our work demonstrates a strategy for generating detailed hypotheses about the mechanisms of neural circuit function from connectivity measurements. We show that this strategy is more likely to be successful when neurons are sparsely connected—a universally observed feature of biological neural networks across species and brain regions.
Abstract The body of an animal influences how its nervous system generates behaviour 1 . Accurately modelling the neural control of sensorimotor behaviour requires an anatomically detailed biomechanical representation of the body. Here we introduce a whole-body model of the fruit fly Drosophila melanogaster in a physics simulator 2 . Designed as a general-purpose framework, our model enables the simulation of diverse fly behaviours, including both terrestrial and aerial locomotion. We validate its versatility by replicating realistic walking and flight behaviours. To support these behaviours, we develop phenomenological models for fluid and adhesion forces. Using data-driven, end-to-end reinforcement learning 3,4 , we train neural network controllers capable of generating naturalistic locomotion 5–7 along complex trajectories in response to high-level steering commands. Furthermore, we show the use of visual sensors and hierarchical motor control 8 , training a high-level controller to reuse a pretrained low-level flight controller to perform visually guided flight tasks. Our model serves as an open-source platform for studying the neural control of sensorimotor behaviour in an embodied context.
Abstract Distribution shifts remain a problem for the safe application of regulated medical AI systems, and may impact their real-world performance if undetected. Postmarket shifts can occur for example if algorithms developed on data from various acquisition settings and a heterogeneous population are predominantly applied in hospitals with lower quality data acquisition or other centre-specific acquisition factors, or where some ethnicities are over-represented. Therefore, distribution shift detection could be important for monitoring AI-based medical products during postmarket surveillance. We implemented and evaluated three deep-learning based shift detection techniques (classifier-based, deep kernel, and multiple univariate kolmogorov-smirnov tests) on simulated shifts in a dataset of 130’486 retinal images. We trained a deep learning classifier for diabetic retinopathy grading. We then simulated population shifts by changing the prevalence of patients’ sex, ethnicity, and co-morbidities, and example acquisition shifts by changes in image quality. We observed classification subgroup performance disparities w.r.t. image quality, patient sex, ethnicity and co-morbidity presence. The sensitivity at detecting referable diabetic retinopathy ranged from 0.50 to 0.79 for different ethnicities. This motivates the need for detecting shifts after deployment. Classifier-based tests performed best overall, with perfect detection rates for quality and co-morbidity subgroup shifts at a sample size of 1000. It was the only method to detect shifts in patient sex, but required large sample sizes ( $$> 30^{\prime} 000$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mo>></mml:mo> <mml:mn>3</mml:mn> <mml:msup> <mml:mrow> <mml:mn>0</mml:mn> </mml:mrow> <mml:mrow> <mml:mo>′</mml:mo> </mml:mrow> </mml:msup> <mml:mn>000</mml:mn> </mml:mrow> </mml:math> ). All methods identified easier-to-detect out-of-distribution shifts with small (≤300) sample sizes. We conclude that effective tools exist for detecting clinically relevant distribution shifts. In particular classifier-based tests can be easily implemented components in the post-market surveillance strategy of medical device manufacturers.
We present Dictionary Fields, a novel neural representation which decomposes a signal into a product of factors, each represented by a classical or neural field representation, operating on transformed input coordinates. More specifically, we factorize a signal into a coefficient field and a basis field, and exploit periodic coordinate transformations to apply the same basis functions across multiple locations and scales. Our experiments show that Dictionary Fields lead to improvements in approximation quality, compactness, and training time when compared to previous fast reconstruction methods. Experimentally, our representation achieves better image approximation quality on 2D image regression tasks, higher geometric quality when reconstructing 3D signed distance fields, and higher compactness for radiance field reconstruction tasks. Furthermore, Dictionary Fields enable generalization to unseen images/3D scenes by sharing bases across signals during training which greatly benefits use cases such as image regression from partial observations and few-shot radiance field reconstruction.
Andrea Santilli, Silvio Severino, Emilian Postolache, Valentino Maiorca, Michele Mancusi, Riccardo Marin, Emanuele Rodola. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2023.
Abstract The number of publications in biomedicine and life sciences has rapidly grown over the last decades, with over 1.5 million papers now being published every year. This makes it difficult to keep track of new scientific works and to have an overview of the evolution of the field as a whole. Here we present a 2D map of the entire corpus of biomedical literature, and argue that it provides a unique and useful overview of the life sciences research. We based our atlas on the abstract texts of 21 million English articles from the PubMed database. To embed the abstracts into 2D, we used the large language model PubMedBERT, combined with t -SNE tailored to handle samples of our size. We used our atlas to study the emergence of the Covid-19 literature, the evolution of the neuroscience discipline, the uptake of machine learning, the distribution of gender imbalance in academic authorship, and the distribution of retracted paper mill articles. Furthermore, we present an interactive web version of our atlas that allows easy exploration and will enable further insights and facilitate future research.
Many studies claim that visual regularities can be learned unconsciously and without explicit awareness. For example in the contextual cueing paradigm, studies often make claims using a standard reasoning based on two results: (1) a reliable response time (RT) difference between repeated vs. new stimulus displays and (2) a close-to-chance sensitivity when participants are asked to explicitly recognize repeated stimulus displays. From this pattern of results, studies routinely conclude that the sensitivity of RT responses is higher than that of explicit responses-an empirical situation we call Indirect Task Advantage (ITA). Many studies further infer from an ITA that RT effects were driven by a form of recognition that exceeds explicit memory: implicit recognition. However, this reasoning is flawed because the sensitivity underlying RT effects is never computed. To properly establish a difference, a sensitivity comparison is required. We apply this sensitivity comparison in a reanalysis of 20 contextual cueing studies showing that not a single study provides consistent evidence for ITAs. Responding to recent correlation-based arguments, we also demonstrate the absence of evidence for ITAs at the level of individual participants. This lack of ITAs has serious consequences for the field: If RT effects can be fully explained by weak but above-chance explicit recognition sensitivity, what is the empirical content of the label "implicit"? Thus, theoretical discussions in this paradigm-and likely in other paradigms using this standard reasoning-require serious reassessment because the current data from contextual cueing studies is insufficient to consider recognition as implicit.
After a wave of breakthroughs in image-based medical diagnostics and risk prediction models, machine learning (ML) has turned into a normal science. However, prominent researchers are claiming that another paradigm shift in medical ML is imminent-due to most recent staggering successes of large language models-from single-purpose applications toward generalist models, driven by natural language. This article investigates the implications of this paradigm shift for the ethical debate. Focusing on issues like trust, transparency, threats of patient autonomy, responsibility issues in the collaboration of clinicians and ML models, fairness, and privacy, it will be argued that the main problems will be continuous with the current debate. However, due to functioning of large language models, the complexity of all these problems increases. In addition, the article discusses some profound challenges for the clinical evaluation of large language models and threats to the reproducibility and replicability of studies about large language models in medicine due to corporate interests.
Scientists and engineers use simulators to model empirically observed phenomena.However, tuning the parameters of a simulator to ensure its outputs match observed data presents a significant challenge.Simulation-based inference (SBI) addresses this by enabling Bayesian inference for simulators, identifying parameters that match observed data and align with prior knowledge.Unlike traditional Bayesian inference, SBI only needs access to simulations from the model and does not require evaluations of the likelihood-function.In addition, SBI algorithms do not require gradients through the simulator, allow for massive parallelization of simulations, and can perform inference for different observations without further simulations or training, thereby amortizing inference.Over the past years, we have developed, maintained, and extended sbi, a PyTorch-based package that implements Bayesian SBI algorithms based on neural networks.The sbi toolkit implements a wide range of inference methods, neural network architectures, sampling methods, and diagnostic tools.In addition, it provides well-tested default settings but also offers flexibility to fully customize every step of the simulation-based inference workflow.Taken together, the sbi toolkit enables scientists and engineers to apply state-of-the-art SBI methods to black-box simulators, opening up new possibilities for aligning simulations with empirically observed data.
The temporal order of a sequence of events has been thought to be reflected in the ordered firing of neurons at different phases of theta oscillations. Here we assess this by measuring single neuron activity (1,420 neurons) and local field potentials (921 channels) in the medial temporal lobe of 16 patients with epilepsy performing a working-memory task for temporal order. During memory maintenance, we observe theta oscillations, preferential firing of single neurons to theta phase and a close relationship between phase of firing and item position. However, the firing order did not match item order. Training recurrent neural networks to perform an analogous task, we also show the generation of theta oscillations, theta phase-dependent firing related to item position and, again, no match between firing and item order. Rather, our results suggest a mechanistic link between phase order, stimulus timing and oscillation frequency. In both biological and artificial neural networks, we provide evidence supporting the role of phase of firing in working-memory processing.
Artificial neural networks (ANNs) have proven to be a useful tool for complex questions that involve large amounts of data. Our use case of predicting soil maps with ANNs is in high demand by government agencies, construction companies, or farmers, given cost and time intensive field work. However, there are two main challenges when applying ANNs. In their most common form, deep learning algorithms do not provide interpretable predictive uncertainty. This means that properties of an ANN such as the certainty and plausibility of the predicted variables, rely on the interpretation by experts rather than being quantified by evaluation metrics validating the ANNs. Further, these algorithms have shown a high confidence in their predictions in areas geographically distant from the training area or areas sparsely covered by training data. To tackle these challenges, we use the Bayesian deep learning approach "last-layer Laplace approximation", which is specifically designed to quantify uncertainty into deep networks, in our explorative study on soil classification. It corrects the overconfident areas without reducing the accuracy of the predictions, giving us a more realistic uncertainty expression of the model's prediction. In our study area in southern Germany, we subdivide the soils into soil regions and as a test case we explicitly exclude two soil regions in the training area but include these regions in the prediction. Our results emphasize the need for uncertainty measurement to obtain more reliable and interpretable results of ANNs, especially for regions far away from the training area. Moreover, the knowledge gained from this research addresses the problem of overconfidence of ANNs and provides valuable information on the predictability of soil types and the identification of knowledge gaps. By analyzing regions where the model has limited data support and, consequently, high uncertainty, stakeholders can recognize the areas that require more data collection efforts.
Purpose: The purpose of this study was to provide a comparison of performance and explainability of a multitask convolutional deep neuronal network to single-task networks for activity detection in neovascular age-related macular degeneration (nAMD). Methods: From 70 patients (46 women and 24 men) who attended the University Eye Hospital Tübingen, 3762 optical coherence tomography B-scans (right eye = 2011 and left eye = 1751) were acquired with Heidelberg Spectralis, Heidelberg, Germany. B-scans were graded by a retina specialist and an ophthalmology resident, and then used to develop a multitask deep learning model to predict disease activity in neovascular age-related macular degeneration along with the presence of sub- and intraretinal fluid. We used performance metrics for comparison to single-task networks and visualized the deep neural network (DNN)-based decision with t-distributed stochastic neighbor embedding and clinically validated saliency mapping techniques. Results: The multitask model surpassed single-task networks in accuracy for activity detection (94.2% vs. 91.2%). The area under the curve of the receiver operating curve was 0.984 for the multitask model versus 0.974 for the single-task model. Furthermore, compared to single-task networks, visualizations via t-distributed stochastic neighbor embedding and saliency maps highlighted that multitask networks' decisions for activity detection in neovascular age-related macular degeneration were highly consistent with the presence of both sub- and intraretinal fluid. Conclusions: Multitask learning increases the performance of neuronal networks for predicting disease activity, while providing clinicians with an easily accessible decision control, which resembles human reasoning. Translational Relevance: By improving nAMD activity detection performance and transparency of automated decisions, multitask DNNs can support the translation of machine learning research into clinical decision support systems for nAMD activity detection.
This study aimed to automatically detect epiretinal membranes (ERM) in various OCT-scans of the central and paracentral macula region and classify them by size using deep-neural-networks (DNNs). To this end, 11,061 OCT-images were included and graded according to the presence of an ERM and its size (small 100-1000 µm, large > 1000 µm). The data set was divided into training, validation and test sets (75%, 10%, 15% of the data, respectively). An ensemble of DNNs was trained and saliency maps were generated using Guided-Backprob. OCT-scans were also transformed into a one-dimensional-value using t-SNE analysis. The DNNs' receiver-operating-characteristics on the test set showed a high performance for no-ERM, small-ERM and large-ERM cases (AUC: 0.99, 0.92, 0.99, respectively; 3-way accuracy: 89%), with small-ERMs being the most difficult ones to detect. t-SNE analysis sorted cases by size and, in particular, revealed increased classification uncertainty at the transitions between groups. Saliency maps reliably highlighted ERM, regardless of the presence of other OCT features (i.e. retinal-thickening, intraretinal pseudo-cysts, epiretinal-proliferation) and entities such as ERM-retinoschisis, macular-pseudohole and lamellar-macular-hole. This study showed therefore that DNNs can reliably detect and grade ERMs according to their size not only in the fovea but also in the paracentral region. This is also achieved in cases of hard-to-detect, small-ERMs. In addition, the generated saliency maps can be used to highlight small-ERMs that might otherwise be missed. The proposed model could be used for screening-programs or decision-support-systems in the future.
Neurons in the neocortex exhibit astonishing morphological diversity, which is critical for properly wiring neural circuits and giving neurons their functional properties. However, the organizational principles underlying this morphological diversity remain an open question. Here, we took a data-driven approach using graph-based machine learning methods to obtain a low-dimensional morphological "bar code" describing more than 30,000 excitatory neurons in mouse visual areas V1, AL, and RL that were reconstructed from the millimeter scale MICrONS serial-section electron microscopy volume. Contrary to previous classifications into discrete morphological types (m-types), our data-driven approach suggests that the morphological landscape of cortical excitatory neurons is better described as a continuum, with a few notable exceptions in layers 5 and 6. Dendritic morphologies in layers 2-3 exhibited a trend towards a decreasing width of the dendritic arbor and a smaller tuft with increasing cortical depth. Inter-area differences were most evident in layer 4, where V1 contained more atufted neurons than higher visual areas. Moreover, we discovered neurons in V1 on the border to layer 5, which avoided deeper layers with their dendrites. In summary, we suggest that excitatory neurons' morphological diversity is better understood by considering axes of variation than using distinct m-types.
Neural cell types have classically been characterized by their anatomy and electrophysiology. More recently, single-cell transcriptomics has enabled an increasingly fine genetically defined taxonomy of cortical cell types, but the link between the gene expression of individual cell types and their physiological and anatomical properties remains poorly understood. Here, we develop a hybrid modeling approach to bridge this gap. Our approach combines statistical and mechanistic models to predict cells' electrophysiological activity from their gene expression pattern. To this end, we fit biophysical Hodgkin-Huxley-based models for a wide variety of cortical cell types using simulation-based inference, while overcoming the challenge posed by the mismatch between the mathematical model and the data. Using multimodal Patch-seq data, we link the estimated model parameters to gene expression using an interpretable sparse linear regression model. Our approach recovers specific ion channel gene expressions as predictive of biophysical model parameters including ion channel densities, directly implicating their mechanistic role in determining neural firing.
Abstract Biophysiscal neuron models provide insights into cellular mechanisms underlying neural computations. However, a central challenge has been the question of how to identify the parameters of detailed biophysical models such that they match physiological measurements at scale or such that they perform computational tasks. Here, we describe a framework for simulation of detailed biophysical models in neuroscience—J axley —which addresses this challenge. By making use of automatic differentiation and GPU acceleration, J axley opens up the possibility to efficiently optimize large-scale biophysical models with gradient descent. We show that J axley can learn parameters of biophysical neuron models with several hundreds of parameters to match voltage or two photon calcium recordings, sometimes orders of magnitude more efficiently than previous methods. We then demonstrate that J axley makes it possible to train biophysical neuron models to perform computational tasks. We train a recurrent neural network to perform working memory tasks, and a feedforward network of morphologically detailed neurons with 100,000 parameters to solve a computer vision task. Our analyses show that J axley dramatically improves the ability to build large-scale data- or task-constrained biophysical models, creating unprecedented opportunities for investigating the mechanisms underlying neural computations across multiple scales.