The Chair of Epidemiology, led by Prof. Dr. Stefanie Klug, has received funding from the German Research Foundation (DFG) amounting to more than €380,000. The approved application is for the project titled “Machine Learning (ML) Methods in the Application of Logistic Regression in Statistical Data Analysis.” The focus is on developing modern machine learning methods for matched case-control studies—a field that has so far mostly relied on classical regression techniques.
“In the past, we used conditional logistic regression to analyze matched case-control studies. This method is standard—but quite simple, like all basic regression models,” explains Dr. Gunther Schauberger, scientific staff member at the Chair of Epidemiology and initiator of the project. “Linear models always assume that the effects of variables are linear and additive. However, interactions between variables are hardly captured this way.” Schauberger sees great potential here for machine learning methods, which can automatically recognize more complex patterns. Initial approaches such as decision trees or the so-called random forest algorithm have already been tested—now the methodological “toolbox” will be further expanded.
“The use of machine learning methods in epidemiology has increased significantly in recent years. Especially as prediction tools, they have proven useful,” states Prof. Dr. Klug. “With this project, we aim to contribute to enabling these methods to be applied to more complex data types from matched case-control studies.”
According to Schauberger, it is particularly exciting that the developed methods can be applied not only to matched case-control studies but also to so-called discrete choice data. These refer to situations where individuals choose between several alternatives, such as transportation mode choice or selection of medical treatments. “At first, this seems quite different, but mathematically the structures are very similar. Conditional logistic regression is also suitable here, with the same methodological limitations,” explains the scientific staff member. The goal of the project is therefore to develop a methodological framework that works for both data types and goes beyond the capabilities of classical methods.
A concrete application of the project is in a study on early detection of cervical cancer, examining regular participation in screening programs. “The regularity of screening visits is our exposure variable here. But there are many other influencing factors—such as socioeconomic background, nutrition, BMI, or physical activity. Various interactions may exist between these, and this is exactly where machine learning comes into play.”
The DFG funding is set for three years. Most of the funds will be used for personnel costs: starting in October, a doctoral student will be hired to dedicate full-time effort to the methodological development. In the long term, the project aims not only to advance epidemiological research. Schauberger’s goal is to expand the toolbox for health scientists and medical professionals. “As a statistician, I want to improve the methods others use to achieve new results,” he emphasizes. “With better tools, the results become more reliable and closer to reality.” This is especially important wherever complex relationships play a role: “The world is complex—it rarely follows a simple linear model.”
The fact that these methods can also be applied beyond health data is an additional incentive for Schauberger: “For me personally, it is methodologically exciting that we can transfer the models to completely different types of data—such as political decision-making processes. With very similar methods, entirely different scientific questions can be answered. I will collaborate with a political scientist from Spain on this, because voting decisions are also discrete choice data.”
Link to the Chair of Epidemiology homepage
Contact:
Prof. Dr. Stefanie J. Klug, MPH
Ordinaria
Chair of Epidemiology
TUM Campus im Olympiapark
Am Olympiacampus 11
80809 München
Phone: 089 289 24951
e-mail: stefanie.klug(at)tum.de
Dr. Gunther Schauberger
Chair of Epidemiology
TUM Campus im Olympiapark
Am Olympiacampus 11
80809 München
Phone: 089 289 24955
e-mail: gunther.schauberger(at)tum.de
Text: Jasmin Schol
Photos: Privat