Jakob Sturm
M. Sc. (he/him) (external PhD student with BMW)
Address: TUM - Fakultät für Informatik, Boltzmannstr. 3, 85748 Garching
Research interest
The focus of my research is on Retrieval-Augmented Generation (RAG) as an approach for domain adaptation. While pretrained large language models (LLMs) offer great potential, they need to be adapted to handle domain-specific tasks and incorporate relevant knowledge. RAG addresses this need, but it is not yet a perfect solution. I aim to contribute to the identification and resolution of existing challenges.
Current research topics include:
- Evaluation methods for RAG, both end-to-end and component-wise
- Domain adaptation strategies for the retrieval component of RAG
- Comparison and integration of RAG with fine-tuning approaches
- Identification and analysis of key pain points in RAG systems
Supervision of Theses
Important: Currently, I am not available to supervise any additional topics, so please only reach out if you are interested in one of the listed open projects!
If you are looking for a thesis or guided research topic and are motivated to work on a project related to my research interests, don't hesitate to contact me via email! Please attach your CV, Transcript of Records, and a short (< 400 words) introduction about yourself and your motivation. As I only have very limited capacities, please understand that I cannot offer a thesis to every interested student. However, I will try to answer your enquiries nevertheless, i.e. if you haven't heard from me, I am still deciding whether I can provide you a topic or not.
(M = master thesis; B = bachelor thesis) (TUM = Internal TUM thesis; BMW = external thesis, linked to a thesis position at BMW)
Open Topics
- (M) (TUM) Assessing the potential of hybrid retrieval methods for special domain items with focus on large text elements.
- (M) (TUM) Comparing RAG and Finetuning with respect to output quality and costs
- (M) (TUM) Assessing the potential of embedding model blackbox finetuning for domain adaptation
Currently ongoing Thesises
- (M) S3EK: boosting task-Specific Semantic Search with Expert Knowledge data
- (B) Assessing the role of generator and retriever within a RAG pipeline on the Question Answering task with the help of NLP metrics
- (M) Agentic RAG, relevance feedback and keyword identification from search histories
Finished Thesises
- (2025) (M) Advanced Methods for Finding Related Tickets Based on Semantic Search (Continous pretraing and finetuning for IR; hybrid retrieval)
- (2024) (M) Evaluation of Retrieval augmented Generation architectures
- (2024) (M) Boosting Quality Control in the Automotive Industry using LLMs and Contrastive Learning (Blackbox finetuning for Embedding Models)