Automatic improvement of instructions in sorting tasks

Objective

Large language models (LLMs) have been shown to be highly effective on a wide range of tasks, including the classification of text from natural language instructions. However, the quality of these instructions – known as prompts – directly influences system performance. Designing effective instructions manually is time-consuming and requires expertise, which makes it difficult to scale to multiple tasks or domains.
Translated with DeepL.com (free version)

This project presents a fully automated approach to improving instructions in sorting tasks through an iterative process. Unlike other solutions that rely on large models and complex infrastructure, the developed system uses only small language models, demonstrating that significant improvements can be achieved without resorting to large-scale architectures.

The proposed architecture is structured in three functional modules: generation, evaluation and selection. Through an iterative process, a first model acts as a generator of new instructions from templates and examples. These instructions are evaluated by a second model that simulates their execution and compares the results with real labels. Finally, an intelligent selector component, based on exploration-exploitation strategies and Bayesian inference, chooses the most promising instructions and guides the next iteration.

This approach seeks to maximize system performance with minimal human intervention by dynamically adapting to each dataset. Preliminary results show consistent improvements over manual prompts, highlighting the feasibility of automatic methods even in environments with limited computational resources. Furthermore, the system is presented as an adaptable basis for future applications in other natural language processing tasks, such as generation or question-answering.

BACHELOR’S THESIS BY:

ÓSCAR HONTORIA HERRADOR

Academic Experience

Computer Science and Engineering, Universidad Carlos III de Madrid (September 2021 – September 2025)

Work Experience

Machine Learning Researcher – Universidad Carlos III de Madrid in collaboration with Grupo MasOrange (September 2024 — June 2025)

Technical skills

Programming languages: Python, C/C++, Go, C#, SQL, HTML/CSS, JavaScript.
Development libraries: Pandas, Numpy, PyTorch, Keras, Sci-kit Learn.
Cloud Platforms: Google Cloud.
Frameworks: GitHub, GitLab.

LinkedIn