Objective
This project focuses on the exploration of mobile phone data to study tourist mobility in the city of Madrid during Easter Week 2022. To achieve this goal, a meticulous data cleaning process was carried out, addressing challenges such as the scarcity of tourist records and issues related to labeling and data duplication of mobile phone antennas.
Once the data was prepared, a comprehensive analysis was conducted to understand its composition and structure, generating a new dataset in log format. Through the application of process mining techniques, the main tourist routes taken by visitors in Madrid were identified. The Disco tool was utilized to facilitate information filtering and process map construction.
However, it was observed that extracting valuable information, such as the movement patterns of tourists based on their place of accommodation and nationality, required a more manual approach and the expert knowledge of the analyst. This limitation motivated the proposal to apply machine learning techniques in order to automate the detection of movement patterns and extraction of relevant information.
Based on this premise, an architecture centered around an autoencoder was proposed, compressing mobility information of a tourist’s daily activities into a three-dimensional vector. Subsequently, clustering algorithms such as k-means, DBSCAN, and hierarchical clustering were applied using these vectors to identify groups of tourists with similar movement profiles. The methodology was validated using synthetic data, and neural network models based on convolutional autoencoder and UNET architectures were constructed.
The results revealed that while both designs proved to be effective, the convolutional autoencoder exhibited better learning of the data structure and construction of the latent space. This was attributed to the “skip connections” in the UNET architecture, which facilitated data reconstruction at the expense of more accurate learning. Through clustering, between 4 and 6 groups of tourists were identified, and differences in terms of hourly, spatial, and nationality profiles among these groups were analyzed.
In conclusion, this project has added value to the field of tourist mobility through the application of process mining and machine learning techniques. However, it is suggested to apply similar methods to more precise and less aggregated datasets in future research. This approach has laid the groundwork for a deeper and more specific analysis of tourist mobility in Madrid, with the potential to enhance understanding of visitor movement patterns and provide valuable insights for the tourism sector.
FINAL DEGREE PROJECT OF:
RICARDO GRANDE CROS
Academic Experience
Double Degree in Computer Science and Engineering, and Business Administration, Universidad Carlos III de Madrid
(september 2017 – june 2023)
ERASMUS+ at Karl Franzens Universität, Austria (2019 – 2020)
Work Experience
Backend developer – Grupo MasMóvil (august 2023 – present)
Machine Learning Researcher – Universidad Carlos III de Madrid in colaboration with Grupo MasMóvil (september 2022 – may 2023)
Devops Developer Trainee – CBRE (january 2023 – july 2023)
Software Development Intern – Tecnatom (june 2021 – august 2021)
Private Tutor (Mathematics, Physics and Programming, 2018 – 2020)
Awards and Certifications
Beca de Excelencia de la Comunidad de Madrid 2022/23
Beca de Excelencia de la Comunidad de Madrid 2021/22
Beca de Excelencia de la Comunidad de Madrid 2020/21
Best Creative Idea in Digital Transformation in the field of Occupational Health and Safety (2022)
Azure AI Fundamentals (AI – 900)
Technical skills
Programming languages: Python, Java, C/C++, SQL, HTML/CSS/JS
Libraries: Pandas (y geopandas), NumPy, TensorFlow, Keras, Ploty y Scikit-Learn
Cloud Platform: Amazon Web Services, Microsoft Azure, Google Cloud.
Technologies and frameworks: Git, Jenkins, Terraform, Angular