Objective
Currently, companies of all kinds collect data about the interaction of customers with their products or services. This data is very valuable and works as a tool for companies both to generate revenue and reduce costs. In this specific case, we work with the quality data of the MasMovil mobile network, which allows us to monitor the values of numerous key indicators about the network functioning for each client.
Regarding the data, it should first be said that we find a scenario with very few complaints and, therefore, a great imbalance with respect to the majority class of no complaints. This particular work wants to focus on only “Slow Internet” complaint reason, as it is one of the least confusing categorizations and because focusing the effort on a specific complaint will allow to obtain more precise results.
With this data, the objective of this work is, on the one hand, to predict complaint cases before they occur, having a direct impact on the image that MasMovil offers to the customer, with its consequent economic benefits. On the other hand, it seeks to detect cases of anomalous behavior in the quality indicators of the customer’s mobile network, in order to study the antennas with these anomalous behaviors that customers go through, using the data of the complaints as input for these methods.
Used methods to predict complaint cases before they occur are classic supervised learning models such as Random Forest, SVCs or Decision Trees. For results evaluation, Area Under the ROC Curve (AUC) metric has been used mainly because it considers success rates for each class and is more convenient for cases with as much imbalance as the one we are facing. The best results achieved are given by Random Forest models using time windows that include data from the day of the complaint and the previous 6, being in this case AUC = 0.80. Additionally, if we dicard the data from complaint date to predict the complaint, we obtain an AUC = 0.76. However, for the second objective, only 4 attributes and 5-day time windows will be used, since not much information is lost.
On the other hand, regarding the second objective, techniques such as Auto Encoders, Variational Auto Encoders or GANs have been used to detect anomalies in the client’s time window (of 5 days). The best results are offered by the GANs, reaching an AUC of 0.66 by adjusting the anomaly classification thresholds to maximize this metric. However, it may be very interesting not to adjust the classification threshold in this way and to categorize as an anomaly only what the GAN clearly identifies as such, obtaining in this case a result with AUC=0.58, with practically no false positives, but false negatives. This is very relevant because allows being more certain that the case is really an anomaly that must be studied.
These two objectives and results complement each other because, on the one hand, we can anticipate the customer’s complaint, although without having been able to prevent a bad functioning of the mobile network, and, on the other hand, as we can study cases of abnormal behavior on the network to prevent this from happening again.
FINAL DEGREE PROJECT OF:
DAVID CAVADA BUENASMAÑANAS
Degree
Double degree in Computer Engineering and Business Administration at Universidad Carlos III de Madrid. Mention in Computer Science in the Computer Engineering degree.
One year of studies at University of California Riverside. Degrees in Business Administration & Computer Science.
Work Experience
Front End Developer at Varadero Software Factory S.L. (June 2021 – September 2021)
Specialist Technician (research staff) at Universidad Carlos III de Madrid. Chairs at MásMóvil Group (September 2021 – May 2022)
Back End Developer at Grupo MásMóvil (June 2022 – current)
Technical skills
Advanced Java and Python programming skills. Relational databases.
Data analysis and Machine Learning techniques. HTML, CSS and JavaScript.
Orchestration of distributed scalable applications using Cadence. Vert.x and RxJava. Kubernetes.
Knowledge of Business Administration and Management, such as Strategic Management, Accounting, Economics or Marketing.