{"id":1551,"date":"2024-06-26T11:01:41","date_gmt":"2024-06-26T11:01:41","guid":{"rendered":"https:\/\/catedramasmovil.uc3m.es\/2024\/06\/26\/development-of-a-system-for-telephone-call-processing-transcription-diarization-and-classification-by-llm\/"},"modified":"2024-06-27T16:54:54","modified_gmt":"2024-06-27T16:54:54","slug":"development-of-a-system-for-telephone-call-processing-transcription-diarization-and-classification-by-llm","status":"publish","type":"post","link":"https:\/\/catedramasmovil.uc3m.es\/en\/2024\/06\/26\/development-of-a-system-for-telephone-call-processing-transcription-diarization-and-classification-by-llm\/","title":{"rendered":"Development of a system for telephone call processing: transcription, diarization and classification by LLM"},"content":{"rendered":"<p>[et_pb_section fb_built=&#8221;1&#8243; _builder_version=&#8221;4.17.0&#8243; custom_padding=&#8221;0px||||false|false&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_row _builder_version=&#8221;4.17.0&#8243; _module_preset=&#8221;default&#8221; custom_padding=&#8221;0px||||false|false&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.17.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_gallery gallery_ids=&#8221;1418,1420,1422,1424,1426,1408,1410,1412&#8243; fullwidth=&#8221;on&#8221; _builder_version=&#8221;4.25.1&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][\/et_pb_gallery][\/et_pb_column][\/et_pb_row][et_pb_row _builder_version=&#8221;4.16&#8243; background_size=&#8221;initial&#8221; background_position=&#8221;top_left&#8221; background_repeat=&#8221;repeat&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.16&#8243; custom_padding=&#8221;|||&#8221; global_colors_info=&#8221;{}&#8221; custom_padding__hover=&#8221;|||&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_text _builder_version=&#8221;4.18.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;]<\/p>\n<h2>Objective<\/h2>\n<p><span style=\"font-weight: 400\"><\/span><\/p>\n<p>The massive volume of daily calls in telcos presents significant challenges in terms of data management and analysis. Switching these calls to text manually is a laborious and costly process that results in delays and errors. In addition, the lack of analysis of customer interactions prevents the identification of trends and recurring problems that may affect service quality.<\/p>\n<p>Therefore, an automated call processing system has been developed, consisting of a transcription and diarization model that allows the transition from audio to text, in addition to an LLM that categorizes the call and classifies it according to predefined classes.<\/p>\n<p>The transcription model used is OpenAI&#8217;s Whisper, and after several tests it was concluded that the best is its Large-v2 model, which offers errors (WER) of 10% on average. Although good results are obtained, this is a model sensitive to audio quality, background noise, accents and technicalities. It needs the audio as input and as output it provides the words pronounced in the audio with their time stamps.<\/p>\n<p>The transcription model is Nvidia&#8217;s NeMo, which offers an innovative mechanism to obtain the characteristics of the speaker&#8217;s voice through its Multi-scale Diarization Decoder. In addition, Nvidia offers specific parameters for the telephone call domain. The results obtained with this model are good, but its accuracy decreases when there are interruptions and overlaps. It requires audio as input and returns speech segments as output, indicating their start, duration and the speaker who pronounces them.<\/p>\n<p>Once the timestamps of the words and speech segments are available, the spoken words are assigned to each speaker, thus completing the audio-to-text transition.<\/p>\n<p>Finally, the text is taken and classified into pre-established categories of general subject matter Customer retention. Calls are classified according to the reason and sub-reason why the customer decided to make the call to unsubscribe. Different language models such as GPT, Gemini, LlaMA and Gemma are compared, obtaining the best results with gpt-3.5-turbo.<\/p>\n<p>It is essential to develop a good prompt, with which the model understands the task to be performed and to establish clear, concrete and different categories that do not give rise to conclusions. In this way, good results are achieved, confirming the use of this type of tools for text classification.<\/p>\n<p>In conclusion, this project provides value by creating a system that allows the passage from audio to text, to classify them later within defined categories. It also lays the groundwork for possible future work in search of improved accuracy and performance, or enhanced functionality.<\/p>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][\/et_pb_section][et_pb_section fb_built=&#8221;1&#8243; _builder_version=&#8221;4.18.0&#8243; _module_preset=&#8221;default&#8221; custom_margin=&#8221;20px||||false|false&#8221; custom_padding=&#8221;0px||||false|false&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_row column_structure=&#8221;1_2,1_2&#8243; _builder_version=&#8221;4.18.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_column type=&#8221;1_2&#8243; _builder_version=&#8221;4.18.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_image src=&#8221;https:\/\/storage.googleapis.com\/wp-uploads.bucket.wp.uc3m.es\/wp-content\/uploads\/sites\/70\/2024\/06\/26101247\/FotoCarlos.jpeg&#8221; title_text=&#8221;PhotoCarlos&#8221; _builder_version=&#8221;4.25.1&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][\/et_pb_image][\/et_pb_column][et_pb_column type=&#8221;1_2&#8243; _builder_version=&#8221;4.18.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_text _builder_version=&#8221;4.25.1&#8243; _module_preset=&#8221;default&#8221; custom_padding=&#8221;||0px|||&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;]<\/p>\n<p><span style=\"color: #003366\"><strong>BACHELOR&#8217;S THESIS BY:<\/strong><\/span><\/p>\n<p><span style=\"color: #003366\"><strong>CARLOS CAMARERO FUENTE<\/strong><\/span><\/p>\n<p>[\/et_pb_text][et_pb_text _builder_version=&#8221;4.25.1&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;]<\/p>\n<p><strong><\/strong><\/p>\n<p><strong><\/strong><\/p>\n<p><strong><\/strong><\/p>\n<p><strong>Academic Experience<\/strong><strong><\/strong><\/p>\n<ul>\n<li>Double Degree in Computer Engineering and Business Administration, Universidad Carlos III de Madrid (September 2018 &#8211; June 2024)<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][et_pb_row _builder_version=&#8221;4.20.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.20.2&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_text _builder_version=&#8221;4.25.1&#8243; _module_preset=&#8221;default&#8221; custom_padding=&#8221;||9px|||&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;]<\/p>\n<p><strong>Work Experience<\/strong><strong><\/strong><\/p>\n<ul>\n<li>AI Tech Specialist &#8211; Grupo M\u00e1sM\u00f3vil (September 2024 &#8211; Present)<\/li>\n<li>Machine Learning Researcher &#8211; Universidad Carlos III de Madrid in collaboration with Grupo M\u00e1sM\u00f3vil (September 2023 &#8211; May 2024)<\/li>\n<\/ul>\n<p><strong><\/strong><\/p>\n<p><strong>Awards and Certifications<br \/><\/strong><\/p>\n<ul>\n<li>Community of Madrid Excellence Scholarship 2022\/23<\/li>\n<li>Best Creative Idea in Digital Transformation in the field of Occupational Health and Safety (2022)<\/li>\n<li>Community of Madrid Excellence Scholarship 2021\/22<\/li>\n<li>Community of Madrid Excellence Scholarship 2020\/21<\/li>\n<li>Community of Madrid Excellence Scholarship 2019\/20<\/li>\n<li>Community of Madrid Excellence Scholarship 2018\/19<\/li>\n<li>Community of Madrid Excellence Scholarship 2018<\/li>\n<\/ul>\n<p><strong><br \/>Technical skills<\/strong><\/p>\n<ul>\n<li>Programming languages: Python, C\/C++, SQL, HTML\/CSS\/JS.<\/li>\n<li>Development libraries: Pandas, Numpy, Tensorflow, Sci-kit Learn.<\/li>\n<li>Cloud Platforms: Google Cloud.<\/li>\n<li>Frameworks: Git, Docker.<\/li>\n<\/ul>\n<blockquote>\n<p><span style=\"color: #666699\"><span style=\"text-decoration: underline\"><a href=\"https:\/\/www.linkedin.com\/in\/carlos-camarero-fuente\" target=\"_blank\" rel=\"noopener\" style=\"color: #666699;text-decoration: underline\">LinkedIn<\/a><\/span><\/span><\/p>\n<\/blockquote>\n<p>[\/et_pb_text][\/et_pb_column][\/et_pb_row][\/et_pb_section]<\/p>\n","protected":false},"excerpt":{"rendered":"<p>[et_pb_section fb_built=&#8221;1&#8243; _builder_version=&#8221;4.17.0&#8243; custom_padding=&#8221;0px||||false|false&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_row _builder_version=&#8221;4.17.0&#8243; _module_preset=&#8221;default&#8221; custom_padding=&#8221;0px||||false|false&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.17.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_gallery gallery_ids=&#8221;1418,1420,1422,1424,1426,1408,1410,1412&#8243; fullwidth=&#8221;on&#8221; _builder_version=&#8221;4.25.1&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][\/et_pb_gallery][\/et_pb_column][\/et_pb_row][et_pb_row _builder_version=&#8221;4.16&#8243; background_size=&#8221;initial&#8221; background_position=&#8221;top_left&#8221; background_repeat=&#8221;repeat&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_column type=&#8221;4_4&#8243; _builder_version=&#8221;4.16&#8243; custom_padding=&#8221;|||&#8221; global_colors_info=&#8221;{}&#8221; custom_padding__hover=&#8221;|||&#8221; theme_builder_area=&#8221;et_body_layout&#8221;][et_pb_text _builder_version=&#8221;4.18.0&#8243; _module_preset=&#8221;default&#8221; global_colors_info=&#8221;{}&#8221; theme_builder_area=&#8221;et_body_layout&#8221;] Objective The massive volume of daily calls in telcos presents significant challenges in terms of data management [&hellip;]<\/p>\n","protected":false},"author":172,"featured_media":1421,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"on","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[69],"tags":[],"class_list":["post-1551","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-projects-2023-2024"],"_links":{"self":[{"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/posts\/1551","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/users\/172"}],"replies":[{"embeddable":true,"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/comments?post=1551"}],"version-history":[{"count":6,"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/posts\/1551\/revisions"}],"predecessor-version":[{"id":1693,"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/posts\/1551\/revisions\/1693"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/media\/1421"}],"wp:attachment":[{"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/media?parent=1551"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/categories?post=1551"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/catedramasmovil.uc3m.es\/en\/wp-json\/wp\/v2\/tags?post=1551"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}