I-BiDaaS - Telefonica Research Online Hackathon

Ioannis Arapakis (TID), Jordi Luque Serrano (TID)




TID organised an online Hackathon event between the dates 23rd to 25th of October 2020, on the “Quality of Service in Call Centers”, which is a high-value use case for any company that wants to maintain a close relationship with their customers. In this Hackathon challenge, motivated by the I-BiDaaS EU project, we proposed the analysis of Call Centre transcripts together with the corresponding voice acoustic features for the prediction of customer satisfaction index (CSI). Such services may be used to support the Call Center in screening phone calls automatically and identify efficiently problematic cases. Hence, the main task of this Hackathon was to provide an automatic solution for analysing the calls and predicting the customer satisfaction.

The Hackathon was designed to challenge curious and analytical minds, data experts, designers and developers with (but not limited to) competences in:

  • Analysing big and complex data with scalable methods (e.g., Deep Learning, statistical analysis)
  • NLP and speech processing technologies

Considering the conditions and limitations under which the event was launched, the attention it received exceeded our initial expectations. Students, PhDs, researchers and employees from startups and SMEs, from all over Europe, teamed up to address the challenge of developing algorithms that take the output from speech-to-text technologies (e.g., prosodic and linguistic features) and convert them into relevant information for the Call Center Operations.


The participating teams were the following:

  • Team 1 - Qbeast Analytics (2 persons)
  • Team 2 – Algomo (3 persons)
  • Team 3 (1 person)
  • Team 4 - Forest Labs (1 person)
  • Team 5 - Bharat Eco Solutions and Technologies (1 person)
  • Team 6 - ZZ Data Labs (2 persons)
  • Team 7 – ElArbustoDeLaDecision (2 persons)
  • Team 8 - OEG-UPM (1 person)




To assist the teams, and improve communication and collaboration, we assigned to each of them a mentor. The mentors, who were recruited from the I-BiDaaS consortium technical partners, were the first “line of defence” for addressing questions and resolving technical issues:

  • Ioannis Arapakis (TID)
  • Jordi Luque (TID)
  • Gerald Ristow (SAG)
  • Leonidas Kallipolitis (AEGIS)
  • Andreas Alexopoulos (AEGIS)
  • Raül Sirvent (BSC)
  • Omer Boehm (IBM)
  • Lidija Fodor (UNSPMF)


For the CSI prediction, we considered the use of deep neural architectures to perform early feature fusion of both prosodic and textual information. Convolutional Neural Networks were trained on a combination of word embeddings and acoustic features. We addressed the task as a binary classification i.e. “low” and “high” satisfaction prediction categories. We, further, investigated whether fully anonymized transcripts can impact the performance. For the purposes of the hackathon, we made available an anonymized call-centre dataset.


The official track of the challenge was the following:

  • CSI detection, as self-reported by the customer, on anonymize data from text-based and/or prosodic features


We also encourage the participants to submit their proposals to the following unofficial tracks:

  • CSI detection on non-anonymised data from text-based and prosodic features
  • CSI detection from only text
  • CSI detection from only prosodic


Each team was provided with access to a dedicated AWS server (Ubuntu, instance type “g3s.xlarge”, Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, 31 GB memory, SSD 90 GB, Tesla M60 GPU card) that had pre-installed updated versions of popular ML/DL libraries, along with a training dataset and examples of Python scripts in Jupyter notebook format for benchmarking their final submission.


training dataset


The submitted solutions were evaluated on an unseen test set of approximately 9,000 instances, using the following scoring function, which is the geometric mean of two metrics:



  • : The F-score is a way of combining the precision and recall of the model, and it is defined as the harmonic mean of the model’s precision and recall
  • : The Normalised Runtime (NR) is the total number of milliseconds that the submitted system requires to perform classification in the final anonymised test set, averaged over 10 trials, and normalised by the corresponding baseline Runtime, as computed on the same test set


After a careful assessment of the submitted solutions, we determined the winning team: “Team 7 – ElArbustoDeLaDecision” (Dennis Doerrich, Yaroslav Marchuk). Their model outperformed our baseline solution in terms of the F-beta score and achieved an impressive final score of 0.47164.


Runtime (average of 10 rounds): 115.84413
Runtime score (compared to baseline): 0.50000
F-beta score (average of 10 rounds): 0.44488
Final score: 0.47164


Through this Hackathon event, TID achieved the following goals:

  • Assisted in the synthetization of realistic data that mimic real datasets and facilitated the early exploration and development phases in I-BiDaaS
  • Broke the inter- and intra-sectorial data-silos, and provided in-house access to real-life datasets
  • Involved different business units and external companies for interfacing and exploring novel data analytic technologies
  • Raised awareness about the challenges and research output produced by the I-BiDaaS project


In summary, this Hackathon addressed the challenge of developing speech technologies that transform audio calls into relevant information for the Call Center. By working synergistically on this use case, we were able to deliver technologies that can improve the number of audio calls processed per time unit and reduce significantly the manual effort allocated for this task.


Last but not least, the winning team was awarded with a free entrance to the Wayra pitch day!