Network Data Analytics
The data analytics group aims to collect and analyze data sets from various business areas. We develop novel machine learning approaches to turn data into knowledge and better products. The group's focus is on scalable machine learning on time series, sequence, and graph data. We collaborate closely with NEC's business units on internal R&D projects and participate in European R&D projects both with industrial and academic partners.
Machine Learning and Network Data Analytics
This research area focuses on the development of state of the art machine learning models and algorithms. The primarily targeted application domains are those that arise in the operation of communication networks. For instance, we have developed machine learning algorithms that predict popularity of video content to improve the caching performance of mobile networks. We are also working on novel approaches to monetizing network data. For instance, we are developing deep learning models that analyze the request traffic in communication networks turning raw traffic patterns into valuable insights, user personalization, and optimized operation of the network. Communication networks require real-time, low-latency predictions and, therefore, the machine learning methods we develop are highly scalable and often run on high performance GPUs and distributed systems.
The current methodological focus of the group is on scalable neural learning with an emphasis on time series and graph data. Similar to communication networks, data collected from several of NEC's business domains is representable as graphs. Examples include transportation networks and large knowledge bases powering novel AI applications. We have developed and implemented neural machine learning algorithms that learn low-dimensional embeddings for complex systems represented as graphs. These novel embedding approaches improve prediction accuracy while being highly scalable and suitable for low-latency applications. The results of these research projects will be presented at the top-tier conferences ICML 2016 and NIPS 2016.
Processing, Analyzing, and Understanding Big Web Data
This research area focuses on the design and implementation of algorithms and methodologies that allow a better understanding of the Web and its relation with the Internet users. We analyze TB-size network traces, Web logs and CDRs to find and solve problems that may affect the user experience. Moreover, we design and execute measurement experiments when the problem requires it. For example, we analyze the fraud protection mechanisms of video content portals such as YouTube. The results have led to important insights into current practices of these portals which may have worrisome consequences for advertisers and users generating and publishing video content.
The current line of research line focused on improving the online privacy of Internet users. To this end, we use our in-house Hadoop/Spark cluster to study the traffic of thousands of users looking for possible privacy violations. Moreover, we use the knowledge acquired with the big data analysis tools to develop both in-network and user-side privacy safeguarding tools. Some results from this line of research have been published in top-tier international conferences such as WWW 2016 and IMC 2016 and have received broad media coverage (Financial Times, BBC, The Guardian, etc.).
- 20 February 2017:
Revenues for communications service providers to be boosted by NEC's AI
- Several research papers have been accepted at WWW, ICML, NIPS, ECML/PKDD
- Our research on “Understanding the detection of fake view fraud in Video Content Portals” (http://arxiv.org/abs/1507.08874) has received a broad media coverage (Financial Times, BBC, The Guardian, etc.). The article was published on Financial Times cover page of 23.09.2015 (http://www.ft.com/intl/cms/s/0/53ac3fd0-604e-11e5-a28b-50226830d644.html#axzz3okJ2G3Ad)
- Miriam Marciel, Ruben Cuevas (UC3M), Albert Banchs (UC3M, IMDEA), Roberto Gonzalez, Stefano Traverso (Polito), Mohamed Ahmed and Arturo Azcorra (UC3M, IMDEA). Understanding the Detection of View Fraud in Video Content Portals. WWW
- Mathias Niepert. Discriminative Gaifman Models. NIPS
- Roberto Gonzalez, Claudio Soriente (Telefonica Researc), Nikolaos Laoutaris (Telefonica Research). User profiling in the time of HTTPS. IMC
- Mathias Niepert, Mohamed Ahmed, Konstantin Kutzkov. Learning Convolutional Neural Networks for Graphs. ICML
- Miriam Marciel, Jose Gonzalez (UC3M), Yonas Mitike Kassa (IMDEA/UC3M), Roberto Gonzalez, Mohamed Ahmed. The Value of Online Users: Empirical Evaluation of the Price of Personalized Ads. FASES
- Tian Guo (EPFL), Konstantin Kutzkov, Mohamed Ahmed, Jean-Paul Calbimonte (EPFL), and Karl Aberer (EPFL). Efficient Distributed Decision Trees for Robust Regression. ECML/PKDD
- Alberto Garcia-Duran, Antoine Bordes (Facebook), Nicolas Usunier (Facebook), Yves Grandvalet (Universite de technologie de Compiegne). Combining Two and Three-Way Embedding Models for Link Prediction in Knowledge Bases. JAIR
- Juan Miguel Carrascosa, Ruben Cuevas (UC3M), Roberto Gonzalez, Arturo Azcorra (UC3M/IMDEA), David Garcia (ETHZ). Quantifying the Economic and Cultural Biases of Social Media through Trending Topics. PLOS ONE
- Mathias Niepert and Pedro Domingos (University of Washington). Learning and Inference in Tractable Probabilistic Knowledge Bases. UAI
- Stefano Traverso (Polito), Mohamed Ahme, Michele Garetto (Universita di Torino), Paolo Giaccone (Polito), Emilio Leonardi (Polito), Saverio Niccolini. Unravelling the Impact of Temporal and Geographical Locality in Today’s Content Caching Systems. TOMS
- Konstantin Kutzkov, Mohamed Ahmed, Sofia Nikitaki. Weighted similarity estimation in data streams. CIKM
- Laurent Bulteau (TU Berlin), Vincent Froese (TU Berlin), Konstantin Kutzkov and Rasmus Pagh (ITU Copenhagen). Triangle counting in dynamic graph streams. Algorithmica
- Sofia Nikitaki, Maurizio Dusi. Inferring spatio-temporal changes in urban areas from mobile phone data. NetMob 2015