Carnegie Mellon University, USA
Graph-Based Anomaly Detection : Problems, Algorithms and Applications
Leman Akoglu joined the Heinz College faculty as an Assistant Professor in Fall 2016. She also holds a courtesy appointment in the Computer Science Department (CSD) and the Machine Learning Department (MLD) of School of Computer Science (SCS). Akoglu is the Heinz College Dean’s Associate Professor of Information Systems. Prior to joining Heinz College, she was an Assistant Professor in the Department of Computer Science at Stony Brook University since receiving her Ph.D. from CSD/SCS of Carnegie Mellon University in 2012.
Dr. Akoglu’s research interests span a wide range of data mining and machine learning topics with a focus on algorithmic problems arising in graph mining, pattern discovery, social and information networks, and especially anomaly mining; outlier, fraud, and event detection. At Heinz, Dr. Akoglu directs the Data Analytics Techniques Algorithms (DATA) Lab. Dr. Akoglu’s research has won 7 publication awards; Best Research Paper at SIAM SDM 2019, Best Student Machine Learning Paper Runner-up at ECML PKDD 2018, Best Paper Runner-up at SIAM SDM 2016, Best Research Paper at SIAM SDM 2015, Best Paper at ADC 2014, Best Paper at PAKDD 2010, and Best Knowledge Discovery Paper at ECML PKDD 2009. She also holds 3 U.S. patents filed by IBM T. J. Watson Research Labs.
Graphs provide a powerful abstraction for representing non-iid data, capturing immediate as well as long-range dependencies between entities. The study of the structure and dynamics of real-world graphs has been a central theme of research across various communities. Graph-based anomaly detection focuses broadly on identifying those ‘constructs’ that do not ‘fit’ the expected relational patterns.
This talk involves vignettes from my decade-long research on anomaly detection using graph-based techniques. I will introduce various scenarios in which graphs can be used in a natural way — both to formalize concrete anomaly detection problems, and to develop algorithmic anomaly detection methods. These will be motivated by real-world applications of anomaly detection in the wild; including opinion fraud, accounting anomalies, and host-level intrusion.
Florence University, Italy
Synchronization in Complex Networks, Hypergraphs and Simplicial Complexes
Stefano Boccaletti got his PhD in Physics at the University of Florence on 1995. In October 1998 he was awarded the individual EU grant “Marie Curie” n. ERBFMBICT983466. He is Senior Researcher at the CNR-Institute for Complex Systems, and Honorary Professor of the Weizmann Institute of Science, the Tel Aviv University, the University of Bar Ilan, the University of Navarre, and the Technical University of Madrid. In 2015, he was awarded the PhD honoris causa by the University Rey Juan Carlos of Madrid. Currently, he is the Scientific Attache' at the Italian Embassy in Israel.
Stefano Boccaletti is Author of publications in Physics Journals, which have been cited more than 14,000 times, Editor of 4 books, and Author of other 3, Editor in Chief of the Elsevier Journal Chaos Solitons and Fractals, and member of the Editorial Board of several other International journals of physics and applied mathematics. He has been invited to about 85 International Conferences and Seminars as a plenary lecturer or keynote speaker, and he directly organized 15 Workshops.
All interesting and fascinating collective properties of a complex system arise from the intricate way in which its components interact. Various systems in physics, biology, social sciences and engineering have been successfully modelled as networks of coupled dynamical systems, where the graph links stand for pairwise interactions. This is, however, too strong a limitation, as recent studies have revealed that higher-order many-body interactions are present in social groups, ecosystems and in the human brain, and they actually affect the emergent dynamics of all these systems. I will discuss a general framework that allows to study coupled dynamical systems accounting for the precise microscopic structure of their interactions at any possible order. Namely, I will conider an ensemble of identical dynamical systems, organized on the nodes of a simplicial complex, and interacting through synchronization-non-invasive coupling function. The simplicial complex can be of any dimension, meaning that it can account, at the same time, for pairwise interactions (networks), three-body interactions and so on. In such a broad context, a recent collaboration of mine has shown that complete synchronization, a circumstance where all the dynamical units arrange their evolution in unison, exists as an invariant solution, and has given the necessary condition for it to be observed as a stable state in terms of a Master Stability Function. This generalizes the existing results valid for pairwise interactions (i.e. graphs) to the case of complex systems with the most general possible architecture. Moreover, we show how the approach can be simplified for specific, yet frequently occurring, instances, and we verify all our theoretical predictions in synthetic and real-world systems. Given the completely general character of the method proposed, our results contribute to the theory of dynamical systems with many-body interactions and can find applications in an extremely wide range of practical cases.
KDD Lab, Pisa, Italy
This keynote is jointly sponsored by Applied Network Science, EPJ Data Science, and Social Network Analysis and Mining
Explainable Machine Learning for Trustworthy AI
Fosca Giannotti is a director of research of computer science at the Information Science and Technology Institute “A. Faedo” of the National Research Council, Pisa, Italy. Fosca Giannotti is a pioneering scientist in mobility data mining, social network analysis and privacy-preserving data mining. Fosca leads the Pisa KDD Lab – Knowledge Discovery and Data Mining Laboratory, a joint research initiative of the University of Pisa and ISTI-CNR, founded in 1994 as one of the earliest research lab on data mining. Fosca's research focus is on social mining from big data: smart cities, human dynamics, social and economic networks, ethics and trust, diffusion of innovations.
Fosca has coordinated tens of European projects and industrial collaborations. She is currently the coordinator of SoBigData, the European research infrastructure on Big Data Analytics and Social Mining an ecosystem of ten cutting edge European research centres providing an open platform for interdisciplinary data science and data-driven innovation. Recently she is the PI of ERC Advanced Grant entitled XAI – Science and technology for the explanation of AI decision making. She is member of the steering board of CINI-AIIS lab. On March 8, 2019 she has been features as one of the 19 Inspiring women in AI, BigData, Data Science, Machine Learning by KDnuggets.com, the leading site on AI, Data Mining and Machine Learning.
Black box AI systems for automated decision making, often based on machine learning over (big) data, map a user’s features into a class or a score without exposing the reasons why. This is problematic not only for the lack of transparency, but also for possible biases inherited by the algorithms from human prejudices and collection artifacts hidden in the training data, which may lead to unfair or wrong decisions. The future of AI lies in enabling people to collaborate with machines to solve complex problems. Like any efficient collaboration, this requires good communication, trust, clarity and understanding. Explainable AI addresses such challenges and for years different AI communities have studied such topic, leading to different definitions, evaluation protocols, motivations, and results. This lecture provides a reasoned introduction to the work of Explainable AI (XAI) to date, and surveys the literature with a focus on machine learning and symbolic AI related approaches. We motivate the needs of XAI in real-world and large-scale application, while presenting state-of-the-art techniques and best practices, as well as discussing the many open challenges.
Central European University, Hungary
This keynote is jointly sponsored by Applied Network Science, EPJ Data Science, and Social Network Analysis and Mining
Possibilities and Limitations of using mobile phone data in exploring human behavior
János Kertész obtained his PhD in Physics 1980 from Eötvös University. He worked at the Research Institute of Technical Physics of the Hungarian Academy of sciences, at the Cologne University and at Technical University Munich. He has been professor since 1992 at the Budapest University of Technology and Economics, and since 2012 at the Department of Network and Data Science of the Central European University. He was visiting scientist in Germany, US, France, Italy and Finland.
János Kertész is interested in statistical physics and its applications, including percolation theory, phase transitions, fractal growth, granular materials and simulation methods. During the last 15 years his research has focused on multidisciplinary topics, mainly on complex networks as well as on financial analysis and modeling. He has published more than 200 scientific papers. He has been on the editorial boards of Journal of Physics A, Physica A, Fluctuation and Noise Letters, Fractals, New Journal of Physics. His work has been awarded by several recognitions, including the Hungarian Academy Award, the Szent-Györgyi Award of the Ministry of Education and Culture, the Széchenyi Prize and the title of Finland Distinguished Professor.
Big Data as provided by modern communication systems provide unprecedented opportunities for research. Mobile phones have become almost like a new organ in additional to our biological ones and we practically never get rid of them, hence the analysis of CDR-s (Call Detail Records) are particularly important in gaining information about the whereabouts, contacts and activity patterns of people. We will review some of the results from such analyses, including large scale structure of the society, mobility patterns, gender and age dependence of interactions, bursty character of the activity. We will show that sometimes extremely precise information can be obtained and applied to support theories of social anthropology e.g., about family relationships. However, CDR data should be used with care, as bias could occur since information from one communication channel is considered only. We analyze this aspect and suggest a general description of such biases.
Queen Mary, University of London, UK
The keynote is sponsored by Entropy
Simplicial model of social contagion
I am Professor of Applied Mathematics, Chair of Complex Systems, and Head of the Complex Systems and Networks Research Group in the School of Mathematical Sciences of QMUL. I am editor of the Journal of Complex Networks, Fellow of the Turing Institute, and External Faculty of the Complexity Hub Vienna. I study the structure and the dynamics of complex systems, using my background as theoretical physicist and some of the methods proper to statistical physics and non-linear dynamics, to look into biological problems, to model social systems, and to find new solutions for the design of man-made networks. I have coauthored more than 150 scientific publications, including papers in PRL, PNAS, Nature Comm, Science and Physics Reports. See the complete list of my publications here or from Google Scholar. My recent grants include: EU LASAGNE (2012-15), EPSRC GALE (2013-16) and EPSRC LoBaNet (2016-2019). I currently hold a Research Fellowships from the Leverhulme Trust to work on the network components of creativity and success.
Complex networks have been successfully used to describe the spread of diseases in populations of interacting individuals. Conversely, pairwise interactions are often not enough to characterize social contagion processes such as opinion formation or the adoption of novelties, where complex mechanisms of influence and reinforcement are at work. I will first discuss a higher-order model of social contagion in which a social system is represented by a simplicial complex and contagion can occur through interactions in groups of different sizes. Numerical simulations of the model on both empirical and synthetic simplicial complexes highlight the emergence of novel phenomena such as a discontinuous transition induced by higher-order interactions. I will show analytically that the transition is discontinuous and that a bistable region appears where healthy and endemic states co-exist. This result can help explaining why critical masses are required to initiate social changes. I will then show how the presence of higher-order interaction can affect the stability of a synchronised state in a simplicial complex of coupled dynamical systems.
Simplicial models of social contagion and The Master Stability Function for Synchronization in Simplicial Complexes
Alex 'Sandy' PENTLAND
MIT Media Lab, USA
Human and Optimal Networked Decision Making in Long-Tailed and Non-stationary Environments
Professor Alex 'Sandy' Pentland directs MIT Connection Science, an MIT-wide initiative, and previously helped create and direct the MIT Media Lab and the Media Lab Asia in India. He is one of the most-cited computational scientists in the world, and Forbes recently declared him one of the "7 most powerful data scientists in the world" along with Google founders and the Chief Technical Officer of the United States. He is on the Board of the UN Foundations' Global Partnership for Sustainable Development Data, co-led the World Economic Forum discussion in Davos that led to the EU privacy regulation GDPR, and was central in forging the transparency and accountability mechanisms in the UN's Sustainable Development Goals. He has received numerous awards and prizes such as the McKinsey Award from Harvard Business Review, the 40th Anniversary of the Internet from DARPA, and the Brandeis Award for work in privacy.
He is a member of advisory boards for the UN Secretary General and the UN Foundation, and the American Bar Association, and previously for Google, AT&T, and Nissan. He is a serial entrepreneur who has co-founded more than a dozen companies including social enterprises such as the Harvard-ODI-MIT DataPop Alliance . He is a member of the U.S. National Academy of Engineering and leader within the World Economic Forum.
Over the years Sandy has advised more than 70 PhD students. Almost half are now tenured faculty at leading institutions, with another one-quarter leading industry research groups and a final quarter founders of their own companies. Together Sandy and his students have pioneered computational social science, organizational engineering, wearable computing (Google Glass), image understanding, and modern biometrics. His most recent books are Social Physics, published by Penguin Press, and Honest Signals, published by MIT Press.
Human social networks frequently give rise to long-tailed and non-stationary information spreading, but most methods of analysis and decision making typically assume stationary, concentrated distributions. Similarly, wisdom of the crowd phenomena are usually analyzed as a single trial with a fixed information sharing network whereas dynamic networks and importance of long-term repeated-trial performance are major feature of human societies. I will discuss new theoretical results on optimal tuning of information sharing networks while accounting for long-tailed distributions. Finally, I will show that these new theoretical results provide a good model for how humans tune their social networks for better performance in non-stationary and long-tailed environments.
Barcelona Supercomputing Center, Spain
The keynote is sponsored by Entropy
Untangling biological complexity: From omics network data to new biomedical knowledge and Data-Integrated Medicine
Prof. Przulj initiated extraction of biomedical knowledge from the wiring patterns (topology, structure) of "Big Data" real-world molecular (omics) and other networks. That is, she views the wiring patterns of large and complex omics networks, disease ontologies, clinical patient data, drug-drug and drug-target interaction networks etc., as a new source of information that complements the genetic sequence data and needs to be mined and meaningfully integrated to gain deeper biomedical understanding. Her recent work includes designing machine learning methods for integration of heterogeneous biomedical and molecular data, applied to advancing biological and medical knowledge. She also applies her methods to economics.
She is a member of the Editorial Boards of Bioinformatics (Oxford Journals), Scientific Reports (Nature Publishing Group) and Frontiers in Genetics (Frontiers), and an Associate Editor of BMC Bioinformatics (BioMed Central). Prof. Przulj a member of the Scientific Advisory Board of the Helmholtz Centre for Infection Research (HZI / Braunschweig, Germany) and GSK. She is a Proceedings / Area Chair of Protein Interactions, Molecular Networks and Network Biology tracks at the ISMB/ECCB 2015, ISMB 2016 and ISMB/ECCB 2017, elected Chair of NetBio COSI (ISCB, ISMB) since 2019.
We are faced with a flood of molecular and clinical data. We are measuring interactions between various bio-molecules in a cell that form large, complex systems. Patient omics datasets are also increasingly becoming available. These systems-level network data provide heterogeneous, but complementary information about cells, tissues and diseases. The challenge is how to mine them collectively to answer fundamental biological and medical questions. This is nontrivial, because of computational intractability of many underlying problems on networks (also called graphs), necessitating the development of approximate algorithms (heuristic methods) for finding approximate solutions.
We develop methods for extracting new biomedical knowledge from the wiring patterns of systems-level, heterogeneous biomedical networks. Our methods uncover the patterns in molecular networks and in the multi-scale network organization indicative of biological function, translating the information hidden in the network topology into domain-specific knowledge. We also introduce a versatile data fusion (integration) framework to address key challenges in precision medicine from biomedical network data: better stratification of patients, prediction of driver genes in cancer, and re-purposing of approved drugs to particular patients and patient groups, including Covid-19 patients. Our new methods stem from novel network science algorithms coupled with graph-regularized non-negative matrix tri-factorization, a machine learning technique for dimensionality reduction and co-clustering of heterogeneous datasets. We utilize our new framework to develop methodologies for performing other related tasks, including disease re-classification from modern, heterogeneous molecular level data, inferring new Gene Ontology relationships, aligning multiple molecular networks, and uncovering new cancer mechanisms.