Graphs naturally represent a host of processes, including interactions between people on social or communication networks, links between webpages on the World Wide Web, protein interactions in biological networks, movement in transportation networks, electricity delivery in smart energy grids, relations in bibliographic data, and many others. In such scenarios, graphs that model real-world networks are typically heterogeneous, multi-modal, and multi-relational.
In the era of big data, as more varieties of interconnected structured and semi-structured data are becoming available, the importance of leveraging this heterogeneous and multi-relational nature of networks in being able to effectively mine and learn this kind of data is becoming more evident.
The objective of this workshop is to bring together researchers from a variety of related areas, and discuss commonalities and differences in challenges faced, survey some of the different approaches, and provide a forum to present and learn about some of the most cutting-edge research in this area. As an outcome, we expect participants to walk away with a better sense of the variety of different methods and tools available for heterogenous network mining and analysis, and an appreciation for some of the interesting emerging applications, as well as the challenges that accompany these applications
There are many challenges involved in effectively mining and learning from this kind of data, including:
Traditionally, a number of subareas have contributed to this space: communities in graph mining, learning from structured data, statistical relational learning, and, moving beyond subdisciplines in computer science, social network analysis, and, more broadly network science.
Hyper Edge-Based Embedding in Heterogeneous Information Networks
- From Homogeneous to Heterogeneous Network Alignment (Shawn Gu)
The Paradoxes of Social Data
- Robust Overlapping Community Detection via Constrained Egonet Tensor Decomposition (Fatemeh Sheikholeslami) - SMACD: Semi-supervised Multi-Aspect Community Detection (Ekta Gujral) - Discovering Hidden Structure in High Dimensional Human Behavioral Data via Tensor Factorization (Homa Hosseinmardi)
Structured Output Models of Recommendations, Activities, and Behavior
Representation Learning on Heterogeneous Networks
- Multifaceted Event Analysis on Cross-Media Network Data (Nitesh Chawla)
Modeling and Measuring Signed Networks
Mining Rich Graphs:
Ranking, Classification, and Anomaly Detection
- Highly Accurate Link Prediction in Networks Using Stacked Generalization (Amir Ghasemian) - Enhancing Spatial Query Results Using Semantics and Multiplex Networks (Manesh Pillai)
- Biological Systems as Heterogeneous Information Networks: A Mini-review and Perspectives (Koki Tsuyuzaki) - StackSeq2Seq: Dual Encoder Seq2Seq Recurrent Networks (Alessandro Bay)
Carnegie Mellon University
Graph mining is a large research area with various fundamental problems, such as importance ranking, clustering and partitioning, relational classification, link prediction, influence and propagation, etc. When rich data is represented as a graph, in which nodes and/or edges may exhibit different types or properties, how do we think about these fundamental problems? In this talk, I will introduce example work from our group on the ranking, classification, and anomaly mining problems for heterogeneous graphs, motivated by applications from a number of different fields.
Leman Akoglu is an assistant professor of Information Systems at the Heinz College of Carnegie Mellon University. She received her PhD from the Computer Science Department at Carnegie Mellon University in 2012. Her research interests involve algorithmic problems in data mining and applied machine learning, focusing on patterns and anomalies, with applications to fraud and event detection. Dr. Akoglu's research has won 5 publication awards; Best Paper Runner-up at SIAM SDM 2016, Best Paper at SIAM SDM 2015, Best Paper at ADC 2014, Best Paper at PAKDD 2010, and Best Knowledge Discovery Paper at ECML/PKDD 2009. She is also a recipient of the NSF CAREER award (2015) and Army Research Office Young Investigator award (2013).
University of Notre Dame
Representation learning on networks is providing alternatives to feature engineering for designing feature vectors for the learning algorithms. The goal of representation learning is to embed nodes or (sub-)graphs by learning a mapping to a lower dimensional vector space. However, heterogeneous networks present their own set of challenges for representation learning given the multi-typed nodes and/or links. In addition, to the heterogeneity in node and link types, the content associated with the nodes presents yet another challenge. In this talk, I'll discuss our work on representation learning in heterogeneous networks that leverages the network structure, as well as content aware representation learning that incorporates the content of the nodes in addition to the network structure.
Nitesh Chawla is the Frank M. Freimann Professor of Computer Science and Engineering, and director of the research center on network and data sciences (iCeNSA) at the University of Notre Dame. He started his tenure-track career at Notre Dame in 2007, and quickly advanced from assistant professor to a chaired full professor position in nine years. He is the recipient of several awards including 2015 IEEE CIS Outstanding Early Career Award; the IBM Watson Faculty Award, the IBM Big Data and Analytics Faculty Award, National Academy of Engineering New Faculty Fellowship, 1st Source Bank Technology Commercialization Award. He is a twice recipient of Outstanding Teaching Award at Notre Dame. His papers have received several outstanding paper nominations and awards at top conferences and journals, including bein featured on journal cover page. In addition, his students are also recipient of several honors and recent honors include a runner up for the Outstanding Dissertation Award at KDD’17 and the second best research award at the ACM Student Research Competition at Grace Hopper Conference, 2017. In recognition of the societal and impact of his research, he was recognized with the Rodney Ganey Award and Michiana 40 Under 40. He is founder of Aunalytics, a data science software and solutions company.
University of Illinois Urbana-Champaign
In real-world applications, objects of multiple types are interconnected, forming Heterogeneous Information Networks. In such heterogeneous information networks, we make a key observation that many interactions happen due to some event and the objects in each event form a complete semantic unit. By taking advantage of such a property, we propose a generic framework called Hyper Edge-Based Embedding (HEBE) to learn object embeddings with events in heterogeneous information networks, where a hyperedge encompasses the objects participating in one event. The HEBE framework models the proximity among objects in each event with two methods: (1) predicting a target object given other participating objects in the event, and (2) predicting if the event can be observed given all the participating objects. Since each hyperedge encapsulates more information of a given event, HEBE is robust to data sparseness and noise. In addition, HEBE is scalable when the data size spirals. Extensive experiments on large-scale real-world datasets show the efficacy and robustness of the proposed framework.
Jiawei Han is Abel Bliss Professor in the Department of Computer Science, University of Illinois at Urbana-Champaign. He has been researching into data mining, information network analysis, database systems, and data warehousing, with over 900 journal and conference publications. He has chaired or served on many program committees of international conferences in most data mining and database conferences. He also served as the founding Editor-In-Chief of ACM Transactions on Knowledge Discovery from Data and the Director of Information Network Academic Research Center supported by U.S. Army Research Lab (2009-2016), and is the co-Director of KnowEnG, an NIH funded Center of Excellence in Big Data Computing since 2014. He is Fellow of ACM, Fellow of IEEE, and received 2004 ACM SIGKDD Innovations Award, 2005 IEEE Computer Society Technical Achievement Award, and 2009 M. Wallace McDowell Award from IEEE Computer Society. His co-authored book "Data Mining: Concepts and Techniques" has been adopted as a textbook popularly worldwide.
University of Southern California (ISI)
Macro and micro views of social data are often incompatible. In this talk, I will discuss two paradoxes arising from this discrepancy and show how they can bias analysis of social data. First, I discuss Simpson’s paradox, which occurs when an association observed in the data at the macro (population) level disappears or even reverses when data is disaggregated on a micro level (into its underlying subgroups). I illustrate with several examples showing how the paradox can distort conclusions of a study, and describe recent algorithmic efforts to address this problem. Second, I discuss the counter-intuitive effects in social network that may significantly distort the observations people make of their friends. These effects include the “strong friendship paradox,” which states that most of your friends have more friends than you do, and its generalizations. As a result of these paradoxes, an opinion that is macroscopically (globally) rare may be dramatically over-represented in the micro (local) neighborhoods of many individuals. This effect, which I call the ``majority illusion,'' leads individuals to systematically overestimate the prevalence of a minority opinion, and may accelerate the spread of social contagions and adoption of social norms.
Kristina Lerman is Research Team Lead at the University of Southern California Information Sciences Institute and holds a joint appointment as a Research Associate Professor in the USC Computer Science Department. Trained as a physicist, she now applies network analysis and machine learning to problems in computational social science, including crowdsourcing, social network and social media analysis. Her recent work on modeling and understanding cognitive biases in social networks has been covered by the Washington Post, Wall Street Journal, and MIT Tech Review.
University of California San Diego
Predictive models of human behavior--and in particular recommender systems--learn patterns from large volumes of historical activity data, in order to make personalized predictions that adapt to the needs, nuances, and preferences of individuals. Models may take incredibly complex data as *input*, ranging from text, images, social networks, or sequence data. However, the *outputs* they are trained to predict--clicks, purchases, transactions, etc.--are typically simple, numerical quantities, in order for the problem to be cast in terms of traditional supervised learning frameworks.
In this talk, we discuss possible extensions to such personalized, predictive models of human behavior so that they are capable of predicting complex structured *outputs*. For example, rather than training a model to predict what content a user might interact with, we could predict how they would react to unseen content, in the form of text they might write. Or, rather than predicting whether a user would purchase an existing product, we could predict the characteristics or attributes of the types of products that *should* be created.
Julian McAuley has been an Assistant Professor in the Computer Science Department at the University of California, San Diego since 2014. Previously he was a postdoctoral scholar at Stanford University after receiving his PhD from the Australian National University in 2011. His research is concerned with developing predictive models of human behavior using large volumes of online activity data.
Michigan State University
Network modeling and measuring are two fundamental tasks for social network analysis. The majority of existing efforts have focused on unsigned networks (or networks with only positive links). However, in many real-world social systems, relations between two nodes can be represented as signed networks with positive and negative links, where negative links can denote their foes, distrust, or "unfriended" friends and blocked users. It is evident that the key properties and principles in signed networks are distinct from those of unsigned networks. Hence, dedicated efforts are desired for signed networks. In this talk, we will discuss our recent work on signed generative network models and signed node relevance measurements. Both works suggest great opportunities in this new research subfield.
Jiliang Tang is an assistant professor in the computer science and engineering department at Michigan State University since Fall 2016. Before that, he was a research scientist in Yahoo Research and got his PhD from Arizona State University in 2015. He has broad interests in social computing, data mining and machine learning and is directing the Data Science and Engineering Lab. He was the recipients of the Best Paper Award in KDD2016, the runner up of the Best KDD Dissertation Award in 2015, and the best paper runner up of WSDM2013. He has served as the editors and the organizers in prestigious journals and conferences. He has published his research in highly ranked journals and top conference proceedings, which received thousands of citations and extensive media coverage.
Robust Overlapping Community Detection via Constrained Egonet Tensor Decomposition
Fatemeh Sheikholeslami and Georgios B. Giannakis
SMACD: Semi-supervised Multi-Aspect Community Detection
Ekta Gujral and Evangelos Papalexakis
From homogeneous to heterogeneous network alignment
Shawn Gu, John Johnson, Fazle Faisal and Tijana Milenkovic
Discovering Hidden Structure in High Dimensional Human Behavioral Data via Tensor Factorization
Homa Hosseinmardi, Hsien-Te Kao, Kristina Lerman and Emilio Ferrara
Multifaceted Event Analysis on Cross-Media Network Data
Daheng Wang, Meng Jiang, Xueying Wang, Nitesh Chawla and Paul Brunts
Enhancing spatial query results using semantics and multiplex networks
Manesh Pillai and George Karabatis
Biological Systems as Heterogeneous Information Networks: A Mini-review and Perspectives
Koki Tsuyuzaki and Itoshi Nikaido
StackSeq2Seq: Dual Encoder Seq2Seq Recurrent Networks
Alessandro Bay and Biswa Sengupta
Highly Accurate Link Prediction in Networks Using Stacked Generalization
Amir Ghasemian, Aram Galstyan and Aaron Clauset
This workshop is a forum for exchanging ideas and methods for heterogeneous networks analysis and mining, developing new common understandings of the problems at hand, sharing of data sets where applicable, and leveraging existing knowledge from different disciplines. The goal is to bring together researchers from academia, industry, and government, to create a forum for discussing recent advances in this area. In doing so, we aim to better understand the overarching principles and the limitations of our current methods and to inspire research on new algorithms and techniques for heterogeneous networks analysis and mining.
To reflect the broad scope of work on heterogeneous networks analysis and mining, we encourage submissions that span the spectrum from theoretical analysis to algorithms and implementation, to applications and empirical studies is various domains. The need for analysis and learning methods that go beyond mining simple graphs is emerging in many disciplines and are referred to with different names depending on the type of data augmenting the simple graph.
Topics of interest include, but are not limited to:
Heterogenous networks are becoming the key component in many emerging applications and data-mining and graph-mining related tasks. Some of the related research areas and tasks related to heterogeneous networks include:
All papers will be peer reviewed, single-blinded. We welcome many kinds of papers, such as, but not limited to:
Authors should clearly indicate in their abstracts the kinds of submissions that the papers belong to, to help reviewers better understand their contributions.
Submissions must be in PDF, no more than 8 pages long — shorter papers are welcome — and formatted according to the standard double-column ACM Proceedings Style.
The accepted papers will be published on the workshop’s website and will not be considered archival for resubmission purposes.
Authors whose papers are accepted to the workshop will have the opportunity to participate in a spotlight and poster session, and some set may also be chosen for oral presentation. For paper submission, please proceed to the submission website.
Please send enquiries to firstname.lastname@example.org To receive updates about the current and future workshops and the other related news, please join the Mailing List, or follow the Twitter Account.
Paper Submission Open:
Sep 1, 2017
Paper Submission Deadline:
Nov 23, 2017
Dec 20, 2017
Jan 20, 2018
Workshop: Feb 9, 2018
University of Southern California (ISI)
University of California Los angeles
University of Notre Dame
Nesreen Ahmed (Intel Research Labs)
Yuxiao Dong (Microsoft Research)
Amir Ghasemian (University of Colorado Boulder)
Srijan Kumar (Stanford University)
Julian McAuley (University of California, San Diego)
Fred Morstatter (University of Southern California)
Maximilian Nickel (Facebook AI Research)
Evangelos Papalexakis (University of California Riverside)
Ali Pinar (Sandia National Laboratories)
Arti Ramesh (Binghamton University)
Neil Shah (Carnegie Mellon University)
Chuan Shi (Beijing Uni. of Posts & Telecommunications)
Jiliang Tang (Michigan State University)
Elena Zheleva (University of Illinois at Chicago)