PERSEUS-HUB: Interactive and Collective Exploration of Large-Scale Graphs
Journal Title: Informatics - Year 2017, Vol 4, Issue 3
Abstract
Graphs emerge naturally in many domains, such as social science, neuroscience, transportation engineering, and more. In many cases, such graphs have millions or billions of nodes and edges, and their sizes increase daily at a fast pace. How can researchers from various domains explore large graphs interactively and efficiently to find out what is ‘important’? How can multiple researchers explore a new graph dataset collectively and “help” each other with their findings? In this article, we present PERSEUS-HUB, a large-scale graph mining tool that computes a set of graph properties in a distributed manner, performs ensemble, multi-view anomaly detection to highlight regions that are worth investigating, and provides users with uncluttered visualization and easy interaction with complex graph statistics. PERSEUS-HUB uses a Spark cluster to calculate various statistics of large-scale graphs efficiently, and aggregates the results in a summary on the master node to support interactive user exploration. In PERSEUS-HUB, the visualized distributions of graph statistics provide preliminary analysis to understand a graph. To perform a deeper analysis, users with little prior knowledge can leverage patterns (e.g., spikes in the power-law degree distribution) marked by other users or experts. Moreover, PERSEUS-HUB guides users to regions of interest by highlighting anomalous nodes and helps users establish a more comprehensive understanding about the graph at hand. We demonstrate our system through the case study on real, large-scale networks.
Authors and Affiliations
Di Jin, Aristotelis Leventidis, Haoming Shen, Ruowang Zhang, Junyue Wu and Danai Koutra
Artificial Neural Networks and Particle Swarm Optimization Algorithms for Preference Prediction in Multi-Criteria Recommender Systems
Recommender systems are powerful online tools that help to overcome problems of information overload. They make personalized recommendations to online users using various data mining and filtering techniques. However,...
digiMe: An Online Portal to Support Connectivity through E-Learning in Medical Education
Connectivity is intrinsic to all aspects of our life today, be it political, economic, technological, scientific, or personal. Higher education is also transcending the previous paradigm of technology enabled content d...
Relative Quality and Popularity Evaluation of Multilingual Wikipedia Articles
Despite the fact that Wikipedia is often criticized for its poor quality, it continues to be one of the most popular knowledge bases in the world. Articles in this free encyclopedia on various topics can be created and...
Assessing the Cost Impact of Multiple Transportation Modes to Enhance Sustainability in an Integrated, Two Stage, Automotive Supply Chain
As the automotive industry has been striving to enhance its efficiency, competitiveness, and sustainability, great focus is often placed on opportunities for improving its supply chain operations. We study the effect o...
In Search of Smartness: The EU e-Justice Challenge
At the EU level, an increasing number of resources are being invested in an attempt to provide better public services through the use of Information and Communication Technology (ICT). While new tools are being designe...