A Parallel Workflow for Online Correlation and Clique-finding -- with applications to finance
This thesis investigates how a state-of-the-art Stochastic Local Search (SLS) algorithm for the maximum clique problem... more This thesis investigates how a state-of-the-art Stochastic Local Search (SLS) algorithm for the maximum clique problem can be adapted for and employed within a fully distributed parallel workfiow environment. First we present parallel variants of Dynamic Local Search (DLS-MC) and Phased Local Search (PLS), demonstrating how a simple yet effective multiple independent runs strategy can offer superior speedup performance with minimal communication overhead. We then extend PLS into an online algorithm so that it can operate in a dynamic environment where the input graph is constantly changing, and show that in most cases trajectory continuation is more efficient than restarting the search from scratch. Finally, we embed our new algorithm within a data processing pipeline that performs high throughput correlation and clique-based clustering of thousands of variables from a high-frequency data stream. For communication within and between system components, we use MPI, the de-facto standard API for message passing in high-performance computing. We present algorithmic and system performance results using synthetically generated data streams, as well as a preliminary investigation into the applicability of our system for processing high-frequency, real-life intra-day stock market data in order to determine clusters of stocks exhibiting highly correlated short-term trading patterns.
The Evolution of the Commuting Network in Germany: Spatial and Connectivity Patterns
Co-authored with A. Reggiani, P. Nijkamp and F.-J. Bade, published in the 'Journal of Transport and Land Use', 2010
The analysis of the structure and evolution of (complex) networks has recently received considerable attention.... more
The analysis of the structure and evolution of (complex) networks has recently received considerable attention. Although research on networks originated in mathematics studies dating back to the 19th century (or earlier), and developed further in the mid-1900s with contributions to graph theory, there is nowadays growing interest in its application to the social sciences, particularly in regional science and transportation, because of the spatial relevance of networks.
This paper presents a dynamic outlook for the German commuting network from the perspective of the German labour market districts. The focus of this paper is to explore how the German commuting network evolves, from two perspectives: space and connectivity. We consider home-to-work commuters moving between 439 German districts for the years 1995 and 2005. The results of the present analysis allow us to identify, amongst the main German districts, the most ‘open’ and connected ones. These emerging districts can be considered as candidates for a position as ‘hubs’ in the German commuting system, that is, attractors in the spatial economic perspective, as well as interconnectors in a network perspective.
Spatial and Commuting Networks: A Unifying Perspective
Co-authored with A. Reggiani, P. Nijkamp and F.-J. Bade, published in A. Reggiani and P. Nijkamp (eds) 'Complexity, Evolution and Learning: In Search of Simplicity', Springer, 2009
48 views
Seen by:An Application of Complex Network Theory to German Commuting Patterns
Co-authored with S.P. Gorman, A. Reggiani, P. Nijkamp, R. Kulkarni and F.-J. Bade, published in T. Friesz (ed.), 'Network Science, Nonlinear Science and Infrastructure Systems', Springer, 2007
Simulating the structure and evolution of complex networks is an area that has recently received considerable... more Simulating the structure and evolution of complex networks is an area that has recently received considerable attention. Most of this research has grown out of the physical sciences, but there is growing interest in their application to the social sciences, especially regional science and transportation. This paper presents a network structure simulation experiment utilizing a gravity model to identify interactions embodied in socio-economic processes. In our empirical case, we consider home-to-work commuting patterns among 439 German labour market districts. Specifically, the paper examines first the connectivity distribution of the German commuting network. The paper next develops a spatial interaction model to estimate the structure and flows in the network concerned. The focus of this paper is to examine how well the spatial interaction model replicates the structure of the German commuting network as compared to complex network models. Finally, the structure of the physical German road network is compared to the spatial flows of commuters across it for a tentative supply-demand comparison.
39 views
Seen by:Network Analysis of Commuting Flows: A Comparative Static Approach to German Data
Co-authored with A. Reggiani, S.P. Gorman, P. Nijkamp and F.-J. Bade, published in 'Networks and Spatial Economics', 2007. ISI impact factor (2010): 0.940
The analysis of complex networks has recently received considerable attention. The work by Albert and Barabási... more The analysis of complex networks has recently received considerable attention. The work by Albert and Barabási presented a research challenge to network analysis, that is, growth of the network. The present paper offers a network analysis of the spatial commuting network in Germany. First, we study the spatial evolution of the commuting network over time. Secondly, we compare two spatial interaction model (SIM) specifications, in order to replicate the actual network structure. Our findings suggest that the commuting network appeared to become more dense and clustered, while the SIMs seem to require more sophisticated specifications, in order to replicate such a connectivity structure.
Small-World Phenomena in Communications Networks: A Cross-Atlantic Comparison
Co-authored with L.A. Schintler, S.P. Gorman, A. Reggiani and P. Nijkamp, published in A. Reggiani and L.A. Schintler (eds), 'Methods and Models in Transport and Communications. Cross Atlantic Perspectives', Springer, 2005
59 views
Seen by:Complex Network Phenomena in Telecommunication Networks
Co-authored with L.A. Schintler, S.P. Gorman, A. Reggiani and P. Nijkamp, published in 'Networks and Spatial Economics', 2005. ISI impact factor (2010): 0.940
Many networks such as the Internet have been found to possess scale-free and small-world network properties reflected... more
Many networks such as the Internet have been found to possess scale-free and small-world network properties reflected by power-law distributions. Scale-free properties evolve in large complex networks through self-organizing processes and, more specifically, preferential attachment. New nodes in a network tend to attach to other vertices that are already well-connected. Because traffic is routed mainly through a few highly connected and concentrated vertices, the diameter of the network is small in comparison to other network structures, and movement through the network is therefore efficient. At the same time, this efficiency feature puts scale-free networks at risk for becoming disconnected or significantly disrupted when super-connected nodes are removed, either unintentionally or through a targeted attack or external force.
The present paper will examine and compare properties of telecommunication networks for both the United States and Europe. Both types of networks will be examined in terms of their network topology and specifically, whether or not they are scale-free networks to be further explored by identifying and plotting power-law
A Spatial Query & Analysis Tool for Architects
by Ben Doherty
Ben Doherty, Dan Rumery, Ben Barnes, Bin Zhou, Presented at SimAUD 2012
There exists a lack of 'off the shelf' and user-friendly
computational tools that allow architects and other... more
There exists a lack of 'off the shelf' and user-friendly
computational tools that allow architects and other design
consultants to quickly analyse and simulate circulation
patterns of buildings. Other developments of such tools
have so far failed to penetrate the mainstream market for
architectural design software. The research presented in this
paper focuses on the development of a graph-based building
navigation and distance measurement tool; the Spatial
Analysis and Query Tool (SQ&AT), that plugs into the most
common design documentation software (Autodesk Revit)
with the ability for extension into other tools.
The resulting software allows users to test point-to-point
shortest distances, produces several grid based metrics and
allows scenarios to be built. These results may expose areas
where the designer’s intuition and bias has led them to make
inaccurate assumptions about quantitative design aspects.
On the privacy and utility of anonymized social networks
Proc. of the 13th International Conference on Information Integration and Web-based Applications and Services.
You are on Facebook or you are out. Of course, this assessment is controversial and its rationale arguable. It is... more You are on Facebook or you are out. Of course, this assessment is controversial and its rationale arguable. It is nevertheless not far, for many of us, from the reason behind our joining social media and publishing and sharing details of our professional and private lives. Not only the personal details we may reveal but also the very structure of the networks themselves are sources of invaluable information for any organization wanting to understand and learn about social groups, their dynamics and their members. These organizations may or may not be benevolent. It is therefore important to devise, design and evaluate solutions that guarantee some privacy. One approach that attempts to reconcile the different stakeholders' requirement is the publication of a modified graph. The perturbation is hoped to be sufficient to protect members' privacy while it maintains sufficient utility for analysts wanting to study the social media as a whole. It is necessarily a compromise. In this paper we try and empirically quantify the inevitable trade-off between utility and privacy. We do so for one state-of-the-art graph anonymization algorithm that protects against most structural attacks, the k-automorphism algorithm. We measure several metrics for a series of real graphs from various social media before and after their anonymization under various settings.
A Data Parallel Minimum Spanning Tree Algorithm for Most Graphics Processing Units
Proc. of the Annual Intl Conf. on Advances in Distributed and Parallel Computing (ADPC'10), Singapore.
We propose a fast data parallel minimum spanning tree algorithm designed for general purpose computation graphical... more We propose a fast data parallel minimum spanning tree algorithm designed for general purpose computation graphical processing units (GPU). Our algorithm is a data parallel version of Borůvka's minimum spanning tree algorithm. Its gist is a synchronization on the central processing unit after each of the parallel iterations computing the components and their outgoing edge minimum weight. Our implementation uses both BrookGPU and CUDA from NVIDIA as programming environments and the performance of our algorithm was evaluated in comparison with the state-of-the art algorithms on different types of datasets. The experimental results show that our algorithm out performs other algorithms substantially in terms of execution time with up to ten fold speedup.
Fast Random Graph Generation
Proc. of the 14th Intl Conf. on Extending Database Technology (EDBT'11), Uppsala, Sweden.
Today, several database applications call for the generation of random graphs. A fundamental, versatile random graph... more Today, several database applications call for the generation of random graphs. A fundamental, versatile random graph model adopted for that purpose is the Erdős–Rényi Γv,p model. This model can be used for directed, undirected, and multipartite graphs, with and without self-loops; it induces algorithms for both graph generation and sampling, hence is useful not only in applications necessitating the generation of random structures but also for simulation, sampling and in randomized algorithms. However, the commonly advocated algorithm for random graph generation under this model performs poorly when generating large graphs, and fails to make use of the parallel processing capabilities of modern hardware. In this paper, we propose PPreZER, an alternative, data parallel algorithm for random graph generation under the Erdős–Rényi model, designed and implemented in a graphics processing unit (GPU). We are led to this chief contribution of ours via a succession of seven intermediary algorithms, both sequential and parallel. Our extensive experimental study shows an average speedup of 19 for PPreZER with respect to the baseline algorithm.
Scalable parallel minimum spanning forest computation
Proc. of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming (PPoPP'12), New Orleans, LA, USA
The proliferation of data in graph form calls for the development of scalable graph algorithms that exploit parallel... more The proliferation of data in graph form calls for the development of scalable graph algorithms that exploit parallel processing environments. One such problem is the computation of a graph's minimum spanning forest (MSF). Past research has proposed several parallel algorithms for this problem, yet none of them scales to large, high-density graphs. In this paper we propose a novel, scalable, Parallel MSF Algorithm (PMA) for undirected weighted graphs. Our algorithm leverages Prim's algorithm in a parallel fashion, concurrently expanding several subsets of the computed MSF. Our effort focuses on minimizing the communication among different processors without constraining the local growth of a processor's computed subtree. In effect, we achieve a scalability that previous approaches lacked. We implement our algorithm in CUDA, running on a GPU and study its performance using real and synthetic, sparse as well as dense, structured and unstructured graph data. Our experimental study demonstrates that our algorithm outperforms the previous state-of-the-art GPU-based MSF algorithm, while being several orders of magnitude faster than sequential CPU-based algorithms.
An Efficient Hierarchical Parallel Genetic Algorithm for Graph Coloring Problem
! NOMINATED FOR BEST PAPER AWARD AT GECCO 2011 !
R. Abbasian and M. Mouhoub. An efficient hierarchical parallel genetic algorithm for graph coloring problem, 13th Annual Genetic and Evolutionary Computation Conference (GECCO 2011), ACM, pages 521-528, Dublin, Ireland, July 12-16, 2011. Also presented at the International Joint Conferences on Artificial Intelligence (IJCAI 2011), RCRA, July 2011.
Graph coloring problems (GCPs) are constraint optimization problems with various applications including scheduling,... more Graph coloring problems (GCPs) are constraint optimization problems with various applications including scheduling, time tabling, and frequency allocation. The GCP consists in finding the minimum number of colors for coloring the graph vertices such that adjacent vertices have distinct colors. We propose a parallel approach based on Hierarchical Parallel Genetic Algorithms (HPGAs) to solve the GCP. We also propose a new extension to PGA, that is Genetic Modification (GM) operator designed for solving constraint optimization problems by taking advantage of the properties between variables and their relations. Our proposed GM for solving the GCP is based on a novel Variable Ordering Algorithm (VOA). In order to evaluate the performance of our new approach, we have conducted several experiments on GCP instances taken from the well known DIMACS website. The results show that the proposed approach has a high performance in time and quality of the solution returned in solving graph coloring instances taken from DIMACS website. The quality of the solution is measured here by comparing the returned solution with the optimal one.
Complexity of Social Network Anonymization
by Sean Chester
Ssocial network privacy paper co-authored with Bruce M. Kapron, Gautam Srivastava, and Venkatesh Srinivasan. To appear in an upcoming issue of the Springer journal Social Network Analysis and Mining (SNAM).
With an abundance of social network data being released, the need
to protect sensitive information within these... more
With an abundance of social network data being released, the need
to protect sensitive information within these networks has become an impor-
tant concern of data publishers. To achieve this objective, various notions of
k-anonymization have been proposed for social network graphs. In this paper
we focus on the complexity of optimization problems that arise from trying
to anonymize graphs, establishing that optimally k-anonymizing the label se-
quences of edge-labeled graphs is intractable. We show how this result implies
intractability for other notions of k-anonymization in literature.
We also consider the case of bipartite social network graphs which arise
from the representation of distinct entities, such as movies and viewers, pa-
tients and drugs, or products and customers. Within this setting we demon-
strate that, although k-anonymizing edge-labeled graphs is intractable for
k ≥ 3, polynomial time algorithms exist for arbitrary bipartite graphs when
k = 2 and for unlabeled bipartite graphs irrespective of the value of k.
Finally, in this paper we extend the study of attribute disclosure within
the context of social networks by defining t-closeness, a measure of how effec-
tively an adversary can determine sensitive information about members of a
k-anonymous social network.
Jump Liars and Jourdain's Card via the Relativized T-scheme
by Ming Xiong
Studia Logica, Vol. 91 (2), pp. 239-271, December 2009.
A relativized version of Tarski’s T-scheme is introduced as a new principle of the truth predicate. Under the... more A relativized version of Tarski’s T-scheme is introduced as a new principle of the truth predicate. Under the relativized T-scheme, the paradoxical objects, such as the Liar sentence and Jourdain’s card sequence, are found to have certain relative contradictoriness. That is, they are contradictory only in some frames in the sense that any valuation admissible for them in these frames will lead to a contradiction. It is proved that for any positive integer n, the n-jump liar sentence is contradictory in and only in those frames containing at least an n-jump odd cycle. In particular, the Liar sentence is contradictory in and only in those frames containing at least an odd cycle. The Liar sentence is also proved to be less contradictory than Jourdain’s card sequence: the latter must be contradictory in those frames where the former is so, but not vice versa. Generally, the relative contradictoriness is the common characteristic of the paradoxical objects, but different paradoxical objects may have different relative contradictoriness.
26 views
Seen by:Comparative Cognition: Politics of International Control of the Oceans
by Jeffrey Hart
published in Robert Axelrod (ed.), Structure of Decision (Princeton, N.J.: Princeton University Press, 1976).
Symmetry and polarization in the European international system, 1870-1879: a methodological study
by Jeffrey Hart
Journal of Peace Research, no. 3 (1974), 229-44.
15 views
Seen by:
