Network science, big data analytics, and deep learning: An interdisciplinary approach to the study of citation, social and collaboration networks

Yifan Qian

September 2021

Abstract

Over the last few decades, networks have played an increasingly important role in multiple scientific domains, ranging from social science to physics and computer science. This thesis mainly focuses on three types of networks (citation networks, social networks, and collaboration networks) by combining theories and methods from network science, sociology, machine learning, and data science. Specifically, I present four projects concerned with two research clusters: social capital and deep learning. In the first project, I develop new measures of network effective size, i.e., intra- and inter-brokerage based on non-topological properties of nodes in directed and weighted networks, which can provide finer-grained perspectives on social capital. In the second project, I explore the social capital of cities extracted from the collaboration patterns of their resident scientists and their external collaborators by combining four large-scale bibliometric data sets. Results suggest that the relationship between the (internal or external) brokerage and scientific performance of cities is moderated by internal or external strong ties and the cities’ geographical diversity. In the third project, I show that the classification performance of Graph Convolutional Networks (GCNs) is related to the alignment among features, graph, and ground truth, which I quantify using a subspace alignment measure corresponding to the Frobenius norm of the matrix of pairwise chordal distances between three subspaces associated with the three ingredients. The proposed measure is based on the principal angles between subspaces and has both spectral and geometrical interpretations. In the fourth project, I show that, if additional relational information is not available in the data set, one can improve classification by constructing geometric graphs from the features themselves and using them within a GCN. I also show that such feature-derived graphs increase the alignment of the data to the ground truth while improving class separation.

Type

Thesis

Publication

Queen Mary University of London

Network science, big data analytics, and deep learning: An interdisciplinary approach to the study of citation, social and collaboration networks

Abstract

Related