Full Record

New Search | Similar Records

Title Link Prediction and Denoising in Networks
Publication Date
Date Accessioned
Degree PhD
Discipline/Department Statistics
Degree Level doctoral
University/Publisher University of Michigan
Abstract Network data represent connections between units of interests, but are often noisy and/or include missing values. This thesis focuses on denoising network data via inferring underlying network structure from an observed noisy realization. The observed network data can be viewed as a single random realization of an unobserved latent structure, and our general approach to estimating this latent structure is based factorizing it into a product of interpretable components, with structural assumptions on the components determined by the nature of the problem. We first study the problem of predicting links when edge features are available, or node features that can be converted into edge features. We propose a regression-type model to combine information from network structure and edge features. We show that estimating parameters in this model is straightforward and the estimator enjoys excellent theoretical performance guarantees. Another direction we study is predicting links in time-stamped dynamic networks. A common approach to modeling networks observed over time is aggregating the networks to a few snapshots, which reduces computational complexity, but also loses information. We address this limitation through a dynamic network model based on tensor factorization, which simultaneously captures time trends and the graph structure of dynamic networks without aggregating over time. We develop an efficient algorithm to fit this model and demonstrate the method performs well numerically. The last contribution of this thesis is link prediction for ego-networks. Ego-networks are constructed by recording all friends of a particular user, or several users, which is widely used in survey-based social data collection. There are many methods for filling in missing data in a matrix when entries are missing independently at random, but here it is more appropriate to assume that whole rows of the matrix are missing (corresponding to users), whereas other rows are observed completely. We develop an approach to estimate missing links in this scenario via subspace estimation, exploiting potential low-rank structure common in networks. We obtain theoretical bounds on the estimator's performance and demonstrate it significantly outperforms many widely used benchmarks in both simulated and real networks.
Subjects/Keywords Network data; Link prediction; Low-rank approximation; Statistics and Numeric Data; Science
Contributors Levina, Elizaveta (committee member); Zhu, Ji (committee member); Scott, Clayton D (committee member); Tewari, Ambuj (committee member)
Language en
Rights Unrestricted
Country of Publication us
Record ID handle:2027.42/138596
Repository umich
Date Retrieved
Date Indexed 2019-03-07
Grantor University of Michigan, Horace H. Rackham School of Graduate Studies
Issued Date 2017-01-01 00:00:00
Note [thesisdegreename] PHD; [thesisdegreediscipline] Statistics; [thesisdegreegrantor] University of Michigan, Horace H. Rackham School of Graduate Studies; [bitstreamurl] https://deepblue.lib.umich.edu/bitstream/2027.42/138596/1/yjwu_1.pdf;

Sample Images | Cited Works