Advanced search options

You searched for `+publisher:"University of Texas – Austin" +contributor:("Negahban, Sahand")`

. One record found.

▼ Search Limiters

University of Texas – Austin

1. -3269-6167. Graph analytics and subset selection problems in machine learning.

Degree: Electrical and Computer Engineering, 2018, University of Texas – Austin

URL: http://hdl.handle.net/2152/68499

In this dissertation we examine two topics relevant to modern machine learning research: 1) Subgraph counting and 2) High-dimensional subset selection. The former can be used to construct features for performing graph analytics. The latter has applications in sparse modeling such as feature selection, sparse regression, and interpretable machine learning. Since these problems become intractable for large datasets, we design efficient approximation algorithms for both tasks with data-dependent performance guarantees.
In the first part of the dissertation, we study the problem of approximating all three and four node induced subgraphs in a large graph. These counts are called the 3 and 4-profile, respectively, and describe a graph's connectivity properties. This problem generalizes graphlet counting and has found applications ranging from bioinformatics to spam detection. Our algorithms use the novel concept of graph profile sparsifiers: sparse graphs that can be used to approximate the full profile counts for a given large graph. We obtain novel concentration results showing that graphs can be substantially sparsified and still retain good approximation quality for the global graph profile. We also study the problem of counting local and ego profiles centered at each vertex of the graph. These quantities embed every vertex into a low-dimensional space that characterizes the local geometry of its neighborhood. We introduce the concept of edge pivots and show that all local 3 and 4-profiles can be computed as vertex programs using compressed two-hop information. Our algorithms are local, distributed message-passing schemes and compute all graph profiles in parallel. We empirically evaluate these algorithms with a distributed GraphLab implementation, and show improvements over previous state-of-the-art in experiments scaling up to 640 cores on Amazon EC2.
In the second part we shift to the problem of subset selection: for example, selecting a few features from a large feature set.
Motivated by the need for interpretable, nonlinear regression models for high-dimensional data, we draw a novel connection between this and submodular maximization. We extend an earlier concept of weak submodularity from the setting of sparse linear regression to a broad class of objective functions, including generalized linear model likelihoods. We then show that three greedy algorithms (Oblivous, Forward Stepwise, and Orthogonal Matching Pursuit) perform within a constant factor from the best possible subset. Our methods do not require any statistical modeling assumptions and allow direct control over the number of obtained features. This contrasts with other work that uses regularization parameters to control sparsity only implicitly. Our proof technique connecting convex analysis and submodular set function theory may be of independent interest for other statistical learning applications that have combinatorial structure.
In the third part, we consider the problem of explaining the predictions of a given black-box classifier. For…
*Advisors/Committee Members: Vishwanath, Sriram (advisor), Dimakis, Alexandros G. (advisor), Shakkottai, Sanjay (committee member), Caramanis, Constantine (committee member), Negahban, Sahand (committee member).*

Subjects/Keywords: Machine learning; Approximation algorithms; Graph analytics; Graph algorithms; Subset selection; Submodular optimization; Weak submodularity; Restricted strong convexity; Streaming algorithms; Interpretability

Record Details Similar Records

❌

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6^{th} Edition):

-3269-6167. (2018). Graph analytics and subset selection problems in machine learning. (Thesis). University of Texas – Austin. Retrieved from http://hdl.handle.net/2152/68499

Note: this citation may be lacking information needed for this citation format:

Author name may be incomplete

Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16^{th} Edition):

-3269-6167. “Graph analytics and subset selection problems in machine learning.” 2018. Thesis, University of Texas – Austin. Accessed June 16, 2019. http://hdl.handle.net/2152/68499.

Note: this citation may be lacking information needed for this citation format:

Author name may be incomplete

Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7^{th} Edition):

-3269-6167. “Graph analytics and subset selection problems in machine learning.” 2018. Web. 16 Jun 2019.

Note: this citation may be lacking information needed for this citation format:

Author name may be incomplete

Vancouver:

-3269-6167. Graph analytics and subset selection problems in machine learning. [Internet] [Thesis]. University of Texas – Austin; 2018. [cited 2019 Jun 16]. Available from: http://hdl.handle.net/2152/68499.

Note: this citation may be lacking information needed for this citation format:

Author name may be incomplete

Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

-3269-6167. Graph analytics and subset selection problems in machine learning. [Thesis]. University of Texas – Austin; 2018. Available from: http://hdl.handle.net/2152/68499

Author name may be incomplete

Not specified: Masters Thesis or Doctoral Dissertation