Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Dates: Last 2 Years

You searched for +publisher:"University of Texas – Austin" +contributor:("Sanghavi, Sujay"). One record found.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters

1. -7585-6925. Distributed and dynamic factor modeling of online data.

Degree: Electrical and Computer Engineering, 2017, University of Texas – Austin

The domain of data mining and machine learning has expanded rapidly in recent years to include both large-scale distributed and streaming computation. Although many open-source and cloud-based frameworks are available for these tasks, many of which are used in-production by industry, this is a rapidly-evolving technology landscape, and the gap between the academic role of algorithm development and discovery and code available for use with real-world data has grown. In addition, although there is a rich history of mathematical models for streaming data on continuous vector spaces, there has been significantly less work on streaming discrete spaces. However, much if not most of the data available online is composed of high-dimensional sparse counts, such as text corpora and interaction networks. We attempt to help bridge this gap by extending promising Bayesian Poisson factorization and co-factorization models that can be used, for example, to model not only text corpora but also related user interactions in a social network. We construct a dependent process prior that enables dynamic latent factor modeling in the natural probability space of the factors, rather than in the raw data. These models are then scaled to and implemented for distributed compute systems and streaming data. We develop an adaptive hashing method (AdaHash) for lambda architectures that can use latent factors calculated during periodic batch mode updates as a similarity metric for hierarchical grouping, or for finding similar factors to reconcile parameters in a distributed compute scenario. In addition, we develop a novel Hidden Markov variant using particle filters to update prior factors and probabilistically group with new factors in a dynamic inference model (D-GaPS). We show experimentally that the distributed model converges to similar factors as single-process inference, and the dynamic model yields superior quality topics over batch mode alternatives. Empirical studies are presented on the use of a U.S. Senate voting and bill summary data set that is readily interpretable with regard to latent factors. Advisors/Committee Members: Ghosh, Joydeep (advisor), Khurshid, Sarfraz (committee member), Julien, Christine (committee member), Sanghavi, Sujay (committee member), Chakrabarti, Deepayan (committee member).

Subjects/Keywords: Distributed clustering; Dynamic clustering; Matrix factorization; Co-factorization

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

-7585-6925. (2017). Distributed and dynamic factor modeling of online data. (Thesis). University of Texas – Austin. Retrieved from http://hdl.handle.net/2152/62065

Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

-7585-6925. “Distributed and dynamic factor modeling of online data.” 2017. Thesis, University of Texas – Austin. Accessed March 25, 2019. http://hdl.handle.net/2152/62065.

Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

-7585-6925. “Distributed and dynamic factor modeling of online data.” 2017. Web. 25 Mar 2019.

Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete

Vancouver:

-7585-6925. Distributed and dynamic factor modeling of online data. [Internet] [Thesis]. University of Texas – Austin; 2017. [cited 2019 Mar 25]. Available from: http://hdl.handle.net/2152/62065.

Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

-7585-6925. Distributed and dynamic factor modeling of online data. [Thesis]. University of Texas – Austin; 2017. Available from: http://hdl.handle.net/2152/62065

Note: this citation may be lacking information needed for this citation format:
Author name may be incomplete
Not specified: Masters Thesis or Doctoral Dissertation

.