Event-related Collections Understanding and Services.
Event-related collections, including both tweets and webpages, have valuable information. They are worth exploring in interdisciplinary research and education. Unfortunately, such data is noisy. Many tweets and webpages are not relevant to the events. This leads to difficulties during data analysis of the datasets, as well as explanation of the results. Further, for better understanding, more knowledge hidden behind events needs to be unearthed. Regarding these collections, different groups of people may have different requirements. Some may need relatively clean datasets for data exploration. Some require preprocessing of information, so they can conduct analyses, e.g., based on tweeter type or content topic. General societies are interested in the overall descriptions of events. However, few systems, tools, or methods exist to support the flexible use of event-related collections.
Accordingly, we describe our new framework and integrated system to process and analyze event-related collections. It provides varied services and covers the most important stages in a system pipeline. It has sub-systems to clean, manage, analyze, integrate, and visualize event-related collections. It takes an event-related tweet collection as input and generates an event-related webpage corpus by leveraging Wikipedia and the URLs embedded in tweets. It also combines and enriches original tweets with webpages. As an application of data management, we conduct an empirical study of tweets and their embedded URLs. We developed TwiRole for 3-way user classification on Twitter. It detects brand-related, female-related, and male-related tweeters through their profiles, tweets, and images. To aid user-centered social research, we combine TwiRole with an existing emotion detection tool, and carry out tweeting pattern analyses on disaster-related collections. Finally, we propose a tweet-guided multi-document summarization (TMDS) model and service, which generates summaries of the event-related collections by using tweets associated with those events. It extracts important sentences across different topics from webpages, and organizes them in proper order.
The entire system is realized using many technologies, such as collection development, natural language processing, machine learning, and deep learning. For each part, comprehensive evaluations help confirm the effectiveness and accuracy of our proposed approaches. Regarding broader impact, our methods and system can be easily adopted or extended for further event analyses and service development.
Advisors/Committee Members: Fox, Edward A. (committeechair), Kavanaugh, Andrea L. (committee member), Xie, Zhiwu (committee member), Deng, Zhi-Hong (committee member), Reddy, Chandan K. (committee member).
to Zotero / EndNote / Reference
APA (6th Edition):
Li, L. (2020). Event-related Collections Understanding and Services. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/97365
Chicago Manual of Style (16th Edition):
Li, Liuqing. “Event-related Collections Understanding and Services.” 2020. Doctoral Dissertation, Virginia Tech. Accessed April 09, 2020.
MLA Handbook (7th Edition):
Li, Liuqing. “Event-related Collections Understanding and Services.” 2020. Web. 09 Apr 2020.
Li L. Event-related Collections Understanding and Services. [Internet] [Doctoral dissertation]. Virginia Tech; 2020. [cited 2020 Apr 09].
Available from: http://hdl.handle.net/10919/97365.
Council of Science Editors:
Li L. Event-related Collections Understanding and Services. [Doctoral Dissertation]. Virginia Tech; 2020. Available from: http://hdl.handle.net/10919/97365