Advanced search options

Advanced Search Options 🞨

Browse by author name (“Author name starts with…”).

Find ETDs with:

in
/  
in
/  
in
/  
in

Written in Published in Earliest date Latest date

Sorted by

Results per page:

Sorted by: relevance · author · university · dateNew search

You searched for subject:(multiple clusterings). Showing records 1 – 2 of 2 total matches.

Search Limiters

Last 2 Years | English Only

No search limiters apply to these results.

▼ Search Limiters


University of Melbourne

1. Lei, Yang. Cluster validation and discovery of multiple clusterings.

Degree: 2016, University of Melbourne

Cluster analysis is an important unsupervised learning process in data analysis. It aims to group data objects into clusters, so that the data objects in the same group are more similar and the data objects in different groups are more dissimilar. There are many open challenges in this area. In this thesis, we focus on two: discovery of multiple clusterings and cluster validation. Many clustering methods focus on discovering one single ‘best’ solution from the data. However, data can be multi-faceted in nature. Particularly when datasets are large and complex, there may be several useful clusterings existing in the data. In addition, users may be seeking different perspectives on the same dataset, requiring multiple clustering solutions. Multiple clustering analysis has attracted considerable attention in recent years and aims to discover multiple reasonable and distinctive clustering solutions from the data. Many methods have been proposed on this topic and one popular technique is meta-clustering. Meta-clustering explores multiple reasonable and distinctive clusterings by analyzing a large set of base clusterings. However, there may exist poor quality and redundant base clustering which will affect the generation of high quality and diverse clustering views. In addition, the generated clustering views may not all be relevant. It will be time and energy consuming for users to check all the returned solutions. To tackle these problems, we propose a filtering method and a ranking method to achieve higher quality and more distinctive clustering solutions. Cluster validation refers to the procedure of evaluating the quality of clusterings, which is critical for clustering applications. Cluster validity indices (CVIs) are often used to quantify the quality of clusterings. They can be generally classified into two categories: external measures and internal measures, which are distinguished in terms of whether or not external information is used during the validation procedure. In this thesis, we focus on external cluster validity indices. There are many open challenges in this area. We focus two of them: (a) CVIs for fuzzy clusterings and, (b) Bias issues for CVIs. External CVIs are often used to quantify the quality of a clustering by comparing it against the ground truth. Most external CVIs are designed for crisp clusterings (one data object only belongs to one single cluster). How to evaluate the quality of soft clusterings (one data object can belong to more than one cluster) is a challenging problem. One common way to achieve this is by hardening a soft clustering to a crisp clustering and then evaluating it using a crisp CVI. However, hardening may cause information loss. To address this problem, we generalize a class of popular information-theoretic based crisp external CVIs to directly evaluate the quality of soft clusterings, without the need for a hardening step. There is an implicit assumption when using external CVIs for evaluating the quality of a clustering, that is, they work correctly.…

Subjects/Keywords: cluster analysis; cluster validation; multiple clusterings; data mining; machine learning

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Lei, Y. (2016). Cluster validation and discovery of multiple clusterings. (Doctoral Dissertation). University of Melbourne. Retrieved from http://hdl.handle.net/11343/121995

Chicago Manual of Style (16th Edition):

Lei, Yang. “Cluster validation and discovery of multiple clusterings.” 2016. Doctoral Dissertation, University of Melbourne. Accessed January 18, 2020. http://hdl.handle.net/11343/121995.

MLA Handbook (7th Edition):

Lei, Yang. “Cluster validation and discovery of multiple clusterings.” 2016. Web. 18 Jan 2020.

Vancouver:

Lei Y. Cluster validation and discovery of multiple clusterings. [Internet] [Doctoral dissertation]. University of Melbourne; 2016. [cited 2020 Jan 18]. Available from: http://hdl.handle.net/11343/121995.

Council of Science Editors:

Lei Y. Cluster validation and discovery of multiple clusterings. [Doctoral Dissertation]. University of Melbourne; 2016. Available from: http://hdl.handle.net/11343/121995


Halmstad University

2. Sweidan, Dirar. A General Framework for Discovering Multiple Data Groupings.

Degree: Information Technology, 2018, Halmstad University

Clustering helps users gain insights from their data by discovering hidden structures in an unsupervised way. Unlike classification tasks that are evaluated using well-defined target labels, clustering is an intrinsically subjective task as it depends on the interpretation, need and interest of users. In many real-world applications, multiple meaningful clusterings can be hidden in the data, and different users are interested in exploring different perspectives and use cases of this same data. Despite this, most existing clustering techniques only attempt to produce a single clustering of the data, which can be too strict. In this thesis, a general method is proposed to discover multiple alternative clusterings of the data, and let users select the clustering(s) they are most interested in. In order to cover a large set of possible clustering solutions, a diverse set of clusterings is first generated based on various projections of the data. Then, similar clusterings are found, filtered, and aggregated into one representative clustering, allowing the user to only explore a small set of non-redundant representative clusterings. We compare the proposed method against others and analyze its advantages and disadvantages, based on artificial and real-world datasets, as well as on images enabling a visual assessment of the meaningfulness of the discovered clustering solutions. On the other hand, extensive studies and analysis concerning a variety of techniques used in the method are made. Results show that the proposed method is able to discover multiple interesting and meaningful clustering solutions.

Subjects/Keywords: machine learning; unsupervised learning; data mining; clustering; multiple-clusterings; clustering algorithm; Engineering and Technology; Teknik och teknologier; Computer Systems; Datorsystem

Record DetailsSimilar RecordsGoogle PlusoneFacebookTwitterCiteULikeMendeleyreddit

APA · Chicago · MLA · Vancouver · CSE | Export to Zotero / EndNote / Reference Manager

APA (6th Edition):

Sweidan, D. (2018). A General Framework for Discovering Multiple Data Groupings. (Thesis). Halmstad University. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-38047

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Chicago Manual of Style (16th Edition):

Sweidan, Dirar. “A General Framework for Discovering Multiple Data Groupings.” 2018. Thesis, Halmstad University. Accessed January 18, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-38047.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

MLA Handbook (7th Edition):

Sweidan, Dirar. “A General Framework for Discovering Multiple Data Groupings.” 2018. Web. 18 Jan 2020.

Vancouver:

Sweidan D. A General Framework for Discovering Multiple Data Groupings. [Internet] [Thesis]. Halmstad University; 2018. [cited 2020 Jan 18]. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-38047.

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Council of Science Editors:

Sweidan D. A General Framework for Discovering Multiple Data Groupings. [Thesis]. Halmstad University; 2018. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-38047

Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

.