Full Record

Author | Bellala, Gowtham |

Title | Information and Decision Theoretic Approaches to Problems in Active Diagnosis. |

URL | http://hdl.handle.net/2027.42/91606 |

Publication Date | 2012 |

Date Accessioned | 2012-06-15 17:33:26 |

Degree | PhD |

Discipline/Department | Electrical Engineering: Systems |

Degree Level | doctoral |

University/Publisher | University of Michigan |

Abstract | In applications such as active learning or disease/fault diagnosis, one often encounters the problem of identifying an unknown object while minimizing the number of ``yes" or ``no" questions (queries) posed about that object. This problem has been commonly referred to as object/entity identification or active diagnosis in the literature. In this thesis, we consider several extensions of this fundamental problem that are motivated by practical considerations in real-world, time-critical identification tasks such as emergency response. First, we consider the problem where the objects are partitioned into groups, and the goal is to identify only the group to which the object belongs. We then consider the case where the cost of identifying an object grows exponentially in the number of queries. To address these problems we show that a standard algorithm for object identification, known as the splitting algorithm or generalized binary search (GBS), may be viewed as a generalization of Shannon-Fano coding. We then extend this result to the group-based and the exponential cost settings, leading to new, improved algorithms. We then study the problem of active diagnosis under persistent query noise. Previous work in this area either assumed that the noise is independent or that the underlying query noise distribution is completely known. We make no such assumptions, and introduce an algorithm that returns a ranked list of objects, such that the expected rank of the true object is optimized. Finally, we study the problem of active diagnosis where multiple objects are present, such as in disease/fault diagnosis. Current algorithms in this area have an exponential time complexity making them slow and intractable. We address this issue by proposing an extension of our rank-based approach to the multiple object scenario, where we optimize the area under the ROC curve of the rank-based output. The AUC criterion allows us to make a simplifying assumption that significantly reduces the complexity of active diagnosis (from exponential to near quadratic), with little or no compromise on the performance quality. Further, we demonstrate the performance of the proposed algorithms through extensive experiments on both synthetic and real world datasets. |

Subjects/Keywords | Machine Learning; Active Diagnosis; Active Learning; Object/Entity Identification; Bayesian Networks; Generalized Binary Search; Computer Science; Electrical Engineering; Engineering |

Contributors | Scott, Clayton D. (committee member); Hero Iii, Alfred O. (committee member); Murphy, Susan A. (committee member); Sadanandarao, Sandeep P. (committee member) |

Language | en |

Country of Publication | us |

Record ID | handle:2027.42/91606 |

Repository | umich |

Date Indexed | 2020-09-09 |

Grantor | University of Michigan, Horace H. Rackham School of Graduate Studies |

Issued Date | 2012-01-01 00:00:00 |

Note | [thesisdegreename] Ph.D.; [thesisdegreediscipline] Electrical Engineering: Systems; [thesisdegreegrantor] University of Michigan, Horace H. Rackham School of Graduate Studies; [bitstreamurl] http://deepblue.lib.umich.edu/bitstream/2027.42/91606/1/gowtham_1.pdf; |

Sample Search Hits | Sample Images

…algorithm for object
identification, known as the splitting algorithm or *generalized* *binary* *search* (GBS),
may be viewed as a generalization of Shannon-Fano coding. We then extend this
result to the group-based and the exponential cost settings…

…identification algorithm known as the splitting algorithm, or *generalized* *binary* *search*
(GBS) to these settings. The proposed algorithms are derived in a common codingtheoretic framework, and are based on reinterpretation of GBS as a *generalized* form
of…

…the most widely studied algorithm is known
as the splitting algorithm (Loveland , 1985) or *generalized* *binary* *search* (GBS) (Dasgupta, 2004; Nowak , 2008). This algorithm grows a *binary* decision tree in a top down
greedy…

…common framework, and are based on reinterpretation of a standard object identification algorithm (the splitting algorithm, or
*generalized* *binary* *search*) as a *generalized* form of Shannon-Fano coding. We first
establish an exact formula for the…

…learning (with membership queries) (Angluin,
2004), active learning (Dasgupta, 2004), active/adaptive diagnosis (Rish et al., 2005)
object/entity identification (Garey, 1970, 1972), and *binary* testing (…

…θ1 , θ2 , θ3 } and
Q = {q1 , q2 , q3 , q4 , q5 } with q1 = q3 = {θ1 }, q2 = q4 = {θ2 , θ3 }, and
q5 = {θ1 , θ3 }.
6
*binary* matrix B, where the rows correspond to different objects and columns to
queries…

…with the *binary* entries in the matrix corresponding to the presence/absence of
edges. The *binary* matrix corresponding to the bipartite diagnosis graph in Figure 2.1
is given by,
1 0 1 0 1
B=
0 1 0 1 0
0 1 0 1 1
.
In the problem…

…*binary* row vector
associated with the unknown object in the matrix B.
As mentioned earlier, problems of this nature arise in several applications such as
job scheduling (Kosaraju et al., 1999), image processing (Korostelev and Kim, 2000…