Jain, Suyog Dutt.
Human machine collaboration for foreground segmentation in images and videos.
Degree: PhD, Computer Science, 2018, University of Texas – Austin
Foreground segmentation is defined as the problem of generating pixel level foreground masks for all the objects in a given image or video. Accurate foreground segmentations in images and videos have several potential applications such as improving search, training richer object detectors, image synthesis and re-targeting, scene and activity understanding, video summarization, and post-production video editing.
One effective way to solve this problem is human-machine collaboration. The main idea is to let humans guide the segmentation process through some partial supervision. As humans, we are extremely good at perception and can easily identify the foreground regions. Computers, on the other hand, lack this capability, but are extremely good at continuously processing large volumes of data at the lowest level of detail with great efficiency. Bringing these complementary strengths together can lead to systems which are accurate and cost-effective at the same time. However, in any such human-machine collaboration system, cost effectiveness and higher accuracy are competing goals. While more involvement from humans can certainly lead to higher accuracy, it also leads to increased cost both in terms of time and money. On the other hand, relying more on machines is cost-effective, but algorithms are still nowhere near human-level performance. Balancing this cost versus accuracy trade-off holds the key behind success for such a hybrid system.
In this thesis, I develop foreground segmentation algorithms which effectively and efficiently make use of human guidance for accurately segmenting foreground objects in images and videos. The algorithms developed in this thesis actively reason about the best modalities or interactions through which a user can provide guidance to the system for generating accurate segmentations. At the same time, these algorithms are also capable of prioritizing human guidance on instances where it is most needed. Finally, when structural similarity exists within data (e.g., adjacent frames in a video or similar images in a collection), the algorithms developed in this thesis are capable of propagating information from instances which have received human guidance to the ones which did not. Together, these characteristics result in a substantial savings in human annotation cost while generating high quality foreground segmentations in images and videos.
In this thesis, I consider three categories of segmentation problems all of which can greatly benefit from human-machine collaboration. First, I consider the problem of interactive image segmentation. In traditional interactive methods a human annotator provides a coarse spatial annotation (e.g., bounding box or freehand outlines) around the object of interest to obtain a segmentation. The mode of manual annotation used affects both its accuracy and ease-of-use. Whereas existing methods assume a fixed form of input no matter the image, in this thesis I propose a data-driven algorithm which learns whether an interactive segmentation method will…
Advisors/Committee Members: Grauman, Kristen Lorraine, 1979- (advisor), Mooney, Raymond (committee member), Corso, Jason (committee member), Niekum, Scott (committee member), Vouga, Paul Etienne (committee member).
Subjects/Keywords: Computer vision; Crowdsourcing; Human machine collaboration; Image and video segmentation; Image segmentation; Video segmentation; Foreground segmentation; Object segmentation
to Zotero / EndNote / Reference
APA (6th Edition):
Jain, S. D. (2018). Human machine collaboration for foreground segmentation in images and videos. (Doctoral Dissertation). University of Texas – Austin. Retrieved from http://hdl.handle.net/2152/63453
Chicago Manual of Style (16th Edition):
Jain, Suyog Dutt. “Human machine collaboration for foreground segmentation in images and videos.” 2018. Doctoral Dissertation, University of Texas – Austin. Accessed September 26, 2020.
MLA Handbook (7th Edition):
Jain, Suyog Dutt. “Human machine collaboration for foreground segmentation in images and videos.” 2018. Web. 26 Sep 2020.
Jain SD. Human machine collaboration for foreground segmentation in images and videos. [Internet] [Doctoral dissertation]. University of Texas – Austin; 2018. [cited 2020 Sep 26].
Available from: http://hdl.handle.net/2152/63453.
Council of Science Editors:
Jain SD. Human machine collaboration for foreground segmentation in images and videos. [Doctoral Dissertation]. University of Texas – Austin; 2018. Available from: http://hdl.handle.net/2152/63453