You searched for subject:(instance segmentation)
.
Showing records 1 – 18 of
18 total matches.
No search limiters apply to these results.

University of Ottawa
1.
Kolhatkar, Dhanvin.
Real-Time Instance and Semantic Segmentation Using Deep Learning
.
Degree: 2020, University of Ottawa
URL: http://hdl.handle.net/10393/40616
► In this thesis, we explore the use of Convolutional Neural Networks for semantic and instance segmentation, with a focus on studying the application of existing…
(more)
▼ In this thesis, we explore the use of Convolutional Neural Networks for semantic and instance segmentation, with a focus on studying the application of existing methods with cheaper neural networks. We modify a fast object detection architecture for the instance segmentation task, and study the concepts behind these modifications both in the simpler context of semantic segmentation and the more difficult context of instance segmentation. Various instance segmentation branch architectures are implemented in parallel with a box prediction branch, using its results to crop each instance's features. We negate the imprecision of the final box predictions and eliminate the need for bounding box alignment by using an enlarged bounding box for cropping. We report and study the performance, advantages, and disadvantages of each. We achieve fast speeds with all of our methods.
Subjects/Keywords: Instance segmentation;
Semantic segmentation;
Deep learning;
Real-time;
Mask prediction
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Kolhatkar, D. (2020). Real-Time Instance and Semantic Segmentation Using Deep Learning
. (Thesis). University of Ottawa. Retrieved from http://hdl.handle.net/10393/40616
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Kolhatkar, Dhanvin. “Real-Time Instance and Semantic Segmentation Using Deep Learning
.” 2020. Thesis, University of Ottawa. Accessed January 19, 2021.
http://hdl.handle.net/10393/40616.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Kolhatkar, Dhanvin. “Real-Time Instance and Semantic Segmentation Using Deep Learning
.” 2020. Web. 19 Jan 2021.
Vancouver:
Kolhatkar D. Real-Time Instance and Semantic Segmentation Using Deep Learning
. [Internet] [Thesis]. University of Ottawa; 2020. [cited 2021 Jan 19].
Available from: http://hdl.handle.net/10393/40616.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Kolhatkar D. Real-Time Instance and Semantic Segmentation Using Deep Learning
. [Thesis]. University of Ottawa; 2020. Available from: http://hdl.handle.net/10393/40616
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
2.
Seguin, Guillaume.
Analyse des personnes dans les films stéréoscopiques : Person analysis in stereoscopic movies.
Degree: Docteur es, Informatique, 2016, Paris Sciences et Lettres (ComUE)
URL: http://www.theses.fr/2016PSLEE021
► Les humains sont au coeur de nombreux problèmes de vision par ordinateur, tels que les systèmes de surveillance ou les voitures sans pilote. Ils sont…
(more)
▼ Les humains sont au coeur de nombreux problèmes de vision par ordinateur, tels que les systèmes de surveillance ou les voitures sans pilote. Ils sont également au centre de la plupart des contenus visuels, pouvant amener à des jeux de données très larges pour l’entraînement de modèles et d’algorithmes. Par ailleurs, si les données stéréoscopiques font l’objet d’études depuis longtemps, ce n’est que récemment que les films 3D sont devenus un succès commercial. Dans cette thèse, nous étudions comment exploiter les données additionnelles issues des films 3D pour les tâches d’analyse des personnes. Nous explorons tout d’abord comment extraire une notion de profondeur à partir des films stéréoscopiques, sous la forme de cartes de disparité. Nous évaluons ensuite à quel point les méthodes de détection de personne et d’estimation de posture peuvent bénéficier de ces informations supplémentaires. En s’appuyant sur la relative facilité de la tâche de détection de personne dans les films 3D, nous développons une méthode pour collecter automatiquement des exemples de personnes dans les films 3D afin d’entraîner un détecteur de personne pour les films non 3D. Nous nous concentrons ensuite sur la segmentation de plusieurs personnes dans les vidéos. Nous proposons tout d’abord une méthode pour segmenter plusieurs personnes dans les films 3D en combinant des informations dérivées des cartes de profondeur avec des informations dérivées d’estimations de posture. Nous formulons ce problème comme un problème d’étiquetage de graphe multi-étiquettes, et notre méthode intègre un modèle des occlusions pour produire une segmentation multi-instance par plan. Après avoir montré l’efficacité et les limitations de cette méthode, nous proposons un second modèle, qui ne repose lui que sur des détections de personne à travers la vidéo, et pas sur des estimations de posture. Nous formulons ce problème comme la minimisation d’un coût quadratique sous contraintes linéaires. Ces contraintes encodent les informations de localisation fournies par les détections de personne. Cette méthode ne nécessite pas d’information de posture ou des cartes de disparité, mais peut facilement intégrer ces signaux supplémentaires. Elle peut également être utilisée pour d’autres classes d’objets. Nous évaluons tous ces aspects et démontrons la performance de cette nouvelle méthode.
People are at the center of many computer vision tasks, such as surveillance systems or self-driving cars. They are also at the center of most visual contents, potentially providing very large datasets for training models and algorithms. While stereoscopic data has been studied for long, it is only recently that feature-length stereoscopic ("3D") movies became widely available. In this thesis, we study how we can exploit the additional information provided by 3D movies for person analysis. We first explore how to extract a notion of depth from stereo movies in the form of disparity maps. We then evaluate how person detection and human pose estimation methods perform on such data. Leveraging…
Advisors/Committee Members: Laptev, Ivan (thesis director), Sivic, Josef (thesis director).
Subjects/Keywords: Vision par ordinateur; Films 3D; Détection de personne; Estimation de pose; Segmentation vidéo; Segmentation multi-instance; Computer vision; 3D movies; Person detection; Pose estimation; Video segmentation; Instance-level segmentation; 004
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Seguin, G. (2016). Analyse des personnes dans les films stéréoscopiques : Person analysis in stereoscopic movies. (Doctoral Dissertation). Paris Sciences et Lettres (ComUE). Retrieved from http://www.theses.fr/2016PSLEE021
Chicago Manual of Style (16th Edition):
Seguin, Guillaume. “Analyse des personnes dans les films stéréoscopiques : Person analysis in stereoscopic movies.” 2016. Doctoral Dissertation, Paris Sciences et Lettres (ComUE). Accessed January 19, 2021.
http://www.theses.fr/2016PSLEE021.
MLA Handbook (7th Edition):
Seguin, Guillaume. “Analyse des personnes dans les films stéréoscopiques : Person analysis in stereoscopic movies.” 2016. Web. 19 Jan 2021.
Vancouver:
Seguin G. Analyse des personnes dans les films stéréoscopiques : Person analysis in stereoscopic movies. [Internet] [Doctoral dissertation]. Paris Sciences et Lettres (ComUE); 2016. [cited 2021 Jan 19].
Available from: http://www.theses.fr/2016PSLEE021.
Council of Science Editors:
Seguin G. Analyse des personnes dans les films stéréoscopiques : Person analysis in stereoscopic movies. [Doctoral Dissertation]. Paris Sciences et Lettres (ComUE); 2016. Available from: http://www.theses.fr/2016PSLEE021

University of Cambridge
3.
Agapaki, Evangelia.
Automated Object Segmentation in Existing Industrial Facilities.
Degree: PhD, 2020, University of Cambridge
URL: https://www.repository.cam.ac.uk/handle/1810/305021
► Shape segmentation from point cloud data is a core step of the digital twinning process for industrial facilities. However, this process is labour-intensive with 90%…
(more)
▼ Shape segmentation from point cloud data is a core step of the digital twinning process
for industrial facilities. However, this process is labour-intensive with 90% of the cost being
spent on converting point cloud data to a model. This counteracts the perceived value of the
resulting model in managing and retrofitting the facilities and motivates the use of automation
to reduce this cost. In the US alone, unplanned factory shutdowns due to maintenance cost
$50 billion per year. Better documenting the existing conditions can significantly circumvent
irreversible damages and decrease the frequency of shutdowns, thus boosting the productivity
of industrial assets. This explains why there is a huge market demand for less labour-intensive
industrial documentation.
Shape segmentation in the literature has so far mostly focused on cylinders, with state-of-
the-art methods achieving 60-70% precision and recall for cylinder detection. Such
performance is promising, but far from drastically eliminating the manual labour cost, as
all other shapes have to be segmented manually. Yet the search space is massive; industrial
facilities contain thousands of object types, making automated detection an impossible
problem. Hence, there is a direct need to prioritise the most tedious to model objects.
The objective of this PhD research is to devise, implement and benchmark a novel framework
that can accurately generate individual labelled point clusters of the most important
shapes of existing industrial facilities with minimal manual effort in a generic point-level
format. This is addressed by first identifying the most important shapes to be modelled and
then developing algorithms to efficiently detect those shapes. The former is achieved by
answering the following three general research questions: a) what are the most frequent
industrial object types?, b) what is the time to model the most frequent object types in
state-of-the-art commercial software? and c) what is the performance of state-of-the-art tools
in terms of automated object detection? The proposed methodology employs a statistical
analysis to identify the most frequent industrial object types and then manually models those
to estimate the average man-hours needed for each type. Then, it evaluates the state-of-the-art
automated cylinder extraction tool and concludes a 64% reduction in manual modelling
time of cylinders. This leads to focus on reducing the remaining man-hours for cylinder modelling as well as for manual modelling of the remaining industrial objects, which are still
substantial. This is achieved by answering the following technical research questions: (1)
how to automatically segment the most important industrial shapes from point cloud data
with varying point densities and occlusions without relying on prior knowledge? (2) how to
minimise the time for manually assigning class labels to points? and (3) how to automatically
segment instance point clusters with less manual labour compared to the state-of-the-art?
The proposed framework employs a…
Subjects/Keywords: Digital Twin; Industrial Factory; Point Cloud Data; Deep Learning; Class Segmentation; Instance Segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Agapaki, E. (2020). Automated Object Segmentation in Existing Industrial Facilities. (Doctoral Dissertation). University of Cambridge. Retrieved from https://www.repository.cam.ac.uk/handle/1810/305021
Chicago Manual of Style (16th Edition):
Agapaki, Evangelia. “Automated Object Segmentation in Existing Industrial Facilities.” 2020. Doctoral Dissertation, University of Cambridge. Accessed January 19, 2021.
https://www.repository.cam.ac.uk/handle/1810/305021.
MLA Handbook (7th Edition):
Agapaki, Evangelia. “Automated Object Segmentation in Existing Industrial Facilities.” 2020. Web. 19 Jan 2021.
Vancouver:
Agapaki E. Automated Object Segmentation in Existing Industrial Facilities. [Internet] [Doctoral dissertation]. University of Cambridge; 2020. [cited 2021 Jan 19].
Available from: https://www.repository.cam.ac.uk/handle/1810/305021.
Council of Science Editors:
Agapaki E. Automated Object Segmentation in Existing Industrial Facilities. [Doctoral Dissertation]. University of Cambridge; 2020. Available from: https://www.repository.cam.ac.uk/handle/1810/305021

Delft University of Technology
4.
Wang, Ziqi (author).
Depth-aware Instance Segmentation with a Discriminative Loss Function.
Degree: 2018, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:02bd3582-3304-4595-baa6-c6fcca755418
► This work explores the possibility of incorporating depth information into a deep neural network to improve accuracy of RGB instance segmentation. The baseline of this…
(more)
▼ This work explores the possibility of incorporating depth information into a deep neural network to improve accuracy of RGB instance segmentation. The baseline of this work is semantic instance segmentation with discriminative loss function.The baseline work proposes a novel discriminative loss function with which the semantic net-work can learn a n-D embedding for all pixels belonging to instances. Embeddings of the same instances are attracted to their own centers while centers of different instance embeddings repulse each other. Two limitations are set for attraction and repulsion, namely the in-margin and out-margin. A post-processing procedure (clustering) is required to infer instance indices from embeddings with an important parameter bandwidth, the threshold for clustering. The contribution of the work in this thesis are several new methods to incorporate depth information into the baseline work. One simple method is adding scaled depth directly to RGB embeddings, which is named as scaling. Through theorizing and experiments, this work also proposes that depth pixels can be encoded into 1-D embeddings with the same discriminative loss function and combined with RGB embeddings. Explored combination methods are fusion and concatenation. Additionally, two depth pre-processing methods are proposed, replication and coloring. From the experimental result, both scaling and fusion lead to significant improvements over baseline work while concatenation contributes more to classes with lots of similarities.
Cognitive Robotics Lab
Advisors/Committee Members: Pool, Ewoud (mentor), Kooij, Julian (mentor), Gavrila, Dariu (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: Deep Learning; Computer Vision; instance segmentation; Intelligent Vehicles
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Wang, Z. (. (2018). Depth-aware Instance Segmentation with a Discriminative Loss Function. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:02bd3582-3304-4595-baa6-c6fcca755418
Chicago Manual of Style (16th Edition):
Wang, Ziqi (author). “Depth-aware Instance Segmentation with a Discriminative Loss Function.” 2018. Masters Thesis, Delft University of Technology. Accessed January 19, 2021.
http://resolver.tudelft.nl/uuid:02bd3582-3304-4595-baa6-c6fcca755418.
MLA Handbook (7th Edition):
Wang, Ziqi (author). “Depth-aware Instance Segmentation with a Discriminative Loss Function.” 2018. Web. 19 Jan 2021.
Vancouver:
Wang Z(. Depth-aware Instance Segmentation with a Discriminative Loss Function. [Internet] [Masters thesis]. Delft University of Technology; 2018. [cited 2021 Jan 19].
Available from: http://resolver.tudelft.nl/uuid:02bd3582-3304-4595-baa6-c6fcca755418.
Council of Science Editors:
Wang Z(. Depth-aware Instance Segmentation with a Discriminative Loss Function. [Masters Thesis]. Delft University of Technology; 2018. Available from: http://resolver.tudelft.nl/uuid:02bd3582-3304-4595-baa6-c6fcca755418

Carnegie Mellon University
5.
Le, Ngan Thi Hoang.
Contextual Recurrent Level Set Networks and Recurrent Residual Networks for Semantic Labeling.
Degree: 2018, Carnegie Mellon University
URL: http://repository.cmu.edu/dissertations/1166
► Semantic labeling is becoming more and more popular among researchers in computer vision and machine learning. Many applications, such as autonomous driving, tracking, indoor navigation,…
(more)
▼ Semantic labeling is becoming more and more popular among researchers in computer vision and machine learning. Many applications, such as autonomous driving, tracking, indoor navigation, augmented reality systems, semantic searching, medical imaging are on the rise, requiring more accurate and efficient segmentation mechanisms. In recent years, deep learning approaches based on Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have dramatically emerged as the dominant paradigm for solving many problems in computer vision and machine learning. The main focus of this thesis is to investigate robust approaches that can tackle the challenging semantic labeling tasks including semantic instance segmentation and scene understanding. In the first approach, we convert the classic variational Level Set method to a learnable deep framework by proposing a novel definition of contour evolution named Recurrent Level Set (RLS). The proposed RLS employs Gated Recurrent Units to solve the energy minimization of a variational Level Set functional. The curve deformation processes in RLS is formulated as a hidden state evolution procedure and is updated by minimizing an energy functional composed of fitting forces and contour length. We show that by sharing the convolutional features in a fully end-to-end trainable framework, RLS is able to be extended to Contextual Recurrent Level Set (CRLS) Networks to address semantic segmentation in the wild problem. The experimental results have shown that our proposed RLS improves both computational time and segmentation accuracy against the classic variational Level Set-based methods whereas the fully end-to-end system CRLS achieves competitive performance compared to the state-of-the-art semantic segmentation approaches on PAS CAL VOC 2012 and MS COCO 2014 databases. The second proposed approach, Contextual Recurrent Residual Networks (CRRN), inherits all the merits of sequence learning information and residual learning in order to simultaneously model long-range contextual infor- mation and learn powerful visual representation within a single deep network. Our proposed CRRN deep network consists of three parts corresponding to sequential input data, sequential output data and hidden state as in a recurrent network. Each unit in hidden state is designed as a combination of two components: a context-based component via sequence learning and a visualbased component via residual learning. That means, each hidden unit in our proposed CRRN simultaneously (1) learns long-range contextual dependencies via a context-based component. The relationship between the current unit and the previous units is performed as sequential information under an undirected cyclic graph (UCG) and (2) provides powerful encoded visual representation via residual component which contains blocks of convolution and/or batch normalization layers equipped with an identity skip connection. Furthermore, unlike previous scene labeling approaches [1, 2, 3], our method is not only able to exploit the…
Subjects/Keywords: Gated Recurrent Unit; Level Set; Recurrent Neural Networks; Residual Network; Scene Labeling; Semantic Instance Segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Le, N. T. H. (2018). Contextual Recurrent Level Set Networks and Recurrent Residual Networks for Semantic Labeling. (Thesis). Carnegie Mellon University. Retrieved from http://repository.cmu.edu/dissertations/1166
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Le, Ngan Thi Hoang. “Contextual Recurrent Level Set Networks and Recurrent Residual Networks for Semantic Labeling.” 2018. Thesis, Carnegie Mellon University. Accessed January 19, 2021.
http://repository.cmu.edu/dissertations/1166.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Le, Ngan Thi Hoang. “Contextual Recurrent Level Set Networks and Recurrent Residual Networks for Semantic Labeling.” 2018. Web. 19 Jan 2021.
Vancouver:
Le NTH. Contextual Recurrent Level Set Networks and Recurrent Residual Networks for Semantic Labeling. [Internet] [Thesis]. Carnegie Mellon University; 2018. [cited 2021 Jan 19].
Available from: http://repository.cmu.edu/dissertations/1166.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Le NTH. Contextual Recurrent Level Set Networks and Recurrent Residual Networks for Semantic Labeling. [Thesis]. Carnegie Mellon University; 2018. Available from: http://repository.cmu.edu/dissertations/1166
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Delft University of Technology
6.
van Hilten, Arno (author).
Segmenting and Detecting Carotid Plaque Components in MRI.
Degree: 2018, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:9bce7f8a-8d69-4b48-98c4-fc4e6600b63d
► Cardiovascular diseases and stroke are currently the leading causes of death worldwide. Atherosclerotic plaque is a mostly asymptotic vascular disease, but rupture of an atherosclerotic…
(more)
▼ Cardiovascular diseases and stroke are currently the leading causes of death worldwide. Atherosclerotic plaque is a mostly asymptotic vascular disease, but rupture of an atherosclerotic plaque in the carotid artery could lead to stroke. Automated segmentation of plaque components could help improve risk assessment by producing fast and reliable results while saving costs. In this thesis two extensive comparisons have been made. First supervised classifiers are compared in the pixel-wise segmentation task of plaque components. In this comparison five conventional machine learning techniques and one deep learning architecture have been evaluated: linear and quadratic Bayes normal classifiers, linear logistic classifier, random forest and a U-net architecture. In the second comparison classifiers are evaluated in a detection task for their ability to learn with weakly labelled data. This is done within the multiple instance learning (MIL) framework. In addition to conventional multiple instance learning algorithms, a new MIL adaptation of the deep learning architecture, MIL U-net, is proposed and evaluated. In the pixel-wise segmentation tasks the U-net architecture was the best overall classifier after the addition of 93 extra training patients to the original 20 training patients. A good inter-rater agreement was found for the haemorrhage class (ICC = 0.684) and the calcification class (ICC = 0.627). In the detection task the supervised methods, trained with one-sided noise, outperformed multiple instance classifiers such as MIL-Boost and the proposed MIL U-net. In this task both random forest and the linear logistic classifier obtained a fair Cohen's kappa (0.419 and 0.445 respectively) for detection of calcification per slice. The same classifiers obtained good correlation (Cohen's kappa 0.717 and 0.666 respectively) for haemorrhage detection per slice.
Mechanical Engineering
Advisors/Committee Members: de Bruijne, Marleen (mentor), Sedghi Gamechi, Zahra (mentor), Niessen, Wiro (graduation committee), Kooij, Julian (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: Machine Learning; Deep Learning; Multiple Instance Learning; Segmentation; Detection; Plaque Components; Carotid Artery; MRI
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
van Hilten, A. (. (2018). Segmenting and Detecting Carotid Plaque Components in MRI. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:9bce7f8a-8d69-4b48-98c4-fc4e6600b63d
Chicago Manual of Style (16th Edition):
van Hilten, Arno (author). “Segmenting and Detecting Carotid Plaque Components in MRI.” 2018. Masters Thesis, Delft University of Technology. Accessed January 19, 2021.
http://resolver.tudelft.nl/uuid:9bce7f8a-8d69-4b48-98c4-fc4e6600b63d.
MLA Handbook (7th Edition):
van Hilten, Arno (author). “Segmenting and Detecting Carotid Plaque Components in MRI.” 2018. Web. 19 Jan 2021.
Vancouver:
van Hilten A(. Segmenting and Detecting Carotid Plaque Components in MRI. [Internet] [Masters thesis]. Delft University of Technology; 2018. [cited 2021 Jan 19].
Available from: http://resolver.tudelft.nl/uuid:9bce7f8a-8d69-4b48-98c4-fc4e6600b63d.
Council of Science Editors:
van Hilten A(. Segmenting and Detecting Carotid Plaque Components in MRI. [Masters Thesis]. Delft University of Technology; 2018. Available from: http://resolver.tudelft.nl/uuid:9bce7f8a-8d69-4b48-98c4-fc4e6600b63d

Australian National University
7.
Zhang, Haoyang.
Learning to Generate and Refine Object Proposals
.
Degree: 2018, Australian National University
URL: http://hdl.handle.net/1885/143520
► Visual object recognition is a fundamental and challenging problem in computer vision. To build a practical recognition system, one is first confronted with high computation…
(more)
▼ Visual object recognition is a fundamental and challenging
problem in computer vision. To build a practical recognition
system, one is first confronted with high computation complexity
due to an enormous search space from an image, which is caused by
large variations in object appearance, pose and mutual occlusion,
as well as other environmental factors. To reduce the search
complexity, a moderate set of image regions that are likely to
contain an object, regardless of its category, are usually first
generated in modern object recognition subsystems. These possible
object regions are called object proposals, object hypotheses or
object candidates, which can be used for down-stream
classification or global reasoning in many different vision tasks
like object detection, segmentation and tracking, etc.
This thesis addresses the problem of object proposal generation,
including bounding box and segment proposal generation, in
real-world scenarios. In particular, we investigate the
representation learning in object proposal generation with 3D
cues and contextual information, aiming to propose higher-quality
object candidates which have higher object recall, better
boundary coverage and lower number. We focus on three main
issues: 1) how can we incorporate additional geometric and
high-level semantic context information into the proposal
generation for stereo images? 2) how do we generate object
segment proposals for stereo images with learning representations
and learning grouping process? and 3) how can we learn a
context-driven representation to refine segment proposals
efficiently?
In this thesis, we propose a series of solutions to address each
of the raised problems. We first propose a semantic context and
depth-aware object proposal generation method. We design a set of
new cues to encode the objectness, and then train an efficient
random forest classifier to re-rank the initial proposals and
linear regressors to fine-tune their locations. Next, we extend
the task to the segment proposal generation in the same setting
and develop a learning-based segment proposal generation method
for stereo images. Our method makes use of learned deep features
and designed geometric features to represent a region and learns
a similarity network to guide the superpixel grouping process. We
also learn a ranking network to predict the objectness score for
each segment proposal. To address the third problem, we take a
transformation-based approach to improve the quality of a given
segment candidate pool based on context information. We propose
an efficient deep network that learns affine transformations to
warp an initial object mask towards nearby object region, based
on a novel feature pooling strategy. Finally, we extend our
affine warping approach to address the object-mask alignment
problem…
Subjects/Keywords: object proposal;
object candidate;
object detection;
object instance segmentation;
convolutional neural network (CNN)
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhang, H. (2018). Learning to Generate and Refine Object Proposals
. (Thesis). Australian National University. Retrieved from http://hdl.handle.net/1885/143520
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Zhang, Haoyang. “Learning to Generate and Refine Object Proposals
.” 2018. Thesis, Australian National University. Accessed January 19, 2021.
http://hdl.handle.net/1885/143520.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Zhang, Haoyang. “Learning to Generate and Refine Object Proposals
.” 2018. Web. 19 Jan 2021.
Vancouver:
Zhang H. Learning to Generate and Refine Object Proposals
. [Internet] [Thesis]. Australian National University; 2018. [cited 2021 Jan 19].
Available from: http://hdl.handle.net/1885/143520.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Zhang H. Learning to Generate and Refine Object Proposals
. [Thesis]. Australian National University; 2018. Available from: http://hdl.handle.net/1885/143520
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Australian National University
8.
Hayder, Zeeshan.
Deep Structured Models for Large Scale Object Co-detection and Segmentation
.
Degree: 2017, Australian National University
URL: http://hdl.handle.net/1885/142816
► Structured decisions are often required for a large variety of image and scene understanding tasks in computer vision, with few of them being object detection,…
(more)
▼ Structured decisions are often required for a large variety of
image and scene understanding tasks in computer vision, with few
of them being object detection, localization, semantic
segmentation and many more. Structured prediction deals with
learning inherent structure by incorporating contextual
information from several images and multiple tasks. However, it
is very challenging when dealing with large scale image datasets
where performance is limited by high computational costs and
expressive power of the underlying representation learning
techniques. In this thesis,
we present efficient and effective deep structured models for
context-aware object detection, co-localization and
instance-level semantic segmentation.
First, we introduce a principled formulation for object
co-detection using a fully-connected conditional random field
(CRF). We build an explicit graph whose vertices represent object
candidates (instead of pixel values) and edges encode the object
similarity via simple, yet effective pairwise potentials. More
specifically, we design a weighted mixture of Gaussian kernels
for class-specific object similarity, and formulate kernel
weights estimation as a least-squares regression problem. Its
solution can therefore be obtained in closed-form. Furthermore,
in contrast with traditional co-detection approaches, it has been
shown that inference in such fully-connected CRFs can be
performed efficiently using an approximate mean-field method with
high-dimensional Gaussian filtering. This lets us effectively
leverage information in multiple images.
Next, we extend our class-specific co-detection framework to
multiple object categories. We model object candidates with rich,
high-dimensional features learned using a deep convolutional
neural network. In particular, our max-margin and directloss
structural boosting algorithms enable us to learn the most
suitable features that best encode pairwise similarity
relationships within our CRF framework. Furthermore, it
guarantees that the time and space complexity is O(n t) where n
is the total number of candidate boxes in the pool and t the
number of mean-field iterations.
Moreover, our experiments evidence the importance of learning
rich similarity measures to account for the contextual relations
across object classes and instances. However, all these methods
are based on precomputed object candidates (or proposals), thus
localization performance is limited by the quality of
bounding-boxes.
To address this, we present an efficient object proposal
co-generation technique that leverages the collective power of
multiple images. In particular, we design a deep neural network
layer that takes unary and pairwise features as input, builds a
fully-connected CRF and produces mean-field marginals as output.
It also lets us…
Subjects/Keywords: Deep structured models;
Context modeling;
Object (co-)detection;
Instance-level semantic segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Hayder, Z. (2017). Deep Structured Models for Large Scale Object Co-detection and Segmentation
. (Thesis). Australian National University. Retrieved from http://hdl.handle.net/1885/142816
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Hayder, Zeeshan. “Deep Structured Models for Large Scale Object Co-detection and Segmentation
.” 2017. Thesis, Australian National University. Accessed January 19, 2021.
http://hdl.handle.net/1885/142816.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Hayder, Zeeshan. “Deep Structured Models for Large Scale Object Co-detection and Segmentation
.” 2017. Web. 19 Jan 2021.
Vancouver:
Hayder Z. Deep Structured Models for Large Scale Object Co-detection and Segmentation
. [Internet] [Thesis]. Australian National University; 2017. [cited 2021 Jan 19].
Available from: http://hdl.handle.net/1885/142816.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Hayder Z. Deep Structured Models for Large Scale Object Co-detection and Segmentation
. [Thesis]. Australian National University; 2017. Available from: http://hdl.handle.net/1885/142816
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
9.
Wang, Qiong.
Salient object detection and segmentation in videos : Détection d'objets saillants et segmentation dans des vidéos.
Degree: Docteur es, Signal, Image, Vision, 2019, Rennes, INSA
URL: http://www.theses.fr/2019ISAR0003
► Cette thèse est centrée sur le problème de la détection d'objets saillants et de leur segmentation dans une vidéo en vue de détecter les objets…
(more)
▼ Cette thèse est centrée sur le problème de la détection d'objets saillants et de leur segmentation dans une vidéo en vue de détecter les objets les plus attractifs ou d'affecter des identités cohérentes d'objets à chaque pixel d'une séquence vidéo. Concernant la détection d'objets saillants dans vidéo, outre une revue des techniques existantes, une nouvelle approche et l'extension d'un modèle sont proposées; de plus une approche est proposée pour la segmentation d'instances d'objets vidéo. Pour la détection d'objets saillants dans une vidéo, nous proposons : (1) une approche traditionnelle pour détecter l'objet saillant dans sa totalité à l'aide de la notion de "bordures virtuelles". Un filtre guidé est appliqué sur la sortie temporelle pour intégrer les informations de bord spatial en vue d'une meilleure détection des bords de l'objet saillants. Une carte globale de saillance spatio-temporelle est obtenue en combinant la carte de saillance spatiale et la carte de saillance temporelle en fonction de l'entropie. (2) Une revue des développements récents des méthodes basées sur l'apprentissage profond est réalisée. Elle inclut les classifications des méthodes de l'état de l'art et de leurs architectures, ainsi qu'une étude expérimentale comparative de leurs performances. (3) Une extension d'un modèle de l'approche traditionnelle proposée en intégrant un procédé de détection d'objet saillant d'image basé sur l'apprentissage profond a permis d'améliorer encore les performances. Pour la segmentation des instances d'objets dans une vidéo, nous proposons une approche d'apprentissage profond dans laquelle le calcul de la confiance de déformation détermine d'abord la confiance de la carte masquée, puis une sélection sémantique est optimisée pour améliorer la carte déformée, où l'objet est réidentifié à l'aide de l'étiquettes sémantique de l'objet cible. Les approches proposées ont été évaluées sur des jeux de données complexes et de grande taille disponibles publiquement et les résultats expérimentaux montrent que les approches proposées sont plus performantes que les méthodes de l'état de l'art.
This thesis focuses on the problem of video salient object detection and video object instance segmentation which aim to detect the most attracting objects or assign consistent object IDs to each pixel in a video sequence. One approach, one overview and one extended model are proposed for video salient object detection, and one approach is proposed for video object instance segmentation. For video salient object detection, we propose: (1) one traditional approach to detect the whole salient object via the adjunction of virtual borders. A guided filter is applied on the temporal output to integrate the spatial edge information for a better detection of the salient object edges. A global spatio-temporal saliency map is obtained by combining the spatial saliency map and the temporal saliency map together according to the entropy. (2) An overview of recent developments for deep-learning based methods is provided. It includes the…
Advisors/Committee Members: Kpalma, Kidiyo (thesis director).
Subjects/Keywords: Vidéo; Détection d'objet saillant; Segmentation d'instance d'objet; Apprentissage en profondeur; Video; Salient object detection; Object instance segmentation; Deep-learning; 006.7
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Wang, Q. (2019). Salient object detection and segmentation in videos : Détection d'objets saillants et segmentation dans des vidéos. (Doctoral Dissertation). Rennes, INSA. Retrieved from http://www.theses.fr/2019ISAR0003
Chicago Manual of Style (16th Edition):
Wang, Qiong. “Salient object detection and segmentation in videos : Détection d'objets saillants et segmentation dans des vidéos.” 2019. Doctoral Dissertation, Rennes, INSA. Accessed January 19, 2021.
http://www.theses.fr/2019ISAR0003.
MLA Handbook (7th Edition):
Wang, Qiong. “Salient object detection and segmentation in videos : Détection d'objets saillants et segmentation dans des vidéos.” 2019. Web. 19 Jan 2021.
Vancouver:
Wang Q. Salient object detection and segmentation in videos : Détection d'objets saillants et segmentation dans des vidéos. [Internet] [Doctoral dissertation]. Rennes, INSA; 2019. [cited 2021 Jan 19].
Available from: http://www.theses.fr/2019ISAR0003.
Council of Science Editors:
Wang Q. Salient object detection and segmentation in videos : Détection d'objets saillants et segmentation dans des vidéos. [Doctoral Dissertation]. Rennes, INSA; 2019. Available from: http://www.theses.fr/2019ISAR0003
10.
Luc, Pauline.
Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video.
Degree: Docteur es, Mathématiques et informatique, 2019, Université Grenoble Alpes (ComUE)
URL: http://www.theses.fr/2019GREAM024
► Les modèles prédictifs ont le potentiel de permettre le transfert des succès récents en apprentissage par renforcement à de nombreuses tâches du monde réel, en…
(more)
▼ Les modèles prédictifs ont le potentiel de permettre le transfert des succès récents en apprentissage par renforcement à de nombreuses tâches du monde réel, en diminuant le nombre d’interactions nécessaires avec l’environnement.La tâche de prédiction vidéo a attiré un intérêt croissant de la part de la communauté ces dernières années, en tant que cas particulier d’apprentissage prédictif dont les applications en robotique et dans les systèmes de navigations sont vastes.Tandis que les trames RGB sont faciles à obtenir et contiennent beaucoup d’information, elles sont extrêmement difficile à prédire, et ne peuvent être interprétées directement par des applications en aval.C’est pourquoi nous introduisons ici une tâche nouvelle, consistant à prédire la segmentation sémantique ou d’instance de trames futures.Les espaces de descripteurs que nous considérons sont mieux adaptés à la prédiction récursive, et nous permettent de développer des modèles de segmentation prédictifs performants jusqu’à une demi-seconde dans le futur.Les prédictions sont interprétables par des applications en aval et demeurent riches en information, détaillées spatialement et faciles à obtenir, en s’appuyant sur des méthodes état de l’art de segmentation.Dans cette thèse, nous nous attachons d’abord à proposer pour la tâche de segmentation sémantique, une approche discriminative se basant sur un entrainement par réseaux antagonistes.Ensuite, nous introduisons la tâche nouvelle de prédiction de segmentation sémantique future, pour laquelle nous développons un modèle convolutionnel autoregressif.Enfin, nous étendons notre méthode à la tâche plus difficile de prédiction de segmentation d’instance future, permettant de distinguer entre différents objets.Du fait du nombre de classes variant selon les images, nous proposons un modèle prédictif dans l’espace des descripteurs d’image convolutionnels haut niveau du réseau de segmentation d’instance Mask R-CNN.Cela nous permet de produire des segmentations visuellement plaisantes en haute résolution, pour des scènes complexes comportant un grand nombre d’objets, et avec une performance satisfaisante jusqu’à une demi seconde dans le futur.
Predictive models of the environment hold promise for allowing the transfer of recent reinforcement learning successes to many real-world contexts, by decreasing the number of interactions needed with the real world.Video prediction has been studied in recent years as a particular case of such predictive models, with broad applications in robotics and navigation systems.While RGB frames are easy to acquire and hold a lot of information, they are extremely challenging to predict, and cannot be directly interpreted by downstream applications.Here we introduce the novel tasks of predicting semantic and instance segmentation of future frames.The abstract feature spaces we consider are better suited for recursive prediction and allow us to develop models which convincingly predict segmentations up to half a second into the future.Predictions are more easily interpretable by…
Advisors/Committee Members: Verbeek, Jakob (thesis director), Couprie, Camille (thesis director).
Subjects/Keywords: Apprentissage profond; Segmentation sémantique; Segmentation d’instance; Modèles génératifs; Apprentissage prédictif; Compréhension vidéo; Deep learning; Semantic segmentation; Instance segmentation; Generative modeling; Predictive learning; Video understanding; 004; 510
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Luc, P. (2019). Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video. (Doctoral Dissertation). Université Grenoble Alpes (ComUE). Retrieved from http://www.theses.fr/2019GREAM024
Chicago Manual of Style (16th Edition):
Luc, Pauline. “Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video.” 2019. Doctoral Dissertation, Université Grenoble Alpes (ComUE). Accessed January 19, 2021.
http://www.theses.fr/2019GREAM024.
MLA Handbook (7th Edition):
Luc, Pauline. “Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video.” 2019. Web. 19 Jan 2021.
Vancouver:
Luc P. Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video. [Internet] [Doctoral dissertation]. Université Grenoble Alpes (ComUE); 2019. [cited 2021 Jan 19].
Available from: http://www.theses.fr/2019GREAM024.
Council of Science Editors:
Luc P. Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video. [Doctoral Dissertation]. Université Grenoble Alpes (ComUE); 2019. Available from: http://www.theses.fr/2019GREAM024

Georgia Tech
11.
Hsu, Yen-Chang.
Learning from pairwise similarity for visual categorization.
Degree: PhD, Electrical and Computer Engineering, 2020, Georgia Tech
URL: http://hdl.handle.net/1853/62814
► Learning high-capacity machine learning models for perception, especially for high-dimensional inputs such as in computer vision, requires a large amount of human-annotated data. Many efforts…
(more)
▼ Learning high-capacity machine learning models for perception, especially for high-dimensional inputs such as in computer vision, requires a large amount of human-annotated data. Many efforts have been made to construct such large-scale, annotated datasets. However, there are not many options for transferring knowledge from those datasets to other tasks with different categories, limiting the value of these efforts. While one common option for transfer is reusing a learned feature representation, other options for reusing supervision across tasks are generally not considered due to the tight association between labels and tasks. This thesis proposes to use an intermediate form of supervision, pairwise similarity, for enabling the transferability of supervision across different categorization tasks that have different sets of classes. We show that pairwise similarity, defined as whether two pieces of data have the same semantic meaning or not, is sufficient as the primary supervision for learning categorization problems such as clustering and classification. We investigate this idea by answering two transfer learning questions: how and when to transfer. We develop two loss functions for answering how to transfer and show the same framework can support supervised, unsupervised, and semi-supervised learning paradigms, demonstrating better performance over previous methods. This result makes discovering unseen categories in unlabeled data possible by transferring a learned pairwise similarity prediction function. Additionally, we provide a decomposed confidence strategy for answering when to transfer, achieving state-of-the-art results on out-of-distribution data detection. Lastly, we apply our loss function to the application of
instance segmentation, demonstrating the scalability of our method in utilizing pairwise similarity within a real-world problem.
Advisors/Committee Members: Kira, Zsolt (advisor), Vela, Patricio (committee member), Batra, Dhruv (committee member), Hoffman, Judy (committee member), Odom, Phillip (committee member).
Subjects/Keywords: Transfer learning; Pairwise similarity; Clustering; Deep learning; Neural networks; Classification; Out-of-distribution detection; Instance segmentation; Lane detection
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Hsu, Y. (2020). Learning from pairwise similarity for visual categorization. (Doctoral Dissertation). Georgia Tech. Retrieved from http://hdl.handle.net/1853/62814
Chicago Manual of Style (16th Edition):
Hsu, Yen-Chang. “Learning from pairwise similarity for visual categorization.” 2020. Doctoral Dissertation, Georgia Tech. Accessed January 19, 2021.
http://hdl.handle.net/1853/62814.
MLA Handbook (7th Edition):
Hsu, Yen-Chang. “Learning from pairwise similarity for visual categorization.” 2020. Web. 19 Jan 2021.
Vancouver:
Hsu Y. Learning from pairwise similarity for visual categorization. [Internet] [Doctoral dissertation]. Georgia Tech; 2020. [cited 2021 Jan 19].
Available from: http://hdl.handle.net/1853/62814.
Council of Science Editors:
Hsu Y. Learning from pairwise similarity for visual categorization. [Doctoral Dissertation]. Georgia Tech; 2020. Available from: http://hdl.handle.net/1853/62814
12.
Cheng, Hsien Ting.
Unsupervised video segmentation and its application to activity recognition.
Degree: PhD, 1200, 2015, University of Illinois – Urbana-Champaign
URL: http://hdl.handle.net/2142/72891
► We addressed the fundamental problem of computer vision: segmentation and recognition, in the space-time domain. With the knowledge that generic image segmentation introduces unstable regions…
(more)
▼ We addressed the fundamental problem of computer vision:
segmentation and recognition, in the space-time domain. With the knowledge that generic image
segmentation introduces unstable regions due to illumination, com- pression, etc., we utilized temporal information to achieve consistent 3D video
segmentation. By exploiting non-local structure in both spatial and temporal space, the instabilities of the segmented regions were alleviated. A
segmentation tree was built within every frame, and the label consistency was enforced within each subtree (i.e. spatial clique). By roughly tracking 2D regions across each frame, temporal clique was built in which label consis- tency was enforced as well. The high-order (more than binary) Conditional Random Field (CRF) is designed and solved efficiently. Experimental results demonstrate high-quality
segmentation quantitatively and qualitatively.
Taking segmented 3D regions, called tubes, as input, we developed an activity recognition framework not only to determine which activity existed in a video but also to locate where it happens. A robust tube feature was extracted with photometric and shape dynamics information. Activity was described as a Parts Activity Model (PAM) with a root template and four- part template under the root. Given the nature of the activity recognition problem that only some parts on the video were used to determine the activity label, we used Multiple
Instance Learning (MIL) to formulate the problem. Latent variables included a tube index and the parts location under the root template. Experiments were conducted on three well-known datasets and a state-of-the-art result was achieved.
Advisors/Committee Members: Ahuja, Narendra (advisor), Ahuja, Narendra (Committee Chair), Forsyth, David A. (committee member), Hasegawa-Johnson, Mark A. (committee member), Huang, Thomas (committee member).
Subjects/Keywords: segmentation; Video segmentation; Unsupervised clustering; Activity recognition; Multiple instance learning
…variations
than encountered in more complex categories. Based on segmentation tree
implementation… …8].
Figure 1.2: Illustration steps of a segmentation tree building algorithm. (… …g) Results of multiscale
segmentation. (e) Segmentation result for… …photometric scale = 5. All
regions are included. (f) Segmentation for = 65. Two regions… …fragment in between the two merging regions is less than 65. (g)
Segmentation for = 80…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Cheng, H. T. (2015). Unsupervised video segmentation and its application to activity recognition. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/72891
Chicago Manual of Style (16th Edition):
Cheng, Hsien Ting. “Unsupervised video segmentation and its application to activity recognition.” 2015. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed January 19, 2021.
http://hdl.handle.net/2142/72891.
MLA Handbook (7th Edition):
Cheng, Hsien Ting. “Unsupervised video segmentation and its application to activity recognition.” 2015. Web. 19 Jan 2021.
Vancouver:
Cheng HT. Unsupervised video segmentation and its application to activity recognition. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2015. [cited 2021 Jan 19].
Available from: http://hdl.handle.net/2142/72891.
Council of Science Editors:
Cheng HT. Unsupervised video segmentation and its application to activity recognition. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2015. Available from: http://hdl.handle.net/2142/72891

Linköping University
13.
Fritz, Karin.
Instance Segmentation of Buildings in Satellite Images.
Degree: Computer Vision, 2020, Linköping University
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-164537
► When creating a photo realistic 3D model of the world using satellite imagery, image classification is an important part of the process. In this…
(more)
▼ When creating a photo realistic 3D model of the world using satellite imagery, image classification is an important part of the process. In this thesis the specificpart of automated building extraction is investigated. This is done by investi-gating the difference in performance between the methods instance segmentation and semantic segmentation for extraction of building footprints in orthorectified imagery. Semantic segmentation of the images is solved by using U-net, a Fully Convolutional Network that outputs a pixel-wise segmentation of the image. Instance segmentation of the images is done by a network called Mask R-CNN.The performance of the models are measured using precision, recall and the F1 score, which is the harmonic mean between precision and recall. The resulting F1 score of the two methods are similar, with U-net achieving a the F1 score of 0.684 without any post processing. Mask R-CNN achieves the F1 score of 0.676 without post processing.
Subjects/Keywords: cnn; convolutional neural networks; instance segmentation; semantic segmentation; Signal Processing; Signalbehandling; Computer Vision and Robotics (Autonomous Systems); Datorseende och robotik (autonoma system)
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Fritz, K. (2020). Instance Segmentation of Buildings in Satellite Images. (Thesis). Linköping University. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-164537
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Fritz, Karin. “Instance Segmentation of Buildings in Satellite Images.” 2020. Thesis, Linköping University. Accessed January 19, 2021.
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-164537.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Fritz, Karin. “Instance Segmentation of Buildings in Satellite Images.” 2020. Web. 19 Jan 2021.
Vancouver:
Fritz K. Instance Segmentation of Buildings in Satellite Images. [Internet] [Thesis]. Linköping University; 2020. [cited 2021 Jan 19].
Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-164537.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Fritz K. Instance Segmentation of Buildings in Satellite Images. [Thesis]. Linköping University; 2020. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-164537
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
14.
Tsogkas, Stavros.
Mid-level representations for modeling objects : Représentations de niveau intermédiaire pour la modélisation d'objets.
Degree: Docteur es, Mathématiques et informatique, 2016, Université Paris-Saclay (ComUE)
URL: http://www.theses.fr/2016SACLC012
► Dans cette thèse, nous proposons l'utilisation de représentations de niveau intermédiaire, et en particulier i) d'axes médians, ii) de parties d'objets, et iii) des caractéristiques…
(more)
▼ Dans cette thèse, nous proposons l'utilisation de représentations de niveau intermédiaire, et en particulier i) d'axes médians, ii) de parties d'objets, et iii) des caractéristiques convolutionnels, pour modéliser des objets.La première partie de la thèse traite de détecter les axes médians dans des images naturelles en couleur. Nous adoptons une approche d'apprentissage, en utilisant la couleur, la texture et les caractéristiques de regroupement spectral pour construire un classificateur qui produit une carte de probabilité dense pour la symétrie. Le Multiple Instance Learning (MIL) nous permet de traiter l'échelle et l'orientation comme des variables latentes pendant l'entraînement, tandis qu'une variante fondée sur les forêts aléatoires offre des gains significatifs en termes de temps de calcul.Dans la deuxième partie de la thèse, nous traitons de la modélisation des objets, utilisant des modèles de parties déformables (DPM). Nous développons une approche « coarse-to-fine » hiérarchique, qui utilise des bornes probabilistes pour diminuer le coût de calcul dans les modèles à grand nombre de composants basés sur HOGs. Ces bornes probabilistes, calculés de manière efficace, nous permettent d'écarter rapidement de grandes parties de l'image, et d'évaluer précisément les filtres convolutionnels seulement à des endroits prometteurs. Notre approche permet d'obtenir une accélération de 4-5 fois sur l'approche naïve, avec une perte minimale en performance.Nous employons aussi des réseaux de neurones convolutionnels (CNN) pour améliorer la détection d'objets. Nous utilisons une architecture CNN communément utilisée pour extraire les réponses de la dernière couche de convolution. Nous intégrons ces réponses dans l'architecture DPM classique, remplaçant les descripteurs HOG fabriqués à la main, et nous observons une augmentation significative de la performance de détection (~14.5% de mAP).Dans la dernière partie de la thèse nous expérimentons avec des réseaux de neurones entièrement convolutionnels pous la segmentation de parties d'objets.Nous réadaptons un CNN utilisé à l'état de l'art pour effectuer une segmentation sémantique fine de parties d'objets et nous utilisons un CRF entièrement connecté comme étape de post-traitement pour obtenir des bords fins.Nous introduirons aussi un à priori sur les formes à l'aide d'une Restricted Boltzmann Machine (RBM), à partir des segmentations de vérité terrain.Enfin, nous concevons une nouvelle architecture entièrement convolutionnel, et l'entraînons sur des données d'image à résonance magnétique du cerveau, afin de segmenter les différentes parties du cerveau humain.Notre approche permet d'atteindre des résultats à l'état de l'art sur les deux types de données.
In this thesis we propose the use of mid-level representations, and in particular i) medial axes, ii) object parts, and iii)convolutional features, for modelling objects.The first part of the thesis deals with detecting medial axes in natural RGB images. We adopt a learning approach, utilizing colour, texture and spectral…
Advisors/Committee Members: Kokkinos, Iasonas (thesis director).
Subjects/Keywords: Axes médians; Parties d' objets; Réseaux de neurones convolutionnels; Modèles de parties déformables; Segmentation semantique; Multiple instance learning; Medial axis; Object parts; Convolutional Neural Networks; Deformable part models; Semantic segmentation; Multiple instance learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Tsogkas, S. (2016). Mid-level representations for modeling objects : Représentations de niveau intermédiaire pour la modélisation d'objets. (Doctoral Dissertation). Université Paris-Saclay (ComUE). Retrieved from http://www.theses.fr/2016SACLC012
Chicago Manual of Style (16th Edition):
Tsogkas, Stavros. “Mid-level representations for modeling objects : Représentations de niveau intermédiaire pour la modélisation d'objets.” 2016. Doctoral Dissertation, Université Paris-Saclay (ComUE). Accessed January 19, 2021.
http://www.theses.fr/2016SACLC012.
MLA Handbook (7th Edition):
Tsogkas, Stavros. “Mid-level representations for modeling objects : Représentations de niveau intermédiaire pour la modélisation d'objets.” 2016. Web. 19 Jan 2021.
Vancouver:
Tsogkas S. Mid-level representations for modeling objects : Représentations de niveau intermédiaire pour la modélisation d'objets. [Internet] [Doctoral dissertation]. Université Paris-Saclay (ComUE); 2016. [cited 2021 Jan 19].
Available from: http://www.theses.fr/2016SACLC012.
Council of Science Editors:
Tsogkas S. Mid-level representations for modeling objects : Représentations de niveau intermédiaire pour la modélisation d'objets. [Doctoral Dissertation]. Université Paris-Saclay (ComUE); 2016. Available from: http://www.theses.fr/2016SACLC012
15.
Grard, Matthieu.
Generic instance segmentation for object-oriented bin-picking : Segmentation en instances génériques pour le dévracage orienté objet.
Degree: Docteur es, Informatique, 2019, Lyon
URL: http://www.theses.fr/2019LYSEC015
► Le dévracage robotisé est une tâche industrielle en forte croissance visant à automatiser le déchargement par unité d’une pile d’instances d'objet en vrac pour faciliter…
(more)
▼ Le dévracage robotisé est une tâche industrielle en forte croissance visant à automatiser le déchargement par unité d’une pile d’instances d'objet en vrac pour faciliter des traitements ultérieurs tels que la formation de kits ou l’assemblage de composants. Cependant, le modèle explicite des objets est souvent indisponible dans de nombreux secteurs industriels, notamment alimentaire et automobile, et les instances d'objet peuvent présenter des variations intra-classe, par exemple en raison de déformations élastiques.Les techniques d’estimation de pose, qui nécessitent un modèle explicite et supposent des transformations rigides, ne sont donc pas applicables dans de tels contextes. L'approche alternative consiste à détecter des prises sans notion explicite d’objet, ce qui pénalise fortement le dévracage lorsque l’enchevêtrement des instances est important. Ces approches s’appuient aussi sur une reconstruction multi-vues de la scène, difficile par exemple avec des emballages alimentaires brillants ou transparents, ou réduisant de manière critique le temps de cycle restant dans le cadre d’applications à haute cadence.En collaboration avec Siléane, une entreprise française de robotique industrielle, l’objectif de ce travail est donc de développer une solution par apprentissage pour la localisation des instances les plus prenables d’un vrac à partir d’une seule image, en boucle ouverte, sans modèles d'objet explicites. Dans le contexte du dévracage industriel, notre contribution est double.Premièrement, nous proposons un nouveau réseau pleinement convolutionnel (FCN) pour délinéer les instances et inférer un ordre spatial à leurs frontières. En effet, les méthodes état de l'art pour cette tâche reposent sur deux flux indépendants, respectivement pour les frontières et les occultations, alors que les occultations sont souvent sources de frontières. Plus précisément, l'approche courante, qui consiste à isoler les instances dans des boîtes avant de détecter les frontières et les occultations, se montre inadaptée aux scénarios de dévracage dans la mesure où une région rectangulaire inclut souvent plusieurs instances. A contrario, notre architecture sans détection préalable de régions détecte finement les frontières entre instances, ainsi que le bord occultant correspondant, à partir d'une représentation unifiée de la scène.Deuxièmement, comme les FCNs nécessitent de grands ensembles d'apprentissage qui ne sont pas disponibles dans les applications de dévracage, nous proposons une procédure par simulation pour générer des images d'apprentissage à partir de moteurs physique et de rendu. Plus précisément, des vracs d'instances sont simulés et rendus avec les annotations correspondantes à partir d'ensembles d'images de texture et de maillages auxquels sont appliquées de multiples déformations aléatoires. Nous montrons que les données synthétiques proposées sont vraisemblables pour des applications réelles au sens où elles permettent l'apprentissage de représentations profondes transférables à des données réelles. A travers de…
Advisors/Committee Members: Chen, Liming (thesis director), Dellandréa, Emmanuel (thesis director).
Subjects/Keywords: Vision par ordinateur; Dévracage robotisé; Apprentissage profond; Segmentation en instances; Détection des occultations; Réseaux entièrement convolutionnels; Données d'apprentissage synthétiques; Computer vision; Robotic bin-picking; Deep learning; Instance segmentation; Occlusion detection; Fully convolutional networks; Synthetic training data
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Grard, M. (2019). Generic instance segmentation for object-oriented bin-picking : Segmentation en instances génériques pour le dévracage orienté objet. (Doctoral Dissertation). Lyon. Retrieved from http://www.theses.fr/2019LYSEC015
Chicago Manual of Style (16th Edition):
Grard, Matthieu. “Generic instance segmentation for object-oriented bin-picking : Segmentation en instances génériques pour le dévracage orienté objet.” 2019. Doctoral Dissertation, Lyon. Accessed January 19, 2021.
http://www.theses.fr/2019LYSEC015.
MLA Handbook (7th Edition):
Grard, Matthieu. “Generic instance segmentation for object-oriented bin-picking : Segmentation en instances génériques pour le dévracage orienté objet.” 2019. Web. 19 Jan 2021.
Vancouver:
Grard M. Generic instance segmentation for object-oriented bin-picking : Segmentation en instances génériques pour le dévracage orienté objet. [Internet] [Doctoral dissertation]. Lyon; 2019. [cited 2021 Jan 19].
Available from: http://www.theses.fr/2019LYSEC015.
Council of Science Editors:
Grard M. Generic instance segmentation for object-oriented bin-picking : Segmentation en instances génériques pour le dévracage orienté objet. [Doctoral Dissertation]. Lyon; 2019. Available from: http://www.theses.fr/2019LYSEC015

University of Illinois – Urbana-Champaign
16.
Akbas, Emre.
Generation and analysis of segmentation trees for natural images.
Degree: PhD, 1200, 2011, University of Illinois – Urbana-Champaign
URL: http://hdl.handle.net/2142/26317
► This dissertation is about extracting as well as making use of the structure and hierarchy present in images. We develop a new low-level, multiscale, hierarchical…
(more)
▼ This dissertation is about extracting as well as making use of the structure and hierarchy present in images. We develop a new low-level, multiscale, hierarchical image
segmentation algorithm designed to detect image regions regardless of their shapes, sizes, and levels of interior homogeneity. We model a region as a connected set of pixels that is surrounded by ramp edge discontinuities where the magnitude of these discontinuities is large compared to the variation inside the region. Each region is associated with a scale depending on the magnitude of the weakest part of its boundary. Traversing through the range of all possible scales, we obtain all regions
present in the image. Regions strictly merge as the scale increases; hence a
tree is formed where the root node corresponds to the whole image, and nodes close to the root along a path are large, while their children nodes are smaller and
capture embedded details.
To evaluate the accuracy and precision of our algorithm, as well as to compare
it to the existing algorithms, we develop a new benchmark dataset for low-level image
segmentation. In this benchmark, small patches of many images are hand-segmented by human subjects. We provide evaluation methods
for both boundary-based and region-based performance of algorithms. We show that our proposed algorithm performs better than the
existing low-level
segmentation algorithms on this benchmark.
Next, we investigate the
segmentation-based statistics of natural images. Such
statistics capture geometric and topological properties of images, which is not
possible to obtain using pixel-, patch-, or subband-based methods. We compile
and use
segmentation statistics from a large number of images, and propose a
Markov random field based model for estimating them. Our estimates confirm some
of the previous statistical properties of natural images as well as yield new
ones. To demonstrate the value of the statistics, we successfully use them as
priors in image classification and semantic image
segmentation.
We also investigate the importance of different visual cues to
describe image regions for solving the region correspondence problem. We design
and develop
psychophysical experiments to learn the weights of different cues by evaluating
their impact on binocular fusibility by human subjects. Using a head-mounted
display, we show a set of elliptical regions to one eye and slightly different
versions of the same set of regions to the other eye of human subjects. We then
ask them whether the ellipses fuse or not. By systematically varying the
parameters of the elliptical shapes, and testing for fusion, we learn a
perceptual distance function between two elliptical regions. We evaluate this
function on ground-truth stereo image pairs.
Finally, we propose a novel multiple
instance learning (MIL) method. In MIL,
in contrast to classical supervised learning, the entities to be classified
are called bags, each of which contains an arbitrary number of elements called
…
Advisors/Committee Members: Ahuja, Narendra (advisor), Ahuja, Narendra (Committee Chair), Huang, Thomas S. (committee member), Forsyth, David A. (committee member), Hasegawa-Johnson, Mark A. (committee member), Hoiem, Derek W. (committee member).
Subjects/Keywords: computer vision; image processing; image segmentation; machine learning; segmentation benchmark; natural image statistics; image classification; scene classification; binocular fusion; region matching; multiple instance learning (mil); mis-boost; pascal voc; Pattern Analysis Statistical Modeling and Computational Learning (PASCAL); Visual Object Classes (VOC)
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Akbas, E. (2011). Generation and analysis of segmentation trees for natural images. (Doctoral Dissertation). University of Illinois – Urbana-Champaign. Retrieved from http://hdl.handle.net/2142/26317
Chicago Manual of Style (16th Edition):
Akbas, Emre. “Generation and analysis of segmentation trees for natural images.” 2011. Doctoral Dissertation, University of Illinois – Urbana-Champaign. Accessed January 19, 2021.
http://hdl.handle.net/2142/26317.
MLA Handbook (7th Edition):
Akbas, Emre. “Generation and analysis of segmentation trees for natural images.” 2011. Web. 19 Jan 2021.
Vancouver:
Akbas E. Generation and analysis of segmentation trees for natural images. [Internet] [Doctoral dissertation]. University of Illinois – Urbana-Champaign; 2011. [cited 2021 Jan 19].
Available from: http://hdl.handle.net/2142/26317.
Council of Science Editors:
Akbas E. Generation and analysis of segmentation trees for natural images. [Doctoral Dissertation]. University of Illinois – Urbana-Champaign; 2011. Available from: http://hdl.handle.net/2142/26317

Penn State University
17.
Zhao, Qi.
Mixture Model Learning with Instance-level Constraints.
Degree: 2008, Penn State University
URL: https://submit-etda.libraries.psu.edu/catalog/6605
► Machine learning traditionally includes two categories of methods: supervised learning and unsupervised learning. In recent years, one paradigm, semi-supervised learning, has attracted much more interest…
(more)
▼ Machine learning traditionally includes two categories of methods: supervised learning and unsupervised learning. In recent years, one paradigm, semi-supervised learning, has attracted much more interest due to large data sets with only partial label information from different domains such as web search, text classification, and machine vision. However, much prior knowledge or problem-specific information cannot be expressed with labels and hence cannot be used by existing semi-supervised learning methods. In other words, we require extensions of semi-supervised learning that can encode these types of auxiliary information; of particular interest is information that can be encoded as
instance-level constraints.
This dissertation presents a mixture model-based method able to make use of domain-specific information to improve clustering. The domain-specific information is cast in the form of
instance-level constraints, i.e., pairwise sample constraints. Most prior work on semi-supervised clustering with constraints assumes the number of classes is known, with each learned cluster assumed to be a class and hence
subject to the given class constraints. When the number of classes is unknown or when the ``one-cluster-per-class' assumption is not valid, the use of constraints may actually be deleterious to learning the ground-truth data groups. The proposed method addresses this by 1) allowing allocation of multiple mixture components to individual classes and 2) estimating both the number of components and the number of classes. This method also addresses new class discovery, with components void of constraints treated as putative unknown classes. The improvement in clustering performance by this method for the case of partially labeled data sets is also illustrated. We also explore discriminative learning with
instance-level constraints in this dissertation. Our proposed discriminative method assumes a family of discriminant functions specified with a set of parameters, and uses the minimum relative entropy principle to find a distribution over these parameters. The final discriminant decision rule is obtained by averaging over all classifiers.
The second major contribution of this dissertation is to image
segmentation. The domain-specific information in images is spatial
continuity, which can also be converted into
instance-level constraints. After applying the proposed mixture-model-based method with constraints to image
segmentation, we obtain a standard Markov random field potential objective function. Due to the structure of the constraints, a sequence-based forward/backward algorithm, i.e., a novel structured mean-field method, is presented. Better performance is obtained than a standard mean-field annealing algorithm.
An investigation of model selection techniques is another contribution in this dissertation. In the
proposed mixture-model-based method, the cluster number is determined with model selection. We provide an integrated learning and model selection framework, which performs batch optimization over…
Advisors/Committee Members: David Jonathan Miller, Committee Chair/Co-Chair, Constantino Manuel Lagoa, Committee Member, George Kesidis, Committee Member, Jia Li, Committee Member.
Subjects/Keywords: semi-supervised learning with instance-level const; mixture modeling with constraints; constrained clustering; image segmentation with background information
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhao, Q. (2008). Mixture Model Learning with Instance-level Constraints. (Thesis). Penn State University. Retrieved from https://submit-etda.libraries.psu.edu/catalog/6605
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Zhao, Qi. “Mixture Model Learning with Instance-level Constraints.” 2008. Thesis, Penn State University. Accessed January 19, 2021.
https://submit-etda.libraries.psu.edu/catalog/6605.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Zhao, Qi. “Mixture Model Learning with Instance-level Constraints.” 2008. Web. 19 Jan 2021.
Vancouver:
Zhao Q. Mixture Model Learning with Instance-level Constraints. [Internet] [Thesis]. Penn State University; 2008. [cited 2021 Jan 19].
Available from: https://submit-etda.libraries.psu.edu/catalog/6605.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Zhao Q. Mixture Model Learning with Instance-level Constraints. [Thesis]. Penn State University; 2008. Available from: https://submit-etda.libraries.psu.edu/catalog/6605
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
18.
Zhuo, Wei.
2D+3D Indoor Scene Understanding from a Single Monocular Image
.
Degree: 2018, Australian National University
URL: http://hdl.handle.net/1885/144616
► Scene understanding, as a broad field encompassing many subtopics, has gained great interest in recent years. Among these subtopics, indoor scene understanding, having its own…
(more)
▼ Scene understanding, as a broad field encompassing many
subtopics, has gained great interest in recent years. Among these
subtopics, indoor scene understanding, having its own specific
attributes and challenges compared to outdoor scene under-
standing, has drawn a lot of attention. It has potential
applications in a wide variety of domains, such as robotic
navigation, object grasping for personal robotics, augmented
reality, etc. To our knowledge, existing research for indoor
scenes typically makes use of depth sensors, such as Kinect, that
is however not always available.
In this thesis, we focused on addressing the indoor scene
understanding tasks in a general case, where only a monocular
color image of the scene is available. Specifically, we first
studied the problem of estimating a detailed depth map from a
monocular image. Then, benefiting from deep-learning-based depth
estimation, we tackled the higher-level tasks of 3D box proposal
generation, and scene parsing with instance segmentation,
semantic labeling and support relationship inference from a
monocular image. Our research on indoor scene understanding
provides a comprehensive scene interpretation at various
perspectives and scales.
For monocular image depth estimation, previous approaches are
limited in that they only reason about depth locally on a single
scale, and do not utilize the important information of geometric
scene structures. Here, we developed a novel graphical model,
which reasons about detailed depth while leveraging geometric
scene structures at multiple scales.
For 3D box proposals, to our best knowledge, our approach
constitutes the first attempt to reason about class-independent
3D box proposals from a single monocular image. To this end, we
developed a novel integrated, differentiable framework that
estimates depth, extracts a volumetric scene representation and
generates 3D proposals. At the core of this framework lies a
novel residual, differentiable truncated signed distance function
module, which is able to handle the relatively low accuracy of
the predicted depth map.
For scene parsing, we tackled its three subtasks of instance
segmentation, se- mantic labeling, and the support relationship
inference on instances. Existing work typically reasons about
these individual subtasks independently. Here, we leverage the
fact that they bear strong connections, which can facilitate
addressing these sub- tasks if modeled properly. To this end, we
developed an integrated graphical model that reasons about the
mutual relationships of the above subtasks.
In summary, in this thesis, we introduced novel and effective
methodologies for each of three indoor scene understanding tasks,
i.e., depth estimation, 3D box proposal generation, and scene
parsing, and exploited the dependencies on depth estimates of the
…
Subjects/Keywords: Scene Understanding;
Monocular Image Processing;
Depth Estimation;
3D Box Proposal;
Semantic Labeling;
Instance Segmentation;
Support Relationship Inference
…instance segmentation, semantic labeling and support relationship inference
from a monocular… …we tackled its three subtasks of instance segmentation, semantic labeling, and the support… …scene; c) Based on depth estimation, we jointly reason about instance segmentation… …59
5.1
Our scene parsing framework for instance segmentation, semantic labeling and… …Instance
Segmentation
Our
CRF
Model
for
Intergrated Scene
Parsing
Semantic…
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhuo, W. (2018). 2D+3D Indoor Scene Understanding from a Single Monocular Image
. (Thesis). Australian National University. Retrieved from http://hdl.handle.net/1885/144616
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Zhuo, Wei. “2D+3D Indoor Scene Understanding from a Single Monocular Image
.” 2018. Thesis, Australian National University. Accessed January 19, 2021.
http://hdl.handle.net/1885/144616.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Zhuo, Wei. “2D+3D Indoor Scene Understanding from a Single Monocular Image
.” 2018. Web. 19 Jan 2021.
Vancouver:
Zhuo W. 2D+3D Indoor Scene Understanding from a Single Monocular Image
. [Internet] [Thesis]. Australian National University; 2018. [cited 2021 Jan 19].
Available from: http://hdl.handle.net/1885/144616.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Zhuo W. 2D+3D Indoor Scene Understanding from a Single Monocular Image
. [Thesis]. Australian National University; 2018. Available from: http://hdl.handle.net/1885/144616
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
.