You searched for subject:(Semantic Segmentation)
.
Showing records 1 – 30 of
174 total matches.
◁ [1] [2] [3] [4] [5] [6] ▶

University of Adelaide
1.
Shen, Tong.
Context Learning and Weakly Supervised Learning for Semantic Segmentation.
Degree: 2018, University of Adelaide
URL: http://hdl.handle.net/2440/120354
► This thesis focuses on one of the fundamental problems in computer vision, semantic segmentation, whose task is to predict a semantic label for each pixel…
(more)
▼ This thesis focuses on one of the fundamental problems in computer vision,
semantic segmentation, whose task is to predict a
semantic label for each pixel of an image. Although
semantic segmentation models have been largely improved thanks to the great representative power of deep learning techniques, there are still open questions needed to be discussed. In this thesis, we discuss two problems regarding
semantic segmentation, scene consistency and weakly supervised
segmentation. In the first part of the thesis, we discuss the issue of scene consistency in
semantic segmentation. This issue comes from the fact that trained models sometimes produce noisy and implausible predictions that are not semantically consistent with the scene or context. By explicitly considering scene consistency both locally and globally, we can narrow down the possible categories for each pixel and generate the desired prediction more easily. In the thesis, we address this issue by introducing a dense multi-label module. In general, multi-label classification refers to the task of assigning multiple labels to a given image. We extend the idea to different levels of the image, and assign multiple labels to different regions of the image. Dense multi-label acts as a constraint to encourage scene consistency locally and globally. For dense prediction problems such as
semantic segmentation, training a model requires densely annotated data as ground-truth, which involves a great amount of human annotation effort and is very time-consuming. Therefore, it is worth investigating semi- or weakly supervised methods that require much less supervision. Particularly, weakly supervised
segmentation refers to training the model using only image-level labels, while semi-supervised
segmentation refers to using partially annotated data or a small portion of fully annotated data to train. In the thesis, two weakly supervised methods are proposed where only image-level labels are required. The two methods share some similar motivations. First of all, since pixel-level masks are missing in this particular setting, the two methods are all designed to estimate the missing ground-truth and further use them as pseudo ground-truth for training. Secondly, they both use data retrieved from the internet as auxiliary data because web data are cheap to obtain and exist in a large amount. Although there are similarities between these two methods, they are designed from different perspectives. The motivation for the first method is that given a group of images crawled from the internet that belong to the same
semantic category, it is a good choice to use co-
segmentation to extract the masks of them, which gives us almost free pixel-wise training samples. Those internet images along with the extracted masks are used to train a mask generator to help us estimate the pseudo ground-truth for the training images. The second method is designed as a bi-directional framework between the target domain and the web domain. The term “bi-directional” refers to the concept that the…
Advisors/Committee Members: Shen, Chunhua (advisor), School of Computer Science (school).
Subjects/Keywords: weakly supervised learning; semantic segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Shen, T. (2018). Context Learning and Weakly Supervised Learning for Semantic Segmentation. (Thesis). University of Adelaide. Retrieved from http://hdl.handle.net/2440/120354
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Shen, Tong. “Context Learning and Weakly Supervised Learning for Semantic Segmentation.” 2018. Thesis, University of Adelaide. Accessed January 28, 2021.
http://hdl.handle.net/2440/120354.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Shen, Tong. “Context Learning and Weakly Supervised Learning for Semantic Segmentation.” 2018. Web. 28 Jan 2021.
Vancouver:
Shen T. Context Learning and Weakly Supervised Learning for Semantic Segmentation. [Internet] [Thesis]. University of Adelaide; 2018. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/2440/120354.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Shen T. Context Learning and Weakly Supervised Learning for Semantic Segmentation. [Thesis]. University of Adelaide; 2018. Available from: http://hdl.handle.net/2440/120354
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Delft University of Technology
2.
Bai, Qian (author).
Semantic Segmentation of AHN3 Point Clouds with DGCNN.
Degree: 2020, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:492d2981-35ea-4cff-bc5a-eb75d06fc2dc
► Semantic segmentation of aerial point clouds with high accuracy is significant for many geographical applications, but is not trivial since the data is massive and…
(more)
▼ Semantic segmentation of aerial point clouds with high accuracy is significant for many geographical applications, but is not trivial since the data is massive and unstructured. In the past few years, deep learning approaches designed for 3D point cloud data have made great progress. Pointwise neural networks, such as PointNet and its extensions, show their ability to process 3D point clouds, especially in classification and semantic segmentation. In this work, we implement DGCNN (Dynamic Graph CNN), which combines PointNet with Graph CNN, and extend its semantic segmentation application from indoor scenes to an aerial point cloud dataset: The Current Elevation File Netherlands (AHN), which was produced by airborne laser scanners for the whole Netherlands. Point clouds from the iteration AHN3 are classified into four classes: ground, building, water and others (including vegetation, railways, etc). Moreover, DGCNN splits the input point cloud into regular blocks before operating on it and processes each block independently, which limits the effective range (receptive field) of the network to some extent. Thus, the second aim of this work is to investigate the impact of the effective range on the performance of DGCNN by adjusting two crucial parameters: the block size and the neighborhood size k in k-NN graphs. It turns out that enlarging the block size or k helps to improve the overall accuracy of DGCNN, but cannot ensure better segmentation results from each individual class. With the block size 50 m and k=20, the most balanced F1 scores for all classes and an overall accuracy of 93.28% are achieved. Based on the evaluation for each setting with a certain block size and k, we also manage to further improve the overall accuracy to 93.51% by combining smaller-scale (with block size 30 m) and larger-scale (with block size 50 m) segmentation results, with k=20.
Geoscience and Remote Sensing
Advisors/Committee Members: Lindenbergh, R.C. (mentor), Nan, L. (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: Point Cloud; Semantic segmentation; DGCNN
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Bai, Q. (. (2020). Semantic Segmentation of AHN3 Point Clouds with DGCNN. (Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:492d2981-35ea-4cff-bc5a-eb75d06fc2dc
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Bai, Qian (author). “Semantic Segmentation of AHN3 Point Clouds with DGCNN.” 2020. Thesis, Delft University of Technology. Accessed January 28, 2021.
http://resolver.tudelft.nl/uuid:492d2981-35ea-4cff-bc5a-eb75d06fc2dc.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Bai, Qian (author). “Semantic Segmentation of AHN3 Point Clouds with DGCNN.” 2020. Web. 28 Jan 2021.
Vancouver:
Bai Q(. Semantic Segmentation of AHN3 Point Clouds with DGCNN. [Internet] [Thesis]. Delft University of Technology; 2020. [cited 2021 Jan 28].
Available from: http://resolver.tudelft.nl/uuid:492d2981-35ea-4cff-bc5a-eb75d06fc2dc.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Bai Q(. Semantic Segmentation of AHN3 Point Clouds with DGCNN. [Thesis]. Delft University of Technology; 2020. Available from: http://resolver.tudelft.nl/uuid:492d2981-35ea-4cff-bc5a-eb75d06fc2dc
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Arizona
3.
Peng, Kuo-Shiuan.
Toward Joint Scene Understanding Using Deep Convolutional Neural Network: Object State, Depth, and Segmentation
.
Degree: 2019, University of Arizona
URL: http://hdl.handle.net/10150/636671
► Semantic understanding is the foundation of an intelligent system in the field of computer vision. Particularly, the real-time usage of the automation systems, such as…
(more)
▼ Semantic understanding is the foundation of an intelligent system in the field of computer vision. Particularly, the real-time usage of the automation systems, such as robotic vision, auto-driving, and surgical training applications, has been in high demand. The models require to capture this variability of scenes and their constituents (e.g., objects or depth) given the limited memory and computation resources. To achieve the goals of real-time usage in
semantic understanding, we propose a series of novel methods for object state, depth, and
segmentation in this dissertation. We first present a
semantic object model for simplifying the object state detection process. We then propose a novel method of monocular depth estimation to retrieve the 3D information effectively. Lastly, this dissertation presents a multi-task model for
semantic segmentation and depth estimation. We train and verify the proposed method by using two public datasets of outdoor scenes that are meant to be applied to auto-driving applications. Our method successfully achieves 60 frames per second with a competitive performance compared to the current state-of-the-art in the benchmark. In the empirical experiments, we have applied our method to a simulated laparoscopic surgical training system: Computer Assisted Surgical Trainer (CAST). One of the CAST training tasks, Peg Transfer Task, is selected to be the evaluation platform. In this experiment, our method has demonstrated promising results for supporting a real-world application in medicine.
Advisors/Committee Members: Rozenblit, Jerzy (advisor), Ditzler, Gregory (advisor), Roveda, Janet (committeemember).
Subjects/Keywords: depth estimation;
semantic object;
semantic segmentation;
semantic understanding
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Peng, K. (2019). Toward Joint Scene Understanding Using Deep Convolutional Neural Network: Object State, Depth, and Segmentation
. (Doctoral Dissertation). University of Arizona. Retrieved from http://hdl.handle.net/10150/636671
Chicago Manual of Style (16th Edition):
Peng, Kuo-Shiuan. “Toward Joint Scene Understanding Using Deep Convolutional Neural Network: Object State, Depth, and Segmentation
.” 2019. Doctoral Dissertation, University of Arizona. Accessed January 28, 2021.
http://hdl.handle.net/10150/636671.
MLA Handbook (7th Edition):
Peng, Kuo-Shiuan. “Toward Joint Scene Understanding Using Deep Convolutional Neural Network: Object State, Depth, and Segmentation
.” 2019. Web. 28 Jan 2021.
Vancouver:
Peng K. Toward Joint Scene Understanding Using Deep Convolutional Neural Network: Object State, Depth, and Segmentation
. [Internet] [Doctoral dissertation]. University of Arizona; 2019. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/10150/636671.
Council of Science Editors:
Peng K. Toward Joint Scene Understanding Using Deep Convolutional Neural Network: Object State, Depth, and Segmentation
. [Doctoral Dissertation]. University of Arizona; 2019. Available from: http://hdl.handle.net/10150/636671
4.
Zou, Wenbin.
Semantic-oriented Object Segmentation : Segmentation d'objet pour l'interprétation sémantique.
Degree: Docteur es, Traitement du signal et de l'image, 2014, Rennes, INSA
URL: http://www.theses.fr/2014ISAR0007
► Cette thèse porte sur les problèmes de segmentation d’objets et la segmentation sémantique qui visent soit à séparer des objets du fond, soit à l’attribution…
(more)
▼ Cette thèse porte sur les problèmes de segmentation d’objets et la segmentation sémantique qui visent soit à séparer des objets du fond, soit à l’attribution d’une étiquette sémantique spécifique à chaque pixel de l’image. Nous proposons deux approches pour la segmentation d’objets, et une approche pour la segmentation sémantique. La première approche est basée sur la détection de saillance. Motivés par notre but de segmentation d’objets, un nouveau modèle de détection de saillance est proposé. Cette approche se formule dans le modèle de récupération de la matrice de faible rang en exploitant les informations de structure de l’image provenant d’une segmentation ascendante comme contrainte importante. La segmentation construite à l’aide d’un schéma d’optimisation itératif et conjoint, effectue simultanément, d’une part, une segmentation d’objets basée sur la carte de saillance résultant de sa détection et, d’autre part, une amélioration de la qualité de la saillance à l’aide de la segmentation. Une carte de saillance optimale et la segmentation finale sont obtenues après plusieurs itérations. La deuxième approche proposée pour la segmentation d’objets se fonde sur des images exemples. L’idée sous-jacente est de transférer les étiquettes de segmentation d’exemples similaires, globalement et localement, à l’image requête. Pour l’obtention des exemples les mieux assortis, nous proposons une représentation nouvelle de haut niveau de l’image, à savoir le descripteur orienté objet, qui reflète à la fois l’information globale et locale de l’image. Ensuite, un prédicteur discriminant apprend en ligne à l’aide les exemples récupérés pour attribuer à chaque région de l’image requête un score d’appartenance au premier plan. Ensuite, ces scores sont intégrés dans un schéma de segmentation du champ de Markov (MRF) itératif qui minimise l’énergie. La segmentation sémantique se fonde sur une banque de régions et la représentation parcimonieuse. La banque des régions est un ensemble de régions générées par segmentations multi-niveaux. Ceci est motivé par l’observation que certains objets peuvent être capturés à certains niveaux dans une segmentation hiérarchique. Pour la description de la région, nous proposons la méthode de codage parcimonieux qui représente chaque caractéristique locale avec plusieurs vecteurs de base du dictionnaire visuel appris, et décrit toutes les caractéristiques locales d’une région par un seul histogramme parcimonieux. Une machine à support de vecteurs (SVM) avec apprentissage de noyaux multiple est utilisée pour l’inférence sémantique. Les approches proposées sont largement évaluées sur plusieurs ensembles de données. Des expériences montrent que les approches proposées surpassent les méthodes de l’état de l’art. Ainsi, par rapport au meilleur résultat de la littérature, l’approche proposée de segmentation d’objets améliore la mesure d F-score de 63% à 68,7% sur l’ensemble de données Pascal VOC 2011.
This thesis focuses on the problems of object segmentation and semantic segmentation which aim at…
Advisors/Committee Members: Senhadji, Lotfi (thesis director).
Subjects/Keywords: Segmentation d’objets; Segmentation sémantique; Détection de saillance; Object segmentation; Semantic segmentation; Saliency detection; 621.367
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zou, W. (2014). Semantic-oriented Object Segmentation : Segmentation d'objet pour l'interprétation sémantique. (Doctoral Dissertation). Rennes, INSA. Retrieved from http://www.theses.fr/2014ISAR0007
Chicago Manual of Style (16th Edition):
Zou, Wenbin. “Semantic-oriented Object Segmentation : Segmentation d'objet pour l'interprétation sémantique.” 2014. Doctoral Dissertation, Rennes, INSA. Accessed January 28, 2021.
http://www.theses.fr/2014ISAR0007.
MLA Handbook (7th Edition):
Zou, Wenbin. “Semantic-oriented Object Segmentation : Segmentation d'objet pour l'interprétation sémantique.” 2014. Web. 28 Jan 2021.
Vancouver:
Zou W. Semantic-oriented Object Segmentation : Segmentation d'objet pour l'interprétation sémantique. [Internet] [Doctoral dissertation]. Rennes, INSA; 2014. [cited 2021 Jan 28].
Available from: http://www.theses.fr/2014ISAR0007.
Council of Science Editors:
Zou W. Semantic-oriented Object Segmentation : Segmentation d'objet pour l'interprétation sémantique. [Doctoral Dissertation]. Rennes, INSA; 2014. Available from: http://www.theses.fr/2014ISAR0007

Oregon State University
5.
Roy, Anirban.
Semantic Image Segmentation Using Domain Constraints.
Degree: PhD, 2017, Oregon State University
URL: http://hdl.handle.net/1957/61703
► This dissertation addresses the problem of semantic labeling of image pixels. In the course of our work, we considered different types of semantic labels, including…
(more)
▼ This dissertation addresses the problem of
semantic labeling of image pixels. In the course of our work, we considered different types of
semantic labels, including object classes (e.g., car, person), 3D depth values (in the range 0 to 80 meters), and affordance classes (e.g., walkable, sittable).
Semantic pixel labeling is challenging as objects may appear in various poses, under partial occlusion, and against a cluttered background in the scene. To address these challenges in
semantic segmentation, we developed approaches in a unified research theme that of incorporating domain knowledge in learning and inference. As our results show, domain knowledge helps to resolve various ambiguities in
semantic segmentation. We addressed this problem in supervised and weakly supervised settings, where the former provides pixel-wise ground-truth annotations in training, and the latter provides ground truths only as image-level tags. Our approaches range from beam search based inference to deep convolutional neural networks (CNN). Our approaches achieved state-of-the-art performance on the benchmark datasets for all types of
semantic segmentation problems.
Our main contributions include:
1. Efficient beam search based inference that guarantees to respect domain constraints.
2. Novel deep neural architecture called neural regression forest, which integrates
decision forests with CNNs.
3. Multi-scale CNN architecture for extracting and fusing diverse mid-level visual
cues, including depth map, surface normals, and object localization.
4. Constraint-based regularized learning of a CNN where constraints are defined as
spatial relationships between objects in the domain.
5. Weakly supervised learning of CNNs using neural attention cues.
6. We introduced first manually annotated dataset for evaluating affordance
segmentation.
Advisors/Committee Members: Todorovic, Sinisa (advisor), Fern, Xiaoli (committee member).
Subjects/Keywords: Computer Vision; Semantic Image Segmentation; Deep Learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Roy, A. (2017). Semantic Image Segmentation Using Domain Constraints. (Doctoral Dissertation). Oregon State University. Retrieved from http://hdl.handle.net/1957/61703
Chicago Manual of Style (16th Edition):
Roy, Anirban. “Semantic Image Segmentation Using Domain Constraints.” 2017. Doctoral Dissertation, Oregon State University. Accessed January 28, 2021.
http://hdl.handle.net/1957/61703.
MLA Handbook (7th Edition):
Roy, Anirban. “Semantic Image Segmentation Using Domain Constraints.” 2017. Web. 28 Jan 2021.
Vancouver:
Roy A. Semantic Image Segmentation Using Domain Constraints. [Internet] [Doctoral dissertation]. Oregon State University; 2017. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/1957/61703.
Council of Science Editors:
Roy A. Semantic Image Segmentation Using Domain Constraints. [Doctoral Dissertation]. Oregon State University; 2017. Available from: http://hdl.handle.net/1957/61703

Delft University of Technology
6.
Ai, Zhiwei (author).
Semantic Segmentation of Large-scale Urban Scenes from Point Clouds.
Degree: 2019, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:a9cedaac-42ae-4cb0-9c14-67bab8e96a6d
► Deep learning methods have been demonstrated to be promising in semantic segmentation of point clouds. Existing works focus on extracting informative local features based on…
(more)
▼ Deep learning methods have been demonstrated to be promising in semantic segmentation of point clouds. Existing works focus on extracting informative local features based on individual points and their local neighborhood. They lack consideration of the general structures and latent contextual relations of underlying shapes among points. To this end, we design geometric priors to encode contextual relations of underlying shapes between corresponding point pairs. Geometric prior convolution operator is proposed to explicitly incorporate the contextual relations into the computation. Then, GP-net, which contains geometric prior convolution and a backbone network is constructed. Our experiments show that the performance of our backbone network can be improved by up to 6.9 percent in terms of mean Intersection over Union (mIoU) with the help of geometric prior convolution. We also analyze different design options of geometric prior convolution and GP-net. The GP-net has been tested on the Paris and Lille 3D benchmark, and it achieves the state-of-the-art performance of 74.7 % mIoU.
Mechanical Engineering
Advisors/Committee Members: Nan, Liangliang (mentor), Gavrila, Dariu (graduation committee), Lindenbergh, Roderik (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: Deep Learning; Point Clouds; Semantic Segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Ai, Z. (. (2019). Semantic Segmentation of Large-scale Urban Scenes from Point Clouds. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:a9cedaac-42ae-4cb0-9c14-67bab8e96a6d
Chicago Manual of Style (16th Edition):
Ai, Zhiwei (author). “Semantic Segmentation of Large-scale Urban Scenes from Point Clouds.” 2019. Masters Thesis, Delft University of Technology. Accessed January 28, 2021.
http://resolver.tudelft.nl/uuid:a9cedaac-42ae-4cb0-9c14-67bab8e96a6d.
MLA Handbook (7th Edition):
Ai, Zhiwei (author). “Semantic Segmentation of Large-scale Urban Scenes from Point Clouds.” 2019. Web. 28 Jan 2021.
Vancouver:
Ai Z(. Semantic Segmentation of Large-scale Urban Scenes from Point Clouds. [Internet] [Masters thesis]. Delft University of Technology; 2019. [cited 2021 Jan 28].
Available from: http://resolver.tudelft.nl/uuid:a9cedaac-42ae-4cb0-9c14-67bab8e96a6d.
Council of Science Editors:
Ai Z(. Semantic Segmentation of Large-scale Urban Scenes from Point Clouds. [Masters Thesis]. Delft University of Technology; 2019. Available from: http://resolver.tudelft.nl/uuid:a9cedaac-42ae-4cb0-9c14-67bab8e96a6d

Delft University of Technology
7.
van Ramshorst, Arjan (author).
Automatic Segmentation of Ships in Digital Images: A Deep Learning Approach.
Degree: 2018, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:55de4322-8552-4a2c-84d0-427b2891015b
► Knowledge on adversaries during military missions at sea heavily influences decision making, making identification of unknown vessels an important task. Identification of surrounding vessels based…
(more)
▼ Knowledge on adversaries during military missions at sea heavily influences decision making, making identification of unknown vessels an important task. Identification of surrounding vessels based on visual data offers an alternative to AIS information (Automatic Identification System), the current standard in vessel identification, which can be spoofed. One visual approach employs human expertise and manually identifies vessels guided by a ship catalog. In order to minimize or potentially eliminate human error and performance limitations, there is strong interest in developing an automated vessel classification pipeline. One such pipeline is currently being developed at TNO, capable of classifying over 500 separate classes. A crucial part of the classification pipeline is retrieving an accurate contour of a vessel from a digital image. To address this important challenge, this thesis proposes an advanced deep learning pipeline to automatically segment the vessel image into background (e.g. sky and sea) and the object of interest (a vessel). Deep learning models based on Fully Convolutional Neural Networks (FCNs) have achieved high performance on the task of semantic segmentation. Several networks such as CRF-RNN, PSPNet, DeepLab and Mask R-CNN are employed to determine a baseline performance. We will focus on identifying the cause of poor or failing segmentations and aim to construct a robust network capable of handling these challenges. By sampling disturbances, caused by ship distance and camera noise, augmented data sets are built to tune networks to input from on-site images. Additionally, experiments are done to evaluate the influence of different levels of disturbances. Previous approaches implementing the CRF-RNN network achieved top 1 and top 5 classification accuracies of 31.1% and 44.0% respectively. Employing the DeepLab network, trained to convergence on artificial noise augmented data, we report top 1 and top 5 accuracy of 68.9% and 88.8% respectively. Additionally, implementing an ensemble of classifiers, performance is increased to 73.0% and 91.7% for top 1 and top 5 accuracy respectively. This best result is comparable to the classification results with human annotated ship silhouettes. The human performance accuracy is 73.4% on top 1, and 91.3% on top 5 classification performance. Finally, we show that training on a collection of different levels of image disturbances results in a network that is robust against increasing disturbance in images, while retaining performance on clean images.
Systems and Control
Advisors/Committee Members: van de Plas, Raf (mentor), Schutte, Klamer (mentor), Hellendoorn, Hans (graduation committee), Kober, Jens (graduation committee), Delft University of Technology (degree granting institution).
Subjects/Keywords: Semantic Segmentation; Deep Learning; Data Augmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
van Ramshorst, A. (. (2018). Automatic Segmentation of Ships in Digital Images: A Deep Learning Approach. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:55de4322-8552-4a2c-84d0-427b2891015b
Chicago Manual of Style (16th Edition):
van Ramshorst, Arjan (author). “Automatic Segmentation of Ships in Digital Images: A Deep Learning Approach.” 2018. Masters Thesis, Delft University of Technology. Accessed January 28, 2021.
http://resolver.tudelft.nl/uuid:55de4322-8552-4a2c-84d0-427b2891015b.
MLA Handbook (7th Edition):
van Ramshorst, Arjan (author). “Automatic Segmentation of Ships in Digital Images: A Deep Learning Approach.” 2018. Web. 28 Jan 2021.
Vancouver:
van Ramshorst A(. Automatic Segmentation of Ships in Digital Images: A Deep Learning Approach. [Internet] [Masters thesis]. Delft University of Technology; 2018. [cited 2021 Jan 28].
Available from: http://resolver.tudelft.nl/uuid:55de4322-8552-4a2c-84d0-427b2891015b.
Council of Science Editors:
van Ramshorst A(. Automatic Segmentation of Ships in Digital Images: A Deep Learning Approach. [Masters Thesis]. Delft University of Technology; 2018. Available from: http://resolver.tudelft.nl/uuid:55de4322-8552-4a2c-84d0-427b2891015b
8.
Wagh, Ameya Yatindra.
A Deep 3D Object Pose Estimation Framework for Robots with RGB-D Sensors.
Degree: MS, 2019, Worcester Polytechnic Institute
URL: etd-042419-143553
;
https://digitalcommons.wpi.edu/etd-theses/1287
► The task of object detection and pose estimation has widely been done using template matching techniques. However, these algorithms are sensitive to outliers and occlusions,…
(more)
▼ The task of object detection and pose estimation
has widely been done using template matching
techniques. However, these algorithms are
sensitive to outliers and occlusions, and have
high latency due to their iterative nature.
Recent research in computer vision and deep
learning has shown great improvements in the
robustness of these algorithms. However, one of
the major drawbacks of these algorithms is that
they are specific to the objects. Moreover, the
estimation of pose depends significantly on their
RGB image features. As these algorithms are
trained on meticulously labeled large datasets
for object's ground truth pose, it is difficult
to re-train these for real-world applications.
To overcome this problem, we propose a two-stage
pipeline of convolutional neural networks which
uses RGB images to localize objects in 2D space
and depth images to estimate a 6DoF pose. Thus
the pose estimation network learns only the
geometric features of the object and is not
biased by its color features. We evaluate the
performance of this framework on LINEMOD dataset,
which is widely used to benchmark object pose
estimation frameworks. We found the results to be
comparable with the state of the art algorithms
using RGB-D images. Secondly, to show the
transferability of the proposed pipeline, we
implement this on ATLAS robot for a pick and
place experiment. As the distribution of images
in LINEMOD dataset and the images captured by the
MultiSense sensor on ATLAS are different, we
generate a synthetic dataset out of very few
real-world images captured from the MultiSense
sensor. We use this dataset to train just the
object detection networks used in the ATLAS Robot
experiment.
Advisors/Committee Members: Michael Gennert, Advisor, Emmanuel Agu, Committee Member, Berk Calli, Committee Member.
Subjects/Keywords: Atlas robots; pose estimation; semantic segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Wagh, A. Y. (2019). A Deep 3D Object Pose Estimation Framework for Robots with RGB-D Sensors. (Thesis). Worcester Polytechnic Institute. Retrieved from etd-042419-143553 ; https://digitalcommons.wpi.edu/etd-theses/1287
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Wagh, Ameya Yatindra. “A Deep 3D Object Pose Estimation Framework for Robots with RGB-D Sensors.” 2019. Thesis, Worcester Polytechnic Institute. Accessed January 28, 2021.
etd-042419-143553 ; https://digitalcommons.wpi.edu/etd-theses/1287.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Wagh, Ameya Yatindra. “A Deep 3D Object Pose Estimation Framework for Robots with RGB-D Sensors.” 2019. Web. 28 Jan 2021.
Vancouver:
Wagh AY. A Deep 3D Object Pose Estimation Framework for Robots with RGB-D Sensors. [Internet] [Thesis]. Worcester Polytechnic Institute; 2019. [cited 2021 Jan 28].
Available from: etd-042419-143553 ; https://digitalcommons.wpi.edu/etd-theses/1287.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Wagh AY. A Deep 3D Object Pose Estimation Framework for Robots with RGB-D Sensors. [Thesis]. Worcester Polytechnic Institute; 2019. Available from: etd-042419-143553 ; https://digitalcommons.wpi.edu/etd-theses/1287
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Ottawa
9.
Kolhatkar, Dhanvin.
Real-Time Instance and Semantic Segmentation Using Deep Learning
.
Degree: 2020, University of Ottawa
URL: http://hdl.handle.net/10393/40616
► In this thesis, we explore the use of Convolutional Neural Networks for semantic and instance segmentation, with a focus on studying the application of existing…
(more)
▼ In this thesis, we explore the use of Convolutional Neural Networks for semantic and instance segmentation, with a focus on studying the application of existing methods with cheaper neural networks. We modify a fast object detection architecture for the instance segmentation task, and study the concepts behind these modifications both in the simpler context of semantic segmentation and the more difficult context of instance segmentation. Various instance segmentation branch architectures are implemented in parallel with a box prediction branch, using its results to crop each instance's features. We negate the imprecision of the final box predictions and eliminate the need for bounding box alignment by using an enlarged bounding box for cropping. We report and study the performance, advantages, and disadvantages of each. We achieve fast speeds with all of our methods.
Subjects/Keywords: Instance segmentation;
Semantic segmentation;
Deep learning;
Real-time;
Mask prediction
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Kolhatkar, D. (2020). Real-Time Instance and Semantic Segmentation Using Deep Learning
. (Thesis). University of Ottawa. Retrieved from http://hdl.handle.net/10393/40616
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Kolhatkar, Dhanvin. “Real-Time Instance and Semantic Segmentation Using Deep Learning
.” 2020. Thesis, University of Ottawa. Accessed January 28, 2021.
http://hdl.handle.net/10393/40616.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Kolhatkar, Dhanvin. “Real-Time Instance and Semantic Segmentation Using Deep Learning
.” 2020. Web. 28 Jan 2021.
Vancouver:
Kolhatkar D. Real-Time Instance and Semantic Segmentation Using Deep Learning
. [Internet] [Thesis]. University of Ottawa; 2020. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/10393/40616.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Kolhatkar D. Real-Time Instance and Semantic Segmentation Using Deep Learning
. [Thesis]. University of Ottawa; 2020. Available from: http://hdl.handle.net/10393/40616
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
10.
Coimbra, Danilo Barbosa.
Segmentação de cenas em telejornais: uma abordagem multimodal.
Degree: Mestrado, Ciências de Computação e Matemática Computacional, 2011, University of São Paulo
URL: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28062011-103714/
;
► Este trabalho tem como objetivo desenvolver um método de segmentação de cenas em vídeos digitais que trate segmentos semânticamente complexos. Como prova de conceito, é…
(more)
▼ Este trabalho tem como objetivo desenvolver um método de segmentação de cenas em vídeos digitais que trate segmentos semânticamente complexos. Como prova de conceito, é apresentada uma abordagem multimodal que utiliza uma definição mais geral para cenas em telejornais, abrangendo tanto cenas onde âncoras aparecem quanto cenas onde nenhum âncora aparece. Desse modo, os resultados obtidos da técnica multimodal foram signifiativamente melhores quando comparados com os resultados obtidos das técnicas monomodais aplicadas em separado. Os testes foram executados em quatro grupos de telejornais brasileiros obtidos de duas emissoras de TV diferentes, cada qual contendo cinco edições, totalizando vinte telejornais
This work aims to develop a method for scene segmentation in digital video which deals with semantically complex segments. As proof of concept, we present a multimodal approach that uses a more general definition for TV news scenes, covering both: scenes where anchors appear on and scenes where no anchor appears. The results of the multimodal technique were significantly better when compared with the results from monomodal techniques applied separately. The tests were performed in four groups of Brazilian news programs obtained from two different television stations, containing five editions each, totaling twenty newscasts
Advisors/Committee Members: Goularte, Rudinei.
Subjects/Keywords: Multimodal scene segmentation; Multimodal video segmentation; Segmentação de cena multimodal; Segmentação de vídeo multimodal; Segmentaçãop semântica; Semantic segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Coimbra, D. B. (2011). Segmentação de cenas em telejornais: uma abordagem multimodal. (Masters Thesis). University of São Paulo. Retrieved from http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28062011-103714/ ;
Chicago Manual of Style (16th Edition):
Coimbra, Danilo Barbosa. “Segmentação de cenas em telejornais: uma abordagem multimodal.” 2011. Masters Thesis, University of São Paulo. Accessed January 28, 2021.
http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28062011-103714/ ;.
MLA Handbook (7th Edition):
Coimbra, Danilo Barbosa. “Segmentação de cenas em telejornais: uma abordagem multimodal.” 2011. Web. 28 Jan 2021.
Vancouver:
Coimbra DB. Segmentação de cenas em telejornais: uma abordagem multimodal. [Internet] [Masters thesis]. University of São Paulo; 2011. [cited 2021 Jan 28].
Available from: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28062011-103714/ ;.
Council of Science Editors:
Coimbra DB. Segmentação de cenas em telejornais: uma abordagem multimodal. [Masters Thesis]. University of São Paulo; 2011. Available from: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28062011-103714/ ;

Linköping University
11.
Tranell, Victor.
Semantic Segmentation of Oblique Views in a 3D-Environment.
Degree: Computer Vision, 2019, Linköping University
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-153866
► This thesis presents and evaluates different methods to semantically segment 3D-models by rendered 2D-views. The 2D-views are segmented separately and then merged together. The…
(more)
▼ This thesis presents and evaluates different methods to semantically segment 3D-models by rendered 2D-views. The 2D-views are segmented separately and then merged together. The thesis evaluates three different merge strategies, two different classification architectures, how many views should be rendered and how these rendered views should be arranged. The results are evaluated both quantitatively and qualitatively and then compared with the current classifier at Vricon presented in [30]. The conclusion of this thesis is that there is a performance gain to be had using this method. The best model was using two views and attains an accuracy of 90.89% which can be compared with 84.52% achieved by the single view network from [30]. The best nine view system achieved a 87.72%. The difference in accuracy between the two and the nine view system is attributed to the higher quality mesh on the sunny side of objects, which typically is the south side. The thesis provides a proof of concept and there are still many areas where the system can be improved. One of them being the extraction of training data which seemingly would have a huge impact on the performance.
Subjects/Keywords: Semantic segmentation; 3D segmentation; oblique views; multiview segmentation; satellite imagery; convolutional neural networks; Signal Processing; Signalbehandling
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Tranell, V. (2019). Semantic Segmentation of Oblique Views in a 3D-Environment. (Thesis). Linköping University. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-153866
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Tranell, Victor. “Semantic Segmentation of Oblique Views in a 3D-Environment.” 2019. Thesis, Linköping University. Accessed January 28, 2021.
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-153866.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Tranell, Victor. “Semantic Segmentation of Oblique Views in a 3D-Environment.” 2019. Web. 28 Jan 2021.
Vancouver:
Tranell V. Semantic Segmentation of Oblique Views in a 3D-Environment. [Internet] [Thesis]. Linköping University; 2019. [cited 2021 Jan 28].
Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-153866.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Tranell V. Semantic Segmentation of Oblique Views in a 3D-Environment. [Thesis]. Linköping University; 2019. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-153866
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

UCLA
12.
Xia, Fangting.
Pose-Guided Human Semantic Part Segmentation.
Degree: Statistics, 2016, UCLA
URL: http://www.escholarship.org/uc/item/34r7t3d3
► Human semantic part segmentation and human pose estimation are two fundamental and complementary tasks in computer vision. The localization of joints in pose estimation can…
(more)
▼ Human semantic part segmentation and human pose estimation are two fundamental and complementary tasks in computer vision. The localization of joints in pose estimation can be much more accurate with the support of part segment consistency while the local confusions in part segmentation can be greatly reduced with the support of top-down pose information. In natural scenes which consist of multiple people, human pose estimation and human part segmentation are still challenging due to multi-instance confusion and large variations in pose, scale, appearance and occlusion. Current state-of-the-art methods for both tasks rely on deep neural networks to extract data-dependent features, and combine them with a carefully designed graphical model. However, these methods have no efficient mechanism to handle multi-person overlapping or to adapt to the scale of human instances, thus are still limited when facing large variability in human pose and scale. To improve the performance of both tasks over current methods, we propose three models that tackle the difficulty of pose/scale variation in two major directions: (1) introduce top-down pose consistency into semantic part segmentation and introduce part segment consistency into human pose estimation, letting the two tasks benefit each other; (2) handle the scale variation by designing a mechanism to adapt to the size of human instances and their corresponding parts. Our first model incorporates pose cues into a graphical model-based part segmentation framework while our third model combines pose information within a framework made up of fully convolutional networks (FCN). Our second model is a hierarchical FCN framework that performs object/part scale estimation and part segmentation jointly, adapting to the size of objects and parts. We show that all our three models achieve state-of-the-art performance on challenging datasets.
Subjects/Keywords: Statistics; Computer science; multi-scale modeling; pose estimation; semantic part segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Xia, F. (2016). Pose-Guided Human Semantic Part Segmentation. (Thesis). UCLA. Retrieved from http://www.escholarship.org/uc/item/34r7t3d3
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Xia, Fangting. “Pose-Guided Human Semantic Part Segmentation.” 2016. Thesis, UCLA. Accessed January 28, 2021.
http://www.escholarship.org/uc/item/34r7t3d3.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Xia, Fangting. “Pose-Guided Human Semantic Part Segmentation.” 2016. Web. 28 Jan 2021.
Vancouver:
Xia F. Pose-Guided Human Semantic Part Segmentation. [Internet] [Thesis]. UCLA; 2016. [cited 2021 Jan 28].
Available from: http://www.escholarship.org/uc/item/34r7t3d3.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Xia F. Pose-Guided Human Semantic Part Segmentation. [Thesis]. UCLA; 2016. Available from: http://www.escholarship.org/uc/item/34r7t3d3
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Rochester Institute of Technology
13.
Karnam, Srivallabha.
Self-Supervised Learning for Segmentation using Image Reconstruction.
Degree: MS, Computer Engineering, 2020, Rochester Institute of Technology
URL: https://scholarworks.rit.edu/theses/10532
► Deep learning is the engine that is piloting tremendous growth in various segments of the industry by consuming valuable fuel called data. We are…
(more)
▼ Deep learning is the engine that is piloting tremendous growth in various segments of the industry by consuming valuable fuel called data. We are witnessing many businesses adopting this technology be it healthcare, transportation, defense, semiconductor, or retail. But most of the accomplishments that we see now rely on supervised learning. Supervised learning needs a substantial volume of labeled data which are usually annotated by humans- an arduous and expensive task often leading to datasets that are insufficient in size or human labeling errors. The performance of deep learning models is only as good as the data. Self-supervised learning minimizes the need for labeled data as it extracts the pertinent context and inherited data content. We are inspired by image interpolation where we resize an image from a one-pixel grid to another. We introduce a novel self-supervised learning method specialized for
semantic segmentation tasks. We use Image reconstruction as a pre-text task where pixels and or pixel channel (R or G or B pixel channel) in the input images are dropped in a defined or random manner and the original image serves as ground truth. We use the ImageNet dataset for a pretext learning task, and PASCAL V0C to evaluate efficacy of proposed methods. In
segmentation tasks decoder is equally important as the encoder, since our proposed method learns both the encoder and decoder as a part of a pretext task, our method outperforms existing self-supervised
segmentation methods.
Advisors/Committee Members: Raymond Ptucha.
Subjects/Keywords: Classification; Computer vision; Self-supervised learning; Semantic segmentation; Unsupervised learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Karnam, S. (2020). Self-Supervised Learning for Segmentation using Image Reconstruction. (Masters Thesis). Rochester Institute of Technology. Retrieved from https://scholarworks.rit.edu/theses/10532
Chicago Manual of Style (16th Edition):
Karnam, Srivallabha. “Self-Supervised Learning for Segmentation using Image Reconstruction.” 2020. Masters Thesis, Rochester Institute of Technology. Accessed January 28, 2021.
https://scholarworks.rit.edu/theses/10532.
MLA Handbook (7th Edition):
Karnam, Srivallabha. “Self-Supervised Learning for Segmentation using Image Reconstruction.” 2020. Web. 28 Jan 2021.
Vancouver:
Karnam S. Self-Supervised Learning for Segmentation using Image Reconstruction. [Internet] [Masters thesis]. Rochester Institute of Technology; 2020. [cited 2021 Jan 28].
Available from: https://scholarworks.rit.edu/theses/10532.
Council of Science Editors:
Karnam S. Self-Supervised Learning for Segmentation using Image Reconstruction. [Masters Thesis]. Rochester Institute of Technology; 2020. Available from: https://scholarworks.rit.edu/theses/10532

University of Waterloo
14.
Angus, Matt.
Towards Pixel-Level OOD Detection for Semantic Segmentation.
Degree: 2019, University of Waterloo
URL: http://hdl.handle.net/10012/15004
► There exists wide research surrounding the detection of out of distribution sample for image classification. Safety critical applications, such as autonomous driving, would benefit from…
(more)
▼ There exists wide research surrounding the detection of out of distribution sample for image classification. Safety critical applications, such as autonomous driving, would benefit from the ability to localise the unusual objects causing an image to be out of distribution. This thesis adapts state-of-the-art methods for detecting out of distribution images for image classification to the new task of detecting out of distribution pixels, which can localise the unusual objects. It further experimentally compares the adapted methods to a new dataset derived from existing semantic segmentation datasets, proposing a new metric for the task. The evaluation shows that the performance ranking of the compared methods successfully transfers to the new task.
Subjects/Keywords: Semantic Segmentation; Out of Distribution Detection; Deep Learning; Convolutional Neural Networks
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Angus, M. (2019). Towards Pixel-Level OOD Detection for Semantic Segmentation. (Thesis). University of Waterloo. Retrieved from http://hdl.handle.net/10012/15004
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Angus, Matt. “Towards Pixel-Level OOD Detection for Semantic Segmentation.” 2019. Thesis, University of Waterloo. Accessed January 28, 2021.
http://hdl.handle.net/10012/15004.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Angus, Matt. “Towards Pixel-Level OOD Detection for Semantic Segmentation.” 2019. Web. 28 Jan 2021.
Vancouver:
Angus M. Towards Pixel-Level OOD Detection for Semantic Segmentation. [Internet] [Thesis]. University of Waterloo; 2019. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/10012/15004.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Angus M. Towards Pixel-Level OOD Detection for Semantic Segmentation. [Thesis]. University of Waterloo; 2019. Available from: http://hdl.handle.net/10012/15004
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
15.
Muruganandham, Shivaprakash.
Semantic Segmentation of Satellite Images using Deep Learning.
Degree: Electrical and Space Engineering, 2016, Luleå University of Technology
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-38558
► A stark increase in the amount of satellite imagery available in recent years has made the interpretation of this data a challenging problem at…
(more)
▼ A stark increase in the amount of satellite imagery available in recent years has made the interpretation of this data a challenging problem at scale. Deriving useful insights from such images requires a rich understanding of the information present in them. This thesis explores the above problem by designing an automated framework for extracting semantic maps of roads and highways to track urban growth of cities in satellite images. Devising it as a supervised machine learning problem, a deep neural network is designed, implemented and experimentally evaluated. Publicly available datasets and frameworks are used for this purpose. The resulting pipeline includes image pre-processing algorithms that allows it to cope with input images of varying quality, resolution and channels. Additionally, we review a computational graph approach to building a neural network using the TensorFlow framework.
Subjects/Keywords: Satellite Imagery; Deep Learning; Semantic Segmentation; Machine Learning; Urban Growth
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Muruganandham, S. (2016). Semantic Segmentation of Satellite Images using Deep Learning. (Thesis). Luleå University of Technology. Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-38558
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Muruganandham, Shivaprakash. “Semantic Segmentation of Satellite Images using Deep Learning.” 2016. Thesis, Luleå University of Technology. Accessed January 28, 2021.
http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-38558.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Muruganandham, Shivaprakash. “Semantic Segmentation of Satellite Images using Deep Learning.” 2016. Web. 28 Jan 2021.
Vancouver:
Muruganandham S. Semantic Segmentation of Satellite Images using Deep Learning. [Internet] [Thesis]. Luleå University of Technology; 2016. [cited 2021 Jan 28].
Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-38558.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Muruganandham S. Semantic Segmentation of Satellite Images using Deep Learning. [Thesis]. Luleå University of Technology; 2016. Available from: http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-38558
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

George Mason University
16.
Singh, Gautam.
Visual Scene Understanding through Semantic Segmentation
.
Degree: 2014, George Mason University
URL: http://hdl.handle.net/1920/9193
► The problem of visual scene understanding entails recognizing the semantic constituents of a scene and the complex interactions that occur between them. Development of algorithms…
(more)
▼ The problem of visual scene understanding entails recognizing the
semantic constituents of a scene and the complex interactions that occur between them. Development of algorithms for
semantic segmentation, which requires the simultaneous
segmentation of an image into regions and the classification of these regions into
semantic categories, is at the heart of this problem. This dissertation presents methods that provide improvements to the state of the art in
semantic segmentation of images and investigates the use of the obtained
semantic segmentation output for related image retrieval and classification tasks. We present a method for non-parametric
semantic segmentation of images which can effectively work on image datasets with a large number of categories. The method exploits query time feature channel relevance and also introduces the
semantic label descriptor for improving the
semantic segmentation output by retrieving images which share semantically similar spatial layouts. We further demonstrate how to associate accurate confidences with the resulting
semantic segmentation through the use of the strangeness measure. We show how this measure can be applied for confidence ranking of unlabeled images and associate high uncertainty scores with images containing unfamiliar
semantic categories. We then demonstrate the use of
semantic segmentation output for additional tasks such as scene categorization, learning related
semantic concepts and content based image retrieval.
Advisors/Committee Members: Košecká, Jana (advisor).
Subjects/Keywords: Computer science;
Computer Vision;
Machine Learning;
Scene Understanding;
Semantic Segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Singh, G. (2014). Visual Scene Understanding through Semantic Segmentation
. (Thesis). George Mason University. Retrieved from http://hdl.handle.net/1920/9193
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Singh, Gautam. “Visual Scene Understanding through Semantic Segmentation
.” 2014. Thesis, George Mason University. Accessed January 28, 2021.
http://hdl.handle.net/1920/9193.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Singh, Gautam. “Visual Scene Understanding through Semantic Segmentation
.” 2014. Web. 28 Jan 2021.
Vancouver:
Singh G. Visual Scene Understanding through Semantic Segmentation
. [Internet] [Thesis]. George Mason University; 2014. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/1920/9193.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Singh G. Visual Scene Understanding through Semantic Segmentation
. [Thesis]. George Mason University; 2014. Available from: http://hdl.handle.net/1920/9193
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
17.
BAYOMI, MOSTAFA MOHAMED.
Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems.
Degree: School of Computer Science & Statistics. Discipline of Computer Science, 2019, Trinity College Dublin
URL: http://hdl.handle.net/2262/86062
► The volume of digital content resources written as text documents is growing every day, at an unprecedented rate. Because this content is generally not structured…
(more)
▼ The volume of digital content resources written as text documents is growing every day, at an unprecedented rate. Because this content is generally not structured as easy-to-handle units, it can be very difficult for users to find information they are interested in, or to help them accomplish their tasks. This in turn has increased the need for producing tailored content that can be adapted to the needs of individual users. A key challenge for producing such tailored content lies in the ability to understand how this content is structured. Hence, the efficient analysis and understanding of unstructured text content has become increasingly important. This has led to the increasing use of Natural Language Processing (NLP) techniques to help with processing unstructured text documents. Amongst the different NLP techniques, Text
Segmentation is specifically used to understand the structure of textual documents. However, current approaches to text
segmentation are typically based upon using lexical and/or syntactic representation to build a structure from the unstructured text documents. However, the relationship between segments may be
semantic, rather than lexical or syntactic.
Furthermore, text
segmentation research has primarily focused on techniques that can be used to process text documents but not on how these techniques can be utilised to produce tailored content that can be adapted to the needs of individual users. In contrast, the field of Adaptive Systems has inherently focused on the challenges associated with dynamically adapting and delivering content to individual users. However, adaptive systems have primarily focused upon the techniques of adapting content, not on how to understand and structure this content. Even systems that have focused on structuring content are limited in that they rely upon the original structure of the content resource, which reflects the perspective of its author. Therefore, these systems are limited in that they do not deeply ?understand? the structure of the content, which in turn, limits their capability to discover and supply appropriate content for use in defined contexts, and limits the content?s amenability for reuse within various independent adaptive systems.
In order to utilise the strength of NLP techniques to overcome the challenges of understanding unstructured text content, this thesis investigates how NLP techniques can be utilised in order to enhance the supply of content to adaptive systems. Specifically, the contribution of this thesis is concerned with addressing the challenges associated with hierarchical text
segmentation techniques, and with content discoverability and reusability for adaptive systems.
Firstly, this research proposes a novel hierarchical text
segmentation approach, named C-HTS, that builds a structure from text documents based on the
semantic representation of text.
Semantic representation is a method that replaces keyword-based text representation with concept-based features, where the meaning of a piece of text is represented as a vector of…
Advisors/Committee Members: Lawless, Seamus.
Subjects/Keywords: Natural Language Processing; Text Segmentation; Semantic Analysis; Adaptive Systems
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
BAYOMI, M. M. (2019). Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems. (Thesis). Trinity College Dublin. Retrieved from http://hdl.handle.net/2262/86062
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
BAYOMI, MOSTAFA MOHAMED. “Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems.” 2019. Thesis, Trinity College Dublin. Accessed January 28, 2021.
http://hdl.handle.net/2262/86062.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
BAYOMI, MOSTAFA MOHAMED. “Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems.” 2019. Web. 28 Jan 2021.
Vancouver:
BAYOMI MM. Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems. [Internet] [Thesis]. Trinity College Dublin; 2019. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/2262/86062.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
BAYOMI MM. Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems. [Thesis]. Trinity College Dublin; 2019. Available from: http://hdl.handle.net/2262/86062
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
18.
BAYOMI, MOSTAFA.
Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems.
Degree: School of Computer Science & Statistics. Discipline of Computer Science, 2019, Trinity College Dublin
URL: http://hdl.handle.net/2262/86075
► The volume of digital content resources written as text documents is growing every day, at an unprecedented rate. Because this content is generally not structured…
(more)
▼ The volume of digital content resources written as text documents is growing every day, at an unprecedented rate. Because this content is generally not structured as easy-to-handle units, it can be very difficult for users to find information they are interested in, or to help them accomplish their tasks. This in turn has increased the need for producing tailored content that can be adapted to the needs of individual users. A key challenge for producing such tailored content lies in the ability to understand how this content is structured. Hence, the efficient analysis and understanding of unstructured text content has become increasingly important. This has led to the increasing use of Natural Language Processing (NLP) techniques to help with processing unstructured text documents. Amongst the different NLP techniques, Text
Segmentation is specifically used to understand the structure of textual documents. However, current approaches to text
segmentation are typically based upon using lexical and/or syntactic representation to build a structure from the unstructured text documents. However, the relationship between segments may be
semantic, rather than lexical or syntactic.
Furthermore, text
segmentation research has primarily focused on techniques that can be used to process text documents but not on how these techniques can be utilised to produce tailored content that can be adapted to the needs of individual users. In contrast, the field of Adaptive Systems has inherently focused on the challenges associated with dynamically adapting and delivering content to individual users. However, adaptive systems have primarily focused upon the techniques of adapting content, not on how to understand and structure this content. Even systems that have focused on structuring content are limited in that they rely upon the original structure of the content resource, which reflects the perspective of its author. Therefore, these systems are limited in that they do not deeply ?understand? the structure of the content, which in turn, limits their capability to discover and supply appropriate content for use in defined contexts, and limits the content?s amenability for reuse within various independent adaptive systems.
In order to utilise the strength of NLP techniques to overcome the challenges of understanding unstructured text content, this thesis investigates how NLP techniques can be utilised in order to enhance the supply of content to adaptive systems. Specifically, the contribution of this thesis is concerned with addressing the challenges associated with hierarchical text
segmentation techniques, and with content discoverability and reusability for adaptive systems.
Firstly, this research proposes a novel hierarchical text
segmentation approach, named C-HTS, that builds a structure from text documents based on the
semantic representation of text.
Semantic representation is a method that replaces keyword-based text representation with concept-based features, where the meaning of a piece of text is represented as a vector of…
Advisors/Committee Members: Lawless, Seamus.
Subjects/Keywords: Natural Language Processing; Text Segmentation; Semantic Analysis; Adaptive Systems
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
BAYOMI, M. (2019). Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems. (Thesis). Trinity College Dublin. Retrieved from http://hdl.handle.net/2262/86075
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
BAYOMI, MOSTAFA. “Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems.” 2019. Thesis, Trinity College Dublin. Accessed January 28, 2021.
http://hdl.handle.net/2262/86075.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
BAYOMI, MOSTAFA. “Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems.” 2019. Web. 28 Jan 2021.
Vancouver:
BAYOMI M. Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems. [Internet] [Thesis]. Trinity College Dublin; 2019. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/2262/86075.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
BAYOMI M. Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems. [Thesis]. Trinity College Dublin; 2019. Available from: http://hdl.handle.net/2262/86075
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Delft University of Technology
19.
Lengyel, Attila (author).
Addressing Illumination-Based Domain Shifts in Deep Learning: A Physics-Based Approach.
Degree: 2019, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:f8619273-0e7e-42e3-990b-67e2f6edc78a
► This work investigates how prior knowledge from physics-based reflection models can be used to improve the performance of semantic segmentation models under an illumination-based domain…
(more)
▼ This work investigates how prior knowledge from physics-based reflection models can be used to improve the performance of
semantic segmentation models under an illumination-based domain shift. We implement various color invariants as a preprocessing step and find that CNNs trained on these color invariants get stuck in worse local minima compared to RGB inputs, but can achieve comparable or even superior performance when applying knowledge transfer from RGB. We also find Batch Normalization to severely affect the performance of neural networks under an illumination-based domain shift and demonstrate that Instance Normalization offers a simple remedy to this issue. Additionally, we investigate different fusion models for combining color invariants with RGB. Using a combination of these methods we achieve a 14.5% performance increase on nighttime
semantic segmentation without any additional training data.
Advisors/Committee Members: van Gemert, Jan (mentor), Reinders, Marcel (graduation committee), Hildebrandt, Klaus (graduation committee), Milford, Michael (mentor), Delft University of Technology (degree granting institution).
Subjects/Keywords: Semantic segmentation; color invariants; deep learning; computer vision; domain adaptation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Lengyel, A. (. (2019). Addressing Illumination-Based Domain Shifts in Deep Learning: A Physics-Based Approach. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:f8619273-0e7e-42e3-990b-67e2f6edc78a
Chicago Manual of Style (16th Edition):
Lengyel, Attila (author). “Addressing Illumination-Based Domain Shifts in Deep Learning: A Physics-Based Approach.” 2019. Masters Thesis, Delft University of Technology. Accessed January 28, 2021.
http://resolver.tudelft.nl/uuid:f8619273-0e7e-42e3-990b-67e2f6edc78a.
MLA Handbook (7th Edition):
Lengyel, Attila (author). “Addressing Illumination-Based Domain Shifts in Deep Learning: A Physics-Based Approach.” 2019. Web. 28 Jan 2021.
Vancouver:
Lengyel A(. Addressing Illumination-Based Domain Shifts in Deep Learning: A Physics-Based Approach. [Internet] [Masters thesis]. Delft University of Technology; 2019. [cited 2021 Jan 28].
Available from: http://resolver.tudelft.nl/uuid:f8619273-0e7e-42e3-990b-67e2f6edc78a.
Council of Science Editors:
Lengyel A(. Addressing Illumination-Based Domain Shifts in Deep Learning: A Physics-Based Approach. [Masters Thesis]. Delft University of Technology; 2019. Available from: http://resolver.tudelft.nl/uuid:f8619273-0e7e-42e3-990b-67e2f6edc78a

Delft University of Technology
20.
Zhou, Zequn (author).
Automated classification of satellite data of informal urban settlements.
Degree: 2019, Delft University of Technology
URL: http://resolver.tudelft.nl/uuid:c7c9c170-eb70-4cf2-adfb-1d08bc1b74d7
► Urban areas are rapidly expanding in developing countries. One of goals of the United Nations Human Settlement Programme (UN-Habitat) is to understand and guide urban…
(more)
▼ Urban areas are rapidly expanding in developing countries. One of goals of the United Nations Human Settlement Programme (UN-Habitat) is to understand and guide urban development for some developing regions. Currently, the approaches that UN-Habitat is using cost plenty of workforce, material, and time. Therefore, UN-Habitat is interested in exploring new approaches on how to drive down costs and time, which would not only allow for faster responses but expanding their analysis. Since UN-Habit is already using satellite imagery for urban mapping, our research question is formulated as: Can we develop an automated system that provides valuable information about urban development for the UN-Habitat from satellite image data (e.g. building detection)? After examining the satellite imagery provided by UN-Habitat and those available publicly (crowd AI and Inria Areial datasets), we define the main task as a building
segmentation task. In this research, we study deep learning techniques for building
segmentation on satellite image data. Duo to the fact that the number of images and the quality available for the region of interest (Middle East) for UN-Habitat are insufficient to solely rely on for training. Therefore, we use some public datasets (crowd AI and Inria Areial datasets) for training and evaluation, whose regions and construction practice are different. Starting with testing several classic
segmentation algorithms (FCN8S, SegNet, Deep\
Lab and U-Net), from the experiment results, we find that the performance can still be improved. Then, we propose two novel data reweighing methods, named border weight and inter-building distance weight, to improve the detection performance. By increasing the weights of the pixels outside but close to the border of the buildings, the model is encouraged to learn those information and thus performs better. Inspired by the idea of reweighing the non-building pixels, we investigate whether modifying building pixels can achieve further improvement. We propose a new label representation – multi-level boundary label that does help to improve the
segmentation results. Based on the distance to the building boundary, we can divide building pixels into multiple classes, as their pixel values can be affected by some factors such as trees and shadows. From the experiment result, we can see that the performance is improved since the model captures more information about the buildings. Next, we propose a new neural network architecture that utilizes the two pixel weights, and the multi-level boundary label explained above. Our proposed model achieves state-of-the-art building
segmentation performance compared with several classic
segmentation methods. For example, the proposed model's mean intersection of union on the test dataset is 3% higher than that of FCN8S. Our model also uses fewer number of parameters (~16 million in total) because we only use the first 13 layers of the VGG16 as the encoder and we do not use any convolutional layers in the decoder part.…
Advisors/Committee Members: Yorke-Smith, Neil (mentor), Rózsás, Árpád (mentor), Delft University of Technology (degree granting institution).
Subjects/Keywords: deep learning; Semantic segmentation; Satellite Imagery; buildng detection
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Zhou, Z. (. (2019). Automated classification of satellite data of informal urban settlements. (Masters Thesis). Delft University of Technology. Retrieved from http://resolver.tudelft.nl/uuid:c7c9c170-eb70-4cf2-adfb-1d08bc1b74d7
Chicago Manual of Style (16th Edition):
Zhou, Zequn (author). “Automated classification of satellite data of informal urban settlements.” 2019. Masters Thesis, Delft University of Technology. Accessed January 28, 2021.
http://resolver.tudelft.nl/uuid:c7c9c170-eb70-4cf2-adfb-1d08bc1b74d7.
MLA Handbook (7th Edition):
Zhou, Zequn (author). “Automated classification of satellite data of informal urban settlements.” 2019. Web. 28 Jan 2021.
Vancouver:
Zhou Z(. Automated classification of satellite data of informal urban settlements. [Internet] [Masters thesis]. Delft University of Technology; 2019. [cited 2021 Jan 28].
Available from: http://resolver.tudelft.nl/uuid:c7c9c170-eb70-4cf2-adfb-1d08bc1b74d7.
Council of Science Editors:
Zhou Z(. Automated classification of satellite data of informal urban settlements. [Masters Thesis]. Delft University of Technology; 2019. Available from: http://resolver.tudelft.nl/uuid:c7c9c170-eb70-4cf2-adfb-1d08bc1b74d7

University of Waterloo
21.
Li, Ying.
Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds.
Degree: 2021, University of Waterloo
URL: http://hdl.handle.net/10012/16620
► This thesis focuses on the challenges and opportunities that come with deep learning in the extraction of 3D information from point clouds. To achieve this,…
(more)
▼ This thesis focuses on the challenges and opportunities that come with deep learning in the extraction of 3D information from point clouds. To achieve this, 3D information such as point-based or object-based attributes needs to be extracted from highly-accurate and information-rich 3D data, which are commonly collected by LiDAR or RGB-D cameras from real-world environments. Driven by the breakthroughs brought by deep learning techniques and the accessibility of reliable 3D datasets, 3D deep learning frameworks have been investigated with a string of empirical successes. However, two main challenges lead to the complexity of deep learning based per-point labeling and object detection in real scenes. First, the variation of sensing conditions and unconstrained environments result in unevenly distributed point clouds with various geometric patterns and incomplete shapes. Second, the irregular data format and the requirements for both accurate and efficient algorithms pose problems for deep learning models.
To deal with the above two challenges, this doctoral dissertation mainly considers the following four features when constructing 3D deep models for point-based or object-based information extraction: (1) the exploration of geometric correlations between local points when defining convolution kernels, (2) the hierarchical local and global feature learning within an end-to-end trainable framework, (3) the relation feature learning from nearby objects, and (4) 2D image leveraging for 3D object detection from point clouds. Correspondingly, this doctoral thesis proposes a set of deep learning frameworks to deal with the 3D information extraction specific for scene segmentation and object detection from indoor and outdoor point clouds.
Firstly, an end-to-end geometric graph convolution architecture on the graph representation of a point cloud is proposed for semantic scene segmentation. Secondly, a 3D proposal-based object detection framework is constructed to extract the geometric information of objects and relation features among proposals for bounding box reasoning. Thirdly, a 2D-driven approach is proposed to detect 3D objects from point clouds in indoor and outdoor scenes. Both semantic features from 2D images and the context information in 3D space are explicitly exploited to enhance the 3D detection performance. Qualitative and quantitative experiments compared with existing state-of-the-art models on indoor and outdoor datasets demonstrate the effectiveness of the proposed frameworks. A list of remaining challenges and future research issues that help to advance the development of deep learning approaches for the extraction of 3D information from point clouds are addressed at the end of this thesis.
Subjects/Keywords: deep learning; semantic segmentation; object detection; point cloud
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Li, Y. (2021). Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds. (Thesis). University of Waterloo. Retrieved from http://hdl.handle.net/10012/16620
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Li, Ying. “Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds.” 2021. Thesis, University of Waterloo. Accessed January 28, 2021.
http://hdl.handle.net/10012/16620.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Li, Ying. “Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds.” 2021. Web. 28 Jan 2021.
Vancouver:
Li Y. Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds. [Internet] [Thesis]. University of Waterloo; 2021. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/10012/16620.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Li Y. Deep Learning for 3D Information Extraction from Indoor and Outdoor Point Clouds. [Thesis]. University of Waterloo; 2021. Available from: http://hdl.handle.net/10012/16620
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

Georgia Tech
22.
Raza, Syed H.
Temporally consistent semantic segmentation in videos.
Degree: PhD, Electrical and Computer Engineering, 2014, Georgia Tech
URL: http://hdl.handle.net/1853/53455
► The objective of this Thesis research is to develop algorithms for temporally consistent semantic segmentation in videos. Though many different forms of semantic segmentations exist,…
(more)
▼ The objective of this Thesis research is to develop algorithms for temporally consistent
semantic segmentation in videos. Though many different forms of
semantic segmentations exist, this research is focused on the problem of temporally-consistent holistic scene understanding in outdoor videos. Holistic scene understanding requires an understanding of many individual aspects of the scene including 3D layout, objects present, occlusion boundaries, and depth. Such a description of a dynamic scene would be useful for many robotic applications including object reasoning, 3D perception, video analysis, video coding,
segmentation, navigation and activity recognition.
Scene understanding has been studied with great success for still images. However, scene understanding in videos requires additional approaches to account for the temporal variation, dynamic information, and exploiting causality. As a first step, image-based scene understanding methods can be directly applied to individual video frames to generate a description of the scene. However, these methods do not exploit temporal information across neighboring frames. Further, lacking temporal consistency, image-based methods can result in temporally-inconsistent labels across frames. This inconsistency can impact performance, as scene labels suddenly change between frames.
The objective of our this study is to develop temporally consistent scene descriptive algorithms by processing videos efficiently, exploiting causality and data-redundancy, and cater for scene dynamics. Specifically, we achieve our research objectives by (1) extracting geometric context from videos to give broad 3D structure of the scene with all objects present, (2) Detecting occlusion boundaries in videos due to depth discontinuity, (3) Estimating depth in videos by combining monocular and motion features with
semantic features and occlusion boundaries.
Advisors/Committee Members: Essa, Irfan (advisor), Anderson, David (advisor), Yezzi, Anthony (committee member), Barnes, Christopher (committee member), Dellaert, Frank (committee member), Sukthankar, Rahul (committee member).
Subjects/Keywords: Semantic segmentation; Temporal consistency; Causality; Videos; Occlusion boundaries; Depth estimation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Raza, S. H. (2014). Temporally consistent semantic segmentation in videos. (Doctoral Dissertation). Georgia Tech. Retrieved from http://hdl.handle.net/1853/53455
Chicago Manual of Style (16th Edition):
Raza, Syed H. “Temporally consistent semantic segmentation in videos.” 2014. Doctoral Dissertation, Georgia Tech. Accessed January 28, 2021.
http://hdl.handle.net/1853/53455.
MLA Handbook (7th Edition):
Raza, Syed H. “Temporally consistent semantic segmentation in videos.” 2014. Web. 28 Jan 2021.
Vancouver:
Raza SH. Temporally consistent semantic segmentation in videos. [Internet] [Doctoral dissertation]. Georgia Tech; 2014. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/1853/53455.
Council of Science Editors:
Raza SH. Temporally consistent semantic segmentation in videos. [Doctoral Dissertation]. Georgia Tech; 2014. Available from: http://hdl.handle.net/1853/53455

University of South Carolina
23.
Guo, Dazhou.
Semantic Segmentation Considering Image Degradation, Global Context, and Data Balancing.
Degree: PhD, Computer Science and Engineering, 2019, University of South Carolina
URL: https://scholarcommons.sc.edu/etd/5600
► Recently, semantic segmentation – assigning a categorical label to each pixel in an im- age – plays an important role in image understanding applications,…
(more)
▼ Recently,
semantic segmentation – assigning a categorical label to each pixel in an im- age – plays an important role in image understanding applications, e.g., autonomous driving, human-machine interaction and medical imaging.
Semantic segmentation has made progress by using the deep convolutional neural networks, which are sur- passing the traditional methods by a large margin. Despite the success of the deep convolutional neural networks (CNNs), there remain three major challenges.
The first challenge is how to segment the degraded images semantically, i.e., de- graded image
semantic segmentation. In general, image degradations increase the difficulty of
semantic segmentation, usually leading to decreased
segmentation ac- curacy. While the use of supervised deep learning has substantially improved the state-of-the-art of
semantic segmentation, the gap between the feature distribution learned using the clean images and the feature distribution learned using the de- graded images poses a major obstacle to degraded image
semantic segmentation. We propose a novel Dense-Gram Network to more effectively reduce the gap than the conventional strategies in segmenting degraded images. Extensive experiments demonstrate that the proposed Dense-Gram Network yields state-of-the-art seman- tic
segmentation performance on degraded images synthesized using PASCAL VOC 2012, SUNRGBD, CamVid, and CityScapes datasets.
The second challenge is how to embed the global context into the
segmentation network. As the existing
semantic segmentation networks usually exploit the local context information for inferring the label of a single pixel or patch, without the global context, the CNNs could miss-classify the objects with similar color and shapes. In this thesis, we propose to embed the global context into the
segmentation network using object’s spatial relationship. In particular, we introduce a boundary-based metric that measures the level of spatial adjacency between each pair of object classes and find that this metric is robust against object size induced biases. By enforcing this metric into the
segmentation loss, we propose a new network, which starts with a
segmentation network, followed by a new encoder to compute the proposed boundary- based metric, and then train this network in an end-to-end fashion for
semantic image
segmentation. We evaluate the proposed method using CamVid and CityScapes datasets and achieve favorable overall performance and a substantial improvement in segmenting small objects.
The third challenge of the existing
semantic segmentation network is the per- formance decrease induced by data imbalance. At the image level, one
semantic class may occur in more images than another. At the pixel level, one
semantic class may show larger size than another. Classic strategies such as class re-sampling or cost-sensitive training could not address these data imbalances for multi-label seg- mentation. Here, we propose a selective-weighting strategy to consider the image- and pixel-level data…
Advisors/Committee Members: Song Wang.
Subjects/Keywords: Computer Sciences; semantic segmentation; convolutional neural networks; Convolutional Neural Networks Backbones
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Guo, D. (2019). Semantic Segmentation Considering Image Degradation, Global Context, and Data Balancing. (Doctoral Dissertation). University of South Carolina. Retrieved from https://scholarcommons.sc.edu/etd/5600
Chicago Manual of Style (16th Edition):
Guo, Dazhou. “Semantic Segmentation Considering Image Degradation, Global Context, and Data Balancing.” 2019. Doctoral Dissertation, University of South Carolina. Accessed January 28, 2021.
https://scholarcommons.sc.edu/etd/5600.
MLA Handbook (7th Edition):
Guo, Dazhou. “Semantic Segmentation Considering Image Degradation, Global Context, and Data Balancing.” 2019. Web. 28 Jan 2021.
Vancouver:
Guo D. Semantic Segmentation Considering Image Degradation, Global Context, and Data Balancing. [Internet] [Doctoral dissertation]. University of South Carolina; 2019. [cited 2021 Jan 28].
Available from: https://scholarcommons.sc.edu/etd/5600.
Council of Science Editors:
Guo D. Semantic Segmentation Considering Image Degradation, Global Context, and Data Balancing. [Doctoral Dissertation]. University of South Carolina; 2019. Available from: https://scholarcommons.sc.edu/etd/5600

Virginia Tech
24.
Christie, Gordon A.
Collaborative Unmanned Air and Ground Vehicle Perception for Scene Understanding, Planning and GPS-denied Localization.
Degree: PhD, Computer Engineering, 2017, Virginia Tech
URL: http://hdl.handle.net/10919/83807
► Autonomous robot missions in unknown environments are challenging. In many cases, the systems involved are unable to use a priori information about the scene (e.g.…
(more)
▼ Autonomous robot missions in unknown environments are challenging. In many cases, the systems involved are unable to use a priori information about the scene (e.g. road maps). This is especially true in disaster response scenarios, where existing maps are now out of date. Areas without GPS are another concern, especially when the involved systems are tasked with navigating a path planned by a remote base station. Scene understanding via robots' perception data (e.g. images) can greatly assist in overcoming these challenges. This dissertation makes three contributions that help overcome these challenges, where there is a focus on the application of autonomously searching for radiation sources with unmanned aerial vehicles (UAV) and unmanned ground vehicles (UGV) in unknown and unstructured environments. The three main contributions of this dissertation are: (1) An approach to overcome the challenges associated with simultaneously trying to understand 2D and 3D information about the environment. (2) Algorithms and experiments involving scene understanding for real-world autonomous search tasks. The experiments involve a UAV and a UGV searching for potentially hazardous sources of radiation is an unknown environment. (3) An approach to the registration of a UGV in areas without GPS using 2D image data and 3D data, where localization is performed in an overhead map generated from imagery captured in the air.
Advisors/Committee Members: Kochersberger, Kevin B. (committeechair), Batra, Dhruv (committeechair), Parikh, Devi (committee member), Tokekar, Pratap (committee member), Ben-Tzvi, Pinhas (committee member).
Subjects/Keywords: Scene Understanding; Semantic Segmentation; Unmanned Systems; UAV; UGV; Path Planning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Christie, G. A. (2017). Collaborative Unmanned Air and Ground Vehicle Perception for Scene Understanding, Planning and GPS-denied Localization. (Doctoral Dissertation). Virginia Tech. Retrieved from http://hdl.handle.net/10919/83807
Chicago Manual of Style (16th Edition):
Christie, Gordon A. “Collaborative Unmanned Air and Ground Vehicle Perception for Scene Understanding, Planning and GPS-denied Localization.” 2017. Doctoral Dissertation, Virginia Tech. Accessed January 28, 2021.
http://hdl.handle.net/10919/83807.
MLA Handbook (7th Edition):
Christie, Gordon A. “Collaborative Unmanned Air and Ground Vehicle Perception for Scene Understanding, Planning and GPS-denied Localization.” 2017. Web. 28 Jan 2021.
Vancouver:
Christie GA. Collaborative Unmanned Air and Ground Vehicle Perception for Scene Understanding, Planning and GPS-denied Localization. [Internet] [Doctoral dissertation]. Virginia Tech; 2017. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/10919/83807.
Council of Science Editors:
Christie GA. Collaborative Unmanned Air and Ground Vehicle Perception for Scene Understanding, Planning and GPS-denied Localization. [Doctoral Dissertation]. Virginia Tech; 2017. Available from: http://hdl.handle.net/10919/83807

University of Victoria
25.
Rose, Spencer.
An evaluation of deep learning semantic segmentation for land cover classification of oblique ground-based photography.
Degree: Department of Computer Science, 2020, University of Victoria
URL: http://hdl.handle.net/1828/12156
► This thesis presents a case study on the application of deep learning methods for the dense prediction of land cover types in oblique ground-based photography.…
(more)
▼ This thesis presents a case study on the application of deep learning methods for the dense prediction of land cover types in oblique ground-based photography. While deep learning approaches are widely used in land cover classification of remote-sensing data (i.e., aerial and satellite orthoimagery) for change detection analysis, dense classification of oblique landscape imagery used in repeat photography remains undeveloped. A performance evaluation was carried out to test two state-of the-art architectures, U-net and Deeplabv3+, as well as a fully-connected conditional random fields model used to boost
segmentation accuracy. The evaluation focuses on the use of a novel threshold-based data augmentation technique, and three multi-loss functions selected to mitigate class imbalance and input noise. The dataset used for this study was sampled from the Mountain Legacy Project (MLP) collection, comprised of high-resolution historic (grayscale) survey photographs of Canada’s Western mountains captured from the 1880s through the 1950s and their corresponding modern (colour) repeat images. Land cover segmentations manually created by MLP researchers were used as ground truth labels. Experimental results showed top overall F1 scores of 0.841 for historic models, and 0.909 for repeat models. Data augmentation showed modest improvements to overall accuracy (+3.0% historic / +1.0% repeat), but much larger gains for under-represented classes.
Advisors/Committee Members: Coady, Yvonne (supervisor).
Subjects/Keywords: landscape classification; semantic segmentation; change detection; deep learning
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Rose, S. (2020). An evaluation of deep learning semantic segmentation for land cover classification of oblique ground-based photography. (Masters Thesis). University of Victoria. Retrieved from http://hdl.handle.net/1828/12156
Chicago Manual of Style (16th Edition):
Rose, Spencer. “An evaluation of deep learning semantic segmentation for land cover classification of oblique ground-based photography.” 2020. Masters Thesis, University of Victoria. Accessed January 28, 2021.
http://hdl.handle.net/1828/12156.
MLA Handbook (7th Edition):
Rose, Spencer. “An evaluation of deep learning semantic segmentation for land cover classification of oblique ground-based photography.” 2020. Web. 28 Jan 2021.
Vancouver:
Rose S. An evaluation of deep learning semantic segmentation for land cover classification of oblique ground-based photography. [Internet] [Masters thesis]. University of Victoria; 2020. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/1828/12156.
Council of Science Editors:
Rose S. An evaluation of deep learning semantic segmentation for land cover classification of oblique ground-based photography. [Masters Thesis]. University of Victoria; 2020. Available from: http://hdl.handle.net/1828/12156

University of Sydney
26.
Weerasiriwardhane, Charika.
Multi-Modal Learning For Adaptive Scene Understanding
.
Degree: 2017, University of Sydney
URL: http://hdl.handle.net/2123/17191
► Modern robotics systems typically possess sensors of different modalities. Segmenting scenes observed by the robot into a discrete set of classes is a central requirement…
(more)
▼ Modern robotics systems typically possess sensors of different modalities. Segmenting scenes observed by the robot into a discrete set of classes is a central requirement for autonomy. Equally, when a robot navigates through an unknown environment, it is often necessary to adjust the parameters of the scene segmentation model to maintain the same level of accuracy in changing situations. This thesis explores efficient means of adaptive semantic scene segmentation in an online setting with the use of multiple sensor modalities. First, we devise a novel conditional random field(CRF) inference method for scene segmentation that incorporates global constraints, enforcing particular sets of nodes to be assigned the same class label. To do this efficiently, the CRF is formulated as a relaxed quadratic program whose maximum a posteriori(MAP) solution is found using a gradient-based optimization approach. These global constraints are useful, since they can encode "a priori" information about the final labeling. This new formulation also reduces the dimensionality of the original image-labeling problem. The proposed model is employed in an urban street scene understanding task. Camera data is used for the CRF based semantic segmentation while global constraints are derived from 3D laser point clouds. Second, an approach to learn CRF parameters without the need for manually labeled training data is proposed. The model parameters are estimated by optimizing a novel loss function using self supervised reference labels, obtained based on the information from camera and laser with minimum amount of human supervision. Third, an approach that can conduct the parameter optimization while increasing the model robustness to non-stationary data distributions in the long trajectories is proposed. We adopted stochastic gradient descent to achieve this goal by using a learning rate that can appropriately grow or diminish to gain adaptability to changes in the data distribution.
Subjects/Keywords: conditional random fields;
semantic segmentation;
adaptive learning;
multi-modal scene understanding
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Weerasiriwardhane, C. (2017). Multi-Modal Learning For Adaptive Scene Understanding
. (Thesis). University of Sydney. Retrieved from http://hdl.handle.net/2123/17191
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Weerasiriwardhane, Charika. “Multi-Modal Learning For Adaptive Scene Understanding
.” 2017. Thesis, University of Sydney. Accessed January 28, 2021.
http://hdl.handle.net/2123/17191.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Weerasiriwardhane, Charika. “Multi-Modal Learning For Adaptive Scene Understanding
.” 2017. Web. 28 Jan 2021.
Vancouver:
Weerasiriwardhane C. Multi-Modal Learning For Adaptive Scene Understanding
. [Internet] [Thesis]. University of Sydney; 2017. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/2123/17191.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Weerasiriwardhane C. Multi-Modal Learning For Adaptive Scene Understanding
. [Thesis]. University of Sydney; 2017. Available from: http://hdl.handle.net/2123/17191
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

King Abdullah University of Science and Technology
27.
Itani, Hani.
A Closer Look at Neighborhoods in Graph Based Point Cloud Scene Semantic Segmentation Networks.
Degree: 2020, King Abdullah University of Science and Technology
URL: http://hdl.handle.net/10754/665898
► Large scale semantic segmentation is considered as one of the fundamental tasks in 3D scene understanding. Point clouds provide a basic and rich geometric rep-…
(more)
▼ Large scale semantic segmentation is considered as one of the fundamental tasks in 3D scene understanding. Point clouds provide a basic and rich geometric rep- resentation of scenes and tangible objects. Convolutional Neural Networks (CNNs) have demonstrated an impressive success in processing regular discrete data such as 2D images and 1D audio. However, CNNs do not directly generalize to point cloud processing due to their irregular and un-ordered nature. One way to extend CNNs to point cloud understanding is to derive an intermediate euclidean representation of a point cloud by projecting onto image domain, voxelizing, or treating points as vertices of an un-directed graph. Graph-CNNs (GCNs) have demonstrated to be a very promising solution for deep learning on irregular data such as social networks, bi- ological systems, and recently point clouds. Early works in literature for graph based point networks relied on constructing dynamic graphs in the node feature space to define a convolution kernel. Later works constructed hierarchical static graphs in 3D space for an encoder-decoder framework inspired from image segmentation. This thesis takes a closer look at both dynamic and static graph neighborhoods of graph- based point networks for the task of semantic segmentation in order to: 1) discuss a potential cause for why going deep in dynamic GCNs does not necessarily lead to an improved performance, and 2) propose a new approach in treating points in a static graph neighborhood for an improved information aggregation. The proposed method leads to an efficient graph based 3D semantic segmentation network that is on par
with current state-of-the-art methods on both indoor and outdoor scene semantic
segmentation benchmarks such as S3DIS and Semantic3D.
Subjects/Keywords: Deep Learning on point clouds; Local Aggregation Function; Semantic Segmentation
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Itani, H. (2020). A Closer Look at Neighborhoods in Graph Based Point Cloud Scene Semantic Segmentation Networks. (Thesis). King Abdullah University of Science and Technology. Retrieved from http://hdl.handle.net/10754/665898
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Chicago Manual of Style (16th Edition):
Itani, Hani. “A Closer Look at Neighborhoods in Graph Based Point Cloud Scene Semantic Segmentation Networks.” 2020. Thesis, King Abdullah University of Science and Technology. Accessed January 28, 2021.
http://hdl.handle.net/10754/665898.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
MLA Handbook (7th Edition):
Itani, Hani. “A Closer Look at Neighborhoods in Graph Based Point Cloud Scene Semantic Segmentation Networks.” 2020. Web. 28 Jan 2021.
Vancouver:
Itani H. A Closer Look at Neighborhoods in Graph Based Point Cloud Scene Semantic Segmentation Networks. [Internet] [Thesis]. King Abdullah University of Science and Technology; 2020. [cited 2021 Jan 28].
Available from: http://hdl.handle.net/10754/665898.
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation
Council of Science Editors:
Itani H. A Closer Look at Neighborhoods in Graph Based Point Cloud Scene Semantic Segmentation Networks. [Thesis]. King Abdullah University of Science and Technology; 2020. Available from: http://hdl.handle.net/10754/665898
Note: this citation may be lacking information needed for this citation format:
Not specified: Masters Thesis or Doctoral Dissertation

University of Manchester
28.
Gupta, Ananya.
Deep Learning for Semantic Feature Extraction in Aerial
Imagery.
Degree: 2020, University of Manchester
URL: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:326952
► Remote sensing provides image and LiDAR data that can be useful for a number of tasks such as disaster mapping and surveying. Deep learning (DL)…
(more)
▼ Remote sensing provides image and LiDAR data that
can be useful for a number of tasks such as disaster mapping and
surveying. Deep learning (DL) has been shown to provide good
results in extracting knowledge from input data sources by the
means of learning intermediate representation features. However,
popular DL methods require large scaled datasets for training which
are costly and time-consuming to obtain. This thesis investigates
semantic knowledge extraction from remote sensing data using DL
methods in regimes with limited labelled data. Firstly,
semantic
segmentation methods are compared and analysed on the task of
aerial image
segmentation. It is shown that pretraining on ImageNet
improves the
segmentation results despite the domain shift between
ImageNet images and aerial images. A framework for mapping road
networks in disaster struck areas is proposed. It uses pre and post
disaster imagery and labels from OpenStreetMaps (OSM), forgoing the
need for costly manually labelled data. Graph-based methods are
used to update the pre-existing road maps from OSM. Experiments on
a disaster dataset from Palu, Indonesia show the efficacy of the
proposed method. A method for
semantic feature extraction from
aerial imagery is proposed which is shown to work well for
multitemporal high resolution image registration. These feature are
able to deal with temporal variations caused by seasonal changes.
Methods for tree identification in LiDAR data have been proposed to
overcome the need for manually labelled data. The first method
works on high density point clouds and uses certain LiDAR data
attributes for tree identification, achieving almost 90% accuracy.
The second uses a voxel based 3D Convolutional Neural Network on
low density LiDAR datasets and is able to identify most large
trees. The third method is a scaled version of PointNet++ and
achieves an F_score of 82.1 on the ISPRS benchmark, comparable to
the state of the art methods but with increased efficiency.
Finally, saliency methods used for explainability in image analysis
are extended to work on 3D point clouds and voxel-based networks to
help aid explainability in this area. It is shown that edge and
corner features are deemed important by these networks for
classification. These features are also demonstrated to be
inherently sparse and pruned easily.
Advisors/Committee Members: WATSON, SIMON SA, Yin, Hujun, Watson, Simon.
Subjects/Keywords: Deep Learning; Semantic Segmentation; LiDAR; Point Cloud; Satellite Imagery; Aerial Imagery
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Gupta, A. (2020). Deep Learning for Semantic Feature Extraction in Aerial
Imagery. (Doctoral Dissertation). University of Manchester. Retrieved from http://www.manchester.ac.uk/escholar/uk-ac-man-scw:326952
Chicago Manual of Style (16th Edition):
Gupta, Ananya. “Deep Learning for Semantic Feature Extraction in Aerial
Imagery.” 2020. Doctoral Dissertation, University of Manchester. Accessed January 28, 2021.
http://www.manchester.ac.uk/escholar/uk-ac-man-scw:326952.
MLA Handbook (7th Edition):
Gupta, Ananya. “Deep Learning for Semantic Feature Extraction in Aerial
Imagery.” 2020. Web. 28 Jan 2021.
Vancouver:
Gupta A. Deep Learning for Semantic Feature Extraction in Aerial
Imagery. [Internet] [Doctoral dissertation]. University of Manchester; 2020. [cited 2021 Jan 28].
Available from: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:326952.
Council of Science Editors:
Gupta A. Deep Learning for Semantic Feature Extraction in Aerial
Imagery. [Doctoral Dissertation]. University of Manchester; 2020. Available from: http://www.manchester.ac.uk/escholar/uk-ac-man-scw:326952
29.
Gadde, Raghu Deep.
Segmentation sémantique d'images fortement structurées et faiblement structurées : Semantic Segmentation of Highly Structured and Weakly Structured Images.
Degree: Docteur es, Signal, Image, Automatique, 2017, Université Paris-Est
URL: http://www.theses.fr/2017PESC1083
► Cette thèse pour but de développer des méthodes de segmentation pour des scènes fortement structurées (ex. bâtiments et environnements urbains) ou faiblement structurées (ex. paysages…
(more)
▼ Cette thèse pour but de développer des méthodes de segmentation pour des scènes fortement structurées (ex. bâtiments et environnements urbains) ou faiblement structurées (ex. paysages ou objets naturels). En particulier, les images de bâtiments peuvent être décrites en termes d'une grammaire de formes, et une dérivation de cette grammaire peut être inférée pour obtenir une segmentation d'une image. Cependant, il est difficile et long d'écrire de telles grammaires. Pour répondre à ce problème, nous avons développé une nouvelle méthode qui permet d'apprendre automatiquement une grammaire à partir d'un ensemble d'images et de leur segmentation associée. Des expériences montrent que des grammaires ainsi apprises permettent une inférence plus rapide et produisent de meilleures segmentations. Nous avons également étudié une méthode basée sur les auto-contextes pour segmenter des scènes fortement structurées et notamment des images de bâtiments. De manière surprenante, même sans connaissance spécifique sur le type de scène particulier observé, nous obtenons des gains significatifs en qualité de segmentation sur plusieurs jeux de données. Enfin, nous avons développé une technique basée sur les réseaux de neurones convolutifs (CNN) pour segmenter des images de scènes faiblement structurées. Un filtrage adaptatif est effectué à l'intérieur même du réseau pour permettre des dépendances entre zones d'images distantes. Des expériences sur plusieurs jeux de données à grande échelle montrent là aussi un gain important sur la qualité de segmentation
The aim of this thesis is to develop techniques for segmenting strongly-structuredscenes (e.g. building images) and weakly-structured scenes (e.g. natural images). Buildingimages can naturally be expressed in terms of grammars and inference is performed usinggrammars to obtain the optimal segmentation. However, it is difficult and time consum-ing to write such grammars. To alleviate this problem, a novel method to automaticallylearn grammars from a given training set of image and ground-truth segmentation pairs isdeveloped. Experiments suggested that such learned grammars help in better and fasterinference. Next, the effect of using grammars for strongly structured scenes is explored.To this end, a very simple technique based on Auto-Context is used to segment buildingimages. Surprisingly, even with out using any domain specific knowledge, we observedsignificant improvements in terms of performance on several benchmark datasets. Lastly,a novel technique based on convolutional neural networks is developed to segment imageswithout any high-level structure. Image-adaptive filtering is performed within a CNN ar-chitecture to facilitate long-range connections. Experiments on different large scale bench-marks show significant improvements in terms of performance
Advisors/Committee Members: Marlet, Renaud (thesis director), Paragios, Nikos (thesis director).
Subjects/Keywords: Grammar learning; Facade parsing; Facade segmentation; Aemantic segmentation; Cnn; Auto-Context; Grammar learning; Facade parsing; Facade segmentation; Semantic segmentation; Cnn; Auto-Context
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Gadde, R. D. (2017). Segmentation sémantique d'images fortement structurées et faiblement structurées : Semantic Segmentation of Highly Structured and Weakly Structured Images. (Doctoral Dissertation). Université Paris-Est. Retrieved from http://www.theses.fr/2017PESC1083
Chicago Manual of Style (16th Edition):
Gadde, Raghu Deep. “Segmentation sémantique d'images fortement structurées et faiblement structurées : Semantic Segmentation of Highly Structured and Weakly Structured Images.” 2017. Doctoral Dissertation, Université Paris-Est. Accessed January 28, 2021.
http://www.theses.fr/2017PESC1083.
MLA Handbook (7th Edition):
Gadde, Raghu Deep. “Segmentation sémantique d'images fortement structurées et faiblement structurées : Semantic Segmentation of Highly Structured and Weakly Structured Images.” 2017. Web. 28 Jan 2021.
Vancouver:
Gadde RD. Segmentation sémantique d'images fortement structurées et faiblement structurées : Semantic Segmentation of Highly Structured and Weakly Structured Images. [Internet] [Doctoral dissertation]. Université Paris-Est; 2017. [cited 2021 Jan 28].
Available from: http://www.theses.fr/2017PESC1083.
Council of Science Editors:
Gadde RD. Segmentation sémantique d'images fortement structurées et faiblement structurées : Semantic Segmentation of Highly Structured and Weakly Structured Images. [Doctoral Dissertation]. Université Paris-Est; 2017. Available from: http://www.theses.fr/2017PESC1083
30.
Luc, Pauline.
Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video.
Degree: Docteur es, Mathématiques et informatique, 2019, Université Grenoble Alpes (ComUE)
URL: http://www.theses.fr/2019GREAM024
► Les modèles prédictifs ont le potentiel de permettre le transfert des succès récents en apprentissage par renforcement à de nombreuses tâches du monde réel, en…
(more)
▼ Les modèles prédictifs ont le potentiel de permettre le transfert des succès récents en apprentissage par renforcement à de nombreuses tâches du monde réel, en diminuant le nombre d’interactions nécessaires avec l’environnement.La tâche de prédiction vidéo a attiré un intérêt croissant de la part de la communauté ces dernières années, en tant que cas particulier d’apprentissage prédictif dont les applications en robotique et dans les systèmes de navigations sont vastes.Tandis que les trames RGB sont faciles à obtenir et contiennent beaucoup d’information, elles sont extrêmement difficile à prédire, et ne peuvent être interprétées directement par des applications en aval.C’est pourquoi nous introduisons ici une tâche nouvelle, consistant à prédire la segmentation sémantique ou d’instance de trames futures.Les espaces de descripteurs que nous considérons sont mieux adaptés à la prédiction récursive, et nous permettent de développer des modèles de segmentation prédictifs performants jusqu’à une demi-seconde dans le futur.Les prédictions sont interprétables par des applications en aval et demeurent riches en information, détaillées spatialement et faciles à obtenir, en s’appuyant sur des méthodes état de l’art de segmentation.Dans cette thèse, nous nous attachons d’abord à proposer pour la tâche de segmentation sémantique, une approche discriminative se basant sur un entrainement par réseaux antagonistes.Ensuite, nous introduisons la tâche nouvelle de prédiction de segmentation sémantique future, pour laquelle nous développons un modèle convolutionnel autoregressif.Enfin, nous étendons notre méthode à la tâche plus difficile de prédiction de segmentation d’instance future, permettant de distinguer entre différents objets.Du fait du nombre de classes variant selon les images, nous proposons un modèle prédictif dans l’espace des descripteurs d’image convolutionnels haut niveau du réseau de segmentation d’instance Mask R-CNN.Cela nous permet de produire des segmentations visuellement plaisantes en haute résolution, pour des scènes complexes comportant un grand nombre d’objets, et avec une performance satisfaisante jusqu’à une demi seconde dans le futur.
Predictive models of the environment hold promise for allowing the transfer of recent reinforcement learning successes to many real-world contexts, by decreasing the number of interactions needed with the real world.Video prediction has been studied in recent years as a particular case of such predictive models, with broad applications in robotics and navigation systems.While RGB frames are easy to acquire and hold a lot of information, they are extremely challenging to predict, and cannot be directly interpreted by downstream applications.Here we introduce the novel tasks of predicting semantic and instance segmentation of future frames.The abstract feature spaces we consider are better suited for recursive prediction and allow us to develop models which convincingly predict segmentations up to half a second into the future.Predictions are more easily interpretable by…
Advisors/Committee Members: Verbeek, Jakob (thesis director), Couprie, Camille (thesis director).
Subjects/Keywords: Apprentissage profond; Segmentation sémantique; Segmentation d’instance; Modèles génératifs; Apprentissage prédictif; Compréhension vidéo; Deep learning; Semantic segmentation; Instance segmentation; Generative modeling; Predictive learning; Video understanding; 004; 510
Record Details
Similar Records
Cite
Share »
Record Details
Similar Records
Cite
« Share





❌
APA ·
Chicago ·
MLA ·
Vancouver ·
CSE |
Export
to Zotero / EndNote / Reference
Manager
APA (6th Edition):
Luc, P. (2019). Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video. (Doctoral Dissertation). Université Grenoble Alpes (ComUE). Retrieved from http://www.theses.fr/2019GREAM024
Chicago Manual of Style (16th Edition):
Luc, Pauline. “Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video.” 2019. Doctoral Dissertation, Université Grenoble Alpes (ComUE). Accessed January 28, 2021.
http://www.theses.fr/2019GREAM024.
MLA Handbook (7th Edition):
Luc, Pauline. “Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video.” 2019. Web. 28 Jan 2021.
Vancouver:
Luc P. Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video. [Internet] [Doctoral dissertation]. Université Grenoble Alpes (ComUE); 2019. [cited 2021 Jan 28].
Available from: http://www.theses.fr/2019GREAM024.
Council of Science Editors:
Luc P. Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos : Self-supervised learning of predictive segmentation models from video. [Doctoral Dissertation]. Université Grenoble Alpes (ComUE); 2019. Available from: http://www.theses.fr/2019GREAM024
◁ [1] [2] [3] [4] [5] [6] ▶
.