Full Record

New Search | Similar Records

Title Deep Convolutional Neural Networks for Real-Time Single Frame Monocular Depth Estimation
Publication Date
Discipline/Department Division of Systems and Control
University/Publisher Uppsala University
Abstract Vision based active safety systems have become more frequently occurring in modern vehicles to estimate depth of the objects ahead and for autonomous driving (AD) and advanced driver-assistance systems (ADAS). In this thesis a lightweight deep convolutional neural network performing real-time depth estimation on single monocular images is implemented and evaluated. Many of the vision based automatic brake systems in modern vehicles only detect pre-trained object types such as pedestrians and vehicles. These systems fail to detect general objects such as road debris and roadside obstacles. In stereo vision systems the problem is resolved by calculating a disparity image from the stereo image pair to extract depth information. The distance to an object can also be determined using radar and LiDAR systems. By using this depth information the system performs necessary actions to avoid collisions with objects that are determined to be too close. However, these systems are also more expensive than a regular mono camera system and are therefore not very common in the average consumer car. By implementing robust depth estimation in mono vision systems the benefits from active safety systems could be utilized by a larger segment of the vehicle fleet. This could drastically reduce human error related traffic accidents and possibly save many lives. The network architecture evaluated in this thesis is more lightweight than other CNN architectures previously used for monocular depth estimation. The proposed architecture is therefore preferable to use on computationally lightweight systems. The network solves a supervised regression problem during the training procedure in order to produce a pixel-wise depth estimation map. The network was trained using a sparse ground truth image with spatially incoherent and discontinuous data and output a dense spatially coherent and continuous depth map prediction. The spatially incoherent ground truth posed a problem of discontinuity that was addressed by a masked loss function with regularization. The network was able to predict a dense depth estimation on the KITTI dataset with close to state-of-the-art performance. 
Subjects/Keywords deep learning; machine learning; mono vision system; lightweight; CNN; convolutional neural network; depth estimation; lidar; kitti; vehicle camera; mono camera; camera; real-time; real time; ad; autonomous driving; adas; advanced driver assistance systems; mono depth; computer vision; regression; pixel-wise; pixel wise; object detection; general object detection; pedestrian detection; vehicle detection; supervised learning; supervised; tensorflow; python; keras; opencv; autoliv; Computer Vision and Robotics (Autonomous Systems); Datorseende och robotik (autonoma system)
Language en
Country of Publication se
Record ID oai:DiVA.org:uu-336923
Repository diva
Date Indexed 2020-01-03

Sample Search Hits | Sample Images | Cited Works

…vehicle safety development is to prevent the traffic accident from ever happening by deploying active safety systems. By using more sophisticated monitoring and detection systems, such as several cameras, radar, LiDAR and infrared cameras, active safety…

…skilled in reading relative scenic depth information and determine how an object is related to its surroundings. The problem gets increasingly more difficult for a human if the absolute depth is to be determined, especially if only monocular cues are…

…determine scenic relative depth. When stereopsis is unavailable, e.g. when observing an image or having a monocular vision impairment, the visual cortex uses di↵erent monocular cues to interpret depth from scenes [22]. The general perspective of a…

…skyscraper (Figure 2.1a). The perceived distance of familiar objects, e.g. a car, can be determined by using the object size in conjunction with the visual angle covered by the object on the retina. The vision system also uses relative size of…