CN110781717A - Cab scene semantic and visual depth combined analysis method - Google Patents

Cab scene semantic and visual depth combined analysis method Download PDF

Info

Publication number
CN110781717A
CN110781717A CN201910734881.2A CN201910734881A CN110781717A CN 110781717 A CN110781717 A CN 110781717A CN 201910734881 A CN201910734881 A CN 201910734881A CN 110781717 A CN110781717 A CN 110781717A
Authority
CN
China
Prior art keywords
scene
semantic
output
cab
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910734881.2A
Other languages
Chinese (zh)
Inventor
缪其恒
苏志杰
王江明
许炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Leapmotor Technology Co Ltd
Zhejiang Zero Run Technology Co Ltd
Original Assignee
Zhejiang Zero Run Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Zero Run Technology Co Ltd filed Critical Zhejiang Zero Run Technology Co Ltd
Priority to CN201910734881.2A priority Critical patent/CN110781717A/en
Publication of CN110781717A publication Critical patent/CN110781717A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/59Context or environment of the image inside of a vehicle, e.g. relating to seat occupancy, driver state or inner lighting conditions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to a cab scene semantic and visual depth combined analysis method, which comprises the following steps: establishing a neural network model comprising a scene semantic output branch and a scene visual depth output branch; training the scene semantic output branch; performing joint training on the scene semantic output branch and the scene visual depth output branch; collecting and preprocessing an image; model output post-processing; the outputs of the neural network model are integrated into scene structured data. The invention has the advantages that: compared with the existing visual system which can only sense the driver behaviors, the invention can sense the whole cab including a driver seat and a passenger seat by extracting the cab scene structured data through the cab semantic and depth information combined analysis, and provides prior information for realizing various behaviors and member attribute analysis of a cab scene area.

Description

Cab scene semantic and visual depth combined analysis method
Technical Field
The invention relates to the field of visual perception, in particular to a cab scene semantic and visual depth combined analysis method.
Background
Intelligence is one of the important trends in the development of the automobile industry today. The vision system plays an important role in the existing automatic driving/assistant driving system, and is mainly used for perceiving driving scene data required by related applications. Existing mass production vision systems are mainly used for sensing driving scene data (including a travelable area, traffic participants, traffic signal identification and the like) around a vehicle and driving states (fatigue, inattentive driving behavior and the like) of drivers inside the vehicle. The current perception of the driving scene inside the cab and related applications are limited compared to the perception of the driving scene around the vehicle.
The application of the existing vision system in the cab only contains the information of the region of interest of the face (or the upper body) of the driver, and the subsequent application of a cab analysis system, such as driving state monitoring, cab state analysis and the like, cannot be met. In order to realize the application of the vision system with higher intelligent level, the cab analysis system needs to have the capability of automatically measuring more cab prior information, including cab area information, passenger seat area information, steering wheel area information and other customized function prior information.
Disclosure of Invention
The invention mainly solves the problem that the existing visual system can not obtain the driver seat area information, the passenger seat area information, the steering wheel area information and other customized function prior information, and provides a driver seat scene semantic and visual depth combined analysis method which utilizes an infrared system to collect images in a driver seat and analyzes the images through a deep convolutional neural network to finally obtain the prior information of the driver seat.
The invention solves the technical problem by adopting the technical scheme that a cab scene semantic and visual depth combined analysis method comprises the following steps:
s1: establishing a neural network model comprising a scene semantic output branch and a scene visual depth output branch;
s2: training the scene semantic output branch;
s3: performing joint training on the scene semantic output branch and the scene visual depth output branch;
s4: collecting and preprocessing an image;
s5: model output post-processing;
s6: the outputs of the neural network model are integrated into scene structured data.
The scene semantic output branch is based on cab infrared image input, a deep convolution neural network obtained by training a manual labeling scene semantic information sample is used for performing cab scene semantic analysis and outputting a cab scene semantic layer containing information such as seats, a steering wheel and the like, the scene visual depth output branch is based on the cab infrared image input and the scene semantic analysis neural network sharing shallow layer characteristic part, the deep convolution neural network obtained by labeling training is generated by using a laser radar for performing cab scene visual depth analysis, normalized scene visual depth layers with the resolution ratio of input pictures and the like are output, and scene structural data information including the seats, the steering wheel and the steering wheel can be obtained through the cab scene semantic layer and the normalized scene visual depth layers.
As a preferable scheme of the above scheme, the neural network model includes an input layer, a shared feature coding layer, and two branch output layers, and the branch output layers include a semantic output layer and a depth-of-view output layer.
As a preferred scheme of the above scheme, the input of the neural network model is a single-channel infrared image, the output of the semantic output layer is a scene semantic layer containing foreground semantic output representing people and objects in the cab, the output of the visual depth output layer is a normalized scene visual depth layer, and the brightness and darkness of each pixel point in the normalized scene visual depth layer represent the distance between the object and the people corresponding to the pixel point and the camera in practice.
As a preferable scheme of the above scheme, the scene semantic output branch training in step S2 includes the following steps:
s21: collecting cab scene data of different visual angles, vehicle types and illumination conditions;
s22: labeling various foreground semantic outputs of the semantic layer to generate a training data set;
s23: expanding the training data set;
s24: randomly initializing scene semantic output branch parameters, and optimizing a pixel-level semantic loss function L by adopting a batch random gradient descent method SemAnd updating scene semantic output branch parameters
Figure BDA0002161842110000031
Wherein, [ u, v]Is the pixel coordinate of each point in the image coordinate system, [ W, H ]]For inputting width and height of image, s u,vSemantic tags, p, for respective coordinates u,vAnd outputting the predicted values of the branches to the corresponding coordinates for scene semantics.
As a preferable scheme of the above scheme, the joint training of the scene semantic output branch and the scene depth output branch in step S3 includes the following steps:
s31: acquiring a joint training set;
s32: obtaining shared characteristic layer parameters after the semantic output branch training of the solidified scene;
s33: expanding the joint training set;
s34: randomly initializing scene semantic neural network model parameters, and solving the visual depth loss function L at a high learning rate by adopting a batch random gradient descent method DepAnd updating scene depth of vision output branch parameters
Figure BDA0002161842110000041
Wherein d (x, y) is the apparent depth corresponding to each pixel (x, y) of the output depth-of-view map, and d T(x, y) are training sample visual depth labels, and W and H are image width and height;
s35: reducing the training learning rate, canceling the parameters of the solidified shared characteristic layer, and updating the parameters of each branch network according to the following semantic and visual depth comprehensive loss function L
L=k 1L Sem+k 2L Dep
Wherein k is 1,k 2Configurable for semantic and depth of vision impairmentAnd weight coefficients of the loss function in the comprehensive loss function.
As a preferable scheme of the above scheme, the joint training set in S31 is obtained by: calibrating the scene semantic output branch training data set in the step S2 by using the laser radar point cloud output data, aligning a camera coordinate system and point cloud data coordinates according to camera calibration parameters and laser radar system calibration parameters, transforming the point cloud data coordinates into an image coordinate system by using a pinhole imaging principle, and completing the part of the depth-of-view map without effective data by using a bilinear interpolation method.
As a preferable scheme of the above scheme, the image preprocessing in step S4 includes adaptive adjustment of camera shutter, aperture and gain parameters, and image Y channel interception and scaling.
As a preferred scheme of the above scheme, in the step S5, the model output post-processing is performed, and the confidence of each type of multi-channel semantic information output by the scene semantic output branch is processed according to the following formula, so as to obtain the semantic map layer of the driving scene
Figure BDA0002161842110000051
Wherein ch i(u, v) is the i channel confidence output of the scene semantic network, T minThe confidence coefficient is the minimum credibility threshold value and is a configurable parameter.
As a preferable scheme of the above scheme, the output integration of the neural network model in step S6 includes the following steps:
s61: establishing a coordinate system of a cockpit;
s62: based on the neural network output result, clustering the same semantic individuals by using scene depth information;
s63: and constructing cab scene structured data for describing the analysis result of the cab scene network.
As a preferable mode of the above, the cab scene structured data includes a cab class, a seat number, and an attribute of each seat. The cockpit type is input through a configuration parameter interface.
The invention has the advantages that: compared with the existing visual system which can only sense the driver behaviors, the invention can sense the whole cab including a driver seat and a passenger seat by extracting the cab scene structured data through the cab semantic and depth information combined analysis, and provides prior information for realizing various behaviors and member attribute analysis of a cab scene area.
Drawings
FIG. 1 is a schematic flow chart of the present invention.
FIG. 2 is a schematic diagram of a topology of a neural network model according to the present invention.
FIG. 3 is a schematic flow chart of scene semantic output branch training according to the present invention.
FIG. 4 is a schematic flow chart of the joint training of the present invention.
FIG. 5 is a flow chart illustrating the integration of the output of the neural network model into scene structured data according to the present invention.
Fig. 6 is a schematic structural diagram of scene structured data according to the present invention.
Detailed Description
The technical solution of the present invention is further described below by way of examples with reference to the accompanying drawings.
Example 1:
in this embodiment, a method for jointly analyzing scene semantics and depth of view of a cab, as shown in fig. 1, includes two stages, namely, an offline training stage and an online application stage, where the offline training stage includes:
s1: establishing a neural network model comprising a scene semantic output branch and a scene depth of vision output branch, wherein the neural network model comprises an input layer, a shared feature coding layer and two branch output layers as shown in figure 2, the branch output layers comprise a semantic output layer and a depth of vision output layer, the input of the model in the embodiment is a single-channel infrared image of 640 x 320, the output of the semantic output layer is processed into a scene semantic layer of 640 x 320, the scene semantic layer comprises foreground semantic outputs (definition: 0-other, 1-seat, 2-person, 3-safety belt, 4-steering wheel, 5-child seat and 6-driving seat) representing people and objects in a driving cab, the output of the depth of vision output layer is a normalized scene depth of vision layer (actual measurement range is 0-5 m, normalized output range is 0-1 floating point), normalizing the brightness of each pixel point in the scene depth-of-view layer to represent the distance between an object and a person corresponding to the pixel point and a camera in practice;
s2: training the scene semantic output branch, as shown in fig. 3, includes the following steps:
s21: cab scene data (about 10 tens of thousands of pieces) with different viewing angles, vehicle types and illumination conditions are collected by using a driver behavior analysis camera (the horizontal viewing angle is about 50 degrees) and a cab panoramic camera (the horizontal viewing angle is about 180 degrees);
s22: labeling various foreground semantic outputs (namely semantic outputs defining non-0 categories) of the semantic layer in a multi-segment line mode to generate a training data set;
s23: carrying out on-line expansion on the training data set through random geometric and color transformation;
s24: randomly initializing neural network model parameters, and optimizing pixel-level semantic loss function L by adopting batch random gradient descent method SemAnd with a semantic loss function L SemUpdating scene semantic output branch parameters
Figure BDA0002161842110000071
Wherein, [ u, v]Is the pixel coordinate of each point in the image coordinate system, [ W, H ]]For inputting width and height of image, s u,vSemantic tags, p, for respective coordinates u,vOutputting predicted values of the corresponding coordinates of the branches for scene semantics;
s3: the joint training is performed on the scene semantic output branch and the scene depth output branch, as shown in fig. 4, and includes the following steps:
s31: acquiring a joint training set, wherein the specific mode is that a scene semantic output branch training data set in the step S2 is calibrated by using laser radar point cloud output data, then a camera coordinate system and a point cloud data coordinate are aligned according to a camera calibration parameter and a laser radar system calibration parameter, and then the point cloud data coordinate is transformed into an image coordinate system by using a pinhole imaging principle, and as the laser radar output point cloud is discrete data, the part without effective data of a depth-of-view image needs to be supplemented by using a bilinear interpolation method;
s32: obtaining shared characteristic layer parameters after the semantic output branch training of the solidified scene;
s33: carrying out on-line expansion on the joint training set through random geometry and color transformation;
s34: randomly initializing neural network model parameters, and solving the visual depth loss function L at a high learning rate by adopting a batch random gradient descent method DepAnd by the depth of sight loss function L DepUpdating scene depth output branch parameters
Figure BDA0002161842110000081
Wherein d (x, y) is the apparent depth corresponding to each pixel (x, y) of the output depth-of-view map, and d T(x, y) are training sample visual depth labels, and W and H are image width and height;
s35: reducing training learning rate, canceling solidification and sharing characteristic layer parameters, and updating each branch network parameter, namely scene semantic output branch parameter and scene implementation output branch parameter, according to the following semantic and visual depth comprehensive loss function L
L=k 1L Sem+k 2L Dep
Wherein k is 1,k 2The weight coefficients of the configurable semantic and visual depth loss functions in the comprehensive loss function are both 0.5 as default values.
After offline training of a semantic output branch and a scene visual depth output branch of the neural network model is completed, the neural network model is deployed, and after compression operations such as pruning (channel cutting and thinning) and quantization (8-bit or 16-bit floating point and fixed point data types) are performed on trained model parameters, the trained model parameters are deployed on a front-end embedded platform (comprising a data file and a configuration file). And initializing the model file according to a predefined function interface during forward operation.
Entering an online application stage, wherein the online application comprises the following steps:
s4: the method comprises the steps of image acquisition and preprocessing, wherein a driver's cab scene vision system is subjected to infrared light supplement or natural illumination, the vision system mainly comprises a driver behavior analysis camera and a cab panoramic camera, and the image preprocessing mainly comprises adaptive adjustment of parameters such as a camera shutter, an aperture and gain, interception and scaling of an image Y channel and the like. The adaptive adjustment of parameters such as a camera shutter, an aperture, gain and the like can be completed in an off-line image quality adjustment mode; image Y channel data interception (for infrared fill light images, preprocessing is necessary, and options are available under the input of a non-infrared fill light camera) and scaling can be realized by writing corresponding algorithm configuration parameters into an initialization function and reading in the initialization function through a corresponding function interface; based on the preprocessing, an image ROI (region of interest) can be obtained and then input into a trained neural network module;
s5: model output post-processing, wherein the scene apparent depth output branch directly outputs 640 x 320 normalized scene depth; the original output of the scene voice output branch is 640 x 320 x 7 channels of various semantic information confidence coefficients, the confidence coefficients of the multiple channels of various semantic information output by the scene semantic output branch are processed according to the following formula, and the semantic layer of the driving scene is obtained
Figure BDA0002161842110000091
Wherein ch i(u, v) is the i channel confidence output of the scene semantic network, i is 0,1,2, …,6, T minThe confidence coefficient is the minimum credible threshold value, and is a configurable parameter, and the default is 0.5.
S6: integrating the output of the neural network model into scene structured data, as shown in fig. 5, comprises the following steps:
s61: establishing a coordinate system of a cockpit, wherein an origin can be arranged at a driver seat;
s62: based on the neural network output result, clustering the same semantic individuals by using scene depth information;
s63: and constructing cab scene structured data for describing the analysis result of the cab scene network, wherein the cab scene structured data is shown in fig. 6, and cab categories can be transmitted by a configuration parameter interface, and comprise small vehicles, heavy commercial trucks, heavy commercial buses and the like. Heavy commercial passenger vehicles do not contain passenger seat structured data information. The acquired cab scene structured data is prior information necessary for realizing various behaviors and member attribute analysis of the cab scene area.
The specific embodiments described herein are merely illustrative of the spirit of the invention. Various modifications or additions may be made to the described embodiments or alternatives may be employed by those skilled in the art without departing from the spirit or ambit of the invention as defined in the appended claims.

Claims (10)

1. A cab scene semantic and visual depth combined analysis method is characterized by comprising the following steps: the method comprises the following steps:
s1: establishing a neural network model comprising a scene semantic output branch and a scene visual depth output branch;
s2: training the scene semantic output branch;
s3: performing joint training on the scene semantic output branch and the scene visual depth output branch;
s4: collecting and preprocessing an image;
s5: model output post-processing;
s6: the outputs of the neural network model are integrated into scene structured data.
2. The method for jointly analyzing the semantic meaning and the visual depth of the cab scene as claimed in claim 1, wherein the method comprises the following steps: the neural network model comprises an input layer, a shared feature coding layer and two branch output layers, wherein the branch output layers comprise a semantic output layer and an apparent depth output layer.
3. The method for jointly analyzing the semantic meaning and the visual depth of the cab scene as claimed in claim 2, wherein the method comprises the following steps: the input of the neural network model is a single-channel infrared image, the output of the semantic output layer is a scene semantic layer containing foreground semantic output representing people and objects in a cab, the output of the visual depth output layer is a normalized scene visual depth layer, and the brightness of each pixel point in the normalized scene visual depth layer represents the distance between the object corresponding to the pixel point and the people and the camera in practice.
4. The method for jointly analyzing the semantic meaning and the visual depth of the cab scene as claimed in claim 1, wherein the method comprises the following steps: the scene semantic output branch training in the step S2 includes the following steps:
s21: collecting cab scene data of different visual angles, vehicle types and illumination conditions;
s22: labeling various foreground semantic outputs of the semantic layer to generate a training data set;
s23: expanding the training data set;
s24: randomly initializing neural network model parameters, and optimizing pixel-level semantic loss function L by adopting batch random gradient descent method SemAnd updating scene semantic output branch parameters
Wherein, [ u, v]Is the pixel coordinate of each point in the image coordinate system, [ W, H ]]For inputting width and height of image, s u,vSemantic tags, p, for respective coordinates u,vAnd outputting the predicted values of the branches to the corresponding coordinates for scene semantics.
5. The method for jointly analyzing the semantic meaning and the visual depth of the cab scene as claimed in claim 1, wherein the method comprises the following steps: in step S3, the joint training of the scene semantic output branch and the scene depth output branch includes the following steps:
s31: acquiring a joint training set;
s32: obtaining shared characteristic layer parameters after the semantic output branch training of the solidified scene;
s33: expanding the joint training set;
s34: randomly initializing neural network model parameters, using batchesMethod for solving visual depth loss function L by using random gradient decline of quantity at large learning rate DepAnd updating scene depth of vision output branch parameters
Figure FDA0002161842100000022
Wherein d (x, y) is the apparent depth corresponding to each pixel (x, y) of the output depth-of-view map, and d T(x, y) are training sample visual depth labels, and W and H are image width and height;
s35: reducing the training learning rate, canceling the parameters of the solidified shared characteristic layer, and updating the parameters of each branch network according to the following semantic and visual depth comprehensive loss function L
L=k 1L Sem+k 2L Dep
Wherein k is 1,k 2The weight coefficients of the semantic and visual depth loss function in the comprehensive loss function can be configured.
6. The method for jointly analyzing the semantic meaning and the visual depth of the cab scene as claimed in claim 5, wherein the method comprises the following steps: the joint training set in the S31 is obtained by the following steps: calibrating the scene semantic output branch training data set in the step S2 by using the laser radar point cloud output data, aligning a camera coordinate system and point cloud data coordinates according to camera calibration parameters and laser radar system calibration parameters, transforming the point cloud data coordinates into an image coordinate system by using a pinhole imaging principle, and completing the part of the depth-of-view map without effective data by using a bilinear interpolation method.
7. The method for jointly analyzing the semantic meaning and the visual depth of the cab scene as claimed in claim 1, wherein the method comprises the following steps: the image preprocessing in step S4 includes adaptive adjustment of camera shutter, aperture and gain parameters, and image Y-channel clipping and scaling.
8. The method for jointly analyzing the semantic meaning and the visual depth of the cab scene as claimed in claim 1, wherein the method comprises the following steps: in the step S5, the model output post-processing is performed, the confidence of various types of multi-channel semantic information output by the scene semantic output branch is processed according to the following formula, and the semantic map layer of the driving scene is obtained
Wherein ch i(u, v) is the i channel confidence output of the scene semantic network, T minThe confidence coefficient is the minimum credibility threshold value and is a configurable parameter.
9. The method for jointly analyzing the semantic meaning and the visual depth of the cab scene as claimed in claim 1, wherein the method comprises the following steps: the output integration of the neural network model in step S6 includes the following steps:
s61: establishing a coordinate system of a cockpit;
s62: based on the neural network output result, clustering the same semantic individuals by using scene depth information;
s63: and constructing cab scene structured data for describing the analysis result of the cab scene network.
10. The method for analyzing the scene semantics and the visual depth of the cab in a combined manner according to claim 1 or 9, wherein: the cab scene structured data comprises cab category, seat number and attribute of each seat.
CN201910734881.2A 2019-08-09 2019-08-09 Cab scene semantic and visual depth combined analysis method Pending CN110781717A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910734881.2A CN110781717A (en) 2019-08-09 2019-08-09 Cab scene semantic and visual depth combined analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910734881.2A CN110781717A (en) 2019-08-09 2019-08-09 Cab scene semantic and visual depth combined analysis method

Publications (1)

Publication Number Publication Date
CN110781717A true CN110781717A (en) 2020-02-11

Family

ID=69383990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910734881.2A Pending CN110781717A (en) 2019-08-09 2019-08-09 Cab scene semantic and visual depth combined analysis method

Country Status (1)

Country Link
CN (1) CN110781717A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101964055A (en) * 2010-10-21 2011-02-02 重庆大学 Visual perception mechansim simulation natural scene type identification method
CN102289686A (en) * 2011-08-09 2011-12-21 北京航空航天大学 Method for identifying classes of moving targets based on transfer learning
CN103177450A (en) * 2013-04-11 2013-06-26 北京航空航天大学 Image scene segmentation and layering joint solution method based on component set sampling
CN104809187A (en) * 2015-04-20 2015-07-29 南京邮电大学 Indoor scene semantic annotation method based on RGB-D data
CN105787510A (en) * 2016-02-26 2016-07-20 华东理工大学 System and method for realizing subway scene classification based on deep learning
CN108171748A (en) * 2018-01-23 2018-06-15 哈工大机器人(合肥)国际创新研究院 A kind of visual identity of object manipulator intelligent grabbing application and localization method
CN108665496A (en) * 2018-03-21 2018-10-16 浙江大学 A kind of semanteme end to end based on deep learning is instant to be positioned and builds drawing method
CN109299656A (en) * 2018-08-13 2019-02-01 浙江零跑科技有限公司 A kind of deeply determining method of vehicle-mounted vision system scene visual
CN109858372A (en) * 2018-12-29 2019-06-07 浙江零跑科技有限公司 A kind of lane class precision automatic Pilot structured data analysis method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101964055A (en) * 2010-10-21 2011-02-02 重庆大学 Visual perception mechansim simulation natural scene type identification method
CN102289686A (en) * 2011-08-09 2011-12-21 北京航空航天大学 Method for identifying classes of moving targets based on transfer learning
CN103177450A (en) * 2013-04-11 2013-06-26 北京航空航天大学 Image scene segmentation and layering joint solution method based on component set sampling
CN104809187A (en) * 2015-04-20 2015-07-29 南京邮电大学 Indoor scene semantic annotation method based on RGB-D data
CN105787510A (en) * 2016-02-26 2016-07-20 华东理工大学 System and method for realizing subway scene classification based on deep learning
CN108171748A (en) * 2018-01-23 2018-06-15 哈工大机器人(合肥)国际创新研究院 A kind of visual identity of object manipulator intelligent grabbing application and localization method
CN108665496A (en) * 2018-03-21 2018-10-16 浙江大学 A kind of semanteme end to end based on deep learning is instant to be positioned and builds drawing method
CN109299656A (en) * 2018-08-13 2019-02-01 浙江零跑科技有限公司 A kind of deeply determining method of vehicle-mounted vision system scene visual
CN109858372A (en) * 2018-12-29 2019-06-07 浙江零跑科技有限公司 A kind of lane class precision automatic Pilot structured data analysis method

Similar Documents

Publication Publication Date Title
CN111259719B (en) Cab scene analysis method based on multi-view infrared vision system
JP7332726B2 (en) Detecting Driver Attention Using Heatmaps
CN111386701B (en) Image processing apparatus, image processing method, and program
DE102018201054A1 (en) System and method for image representation by a driver assistance module of a vehicle
US9384401B2 (en) Method for fog detection
US20200098095A1 (en) Device and method for automatic image enhancement in vehicles
CN110371108B (en) Fusion method of vehicle-mounted ultrasonic radar and vehicle-mounted looking-around system
DE112018007287T5 (en) VEHICLE SYSTEM AND METHOD FOR DETECTING OBJECTS AND OBJECT DISTANCE
CN111046781B (en) Robust three-dimensional target detection method based on ternary attention mechanism
US20070124071A1 (en) System for providing 3-dimensional vehicle information with predetermined viewpoint, and method thereof
CN110807352B (en) In-vehicle scene visual analysis method for dangerous driving behavior early warning
CN109299656B (en) Scene depth determination method for vehicle-mounted vision system
CN110781718B (en) Cab infrared vision system and driver attention analysis method
CN110874559A (en) Method and device for evaluating a vehicle driving surface
KR20200043391A (en) Image processing, image processing method and program for image blur correction
CN113661495A (en) Sight line calibration method, sight line calibration device, sight line calibration equipment, sight line calibration system and sight line calibration vehicle
DE112021001882T5 (en) INFORMATION PROCESSING ESTABLISHMENT, INFORMATION PROCESSING METHOD AND PROGRAM
CN111323027A (en) Method and device for manufacturing high-precision map based on fusion of laser radar and panoramic camera
DE102020215860A1 (en) Correction of images from an all-round view camera system in the event of rain, light and dirt
CN112926354A (en) Deep learning-based lane line detection method and device
CN110781717A (en) Cab scene semantic and visual depth combined analysis method
WO2022128013A1 (en) Correction of images from a camera in case of rain, incident light and contamination
DE112021001872T5 (en) INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND PROGRAM
DE102019102423A1 (en) Method for live annotation of sensor data
KR20190061153A (en) Method for lane detection autonomous car only expressway based on outputting image of stereo camera

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 310051 1st and 6th floors, no.451 Internet of things street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Zhejiang Zero run Technology Co.,Ltd.

Address before: 310051 1st and 6th floors, no.451 Internet of things street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: ZHEJIANG LEAPMOTOR TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20200211

RJ01 Rejection of invention patent application after publication