WO2019228450A1 - Image processing method, device, and equipment, and readable medium - Google Patents

Image processing method, device, and equipment, and readable medium Download PDF

Info

Publication number
WO2019228450A1
WO2019228450A1 PCT/CN2019/089249 CN2019089249W WO2019228450A1 WO 2019228450 A1 WO2019228450 A1 WO 2019228450A1 CN 2019089249 W CN2019089249 W CN 2019089249W WO 2019228450 A1 WO2019228450 A1 WO 2019228450A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
target
image data
image
position information
Prior art date
Application number
PCT/CN2019/089249
Other languages
French (fr)
Chinese (zh)
Inventor
徐跃书
肖飞
范蒙
俞海
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2019228450A1 publication Critical patent/WO2019228450A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

Provided by the present disclosure are an image processing method, device, and equipment, and a storage medium, the image processing method comprising: acquiring, from collected first image data of a first data format, position information of a specific target in the first image data; intercepting target data corresponding to the position information from the first image data; converting the data format of the target data from the first data format to a second data format, the second data format being suitable for displaying and/or transmitting the target data. The present disclosure may improve the image quality of a detection target.

Description

一种图像处理方法、装置及设备、可读介质Image processing method, device and equipment, and readable medium
相关申请的交叉引用Cross-reference to related applications
本专利申请要求于2018年5月31日提交的、申请号为201810571964.X、发明名称为“一种图像处理方法、装置及设备、可读介质”的中国专利申请的优先权,该申请的全文以引用的方式并入本文中。This patent application claims priority from a Chinese patent application filed on May 31, 2018 with an application number of 201810571964.X and an invention name of "an image processing method, device and device, and readable medium". The entire text is incorporated herein by reference.
技术领域Technical field
本公开涉及图像处理技术领域,尤其涉及一种图像处理方法、装置及设备、可读介质。The present disclosure relates to the field of image processing technologies, and in particular, to an image processing method, apparatus, and device, and a readable medium.
背景技术Background technique
目标检测技术主要目的是从单帧图片或视频中检测并定位特定的目标。目前,目标检测技术已广泛应用于社会中的各个领域,例如:物流中货物搬运的文字检测,道路交通中对违章车辆的检测,商场、车站对客流的检测和客流量的统计等等。The main purpose of target detection technology is to detect and locate specific targets from a single frame of pictures or videos. At present, target detection technology has been widely used in various fields in society, such as: text detection of goods handling in logistics, detection of illegal vehicles in road traffic, detection of passenger flow and statistics of passenger flow in shopping malls and stations, and so on.
目标检测算法主要应用已经经过ISP处理后的低位宽图像,在检测到感兴趣的目标后,从图像中抠取相应的目标图像,以作显示或者后续的识别等处理。在这种检测技术的系统中,最终获取的目标图像的质量一般差距较大,有些图像质量较优,但是很多时候可能存在模糊、亮度不足、对比度不足等质量较差的情况。The target detection algorithm mainly uses low-bit-width images that have been processed by the ISP. After detecting the target of interest, the corresponding target image is extracted from the image for display or subsequent recognition. In the system of this detection technology, the quality of the target image finally obtained is generally large, and some images have better quality, but in many cases there may be poor quality such as blur, insufficient brightness, and insufficient contrast.
中国专利局公开的公开号为CN104463103A的专利申请文件中,提出了一种图像处理方法及装置,当检测目标是文字时,对目标图像中文字进行清晰化处理,方案的主要流程如下:首先对图像中的感兴趣目标进行检测,检测目标使用预设的分类器进行分类,当分类结果为文字时,对文字进行清晰化处理。In the patent application document published by the Chinese Patent Office with the publication number of CN104463103A, an image processing method and device are proposed. When the detection target is text, the text in the target image is sharpened. The main flow of the solution is as follows: The object of interest in the image is detected. The detected object is classified using a preset classifier. When the classification result is a text, the text is sharpened.
ISP处理算法由于设计的缺陷,以及每个处理模块损失的叠加,最后会在一定程度上损失图像原有信息,而CN104463103A的技术方案在后续对文字处理时使用的信息已经是经过ISP算法处理后的数据格式图像,此时信息有可能丢失较为严重,后续无法再修复;且只针对文字进行处理,而文字一般是人们关注对象的极小一部分,当检测到人们关注的其它目标如人脸、车辆、建筑等时,该专利并不进行后续的处理以提高关键目标的质量;整体来说,当前方案较为局限,无法全面提升检测到目标的图像质量。Due to the design defects of the ISP processing algorithm and the superposition of the loss of each processing module, the original information of the image will be lost to a certain extent, and the information used in the technical solution of CN104463103A in the subsequent word processing has been processed by the ISP algorithm Data format image, at this time, the information may be seriously lost and cannot be repaired later; and only the text is processed, and the text is generally a very small part of the people's attention. When other targets of people's attention such as faces, For vehicles, buildings, etc., the patent does not perform subsequent processing to improve the quality of key targets; on the whole, the current solution is more limited and cannot comprehensively improve the image quality of detected targets.
发明内容Summary of the Invention
有鉴于此,本公开提供一种图像处理方法、装置及设备、可读介质,可提升检测目标的图像质量。In view of this, the present disclosure provides an image processing method, apparatus, and device, and a readable medium, which can improve the image quality of a detection target.
本公开第一方面提供一种图像处理方法,包括:A first aspect of the present disclosure provides an image processing method, including:
从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息;Acquiring position information of a specified target in the first image data from the collected first image data in a first data format;
从所述第一图像数据中截取所述位置信息对应的目标数据;Intercepting target data corresponding to the position information from the first image data;
将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述第二数据格式适于所述目标数据显示和/或传输。And converting the data format of the target data from the first data format to a second data format, where the second data format is suitable for displaying and / or transmitting the target data.
根据本公开的一个实施例,所述从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息包括:According to an embodiment of the present disclosure, the acquiring position information of a specified target in the first image data from the acquired first image data in the first data format includes:
将所述第一图像数据转换为可进行目标检测的第二图像数据;Converting the first image data into second image data capable of performing target detection;
在所述第二图像数据中检测出所述指定目标的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息。The position information of the designated target is detected in the second image data, and the detected position information is determined as the position information of the designated target in the first image data.
根据本公开的一个实施例,所述在第二图像数据中检测出指定目标的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息包括:According to an embodiment of the present disclosure, the detecting the position information of the designated target in the second image data, and determining the detected position information as the position information of the designated target in the first image data includes:
将所述第二图像数据输入至已训练的第一神经网络;所述第一神经网络至少通过用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层实现所述指定目标的位置信息的定位和输出;Input the second image data to a trained first neural network; the first neural network at least passes a convolution layer for performing convolution, a pooling layer for performing downsampling, and performs feature synthesis A fully-connected layer and a frame regression layer for performing coordinate transformation to realize the positioning and output of the position information of the specified target;
将所述第一神经网络输出的结果确定为所述指定目标在第一图像数据中的位置信息。The result output by the first neural network is determined as position information of the specified target in the first image data.
根据本公开的一个实施例,所述将所述第一图像数据转换为可进行目标检测的第二图像数据,包括:采用黑电平校正、白平衡校正、色彩插值、对比度增强和位宽压缩中的至少一种图像处理方式,实现将所述第一图像数据转换为可进行目标检测的第二图像数据。According to an embodiment of the present disclosure, the converting the first image data into second image data capable of performing target detection includes: using black level correction, white balance correction, color interpolation, contrast enhancement, and bit width compression. At least one of the image processing methods realizes converting the first image data into second image data capable of performing target detection.
根据本公开的一个实施例,所述从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息包括:According to an embodiment of the present disclosure, the acquiring position information of a specified target in the first image data from the acquired first image data in the first data format includes:
将所述第一图像数据输入至已训练的第二神经网络;所述第二神经网络至少通过用于执行灰度处理的灰度化层、用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层实现将所述第一图像数据转换为可进行目标检测的第二图像数据,以及检测指定目标在第二图像数据中的位置信息;Input the first image data to a trained second neural network; the second neural network passes at least a grayscale layer for performing grayscale processing, a convolution layer for performing convolution, and The down-sampling pooling layer, a fully connected layer for performing feature synthesis, and a border regression layer for performing coordinate transformation implement conversion of the first image data into second image data capable of target detection, and detection of a specified target Position information in the second image data;
将所述第二神经网络输出的结果确定为所述指定目标在所述第一图像数据中的位置信息。The result output by the second neural network is determined as position information of the specified target in the first image data.
根据本公开的一个实施例,所述将目标数据的数据格式从所述第一数据格式转换为第二数据格式包括:将所述目标数据输入至已训练的第三神经网络;所述第三神经网络至少通过用于执行卷积的卷积层实现将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式。According to an embodiment of the present disclosure, converting the data format of the target data from the first data format to the second data format includes: inputting the target data to a trained third neural network; the third The neural network implements conversion of the data format of the target data from the first data format to the second data format by at least a convolution layer for performing convolution.
根据本公开的一个实施例,所述将目标数据的数据格式从所述第一数据格式转换为第二数据格式包括:According to an embodiment of the present disclosure, the converting a data format of the target data from the first data format to a second data format includes:
对所述目标数据执行ISP处理;其中,所述ISP处理用于将所述目标数据的数 据格式从所述第一数据格式转换为第二数据格式,所述ISP处理至少包括色彩插值。ISP processing is performed on the target data; wherein the ISP processing is used to convert a data format of the target data from the first data format to a second data format, and the ISP processing includes at least color interpolation.
根据本公开的一个实施例,所述ISP处理还包括以下处理中的至少一种:白平衡校正、曲线映射。According to an embodiment of the present disclosure, the ISP process further includes at least one of the following processes: white balance correction, curve mapping.
本公开第二方面提供一种图像处理装置,包括:A second aspect of the present disclosure provides an image processing apparatus including:
第一处理模块,用于从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息;A first processing module, configured to obtain position information of a specified target in the first image data from the collected first image data in a first data format;
第二处理模块,用于从所述第一图像数据中截取所述位置信息对应的目标数据;A second processing module, configured to intercept target data corresponding to the position information from the first image data;
第三处理模块,用于将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述第二数据格式适于所述目标数据显示和/或传输。A third processing module is configured to convert a data format of the target data from the first data format to a second data format, where the second data format is suitable for displaying and / or transmitting the target data.
根据本公开的一个实施例,所述第一处理模块包括第一处理单元和第二处理单元;所述第一处理单元,用于将所述第一图像数据转换为可进行目标检测的第二图像数据;所述第二处理单元,用于在所述第二图像数据中检测出所述指定目标的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息。According to an embodiment of the present disclosure, the first processing module includes a first processing unit and a second processing unit; the first processing unit is configured to convert the first image data into a second object that can perform target detection Image data; the second processing unit is configured to detect position information of the designated target in the second image data, and determine the detected position information as the designated target is in the first image data Location information.
根据本公开的一个实施例,所述第二处理单元,具体用于:将所述第二图像数据输入至已训练的第一神经网络,以及将所述第一神经网络输出的结果确定为所述指定目标在第一图像数据中的位置信息;所述第一神经网络至少通过用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层实现所述指定目标的位置信息的定位和输出。According to an embodiment of the present disclosure, the second processing unit is specifically configured to: input the second image data to a trained first neural network, and determine a result output by the first neural network as all The position information of the specified target in the first image data; the first neural network at least passes a convolution layer for performing convolution, a pooling layer for performing downsampling, and a fully connected layer for performing feature synthesis And a frame regression layer for performing coordinate transformation to realize the positioning and output of the position information of the specified target.
根据本公开的一个实施例,所述第一处理单元,具体用于:采用黑电平校正、白平衡校正、色彩插值、对比度增强和位宽压缩中的至少一种图像处理方式,实现将所述第一图像数据转换为可进行目标检测的第二图像数据。According to an embodiment of the present disclosure, the first processing unit is specifically configured to: use at least one image processing mode of black level correction, white balance correction, color interpolation, contrast enhancement, and bit width compression to implement the The first image data is converted into second image data capable of performing target detection.
根据本公开的一个实施例,所述第一处理模块包括第三处理单元;所述第三处理单元,用于将所述第一图像数据输入至已训练的第二神经网络;所述第二神经网络至少通过用于执行灰度处理的灰度化层、用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层实现将所述第一图像数据转换为可进行目标检测的第二图像数据、以及检测指定目标在第二图像数据中的位置信息;将所述第二神经网络输出的结果确定为所述指定目标在所述第一图像数据中的位置信息。According to an embodiment of the present disclosure, the first processing module includes a third processing unit; the third processing unit is configured to input the first image data to a trained second neural network; the second Neural networks pass at least a grayscale layer for performing grayscale processing, a convolution layer for performing convolution, a pooling layer for performing downsampling, a fully connected layer for performing feature synthesis, and a coordinate for performing The transformed border regression layer realizes converting the first image data into second image data capable of target detection, and detecting position information of a specified target in the second image data; determining a result output by the second neural network Is position information of the specified target in the first image data.
根据本公开的一个实施例,所述第三处理模块包括第四处理单元;所述第四处理单元,用于将所述目标数据输入至已训练的第三神经网络;所述第三神经网络至少通过用于执行卷积的卷积层实现将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式。According to an embodiment of the present disclosure, the third processing module includes a fourth processing unit; the fourth processing unit is configured to input the target data to a trained third neural network; and the third neural network The conversion of the data format of the target data from the first data format to the second data format is achieved by at least a convolution layer for performing a convolution.
根据本公开的一个实施例,所述第三处理模块包括第五处理单元;所述第五处理单元,用于对所述目标数据执行ISP处理;其中,所述ISP处理用于将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,至少包括色彩插值。According to an embodiment of the present disclosure, the third processing module includes a fifth processing unit; the fifth processing unit is configured to perform ISP processing on the target data; wherein the ISP processing is used to transfer the target data The data format of the data is converted from the first data format to a second data format, including at least color interpolation.
本公开第三方面提供一种电子设备,包括处理器及存储器;所述存储器存储有 可被处理器调用的程序;其中,所述处理器执行所述程序时,实现如前述实施例中任意一项所述的图像处理方法。According to a third aspect of the present disclosure, there is provided an electronic device including a processor and a memory; the memory stores a program that can be called by the processor; wherein when the processor executes the program, it implements any one of the foregoing embodiments. Item described in the image processing method.
本公开第四方面提供一种机器可读存储介质,其上存储有程序,该程序被处理器执行时,实现如前述实施例中任意一项所述的图像处理方法。相比于现有技术,本发明实施例具有以下有益效果:A fourth aspect of the present disclosure provides a machine-readable storage medium having a program stored thereon that, when executed by a processor, implements the image processing method according to any one of the foregoing embodiments. Compared with the prior art, the embodiments of the present invention have the following beneficial effects:
本发明实施例利用采集得到的第一数据格式的第一图像数据,进行指定目标的检测而获取其位置信息,再利用第一数据格式的第一图像数据截取所得位置信息对应的目标数据,该目标数据由于是从第一图像数据中截取的,因而未发生图像格式或质量的改变,再利用该目标数据进行格式转换而将其转换至适于显示和/或传输的数据格式,相比于现有对已经经过图像处理的图像在检测后进行后处理的方式而言,提升了所检测目标的图像质量。In the embodiment of the present invention, the first image data in the first data format acquired is used to detect the specified target to obtain its position information, and then the target data corresponding to the obtained position information is intercepted by using the first image data in the first data format. Because the target data is intercepted from the first image data, there is no change in the image format or quality. The target data is then used to perform format conversion to convert it to a data format suitable for display and / or transmission, compared to Existing methods for performing post-processing on an image that has undergone image processing after detection improves the image quality of the detected target.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本公开一示例性实施例的图像处理方法的流程示意图。FIG. 1 is a schematic flowchart of an image processing method according to an exemplary embodiment of the present disclosure.
图2为本公开一示例性实施例的图像处理装置的结构框图。FIG. 2 is a structural block diagram of an image processing apparatus according to an exemplary embodiment of the present disclosure.
图3为本公开提供的第一处理模块的一个实施例的结构框图。FIG. 3 is a structural block diagram of an embodiment of a first processing module provided by the present disclosure.
图4为本公开提供的将第一图像数据转换为第二图像数据的一个实施例的流程示意图。FIG. 4 is a schematic flowchart of an embodiment of converting first image data to second image data provided by the present disclosure.
图5为本公开提供的色彩插值的一个实施例的示意图。FIG. 5 is a schematic diagram of an embodiment of color interpolation provided by the present disclosure.
图6为本公开提供的第一神经网络的一个实施例的结构框图。FIG. 6 is a structural block diagram of an embodiment of a first neural network provided by the present disclosure.
图7为本公开提供的第一神经网络的另一个实施例的结构框图。FIG. 7 is a structural block diagram of another embodiment of a first neural network provided by the present disclosure.
图8为本公开提供的第一处理模块的另一个实施例的结构框图。FIG. 8 is a structural block diagram of another embodiment of a first processing module provided by the present disclosure.
图9为本公开提供的第二神经网络的一个实施例的结构框图。FIG. 9 is a structural block diagram of an embodiment of a second neural network provided by the present disclosure.
图10为本公开提供的第二神经网络的另一个实施例的结构框图。FIG. 10 is a structural block diagram of another embodiment of a second neural network provided by the present disclosure.
图11为本公开提供的进行灰度化处理的一个实施例的示意图。FIG. 11 is a schematic diagram of an embodiment for performing grayscale processing provided by the present disclosure.
图12为本公开提供的图像处理装置的一个实施例的结构框图。FIG. 12 is a structural block diagram of an embodiment of an image processing apparatus provided by the present disclosure.
图13为本公开提供的第三神经网络的一个实施例的结构框图。FIG. 13 is a structural block diagram of an embodiment of a third neural network provided by the present disclosure.
图14为本公开提供的图像处理装置的另一个实施例的结构框图。FIG. 14 is a structural block diagram of another embodiment of an image processing apparatus provided by the present disclosure.
图15为本公开提供的将目标数据从第一数据格式转换为第二数据格式的ISP处理的一个实施例的结构框图。15 is a structural block diagram of an embodiment of an ISP process for converting target data from a first data format to a second data format provided by the present disclosure.
图16为本公开一示例性实施例的电子设备的结构框图。FIG. 16 is a structural block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail here, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the present disclosure. Rather, they are merely examples of devices and methods consistent with some aspects of the present disclosure, as detailed in the appended claims.
在本公开使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本公开。在本公开和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terminology used in this disclosure is for the purpose of describing particular embodiments only and is not intended to limit the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and / or" as used herein refers to and includes any or all possible combinations of one or more of the associated listed items.
应当理解,尽管在本公开可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本公开范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that, although the terms first, second, third, etc. may be used in this disclosure to describe various information, such information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other. For example, without departing from the scope of the present disclosure, the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information. Depending on the context, the word "if" as used herein can be interpreted as "at" or "when" or "in response to determination".
为了使得本公开的描述更清楚简洁,下面对本公开中的一些技术术语进行解释:In order to make the description of the disclosure more clear and concise, some technical terms in the disclosure are explained below:
ISP(Image Signal Processor)处理:可以对前端成像设备的图像传感器采集的图像信号进行处理,有坏点校正、黑电平校正、白平衡校正、色彩插值、伽马校正、色彩校正、锐化、去噪等功能,具体可根据实际应用选择其中的一种或几种。ISP (Image Signal Processor) processing: It can process the image signals collected by the image sensor of the front-end imaging device, including dead pixel correction, black level correction, white balance correction, color interpolation, gamma correction, color correction, sharpening, Denoising and other functions, you can choose one or more of them according to the actual application.
深度学习:是一种使用神经网络来模拟人脑分析学习、并建立对应数据表征的方法。Deep learning: It is a method that uses neural networks to simulate human brain analysis and learning, and establishes corresponding data representation.
神经网络(Neural Network):主要由神经元构成;可以包括卷积层(Convolutional Layer)和池化层(Pooling Layer)等。Neural Network (Neural Network): It is mainly composed of neurons; it can include Convolutional Layer and Pooling Layer.
下面对本公开实施例的图像处理方法进行更具体的描述,但不应以此为限。The image processing method of the embodiment of the present disclosure is described in more detail below, but it should not be limited to this.
在一个实施例中,参看图1,示出了本公开实施例的一种图像处理方法,该方法可以包括以下步骤:In one embodiment, referring to FIG. 1, an image processing method according to an embodiment of the present disclosure is shown. The method may include the following steps:
S1:从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息;S1: acquiring position information of a specified target in the first image data from the collected first image data in a first data format;
S2:从所述第一图像数据中截取所述位置信息对应的目标数据;S2: intercept target data corresponding to the position information from the first image data;
S3:将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述第二数据格式适于所述目标数据显示和/或传输。S3: Convert the data format of the target data from the first data format to a second data format, where the second data format is suitable for displaying and / or transmitting the target data.
在本公开实施例中,图1的图像处理方法可以应用在图像设备上,图像设备可以是摄像机等具有成像功能的设备,或者是可进行图像后处理的设备等,具体不限。第一数据格式的第一图像数据可以是图像设备自身采集得到的图像数据,也可以是从其他设备获取的图像数据,具体不限。In the embodiment of the present disclosure, the image processing method of FIG. 1 may be applied to an image device. The image device may be a device with an imaging function, such as a camera, or a device capable of performing image post-processing, and the like is not limited. The first image data in the first data format may be image data acquired by the image device itself, or image data acquired from other devices, which is not limited in particular.
图像设备采集的图像数据为第一图像数据,该第一图像数据的图像数据格式为 第一数据格式。第一数据格式是图像设备采集的原始图像格式,比如原始图像格式是图像设备中的图像传感器对一个或多个光谱波段进行感光之后生成的未经图像预处理的图像格式,原始图像格式的图像可包含一个或多个光谱波段的数据,例如可以包括对波长范围是380nm~780nm的光谱采样信号和/或对波长范围是780nm~2500nm的光谱采样信号。通常来说,第一数据格式的图像直接用于显示或传输会存在一定的困难。The image data collected by the image device is first image data, and the image data format of the first image data is the first data format. The first data format is the original image format collected by the image device. For example, the original image format is an image format without image preprocessing generated by an image sensor in the image device after sensing one or more spectral bands, and an image in the original image format. It may include data in one or more spectral bands, for example, it may include a spectral sampling signal with a wavelength range of 380 nm to 780 nm and / or a spectral sampling signal with a wavelength range of 780 nm to 2500 nm. Generally speaking, it is difficult to directly display or transmit the image in the first data format.
在步骤S1中,从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息。In step S1, position information of a specified target in the first image data is acquired from the collected first image data in a first data format.
第一图像数据中包含有指定目标,该指定目标是被期望进行ISP处理,以提高该指定目标的图像质量的对象。可以在第一图像数据检测并定位指定目标。The first image data includes a specified target, and the specified target is an object that is expected to undergo ISP processing to improve the image quality of the specified target. A specified target may be detected and located on the first image data.
指定目标在第一图像数据中的位置信息,可以包括:指定目标的特征点在第一图像数据中的坐标、及指定目标的图像区域的大小;或者,指定目标的图像区域的起始点和终点的坐标等等,具体并不限定,只要是能够定位指定目标在第一图像数据中的位置即可。The position information of the designated target in the first image data may include: the coordinates of the feature point of the designated target in the first image data, and the size of the image area of the designated target; or, the start point and end point of the designated image area of the target The coordinates and the like are not specifically limited, as long as they can locate the position of the designated target in the first image data.
接着执行步骤S2,从所述第一图像数据中截取所述位置信息对应的目标数据。Then step S2 is executed to intercept target data corresponding to the position information from the first image data.
步骤S2中的第一图像数据是采集的第一数据格式的第一图像数据,也即是设备采集时的原始图像,并非是为了获取目标对象的位置信息而对第一图像数据经过处理后的图像数据,不存在丢失图像信息的问题。也就是说,步骤S1和步骤S2中所利用的第一图像数据是同一数据源,可以是同一第一图像数据,也可以是在相同场景下采集的不同第一图像数据,例如可以是前后两帧图像数据,只要指定目标在两帧图像数据中不发生运动或其他变化即可。当然,优选来说,步骤S1和步骤S2中是选用同一第一图像数据,该第一图像数据可以存储在图像设备中,在需要利用时可调取。The first image data in step S2 is the first image data in the first data format that is collected, that is, the original image when the device is acquired, and is not the first image data that is processed in order to obtain the position information of the target object. Image data, there is no problem of losing image information. That is, the first image data used in steps S1 and S2 are the same data source, and may be the same first image data, or different first image data collected in the same scene. Frame image data, as long as the specified target does not undergo motion or other changes in the two frames of image data. Of course, preferably, in step S1 and step S2, the same first image data is selected, and the first image data can be stored in an image device and can be accessed when needed.
由于位置信息是从第一图像数据中检测获取的,因而位置信息在第一图像数据中对应的图像区域便是指定目标。可在第一图像数据中位置信息指向的区域进行图像截取,得到指定目标对应的目标数据。由于目标数据是从第一图像数据中截取的,因而其数据格式仍然是第一数据格式,与第一图像数据的数据格式是相同的。Since the position information is detected and acquired from the first image data, the image area corresponding to the position information in the first image data is the designated target. Image capture can be performed on the area pointed by the position information in the first image data to obtain target data corresponding to the specified target. Since the target data is intercepted from the first image data, its data format is still the first data format, which is the same as the data format of the first image data.
接着执行步骤S3,将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述第二数据格式适于所述目标数据显示和/或传输。Step S3 is then executed to convert the data format of the target data from the first data format to a second data format, and the second data format is suitable for displaying and / or transmitting the target data.
步骤S3中,是对第一数据格式的目标数据进行图像处理,将其数据格式转换为第二数据格式,第二数据格式是适于目标数据进行显示和/或传输的一种数据格式,第一数据格式和第二数据格式都是图像格式。图像处理的过程可以不仅仅进行数据格式转换,还可以包括其他的图像处理,提高目标数据的图像质量。In step S3, image processing is performed on the target data in the first data format, and the data format is converted into the second data format. The second data format is a data format suitable for displaying and / or transmitting the target data. Both the first data format and the second data format are image formats. The image processing process may not only perform data format conversion, but may also include other image processing to improve the image quality of the target data.
本公开实施例利用采集得到的第一数据格式的第一图像数据,进行指定目标的检测而获取其位置信息,再利用第一数据格式的第一图像数据截取所得位置信息对应的目标数据,该目标数据由于是从第一图像数据中截取的,因而未发生图像格式或质量的改变,再对该目标数据进行格式转换而将其转换至适于显示和/或传输的数据格式,相比于对已经经过图像处理的图像在检测后进行后处理的方式而言,提升了所检测目标的图像质量。In the embodiment of the present disclosure, the first image data in the first data format acquired is used to detect the specified target to obtain its position information, and then the first data in the first data format is used to intercept the target data corresponding to the obtained position information. Because the target data is intercepted from the first image data, there is no change in the image format or quality, and the target data is then format converted to convert it to a data format suitable for display and / or transmission, compared to As for the manner in which the image that has undergone image processing is post-processed after detection, the image quality of the detected target is improved.
步骤S1是位置信息获取的步骤,可通过检测感兴趣的指定目标,并在检测到指定目标后进行定位,得到指定目标的位置信息。指定目标的类型不限,例如是文字、人物、车辆、车牌和建筑等等,形状、大小同样是不限的。可以先进行预处理将输入的第一数据格式的第一图像数据进行转化,转化为常用的可进行目标检测的数据,再进行目标检测,也可以直接对第一数据格式的第一图像数据进行目标检测,输出目标位置信息,具体实现方式不限。Step S1 is a step of acquiring position information. The position information of the specified target can be obtained by detecting the specified target of interest and positioning after detecting the specified target. The types of designated targets are not limited, such as text, characters, vehicles, license plates, buildings, etc. The shape and size are also unlimited. Preprocessing can be performed to convert the first image data of the inputted first data format into commonly used data for target detection, and then perform target detection, or directly perform the first image data of the first data format. Target detection, output target location information, the specific implementation is not limited.
在一个实施例中,上述方法流程可由图像处理装置100执行,如图2所示,图像处理装置100主要包含3个模块:第一处理模块101,第二处理模块102和第三处理模块103。第一处理模块101用于执行上述步骤S1,第二处理模块102用于执行上述步骤S2,第三处理模块103用于执行上述步骤S3。In one embodiment, the above method flow may be executed by the image processing apparatus 100. As shown in FIG. 2, the image processing apparatus 100 mainly includes three modules: a first processing module 101, a second processing module 102, and a third processing module 103. The first processing module 101 is configured to perform step S1, the second processing module 102 is configured to perform step S2, and the third processing module 103 is configured to perform step S3.
如图2,第一处理模块101从第一数据格式的第一图像数据中,检测感兴趣的目标或者对象,并输出检测到目标的位置信息;第二处理模块102基于第一处理模块101输出的感兴趣目标的位置信息和原始输入的第一数据格式的第一图像数据,从原始的第一数据格式的第一图像数据中获取感兴趣目标对应的第一数据格式的目标数据;第三处理模块103对第二处理模块102输出的感兴趣目标对应的第一数据格式的目标数据,进行自适应ISP处理,得到质量较高的第二数据格式的目标数据。As shown in FIG. 2, the first processing module 101 detects a target or object of interest from the first image data in the first data format, and outputs position information of the detected target; the second processing module 102 outputs based on the first processing module 101 The position information of the target of interest and the first image data of the first input data format, and obtain the target data of the first data format corresponding to the target of interest from the first image data of the original first data format; third The processing module 103 performs adaptive ISP processing on the target data in the first data format corresponding to the target of interest output by the second processing module 102 to obtain the target data in the second data format with higher quality.
在一个实施例中,如图3所示,第一处理模块101包括第一处理单元1011和第二处理单元1012,可由第一处理单元1011执行步骤S101,可由第二处理单元1012执行步骤S102,以实现上述步骤S1。上述步骤S1具体包括以下步骤:In one embodiment, as shown in FIG. 3, the first processing module 101 includes a first processing unit 1011 and a second processing unit 1012. Step S101 may be performed by the first processing unit 1011, and step S102 may be performed by the second processing unit 1012. To achieve the above step S1. The above step S1 specifically includes the following steps:
S101:将所述第一图像数据转换为可进行目标检测的第二图像数据;S101: converting the first image data into second image data capable of performing target detection;
S102:在所述第二图像数据中检测出指定目标的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息。S102: Detect position information of a designated target in the second image data, and determine the detected position information as position information of the designated target in the first image data.
由于需要检测出指定目标,而第一图像数据不方便用来直接检测出指定目标,因而在步骤S101中,先将第一图像数据转换为可用来进行目标检测的第二图像数据,使得第二图像数据可以用来检测出指定目标。具体转换的方式不限,只要能够将第一图像数据转换为能检测目标的第二图像数据即可。Since the specified target needs to be detected, and the first image data is not convenient for directly detecting the specified target, in step S101, the first image data is first converted into second image data that can be used for target detection, so that the second Image data can be used to detect specific targets. The specific conversion method is not limited, as long as the first image data can be converted into the second image data capable of detecting a target.
此第二图像数据由于经过转换,数据格式可能已不再是第一数据格式,若再利用其进行后处理来检测提取目标,图像质量是无法保证的。因而本实施例中并非利用第二图像数据来提取指定目标,而是利用了该第二图像数据来检测出指定目标的位置信息。Because the second image data is converted, the data format may no longer be the first data format. If it is used for post-processing to detect the extraction target, the image quality cannot be guaranteed. Therefore, in this embodiment, instead of using the second image data to extract the specified target, the second image data is used to detect the position information of the specified target.
步骤S101之后执行步骤S102,在所述第二图像数据中检测出指定目标的位置信息。在第二图像数据中对指定目标进行目标识别定位,便可确定指定目标在第二图像数据中的位置信息,指定目标在第一图像数据和第二图像数据中的位置关系一般不发生变化,当然也不排除第一图像数据和第二图像数据之间发生缩放或者指定目标的平移等,但是这些缩放和平移都是处理过程中可确定的,因而得知指定目标在第二图像数据中的位置信息就可得知指定目标在第一图像数据中的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息。After step S101, step S102 is performed, and position information of a specified target is detected in the second image data. Target recognition and positioning of the specified target in the second image data can determine the position information of the specified target in the second image data. The positional relationship of the specified target in the first image data and the second image data generally does not change. Of course, it is not excluded that zooming or panning of the designated target occurs between the first image data and the second image data, but these zooming and panning are all determinable during processing, so it is known that the designated target is in the second image data. The position information can be used to know the position information of the specified target in the first image data, and the detected position information is determined as the position information of the specified target in the first image data.
进一步的,将所述第一图像数据转换为可进行目标检测的第二图像数据的方式 可以包括至少对第一图像数据进行色彩插值处理。在此基础上,例如还可以进行以下处理的至少一种:黑电平校正、白平衡校正、对比度增强和位宽压缩,当然具体也不限于此。Further, a manner of converting the first image data into second image data capable of performing target detection may include performing color interpolation processing on at least the first image data. On this basis, for example, at least one of the following processes may also be performed: black level correction, white balance correction, contrast enhancement, and bit width compression, of course, it is not specifically limited to this.
在一种可能的实施方式中,第一处理单元1011可以通过执行步骤S1011至步骤S1015来实现上述步骤S101。参看图4,S1011至步骤S1015具体为:In a possible implementation manner, the first processing unit 1011 may implement step S101 by performing steps S1011 to S1015. Referring to FIG. 4, S1011 to step S1015 are specifically:
S1011:黑电平校正;S1011: black level correction;
S1012:白平衡校正;S1012: white balance correction;
S1013:色彩插值;S1013: color interpolation;
S1014:对比度增强;S1014: contrast enhancement;
S1015:位宽压缩。S1015: Bit width compression.
可以理解,将第一图像数据转换到第二图像数据的方式也不限于上述步骤S1011~S1015,且处理顺序也不限,例如,将第一图像数据转换到第二图像数据也可以仅进行色彩插值处理,只要得到的第二图像数据能够进行目标检测即可。It can be understood that the method of converting the first image data to the second image data is not limited to the above steps S1011 to S1015, and the processing order is not limited. For example, the first image data can be converted to the second image data only by color The interpolation process may be performed as long as the obtained second image data can perform target detection.
步骤S1011中,假设第一数据格式的第一图像数据记为imgR,黑电平校正是为了去除第一数据格式的第一图像数据中黑电平的影响,输出imgR blcIn step S1011, it is assumed that the first image data in the first data format is recorded as imgR, and the black level correction is to remove the influence of the black level in the first image data in the first data format, and output imgR blc :
imgR blc=imgR-V blc imgR blc = imgR-V blc
其中,V blc是黑电平值;这里的“-”并非是数学运算,表示“去除”的含义。 Among them, V blc is a black level value; “-” here is not a mathematical operation, which means the meaning of “removal”.
步骤S1012中,白平衡校正是为了去除图像成像由于环境光照影响而造成的图像偏色,以还原图像原有的色彩信息,可由两个系数R gain、B gain来控制对应的R1分量和B1分量的调整: In step S1012, the white balance correction is to remove the image color cast due to the influence of ambient light in the image imaging to restore the original color information of the image. The corresponding R1 and B1 components can be controlled by two coefficients R gain and B gain Adjustments:
R1′=R1*R gain R1 ′ = R1 * R gain
B1′=B1*B gain B1 ′ = B1 * B gain
其中,R1、B1是黑电平校正处理后的图像数据的红色、蓝色通道色彩分量,R1′、B1′是白平衡校正模块输输出图像的红色、蓝色通道色彩分量,输出图像记为imgR wbAmong them, R1 and B1 are the color components of the red and blue channels of the image data after the black level correction processing, and R1 ′ and B1 ′ are the color components of the red and blue channels of the output image of the white balance correction module. The output image is recorded as imgR wb .
步骤S1013中,色彩插值所针对的数据是白平衡校正处理后的数据,色彩插值可采用最邻近插值法实现,将单通道第一数据格式的第一图像数据扩展成多通道数据。针对Bayer格式的第一数据格式的第一图像数据,直接用最邻近的色彩像素填补相应色彩缺失的像素点,使每个像素点都含有RGB三种色彩分量,具体插值过程如图5所示,R11将其邻近的三个色彩像素填补为R11,具体将哪几个邻近的进行填补可以进行设置,其他色彩像素也是同理,在此不再赘述,插值后的图像记为imgC。In step S1013, the data targeted for color interpolation is data after white balance correction processing. The color interpolation can be implemented by the nearest neighbor interpolation method, and the first image data in the single channel first data format is expanded into multi-channel data. For the first image data in the first data format of the Bayer format, directly fill the missing pixels with the nearest color pixels, so that each pixel contains three RGB color components. The specific interpolation process is shown in Figure 5. R11 fills its three neighboring color pixels as R11. Which specific neighboring pixels can be filled can be set. The same applies to other color pixels, which will not be repeated here. The interpolated image is recorded as imgC.
步骤S1014中,对比度增强所针对的数据是色彩插值后的数据,对比度增强是为了增强插值后图像的对比度,可以使用Gamma曲线进行线性映射,假设Gamma曲线的映射函数为f(),映射后的图像记为imgC gmIn step S1014, the data for contrast enhancement is data after color interpolation. Contrast enhancement is to enhance the contrast of the image after interpolation. Gamma curves can be used for linear mapping. It is assumed that the mapping function of the Gamma curve is f (). The image is recorded as imgC gm :
imgC gm(i,j)=f(imgC(i,j)), imgC gm (i, j) = f (imgC (i, j)),
其中,(i,j)为像素点的坐标。Where (i, j) is the coordinate of the pixel.
步骤S1015中,位宽压缩所针对的数据是对比度增强后的数据,位宽压缩是将对比度增强后得到的高位宽数据imgC gm进行压缩,可压缩到第二数据格式对应的位宽,例如可以直接采用线性压缩,压缩后的图像记为imgC 1bIn step S1015, the data targeted for bit-width compression is the data with enhanced contrast. Bit-width compression is the compression of the high-bit-width data imgC gm obtained after the contrast enhancement. Linear compression is used directly, and the compressed image is recorded as imgC 1b :
imgC 1b(i,j)=imgC gm(i,j)/M imgC 1b (i, j) = imgC gm (i, j) / M
其中,M即为第一数据格式压缩到第二数据格式对应的压缩比例。Among them, M is a compression ratio corresponding to the compression of the first data format to the second data format.
在一种可能的实施方式中,第二处理单元1012可以通过执行步骤S1021至步骤1022来实现上述步骤S102。In a possible implementation manner, the second processing unit 1012 may implement step S102 by performing steps S1021 to 1022.
S1021:将所述第二图像数据输入至已训练的第一神经网络;所述第一神经网络用于至少通过卷积层、池化层、全连接层和边框回归层实现定位;S1021: input the second image data to a trained first neural network; the first neural network is used to achieve positioning through at least a convolution layer, a pooling layer, a fully connected layer, and a frame regression layer;
S1022:将所述第一神经网络输出的结果确定为所述指定目标在第一图像数据中的位置信息。S1022: Determine a result output by the first neural network as position information of the specified target in the first image data.
步骤S1021中,第一神经网络是已经训练好的网络,将第二图像数据输入该第一神经网络,可实现指定目标在第二图像数据中的定位,相应获得指定目标的位置信息。In step S1021, the first neural network is a trained network, and inputting the second image data into the first neural network can realize the positioning of the specified target in the second image data and obtain the position information of the specified target accordingly.
其中,第一神经网络可以集成在第二处理单元1012中作为第一处理模块101的一部分,也可以设置在第一处理模块101外部,可由第二处理单元1012来调度。The first neural network may be integrated in the second processing unit 1012 as a part of the first processing module 101, or may be provided outside the first processing module 101, and may be scheduled by the second processing unit 1012.
参看图6,所述第一神经网络200可以包括至少一层用于执行卷积的卷积层201、至少一层用于执行下采样的池化层202、至少一层用于执行特征综合的全连接层203和至少一层用于执行坐标变换的边框回归层204。Referring to FIG. 6, the first neural network 200 may include at least one convolution layer 201 for performing convolution, at least one pooling layer 202 for performing downsampling, and at least one layer for performing feature synthesis. A fully connected layer 203 and at least one border regression layer 204 for performing coordinate transformation.
作为第一神经网络的一个实施例,参看图7,第一神经网络200可以包括依次连接的卷积层205、卷积层206、池化层207…卷积层208、池化层209、全连接层210、边框回归层211。将第二图像数据输入第一神经网络200,第一神经网络200输出位置信息,该位置信息作为指定目标在第一图像数据中的位置信息。第一神经网络每层所执行的功能已经在上面进行了描述,每层可有适应性的变化,例如不同卷积层的卷积核可有所不同,在此便不再赘述。可以理解,图7示出的第一神经网络仅是一个示例,具体并不限于此,例如可以减少或增加卷积层、和/或池化层、和/或其他层。As an embodiment of the first neural network, referring to FIG. 7, the first neural network 200 may include a convolution layer 205, a convolution layer 206, a pooling layer 207,... A convolution layer 208, a pooling layer 209, The connection layer 210 and the frame return layer 211. The second image data is input to the first neural network 200, and the first neural network 200 outputs position information, which is used as position information of the specified target in the first image data. The functions performed by each layer of the first neural network have been described above, and each layer may have adaptive changes. For example, the convolution kernels of different convolution layers may be different, which will not be described again here. It can be understood that the first neural network shown in FIG. 7 is only an example, and is not specifically limited thereto. For example, a convolution layer, a pooling layer, and / or other layers may be reduced or increased.
下面介绍一下第一神经网络中的各层的具体功能,但不应以此为限。The following describes the specific functions of the layers in the first neural network, but it should not be limited to this.
卷积层(Conv)执行的是卷积操作,还可以带有一个激活函数ReLU,可以对卷积结果进行激活操作,因此对于一个卷积层的操作可以用以下公式表示:The convolution layer (Conv) performs a convolution operation and can also carry an activation function ReLU, which can activate the convolution result. Therefore, the operation for a convolution layer can be expressed by the following formula:
YC i(I)=g(W i*YC i-1(I)+B i) YC i (I) = g (W i * YC i-1 (I) + B i )
其中,YC i(I)为第i个卷积层的输出,YC i-1(I)为第i个卷积层的输入,*表示卷积操作,W i和B i分别为第i个卷积层的卷积滤波器的权重系数和偏移系数,g()表示激活函数,当激活函数为ReLU时,g(x)=max(0,x),x即YC i(I)。 Wherein, YC i (I) is the i-th output of the convolutional layers, YC i-1 (I) is a convolution of the i-th input layer, * denotes convolution, W is i and B i are the i th The weight coefficient and offset coefficient of the convolution filter of the convolution layer. G () represents the activation function. When the activation function is ReLU, g (x) = max (0, x), x is YC i (I).
池化层(Pool)是一种特殊的下采样层,即对卷积得到的特征图进行缩小,缩小窗的大小例如为N×N,当使用最大池化时,即对N×N窗求取最大值作为最新图像对 应点的值,具体公式如下:The pooling layer (Pool) is a special type of downsampling layer, that is, the feature map obtained by the convolution is reduced, and the size of the reduction window is N × N. Take the maximum value as the value of the corresponding point in the latest image. The specific formula is as follows:
YP j(I)=maxpool(YP j-1(I)) YP j (I) = maxpool (YP j-1 (I))
其中,YP j-1(I)为第j个池化层的输入,YP j(I)为第j个池化层的输出。 Among them, YP j-1 (I) is the input of the j-th pooling layer, and YP j (I) is the output of the j-th pooling layer.
全连接层(FC)可以看成是滤波窗口为1×1的卷积层,全连接层的每一个结点都与上一层的所有结点相连,用来把前边提取到的特征综合起来,具体实现和卷积滤波类The fully connected layer (FC) can be regarded as a convolution layer with a filter window of 1 × 1. Each node of the fully connected layer is connected to all the nodes in the previous layer, which is used to integrate the features extracted before. , Concrete implementation and convolution filtering class
Figure PCTCN2019089249-appb-000001
的宽高,W ij和B ij分别为全连接层的连接权重系数和偏置系数,g()表示激活函数,I即为(i,j)。
Figure PCTCN2019089249-appb-000001
The width and height, W ij and B ij are the connection weight coefficients and bias coefficients of the fully connected layer, g () represents the activation function, and I is (i, j).
边框回归层(BBR)是为了寻找一种关系使得全连接层输出的窗口P经过映射得到一个跟真实窗口G更接近的窗口G′;回归的实现一般是对窗口P进行坐标变换,例如包括平移变换和/或尺度缩放变换;假设全连接层输出的窗口P的坐标为(x 1,x 2,y 1,y 2),则经变换后的窗口P的坐标(x 3,x 4,y 3,y 4); The border regression layer (BBR) is to find a relationship such that the window P output by the fully connected layer is mapped to obtain a window G ′ closer to the real window G; the implementation of regression is generally to perform a coordinate transformation on window P, such as including translation Transformation and / or scaling transformation; assuming that the coordinates of the window P output by the fully connected layer are (x 1 , x 2 , y 1 , y 2 ), then the coordinates of the transformed window P (x 3 , x 4 , y 3, y 4);
若变换为平移变换,平移尺度为(Δx,Δy),平移前后的坐标关系为:If the transformation is a translation transformation, the translation scale is (Δx, Δy), and the coordinate relationship before and after the translation is:
x 3=x 1+Δx x 3 = x 1 + Δx
x 4=x 2+Δx x 4 = x 2 + Δx
y 3=y 1+Δy y 3 = y 1 + Δy
y 4=y 2+Δy y 4 = y 2 + Δy
若尺度变换为缩放变换,X、Y方向的缩放尺度分别为dx、dy,变换前后的坐标关系为:If the scale transformation is a scale transformation, the scale scales in the X and Y directions are dx and dy, respectively, and the coordinate relationship before and after the transformation is:
x 4-x 3=(x 2-x 1)*dx x 4 -x 3 = (x 2 -x 1 ) * dx
y 4-y 3=(y 2-y 1)*dy。 y 4 -y 3 = (y 2 -y 1 ) * dy.
在步骤S1022中,根据第一神经网络输出的结果确定所述指定目标在第一图像数据中的位置信息,可将第一神经网络的输出结果直接作为指定目标在第一图像数据中的位置信息,或者也可将输出结果用指定目标在第一图像数据和第二图像数据的位置变化关系进行转换得到指定目标在第一图像数据中的位置信息。In step S1022, the position information of the specified target in the first image data is determined according to a result output by the first neural network, and the output result of the first neural network may be directly used as the position information of the specified target in the first image data. Or, the output result may also be converted by using the position change relationship of the specified target in the first image data and the second image data to obtain position information of the specified target in the first image data.
对于第一神经网络的训练,可以通过获取第二图像数据样本和对应的位置信息样本作为训练样本集,将第二图像数据样本作为输入,将对应的位置信息样本作为输出,对第一神经网络的训练模型进行训练。关于第二图像数据样本和对应的位置信息样本的获取,可以通过可识别检测目标的图像处理方式来对第二图像数据样本进行处理得到对应的位置信息样本。For the training of the first neural network, the first neural network can be obtained by obtaining a second image data sample and a corresponding position information sample as a training sample set, taking the second image data sample as an input, and the corresponding position information sample as an output Training model for training. Regarding the acquisition of the second image data sample and the corresponding position information sample, the second image data sample can be processed by an image processing method that can identify the detection target to obtain a corresponding position information sample.
在另一个实施例中,参看图8,第一处理模块101包括第三处理单元1013,可由第三处理单元1013执行步骤S111和步骤S112,以实现上述步骤S1。步骤S111和步 骤S112具体为:In another embodiment, referring to FIG. 8, the first processing module 101 includes a third processing unit 1013, and step S111 and step S112 may be performed by the third processing unit 1013 to implement the foregoing step S1. Step S111 and step S112 are specifically:
S111:将所述第一图像数据输入至已训练的第二神经网络;第二神经网络至少通过灰度化层、卷积层、池化层、全连接层和边框回归层将所述第一图像数据转换为可进行目标检测的第二图像数据、以及检测指定目标在第二图像数据中的位置信息;S111: input the first image data to a trained second neural network; the second neural network at least passes the first neural network through a grayscale layer, a convolution layer, a pooling layer, a fully connected layer, and a frame regression layer Converting the image data into second image data capable of target detection, and detecting position information of a specified target in the second image data;
S112:将所述第二神经网络输出的结果确定为所述指定目标在所述第一图像数据中的位置信息。S112: Determine a result output by the second neural network as position information of the specified target in the first image data.
其中,第二神经网络可以集成在第三处理单元1013中作为第一处理模块101的一部分,也可以设置在第一处理模块101外部,可由第三处理单元1013来调度。The second neural network may be integrated in the third processing unit 1013 as a part of the first processing module 101, or may be provided outside the first processing module 101, and may be scheduled by the third processing unit 1013.
参看图9,所述第二神经网络300包括至少一层用于执行灰度处理的灰度化层301、一层用于执行卷积的卷积层302、一层用于执行下采样的池化层303、一层用于执行特征综合的全连接层304和一层用于执行坐标变换的边框回归层305。可以通过第二神经网络来实现将所述第一图像数据转换为可进行目标检测的第二图像数据、以及检测指定目标在第二图像数据中的位置信息,而不用进行其他ISP处理。当然,根据不同的需求,可以在第二神经网络处理的基础上,进行一定的信息处理,具体不限。Referring to FIG. 9, the second neural network 300 includes at least one grayscale layer 301 for performing grayscale processing, one convolution layer 302 for performing convolution, and one layer for performing downsampling. A layer 303, a fully connected layer 304 for performing feature synthesis, and a border regression layer 305 for performing coordinate transformation. The second neural network can be used to convert the first image data into second image data capable of target detection and detect position information of a specified target in the second image data without performing other ISP processing. Of course, according to different requirements, certain information processing can be performed on the basis of the second neural network processing, which is not limited in particular.
作为第二神经网络的一个实施例,参看图10,第二神经网络300可以包括灰度化层306、卷积层307、卷积层308、池化层309……卷积层310、池化层311、全连接层312和边框回归层313。向第二神经网络输入第一图像数据,第二神经网络的各层结构对第一图像数据进行应用处理后输出位置信息,该位置信息作为指定目标在第一图像数据中的位置信息。第二神经网络每层所执行的功能与第一神经网络中的相应层相同,已经在上面进行了描述,每层可有适应性的变化,例如不同卷积层的卷积核可有所不同,在此便不再赘述。可以理解,图10示出的第二神经网络300仅是一个示例,具体并不限于此,例如可以减少或增加卷积层、和/或、池化层、和/或其他层。As an example of the second neural network, referring to FIG. 10, the second neural network 300 may include a grayscale layer 306, a convolutional layer 307, a convolutional layer 308, a pooling layer 309, ... a convolutional layer 310, a pooling Layer 311, fully connected layer 312, and border regression layer 313. The first image data is input to the second neural network, and each layer structure of the second neural network applies position processing to the first image data and outputs position information, and the position information is used as position information of the specified target in the first image data. The function performed by each layer of the second neural network is the same as that of the corresponding layer in the first neural network, which has been described above. Each layer can have adaptive changes. For example, the convolution kernels of different convolution layers may be different. , Will not repeat them here. It can be understood that the second neural network 300 shown in FIG. 10 is only an example, and is not specifically limited thereto. For example, the convolutional layer, and / or, the pooling layer, and / or other layers may be reduced or increased.
第二神经网络中的灰度化层是将多通道的第一数据格式信息转化为单通道的灰度信息,可以通过对当前像素点周围表征不同色彩的分量分别进行加权即可。参看图11,通过灰度化层的处理,将不同色彩的分量RGB进行加权后均转换为了单通道的灰度信息Y,例如,对于Y22来说,计算公式如下:The gray layer in the second neural network is to convert the multi-channel first data format information into single-channel gray information, which can be achieved by weighting the components representing different colors around the current pixel. Referring to FIG. 11, through processing of a graying layer, components of different colors RGB are weighted and converted into single-channel gray information Y. For example, for Y22, the calculation formula is as follows:
Y22=(B22+(G12+G32+G21+G23)/4+(R11+R13+R31+R33)/4)/3Y22 = (B22 + (G12 + G32 + G21 + G23) / 4 + (R11 + R13 + R31 + R33) / 4) / 3
其他分量也可以是同理,在此便不再赘述。The other components can be the same, and will not be repeated here.
第二神经网络中的卷积层、池化层、全连接层和边框回归层所执行的功能与第一神经网络中的相应层可相同,每层可有适应性的变化,例如不同卷积层的卷积核可有所不同,在此便不再赘述。The functions performed by the convolutional layer, pooling layer, fully connected layer, and border regression layer in the second neural network can be the same as the corresponding layers in the first neural network. Each layer can have adaptive changes, such as different convolutions. The convolution kernels of the layers may be different, and will not be repeated here.
对于第二神经网络的训练,可以通过获取第一图像数据样本和对应的位置信息样本作为训练样本集,将第一图像数据样本作为输入,将对应的位置信息样本作为输出,对第二神经网络的训练模型进行训练。关于第一图像数据样本和对应的位置信息样本的获取,可以先将第一图像数据样本进行目标可检测化的图像处理,再通过可识别检测目标的图像处理方式来检测目标,以得到对应的位置信息样本。For the training of the second neural network, the second neural network can be obtained by obtaining the first image data sample and the corresponding position information sample as the training sample set, taking the first image data sample as the input, and the corresponding position information sample as the output. Training model for training. Regarding the acquisition of the first image data sample and the corresponding position information sample, the first image data sample may be subjected to target detectable image processing, and then the target may be detected through an image processing method that can identify the detection target to obtain the corresponding Sample location information.
步骤S2中,可以根据步骤S1中得到的指定目标在第一图像数据的位置信息,到原始输入的第一数据格式的第一图像数据中的对应位置进行数据截取,截取的数据作为对应目标的第一数据格式的目标数据。In step S2, data can be intercepted according to the position information of the specified target in the first image data obtained in step S1 to the corresponding position in the first image data of the first data format that was originally input, and the intercepted data is used as the corresponding target. Target data in the first data format.
在一个实施例中,假设步骤S1中得到的指定目标在第一图像数据的位置信息为[x1,x2,y1,y2],其中x1、y1为起始位置信息,x2、y2为结束位置信息,当整幅图像对应的第一数据格式的第一图像数据用imgR来表示时,则指定目标的第一数据格式的目标数据imgT为:In one embodiment, it is assumed that the position information of the specified target obtained in step S1 in the first image data is [x1, x2, y1, y2], where x1, y1 are starting position information, and x2, y2 are ending position information. When the first image data in the first data format corresponding to the entire image is represented by imgR, the target data imgT in the first data format specifying the target is:
imgT=imgR(x1:x2,y1:y2)。imgT = imgR (x1: x2, y1: y2).
步骤S3中,针对步骤S2中得到的指定目标对应的第一数据格式的目标数据进行处理,以将指定目标的目标数据从第一数据格式转化为第二数据格式。步骤S3事实上是针对小目标数据的图像处理,可以通过非神经网络实现的ISP处理实现,也可以通过神经网络来实现。In step S3, the target data in the first data format corresponding to the specified target obtained in step S2 is processed to convert the target data of the specified target from the first data format to the second data format. Step S3 is actually image processing for small target data, which can be implemented by ISP processing implemented by a non-neural network, or by a neural network.
在一个实施例中,如图12所示,第三处理模块103包括第四处理单元1031,可由第四处理单元1031执行以下步骤,以实现上述步骤S3。In an embodiment, as shown in FIG. 12, the third processing module 103 includes a fourth processing unit 1031, and the fourth processing unit 1031 may perform the following steps to implement the above step S3.
将第一数据格式的目标数据输入至已训练的第三神经网络;所述第三神经网络至少通过卷积层实现将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式。Input target data in a first data format to a trained third neural network; the third neural network implements at least a convolution layer to convert the data format of the target data from the first data format to the second data format.
其中,第三神经网络可以集成在第四处理单元1031中作为第三处理模块103的一部分,也可以设置在第三处理模块103外部,可由第四处理单元1031来调度。The third neural network may be integrated in the fourth processing unit 1031 as a part of the third processing module 103, or may be provided outside the third processing module 103, and may be scheduled by the fourth processing unit 1031.
所述第三神经网络可以包括至少一层用于执行卷积的卷积层实现将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式。当然,第三神经网络的层结构不限于此,例如还可以包括至少一层用于执行激活的ReLu层,或者还可以包括其他层。具体的层数也不限。The third neural network may include at least one convolution layer for performing a convolution to convert a data format of the target data from the first data format to a second data format. Of course, the layer structure of the third neural network is not limited to this. For example, it may include at least one ReLu layer for performing activation, or may include other layers. The number of specific layers is not limited.
基于第三神经网络来实现图像处理,减少了传统图像处理在每个处理步骤中分别处理所可能引起的误差传播。Image processing is implemented based on the third neural network, which reduces the error propagation that may be caused by traditional image processing in each processing step.
下面对第三神经网络的各层执行的操作进行具体描述,但不应以此为限。The operations performed by each layer of the third neural network are described in detail below, but it should not be limited to this.
第三神经网络的卷积层,假设每个卷积层的输入为FC i,卷积层的输出为FC i+1,则有: For the convolutional layer of the third neural network, assuming that the input of each convolutional layer is FC i and the output of the convolutional layer is FC i + 1 , there are:
FC i+1=g(w ik*FC i+b ik) FC i + 1 = g (w ik * FC i + b ik )
w ik、b ik为当前卷积层中第k个卷积的参数,g(x)是一种线性加权函数,即对每个卷积层的卷积输出进行线性加权。当然,第三神经网络的卷积层与第一神经网络的卷积层都是执行的卷积操作,因而功能类似,相关描述也可以参看关于第一神经网络的卷积层的内容。 w ik and b ik are parameters of the k-th convolution in the current convolution layer, and g (x) is a linear weighting function, that is, the convolution output of each convolution layer is linearly weighted. Of course, the convolutional layer of the third neural network and the convolutional layer of the first neural network both perform convolution operations, and therefore have similar functions. For related descriptions, please also refer to the content of the convolutional layer of the first neural network.
第三神经网络的ReLu层,假设每个ReLu层的输入为FR i,ReLu层的输入为FR i+1,则有: For the ReLu layer of the third neural network, assuming that the input of each ReLu layer is FR i and the input of the ReLu layer is FR i + 1 , then:
FR i+1=max(FR i,0), FR i + 1 = max (FR i , 0),
即选择0和FR i的最大者。 That is, the largest of 0 and FR i is selected.
作为第三神经网络的一个实施例中,参看图13,第三神经网络400可以包括依次连接的卷积层401、卷积层402、ReLu层403、卷积层404、卷积层405。第三神经网络400的输入是第一数据格式的目标数据,输出是第二数据格式的目标数据。第三神经网络每层所执行的功能与第一神经网络的相应层相同,已经在上面进行了描述,每层可有适应性的变化,例如不同卷积层的卷积核可有所不同,在此便不再赘述。可以理解,图13示出的第三神经网络仅是一个示例,具体并不限于此,例如可以减少或增加卷积层、和/或、池化层、和/或其他层。As an embodiment of the third neural network, referring to FIG. 13, the third neural network 400 may include a convolutional layer 401, a convolutional layer 402, a ReLu layer 403, a convolutional layer 404, and a convolutional layer 405 which are sequentially connected. The input of the third neural network 400 is the target data in the first data format, and the output is the target data in the second data format. The functions performed by each layer of the third neural network are the same as the corresponding layers of the first neural network, which have been described above. Each layer may have adaptive changes. For example, the convolution kernels of different convolution layers may be different. I will not repeat them here. It can be understood that the third neural network shown in FIG. 13 is only an example, and is not specifically limited thereto. For example, the convolutional layer, and / or, the pooling layer, and / or other layers may be reduced or increased.
对于第三神经网络的训练,为了预先优化深度神经网络,可以使用大量第一数据格式的目标数据样本和对应理想的第二数据格式的目标数据样本构成样本,对第三神经网络训练过程中使用的网络参数进行不断训练,直至当输入第一数据格式的目标数据时,能够输出理想的第二数据格式的目标数据,此时输出网络参数,以供第三神经网络实际测试、使用。For the training of the third neural network, in order to optimize the deep neural network in advance, a large number of target data samples in the first data format and target data samples corresponding to the ideal second data format can be used to form samples. The third neural network training process is used The network parameters are continuously trained until the target data in the first data format is input, and the target data in the ideal second data format can be output. At this time, the network parameters are output for actual testing and use by the third neural network.
对第三神经网络进行训练的训练流程可以包括以下步骤:The training process for training the third neural network may include the following steps:
S311:收集训练样本:收集感兴趣目标对应的第一数据格式信息和对应的理想第二数据格式信息。假设已获得n个训练样本对{(x 1,y 1),(x 2,y 2),...,(x n,y n)},其中,x i表示输入的第一数据格式信息,y i表示对应的理想第二数据格式信息。 S311: Collect training samples: collect first data format information corresponding to the target of interest and corresponding ideal second data format information. Assume that n training sample pairs {(x 1 , y 1 ), (x 2 , y 2 ), ..., (x n , y n )} have been obtained, where x i represents the input first data format information , Y i represents corresponding ideal second data format information.
S312:设计第三神经网络的结构;网络训练使用的网络结构和测试时使用的网络结构为同一网络结构;S312: design the structure of the third neural network; the network structure used in network training and the network structure used in testing are the same network structure;
S313:初始化训练参数;对第三神经网络的结构的网络参数进行初始化,可采取随机值初始化、固定值初始化等;设置训练相关参数,如学习率、迭代次数等;S313: Initialize training parameters; initialize network parameters of the structure of the third neural network, which can be random value initialization, fixed value initialization, etc .; set training related parameters, such as learning rate, number of iterations, etc .;
S314:前向传播;基于当前网络参数,采用训练样本x i在第三神经网络上进行前向传播,获得第三神经网络的输出F(x i),计算损失函数Loss: S314: forward propagation; based on the current network parameters, the training sample x i is used for forward propagation on the third neural network to obtain the output F (x i ) of the third neural network, and the loss function Loss is calculated:
Loss=(F(x i)-y i) 2Loss = (F (x i ) -y i ) 2 ;
S315:后向传播:利用后向传播,调整第三神经网络的网络参数;S315: Backward propagation: use backward propagation to adjust the network parameters of the third neural network;
S316:反复迭代:重复迭代步骤S314和S315,直至网络收敛,输出此时的网络参数。S316: Repeated iteration: Repeat iteration steps S314 and S315 until the network converges, and output network parameters at this time.
当然,第三神经网络的训练过程并不局限于此,还可以是其他训练方式,只要能够使得训练后的第三神经网络可以实现输入第一数据格式的目标数据可以得到对应的第二数据格式的目标数据即可。Of course, the training process of the third neural network is not limited to this, and it can also be other training methods, as long as the trained third neural network can achieve the target data input in the first data format and the corresponding second data format can be obtained. Target data.
在另一个实施例中,如图14所示,第三处理模块103包括第五处理单元1032,可由第五处理单元1032对所述目标数据执行ISP处理,将所述目标数据从所述第一数据格式转换为第二数据格式,ISP处理至少包括色彩插值,以实现上述步骤S3。In another embodiment, as shown in FIG. 14, the third processing module 103 includes a fifth processing unit 1032. The fifth processing unit 1032 may perform ISP processing on the target data, and remove the target data from the first data. The data format is converted into the second data format, and the ISP processing includes at least color interpolation to implement step S3 described above.
进一步的,所述ISP处理还包括以下处理中的至少一种:白平衡校正、曲线映 射,可以进一步提高图像质量。Further, the ISP processing further includes at least one of the following processes: white balance correction and curve mapping, which can further improve image quality.
仅利用第一数据格式的目标数据来实现ISP处理中参数的计算,可提高处理参数的准确性,进而提升目标数据处理后的图像质量。Using only the target data in the first data format to implement the calculation of the parameters in the ISP processing can improve the accuracy of the processing parameters, thereby improving the image quality after the target data is processed.
作为对所述目标数据执行ISP处理的一个实施例中,参看图15,所述ISP处理可以依次包括以下步骤:As an embodiment of performing ISP processing on the target data, referring to FIG. 15, the ISP processing may include the following steps in order:
S301:白平衡校正;输入为第一数据格式的目标数据;S301: white balance correction; inputting target data in a first data format;
S302:色彩插值;S302: color interpolation;
S303:曲线映射;输出为第二数据格式的目标数据。S303: curve mapping; outputting the target data in the second data format.
可以理解,实现将所述目标数据从所述第一数据格式转换为第二数据格式的ISP处理也不局限于此,例如可以仅进行色彩插值,或者可以包括其他的ISP处理方式。It can be understood that the ISP processing for converting the target data from the first data format to the second data format is not limited to this, for example, only color interpolation may be performed, or other ISP processing methods may be included.
下面对白平衡校正、色彩插值、曲线映射这些ISP处理进行更具体的描述,但不应以此为限。The ISP processes such as white balance correction, color interpolation, and curve mapping are described in more detail below, but it should not be limited to this.
白平衡校正是为了去除图像成像由于环境光照影响而造成的图像偏色,以还原图像原有的色彩信息,一般由两个系数R -gain、B -gain来控制对应的R分量和B分量的调整。 White balance correction is to remove the image color cast due to the influence of ambient light to restore the original color information of the image. Generally, two coefficients R- gain and B- gain are used to control the corresponding R and B components. Adjustment.
R2′=R2*R _gain R2 ′ = R2 * R _gain
B2′=B2*B -gain B2 ′ = B2 * B -gain
其中,R2、B2是白平衡校正的输入图像的红色、蓝色通道色彩分量,R2'、B2'是白平衡校正的输出图像的红色、蓝色通道色彩分量;相对于全图的白平衡校正而言,此处R -gain、B -gain只需要对感兴趣目标的R、B、G通道色彩分量进行统计和计算; Among them, R2 and B2 are the color components of the red and blue channels of the input image of white balance correction, and R2 'and B2' are the color components of the red and blue channels of the output image of white balance correction; In terms of R- gain and B- gain, only the R, B, and G channel color components of the target of interest need to be calculated and calculated.
在计算R -gain、B -gain时,需要先统计R、G、B通道每个色彩分量的均值R avg、G avg和B avg,则有: When calculating R- gain and B- gain , you need to first calculate the average values R avg , G avg, and B avg of each color component of the R, G, and B channels.
Figure PCTCN2019089249-appb-000002
Figure PCTCN2019089249-appb-000002
Figure PCTCN2019089249-appb-000003
Figure PCTCN2019089249-appb-000003
色彩插值是指将白平衡校正后的第一数据格式的目标数据由单通道格式扩展成每个通道表征一个色彩分量的多通道数据格式;可以采用最邻近插值法实现,将单通道第一数据格式的目标数据扩展成多通道的目标数据。例如针对Bayer格式的第一数据格式的图像数据,可直接用最邻近的色彩像素填补相应色彩缺失的像素点,使每个像素点都含有RGB三种色彩分量,具体插值过程与前述图4对应的实施例可以相同或类似,在此不再赘述。Color interpolation refers to expanding the target data of the first data format after white balance correction from a single-channel format to a multi-channel data format in which each channel represents a color component; it can be implemented using the nearest neighbor interpolation method to convert the single-channel first data The formatted target data is expanded into multi-channel target data. For example, for the image data in the first data format of the Bayer format, the nearest color pixels can be directly used to fill the missing pixels of the corresponding color, so that each pixel contains three RGB color components. The specific interpolation process corresponds to the aforementioned FIG. 4 The embodiments may be the same or similar, and are not repeated here.
曲线映射是指将图像数据按照人眼的视觉特性将图像数据进行亮度和对比度调整,常用不同参数的Gamma曲线进行映射,假设Gamma曲线的映射函数为g,映射后的图像记为img gm,映射前图像记为img,则有: Curve mapping refers to adjusting the brightness and contrast of image data according to the visual characteristics of the human eye. Gamma curves with different parameters are commonly used for mapping. Assuming that the mapping function of the Gamma curve is g, the mapped image is recorded as img gm . The previous image is marked as img, then:
img gm(i,j)=g(img(i,j))。 img gm (i, j) = g (img (i, j)).
本公开实施例利用采集得到的第一数据格式的第一图像数据,进行指定目标的检测而获取其位置信息;再利用第一数据格式的第一图像数据截取所得位置信息对应的目标数据;该目标数据由于是从第一图像数据中截取的,因而未发生图像格式或质量的改变,再将该目标数据转换至适于显示和/或传输的数据格式,相比于对已经经过图像处理的图像在检测后进行后处理的方式而言,提升了所检测目标的图像质量。The embodiment of the present disclosure uses the acquired first image data in the first data format to perform detection of a specified target to obtain its position information; and then uses the first image data in the first data format to intercept target data corresponding to the obtained position information; the Because the target data is intercepted from the first image data, there is no change in the image format or quality, and then the target data is converted to a data format suitable for display and / or transmission, compared to the image processing In terms of post-processing of the image after detection, the image quality of the detected object is improved.
下面对本公开实施例的图像处理装置进行描述,但不应以此为限。The image processing apparatus according to the embodiment of the present disclosure is described below, but it should not be limited to this.
在一个实施例中,参看图2,一种图像处理装置100可以包括:In one embodiment, referring to FIG. 2, an image processing apparatus 100 may include:
第一处理模块101,用于从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息;A first processing module 101, configured to obtain position information of a specified target in the first image data from the collected first image data in a first data format;
第二处理模块102,用于从所述第一图像数据中截取所述位置信息对应的目标数据;A second processing module 102, configured to intercept target data corresponding to the position information from the first image data;
第三处理模块103,用于将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述第二数据格式适于所述目标数据显示和/或传输。A third processing module 103 is configured to convert a data format of the target data from the first data format to a second data format, where the second data format is suitable for displaying and / or transmitting the target data.
在本公开实施例中,图像处理装置100可以应用在图像设备上,图像设备可以是摄像机等具有成像功能的设备,或者是可进行图像后处理的设备等,具体不限。第一数据格式的第一图像数据可以是图像设备自身采集得到的图像数据,也可以是从其他设备获取的图像数据,具体不限。In the embodiment of the present disclosure, the image processing apparatus 100 may be applied to an image device. The image device may be a device with an imaging function, such as a video camera, or a device capable of performing image post-processing, and the like is not limited. The first image data in the first data format may be image data acquired by the image device itself, or image data acquired from other devices, which is not limited in particular.
在一个实施例中,参看图3,所述第一处理模块101包括第一处理单元1011和第二处理单元1012。所述第一处理单元1011,用于将所述第一图像数据转换为可进行目标检测的第二图像数据。所述第二处理单元1012,用于在所述第二图像数据中检测出所述指定目标的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息。In one embodiment, referring to FIG. 3, the first processing module 101 includes a first processing unit 1011 and a second processing unit 1012. The first processing unit 1011 is configured to convert the first image data into second image data capable of performing target detection. The second processing unit 1012 is configured to detect position information of the designated target in the second image data, and determine the detected position information as a position of the designated target in the first image data. information.
在一个实施例中,所述第二处理单元1012,具体用于:将所述第二图像数据输入至已训练的第一神经网络,并将所述第一神经网络输出的结果确定为所述指定目标在第一图像数据中的位置信息。其中,所述第一神经网络至少包括用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层,以实现所述指定目标的位置信息的定位和输出。In one embodiment, the second processing unit 1012 is specifically configured to: input the second image data to a trained first neural network, and determine a result output by the first neural network as the first neural network. Specify the position information of the target in the first image data. The first neural network includes at least a convolution layer for performing convolution, a pooling layer for performing downsampling, a fully connected layer for performing feature synthesis, and a frame regression layer for performing coordinate transformation. In order to realize the positioning and output of the position information of the specified target.
在一个实施例中,所述第一处理单元1011,具体用于:采用黑电平校正、白平衡校正、色彩插值、对比度增强和位宽压缩中的至少一者,将所述第一图像数据转换为可进行目标检测的第二图像数据。In one embodiment, the first processing unit 1011 is specifically configured to: use at least one of black level correction, white balance correction, color interpolation, contrast enhancement, and bit width compression to convert the first image data It is converted into the second image data capable of performing target detection.
在一个实施例中,参看图8,第一处理模块101包括第三处理单元1013,用于将所述第一图像数据输入至已训练的第二神经网络。其中,所述第二神经网络至少包括用于执行灰度处理的灰度化层、用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层,以将所述第一图像数据转换为可进行目标检测的第二图像数据、并检测指定目标在第二图像数据中的位置信息。 这样,可根据所述第二神经网络输出的结果确定所述指定目标在所述第一图像数据中的位置信息。In one embodiment, referring to FIG. 8, the first processing module 101 includes a third processing unit 1013 for inputting the first image data to a trained second neural network. The second neural network includes at least a grayscale layer for performing grayscale processing, a convolution layer for performing convolution, a pooling layer for performing downsampling, and a full connection for performing feature synthesis. Layer and frame regression layer for performing coordinate transformation to convert the first image data into second image data capable of target detection, and detect position information of a specified target in the second image data. In this way, position information of the specified target in the first image data may be determined according to a result output by the second neural network.
在一个实施例中,参看图12,第三处理模块103包括第四处理单元1031,用于将所述目标数据输入至已训练的第三神经网络。其中,所述第三神经网络至少包括用于执行卷积的卷积层,以将所述目标数据从所述第一数据格式转换为第二数据格式。In one embodiment, referring to FIG. 12, the third processing module 103 includes a fourth processing unit 1031 for inputting the target data to a trained third neural network. Wherein, the third neural network includes at least a convolution layer for performing convolution to convert the target data from the first data format to a second data format.
在一个实施例中,参看图14,第三处理模块103包括第五处理单元1032,用于对所述目标数据执行ISP处理。其中,所述ISP处理用于将所述目标数据从所述第一数据格式转换为第二数据格式,所述ISP处理至少包括色彩插值。In one embodiment, referring to FIG. 14, the third processing module 103 includes a fifth processing unit 1032 for performing ISP processing on the target data. The ISP processing is used to convert the target data from the first data format to the second data format, and the ISP processing includes at least color interpolation.
上述装置中各个单元的功能和作用的实现过程具体详见上述方法中对应步骤的实现过程,在此不再赘述。For details about the implementation process of the functions and functions of the units in the above device, refer to the implementation process of the corresponding steps in the foregoing method for details, and details are not described herein again.
对于装置实施例而言,由于其基本对应于方法实施例,所以相关之处参见方法实施例的部分说明即可。以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元。As for the device embodiment, since it basically corresponds to the method embodiment, the relevant part may refer to the description of the method embodiment. The device embodiments described above are only schematic, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units.
本公开还提供一种电子设备,包括处理器及存储器;所述存储器存储有可被处理器调用的程序;其中,所述处理器执行所述程序时,实现如前述实施例中任意一项所述的图像处理方法。The present disclosure also provides an electronic device including a processor and a memory; the memory stores a program that can be called by the processor; wherein when the processor executes the program, the program is implemented as in any one of the foregoing embodiments. The image processing method described above.
本公开图像处理装置的实施例可以应用在电子设备上。以软件实现为例,作为一个逻辑意义上的装置,是通过其所在电子设备的处理器将非易失性存储器中对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言,如图16所示,图16是本公开根据一示例性实施例示出的图像处理装置100所在电子设备的一种硬件结构图,除了图7所示的处理器510、内存530、接口520、以及非易失性存储器540之外,实施例中装置100所在的电子设备通常根据该电子设备的实际功能,还可以包括其他硬件,对此不再赘述。Embodiments of the image processing apparatus of the present disclosure can be applied to electronic devices. Taking software implementation as an example, as a device in a logical sense, it is formed by reading the corresponding computer program instructions in the non-volatile memory into the memory through the processor of the electronic device where it is located. In terms of hardware, as shown in FIG. 16, FIG. 16 is a hardware structural diagram of an electronic device in which the image processing apparatus 100 is located according to an exemplary embodiment of the present disclosure, except for the processor 510 and the memory shown in FIG. 7. In addition to 530, interface 520, and non-volatile memory 540, the electronic device in which the device 100 is located in the embodiment may generally include other hardware according to the actual function of the electronic device, and details are not described herein again.
本公开还提供一种机器可读存储介质,其上存储有程序,该程序被处理器执行时,使得图像设备实现如前述实施例中任意一项所述的图像处理方法。The present disclosure also provides a machine-readable storage medium having a program stored thereon, which when executed by a processor, causes an image device to implement the image processing method according to any one of the foregoing embodiments.
本公开可采用在一个或多个其中包含有程序代码的存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。机器可读存储介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。机器可读存储介质的例子包括但不限于:相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。The present disclosure may take the form of a computer program product implemented on one or more storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) containing program code therein. Machine-readable storage media includes permanent and non-permanent, removable and non-removable media, and information can be stored by any method or technology. Information may be computer-readable instructions, data structures, modules of a program, or other data. Examples of machine-readable storage media include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only Memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technologies, read-only disc read-only memory (CD-ROM), digital versatile disc (DVD), or other optical storage , Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices.
以上所述仅为本公开的实施例而已,并不用以限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开保护的范围之内。The above are only examples of the present disclosure, and are not intended to limit the present disclosure. Any modification, equivalent replacement, or improvement made within the spirit and principles of the present disclosure shall be included in the protection of the present disclosure. Within range.

Claims (17)

  1. 一种图像处理方法,包括:An image processing method includes:
    从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息;Acquiring position information of a specified target in the first image data from the collected first image data in a first data format;
    从所述第一图像数据中截取所述位置信息对应的目标数据;Intercepting target data corresponding to the position information from the first image data;
    将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述第二数据格式适于所述目标数据显示和/或传输。And converting the data format of the target data from the first data format to a second data format, where the second data format is suitable for displaying and / or transmitting the target data.
  2. 如权利要求1所述的图像处理方法,其特征在于,所述从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息包括:The image processing method according to claim 1, wherein the acquiring position information of a specified target in the first image data from the acquired first image data in a first data format comprises:
    将所述第一图像数据转换为可进行目标检测的第二图像数据;Converting the first image data into second image data capable of performing target detection;
    在所述第二图像数据中检测出所述指定目标的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息。The position information of the designated target is detected in the second image data, and the detected position information is determined as the position information of the designated target in the first image data.
  3. 如权利要求2所述的图像处理方法,其特征在于,所述在第二图像数据中检测出指定目标的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息包括:The image processing method according to claim 2, wherein the position information of the specified target is detected in the second image data, and the detected position information is determined as the specified target in the first image data Location information in includes:
    将所述第二图像数据输入至已训练的第一神经网络;所述第一神经网络至少通过用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层实现所述指定目标的位置信息的定位和输出;Input the second image data to a trained first neural network; the first neural network at least passes a convolution layer for performing convolution, a pooling layer for performing downsampling, and performs feature synthesis A fully-connected layer and a frame regression layer for performing coordinate transformation to realize the positioning and output of the position information of the specified target;
    根据所述第一神经网络的输出确定所述指定目标在第一图像数据中的位置信息。Determining position information of the specified target in the first image data according to an output of the first neural network.
  4. 如权利要求2所述的图像处理方法,其特征在于,所述将所述第一图像数据转换为可进行目标检测的第二图像数据,包括:The image processing method according to claim 2, wherein the converting the first image data into second image data capable of target detection comprises:
    采用黑电平校正、白平衡校正、色彩插值、对比度增强和位宽压缩中的至少一种图像处理方式,将所述第一图像数据转换为可进行目标检测的所述第二图像数据。The first image data is converted into the second image data capable of target detection by using at least one image processing method among black level correction, white balance correction, color interpolation, contrast enhancement, and bit width compression.
  5. 如权利要求1所述的图像处理方法,其特征在于,所述从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息包括:The image processing method according to claim 1, wherein the acquiring position information of a specified target in the first image data from the acquired first image data in a first data format comprises:
    将所述第一图像数据输入至已训练的第二神经网络;所述第二神经网络至少通过用于执行灰度处理的灰度化层、用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层,以将所述第一图像数据转换为可进行目标检测的第二图像数据,以及检测指定目标在第二图像数据中的位置信息;Input the first image data to a trained second neural network; the second neural network passes at least a grayscale layer for performing grayscale processing, a convolution layer for performing convolution, and A down-sampling pooling layer, a fully connected layer for performing feature synthesis, and a frame regression layer for performing coordinate transformations to convert the first image data into second image data capable of target detection, and detecting a designation Position information of the target in the second image data;
    根据所述第二神经网络的输出确定所述指定目标在所述第一图像数据中的位置信息。Determining position information of the specified target in the first image data according to an output of the second neural network.
  6. 如权利要求1至5中任一项所述的图像处理方法,其特征在于,所述将目标数据的数据格式从所述第一数据格式转换为第二数据格式包括:The image processing method according to any one of claims 1 to 5, wherein the converting a data format of the target data from the first data format to a second data format comprises:
    将所述目标数据输入至已训练的第三神经网络;所述第三神经网络至少通过用于执行卷积的卷积层实现将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式。Inputting the target data to a trained third neural network; the third neural network realizes converting a data format of the target data from the first data format to at least a convolution layer for performing convolution The second data format.
  7. 如权利要求1至5中任一项所述的图像处理方法,其特征在于,所述将目标数据的数据格式从所述第一数据格式转换为第二数据格式包括:The image processing method according to any one of claims 1 to 5, wherein the converting a data format of the target data from the first data format to a second data format comprises:
    对所述目标数据执行图像信号处理器ISP处理;其中,所述ISP处理用于将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述ISP处理至少包括色 彩插值。Performing image signal processor ISP processing on the target data; wherein the ISP processing is used to convert a data format of the target data from the first data format to a second data format, and the ISP processing includes at least color Interpolation.
  8. 如权利要求7所述的图像处理方法,其特征在于,所述ISP处理还包括以下处理中的至少一种:白平衡校正、曲线映射。The image processing method according to claim 7, wherein the ISP processing further comprises at least one of the following processing: white balance correction, curve mapping.
  9. 一种图像处理装置,包括:An image processing device includes:
    第一处理模块,用于从采集的第一数据格式的第一图像数据中获取指定目标在所述第一图像数据中的位置信息;A first processing module, configured to obtain position information of a specified target in the first image data from the collected first image data in a first data format;
    第二处理模块,用于从所述第一图像数据中截取所述位置信息对应的目标数据;A second processing module, configured to intercept target data corresponding to the position information from the first image data;
    第三处理模块,用于将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述第二数据格式适于所述目标数据显示和/或传输。A third processing module is configured to convert a data format of the target data from the first data format to a second data format, where the second data format is suitable for displaying and / or transmitting the target data.
  10. 如权利要求9所述的图像处理装置,其特征在于,所述第一处理模块包括第一处理单元和第二处理单元;The image processing apparatus according to claim 9, wherein the first processing module comprises a first processing unit and a second processing unit;
    所述第一处理单元,用于将所述第一图像数据转换为可进行目标检测的第二图像数据;The first processing unit is configured to convert the first image data into second image data capable of performing target detection;
    所述第二处理单元,用于在所述第二图像数据中检测出所述指定目标的位置信息,将检测出的位置信息确定为所述指定目标在所述第一图像数据中的位置信息。The second processing unit is configured to detect position information of the designated target in the second image data, and determine the detected position information as position information of the designated target in the first image data. .
  11. 如权利要求10所述的图像处理装置,其特征在于,所述第二处理单元,具体用于:The image processing apparatus according to claim 10, wherein the second processing unit is specifically configured to:
    将所述第二图像数据输入至已训练的第一神经网络,以及将所述第一神经网络输出的结果确定为所述指定目标在第一图像数据中的位置信息;所述第一神经网络至少通过用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层实现所述指定目标的位置信息的定位和输出。Inputting the second image data to a trained first neural network, and determining a result output by the first neural network as position information of the designated target in the first image data; the first neural network At least the position information of the specified target is achieved by a convolution layer for performing convolution, a pooling layer for performing downsampling, a fully connected layer for performing feature synthesis, and a frame regression layer for performing coordinate transformation Positioning and output.
  12. 如权利要求10所述的图像处理装置,其特征在于,所述第一处理单元具体用于:采用黑电平校正、白平衡校正、色彩插值、对比度增强和位宽压缩中的至少一者,将所述第一图像数据转换为可进行目标检测的第二图像数据。The image processing apparatus according to claim 10, wherein the first processing unit is specifically configured to use at least one of black level correction, white balance correction, color interpolation, contrast enhancement, and bit width compression, Converting the first image data into second image data capable of subject detection.
  13. 如权利要求9所述的图像处理装置,其特征在于,所述第一处理模块包括第三处理单元;The image processing apparatus according to claim 9, wherein the first processing module includes a third processing unit;
    所述第三处理单元,用于将所述第一图像数据输入至已训练的第二神经网络;所述第二神经网络至少通过用于执行灰度处理的灰度化层、用于执行卷积的卷积层、用于执行下采样的池化层、用于执行特征综合的全连接层和用于执行坐标变换的边框回归层,以将所述第一图像数据转换为可进行目标检测的第二图像数据、以及检测指定目标在所述第二图像数据中的位置信息;The third processing unit is configured to input the first image data to a trained second neural network; the second neural network passes at least a grayscale layer for performing grayscale processing, and is used for performing volume Convolutional layer, pooling layer for performing downsampling, fully connected layer for performing feature synthesis, and border regression layer for performing coordinate transformation to convert the first image data into target detection Second image data and detecting position information of a specified target in the second image data;
    根据所述第二神经网络输出的结果确定所述指定目标在所述第一图像数据中的位置信息。Determining position information of the specified target in the first image data according to a result output by the second neural network.
  14. 如权利要求9所述的图像处理装置,其特征在于,所述第三处理模块包括第四处理单元;The image processing apparatus according to claim 9, wherein the third processing module includes a fourth processing unit;
    所述第四处理单元,用于将所述目标数据输入至已训练的第三神经网络;所述第三神经网络至少通过用于执行卷积的卷积层,以将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式。The fourth processing unit is configured to input the target data to a trained third neural network; the third neural network at least passes a convolution layer for performing a convolution to transfer data of the target data The format is converted from the first data format to a second data format.
  15. 如权利要求9所述的图像处理装置,其特征在于,所述第三处理模块包括第五处理单元;The image processing apparatus according to claim 9, wherein the third processing module includes a fifth processing unit;
    所述第五处理单元,用于对所述目标数据执行ISP处理;其中,所述ISP处理用于将所述目标数据的数据格式从所述第一数据格式转换为第二数据格式,所述ISP处理至少包括色彩插值。The fifth processing unit is configured to perform ISP processing on the target data; wherein the ISP processing is used to convert a data format of the target data from the first data format to a second data format; ISP processing includes at least color interpolation.
  16. 一种电子设备,包括处理器及存储器;所述存储器存储有可被处理器调用的程序;其中,所述处理器执行所述程序时,实现如权利要求1-8中任意一项所述的图像处理方法。An electronic device includes a processor and a memory; the memory stores a program that can be called by the processor; wherein when the processor executes the program, the program according to any one of claims 1-8 is implemented Image processing method.
  17. 一种机器可读存储介质,其上存储有程序,该程序被处理器执行时,实现如权利要求1-8中任意一项所述的图像处理方法。A machine-readable storage medium stores a program thereon, and when the program is executed by a processor, the image processing method according to any one of claims 1-8 is implemented.
PCT/CN2019/089249 2018-05-31 2019-05-30 Image processing method, device, and equipment, and readable medium WO2019228450A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810571964.X 2018-05-31
CN201810571964.XA CN110555877B (en) 2018-05-31 2018-05-31 Image processing method, device and equipment and readable medium

Publications (1)

Publication Number Publication Date
WO2019228450A1 true WO2019228450A1 (en) 2019-12-05

Family

ID=68698712

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/089249 WO2019228450A1 (en) 2018-05-31 2019-05-30 Image processing method, device, and equipment, and readable medium

Country Status (2)

Country Link
CN (1) CN110555877B (en)
WO (1) WO2019228450A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077516A (en) * 2021-04-28 2021-07-06 深圳市人工智能与机器人研究院 Pose determination method and related equipment

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111110272B (en) * 2019-12-31 2022-12-23 深圳开立生物医疗科技股份有限公司 Ultrasonic image measurement information display method, device and equipment and readable storage medium
RU2764395C1 (en) 2020-11-23 2022-01-17 Самсунг Электроникс Ко., Лтд. Method and apparatus for joint debayering and image noise elimination using a neural network

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083752A1 (en) * 2015-09-18 2017-03-23 Yahoo! Inc. Face detection
CN107886074A (en) * 2017-11-13 2018-04-06 苏州科达科技股份有限公司 A kind of method for detecting human face and face detection system
CN108009524A (en) * 2017-12-25 2018-05-08 西北工业大学 A kind of method for detecting lane lines based on full convolutional network

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103873781B (en) * 2014-03-27 2017-03-29 成都动力视讯科技股份有限公司 A kind of wide dynamic camera implementation method and device
US9881234B2 (en) * 2015-11-25 2018-01-30 Baidu Usa Llc. Systems and methods for end-to-end object detection
CN106529446A (en) * 2016-10-27 2017-03-22 桂林电子科技大学 Vehicle type identification method and system based on multi-block deep convolutional neural network
CN107301383B (en) * 2017-06-07 2020-11-24 华南理工大学 Road traffic sign identification method based on Fast R-CNN
CN107895378A (en) * 2017-10-12 2018-04-10 西安天和防务技术股份有限公司 Object detection method and device, storage medium, electronic equipment
CN107808139B (en) * 2017-11-01 2021-08-06 电子科技大学 Real-time monitoring threat analysis method and system based on deep learning
CN107871126A (en) * 2017-11-22 2018-04-03 西安翔迅科技有限责任公司 Model recognizing method and system based on deep-neural-network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083752A1 (en) * 2015-09-18 2017-03-23 Yahoo! Inc. Face detection
CN107886074A (en) * 2017-11-13 2018-04-06 苏州科达科技股份有限公司 A kind of method for detecting human face and face detection system
CN108009524A (en) * 2017-12-25 2018-05-08 西北工业大学 A kind of method for detecting lane lines based on full convolutional network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077516A (en) * 2021-04-28 2021-07-06 深圳市人工智能与机器人研究院 Pose determination method and related equipment
CN113077516B (en) * 2021-04-28 2024-02-23 深圳市人工智能与机器人研究院 Pose determining method and related equipment

Also Published As

Publication number Publication date
CN110555877B (en) 2022-05-31
CN110555877A (en) 2019-12-10

Similar Documents

Publication Publication Date Title
US11882357B2 (en) Image display method and device
CN110738697B (en) Monocular depth estimation method based on deep learning
US20200234414A1 (en) Systems and methods for transforming raw sensor data captured in low-light conditions to well-exposed images using neural network architectures
EP4109392A1 (en) Image processing method and image processing device
CN112446383B (en) License plate recognition method and device, storage medium and terminal
CN110059728B (en) RGB-D image visual saliency detection method based on attention model
WO2019228450A1 (en) Image processing method, device, and equipment, and readable medium
CN112966635B (en) Low-resolution time sequence remote sensing image-oriented moving ship detection method and device
CN112581379A (en) Image enhancement method and device
US11170470B1 (en) Content-adaptive non-uniform image downsampling using predictive auxiliary convolutional neural network
CN112037129A (en) Image super-resolution reconstruction method, device, equipment and storage medium
US20240007600A1 (en) Spatially Varying Reduction of Haze in Images
US20220398698A1 (en) Image processing model generation method, processing method, storage medium, and terminal
CN111784624A (en) Target detection method, device, equipment and computer readable storage medium
CN113409355A (en) Moving target identification system and method based on FPGA
CN109272014B (en) Image classification method based on distortion adaptive convolutional neural network
CN112241982A (en) Image processing method and device and machine-readable storage medium
CN111815529B (en) Low-quality image classification enhancement method based on model fusion and data enhancement
CN116403200A (en) License plate real-time identification system based on hardware acceleration
CN114463379A (en) Dynamic capturing method and device for video key points
CN112241936B (en) Image processing method, device and equipment and storage medium
CN110738225B (en) Image recognition method and device
CN112288031A (en) Traffic signal lamp detection method and device, electronic equipment and storage medium
CN115170456A (en) Detection method and related equipment
CN112241670B (en) Image processing method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19810790

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19810790

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19810790

Country of ref document: EP

Kind code of ref document: A1