WO2021174834A1 - Yuv图像识别方法、系统和计算机设备 - Google Patents

Yuv图像识别方法、系统和计算机设备 Download PDF

Info

Publication number
WO2021174834A1
WO2021174834A1 PCT/CN2020/118686 CN2020118686W WO2021174834A1 WO 2021174834 A1 WO2021174834 A1 WO 2021174834A1 CN 2020118686 W CN2020118686 W CN 2020118686W WO 2021174834 A1 WO2021174834 A1 WO 2021174834A1
Authority
WO
WIPO (PCT)
Prior art keywords
component
yuv
input
image
combined
Prior art date
Application number
PCT/CN2020/118686
Other languages
English (en)
French (fr)
Inventor
朱禹萌
陆进
陈斌
宋晨
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021174834A1 publication Critical patent/WO2021174834A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the embodiments of the present application relate to the field of image recognition, and in particular, to a YUV image recognition method, system, computer equipment, and computer-readable storage medium.
  • YUV As a color coding method, YUV is widely used in various video processing components. Taking into account the perception of human eyes, YUV images use sampling to reduce the chromaticity bandwidth, thereby reducing the resource requirements of the equipment during the image storage and transmission process.
  • the deep neural network model has become the most important model method in the field of artificial intelligence, and even many dedicated acceleration chips have appeared to more effectively meet the dual pursuit of precision and speed in production. It should be noted that deep neural networks have standardized requirements for input data, and YUV images stored in a flat format cannot be directly applied to model training.
  • the existing YUV image recognition methods are mainly divided into two types: one is a staged step-by-step solution, which first uses traditional or depth models to identify the location of the target, and then uses the YUYV component of the partial image to perform color recognition.
  • the method realizes end-to-end training, and only uses the UV chromaticity component, which does not give full play to the value of the YUV image; the other is to use prior knowledge to convert the YUV format to RGB format, and eliminate the difference between academic and production in an offline manner
  • it increases the computing power consumption and cannot make full use of the acceleration function of the dedicated chip.
  • an embodiment of the present application provides a YUV image recognition method, and the method steps include:
  • the YUV image is input to a YUV image recognition model, which includes an input layer, an image processing layer, and an output layer; wherein, the input layer includes a component extraction branch, a first input branch, a second input branch, and The component combination branch is used to provide the YUV combined component of the YUV image to the image processing layer; and
  • the recognition result is output through the YUV image recognition model.
  • an embodiment of the present application also provides a YUV image recognition system, including:
  • the acquisition module is used to acquire the YUV image to be identified
  • the input module is used to input the YUV image to the YUV image recognition model.
  • the YUV image recognition model includes an input layer, an image processing layer and an output layer; wherein the input layer includes a component extraction branch, a first input branch, The second input branch and the component combination branch are used to provide the YUV combined component of the YUV image to the image processing layer; and
  • the output module is used to output the recognition result through the YUV image recognition model.
  • an embodiment of the present application also provides a computer device, the computer device including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, the When the computer-readable instructions are executed by the processor, the following steps are implemented:
  • the YUV image is input to a YUV image recognition model, which includes an input layer, an image processing layer, and an output layer; wherein, the input layer includes a component extraction branch, a first input branch, a second input branch, and The component combination branch is used to provide the YUV combined component of the YUV image to the image processing layer; and
  • the recognition result is output through the YUV image recognition model.
  • embodiments of the present application also provide a computer-readable storage medium having computer-readable instructions stored in the computer-readable storage medium, and the computer-readable instructions may be executed by at least one processor, So that the at least one processor executes the following steps:
  • the YUV image is input to a YUV image recognition model, which includes an input layer, an image processing layer, and an output layer; wherein, the input layer includes a component extraction branch, a first input branch, a second input branch, and The component combination branch is used to provide the YUV combined component of the YUV image to the image processing layer; and
  • the recognition result is output through the YUV image recognition model.
  • the YUV image recognition method, system, computer equipment, and computer-readable storage medium provided by the embodiments of this application can directly recognize YUV format images without converting the YUV format images to other image formats, reducing the image recognition model’s impact on the YUV format.
  • the computational power consumption of image recognition improves the image recognition model's recognition efficiency for YUV format images.
  • FIG. 1 is a schematic flowchart of a YUV image recognition method according to an embodiment of this application.
  • FIG. 2 is a schematic diagram of program modules of Embodiment 2 of the YUV image recognition system of this application.
  • FIG. 3 is a schematic diagram of the hardware structure of the third embodiment of the computer equipment of this application.
  • the computer device 2 will be used as the execution subject for exemplary description.
  • FIG. 1 shows a flowchart of the steps of a YUV image recognition method according to an embodiment of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps.
  • the following exemplarily describes the computer device 2 as the execution subject. details as follows.
  • Step S100 Acquire a YUV image to be identified.
  • YUV can be widely used in various video processing components.
  • the code stream output by the video capture chip may include a YUV data stream, and the YUV data stream can be divided into frames to obtain multiple YUV images.
  • These YUV images use sampling to reduce the chroma bandwidth, which can reduce the resource requirements of the equipment during the image storage and transmission process.
  • the general image recognition model cannot directly recognize the YUV image. Therefore, if the image recognition model is to realize the direct recognition of the YUV image, it is necessary to The input layer of the image recognition model is reconstructed.
  • Step S102 input the YUV image to a YUV image recognition model
  • the YUV image recognition model includes an input layer, an image processing layer, and an output layer; wherein, the input layer includes a component extraction branch, a first input branch, and a second input layer.
  • the input branch and the component combination branch are used to provide the YUV combination component of the YUV image to the image processing layer.
  • the YUV image recognition model is an improved neural network model, which can directly recognize YUV images. Compared with the traditional neural network model, the step of converting the YUV image to RGB image during the recognition process is subtracted to reduce the resource consumption in the image recognition process.
  • the input layer of the YUV image recognition model is configured with a component extraction branch, a first input branch, a second input branch, and a component combination branch.
  • the YUV image recognition model can directly recognize a YUV image.
  • the specific functions of each branch are as follows:
  • step S102a is executed through the component extraction branch:
  • Step S102a Perform component extraction on the YUV image to obtain the first component Y, the second component U, and the third component V of the YUV image.
  • the component extraction branch is used to extract the first component Y, the second component U, and the third component V in the YUV format image; wherein, the first component Y is used to represent the brightness Y component of the image, The second component U and the third component V are used to represent the chrominance U component and the V component of the image.
  • the sizes of the Y component, U component, and V component are respectively For 16, 4, 4, 1 to 16 can be extracted as Y components, 17 to 20 are U components, and 21 to 24 are V components, and the Y, U, and V components can be reorganized (Reshape ) Operation to get Y(4*4) of (W*H) data structure, U(2*2) of (W/2)*(H/2) data structure, and (W/2)*(H/ 2) The V(2*2) three-component quantity data of the data structure.
  • the shuffling operation also includes sequentially rearranging the UV components and the corresponding Y components, that is, U (1*4) components are reorganized into U (2*2) components, and V (1*4) components are reorganized into The V(2*2) component, wherein the resolution of the Y(4*4) component is consistent with the original image, so the Y(4*4) component does not need to be reshaped.
  • steps S102b1 to S102b2 are executed through the first input branch:
  • Step S102b1 Perform an add dimension operation on the first component Y to obtain a first initial component Y1;
  • Step S102b2 Perform a pooling operation on the first initial component Y1 to obtain a pooled first input component Y2 .
  • the first input branch is used to receive the first component Y, because the data structure of the first component Y is a two-dimensional array W*H, and the two-dimensional array W*H does not conform to the input standard format, Therefore, it is necessary to perform additional processing on the two-dimensional array W*H to obtain the first initial component Y1 with a data structure of 1*1*W*H, and the 1*1*W*H conforms to the input standard format.
  • this embodiment also needs to perform a pooling operation on the first initial component Y1 to obtain a pooled data structure of 1*1*(W/2)*(H/ 2)
  • the first input component Y2 The 1*1*(W/2)*(H/2) conforms to the input standard format.
  • steps S102c1 to S102c4 are executed through the second input branch:
  • Step S102c1 perform an add dimension operation on the second component U and the third component V to obtain a second initial component U1 and a third initial component V1; step S102c2, according to the second initial component U1 and the third initial component V1, obtain the first UV combined component; step S102c3, perform an up-sampling operation on the first UV combined component to obtain the second input component U2 and the third input component V2; step S102c4, according to the second input component U2 And the third input component V2 to obtain the second UV combined component.
  • the second input branch is used to receive the reorganized second component U and the third component V, and the second component U and the third component V are used to represent the chroma of the image; it is not difficult to understand, because The data structures of the second component U and the third component V are both two-dimensional arrays (W/2)*(H/2), and the two-dimensional arrays (W/2)*(H/2) do not conform to The input canonical format, so the two-dimensional array (W/2)*(H/2) needs to be processed; and because the resolution of the second component U and the third component V are the same, the The adding operations of the second component U and the third component V can be performed at the same time to obtain the second initial component U1 and the third initial component V1 in the shape of 1*1*(W/2)*(H/2); Said 1*1*(W/2)*(H/2) conforms to the input standard format.
  • the data structure of the second initial component U1 and the third initial component V1 are both 1*1*(W/2)*(H/2), so the second initial component U1 and the third initial component V1 can be compared with each other.
  • the cascade operation is performed to obtain the first UV combined component, and the shape of the first UV combined component is 1*2*(W/2)*(H/2).
  • the data structure of the second input component U2 and the third input component V2 are both 1*1*W*H, so the second input component U2 and the third input component V2 can be cascaded , To obtain the second UV combined component, and the data structure of the second UV combined component is 1*2*W*H.
  • Step S102d1 cascade the first initial component Y1 and the second UV combined component to obtain a first YUV combined component
  • Step S102d2 cascade the first input component Y2 and the first UV combined component Operate to obtain the second YUV combined component.
  • the second input branch is used to receive the first initial component Y1, the first input component Y2, the first UV combined component, and the second UV combined component.
  • the data structure of the first initial component Y1 and the second UV combined component are the same, so the first initial component Y1 and the second UV combined component can be cascaded to obtain the first YUV combined component ,
  • the data structure of the first YUV combined component is 1*3*W*H.
  • the data structure of the first input component Y2 and the first UV combined component are the same, so the first input component Y2 and the first UV combined component can be cascaded to obtain a second YUV combined component ,
  • the data structure of the second YUV combined component is 1*3*(W/2)*(H/2).
  • Step S104 Output the recognition result through the YUV image recognition model.
  • the step S104 may further include: using the first YUV combined component and the second YUV combined component as output data of the input layer, and inputting the output data to the image processing layer to perform YUV Image recognition operation to obtain the recognition result, the image processing layer includes an image recognition network, the image recognition network is a pre-trained convolutional neural network, the convolutional neural network is used to recognize YUV images; The layer outputs the recognition result.
  • the image format of the YUV image input to the input layer is modified through the component extraction branch, the first input branch, the second input branch, and the component combination branch of the input layer.
  • the image recognition model of the YUV image that cannot be directly recognized is modified to an image recognition model that can directly recognize the YUV image.
  • the use of cascading operations between channels enhances the correlation between images of different resolutions, and makes up for the lack of information learning caused by multiple inputs.
  • the training data trains the image recognition model that can directly recognize the YUV image to obtain the YUV image recognition model.
  • the YUV image recognition model is a deep neural network model after training, and the training steps are as follows:
  • Step 1 Acquire multiple YUV images in advance;
  • Step 2 Use the multiple YUV images as the training set of the pre-training model;
  • Step 3 Train the pre-training model through the training set to obtain the YUV image recognition A model, wherein the pre-training model is a deep neural network model in which the input layer has been reconstructed.
  • the training process of the model is a process of continuously approximating the learnable parameters w and b in the model structure to the ideal value.
  • the input x and the target value y are known, after hierarchical operations, By solving The optimal parameters can be found.
  • the model output can be obtained So as to obtain the optimal solution of the objective function.
  • the network structure used in this proposal contains three parts: L1, D and L res : Among them, L1 is for the Y component input layer, D is for the deconvolution part of the UV component; and L res is the YUV image Identify the feature extraction structure behind the input layer in the model.
  • FIG. 2 is a schematic diagram of program modules of Embodiment 2 of the YUV image recognition system of this application.
  • the YUV image recognition system 20 may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and executed by one or more processors to complete the application, and can be implemented The above YUV image recognition method.
  • the program module referred to in the embodiments of the present application refers to a series of computer program instruction segments capable of completing specific functions, and is more suitable for describing the execution process of the YUV image recognition system 20 in the storage medium than the program itself. The following description will specifically introduce the functions of each program module in this embodiment:
  • the acquiring module 200 is used to acquire the YUV image to be identified.
  • the YUV image to be identified can be obtained by framing the video data in the YUV format.
  • the code stream output by the video capture chip is basically in the form of the YUV data stream;
  • YUV is a color coding method, It is widely used in various video processing components.
  • YUV images use sampling to reduce the chroma bandwidth, which can reduce the resource requirements of the device during the image storage and transmission process.
  • the input module 202 is configured to input the YUV image into a YUV image recognition model, the YUV image recognition model includes an input layer, an image processing layer, and an output layer; wherein, the input layer includes a component extraction branch and a first input branch , The second input branch and the component combination branch are used to provide the YUV combination component of the YUV image to the image processing layer.
  • the input module 202 is further configured to extract components of the YUV image to obtain the first component Y, the second component U, and the third component V of the YUV image.
  • the component extraction branch is used to extract the first component Y, the second component U, and the third component V in the YUV format image; wherein, the first component Y is used to represent the brightness Y component of the image, The second component U and the third component V are used to represent the chrominance U component and the V component of the image.
  • the sizes of the Y component, U component, and V component are respectively For 16, 4, 4, 1 to 16 can be extracted as Y components, 17 to 20 are U components, and 21 to 24 are V components, and the Y, U, and V components can be reorganized (Reshape ) Operation to get Y(4*4) of (W*H) data structure, U(2*2) of (W/2)*(H/2) data structure, and (W/2)*(H/ 2) The V(2*2) three-component quantity data of the data structure.
  • the reorganization operation also includes the sequence rearrangement of the UV components and the corresponding Y components, that is, the U(1*4) component is reorganized into U(2*2) component, and the V(1*4) component is reorganized into The V(2*2) component, wherein the resolution of the Y(4*4) component is consistent with the original image, so the Y(4*4) component does not need to be reshaped.
  • the input module 202 is further configured to: perform an addition operation on the first component Y to obtain a first initial component Y1; perform a pooling operation on the first initial component Y1 to obtain a pool The converted first input component Y2.
  • the first input branch of the input layer is used to receive the first component Y, because the data structure of the first component Y is a two-dimensional array W*H, and the two-dimensional array W*H does not conform to the input Standard format, so the two-dimensional array W*H needs to be dimensioned to obtain the first initial component Y1 with a data structure of 1*1*W*H.
  • the 1*1*W*H conforms to the input Canonical format.
  • this solution also needs to perform a pooling operation on the first initial component Y1 to obtain a pooled data structure of 1*1*(W/2)*(H/2 ) Of the first input component Y2.
  • the 1*1*(W/2)*(H/2) conforms to the input standard format.
  • the input module 202 is further configured to: perform an addition operation on the second component U and the third component V to obtain a second initial component U1 and a third initial component V1; The initial component U1 and the third initial component V1 to obtain the first UV combined component; perform an up-sampling operation on the first UV combined component to obtain the second input component U2 and the third input component V2; according to the second input Component U2 and the third input component V2 to obtain the second UV combined component.
  • the first input branch of the input layer receives the reorganized second component U and the third component V, and the second component U and the third component V are used to represent the color of the image.
  • the resolution of the three components V is the same, so the addition operations on the second component U and the third component V can be performed at the same time to obtain the shape of 1*1*(W/2)*(H/2)
  • the second initial component U1 and the third initial component V1; the 1*1*(W/2)*(H/2) conforms to the input standard format.
  • the data structures of the second initial component U1 and the third initial component V1 are both 1*1*(W/2)*(H/2), so the second initial component U1 and the first The three initial components V1 are cascaded to obtain a first UV combined component, and the shape of the first UV combined component is 1*2*(W/2)*(H/2).
  • the data structure of the second input component U2 and the third input component V2 are both 1*1*W*H, so the second input component U2 and the third input component V2 can be compared with each other.
  • the cascade operation is performed to obtain the second UV combined component, and the data structure of the second UV combined component is 1*2*W*H.
  • the input module 202 is further configured to: cascade the first initial component Y1 and the second UV combined component to obtain a first YUV combined component; combine the first input component Y2 with The first UV combined component is cascaded to obtain the second YUV combined component.
  • the data structure of the first initial component Y1 and the second UV combined component are the same, so the first initial component Y1 and the second UV combined component may be cascaded to obtain the first A YUV combined component, the data structure of the first YUV combined component is 1*3*W*H.
  • the data structure of the first input component Y2 and the first UV combined component are the same, so the first input component Y2 and the first UV combined component may be cascaded to obtain the first Two YUV combined components, the data structure of the second YUV combined component is 1*3*(W/2)*(H/2).
  • the new network structure is adaptively transformed in the input layer, and the format of the non-standard structure adopts the form of multiple inputs to adapt the resolution, and at the same time, it uses the channel
  • the cascade operation between different resolutions enhances the correlation between images of different resolutions, and makes up for the lack of information learning caused by multiple inputs.
  • the output module 204 is configured to output the recognition result through the YUV image recognition model.
  • the input module 204 is further configured to: use the first YUV combined component and the second YUV combined component as output data of the input layer, and input the output data to the image processing layer Performing a YUV image recognition operation to obtain the recognition result, the image processing layer includes an image recognition network, the image recognition network is a pre-trained convolutional neural network, and the convolutional neural network is used to recognize YUV images; And output the recognition result through the output layer.
  • the computer device 2 is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • the computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers).
  • the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and a YUV image recognition system 20 that can communicate with each other through a system bus.
  • the memory 21 includes at least one type of computer-readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory ( RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the memory 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2.
  • the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc.
  • the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device.
  • the memory 21 is generally used to store the operating system and various application software installed in the computer device 2, such as the program code of the YUV image recognition system 20 in the second embodiment.
  • the memory 21 can also be used to temporarily store various types of data that have been output or will be output.
  • the processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips.
  • the processor 22 is generally used to control the overall operation of the computer device 2.
  • the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the YUV image recognition system 20, so as to implement the YUV image recognition method of the first embodiment.
  • the network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used to establish a communication connection between the computer device 2 and other electronic devices.
  • the network interface 23 is used to connect the computer device 2 with an external terminal through a network, and establish a data transmission channel and a communication connection between the computer device 2 and the external terminal.
  • the network may be Intranet, Internet, Global System of Mobile Communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
  • FIG. 3 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the YUV image recognition system 20 stored in the memory 21 can also be divided into one or more program modules.
  • the one or more program modules are stored in the memory 21 and are composed of one or more
  • the processor (the processor 22 in this embodiment) is executed to complete the application.
  • FIG. 2 shows a schematic diagram of program modules for implementing the YUV image recognition system 20 according to the second embodiment of the present application.
  • the YUV image recognition system 20 can be divided into an acquisition module 200, an input module 202, And output module 204.
  • the program module referred to in this application refers to a series of computer program instruction segments that can complete specific functions, and is more suitable than a program to describe the execution process of the YUV image recognition system 20 in the computer device 2.
  • the specific functions of the program modules 200-204 have been described in detail in the second embodiment, and will not be repeated here.
  • the computer-readable storage medium may be non-volatile or volatile, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX). Memory, etc.), random access memory (RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory , Magnetic disks, optical disks, servers, App application malls, etc., on which computer-readable instructions are stored, and the computer-readable instructions realize corresponding functions when executed by the processor.
  • the computer-readable storage medium of this embodiment is used in the YUV image recognition system 20, and the processor executes the following steps:
  • the YUV image is input to a YUV image recognition model, which includes an input layer, an image processing layer, and an output layer; wherein, the input layer includes a component extraction branch, a first input branch, a second input branch, and The component combination branch is used to provide the YUV combined component of the YUV image to the image processing layer; and
  • the recognition result is output through the YUV image recognition model.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种YUV图像识别方法,所述方法包括:获取待识别的YUV图像(S100);将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层(S102);及通过所述YUV图像识别模型输出识别结果(S104)。该方法可以直接对YUV格式图像进行识别,无需将YUV格式图像转换为其他图像格式,减少了图像识别模型对YUV格式图像识别的算力消耗,提高了图像识别模型对YUV格式图像的识别效率。

Description

YUV图像识别方法、系统和计算机设备
本申请申明2020年03月03日递交的申请号为202010138367.5、名称为“YUV图像识别方法、系统和计算机设备”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。
技术领域
本申请实施例涉及图像识别领域,尤其涉及一种YUV图像识别方法、系统、计算机设备及计算机可读存储介质。
背景技术
YUV作为一种颜色编码方式,广泛应用于各个视频处理组件中。考虑到人眼的感知能力,YUV图像采用抽样的方式减低色度的带宽,从而降低了图像存储和传输过程对设备的资源需求。但是,发明人意识到,深度神经网络模型已经成为人工智能领域最重要的模型方法,甚至出现了很多专用的加速芯片,更有效地满足生产对精度和速度的双重追求。需要注意的是,深度神经网络对输入数据具有规范化的要求,利用平面格式存储的YUV图像,是无法直接应用于模型训练的。现有的YUV图像识别方法,主要分为两种:一种是阶段分步式方案,先利用传统或深度模型识别出目标的位置,再利用局部图像的YUYV分量,进行颜色的识别,不仅没有办法实现端对端的训练,而且只利用了UV色度分量,没有充分发挥YUV图像的价值;另一种是利用先验知识将YUV格式转化为RGB格式,通过离线的方式消除学术与生产的差异性,但是增加了算力消耗,且无法充分利用专用芯片的加速功能。
因此,如何在图像识别模型算力消耗较小的情况下,实现模型直接识别YUV格式图像,从而进一步提高图像识别模型的灵活性,成为了当前要解决的技术问题之一。
发明内容
有鉴于此,有必要提供一种YUV图像识别方法、系统、计算机设备及计算机可读存储介质,以解决在当前图像识别模型无法直接对YUV格式图像进行识别的技术问题。
为实现上述目的,本申请实施例提供了一种YUV图像识别方法,所述方法步骤包括:
获取待识别的YUV图像;
将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
通过所述YUV图像识别模型输出识别结果。
为实现上述目的,本申请实施例还提供了一种YUV图像识别系统,包括:
获取模块,用于获取待识别的YUV图像;
输入模块,用于将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
输出模块,用于通过所述YUV图像识别模型输出所述识别结果。
为实现上述目的,本申请实施例还提供了一种计算机设备,所述计算机设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:
获取待识别的YUV图像;
将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
通过所述YUV图像识别模型输出识别结果。
为实现上述目的,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机可读指令,所述计算机可读指令可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
获取待识别的YUV图像;
将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
通过所述YUV图像识别模型输出识别结果。
本申请实施例提供的YUV图像识别方法、系统、计算机设备及计算机可读存储介质,可以直接对YUV格式图像进行识别,无需将YUV格式图像转换为其他图像格式,减少了图像识别模型对YUV格式图像识别的算力消耗,提高了图像识别模型对YUV格式图像的识别效率。
附图说明
图1为本申请实施例YUV图像识别方法的流程示意图。
图2为本申请YUV图像识别系统实施例二的程序模块示意图。
图3为本申请计算机设备实施例三的硬件结构示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
需要说明的是,在本申请中涉及“第一”、“第二”等的描述仅用于描述目的,而不能理解为指示或暗示其相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。另外,各个实施例之间的技术方案可以相互结合,但是必须是以本领域普通技术人员能够实现为基础,当技术方案的结合出现相互矛盾或无法实现时应当认为这种技术方案的结合不存在,也不在本申请要求的保护范围之内。
以下实施例中,将以计算机设备2为执行主体进行示例性描述。
实施例一
参阅图1,示出了本申请实施例之YUV图像识别方法的步骤流程图。可以理解,本方法实施例中的流程图不用于对执行步骤的顺序进行限定。下面以计算机设备2为执行主体进行示例性描述。具体如下。
步骤S100,获取待识别的YUV图像。
YUV作为一种颜色编码方式,可广泛应用于各个视频处理组件中。视频采集芯片输出的码流可以包括YUV数据流,该YUV数据流被分帧可以得到多个YUV图像。这些YUV图像采用抽样的方式减低色度的带宽,可以降低图像存储和传输过程对设备的资源需求。
示例性的,由于YUV图像格式不符合图像识别模型的输入格式,所以一般图像识别模型是无法直接对YUV图像进行识别的,因此,若要使图像识别模型实现直接对YUV图像识别,需要对所述图像识别模型的输入层进行重新构建。
步骤S102,将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包 括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层。
所述YUV图像识别模型为一种改进的神经网络模型,其可以直接识别YUV图像。相比于传统的神经网络模型,减去了在识别过程中间YUV图像转换为RGB图像的步骤,以降低图像识别过程中的资源消耗。
所述YUV图像识别模型的输入层配置有分量提取分支、第一输入分支、第二输入分支和分量组合分支,通过上述分支使得所述YUV图像识别模型可以直接识别YUV图像。各个分支的具体功能如下:
示例性的,通过所述分量提取分支执行以下步骤S102a:
步骤S102a,对所述YUV图像进行分量提取,以得到所述YUV图像的第一分量Y、第二分量U以及第三分量V。
示例性的,所述分量提取分支用于提取YUV格式图像中的第一分量Y、第二分量U以及第三分量V;其中,所述第一分量Y用于表示图像的明亮度Y分量,所述第二分量U和第三分量V用于表示图像的色度U分量和V分量。所述YUV格式图像包括YUV420P格式图像,在YUV420P格式图像中,每4个Y分量共用一组UV分量,已知图像大小为4*4=16,则Y分量、U分量、V分量的大小分别为16、4、4,可以抽取1到16为Y分量,17到20为U分量,21到24为V分量,并对所述Y分量、所述U分量以及所述V分量进行改组(Reshape)操作,以得到(W*H)数据结构的Y(4*4)、(W/2)*(H/2)数据结构的U(2*2)以及(W/2)*(H/2)数据结构的V(2*2)三组分量数据。所述改组操作还包括将所述UV分量分别与对应的Y分量进行顺序的重新排列,即U(1*4)分量改组为U(2*2)分量,V(1*4)分量改组为V(2*2)分量,其中,所述Y(4*4)分量的分辨率与原图像一致,所以,所述Y(4*4)分量不需要进行改组(Reshape)操作。
示例性的,通过所述第一输入分支执行以下步骤S102b1~S102b2:
步骤S102b1,对所述第一分量Y进行加维操作,以得到第一初始分量Y1;步骤S102b2,对所述第一初始分量Y1进行池化操作,以得到池化后的第一输入分量Y2。
示例性的,所述第一输入分支用于接收第一分量Y,由于所述第一分量Y的数据结构为二维数组W*H,而二维数组W*H不符合输入的规范格式,所以需要对所述二维数组W*H进行加维处理,以得到数据结构为1*1*W*H的第一初始分量Y1,所述1*1*W*H符合输入的规范格式。
为了进一步的提高YUV图像的识别率,本实施例还需对所述第一初始分量Y1进行 池化操作,以得到池化后的数据结构为1*1*(W/2)*(H/2)的第一输入分量Y2。所述1*1*(W/2)*(H/2)符合输入的规范格式。
示例性的,通过所述第二输入分支执行以下步骤S102c1~S102c4:
步骤S102c1,对所述第二分量U和第三分量V进行加维操作,以得到第二初始分量U1以及第三初始分量V1;步骤S102c2,根据所述第二初始分量U1和第三初始分量V1,得到第一UV组合分量;步骤S102c3,对所述第一UV组合分量进行升采样操作,以得到第二输入分量U2和第三输入分量V2;步骤S102c4,根据所述第二输入分量U2和第三输入分量V2,得到第二UV组合分量。
示例性的,所述第二输入分支用于接收经过改组的第二分量U和第三分量V,所述第二分量U和第三分量V用于表示图像的色度;不难理解,由于所述第二分量U和第三分量V的数据结构为均为二维数组(W/2)*(H/2),所述二维数组(W/2)*(H/2)不符合输入的规范格式,所以需要对所述二维数组(W/2)*(H/2)进行加维处理;又由于所述第二分量U和第三分量V的分辨率相同,所以对所述第二分量U和第三分量V的加维操作可以同时进行,以得到1*1*(W/2)*(H/2)形状的第二初始分量U1和第三初始分量V1;所述1*1*(W/2)*(H/2)符合输入的规范格式。
所述第二初始分量U1和第三初始分量V1的数据结构均为1*1*(W/2)*(H/2),所以可以对所述第二初始分量U1和第三初始分量V1进行级联操作,以得到第一UV组合分量,所述第一UV组合分量形状为1*2*(W/2)*(H/2)。
为了与数据结构为1*1*W*H的第一初始分量Y1的格式保持一致,还需对所述第一UV组合分量进行升采样操作,以得到数据结构为1*1*W*H的第二输入分量U2和第三输入分量V2。
所述第二输入分量U2和所述第三输入分量V2的数据结构均为1*1*W*H,所以可以对所述第二输入分量U2和所述第三输入分量V2进行级联操作,以得到第二UV组合分量,所述第二UV组合分量的数据结构为1*2*W*H。
示例性的,通过所述分量组合分支执行以下步骤S102d1~S102d2:
步骤S102d1,将所述第一初始分量Y1和第二UV组合分量进行级联操作,以得到第一YUV组合分量;步骤S102d2,将所述第一输入分量Y2和第一UV组合分量进行级联操作,以得到第二YUV组合分量。
示例性的,所述第二输入分支用于接收所述第一初始分量Y1、所述第一输入分量Y2、所述第一UV组合分量和所述第二UV组合分量。
所述第一初始分量Y1和所述第二UV组合分量的数据结构相同,所以可以将所述第 一初始分量Y1和所述第二UV组合分量进行级联处理,以得到第一YUV组合分量,所述第一YUV组合分量的数据结构为1*3*W*H。
所述第一输入分量Y2和所述第一UV组合分量的数据结构相同,所以可以将所述第一输入分量Y2和所述第一UV组合分量进行级联处理,以得到第二YUV组合分量,所述第二YUV组合分量数据结构为1*3*(W/2)*(H/2)。
步骤S104,通过所述YUV图像识别模型输出识别结果。
示例性的,所述步骤S104可以进一步包括:将所述第一YUV组合分量和所述第二YUV组合分量作为输入层的输出数据,并将所述输出数据输入到所述图像处理层进行YUV图像识别操作,以得到所述识别结果,所述图像处理层包括图像识别网络,所述图像识别网络为预先训练好的卷积神经网络,所述卷积神经网络用于识别YUV图像;通过输出层输出所述识别结果。
本实施例的YUV图像识别模型,通过输入层的分量提取分支、第一输入分支、第二输入分支和分量组合分支,对输入到所述输入层的YUV图像的图片格式进行了修改,以将不能直接识别的YUV图像的图像识别模型,修改为可以直接识别YUV图像的图像识别模型。同时,利用通道之间的级联操作,增强不同分辨率图像之间的关联性,弥补了多输入带来的信息欠学习的缺陷。然后训练数据对所述可以直接识别YUV图像的图像识别模型进行训练,以得到YUV图像识别模型。
在示例性的实施例中,所述YUV图像识别模型是训练之后的深度神经网络模型,训练步骤如下:
步骤1,预先获取多个YUV图像;步骤2,将所述多个YUV图像作为预训练模型的训练集;步骤3,通过训练集对所述预训练模型进行训练,以得到所述YUV图像识别模型,其中,所述预训练模型为重新构建过输入层的深度神经网络模型。
示例性的,所述模型的训练过程,就是将模型结构中的可学习参数w和b,不断逼近理想值的过程。对于训练集中的训练数据,已知输入x与目标值y,经过层级操作后,
Figure PCTCN2020118686-appb-000001
则通过求解
Figure PCTCN2020118686-appb-000002
可以求出最优参数。
不难理解,所述预训练模型的输入层已经根据YUV分量的结构特性,进行了调整,有
Figure PCTCN2020118686-appb-000003
其中,D为反卷积操作,且L1、L res与D都是关于w和b的方程,有新的模型的目标方程为:
Figure PCTCN2020118686-appb-000004
即,YUV三个通道分量是等价的,经过卷积层的卷积池化操作L1后,再经过后续操 作L res,就可以获得模型输出
Figure PCTCN2020118686-appb-000005
从而求得目标函数的最优解。需要说明的是,本提案中使用的网络结构包含L1、D和L res三个部分:其中,L1是针对Y分量输入层,D是针对UV分量的反卷积部分;而L res是YUV图像识别模型中输入层后面的特征提取结构。
实施例二
图2为本申请YUV图像识别系统实施例二的程序模块示意图。YUV图像识别系统20可以包括或被分割成一个或多个程序模块,一个或者多个程序模块被存储于存储介质中,并由一个或多个处理器所执行,以完成本申请,并可实现上述YUV图像识别方法。本申请实施例所称的程序模块是指能够完成特定功能的一系列计算机程序指令段,比程序本身更适合于描述YUV图像识别系统20在存储介质中的执行过程。以下描述将具体介绍本实施例各程序模块的功能:
获取模块200,用于获取待识别的YUV图像。
示例性的,所述待识别的YUV图像可以通过对YUV格式的视频数据分帧的得到,通常视频采集芯片输出的码流基本上都是YUV数据流的形式;YUV作为一种颜色编码方式,广泛应用于各个视频处理组件中。YUV图像采用抽样的方式减低色度的带宽,可以降低图像存储和传输过程对设备的资源需求。
输入模块202,用于将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层。
示例性的,所述输入模块202,还用于:对所述YUV图像进行分量提取,以得到所述YUV图像的第一分量Y、第二分量U以及第三分量V。
示例性的,所述分量提取分支用于提取YUV格式图像中的第一分量Y、第二分量U以及第三分量V;其中,所述第一分量Y用于表示图像的明亮度Y分量,所述第二分量U和第三分量V用于表示图像的色度U分量和V分量。所述YUV格式图像包括YUV420P格式图像,在YUV420P格式图像中,每4个Y分量共用一组UV分量,已知图像大小为4*4=16,则Y分量、U分量、V分量的大小分别为16、4、4,可以抽取1到16为Y分量,17到20为U分量,21到24为V分量,并对所述Y分量、所述U分量以及所述V分量进行改组(Reshape)操作,以得到(W*H)数据结构的Y(4*4)、(W/2)*(H/2)数据结构的U(2*2)以及(W/2)*(H/2)数据结构的V(2*2)三组分量数据。所述改组操作还包括将所述UV分量分别与对应的Y分量进行顺序的重新排列,即U(1*4)分量改组为U(2*2)分量, V(1*4)分量改组为V(2*2)分量,其中,所述Y(4*4)分量的分辨率与原图像一致,所以,所述Y(4*4)分量不需要进行改组(Reshape)操作。
示例性的,所述输入模块202,还用于:对所述第一分量Y进行加维操作,以得到第一初始分量Y1;对所述第一初始分量Y1进行池化操作,以得到池化后的第一输入分量Y2。
示例性的,所述输入层的第一输入分支用于接收第一分量Y,由于所述第一分量Y的数据结构为二维数组W*H,而二维数组W*H不符合输入的规范格式,所以需要对所述二维数组W*H进行加维处理,以得到数据结构为1*1*W*H的第一初始分量Y1,所述1*1*W*H符合输入的规范格式。为了进一步的提高YUV图像的识别率,本方案还需对所述第一初始分量Y1进行池化操作,以得到池化后的数据结构为1*1*(W/2)*(H/2)的第一输入分量Y2。所述1*1*(W/2)*(H/2)符合输入的规范格式。
示例性的,所述输入模块202,还用于:对所述第二分量U和第三分量V进行加维操作,以得到第二初始分量U1以及第三初始分量V1;根据所述第二初始分量U1和第三初始分量V1,得到第一UV组合分量;对所述第一UV组合分量进行升采样操作,以得到第二输入分量U2和第三输入分量V2;根据所述第二输入分量U2和第三输入分量V2,得到第二UV组合分量。
示例性的,所述输入层的第一输入分支接收经过改组的所述第二分量U和所述第三分量V,所述第二分量U和所述第三分量V用于表示图像的色度;不难理解,由于所述第二分量U和所述第三分量V的数据结构为均为二维数组(W/2)*(H/2),所述二维数组(W/2)*(H/2)不符合输入的规范格式,所以需要对所述二维数组(W/2)*(H/2)进行加维处理;又由于所述第二分量U和所述第三分量V的分辨率相同,所以对所述第二分量U和所述第三分量V的加维操作可以同时进行,以得到1*1*(W/2)*(H/2)的形状的第二初始分量U1和所述第三初始分量V1;所述1*1*(W/2)*(H/2)符合输入的规范格式。
示例性的,所述第二初始分量U1和第三初始分量V1的数据结构均为1*1*(W/2)*(H/2),所以可以对所述第二初始分量U1和第三初始分量V1进行级联操作,以得到第一UV组合分量,所述第一UV组合分量形状为1*2*(W/2)*(H/2)。
示例性的,为了与数据结构为1*1*W*H的第一初始分量Y1的格式保持一致,还需对所述第一UV组合分量进行升采样操作,以得到数据结构为1*1*W*H的第二输入分量U2和第三输入分量V2。
示例性的,所述第二输入分量U2和所述第三输入分量V2的数据结构均为1*1*W*H,所以可以对所述第二输入分量U2和所述第三输入分量V2进行级联操作,以得到第二UV组合分量,所述第二UV组合分量的数据结构为1*2*W*H。
示例性的,所述输入模块202,还用于:将所述第一初始分量Y1和第二UV组合分量进行级联操作,以得到第一YUV组合分量;将所述第一输入分量Y2和第一UV组合分量进行级联操作,以得到第二YUV组合分量。
示例性的,所述第一初始分量Y1和所述第二UV组合分量的数据结构相同,所以可以将所述第一初始分量Y1和所述第二UV组合分量进行级联处理,以得到第一YUV组合分量,所述第一YUV组合分量的数据结构为1*3*W*H。
示例性的,所述第一输入分量Y2和所述第一UV组合分量的数据结构相同,所以可以将所述第一输入分量Y2和所述第一UV组合分量进行级联处理,以得到第二YUV组合分量,所述第二YUV组合分量数据结构为1*3*(W/2)*(H/2)。
示例性的,新的网络结构相比于传统RGB图像的模型结构,在输入层进行了适应性改造,将非规范结构的格式采用多输入的形式进行分辨率的适配,同时,利用通道之间的级联操作,增强不同分辨率图像之间的关联性,弥补了多输入带来的信息欠学习的缺陷。
输出模块204,用于通过所述YUV图像识别模型输出所述识别结果。
示例性的,所述输入模块204,还用于:将所述第一YUV组合分量和所述第二YUV组合分量作为输入层的输出数据,并将所述输出数据输入到所述图像处理层进行YUV图像识别操作,以得到所述识别结果,所述图像处理层包括图像识别网络,所述图像识别网络为预先训练好的卷积神经网络,所述卷积神经网络用于识别YUV图像;及通过输出层输出所述识别结果。
实施例三
参阅图3,是本申请实施例三之计算机设备的硬件架构示意图。本实施例中,所述计算机设备2是一种能够按照事先设定或者存储的指令,自动进行数值计算和/或信息处理的设备。该计算机设备2可以是机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。如图所示,所述计算机设备2至少包括,但不限于,可通过系统总线相互通信连接存储器21、处理器22、网络接口23、以及YUV图像识别系统20。
本实施例中,存储器21至少包括一种类型的计算机可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器21可以是计算机设备2的内部存储单元,例如该计算机设备2的硬盘或 内存。在另一些实施例中,存储器21也可以是计算机设备2的外部存储设备,例如该计算机设备2上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,存储器21还可以既包括计算机设备2的内部存储单元也包括其外部存储设备。本实施例中,存储器21通常用于存储安装于计算机设备2的操作系统和各类应用软件,例如实施例二的YUV图像识别系统20的程序代码等。此外,存储器21还可以用于暂时地存储已经输出或者将要输出的各类数据。
处理器22在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器22通常用于控制计算机设备2的总体操作。本实施例中,处理器22用于运行存储器21中存储的程序代码或者处理数据,例如运行YUV图像识别系统20,以实现实施例一的YUV图像识别方法。
所述网络接口23可包括无线网络接口或有线网络接口,该网络接口23通常用于在所述计算机设备2与其他电子装置之间建立通信连接。例如,所述网络接口23用于通过网络将所述计算机设备2与外部终端相连,在所述计算机设备2与外部终端之间的建立数据传输通道和通信连接等。所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。
需要指出的是,图3仅示出了具有部件20-23的计算机设备2,但是应理解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。
在本实施例中,存储于存储器21中的YUV图像识别系统20还可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器21中,并由一个或多个处理器(本实施例为处理器22)所执行,以完成本申请。
例如,图2示出了本申请实施例二之所述实现YUV图像识别系统20的程序模块示意图,该实施例中,所述YUV图像识别系统20可以被划分为获取模块200、输入模块202、和输出模块204。其中,本申请所称的程序模块是指能够完成特定功能的一系列计算机程序指令段,比程序更适合于描述所述YUV图像识别系统20在所述计算机设备2中的执行过程。所述程序模块200-204的具体功能在实施例二中已有详细描述,在此不再赘述。
实施例四
本实施例还提供一种计算机可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、 随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机可读指令,计算机可读指令被处理器执行时实现相应功能。本实施例的计算机可读存储介质用于YUV图像识别系统20,被处理器执行如下步骤:
获取待识别的YUV图像;
将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
通过所述YUV图像识别模型输出识别结果。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种YUV图像识别方法,其中,所述方法包括:
    获取待识别的YUV图像;
    将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
    通过所述YUV图像识别模型输出识别结果。
  2. 如权利要求1所述的YUV图像识别方法,其中,所述分量提取分支用于执行以下步骤:
    对所述YUV图像进行分量提取,以得到所述YUV图像的第一分量Y、第二分量U以及第三分量V。
  3. 如权利要求2所述的YUV图像识别方法,其中,所述第一输入分支用于执行以下步骤:
    对所述第一分量Y进行加维操作,以得到第一初始分量Y1;及
    对所述第一初始分量Y1进行池化操作,以得到池化后的第一输入分量Y2。
  4. 如权利要求3所述的YUV图像识别方法,其中,所述第二输入分支用于执行以下步骤:
    对所述第二分量U和第三分量V进行加维操作,以得到第二初始分量U1以及第三初始分量V1;
    根据所述第二初始分量U1和第三初始分量V1,得到第一UV组合分量;
    对所述第一UV组合分量进行升采样操作,以得到第二输入分量U2和第三输入分量V2;及
    根据所述第二输入分量U2和第三输入分量V2,得到第二UV组合分量。
  5. 如权利要求4所述的YUV图像识别方法,其中,所述分量组合分支用于执行以下步骤:
    将所述第一初始分量Y1和第二UV组合分量进行级联操作,以得到第一YUV组合分量;
    将所述第一输入分量Y2和第一UV组合分量进行级联操作,以得到第二YUV组合分量。
  6. 如权利要求5所述的YUV图像识别方法,其中,通过所述YUV图像识别模型输 出所述识别结果,包括:
    将所述第一YUV组合分量和所述第二YUV组合分量作为输入层的输出数据,并将所述输出数据输入到所述图像处理层进行YUV图像识别操作,以得到所述识别结果,所述图像处理层包括图像识别网络,所述图像识别网络为预先训练好的卷积神经网络,所述卷积神经网络用于识别YUV图像;及
    通过输出层输出所述识别结果。
  7. 一种YUV图像识别系统,其中,包括:
    获取模块,用于获取待识别的YUV图像;
    输入模块,用于将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
    输出模块,用于通过所述YUV图像识别模型输出所述识别结果。
  8. 如权利要求7所述的YUV图像识别系统,其中,所述输入模块,还用于:
    对所述YUV图像进行分量提取,以得到所述YUV图像的第一分量Y、第二分量U以及第三分量V。
  9. 一种计算机设备,所述计算机设备包括存储器、处理器及存储在所述存储器上并可在所述处理器上运行的计算机可读指令,其中,所述计算机可读指令被处理器执行时实现以下步骤:
    获取待识别的YUV图像;
    将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
    通过所述YUV图像识别模型输出识别结果。
  10. 如权利要求9所述的计算机设备,其中,所述计算机可读指令被处理器执行时还实现以下步骤:
    对所述YUV图像进行分量提取,以得到所述YUV图像的第一分量Y、第二分量U以及第三分量V。
  11. 如权利要求10所述的计算机设备,其中,所述计算机可读指令被处理器执行时还实现以下步骤:
    对所述第一分量Y进行加维操作,以得到第一初始分量Y1;及
    对所述第一初始分量Y1进行池化操作,以得到池化后的第一输入分量Y2。
  12. 如权利要求11所述的计算机设备,其中,所述计算机可读指令被处理器执行时还实现以下步骤:
    对所述第二分量U和第三分量V进行加维操作,以得到第二初始分量U1以及第三初始分量V1;
    根据所述第二初始分量U1和第三初始分量V1,得到第一UV组合分量;
    对所述第一UV组合分量进行升采样操作,以得到第二输入分量U2和第三输入分量V2;及
    根据所述第二输入分量U2和第三输入分量V2,得到第二UV组合分量。
  13. 如权利要求12所述的计算机设备,其中,所述计算机可读指令被处理器执行时还实现以下步骤:
    将所述第一初始分量Y1和第二UV组合分量进行级联操作,以得到第一YUV组合分量;
    将所述第一输入分量Y2和第一UV组合分量进行级联操作,以得到第二YUV组合分量。
  14. 如权利要求13所述的计算机设备,其中,所述计算机可读指令被处理器执行时还实现以下步骤:
    将所述第一YUV组合分量和所述第二YUV组合分量作为输入层的输出数据,并将所述输出数据输入到所述图像处理层进行YUV图像识别操作,以得到所述识别结果,所述图像处理层包括图像识别网络,所述图像识别网络为预先训练好的卷积神经网络,所述卷积神经网络用于识别YUV图像;及
    通过输出层输出所述识别结果。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质内存储有计算机可读指令,所述计算机可读指令可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
    获取待识别的YUV图像;
    将所述YUV图像输入到YUV图像识别模型,所述YUV图像识别模型包括输入层、图像处理层和输出层;其中,所述输入层包括分量提取分支、第一输入分支、第二输入分支和分量组合分支,用于将所述YUV图像的YUV组合分量提供给所述图像处理层;及
    通过所述YUV图像识别模型输出识别结果。
  16. 如权利要求15所述的计算机可读存储介质,其中,所述计算机可读指令还可被至 少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
    对所述YUV图像进行分量提取,以得到所述YUV图像的第一分量Y、第二分量U以及第三分量V。
  17. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令还可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
    对所述第一分量Y进行加维操作,以得到第一初始分量Y1;及
    对所述第一初始分量Y1进行池化操作,以得到池化后的第一输入分量Y2。
  18. 如权利要求17所述的计算机可读存储介质,其中,所述计算机可读指令还可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
    对所述第二分量U和第三分量V进行加维操作,以得到第二初始分量U1以及第三初始分量V1;
    根据所述第二初始分量U1和第三初始分量V1,得到第一UV组合分量;
    对所述第一UV组合分量进行升采样操作,以得到第二输入分量U2和第三输入分量V2;及
    根据所述第二输入分量U2和第三输入分量V2,得到第二UV组合分量。
  19. 如权利要求18所述的计算机可读存储介质,其中,所述计算机可读指令还可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
    将所述第一初始分量Y1和第二UV组合分量进行级联操作,以得到第一YUV组合分量;
    将所述第一输入分量Y2和第一UV组合分量进行级联操作,以得到第二YUV组合分量。
  20. 如权利要求19所述的计算机可读存储介质,其中,所述计算机可读指令还可被至少一个处理器所执行,以使所述至少一个处理器执行如下步骤:
    将所述第一YUV组合分量和所述第二YUV组合分量作为输入层的输出数据,并将所述输出数据输入到所述图像处理层进行YUV图像识别操作,以得到所述识别结果,所述图像处理层包括图像识别网络,所述图像识别网络为预先训练好的卷积神经网络,所述卷积神经网络用于识别YUV图像;及
    通过输出层输出所述识别结果。
PCT/CN2020/118686 2020-03-03 2020-09-29 Yuv图像识别方法、系统和计算机设备 WO2021174834A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010138367.5A CN111428732B (zh) 2020-03-03 2020-03-03 Yuv图像识别方法、系统和计算机设备
CN202010138367.5 2020-03-03

Publications (1)

Publication Number Publication Date
WO2021174834A1 true WO2021174834A1 (zh) 2021-09-10

Family

ID=71551973

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118686 WO2021174834A1 (zh) 2020-03-03 2020-09-29 Yuv图像识别方法、系统和计算机设备

Country Status (2)

Country Link
CN (1) CN111428732B (zh)
WO (1) WO2021174834A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428732B (zh) * 2020-03-03 2023-10-17 平安科技(深圳)有限公司 Yuv图像识别方法、系统和计算机设备
CN111950727B (zh) * 2020-08-06 2022-10-04 中科智云科技有限公司 图像数据的神经网络训练和测试方法及设备

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897747A (zh) * 2017-02-28 2017-06-27 深圳市捷顺科技实业股份有限公司 一种基于卷积神经网络模型鉴别车辆颜色的方法及装置
CN109472270A (zh) * 2018-10-31 2019-03-15 京东方科技集团股份有限公司 图像风格转换方法、装置及设备
CN111428732A (zh) * 2020-03-03 2020-07-17 平安科技(深圳)有限公司 Yuv图像识别方法、系统和计算机设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2797326A1 (en) * 2013-04-22 2014-10-29 Nederlandse Organisatie voor toegepast -natuurwetenschappelijk onderzoek TNO Image color correction
CN110622169A (zh) * 2017-05-15 2019-12-27 渊慧科技有限公司 用于视频中的动作识别的神经网络系统
CN110136071B (zh) * 2018-02-02 2021-06-25 杭州海康威视数字技术股份有限公司 一种图像处理方法、装置、电子设备及存储介质
CN108259997B (zh) * 2018-04-02 2019-08-23 腾讯科技(深圳)有限公司 图像相关处理方法及装置、智能终端、服务器、存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897747A (zh) * 2017-02-28 2017-06-27 深圳市捷顺科技实业股份有限公司 一种基于卷积神经网络模型鉴别车辆颜色的方法及装置
CN109472270A (zh) * 2018-10-31 2019-03-15 京东方科技集团股份有限公司 图像风格转换方法、装置及设备
CN111428732A (zh) * 2020-03-03 2020-07-17 平安科技(深圳)有限公司 Yuv图像识别方法、系统和计算机设备

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SERMANET PIERRE; KAVUKCUOGLU KORAY; CHINTALA SOUMITH; LECUN YANN: "Pedestrian Detection with Unsupervised Multi-stage Feature Learning", 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE COMPUTER SOCIETY, US, 23 June 2013 (2013-06-23), US, pages 3626 - 3633, XP032492965, ISSN: 1063-6919, DOI: 10.1109/CVPR.2013.465 *
THOMAS BOULAY; SAID EL-HACHIMI; MANI KUMAR SURISETTI; PULLARAO MADDU; SARANYA KANDAN: "YUVMultiNet: Real-time YUV multi-task CNN for autonomous driving", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 11 April 2019 (2019-04-11), 201 Olin Library Cornell University Ithaca, NY 14853, XP081168040 *

Also Published As

Publication number Publication date
CN111428732B (zh) 2023-10-17
CN111428732A (zh) 2020-07-17

Similar Documents

Publication Publication Date Title
US10110936B2 (en) Web-based live broadcast
WO2019184657A1 (zh) 图像识别方法、装置、电子设备及存储介质
WO2021174834A1 (zh) Yuv图像识别方法、系统和计算机设备
CN109522902B (zh) 空-时特征表示的提取
CN108734653B (zh) 图像风格转换方法及装置
KR20210074360A (ko) 이미지 처리 방법, 디바이스 및 장치, 그리고 저장 매체
US8400523B2 (en) White balance method and white balance device
WO2023174098A1 (zh) 一种实时手势检测方法及装置
CN113344794B (zh) 一种图像处理方法、装置、计算机设备及存储介质
WO2023035531A1 (zh) 文本图像超分辨率重建方法及其相关设备
US10133955B2 (en) Systems and methods for object recognition based on human visual pathway
CN116188808B (zh) 图像特征提取方法和系统、存储介质及电子设备
US20230011823A1 (en) Method for converting image format, device, and storage medium
US20220343507A1 (en) Process of Image
WO2021042895A1 (zh) 基于神经网络的验证码识别方法、系统及计算机设备
JP2023001926A (ja) 画像融合方法及び装置、画像融合モデルのトレーニング方法及び装置、電子機器、記憶媒体、並びにコンピュータプログラム
CN113627328A (zh) 电子设备及其图像识别方法、片上系统和介质
CN112819874A (zh) 深度信息处理方法、装置、设备、存储介质以及程序产品
CN110930474A (zh) 昆虫密度热力图构建方法、装置及系统
WO2021164329A1 (zh) 图像处理方法、装置、通信设备及可读存储介质
CN112990370B (zh) 图像数据的处理方法和装置、存储介质及电子设备
CN115660984A (zh) 一种图像高清还原方法、装置及存储介质
CN114694065A (zh) 视频处理方法、装置、计算机设备及存储介质
CN114170082A (zh) 视频播放、图像处理和模型训练方法、装置以及电子设备
WO2022149127A1 (en) Method of training a neural network configured for converting 2d images into 3d models

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20922958

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20922958

Country of ref document: EP

Kind code of ref document: A1