WO2021147316A1 - Object recognition method and device - Google Patents

Object recognition method and device Download PDF

Info

Publication number
WO2021147316A1
WO2021147316A1 PCT/CN2020/111141 CN2020111141W WO2021147316A1 WO 2021147316 A1 WO2021147316 A1 WO 2021147316A1 CN 2020111141 W CN2020111141 W CN 2020111141W WO 2021147316 A1 WO2021147316 A1 WO 2021147316A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
super
area
division
Prior art date
Application number
PCT/CN2020/111141
Other languages
French (fr)
Chinese (zh)
Inventor
高源�
陈士胃
郭一民
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021147316A1 publication Critical patent/WO2021147316A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • This application relates to the field of image technology, and in particular to an object recognition method and device.
  • image technology has been applied to various fields of people's production and life, such as the field of object recognition. For example, by performing face recognition on images collected by surveillance cameras, it can assist in finding criminals.
  • the distance between the surveillance camera and the subject is relatively long, and the resolution of the image collected by the surveillance camera is low. It is more difficult to perform face recognition on the image. Therefore, it is usually necessary to first perform the resolution on the color image of the image. Upgrade, and then perform face recognition on the image obtained by the resolution increase.
  • the face information in the image obtained by the resolution increase is often quite different from the face information in the image collected by the surveillance camera, which results in a lower accuracy of face recognition on the image obtained by the resolution increase.
  • the present application provides an object recognition method and device, which can solve the problem of low accuracy of face recognition on an image obtained by increasing the resolution.
  • the technical solution is as follows:
  • an object recognition method includes: after acquiring a target image generated by an image sensor, first super-divide the image of the target region in the target image to obtain a super-divided color image, and then The target object is recognized based on the super-division color image.
  • the target area includes at least a part of the area in the target raw image, and the resolution of the super-division color image is greater than the resolution of the image of the target area.
  • the image of the target region in the target raw image is subjected to super-division processing to enlarge and enhance the detailed information in the image of the target region.
  • the super-divided image is a raw image that has not been processed by ISP
  • the image of the target area after the super-division processing can contain more detailed information. Therefore, the super-division color image (the color image of the image of the target area after the super-division processing) obtained by this embodiment has more detailed information than the color image in the prior art. Therefore, the recognition of the target object based on the super-division color image can improve the accuracy of the recognition of the target object.
  • the target image is a Bayer mode image.
  • the target image is a Bayer mode image as an example.
  • the target image can also be any mode other than the Bayer mode (such as red, green, and blue brightness (RGBW) mode). ), this application is not limited.
  • RGBW red, green, and blue brightness
  • the method further includes: performing first processing on the target image to obtain a color image of the target image; and determining a first region in the color image of the target image that contains the target object
  • the above-mentioned recognition of the target object based on the super-division color image specifically includes: first determining the second region in the super-division color image based on the first region; and then based on the super-division color image
  • the image of the second area is used to identify the target object. After the first area in the color image of the target image is determined, the first area can be mapped to the super-division color image in a certain manner, so as to obtain the second area in the super-division color image.
  • the second area is an area obtained by mapping the first area
  • the second area also includes the target object on the premise that the first area contains the target object.
  • the object recognition device may intercept the image of the second region, and perform target object recognition on the image of the second region.
  • the image of the second region in the super-division color image also contains more detailed information of the target object, based on the image of the second region The accuracy of the recognition of the target object is high.
  • the first area and the second area are both rectangular, the upper left corner coordinates of the first area are (X A1 , Y A1 ), and the lower right corner coordinates are (X A2 , Y A2 );
  • the coordinates of the upper left corner of the second area are (X B1 , Y B1 ), and the coordinates of the lower right corner are (X B2 , Y B2 );
  • X B1 X A1 *K
  • Y B1 Y A1 *K
  • X B2 X B1 + (X A2- X A1 )*K
  • Y B2 Y B1 + (Y A2 -Y A1 )*K
  • K represents the resolution ratio between the super-division color image and the image in the target area.
  • the target image is a Bayer mode image
  • the image of the target area in the target image is super-divided to obtain
  • the super-division color image includes: acquiring the target four-channel image, and inputting the target four-channel image into a first model to obtain the super-division color image output by the first model; wherein, the first model It is used to super-divide the four-channel image of the unsuper-divided Bayer-mode image, and output a color image of the Bayer-mode image whose resolution has been increased.
  • the image of the target area includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: The combination map, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column.
  • the target image is a Bayer mode image
  • the image of the target area in the target image is super-divided to obtain
  • the super-division color image includes: performing super-division on the image of the target area to obtain a Bayer mode image; and then performing a second processing on the Bayer mode image to obtain the super-division color image.
  • the super-division of the image of the target area to obtain the Bayer mode image includes: acquiring the four-channel image of the target, wherein the image of the target area includes a plurality of pixel groups arranged in an array, The pixel group includes two rows and two columns of pixels, and the target four-channel image includes: a combination map of pixels in the first row and column 1, a combination map of pixels in the first row and second column, and a second row and second column of pixels in the plurality of pixel groups.
  • the first implementation method is to directly obtain a super-resolution color image based on the first model, and the first implementation method is faster to obtain a super-resolution color image.
  • the second implementation method is to first super-divide the image of the target area to obtain the Bayer mode image (for example, obtain a target four-channel image with increased resolution through the second model, and then convert the target four-channel image into a Bayer mode image), After that, the Bayer mode image is processed into a super-division color image.
  • the object recognition device can use either of these two implementations to execute the process of obtaining the super-differentiated color image, or the object recognition device can be based on the user's choice, using the implementation selected by the user in the two implementations to execute The process of obtaining super-division color images.
  • the first processing includes a first distortion correction processing
  • the second processing includes a second distortion correction processing
  • the parameters of the first distortion correction processing include: a first distortion curve
  • the processed parameters include: a second distortion curve
  • the coordinates of any sampling point (X 0 , Y 0 ) in the second distortion curve corresponding to the pixel in the hyperdivision target image is (X Ai , Y Ai )
  • the coordinates of the pixel corresponding to any sampling point (X 0 , Y 0 ) in the first distortion curve in the target image are (X Bi , Y Bi );
  • X Ai (X Bi -(W f /2+0.5))*K+(W f *K/2+0.5)
  • Y Ai (Y Bi -(H f /2+0.5))*K+(H f *K/2+0.5 );
  • W f represents the width of the target image
  • H f represents the height of the target image
  • K represents the
  • the parameters for processing the target image into a color image may not be suitable for processing the Bayer mode image into a color image. If the target image is directly processed into color
  • the processing of the Bayer mode image with the image parameters may cause problems (such as color cast, image distortion, and contrast changes) in the processed super-division color image.
  • the parameters of the distortion correction processing for processing the target raw image into a color image can be corrected, thus avoiding these problems in the obtained super-division color image.
  • corresponding processing is performed on the Bayer mode image to obtain a hyperdivision color image.
  • the corresponding processing on the Bayer mode image may not be performed based on the parameters for processing the target raw image into a color image. This application does not limit this.
  • the target area of the target image is taken as the entire area of the target image as an example.
  • the target area of the target image may be an area (such as a partial area) that contains the target object in the target image.
  • the method before the super-division of the image of the target region in the target image to obtain a super-division color image, the method further includes: determining that the target image contains the target object. target area.
  • the target raw image is the m+1th frame image in the raw image video, m ⁇ 1, and before the determination of the target region containing the target object in the target raw image, the The method further includes: performing third processing on the m-th frame image in the raw image video to obtain a color image of the m-th frame image; determining that the color image of the m-th frame image contains the first image of the target object Three regions; the determining the target region in the target image that contains the target object includes: determining the target region in the target image that corresponds to the third region.
  • the target area and the third area are both rectangular, the coordinates of the upper left corner of the third area are (X D1 , Y D1 ), and the coordinates of the lower right corner are (X D2 , Y D2 );
  • the coordinates of the upper left corner of the target area corresponding to the third area are (X C1 , Y C1 ), and the coordinates of the lower right corner are (X C2 , Y C2 );
  • X D1 (X C1 -(X C1 +X C2 )/2)*L+(X C1 +X C2 )/2;
  • Y D1 (Y C1 -(Y C1 +Y C2 )/2)*L+(Y C1 +Y C2 )/2;
  • X D2 ( X C2 -(X C1 +X C2 )/2)*L+(X C1 +X C2 )/2;
  • Y D2 (Y C2 -(Y C1 +Y C2 )/
  • the method further includes: the target image includes a plurality of target regions, and the When there are two target areas that satisfy the replacement condition among the multiple target areas, replace the two target areas with candidate target areas to obtain the updated multiple target areas; wherein, the replacement condition includes: The two target areas overlap at least partially, and the sum of the areas of the two target areas is greater than the area of the candidate target area; the coordinates of the upper left corner of one of the two target areas are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the coordinates of the upper left corner of the other of the two target areas are (X 21 , Y 21 ), and the coordinates of the lower right corner are (X 22 , Y 22 ); The coordinates of the upper left corner of the candidate target area are (X M1 , Y M1 ), and the coordinates of the lower right corner are (
  • an object recognition device in a second aspect, includes: modules for executing the object recognition method provided in the first aspect.
  • the object recognition device includes: an acquisition module for acquiring a target image generated by an image sensor, the target image being a Bayer mode image; a super-division module for acquiring a four-channel image of the target, wherein, The image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: A combination map of pixels in a column, a combination map of pixels in a first row and a second column, a combination map of pixels in a second row and a first column, and a combination map of pixels in a second row and second column; the target area includes the target raw image
  • the super-division module is also used to input the target four-channel image into a first model to obtain a super-division color image output by the first model; wherein, the first model is used to The four-channel image of the super-divided Bayer mode image is super-divided, and a color image of the
  • the object recognition device includes: an acquisition module for acquiring a target image generated by an image sensor, the target image being a Bayer pattern image; a supramolecular module for acquiring a four-channel image of the target, wherein, The image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: A combination map of pixels in a column, a combination map of pixels in a first row and a second column, a combination map of pixels in a second row and a first column, and a combination map of pixels in a second row and second column; the target area includes the target raw image
  • the supramolecular module is also used to input the target four-channel image into a second model to obtain the target four-channel image output by the second model after the resolution is increased; the resolution is improved
  • the target four-channel image after the rate is converted into a Bayer mode image; wherein, the second model is used to super-divide the
  • the object recognition device includes: an acquisition module for acquiring a target image generated by an image sensor, the target image being a Bayer pattern image; a supramolecular module for acquiring a four-channel image of the target, wherein, The image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: A combination map of pixels in a column, a combination map of pixels in a first row and a second column, a combination map of pixels in a second row and a first column, and a combination map of pixels in a second row and second column; the target area includes the target raw image
  • the supramolecular module is also used to input the target four-channel image into a second model to obtain the target four-channel image output by the second model after the resolution is increased; the resolution is improved
  • the target four-channel image after the rate is converted into a Bayer mode image; wherein, the second model is used to super-divide the
  • the first processing includes a first distortion correction processing
  • the second processing includes a second distortion correction processing
  • the parameters of the first distortion correction processing include: a first distortion curve
  • the parameters include: a second distortion curve
  • the coordinates of a pixel corresponding to any sampling point (X 0 , Y 0 ) in the second distortion curve in the hyperdivision target image are (X Ai , Y Ai )
  • the coordinates of a pixel corresponding to any sampling point (X 0 , Y 0 ) in the first distortion curve in the target image are (X Bi , Y Bi );
  • X Ai (X Bi -( W f /2+0.5))*K+(W f *K/2+0.5)
  • Y Ai (Y Bi -(H f /2+0.5))*K+(H f *K/2+0.5)
  • W f represents the width of the target image
  • H f represents the height of the target image
  • K represents the resolution ratio of the super
  • the object recognition device includes: an acquisition module for acquiring a target image generated by an image sensor; a second determining module for determining a target area containing the target object in the target image; a replacement module for When the target raw image includes multiple target regions, and there are two target regions that satisfy the replacement condition among the multiple target regions, the two target regions are replaced with candidate target regions to obtain updated all target regions.
  • the multiple target regions a super-division module for super-division of the image of the target region in the target image to obtain a super-division color image, wherein the target region includes at least part of the region in the target image ,
  • the resolution of the super-division color image is greater than the resolution of the image of the target area; and the recognition module is configured to recognize the target object based on the super-division color image.
  • the replacement condition includes: the two target regions overlap at least partially, and the sum of the areas of the two target regions is greater than the area of the candidate target region;
  • the coordinates of the upper left corner are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the coordinates of the upper left corner of the other of the two target areas are (X 21 , Y 21 ), and
  • the coordinates of the lower right corner are (X 22 , Y 22 );
  • the coordinates of the upper left corner of the candidate target area are (X M1 , Y M1 ), and the coordinates of the lower right corner are (X M2 , Y M2 );
  • X M1 is X 11 and X 21 is the minimum value;
  • Y M1 is the minimum value of Y 11 and Y 21 ;
  • X M2 is the maximum value of X 12 and X 22 ;
  • Y M2 is the maximum value of Y 12 and Y 22
  • an object recognition device in a third aspect, includes a processor and an interface.
  • the processor is used to obtain a raw image from an image sensor through the interface, and the processor is used to run a program to make The object recognition device executes the object recognition method described in the first aspect.
  • a computer storage medium is provided, and a computer program is stored in the storage medium, and the computer program is used to execute the object recognition method described in the first aspect.
  • a computer program product is provided.
  • the object recognition device executes the object recognition method as described in the first aspect.
  • FIG. 1 is a schematic structural diagram of an object recognition device provided by an embodiment of this application.
  • FIG. 2 is a flowchart of an object recognition method provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of a target image provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of a color image of a target image provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of a super-division color image provided by an embodiment of the application.
  • FIG. 6 is a flowchart of a method for obtaining a super-division color image according to an embodiment of the application
  • FIG. 7 is a schematic diagram of a target four-channel image provided by an embodiment of this application.
  • FIG. 8 is a schematic diagram of a first model provided by an embodiment of this application.
  • FIG. 9 is a schematic diagram of the pixel drawing layer splicing 4 first combination pictures into one second combination picture
  • FIG. 10 is a flowchart of another method for obtaining a hyperdivision color image according to an embodiment of the application.
  • FIG. 11 is a schematic diagram of a second model provided by an embodiment of this application.
  • FIG. 12 is a schematic diagram of an image obtained after the resolution of the target four-channel image in FIG. 7 is increased according to an embodiment of the application;
  • FIG. 13 is a schematic diagram of a Bayer mode image provided by an embodiment of the application.
  • FIG. 14 is a supplementary flowchart of an object recognition method provided by an embodiment of the application.
  • FIG. 15 is a schematic diagram of an m-th frame image provided by an embodiment of this application.
  • FIG. 16 is a schematic diagram of an m+1-th frame image provided by an embodiment of this application.
  • FIG. 17 is a schematic diagram of a target area and candidate target areas in a target image provided by an embodiment of the application.
  • FIG. 18 is a block diagram of an object recognition device provided by an embodiment of the application.
  • the target object in the image can be recognized.
  • the target object may be any object such as a human face, hand, clothes, table, and bench.
  • the image is often a color image (compared to a raw image, a color image is more suitable for human eyes to observe), and the color image can be directly resolved.
  • the lifting process is used to increase the size of the color image and try to enhance the detailed information of the target object in the image.
  • color images are often processed through a series of image signal processing (ISP) (such as image demosaicing, image compression) on raw images (also called RAW images) directly collected by shooting devices (such as video cameras or cameras).
  • ISP image signal processing
  • Image denoising In the process of this process, some detailed information of the target object in the raw image will be eliminated, resulting in less detailed information of the target object contained in the color image itself. Even if the resolution of the color image is increased, it is more difficult to restore the color image. Detailed information of the target object.
  • the embodiment of the object recognition method provided in this application enlarges and enhances the detailed information of the target object in the raw image by performing resolution enhancement processing (that is, "super-resolution” processing) on the raw image. Since the super-divided image is a raw image that has not been processed by ISP, the raw image with the increased resolution can contain more detailed information of the target object. After over-resolution, the color image of the raw image with the increased resolution is obtained, so that the color image obtained in this embodiment has more detailed information than the prior art. Therefore, the recognition of the target object based on the color image can improve the accuracy of the recognition of the target object.
  • the ISP operation may be further performed on the color image.
  • the above-mentioned raw image is an image (also referred to as an original optical image) that is directly collected by the imaging device through its own image sensor without image signal processing.
  • the raw image is the most primitive image information inside the shooting device, so it retains the richest high-frequency details of the image that the shooting device can obtain.
  • the high-frequency details of the color image obtained by image signal processing on the raw image have been weakened or even disappeared, resulting in the resulting color image being unable to retain the details of the target object.
  • the object recognition device includes: a processor 101 and an interface 106.
  • the processor 101 is connected to the interface 106, and the interface 106 is connected to an image sensor 105.
  • the image sensor 105 is used to generate a raw image.
  • 101 is used to obtain a raw image from the image sensor 105 through the interface 106.
  • the processor 101 is configured to run a program, so that the object recognition apparatus executes the object recognition method provided in the embodiment of the present application.
  • the object recognition device may also include: a communication component 102, a memory 103, and at least one communication bus 104.
  • the processor 101, the interface 106, the communication component 102, and the memory 103 can pass through the The communication bus 104 is connected.
  • the program used by the processor 101 to execute may be the program 1031 in the memory 103.
  • the memory 103 may include a high-speed random access memory (RAM: Random Access Memory), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
  • the communication connection between the object identification device and at least one other network element is realized through the communication component 102 (which may be wired or wireless), and the Internet, a wide area network, a local network, or a metropolitan area network may be used.
  • the processor and the memory are independent of each other as an example.
  • the memory 103 may also be integrated in the processor, which is not limited in the embodiment of the present application.
  • FIG. 2 is a flowchart of an object recognition method provided by an embodiment of the application.
  • the object recognition method may include:
  • Step 201 Acquire a target image of the Bayer model.
  • the object recognition device can obtain the target raw image in the Bayer mode generated by the image sensor 105.
  • the object recognition device includes an image sensor as an example.
  • the object recognition device may not include an image sensor, but is connected to the image sensor, and can obtain the Bayer pattern generated by the image sensor. Target student image.
  • the target raw image may include a plurality of pixels arranged in an array, and each of the pixel groups of 2 rows and 2 columns in the plurality of pixels forms a pixel group, and four pixels in each pixel group
  • the colors of each pixel are red, green 1, green 2, and blue.
  • green 1 and green 2 both represent green, but green 1 and green 2 respectively correspond to two pixels in the pixel group.
  • the color of the pixel in the first row and the first column of the pixel group is red
  • the color of the pixel in the first row and second column is green 1
  • the color of the pixel in the second row and first column is green 2.
  • the color of the pixel in the second row and second column is blue.
  • the distribution positions of the four pixels can also be different from the distribution positions shown in FIG. 3.
  • the color of the pixel in the first row and the first column is green 1, the color of the pixel in the first row and second column is red, the color of the pixel in the second row and first column is blue, and the color of the pixel in the second row and second column is blue.
  • the color of the pixel is green 2; or, the color of the pixel in the first row and the first column is green 1, the color of the pixel in the first row and the second column is blue, the color of the pixel in the second row and the first column is red, and the color of the pixel in the second row and first column is red.
  • the color of the pixel in the second column of the row is green 2.
  • the embodiment of the application does not limit this.
  • the embodiments of the present application do not limit the number of pixels in the target image.
  • the target image is a Bayer mode image as an example.
  • the target image can also be any mode other than the Bayer mode (such as red, green, and blue brightness (RGBW) ) Mode), which is not limited in the embodiment of the present application.
  • RGBW red, green, and blue brightness
  • Step 202 Perform first processing on the target raw image to obtain a color image of the target raw image.
  • the object recognition device needs to process the target raw image into a color image.
  • the color image can be an image in any color mode, such as a red-green-blue (RGB) format, a luminance and chrominance (YUV) format, and so on.
  • RGB red-green-blue
  • YUV luminance and chrominance
  • the first processing may include at least one of automatic white balance processing, color correction processing, gamma correction processing, and distortion correction processing.
  • the first processing may also include processing other than these four types of processing.
  • Other processing including automatic white balance processing, color correction processing, gamma correction processing, and distortion correction processing is taken as an example, and the embodiment of the present application does not limit the sequence of these types of processing.
  • Color images are, for example, images in the Joint Photographic Experts Group (JPEG) format, images in the High Efficiency Image File Format (helf), etc.
  • JPEG format is also called the JPG format.
  • Step 203 Determine the first region in the color image of the target image that contains the target object.
  • the object recognition device can recognize the target object on the color image to obtain the first region containing the target object in the color image.
  • the color image may include one or more first regions, which is not limited in the embodiment of the present application.
  • the object recognition device needs to recognize each first region in the color image.
  • Step 204 Perform super-division on the image of the target region in the target image to obtain a super-division color image, where the target region includes at least a part of the region in the target image, and the resolution of the super-division color image is greater than that of the target region in the target image The resolution of the image.
  • the object recognition device When the object recognition device super-divisions the image of the target area in the target image to obtain the super-division color image, it can first increase the resolution of the image (also a kind of raw image) of the target area of the target image. The image of the target area after the resolution is increased is obtained, and then the image of the target area after the resolution is increased is processed to obtain the super-division color image. Since in step 204, the resolution of the image of the target area in the target raw image is directly increased, the image of the target area after the resolution has been increased may contain more detailed information of the target object. The super-division color image obtained later can also contain more detailed information of the target object.
  • Step 205 Determine a second area in the super-division color image based on the first area.
  • the first area in the color image of the target image is determined in step 203
  • the first area can be mapped to the super-division color image in a certain manner, so as to obtain the second area in the super-division color image.
  • the second area is an area obtained by mapping the first area
  • the second area also includes the target object on the premise that the first area contains the target object. It should be noted that if multiple first regions are determined in step 203, each of the multiple first regions needs to be mapped to the super-division color image in step 205, so as to obtain multiple Each first area in the first area corresponds to a second area.
  • both the first area and the second area may be rectangular.
  • the coordinates of the upper left corner of the first area are (X A1 , Y A1 ), and the coordinates of the lower right corner are (X A2 , Y A2 )
  • the coordinates of the upper left corner of the second area corresponding to the first area are (X B1 , Y B1 )
  • the coordinates of the lower right corner are (X B2 , Y B2 )
  • X B1 X A1 *K
  • Y B1 Y A1 *K
  • X B2 X B1 +(X A2 -X A1 )* K
  • Y B2 Y B1 + (Y A2 -Y A1 )*K
  • K represents the resolution ratio of the super-division color image to the target image (can be called the resolution improvement rate), K>1.
  • the color image of the target image is shown in Figure 4, and the super-division color image is shown in Figure 5.
  • K the coordinates of the upper left corner of the first area are (3, 3), and the coordinates of the lower right corner are (6, 6).
  • the length and width of the first region are both 3, the length and width of the second region are both 6, and the second region is twice as large as the first region in terms of length and width.
  • Step 206 Recognize the target object based on the second area.
  • license plate recognition and face recognition based on the second area For example: license plate recognition and face recognition based on the second area.
  • the recognition of the target object based on the second area may refer to the recognition of the target object within the range of the second area.
  • the object recognition device may intercept the image of the second region, and perform target object recognition on the image of the second region. Moreover, since the super-division color image contains more detailed information of the target object, the image of the second region in the super-division color image also contains more detailed information of the target object, based on the image of the second region The accuracy of the recognition of the target object is high.
  • the target object can be recognized only based on the second area, instead of the area other than the second area in the super-division color image.
  • the recognition of the target object simplifies the complexity of the recognition of the target object.
  • the target area of the target image in step 204 is the entire area of the target image as an example.
  • step 204 may include:
  • Step 2041a Obtain a target four-channel image, where the image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: the plurality of pixels The combination map of the pixels in the first row and the first column, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column in the group.
  • the image of the target area in the target raw image may include multiple pixel groups, and each pixel group includes four pixels of red, green 1, green 2, and blue.
  • the object recognition device may convert the image of the target area in the target raw image into a target four-channel image.
  • the target four-channel image includes four combined images, and each of the four combined images includes the multiple Pixels at the same position in a pixel group.
  • the target four-channel image obtained by converting the target area in the target image may include the target four-channel image as shown in FIG. 7
  • the four combination diagrams shown are the combination diagram of red pixels, the combination diagram of green pixels, the combination diagram of green pixels, and the combination diagram of blue pixels.
  • the pixel belongs to the combined image of the red pixels in the target four-channel image; if the number of rows of the pixel is odd and the number of columns is even , The pixel belongs to the combination map of green 1-color pixels in the target four-channel image; if the number of rows of the pixel is even and the number of columns is odd, then the pixel belongs to the combination map of green 2-color pixels in the target four-channel image; if the pixel If the number of rows and the number of columns is even, the pixel belongs to the combined image of blue pixels in the target four-channel image.
  • Step 2042a Input the target four-channel image into the first model to obtain the super-division color image output by the first model.
  • the first model is used to super-divide the four-channel image of the Bayer-mode image that is not super-divided, and output the color image of the Bayer-mode image after the resolution is increased. Therefore, after the four-channel image (called the target four-channel image) of the target area (a Bayer mode image, and the target area is also the entire target image) is input into the first model, the first model can The target four-channel image is processed, and then a super-division color image is output.
  • the target four-channel image a Bayer mode image, and the target area is also the entire target image
  • the first model may be a neural network model.
  • FIG. 8 is a schematic diagram of a first model provided by an embodiment of the application.
  • the first model may include: a first module, 16 second modules connected in series, and k first modules connected in series. Three modules, fourth module, fifth module, sixth module, seventh module and eighth module.
  • the first module may include: a convolutional layer and a leaky linear rectification layer.
  • This convolutional layer is used to perform convolution processing on all images input to the first model (such as the above-mentioned target four-channel image) by using 64 convolution kernels respectively, and output 64-channel images.
  • the size of each convolution kernel is 3*3.
  • the leaky linear rectification layer is used to activate the image output by the convolutional layer (such as the above-mentioned sixty-four channel image) based on the leaky linear rectification function to obtain the activation feature map of the image.
  • the leakage linear rectification function can be:
  • x i,j is the pixel value (such as pixel intensity) of the pixel in the i-th row and j-th column in the image of the input leakage linear rectification layer
  • y i,j is the pixel in the i-th row and j-th column in the activation feature map
  • the pixel value of, a is the weakening parameter, and 1 ⁇ a ⁇ 2. It can be seen that after the activation process, the pixel value of the pixel whose pixel value is greater than or equal to zero is retained, while the pixel value of the pixel whose pixel value is less than zero is weakened. i ⁇ 1, j ⁇ 1.
  • the second module may include: a first convolution layer, a linear rectification layer, a second convolution layer, and an addition layer.
  • the first convolution layer can be used to: use 64 kinds of convolution kernels (3*3 size) to process the image input to the second module, and output 64-channel images;
  • the linear rectification layer is used for linear rectification based
  • the function activates the 64-channel image output by the first convolutional layer, and outputs the 64-channel image after the activation processing;
  • the second convolutional layer is used to respectively use 64 convolution kernels (3*3 size) Process the 64-channel image output by the leaky linear rectification layer and output a 64-channel image;
  • the addition layer is used to compare the 64-channel image output by the second convolutional layer with the image input to the second module Add and output the feature map of the 64-channel image.
  • linear rectification function is as follows:
  • x i,j is the pixel value of the pixel in the i-th row and j-th column in the image output by the first convolutional layer
  • y i,j is the pixel value of the pixel in the i-th row and jth column in the activation feature map of the image.
  • the third module may include: a convolutional layer, a first leaky linear rectification layer, a pixel drawing layer, and a second leaky linear rectification layer.
  • the convolution layer is used to process the image input to the third module with 64 types of convolution kernels (3*3 size), and output 64-channel images;
  • the first leaky linear rectification layer is used for leak-based linear rectification
  • the function activates the 64-channel image output by the convolutional layer, and outputs the 64-channel image after activation processing;
  • the pixel draw layer is used to divide the 64-channel image output by the first drain linear rectification layer into 4 Copies (each including 16-channel images), and stitch these 4 images into a 16-channel image;
  • the second leaky linear rectification layer is used to activate the 16-channel image output by the pixel drawing layer based on the leaky linear rectification function Process and output the sixteen-channel image after activation processing.
  • the leakage linear rectification function can refer to the leakage linear
  • the size of the combined image (called the first combined image) in the 64-channel image (including 64 combined images) output by the first leakage linear rectification layer can be x*y, then the 16-channel output of the pixel drawing layer
  • the size of the channel image (referred to as the second combined image) in the image (including 16 combined images) can be 2x*2y.
  • the pixel draw layer can combine 4 first combination pictures (each first combination picture includes 16 first combination pictures), each of the m-th first combination pictures in the first combination picture, form a set of first combinations In this way, 16 groups of first combined diagrams can be obtained, and each group of first combined diagrams includes 4 first combined diagrams. After that, the pixel drawing layer can splice the 4 first combination diagrams in each group of the first combination diagram into a second combination diagram, thereby obtaining 16 second combination diagrams.
  • FIG. 9 is a schematic diagram of the pixel drawing layer splicing four first combination diagrams in a group of first combination diagrams into one second combination diagram.
  • the four first combined images are called images A, B, C, and D
  • the pixels in image A are called 1
  • the pixels in image B are called 2
  • the pixels in image C are It is called 3
  • the pixel in image D is called 4.
  • the second combination map formed by splicing the four first combination maps may include a plurality of pixel groups arranged in an array, and each pixel group includes two rows and two columns of pixels, where the pixels in the first row and the first column Is pixel 1 from image A, the pixel in row 1 and column 2 is pixel 2 from image B, the pixel in row 2 and column 1 is pixel 3 from image C, and the pixel in row 2 and column 2 is pixel from Pixel 4 of image D.
  • each third module can magnify the input image by 2 times. Since the first model includes k third modules, the k third modules can magnify the input image by 2k times in total.
  • the number of third modules in the first model can be set reasonably according to the size of K.
  • the fourth module includes a convolutional layer, and the convolutional layer is used to process the image input to the fourth module by using 64 convolution kernels (3*3 size), and output a 64-channel image.
  • the fifth module includes a convolutional layer, and the convolutional layer is used to process the image input to the fourth module with 4 types of convolution kernels (3*3 size), and output a four-channel image.
  • the resolution of the four-channel image output by the fifth module is greater than the resolution of the four-channel image input to the first model.
  • the sixth module includes an amplification layer and an addition layer.
  • the magnification layer is used to interpolate and magnify the four-channel image input to the first model by means of bilinear interpolation, and the magnification factor is also K.
  • the addition layer is used to add the output result of the amplification layer and the output result of the fifth module to obtain the above-mentioned image of the target area after the resolution is increased.
  • the seventh module includes: a convolutional layer and a leaky linear rectification layer.
  • This convolutional layer is used to convolve the output of the sixth module with 64 types of convolution kernels, and output 64-channel images. Among them, the size of each of the 64 types of convolution kernels is equal to 3*3.
  • the leaky linear rectification layer is used to activate the image output by the convolutional layer based on the leaky linear rectification function to obtain the activation feature map of the image.
  • the leakage linear rectification function can refer to the leakage linear rectification function in the first module, which is not described in detail in the embodiment of the present application.
  • the eighth module includes: convolutional layer.
  • the convolutional layer is used to convolve the output result of the seventh module with 3 types of convolution kernels, and output the above-mentioned super-division color image (with 3 channels).
  • each of the 3 types of convolution kernels The size of the product core is 3*3.
  • the size of the convolution kernel used by each convolution layer in the first model is 3*3 as an example.
  • the size of each convolution kernel may not be 3. *3, such as 4*4, etc., this embodiment of the application does not limit this.
  • step 204 after acquiring the target four-channel image, the object recognition device directly uses the first model to process the target four-channel image to obtain a super-differential color image, so that the super-differential color image can be obtained.
  • the image efficiency is higher.
  • the object recognition device may train the initial model based on the first training data to obtain the first model.
  • the process of training the initial model to obtain the first model may not be executed by the object recognition device, which is not limited in the embodiment of the present application.
  • the raw image of the Bayer mode (which can be any raw image) can be obtained first, and then the obtained image can be obtained according to the binning interpolation method.
  • the raw image is interpolated to obtain a small-sized degraded image of the raw image (the degraded image can be regarded as a raw image with reduced resolution).
  • the acquired raw image may also be processed to obtain a color image of the raw image.
  • the degraded image of the raw image and the color image of the raw image can be used as the first training data to train the initial model.
  • the degraded image is used as input during training, and the output result of the initial model is compared with the color image, and then the initial model is adjusted according to the comparison result.
  • the initial model can be trained as the first model by repeating this process many times.
  • the pixel value of the pixel in the degraded image satisfies the following formula:
  • R i,j is the pixel value of the red pixel at the (i,j) coordinate in the degraded image
  • GR i+1,j is the pixel of the green pixel at the (i+1,j) coordinate in the degraded image Value
  • GB i,j+1 is the pixel value of the green pixel on the (i,j+1) coordinates in the degraded image
  • B i+1,j+1 is the (i+1,j+1) in the degraded image )
  • f (i-1) ⁇ K+1+2 ⁇ r, (j-1) ⁇ K+1+2 ⁇ c represents the raw image corresponding to the degraded image (the degraded image is obtained by interpolating the raw image) in (( i-1) ⁇ K+1+2 ⁇ r, (j-1) ⁇ K+1+2 ⁇ c) the pixel value of the pixel on the coordinate.
  • f (i-1) ⁇ K+1+2 ⁇ r, (j-1) ⁇ K+2+2 ⁇ c means ((i-1) ⁇ K+1+2 ⁇ r in the raw image corresponding to the degraded image , The pixel value of the pixel on the (j-1) ⁇ K+2+2 ⁇ c) coordinates.
  • f (i-1) ⁇ K+2+2 ⁇ r, (j-1) ⁇ K+1+2 ⁇ c means ((i-1) ⁇ K+2+2 ⁇ r in the raw image corresponding to the degraded image , The pixel value of the pixel on the (j-1) ⁇ K+1+2 ⁇ c) coordinates.
  • f (i-1) ⁇ K+2+2 ⁇ r, (j-1) ⁇ K+2+2 ⁇ c represents the raw image corresponding to the degraded image ((i-1) ⁇ K+2+2 ⁇ r , The pixel value of the pixel on the (j-1) ⁇ K+2+2 ⁇ c) coordinates.
  • K is the degradation factor, which is equal to the resolution ratio K of the super-division color image and the target region of the target image mentioned in step 205.
  • the training data required by the model needs to be obtained based on the degraded image of the color image.
  • the first training data used to train the first model is obtained based on the degraded image of the raw image of the Bayer mode.
  • the process of acquiring the degraded image of the color image is more complicated, while the process of acquiring the degraded image of the raw image is simple. Therefore, the efficiency of acquiring the first training data for training the first model in this application is relatively high. Accordingly, The accuracy of the first model obtained by training is also higher.
  • step 204 may include:
  • Step 2041b Obtain a target four-channel image, where the image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: the plurality of pixels The combination map of the pixels in the first row and the first column, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column in the group.
  • step 2041b reference may be made to the above step 2041a, which is not described in detail in the embodiment of the present application.
  • Step 2042b Input the target four-channel image into the second model to obtain the target four-channel image output by the second model after the resolution has been increased.
  • the second model is used to super-divide the four-channel image of the unsuper-divided Bayer mode image, and output the four-channel image after the resolution is increased. Therefore, the four-channel image (referred to as the target four-channel image) of the target area image (a Bayer mode image, the embodiment of the present application takes the target area as the entire target image as an example) in the target image After the second model, the second model can process the target four-channel image, and then output the target four-channel image with increased resolution.
  • the second model may be a neural network model.
  • FIG. 11 is a schematic diagram of a second model provided by an embodiment of the application.
  • the second model may include: a first module, 16 second modules connected in series, and k first modules connected in series. Three modules, fourth module, fifth module and sixth module. For the explanation of these modules, reference may be made to the explanation of these modules in the first model shown in FIG. 8, which is not repeated in the embodiment of the present application.
  • Step 2043b Convert the target four-channel image with the increased resolution into a Bayer mode image.
  • the object recognition device may obtain the four-channel image of the target region in the target image (referred to as the target four-channel image) in step 2041b.
  • the target four-channel image with the increased resolution is converted into a Bayer mode image.
  • the Bayer mode image may include a plurality of pixel groups, and each pixel group includes four kinds of pixels of red, green 1, green 2, and blue.
  • the object recognition device may convert the target four-channel image with the increased resolution into a Bayer mode image, and the pixels at the same position in the multiple pixel groups all come from the same combined image in the target four-channel image.
  • the image obtained after the resolution of the target four-channel image in Figure 7 is increased is as shown in Figure 12 (including the combination map of red pixels, the combination map of green 1-color pixels, the combination map of green 2-color pixels, and the blue Combination map of color pixels).
  • the Bayer mode image obtained by the conversion of the target four-channel image after the resolution is increased may be as shown in FIG. 13.
  • Step 2044b Perform a second process on the Bayer mode image to obtain a super-division color image.
  • the object recognition apparatus may perform second processing on the Bayer mode image (that is, the image of the target area after the resolution is increased) obtained in step 2043b based on the parameters of the first processing in step 202, Obtain a hyperdivision color image.
  • the parameter when performing corresponding processing on the Bayer mode image based on the parameters for processing the target raw image into a color image, if the parameter is a parameter of automatic white balance processing, color correction processing or gamma correction processing, it can be directly used This parameter processes the Bayer mode image. If this parameter is a distortion correction processing parameter, because the Bayer mode image and the target raw image are different in size and resolution, it is necessary to correct the distortion correction processing parameters, and use the corrected distortion correction processing parameters Bayer mode image processing.
  • the first processing includes the first distortion correction processing
  • the second processing includes the second distortion correction processing
  • both the first distortion correction processing and the second distortion correction processing may be correction processing based on the distortion curve (such as Zhang’s distortion correction). handle).
  • the parameters of the first distortion correction processing include: the first distortion curve; the parameters of the second distortion correction processing include: the second distortion curve; any sampling point (X 0 , Y 0 ) in the second distortion curve is in the Bayer mode image
  • the coordinates of the corresponding pixel in is (X Ai , Y Ai ), and the coordinates of any sampling point (X 0 , Y 0 ) in the first distortion curve in the target image is (X Bi , Y Bi ) ;
  • X Ai (X Bi -(W f /2+0.5))*K+(W f *K/2+0.5)
  • Y Ai (Y Bi -(H f /2+0.5))*K+ (H f *K/2+0.5)
  • the parameters for processing the target image into a color image may not be suitable for processing the Bayer mode image into a color image. If the target image is directly processed into color
  • the processing of the Bayer mode image with the image parameters may cause problems (such as color cast, image distortion, and contrast changes) in the processed super-division color image.
  • the parameters of the distortion correction processing for processing the target raw image into a color image can be corrected, thus avoiding these problems in the obtained super-division color image.
  • corresponding processing is performed on the Bayer mode image to obtain a super-division color image.
  • the corresponding processing on the Bayer mode image may not be performed based on the parameters for processing the target raw image into a color image. The embodiment of the application does not limit this.
  • the first implementation manner is to directly obtain the super-division color image based on the first model, and the first implementation manner obtains the super-division color image faster.
  • the second implementation method is to first super-divide the image of the target area to obtain the Bayer mode image (for example, obtain a target four-channel image with increased resolution through the second model, and then convert the target four-channel image into a Bayer mode image), After that, the Bayer mode image is processed into a super-division color image.
  • the object recognition apparatus may use any one of these two implementation modes to perform step 204, or the object recognition apparatus may perform step 204 based on the user's selection by adopting the implementation mode selected by the user in the two implementation modes.
  • the object recognition device may train the initial model based on the second training data to obtain the second model.
  • the process of training the initial model to obtain the second model may not be executed by the object recognition device, which is not limited in the embodiment of the present application.
  • the raw image of the Bayer mode (which can be any raw image) can be obtained first, and then the obtained image can be obtained according to the binning interpolation method.
  • the raw image is interpolated to obtain a small-sized degraded image of the raw image (the degraded image can be regarded as a raw image with reduced resolution).
  • the degraded image can be regarded as a raw image with reduced resolution.
  • the initial model When training the initial model, you can input the degraded image into the initial model, and compare the output result of the initial model with the raw image corresponding to the degraded image, and adjust the initial model according to the comparison result; repeat the process many times Then the initial model can be trained as the second model.
  • the target area of the target image in step 204 is the entire area of the target image as an example.
  • the target area of the target image may be an area (such as a partial area) that contains the target object in the target image.
  • the target raw image may be the m+1th frame image in the raw image video (in the embodiment of the application, the raw image video is the Bayer mode video), m ⁇ 1, at this time, in Fig. 2
  • the object recognition method may further include:
  • Step 301 Perform a third process on the m-th frame image in the video to obtain a color image of the m-th frame image.
  • Step 302 Determine a third region containing the target object in the color image of the m-th frame of image.
  • Step 303 Determine a target area corresponding to the third area in the (m+1)th frame of image.
  • the object recognition device may determine the target area corresponding to the third area in the (m+1)th frame of image (that is, the target raw image).
  • the content contained in the third area is roughly similar to the corresponding target area.
  • the similarity of the features of these two areas is greater than the similarity threshold (such as 80%, 90%, etc.).
  • the target area in the target image is defined by the target area.
  • the corresponding third region in the m-th frame of image changes.
  • both the target area and the third area are rectangular, the coordinates of the upper left corner of the third area are (X D1 , Y D1 ), and the coordinates of the lower right corner are (X D2 , Y D2 ); the target area corresponding to the third area
  • the coordinates of the upper left corner of is (X C1 , Y C1 ), and the coordinates of the lower right corner are (X C2 , Y C2 );
  • X D1 (X C1 -(X C1 +X C2 )/2)*L+(X C1 +X C2 )/2;
  • Y D1 (Y C1 -(Y C1 +Y C2 )/2)*L+(Y C1 +Y C2 )/2;
  • X D2 (X C2 -(X C1 +X C2 ) /2)*L+(X C1 +X C2 )/2;
  • Y D2 (Y C2 -(Y C1 +Y C2 )/
  • the m-th frame image is shown in FIG. 15 and the m+1-th frame image is shown in FIG. 16.
  • the third area corresponds to the target in the image frame m+1
  • Step 304 When the (m+1)th frame of image contains multiple target areas, and there are two target areas that satisfy the replacement condition in the multiple target areas, replace the two target areas with candidate target areas to obtain an updated Multiple target areas.
  • the replacement condition includes: the two target regions at least partially overlap, and the sum of the areas of the two target regions is greater than the area of the candidate target region.
  • the coordinates of the upper left corner of one of the two target areas are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the coordinates of the upper left corner of the other of the two target areas are (X 21 , Y 21 ), and the lower right corner coordinates are (X 22 , Y 22 ); the upper left corner coordinates of the candidate target area are (X M1 , Y M1 ), and the lower right corner coordinates are (X M2 , Y M2 ) ;
  • X M1 is the minimum value of X 11 and X 21 ;
  • Y M1 is the minimum value of Y 11 and Y 21 ;
  • X M2 is the maximum value of X 12 and X 22 ;
  • Y M2 is the maximum value of Y 12 and Y 22
  • the target area in the target raw image is as shown in FIG. 17, and includes: target area 1 and target area 2.
  • the coordinates of the upper left corner of the target area 1 are (3, 6), and the coordinates of the lower right corner are (6, 3)
  • the coordinates of the upper left corner of the target area 2 are (0, 9)
  • the coordinates of the lower right corner of the target area 2 are ( 9, 0);
  • the coordinates of the upper left corner of the candidate target area X (not belonging to the multiple target areas determined in step 301) may be (0, 6), and the coordinates of the lower right corner may be (9, 3).
  • the target area 1 and the target area 2 at least partially overlap, and the sum (90) of the area (9) of the target area 1 and the area (81) of the target area 2 is greater than the area (27) of the candidate target area X. Therefore, the target area 1 and the target area 2 meet the above replacement conditions, and the target area 1 and the target area 2 of the multiple target areas can be replaced with the candidate target area X, thereby achieving the multiple targets determined in step 301 Regional update.
  • the object recognition apparatus may sequentially use each of the multiple target areas determined in step 301 as the reference area, and execute the update process of the multiple target areas.
  • the update process may include: the object recognition device sequentially determines whether the reference area and each of the multiple target areas except the area satisfy the replacement condition. Once a certain reference area and a certain other area meet the replacement conditions, the object recognition device can replace the reference area and the other area with candidate target areas corresponding to the two areas.
  • the update process may include: the object recognition device may also find a group of other areas in all other areas, and each area in the group of other areas meets the replacement condition with the reference area; after that, the object recognition device returns The area difference corresponding to each area in the group of other areas can be determined (the difference between the sum of the area of the area and the reference area and the area of the candidate target areas corresponding to these two areas), and compare the reference area with the other areas in the group. The corresponding area with the largest area difference in the area is replaced with the candidate target area corresponding to the two areas.
  • the object recognition device can use any one of the updated multiple target areas as the target area in step 204.
  • the object recognition device can intercept each target area in the target raw image to obtain an image of each target area, and use the method shown in FIG. 2 to process the image of each target area and recognize the target object.
  • the recognition process can refer to the recognition based on the m+1 frame image.
  • the process of identifying the target object can be used as the third area in the image of the previous frame of the image of the m+2th frame, and there is no need to identify the target object based on the image of the m+2th frame. During the process, re-determine the third area in the previous frame of the m+2th frame of image.
  • step 303 after multiple target areas are determined in step 303, the multiple target areas are also updated in step 304 as an example.
  • step 304 may not be performed, but the multiple target areas are determined in step 303.
  • Step 204 is directly executed after the target area is set, which is not limited in the embodiment of the present application.
  • FIG. 18 is a block diagram of an object recognition device provided by an embodiment of this application, which can run the aforementioned object recognition method. As shown in Figure 18, the object recognition device includes:
  • the obtaining module 1801 is configured to obtain the target raw image generated by the image sensor; the operation performed by the obtaining module 1801 can refer to step 201 in the embodiment shown in FIG. 2, and details are not described in this embodiment of the present application.
  • the super-division module 1802 is used to super-division the image of the target region in the target image to obtain a super-division color image, wherein the target region includes at least a part of the target image, and the resolution of the super-division color image is greater than that of the target region
  • the resolution of the image; the operation performed by the super-division module 1802 can refer to step 204 in the embodiment shown in FIG. 2, and details are not described in this embodiment of the present application.
  • the recognition module 1803 is used for recognizing the target object based on the hyperdivision color image.
  • the identification module 1803 For operations performed by the identification module 1803, reference may be made to step 205 and step 206 in the embodiment shown in FIG. 2, and details are not described in the embodiment of the present application.
  • the target image is a Bayer mode image.
  • the object recognition device further includes:
  • the first processing module (not shown in FIG. 18) is used to perform the first processing on the target raw image to obtain a color image of the target raw image; for the operations performed by the first processing module, refer to the embodiment shown in FIG. 2 Step 202 in the embodiment of the present application will not be repeated here.
  • the first determining module (not shown in FIG. 18) is used to determine the first area of the target object in the color image of the target image; the operation performed by the first determining module can refer to the embodiment shown in FIG. 2 Step 203 of the embodiment of the present application will not be repeated here.
  • the recognition module 1803 is configured to: determine the second area in the super-division color image based on the first area; and perform the recognition of the target object based on the image of the second area in the super-division color image.
  • the target raw image is a Bayer mode image
  • the super-division module 1802 is used to: obtain a target four-channel image, where the image of the target area includes a plurality of pixel groups arranged in an array, and the pixel group includes two rows and two columns of pixels ,
  • the target four-channel image includes: a combination map of pixels in the first row and column 1, a combination map of pixels in the first row and second column, a combination map of pixels in the second row and first column, and a combination map of the pixels in the second row and second column.
  • the first model is used to super-division the four-channel image of the unsuper-division Bayer mode image , Output the color image of the Bayer mode image after the resolution has been increased.
  • the target image is a Bayer mode image
  • the super-division module 1902 includes: a supramolecular module (not shown in FIG. 18) for super-division of the image of the target area to obtain a Bayer mode image; processing sub-module (Not shown in FIG. 18), used to perform the second processing on the Bayer mode image to obtain a super-division color image.
  • the supramolecular module is used to obtain a target four-channel image, where the image of the target area includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: a plurality of pixels The combination map of the pixels in the first row and the first column, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column in the group; target four channels
  • the image is input to the second model, and the target four-channel image with the increased resolution output by the second model is obtained; the target four-channel image with the increased resolution is converted into a Bayer mode image; among them, the second model is used for the unsupervised
  • the four-channel image of the Bayer mode image is super-divided, and the four-channel image with increased resolution is output.
  • the first process includes a first distortion correction process
  • the second process includes a second distortion correction process
  • the parameters of the first distortion correction process include: a first distortion curve
  • the parameters of the second distortion correction process include: a second distortion Curve
  • the coordinates of any sampling point (X 0 , Y 0 ) in the second distortion curve corresponding to the pixel in the hyperdivision target image is (X Ai , Y Ai ), any sampling point in the first distortion curve (X 0 , Y 0 )
  • the object recognition device further includes: a second determination module (not shown in FIG. 18), configured to determine a target area in the target image containing the target object.
  • a second determination module (not shown in FIG. 18), configured to determine a target area in the target image containing the target object.
  • the target raw image is the m+1th frame image in the raw image video, m ⁇ 1
  • the object recognition device further includes:
  • the third processing module (not shown in FIG. 18) is used to perform third processing on the m-th frame image in the raw image video to obtain a color image of the m-th frame image; this process can refer to the embodiment shown in FIG. 14 Step 301 in the embodiment of the application will not be repeated here.
  • the third determining module (not shown in FIG. 18) is used to determine the third region containing the target object in the color image of the m-th frame of image; this process can refer to step 302 in the embodiment shown in FIG. 14. This application The embodiments are not described in detail here.
  • the second determining module is used for determining the target area corresponding to the third area in the target raw image.
  • the object recognition device further includes:
  • Replacement module (not shown in Figure 18), used to replace the two target regions as candidate targets when the target image contains multiple target regions, and there are two target regions that meet the replacement conditions in the multiple target regions Region, the updated multiple target regions; among which, the replacement conditions include: two target regions at least partially overlap, and the sum of the two target regions is greater than the area of the candidate target region; the target region of one of the two target regions
  • the coordinates of the upper left corner are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the coordinates of the upper left corner of the other of the two target areas are (X 21 , Y 21 ), and the lower right corner
  • the coordinates are (X 22 , Y 22 ); the upper left corner coordinates of the candidate target area are (X M1 , Y M1 ), and the lower right corner coordinates are (X M2 , Y M2 );
  • X M1 is the smallest of X 11 and X 21 Y M1 is the
  • the super-division module can perform super-division processing on the image of the target area in the target raw image to enlarge and enhance the detailed information in the image of the target area. Since the super-divided image is a raw image that has not been processed by ISP, the image of the target area after the super-division processing can contain more detailed information. Therefore, the super-division color image (the color image of the image of the target area after the super-division processing) obtained by this embodiment has more detailed information than the color image in the prior art. Therefore, the recognition of the target object based on the super-division color image can improve the accuracy of the recognition of the target object.
  • the embodiment of the present application provides a computer storage medium in which a computer program is stored, and the computer program is used to execute any object recognition method provided in the embodiment of the present application.
  • the embodiments of the present application provide a computer program product containing instructions.
  • the object recognition device executes any object recognition method provided in the embodiments of the present application.
  • the computer may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it may be implemented in the form of a computer program product in whole or in part, and the computer program product includes one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data.
  • the center transmits to another website, computer, server, or data center through wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.).
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium, or a semiconductor medium (for example, a solid state hard disk).
  • first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance.
  • the term “at least one” refers to one or more, and “multiple” refers to two or more, unless expressly defined otherwise.
  • the disclosed device and the like can be implemented in other structural manners.
  • the device embodiments described above are merely illustrative, for example, the division of units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separate, and the components described as units may or may not be physical units, and may be located in one place or distributed to multiple object recognition devices (such as terminals). Equipment). Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Processing (AREA)
  • Color Television Image Signal Generators (AREA)
  • Image Analysis (AREA)

Abstract

The present application relates to the technical field of images, and disclosed therein are an object recognition method and device. The method comprises: acquiring a target raw image generated by an image sensor; carrying out super resolution on a target region in the target raw image and obtaining a super resolution color image, the target region comprising at least part of a region in the target raw image, and the resolution of the super resolution color image being greater than the resolution of an image of the target region; and recognizing a target object on the basis of the super resolution color image. Thus, the accuracy of target object recognition can be increased.

Description

物体识别方法及装置Object recognition method and device
本申请要求于2020年01月21日提交的申请号为202010071857.8、发明名称为“物体识别方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application with an application number of 202010071857.8 and an invention title of "Object Recognition Method and Device" filed on January 21, 2020, the entire content of which is incorporated into this application by reference.
技术领域Technical field
本申请涉及图像技术领域,特别涉及一种物体识别方法及装置。This application relates to the field of image technology, and in particular to an object recognition method and device.
背景技术Background technique
随着图像技术的发展,图像技术已经运用到人们生产生活的各个领域,比如物体识别领域。示例地,通过对监控摄像机采集到的图像进行人脸识别,能够辅助查找罪犯。With the development of image technology, image technology has been applied to various fields of people's production and life, such as the field of object recognition. For example, by performing face recognition on images collected by surveillance cameras, it can assist in finding criminals.
通常监控摄像机与被拍摄物的距离较远,监控摄像机采集到的图像的分辨率较低,对该图像进行人脸识别的难度较高,因此,通常需要对该图像的彩色图像首先进行分辨率提升,再对分辨率提升得到的图像进行人脸识别。Generally, the distance between the surveillance camera and the subject is relatively long, and the resolution of the image collected by the surveillance camera is low. It is more difficult to perform face recognition on the image. Therefore, it is usually necessary to first perform the resolution on the color image of the image. Upgrade, and then perform face recognition on the image obtained by the resolution increase.
但是,分辨率提升得到的图像中的人脸信息往往与监控摄像机采集到的图像中的人脸信息差异较大,这就导致对分辨率提升得到的图像进行人脸识别的准确度较低。However, the face information in the image obtained by the resolution increase is often quite different from the face information in the image collected by the surveillance camera, which results in a lower accuracy of face recognition on the image obtained by the resolution increase.
发明内容Summary of the invention
本申请提供了一种物体识别方法及装置,可以解决对分辨率提升得到的图像进行人脸识别的准确度较低的问题,所述技术方案如下:The present application provides an object recognition method and device, which can solve the problem of low accuracy of face recognition on an image obtained by increasing the resolution. The technical solution is as follows:
第一方面,提供了一种物体识别方法,所述方法包括:在获取图像传感器产生的目标生图像之后,先对所述目标生图像中目标区域的图像进行超分得到超分彩色图像,再基于所述超分彩色图像进行目标物体的识别。其中,所述目标区域包括所述目标生图像中的至少部分区域,所述超分彩色图像的分辨率大于所述目标区域的图像的分辨率。本申请提供的物体识别方法中,通过对目标生图像中目标区域的图像进行超分处理,以放大增强目标区域的图像中的细节信息。由于被超分的图像是未经过ISP处理的生图像,因此超分处理后的目标区域的图像能够包含较多细节信息。使得本实施例得到的超分彩色图像(超分处理后的目标区域的图像的彩色图像)相较于现有技术中的彩色图像拥有更多的细节信息。因此,基于该超分彩色图像进行目标物体的识别,能够提升目标物体的识别的准确度。In a first aspect, an object recognition method is provided. The method includes: after acquiring a target image generated by an image sensor, first super-divide the image of the target region in the target image to obtain a super-divided color image, and then The target object is recognized based on the super-division color image. Wherein, the target area includes at least a part of the area in the target raw image, and the resolution of the super-division color image is greater than the resolution of the image of the target area. In the object recognition method provided in the present application, the image of the target region in the target raw image is subjected to super-division processing to enlarge and enhance the detailed information in the image of the target region. Since the super-divided image is a raw image that has not been processed by ISP, the image of the target area after the super-division processing can contain more detailed information. Therefore, the super-division color image (the color image of the image of the target area after the super-division processing) obtained by this embodiment has more detailed information than the color image in the prior art. Therefore, the recognition of the target object based on the super-division color image can improve the accuracy of the recognition of the target object.
可选地,所述目标生图像为拜耳模式的图像。需要说明的是,本申请中以目标生图像为拜耳模式的生图像为例,当然,该目标生图像还可以为除拜耳模式之外的任一种模式(如红绿蓝亮度(RGBW)模式)的生图像,本申请对此不作限定。Optionally, the target image is a Bayer mode image. It should be noted that in this application, the target image is a Bayer mode image as an example. Of course, the target image can also be any mode other than the Bayer mode (such as red, green, and blue brightness (RGBW) mode). ), this application is not limited.
可选地,所述方法还包括:对所述目标生图像进行第一处理,得到所述目标生图像的彩色图像;确定所述目标生图像的彩色图像中包含所述目标物体的第一区域;上述基于所述超分彩色图像进行目标物体的识别,具体包括:首先基于所述第一区域,确定所述超分彩色图像中的第二区域;然后基于所述超分彩色图像中所述第二区域的图像,进行所述目标物体的识别。在确定目标生图像的彩色图像中的第一区域后,可以采用一定的方式将该第一区域映 射到超分彩色图像中,从而得到该超分彩色图像中的第二区域。并且,由于第二区域是第一区域映射得到的区域,因此,在第一区域包含目标物体的前提下,第二区域也包含目标物体。在确定超分彩色图像中的第二区域后,物体识别装置可以截取该第二区域的图像,并对该第二区域的图像进行目标物体的识别。并且,由于该超分彩色图像包含了目标物体的较多细节信息,因此,该超分彩色图像中的第二区域的图像也包含了目标物体的较多细节信息,基于该第二区域的图像进行目标物体的识别的准确度较高。Optionally, the method further includes: performing first processing on the target image to obtain a color image of the target image; and determining a first region in the color image of the target image that contains the target object The above-mentioned recognition of the target object based on the super-division color image specifically includes: first determining the second region in the super-division color image based on the first region; and then based on the super-division color image The image of the second area is used to identify the target object. After the first area in the color image of the target image is determined, the first area can be mapped to the super-division color image in a certain manner, so as to obtain the second area in the super-division color image. In addition, since the second area is an area obtained by mapping the first area, the second area also includes the target object on the premise that the first area contains the target object. After determining the second region in the super-division color image, the object recognition device may intercept the image of the second region, and perform target object recognition on the image of the second region. Moreover, since the super-division color image contains more detailed information of the target object, the image of the second region in the super-division color image also contains more detailed information of the target object, based on the image of the second region The accuracy of the recognition of the target object is high.
可选地,所述第一区域和所述第二区域均呈矩形,所述第一区域的左上角坐标为(X A1,Y A1),且右下角坐标为(X A2,Y A2);所述第二区域的左上角坐标为(X B1,Y B1),且右下角坐标为(X B2,Y B2);其中,X B1=X A1*K;Y B1=Y A1*K;X B2=X B1+(X A2-X A1)*K;Y B2=Y B1+(Y A2-Y A1)*K,K表示超分彩色图像与目标区域的图像的分辨率比值。 Optionally, the first area and the second area are both rectangular, the upper left corner coordinates of the first area are (X A1 , Y A1 ), and the lower right corner coordinates are (X A2 , Y A2 ); The coordinates of the upper left corner of the second area are (X B1 , Y B1 ), and the coordinates of the lower right corner are (X B2 , Y B2 ); where X B1 =X A1 *K; Y B1 =Y A1 *K; X B2 = X B1 + (X A2- X A1 )*K; Y B2 = Y B1 + (Y A2 -Y A1 )*K, K represents the resolution ratio between the super-division color image and the image in the target area.
示例地,上述得到超分彩色图像的过程有多种实现方式,本申请对此不作限定。以下将以以下的两种实现方式为例,对上述得到超分彩色图像的过程的实现方式进行讲解。For example, there are multiple implementation manners for the foregoing process of obtaining a super-division color image, which is not limited in this application. The following two implementations will be taken as examples to explain the implementation of the above process of obtaining a super-division color image.
可选地,在得到超分彩色图像的过程的第一种实现方式中,所述目标生图像为拜耳模式的图像,所述对所述目标生图像中的目标区域的图像进行超分,得到超分彩色图像,包括:获取所述目标四通道图像,以及将所述目标四通道图像输入第一模型,得到所述第一模型输出的所述超分彩色图像;其中,所述第一模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的所述拜耳模式的图像的彩色图像。所述目标区域的图像包括阵列排布的多个像素组,所述像素组包括两行两列像素,所述目标四通道图像包括:所述多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图。Optionally, in the first implementation of the process of obtaining a super-division color image, the target image is a Bayer mode image, and the image of the target area in the target image is super-divided to obtain The super-division color image includes: acquiring the target four-channel image, and inputting the target four-channel image into a first model to obtain the super-division color image output by the first model; wherein, the first model It is used to super-divide the four-channel image of the unsuper-divided Bayer-mode image, and output a color image of the Bayer-mode image whose resolution has been increased. The image of the target area includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: The combination map, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column.
可选地,在得到超分彩色图像的过程的第二种实现方式中,所述目标生图像为拜耳模式的图像,所述对所述目标生图像中的目标区域的图像进行超分,得到超分彩色图像,包括:先对所述目标区域的图像进行超分,得到拜耳模式图像;再对所述拜耳模式图像进行第二处理,得到所述超分彩色图像。Optionally, in the second implementation manner of the process of obtaining a super-division color image, the target image is a Bayer mode image, and the image of the target area in the target image is super-divided to obtain The super-division color image includes: performing super-division on the image of the target area to obtain a Bayer mode image; and then performing a second processing on the Bayer mode image to obtain the super-division color image.
可选地,所述对所述目标区域的图像进行超分,得到拜耳模式图像,包括:获取所述目标四通道图像,其中,所述目标区域的图像包括阵列排布的多个像素组,所述像素组包括两行两列像素,所述目标四通道图像包括:所述多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图;将所述目标四通道图像输入第二模型,得到所述第二模型输出的提升分辨率后的所述目标四通道图像;将提升分辨率后的所述目标四通道图像转换为所述拜耳模式图像;其中,所述第二模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的四通道图像。Optionally, the super-division of the image of the target area to obtain the Bayer mode image includes: acquiring the four-channel image of the target, wherein the image of the target area includes a plurality of pixel groups arranged in an array, The pixel group includes two rows and two columns of pixels, and the target four-channel image includes: a combination map of pixels in the first row and column 1, a combination map of pixels in the first row and second column, and a second row and second column of pixels in the plurality of pixel groups. The combined image of the pixels in the first column of the row and the combined image of the pixels in the second column and the second row; input the target four-channel image into the second model to obtain the target four-channel output of the second model after the resolution is increased Image; converting the target four-channel image with the increased resolution into the Bayer mode image; wherein, the second model is used to super-divide the four-channel image of the unsuper-divided Bayer mode image, and the output is improved Four-channel image after resolution.
可以看出,得到超分彩色图像的过程的上述两种实现方式中,第一种实现方式是基于第一模型直接得到超分彩色图像,第一种实现方式得到超分彩色图像的速度较快。而第二种实现方式是先对目标区域的图像进行超分得到拜耳模式图像(如通过第二模型得到提升分辨率的目标四通道图像,再将该目标四通道图像转换为拜耳模式图像),之后再将拜耳模式图像处理为超分彩色图像。物体识别装置可以采用这两种实现方式中的任一种实现方式执行得到超分彩色图像的过程,或者,物体识别装置可以基于用户的选择,采用这两种实现方式中用户选择的实现方式执行得到超分彩色图像的过程。It can be seen that among the above two implementations of the process of obtaining a super-resolution color image, the first implementation method is to directly obtain a super-resolution color image based on the first model, and the first implementation method is faster to obtain a super-resolution color image. . The second implementation method is to first super-divide the image of the target area to obtain the Bayer mode image (for example, obtain a target four-channel image with increased resolution through the second model, and then convert the target four-channel image into a Bayer mode image), After that, the Bayer mode image is processed into a super-division color image. The object recognition device can use either of these two implementations to execute the process of obtaining the super-differentiated color image, or the object recognition device can be based on the user's choice, using the implementation selected by the user in the two implementations to execute The process of obtaining super-division color images.
可选地,所述第一处理包括第一畸变校正处理,所述第二处理包括第二畸变校正处理;所述第一畸变校正处理的参数包括:第一畸变曲线;所述第二畸变校正处理的参数包括:第二畸变曲线;所述第二畸变曲线中的任一采样点(X 0,Y 0)在所述超分目标生图像中对应的像素的坐标为(X Ai,Y Ai),所述第一畸变曲线中的任一采样点(X 0,Y 0)在所述目标生图像中对应的像素的坐标为(X Bi,Y Bi);其中,X Ai=(X Bi-(W f/2+0.5))*K+(W f*K/2+0.5);Y Ai=(Y Bi-(H f/2+0.5))*K+(H f*K/2+0.5);W f表示所述目标生图像的宽,H f表示所述目标生图像的高,K表示所述超分彩色图像与所述目标区域的图像的分辨率比值。 Optionally, the first processing includes a first distortion correction processing, the second processing includes a second distortion correction processing; the parameters of the first distortion correction processing include: a first distortion curve; the second distortion correction The processed parameters include: a second distortion curve; the coordinates of any sampling point (X 0 , Y 0 ) in the second distortion curve corresponding to the pixel in the hyperdivision target image is (X Ai , Y Ai ), the coordinates of the pixel corresponding to any sampling point (X 0 , Y 0 ) in the first distortion curve in the target image are (X Bi , Y Bi ); where X Ai = (X Bi -(W f /2+0.5))*K+(W f *K/2+0.5); Y Ai =(Y Bi -(H f /2+0.5))*K+(H f *K/2+0.5 ); W f represents the width of the target image, H f represents the height of the target image, and K represents the resolution ratio of the super-division color image to the image of the target area.
由于将目标生图像与拜耳模式图像的分辨率不同,因此,将目标生图像处理为彩色图像的参数可能并不适用于将拜耳模式图像处理为彩色图像,如果直接采用将目标生图像处理为彩色图像的参数对拜耳模式图像进行处理,则可能会导致处理得到的超分彩色图像出现问题(如颜色偏色、图像扭曲以及对比度变化等问题)。本申请中,能够对将目标生图像处理为彩色图像的畸变校正处理的参数进行修正,因此,避免了得到的超分彩色图像存在这些问题。本申请中基于将目标生图像处理为彩色图像的参数,对拜耳模式图像进行相应的处理,得到超分彩色图像。当然,在基于上述参数对拜耳模式图像进行相应的处理时,也可以不基于将目标生图像处理为彩色图像的参数对拜耳模式图像进行相应的处理。本申请对此不作限定。Since the resolution of the target image and the Bayer mode image are different, the parameters for processing the target image into a color image may not be suitable for processing the Bayer mode image into a color image. If the target image is directly processed into color The processing of the Bayer mode image with the image parameters may cause problems (such as color cast, image distortion, and contrast changes) in the processed super-division color image. In the present application, the parameters of the distortion correction processing for processing the target raw image into a color image can be corrected, thus avoiding these problems in the obtained super-division color image. In this application, based on the parameters of processing the target raw image into a color image, corresponding processing is performed on the Bayer mode image to obtain a hyperdivision color image. Of course, when corresponding processing is performed on the Bayer mode image based on the above parameters, the corresponding processing on the Bayer mode image may not be performed based on the parameters for processing the target raw image into a color image. This application does not limit this.
进一步地,上述以目标生图像的目标区域为目标生图像的全部区域为例。可选地,该目标生图像的目标区域可以为目标生图像中包含目标物体的区域(如部分区域)。可选地,在所述对所述目标生图像中目标区域的图像进行超分,得到超分彩色图像之前,所述方法还包括:确定所述目标生图像中包含所述目标物体的所述目标区域。Further, in the foregoing, the target area of the target image is taken as the entire area of the target image as an example. Optionally, the target area of the target image may be an area (such as a partial area) that contains the target object in the target image. Optionally, before the super-division of the image of the target region in the target image to obtain a super-division color image, the method further includes: determining that the target image contains the target object. target area.
可选地,所述目标生图像为生图像视频中的第m+1帧图像,m≥1,在所述确定所述目标生图像中包含所述目标物体的所述目标区域之前,所述方法还包括:对所述生图像视频中的第m帧图像进行第三处理,得到所述第m帧图像的彩色图像;确定所述第m帧图像的彩色图像中包含所述目标物体的第三区域;所述确定所述目标生图像中包含所述目标物体的所述目标区域,包括:确定所述目标生图像中对应所述第三区域的所述目标区域。Optionally, the target raw image is the m+1th frame image in the raw image video, m≥1, and before the determination of the target region containing the target object in the target raw image, the The method further includes: performing third processing on the m-th frame image in the raw image video to obtain a color image of the m-th frame image; determining that the color image of the m-th frame image contains the first image of the target object Three regions; the determining the target region in the target image that contains the target object includes: determining the target region in the target image that corresponds to the third region.
可选地,所述目标区域和所述第三区域均呈矩形,所述第三区域的左上角坐标为(X D1,Y D1),且右下角坐标为(X D2,Y D2);所述第三区域对应的所述目标区域的左上角坐标为(X C1,Y C1),且右下角坐标为(X C2,Y C2);其中,X D1=(X C1-(X C1+X C2)/2)*L+(X C1+X C2)/2;Y D1=(Y C1-(Y C1+Y C2)/2)*L+(Y C1+Y C2)/2;X D2=(X C2-(X C1+X C2)/2)*L+(X C1+X C2)/2;Y D2=(Y C2-(Y C1+Y C2)/2)*L+(Y C1+Y C2)/2;L>1。 Optionally, the target area and the third area are both rectangular, the coordinates of the upper left corner of the third area are (X D1 , Y D1 ), and the coordinates of the lower right corner are (X D2 , Y D2 ); The coordinates of the upper left corner of the target area corresponding to the third area are (X C1 , Y C1 ), and the coordinates of the lower right corner are (X C2 , Y C2 ); where X D1 =(X C1 -(X C1 +X C2 )/2)*L+(X C1 +X C2 )/2; Y D1 =(Y C1 -(Y C1 +Y C2 )/2)*L+(Y C1 +Y C2 )/2; X D2 =( X C2 -(X C1 +X C2 )/2)*L+(X C1 +X C2 )/2; Y D2 =(Y C2 -(Y C1 +Y C2 )/2)*L+(Y C1 +Y C2 )/2; L>1.
可选地,在所述对所述目标生图像中目标区域的图像进行超分,得到超分彩色图像之前,所述方法还包括:在所述目标生图像包含多个目标区域,且所述多个目标区域中存在满足替换条件的两个目标区域时,将所述两个目标区域替换为备选目标区域,得到更新的所述多个目标区域;其中,所述替换条件包括:所述两个目标区域至少部分重合,且所述两个目标区域的面积之和大于所述备选目标区域的面积;所述两个目标区域中一个目标区域的左上角坐标为(X 11,Y 11),且右下角坐标为(X 12,Y 12);所述两个目标区域中另一个目标区域的左上角坐标为(X 21,Y 21),且右下角坐标为(X 22,Y 22);所述备选目标区域的左上角坐标为(X M1,Y M1),且右下角坐标为(X M2,Y M2);X M1为X 11和X 21的最小值;Y M1为Y 11和Y 21的最小值;X M2为X 12和X 22的最大值;Y M2为Y 12和Y 22的最大值。由于将满足替换条件的多个目标区域替换为备选目标区域,因此减少了目标生图像中目标区域的个数,简化了基于目标区 域进行目标物体的识别的过程。 Optionally, before the super-division of the image of the target region in the target image to obtain a super-division color image, the method further includes: the target image includes a plurality of target regions, and the When there are two target areas that satisfy the replacement condition among the multiple target areas, replace the two target areas with candidate target areas to obtain the updated multiple target areas; wherein, the replacement condition includes: The two target areas overlap at least partially, and the sum of the areas of the two target areas is greater than the area of the candidate target area; the coordinates of the upper left corner of one of the two target areas are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the coordinates of the upper left corner of the other of the two target areas are (X 21 , Y 21 ), and the coordinates of the lower right corner are (X 22 , Y 22 ); The coordinates of the upper left corner of the candidate target area are (X M1 , Y M1 ), and the coordinates of the lower right corner are (X M2 , Y M2 ); X M1 is the minimum value of X 11 and X 21 ; Y M1 is Y The minimum value of 11 and Y 21 ; X M2 is the maximum value of X 12 and X 22 ; Y M2 is the maximum value of Y 12 and Y 22 . Since multiple target regions satisfying the replacement conditions are replaced with candidate target regions, the number of target regions in the target raw image is reduced, and the process of identifying target objects based on the target regions is simplified.
第二方面,提供了一种物体识别装置,所述物体识别装置包括:用于执行第一方面提供的物体识别方法的各个模块。In a second aspect, an object recognition device is provided. The object recognition device includes: modules for executing the object recognition method provided in the first aspect.
可选地,所述物体识别装置包括:获取模块,用于获取图像传感器产生的目标生图像,所述目标生图像为拜耳模式的图像;超分模块,用于获取目标四通道图像,其中,所述目标生图像中目标区域的图像包括阵列排布的多个像素组,所述像素组包括两行两列像素,所述目标四通道图像包括:所述多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图;所述目标区域包括所述目标生图像中的至少部分区域;所述超分模块还用于将所述目标四通道图像输入第一模型,得到所述第一模型输出的超分彩色图像;其中,所述第一模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的所述拜耳模式的图像的彩色图像;所述超分彩色图像的分辨率大于所述目标区域的图像的分辨率;识别模块,用于基于所述超分彩色图像进行目标物体的识别。Optionally, the object recognition device includes: an acquisition module for acquiring a target image generated by an image sensor, the target image being a Bayer mode image; a super-division module for acquiring a four-channel image of the target, wherein, The image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: A combination map of pixels in a column, a combination map of pixels in a first row and a second column, a combination map of pixels in a second row and a first column, and a combination map of pixels in a second row and second column; the target area includes the target raw image The super-division module is also used to input the target four-channel image into a first model to obtain a super-division color image output by the first model; wherein, the first model is used to The four-channel image of the super-divided Bayer mode image is super-divided, and a color image of the Bayer mode image with an increased resolution is output; the resolution of the super-divided color image is greater than the resolution of the image in the target area ; Recognition module for recognizing target objects based on the super-division color image.
可选地,所述物体识别装置包括:获取模块,用于获取图像传感器产生的目标生图像,所述目标生图像为拜耳模式的图像;超分子模块,用于获取目标四通道图像,其中,所述目标生图像中目标区域的图像包括阵列排布的多个像素组,所述像素组包括两行两列像素,所述目标四通道图像包括:所述多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图;所述目标区域包括所述目标生图像中的至少部分区域;所述超分子模块,还用于将所述目标四通道图像输入第二模型,得到所述第二模型输出的提升分辨率后的所述目标四通道图像;将提升分辨率后的所述目标四通道图像转换为拜耳模式图像;其中,所述第二模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的四通道图像;处理子模块,用于对所述拜耳模式图像进行第二处理,得到超分彩色图像,所述超分彩色图像的分辨率大于所述目标区域的图像的分辨率;识别模块,用于基于所述超分彩色图像进行目标物体的识别。Optionally, the object recognition device includes: an acquisition module for acquiring a target image generated by an image sensor, the target image being a Bayer pattern image; a supramolecular module for acquiring a four-channel image of the target, wherein, The image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: A combination map of pixels in a column, a combination map of pixels in a first row and a second column, a combination map of pixels in a second row and a first column, and a combination map of pixels in a second row and second column; the target area includes the target raw image The supramolecular module is also used to input the target four-channel image into a second model to obtain the target four-channel image output by the second model after the resolution is increased; the resolution is improved The target four-channel image after the rate is converted into a Bayer mode image; wherein, the second model is used to super-divide the four-channel image of the unsuper-divided Bayer mode image, and output the four-channel image with increased resolution The processing sub-module is used to perform the second processing on the Bayer mode image to obtain a super-division color image, the resolution of the super-division color image is greater than the resolution of the image in the target area; the recognition module is used to The super-division color image performs target object recognition.
可选地,所述物体识别装置包括:获取模块,用于获取图像传感器产生的目标生图像,所述目标生图像为拜耳模式的图像;超分子模块,用于获取目标四通道图像,其中,所述目标生图像中目标区域的图像包括阵列排布的多个像素组,所述像素组包括两行两列像素,所述目标四通道图像包括:所述多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图;所述目标区域包括所述目标生图像中的至少部分区域;所述超分子模块,还用于将所述目标四通道图像输入第二模型,得到所述第二模型输出的提升分辨率后的所述目标四通道图像;将提升分辨率后的所述目标四通道图像转换为拜耳模式图像;其中,所述第二模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的四通道图像;处理子模块,用于对所述拜耳模式图像进行第二处理,得到超分彩色图像,所述超分彩色图像的分辨率大于所述目标区域的图像的分辨率;第一处理模块,用于对所述目标生图像进行第一处理,得到所述目标生图像的彩色图像;第一确定模块,用于确定所述目标生图像的彩色图像中包含所述目标物体的第一区域;识别模块,用于基于所述第一区域,确定所述超分彩色图像中的第二区域;基于所述超分彩色图像中所述第二区域的图像,进行所述目标物体的识别;Optionally, the object recognition device includes: an acquisition module for acquiring a target image generated by an image sensor, the target image being a Bayer pattern image; a supramolecular module for acquiring a four-channel image of the target, wherein, The image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: A combination map of pixels in a column, a combination map of pixels in a first row and a second column, a combination map of pixels in a second row and a first column, and a combination map of pixels in a second row and second column; the target area includes the target raw image The supramolecular module is also used to input the target four-channel image into a second model to obtain the target four-channel image output by the second model after the resolution is increased; the resolution is improved The target four-channel image after the rate is converted into a Bayer mode image; wherein, the second model is used to super-divide the four-channel image of the unsuper-divided Bayer mode image, and output the four-channel image with increased resolution The processing sub-module is used to perform the second processing on the Bayer mode image to obtain a super-division color image, the resolution of the super-division color image is greater than the resolution of the image in the target area; the first processing module is used Performing first processing on the target image to obtain a color image of the target image; a first determining module for determining the first region of the target object in the color image of the target image; identifying A module for determining a second area in the super-division color image based on the first area; performing recognition of the target object based on an image of the second area in the super-division color image;
其中,所述第一处理包括第一畸变校正处理,所述第二处理包括第二畸变校正处理;所 述第一畸变校正处理的参数包括:第一畸变曲线;所述第二畸变校正处理的参数包括:第二畸变曲线;所述第二畸变曲线中的任一采样点(X 0,Y 0)在所述超分目标生图像中对应的像素的坐标为(X Ai,Y Ai),所述第一畸变曲线中的任一采样点(X 0,Y 0)在所述目标生图像中对应的像素的坐标为(X Bi,Y Bi);其中,X Ai=(X Bi-(W f/2+0.5))*K+(W f*K/2+0.5);Y Ai=(Y Bi-(H f/2+0.5))*K+(H f*K/2+0.5);W f表示所述目标生图像的宽,H f表示所述目标生图像的高,K表示所述超分彩色图像与所述目标区域的图像的分辨率比值。 Wherein, the first processing includes a first distortion correction processing, the second processing includes a second distortion correction processing; the parameters of the first distortion correction processing include: a first distortion curve; The parameters include: a second distortion curve; the coordinates of a pixel corresponding to any sampling point (X 0 , Y 0 ) in the second distortion curve in the hyperdivision target image are (X Ai , Y Ai ), The coordinates of a pixel corresponding to any sampling point (X 0 , Y 0 ) in the first distortion curve in the target image are (X Bi , Y Bi ); where X Ai = (X Bi -( W f /2+0.5))*K+(W f *K/2+0.5); Y Ai =(Y Bi -(H f /2+0.5))*K+(H f *K/2+0.5); W f represents the width of the target image, H f represents the height of the target image, and K represents the resolution ratio of the super-division color image to the image of the target area.
可选地,所述物体识别装置包括:获取模块,用于获取图像传感器产生的目标生图像;第二确定模块,用于确定所述目标生图像中包含目标物体的目标区域;替换模块,用于在所述目标生图像包含多个目标区域,且所述多个目标区域中存在满足替换条件的两个目标区域时,将所述两个目标区域替换为备选目标区域,得到更新的所述多个目标区域;超分模块,用于对所述目标生图像中目标区域的图像进行超分,得到超分彩色图像,其中,所述目标区域包括所述目标生图像中的至少部分区域,所述超分彩色图像的分辨率大于所述目标区域的图像的分辨率;识别模块,用于基于所述超分彩色图像进行目标物体的识别。其中,所述替换条件包括:所述两个目标区域至少部分重合,且所述两个目标区域的面积之和大于所述备选目标区域的面积;所述两个目标区域中一个目标区域的左上角坐标为(X 11,Y 11),且右下角坐标为(X 12,Y 12);所述两个目标区域中另一个目标区域的左上角坐标为(X 21,Y 21),且右下角坐标为(X 22,Y 22);所述备选目标区域的左上角坐标为(X M1,Y M1),且右下角坐标为(X M2,Y M2);X M1为X 11和X 21的最小值;Y M1为Y 11和Y 21的最小值;X M2为X 12和X 22的最大值;Y M2为Y 12和Y 22的最大值。 Optionally, the object recognition device includes: an acquisition module for acquiring a target image generated by an image sensor; a second determining module for determining a target area containing the target object in the target image; a replacement module for When the target raw image includes multiple target regions, and there are two target regions that satisfy the replacement condition among the multiple target regions, the two target regions are replaced with candidate target regions to obtain updated all target regions. The multiple target regions; a super-division module for super-division of the image of the target region in the target image to obtain a super-division color image, wherein the target region includes at least part of the region in the target image , The resolution of the super-division color image is greater than the resolution of the image of the target area; and the recognition module is configured to recognize the target object based on the super-division color image. Wherein, the replacement condition includes: the two target regions overlap at least partially, and the sum of the areas of the two target regions is greater than the area of the candidate target region; The coordinates of the upper left corner are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the coordinates of the upper left corner of the other of the two target areas are (X 21 , Y 21 ), and The coordinates of the lower right corner are (X 22 , Y 22 ); the coordinates of the upper left corner of the candidate target area are (X M1 , Y M1 ), and the coordinates of the lower right corner are (X M2 , Y M2 ); X M1 is X 11 and X 21 is the minimum value; Y M1 is the minimum value of Y 11 and Y 21 ; X M2 is the maximum value of X 12 and X 22 ; Y M2 is the maximum value of Y 12 and Y 22 .
第三方面,提供了一种物体识别装置,所述物体识别装置包括:处理器和接口,所述处理器用于通过所述接口从图像传感器获取生图像,所述处理器用于运行程序,以使得所述物体识别装置执行如第一方面所述的物体识别方法。In a third aspect, an object recognition device is provided. The object recognition device includes a processor and an interface. The processor is used to obtain a raw image from an image sensor through the interface, and the processor is used to run a program to make The object recognition device executes the object recognition method described in the first aspect.
第四方面,提供了一种计算机存储介质,所述存储介质内存储有计算机程序,所述计算机程序用于执行第一方面所述的物体识别方法。In a fourth aspect, a computer storage medium is provided, and a computer program is stored in the storage medium, and the computer program is used to execute the object recognition method described in the first aspect.
第五方面,提供了一种计算机程序产品,当计算机程序产品在物体识别装置上运行时,使得物体识别装置执行如第一方面所述的物体识别方法。In a fifth aspect, a computer program product is provided. When the computer program product runs on an object recognition device, the object recognition device executes the object recognition method as described in the first aspect.
上述第二方面至第五方面中任一方面的有益效果可以参考上述第一方面的有益效果,本申请在此不做赘述。For the beneficial effects of any one of the second aspect to the fifth aspect described above, reference may be made to the beneficial effects of the first aspect described above, which will not be repeated in this application.
附图说明Description of the drawings
图1为本申请实施例提供的一种物体识别装置的结构示意图;FIG. 1 is a schematic structural diagram of an object recognition device provided by an embodiment of this application;
图2为本申请实施例提供的一种物体识别方法的流程图;FIG. 2 is a flowchart of an object recognition method provided by an embodiment of the application;
图3为本申请实施例提供的一种目标生图像的示意图;FIG. 3 is a schematic diagram of a target image provided by an embodiment of the application;
图4为本申请实施例提供的一种目标生图像的彩色图像的示意图;4 is a schematic diagram of a color image of a target image provided by an embodiment of the application;
图5为本申请实施例提供的一种超分彩色图像的示意图;FIG. 5 is a schematic diagram of a super-division color image provided by an embodiment of the application;
图6为本申请实施例提供的一种获取超分彩色图像的方法流程图;FIG. 6 is a flowchart of a method for obtaining a super-division color image according to an embodiment of the application;
图7为本申请实施例提供的一种目标四通道图像的示意图;FIG. 7 is a schematic diagram of a target four-channel image provided by an embodiment of this application;
图8为本申请实施例提供的一种第一模型的示意图;FIG. 8 is a schematic diagram of a first model provided by an embodiment of this application;
图9为像素抽牌层将4个第一组合图拼接为1个第二组合图的示意图;FIG. 9 is a schematic diagram of the pixel drawing layer splicing 4 first combination pictures into one second combination picture;
图10为本申请实施例提供的另一种获取超分彩色图像的方法流程图;FIG. 10 is a flowchart of another method for obtaining a hyperdivision color image according to an embodiment of the application;
图11为本申请实施例提供的一种第二模型的示意图;FIG. 11 is a schematic diagram of a second model provided by an embodiment of this application;
图12为本申请实施例提供的一种对图7中的目标四通道图像进行分辨率提升后得到的图像的示意图;FIG. 12 is a schematic diagram of an image obtained after the resolution of the target four-channel image in FIG. 7 is increased according to an embodiment of the application;
图13为本申请实施例提供的一种拜耳模式图像的示意图;FIG. 13 is a schematic diagram of a Bayer mode image provided by an embodiment of the application;
图14为本申请实施例提供的一种物体识别方法的补充流程图;FIG. 14 is a supplementary flowchart of an object recognition method provided by an embodiment of the application;
图15为本申请实施例提供的一种第m帧图像的示意图;FIG. 15 is a schematic diagram of an m-th frame image provided by an embodiment of this application;
图16为本申请实施例提供的一种第m+1帧图像的示意图;FIG. 16 is a schematic diagram of an m+1-th frame image provided by an embodiment of this application;
图17为本申请实施例提供的一种目标生图像中的目标区域和备选目标区域的示意图;FIG. 17 is a schematic diagram of a target area and candidate target areas in a target image provided by an embodiment of the application;
图18为本申请实施例提供的一种物体识别装置的框图。FIG. 18 is a block diagram of an object recognition device provided by an embodiment of the application.
具体实施方式Detailed ways
为使本申请的原理、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。In order to make the principles, technical solutions, and advantages of the present application clearer, the implementation manners of the present application will be described in further detail below in conjunction with the accompanying drawings.
基于图像技术能够实现对图像中目标物体的识别。示例地,该目标物体可以为人脸、手、衣服、桌子、板凳等任意物体。Based on image technology, the target object in the image can be recognized. For example, the target object may be any object such as a human face, hand, clothes, table, and bench.
然而,当图像中目标物体的尺寸较小时,该图像就无法包含目标物体的细节信息,基于该图像进行目标物体的识别的难度较高,且准确率较低。因此,通常需要对该图像进行分辨率提升的处理,比如,该图像往往是彩色图像(相较于生图像而言,彩色图像更适合人眼进行观察),可以直接对该彩色图像进行分辨率提升处理,以拉升该彩色图像的尺寸,并尽量增强该图像中目标物体的细节信息。但是,彩色图像往往是通过对拍摄装置(如摄像机或相机)直接采集到的生图像(也称RAW图像)进行一系列图像信号处理(image signal processing,ISP)(例如:图像去马赛克、图像压缩、图像去噪)后得到的图像。在该处理过程中生图像中目标物体的一些细节信息会被消除,进而导致彩色图像本身包含的目标物体的细节信息变少,即使对彩色图像进行分辨率提升处理,也较难复原彩色图像中目标物体的细节信息。However, when the size of the target object in the image is small, the image cannot contain the detailed information of the target object, and it is more difficult to recognize the target object based on the image, and the accuracy rate is low. Therefore, it is usually necessary to perform resolution enhancement processing on the image. For example, the image is often a color image (compared to a raw image, a color image is more suitable for human eyes to observe), and the color image can be directly resolved. The lifting process is used to increase the size of the color image and try to enhance the detailed information of the target object in the image. However, color images are often processed through a series of image signal processing (ISP) (such as image demosaicing, image compression) on raw images (also called RAW images) directly collected by shooting devices (such as video cameras or cameras). , Image denoising). In the process of this process, some detailed information of the target object in the raw image will be eliminated, resulting in less detailed information of the target object contained in the color image itself. Even if the resolution of the color image is increased, it is more difficult to restore the color image. Detailed information of the target object.
本申请提供的物体识别方法实施例,通过对生图像进行分辨率提升处理(也就是“超分”处理),以放大增强生图像中的目标物体的细节信息。由于被超分的图像是未经过ISP处理的生图像,因此分辨率提升后的生图像能够包含目标物体的较多细节信息。超分之后再获取分辨率提升后的生图像的彩色图像,使得本实施例得到的彩色图像相较于现有技术拥有更多的细节信息。因此,基于该彩色图像进行目标物体的识别,能够提升目标物体的识别的准确度。本申请实施例中,在有需要的情况下,可以在得到彩色图像之后,进一步对彩色图像执行ISP操作。The embodiment of the object recognition method provided in this application enlarges and enhances the detailed information of the target object in the raw image by performing resolution enhancement processing (that is, "super-resolution" processing) on the raw image. Since the super-divided image is a raw image that has not been processed by ISP, the raw image with the increased resolution can contain more detailed information of the target object. After over-resolution, the color image of the raw image with the increased resolution is obtained, so that the color image obtained in this embodiment has more detailed information than the prior art. Therefore, the recognition of the target object based on the color image can improve the accuracy of the recognition of the target object. In the embodiments of the present application, if necessary, after the color image is obtained, the ISP operation may be further performed on the color image.
需要说明的是,上述生图像是拍摄装置直接通过自身的图像传感器直接采集到的未经图像信号处理的图像(也称原始光学图像)。该生图像为拍摄装置内部最原始的图像信息,因此保留了拍摄装置能够获取的最丰富的图像高频细节。而对生图像进行图像信号处理得到的彩色图像的高频细节已经被减弱,甚至消失,最终导致得到的彩色图像无法保留目标物体的细节内容。It should be noted that the above-mentioned raw image is an image (also referred to as an original optical image) that is directly collected by the imaging device through its own image sensor without image signal processing. The raw image is the most primitive image information inside the shooting device, so it retains the richest high-frequency details of the image that the shooting device can obtain. However, the high-frequency details of the color image obtained by image signal processing on the raw image have been weakened or even disappeared, resulting in the resulting color image being unable to retain the details of the target object.
本申请实施例提供了一种物体识别方法,该物体识别方法可以用于物体识别装置。示例地,如图1所示,该物体识别装置包括:处理器101和接口106,该处理器101与接口106 连接,接口106与图像传感器105连接,图像传感器105用于生成生图像,处理器101用于通过接口106从图像传感器105获取生图像。处理器101用于运行程序,以使得物体识别装置执行本申请实施例提供的物体识别方法。The embodiment of the present application provides an object recognition method, which can be used in an object recognition device. For example, as shown in FIG. 1, the object recognition device includes: a processor 101 and an interface 106. The processor 101 is connected to the interface 106, and the interface 106 is connected to an image sensor 105. The image sensor 105 is used to generate a raw image. 101 is used to obtain a raw image from the image sensor 105 through the interface 106. The processor 101 is configured to run a program, so that the object recognition apparatus executes the object recognition method provided in the embodiment of the present application.
可选地,请继续参考图1,该物体识别装置还可以包括:通信组件102,存储器103,和至少一个通信总线104,处理器101、接口106、通信组件102和存储器103之间可以通过该通信总线104连接。处理器101用于执行的程序可以为存储器103中的程序1031。存储器103可能包含高速随机存取存储器(RAM:Random Access Memory),也可能还包括非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。通过通信组件102(可以是有线或者无线)实现该物体识别装置与至少一个其他网元之间的通信连接,可以使用互联网、广域网、本地网或城域网等。需要说明的是,本申请实施例中以处理器和存储器相互独立为例,当然,存储器103也可以集成在处理器中,本申请实施例对此不作限定。Optionally, please continue to refer to FIG. 1. The object recognition device may also include: a communication component 102, a memory 103, and at least one communication bus 104. The processor 101, the interface 106, the communication component 102, and the memory 103 can pass through the The communication bus 104 is connected. The program used by the processor 101 to execute may be the program 1031 in the memory 103. The memory 103 may include a high-speed random access memory (RAM: Random Access Memory), and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The communication connection between the object identification device and at least one other network element is realized through the communication component 102 (which may be wired or wireless), and the Internet, a wide area network, a local network, or a metropolitan area network may be used. It should be noted that, in the embodiment of the present application, the processor and the memory are independent of each other as an example. Of course, the memory 103 may also be integrated in the processor, which is not limited in the embodiment of the present application.
示例地,图2为本申请实施例提供的一种物体识别方法的流程图,如图2所示,该物体识别方法可以包括:For example, FIG. 2 is a flowchart of an object recognition method provided by an embodiment of the application. As shown in FIG. 2, the object recognition method may include:
步骤201、获取拜耳模式的目标生图像。Step 201: Acquire a target image of the Bayer model.
如图1所示,该物体识别装置可以获取该图像传感器105生成的拜耳(Bayer)模式的目标生图像。可选地,图1中以物体识别装置包括图像传感器为例,可选地,该物体识别装置也可以不包括图像传感器,而是与图像传感器连接,并能够获取到图像传感器生成的拜耳模式的目标生图像。As shown in FIG. 1, the object recognition device can obtain the target raw image in the Bayer mode generated by the image sensor 105. Optionally, in FIG. 1, the object recognition device includes an image sensor as an example. Optionally, the object recognition device may not include an image sensor, but is connected to the image sensor, and can obtain the Bayer pattern generated by the image sensor. Target student image.
示例地,如图3所示,目标生图像可以包括阵列排布的多个像素,并且,该多个像素中每2行2列的像素组团形成一个像素组,且每个像素组中的四个像素的颜色分别为红、绿1、绿2和蓝。其中,绿1和绿2均表示绿色,但绿1和绿2分别对应的像素组中的两个像素。如图3所示,该像素组中第1行第1列的像素的颜色为红,第1行第2列的像素的颜色为绿1,第2行第1列的像素的颜色为绿2,第2行第2列的像素的颜色为蓝。For example, as shown in FIG. 3, the target raw image may include a plurality of pixels arranged in an array, and each of the pixel groups of 2 rows and 2 columns in the plurality of pixels forms a pixel group, and four pixels in each pixel group The colors of each pixel are red, green 1, green 2, and blue. Among them, green 1 and green 2 both represent green, but green 1 and green 2 respectively correspond to two pixels in the pixel group. As shown in Figure 3, the color of the pixel in the first row and the first column of the pixel group is red, the color of the pixel in the first row and second column is green 1, and the color of the pixel in the second row and first column is green 2. , The color of the pixel in the second row and second column is blue.
当然,这四个像素的分布位置还可以与图3所示的分布位置不同。比如,第1行第1列的像素的颜色为绿1,第1行第2列的像素的颜色为红,第2行第1列的像素的颜色为蓝,第2行第2列的像素的颜色为绿2;或者,第1行第1列的像素的颜色为绿1,第1行第2列的像素的颜色为蓝,第2行第1列的像素的颜色为红,第2行第2列的像素的颜色为绿2。本申请实施例对此不作限定。另外,本申请实施例也不对目标生图像中像素的个数进行限定。Of course, the distribution positions of the four pixels can also be different from the distribution positions shown in FIG. 3. For example, the color of the pixel in the first row and the first column is green 1, the color of the pixel in the first row and second column is red, the color of the pixel in the second row and first column is blue, and the color of the pixel in the second row and second column is blue. The color of the pixel is green 2; or, the color of the pixel in the first row and the first column is green 1, the color of the pixel in the first row and the second column is blue, the color of the pixel in the second row and the first column is red, and the color of the pixel in the second row and first column is red. The color of the pixel in the second column of the row is green 2. The embodiment of the application does not limit this. In addition, the embodiments of the present application do not limit the number of pixels in the target image.
需要说明的是,本申请实施例中以目标生图像为拜耳模式的生图像为例,当然,该目标生图像还可以为除拜耳模式之外的任一种模式(如红绿蓝亮度(RGBW)模式)的生图像,本申请实施例对此不作限定。It should be noted that in the embodiments of the present application, the target image is a Bayer mode image as an example. Of course, the target image can also be any mode other than the Bayer mode (such as red, green, and blue brightness (RGBW) ) Mode), which is not limited in the embodiment of the present application.
步骤202、对目标生图像进行第一处理,得到目标生图像的彩色图像。Step 202: Perform first processing on the target raw image to obtain a color image of the target raw image.
在步骤202中,物体识别装置需要将该目标生图像处理为彩色图像。其中,彩色图像可以为任一种彩色模式的图像,如红绿蓝(RGB)格式、明亮度色度(YUV)格式等。In step 202, the object recognition device needs to process the target raw image into a color image. Among them, the color image can be an image in any color mode, such as a red-green-blue (RGB) format, a luminance and chrominance (YUV) format, and so on.
示例地,该第一处理可以包括:自动白平衡处理、颜色校正处理、伽马校正处理以及畸变校正处理中的至少一种处理,当然该第一处理还可以包括除这四种处理之外的其他处理。本申请实施例中以第一处理包括自动白平衡处理、颜色校正处理、伽马校正处理和畸变校正处理为例,并且,本申请实施例并不对这几种处理的先后顺序进行限定。For example, the first processing may include at least one of automatic white balance processing, color correction processing, gamma correction processing, and distortion correction processing. Of course, the first processing may also include processing other than these four types of processing. Other processing. In the embodiment of the present application, the first processing including automatic white balance processing, color correction processing, gamma correction processing, and distortion correction processing is taken as an example, and the embodiment of the present application does not limit the sequence of these types of processing.
彩色图像例如联合图像专家组(Joint Photographic Experts Group,JPEG)格式的图像、 高效率图档格式(High Efficiency Image File Format,helf)的图像等,JPEG格式也称JPG格式。Color images are, for example, images in the Joint Photographic Experts Group (JPEG) format, images in the High Efficiency Image File Format (helf), etc. The JPEG format is also called the JPG format.
步骤203、确定目标生图像的彩色图像中包含目标物体的第一区域。Step 203: Determine the first region in the color image of the target image that contains the target object.
在得到目标生图像的彩色图像后,物体识别装置可以对该彩色图像进行目标物体的识别,以得到该彩色图像中包含目标物体的第一区域。当然,该彩色图像中可以包含一个或多个第一区域,本申请实施例对此不做限定,在步骤203中,物体识别装置需要识别出该彩色图像中的每个第一区域。After obtaining the color image of the target raw image, the object recognition device can recognize the target object on the color image to obtain the first region containing the target object in the color image. Of course, the color image may include one or more first regions, which is not limited in the embodiment of the present application. In step 203, the object recognition device needs to recognize each first region in the color image.
步骤204、对目标生图像中目标区域的图像进行超分,得到超分彩色图像,其中,目标区域包括目标生图像中的至少部分区域,超分彩色图像的分辨率大于目标生图像中目标区域的图像的分辨率。Step 204: Perform super-division on the image of the target region in the target image to obtain a super-division color image, where the target region includes at least a part of the region in the target image, and the resolution of the super-division color image is greater than that of the target region in the target image The resolution of the image.
物体识别装置在对目标生图像中目标区域的图像进行超分,得到超分彩色图像的过程中,可以首先对目标生图像的目标区域的图像(也是一种生图像)进行分辨率的提升,得到提升分辨率后的上述目标区域的图像,之后,再对提升分辨率后的上述目标区域的图像进行处理,得到该超分彩色图像。由于在步骤204中直接对目标生图像中目标区域的图像进行分辨率的提升处理,因此提升分辨率后的目标区域的图像中可以包含目标物体的较多细节信息。之后得到的该超分彩色图像也能包含目标物体较多的细节信息。When the object recognition device super-divisions the image of the target area in the target image to obtain the super-division color image, it can first increase the resolution of the image (also a kind of raw image) of the target area of the target image. The image of the target area after the resolution is increased is obtained, and then the image of the target area after the resolution is increased is processed to obtain the super-division color image. Since in step 204, the resolution of the image of the target area in the target raw image is directly increased, the image of the target area after the resolution has been increased may contain more detailed information of the target object. The super-division color image obtained later can also contain more detailed information of the target object.
步骤205、基于第一区域,确定超分彩色图像中的第二区域。Step 205: Determine a second area in the super-division color image based on the first area.
在步骤203中确定目标生图像的彩色图像中的第一区域后,可以采用一定的方式将该第一区域映射到超分彩色图像中,从而得到该超分彩色图像中的第二区域。并且,由于第二区域是第一区域映射得到的区域,因此,在第一区域包含目标物体的前提下,第二区域也包含目标物体。需要说明的是,若步骤203中确定出了多个第一区域,则在步骤205中需要将多个第一区域中的每个第一区域均映射至超分彩色图像中,从而得到多个第一区域中每个第一区域对应的第二区域。After the first area in the color image of the target image is determined in step 203, the first area can be mapped to the super-division color image in a certain manner, so as to obtain the second area in the super-division color image. In addition, since the second area is an area obtained by mapping the first area, the second area also includes the target object on the premise that the first area contains the target object. It should be noted that if multiple first regions are determined in step 203, each of the multiple first regions needs to be mapped to the super-division color image in step 205, so as to obtain multiple Each first area in the first area corresponds to a second area.
示例地,对于每个第一区域以及该第一区域对应的第二区域,该第一区域和该第二区域均可以呈矩形。其中,若第一区域的左上角坐标为(X A1,Y A1),且右下角坐标为(X A2,Y A2),则该第一区域对应的第二区域的左上角坐标为(X B1,Y B1),且右下角坐标为(X B2,Y B2);其中,X B1=X A1*K;Y B1=Y A1*K;X B2=X B1+(X A2-X A1)*K;Y B2=Y B1+(Y A2-Y A1)*K;K表示超分彩色图像与目标生图像的分辨率比值(可以称为分辨率提升率),K>1。 For example, for each first area and a second area corresponding to the first area, both the first area and the second area may be rectangular. Among them, if the coordinates of the upper left corner of the first area are (X A1 , Y A1 ), and the coordinates of the lower right corner are (X A2 , Y A2 ), then the coordinates of the upper left corner of the second area corresponding to the first area are (X B1 , Y B1 ), and the coordinates of the lower right corner are (X B2 , Y B2 ); among them, X B1 =X A1 *K; Y B1 =Y A1 *K; X B2 =X B1 +(X A2 -X A1 )* K; Y B2 = Y B1 + (Y A2 -Y A1 )*K; K represents the resolution ratio of the super-division color image to the target image (can be called the resolution improvement rate), K>1.
例如,目标生图像的彩色图像如图4所示,超分彩色图像如图5所示。假设K=2,如图4所示,第一区域的左上角坐标为(3,3),且右下角坐标为(6,6)。则如图5所示,该第二区域的左上角坐标为(X A1*K,Y A1*K)=(3*2,3*2)=(6,6),第二区域的右下角坐标为(X B1+(X A2-X A1)*K,Y B1+(Y A2-Y A1)*K)=(6+(6-3)*2,6+(6-3)*2)=(12,12)。可以看出,第一区域的长和宽均为3,第二区域的长和宽均为6,第二区域比第一区域在长和宽方面均放大了两倍。 For example, the color image of the target image is shown in Figure 4, and the super-division color image is shown in Figure 5. Assuming K=2, as shown in FIG. 4, the coordinates of the upper left corner of the first area are (3, 3), and the coordinates of the lower right corner are (6, 6). As shown in Figure 5, the coordinates of the upper left corner of the second area are (X A1 *K, Y A1 *K)=(3*2,3*2)=(6,6), the lower right corner of the second area The coordinates are (X B1 +(X A2 -X A1 )*K, Y B1 +(Y A2 -Y A1 )*K)=(6+(6-3)*2, 6+(6-3)*2 )=(12,12). It can be seen that the length and width of the first region are both 3, the length and width of the second region are both 6, and the second region is twice as large as the first region in terms of length and width.
步骤206、基于第二区域进行目标物体的识别。Step 206: Recognize the target object based on the second area.
例如:基于第二区域进行车牌识别,人脸识别。其中,基于第二区域进行目标物体的识别可以是指在:第二区域的范围之内进行目标物体的识别。For example: license plate recognition and face recognition based on the second area. Wherein, the recognition of the target object based on the second area may refer to the recognition of the target object within the range of the second area.
在确定超分彩色图像中的第二区域后,物体识别装置可以截取该第二区域的图像,并对该第二区域的图像进行目标物体的识别。并且,由于该超分彩色图像包含了目标物体的较多 细节信息,因此,该超分彩色图像中的第二区域的图像也包含了目标物体的较多细节信息,基于该第二区域的图像进行目标物体的识别的准确度较高。After determining the second region in the super-division color image, the object recognition device may intercept the image of the second region, and perform target object recognition on the image of the second region. Moreover, since the super-division color image contains more detailed information of the target object, the image of the second region in the super-division color image also contains more detailed information of the target object, based on the image of the second region The accuracy of the recognition of the target object is high.
并且,由于第二区域为物体识别装置确定出的包含目标物体的区域,因此仅基于该第二区域便能够进行目标物体的识别,而无需基于超分彩色图像中除第二区域之外的区域进行目标物体的识别,简化了目标物体的识别的复杂度。In addition, since the second area is the area containing the target object determined by the object recognition device, the target object can be recognized only based on the second area, instead of the area other than the second area in the super-division color image. The recognition of the target object simplifies the complexity of the recognition of the target object.
示例地,上述步骤204有多种实现方式,本申请实施例对此不作限定。以下将以以下的两种实现方式为例,对上述步骤204的实现方式进行讲解。并且,以下两种实现方式均以步骤204中的目标生图像的目标区域为目标生图像的全部区域为例。For example, there are multiple implementation manners for the foregoing step 204, which is not limited in the embodiment of the present application. The following two implementations will be taken as examples to explain the implementation of the above step 204. In addition, in the following two implementation manners, the target area of the target image in step 204 is the entire area of the target image as an example.
在步骤204的第一种实现方式中,如图6所示,步骤204可以包括:In the first implementation manner of step 204, as shown in FIG. 6, step 204 may include:
步骤2041a、获取目标四通道图像,其中,目标生图像中目标区域的图像包括阵列排布的多个像素组,该像素组包括两行两列像素,该目标四通道图像包括:该多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图。 Step 2041a: Obtain a target four-channel image, where the image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: the plurality of pixels The combination map of the pixels in the first row and the first column, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column in the group.
示例地,如步骤201中所述,目标生图像中目标区域的图像可以包括多个像素组,且每个像素组包括红、绿1、绿2和蓝四种像素。在步骤2041a中,物体识别装置可以将目标生图像中目标区域的图像转换为目标四通道图像,该目标四通道图像包括四个组合图,且这四个组合图中每个组合图包括该多个像素组中同一位置的像素。Exemplarily, as described in step 201, the image of the target area in the target raw image may include multiple pixel groups, and each pixel group includes four pixels of red, green 1, green 2, and blue. In step 2041a, the object recognition device may convert the image of the target area in the target raw image into a target four-channel image. The target four-channel image includes four combined images, and each of the four combined images includes the multiple Pixels at the same position in a pixel group.
例如,对于图3所示的目标生图像,若目标区域为该目标生图像中的全部区域,则将该目标生图像中的目标区域的图像转换得到的目标四通道图像可以包括如图7所示的四个组合图,分别为红色像素的组合图、绿1色像素的组合图、绿2色像素的组合图以及蓝色像素的组合图。可见,图3中的多个像素中,若像素的行数和列数均为奇数,则该像素属于目标四通道图像中红色像素的组合图;若像素的行数为奇数且列数为偶数,则该像素属于目标四通道图像中绿1色像素的组合图;若像素的行数为偶数且列数为奇数,则该像素属于目标四通道图像中绿2色像素的组合图;若像素的行数和列数均为偶数,则该像素属于目标四通道图像中蓝色像素的组合图。For example, for the target image shown in FIG. 3, if the target area is all areas in the target image, the target four-channel image obtained by converting the target area in the target image may include the target four-channel image as shown in FIG. 7 The four combination diagrams shown are the combination diagram of red pixels, the combination diagram of green pixels, the combination diagram of green pixels, and the combination diagram of blue pixels. It can be seen that among the multiple pixels in Figure 3, if the number of rows and columns of the pixel are both odd, then the pixel belongs to the combined image of the red pixels in the target four-channel image; if the number of rows of the pixel is odd and the number of columns is even , The pixel belongs to the combination map of green 1-color pixels in the target four-channel image; if the number of rows of the pixel is even and the number of columns is odd, then the pixel belongs to the combination map of green 2-color pixels in the target four-channel image; if the pixel If the number of rows and the number of columns is even, the pixel belongs to the combined image of blue pixels in the target four-channel image.
步骤2042a、将目标四通道图像输入第一模型,得到第一模型输出的超分彩色图像。 Step 2042a: Input the target four-channel image into the first model to obtain the super-division color image output by the first model.
第一模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的该拜耳模式的图像的彩色图像。因此,在将目标区域的图像(一种拜耳模式的图像,且此时目标区域也即整个目标生图像)的四通道图像(称为目标四通道图像)输入第一模型后,第一模型能够对该目标四通道图像进行处理,进而输出超分彩色图像。The first model is used to super-divide the four-channel image of the Bayer-mode image that is not super-divided, and output the color image of the Bayer-mode image after the resolution is increased. Therefore, after the four-channel image (called the target four-channel image) of the target area (a Bayer mode image, and the target area is also the entire target image) is input into the first model, the first model can The target four-channel image is processed, and then a super-division color image is output.
可选地,该第一模型可以为神经网络模型。Optionally, the first model may be a neural network model.
示例地,图8为本申请实施例提供的一种第一模型的示意图,如图8所示,该第一模型可以包括:第一模块、16个串联的第二模块、k个串联的第三模块、第四模块、第五模块、第六模块、第七模块和第八模块。For example, FIG. 8 is a schematic diagram of a first model provided by an embodiment of the application. As shown in FIG. 8, the first model may include: a first module, 16 second modules connected in series, and k first modules connected in series. Three modules, fourth module, fifth module, sixth module, seventh module and eighth module.
(1)第一模块可以包括:卷积层和漏线性整流层。该卷积层用于分别采用64种卷积核对输入第一模型的所有图像(如上述目标四通道图像)进行卷积处理,并输出六十四通道图像,其中,这64种卷积核中每种卷积核的大小均为3*3。漏线性整流层用于基于漏线性整流函数对卷积层输出的图像(如上述六十四通道图像)进行激活处理,得到该图像的激活特征 图。(1) The first module may include: a convolutional layer and a leaky linear rectification layer. This convolutional layer is used to perform convolution processing on all images input to the first model (such as the above-mentioned target four-channel image) by using 64 convolution kernels respectively, and output 64-channel images. Among these 64 convolution kernels The size of each convolution kernel is 3*3. The leaky linear rectification layer is used to activate the image output by the convolutional layer (such as the above-mentioned sixty-four channel image) based on the leaky linear rectification function to obtain the activation feature map of the image.
其中,该漏线性整流函数可以为:Among them, the leakage linear rectification function can be:
Figure PCTCN2020111141-appb-000001
Figure PCTCN2020111141-appb-000001
x i,j是输入漏线性整流层的图像中第i行第j列的像素的像素值(如像素强度),y i,j是该图像的激活特征图中第i行第j列的像素的像素值,a为弱化参数,1≤a≤2。可见,经过激活处理后,该图像中像素值大于或等于零的像素的像素值被保留,而像素值小于零的像素的像素值被减弱。i≥1,j≥1。 x i,j is the pixel value (such as pixel intensity) of the pixel in the i-th row and j-th column in the image of the input leakage linear rectification layer, y i,j is the pixel in the i-th row and j-th column in the activation feature map The pixel value of, a is the weakening parameter, and 1≤a≤2. It can be seen that after the activation process, the pixel value of the pixel whose pixel value is greater than or equal to zero is retained, while the pixel value of the pixel whose pixel value is less than zero is weakened. i≥1, j≥1.
(2)第二模块可以包括:第一卷积层、线性整流层、第二卷积层和相加层。其中,第一卷积层可以用于:分别采用64种卷积核(3*3大小)对输入第二模块的图像进行处理,并输出六十四通道图像;线性整流层用于基于线性整流函数对第一卷积层输出的六十四通道图像进行激活处理,并输出激活处理后的六十四通道图像;第二卷积层用于分别采用64种卷积核(3*3大小)对漏线性整流层输出的六十四通道图像进行处理,并输出六十四通道图像;相加层用于将第二卷积层输出的六十四通道图像与输入该第二模块的图像相加,并输出六十四通道图像的特征图。(2) The second module may include: a first convolution layer, a linear rectification layer, a second convolution layer, and an addition layer. Among them, the first convolution layer can be used to: use 64 kinds of convolution kernels (3*3 size) to process the image input to the second module, and output 64-channel images; the linear rectification layer is used for linear rectification based The function activates the 64-channel image output by the first convolutional layer, and outputs the 64-channel image after the activation processing; the second convolutional layer is used to respectively use 64 convolution kernels (3*3 size) Process the 64-channel image output by the leaky linear rectification layer and output a 64-channel image; the addition layer is used to compare the 64-channel image output by the second convolutional layer with the image input to the second module Add and output the feature map of the 64-channel image.
其中,该线性整流函数如下:Among them, the linear rectification function is as follows:
Figure PCTCN2020111141-appb-000002
Figure PCTCN2020111141-appb-000002
x i,j是第一卷积层输出的图像中第i行第j列的像素的像素值,y i,j是该图像的激活特征图中第i行第j列的像素的像素值。可见,经过激活处理后,该图像中像素值大于或等于零的像素的像素值被保留,而像素值小于零的像素的像素值被减为0。i≥1,j≥1。 x i,j is the pixel value of the pixel in the i-th row and j-th column in the image output by the first convolutional layer, and y i,j is the pixel value of the pixel in the i-th row and jth column in the activation feature map of the image. It can be seen that after the activation process, the pixel value of the pixel whose pixel value is greater than or equal to zero is retained, and the pixel value of the pixel whose pixel value is less than zero is reduced to zero. i≥1, j≥1.
(3)第三模块可以包括:卷积层、第一漏线性整流层、像素抽牌层以及第二漏线性整流层。其中,卷积层用于分别采用64种卷积核(3*3大小)对输入第三模块的图像进行处理,并输出六十四通道图像;第一漏线性整流层用于基于漏线性整流函数对卷积层输出的六十四通道图像进行激活处理,并输出激活处理后的六十四通道图像;像素抽牌层用于将第一漏线性整流层输出的六十四通道图像分成4份(每份包括十六通道图像),并将这4份图像拼接成十六通道图像;第二漏线性整流层用于基于漏线性整流函数对像素抽牌层输出的十六通道图像进行激活处理,并输出激活处理后的十六通道图像。其中,该漏线性整流函数可以参考第一模块中的漏线性整流函数,本申请实施例在此不做赘述。(3) The third module may include: a convolutional layer, a first leaky linear rectification layer, a pixel drawing layer, and a second leaky linear rectification layer. Among them, the convolution layer is used to process the image input to the third module with 64 types of convolution kernels (3*3 size), and output 64-channel images; the first leaky linear rectification layer is used for leak-based linear rectification The function activates the 64-channel image output by the convolutional layer, and outputs the 64-channel image after activation processing; the pixel draw layer is used to divide the 64-channel image output by the first drain linear rectification layer into 4 Copies (each including 16-channel images), and stitch these 4 images into a 16-channel image; the second leaky linear rectification layer is used to activate the 16-channel image output by the pixel drawing layer based on the leaky linear rectification function Process and output the sixteen-channel image after activation processing. Wherein, the leakage linear rectification function can refer to the leakage linear rectification function in the first module, which is not described in detail in the embodiment of the present application.
第一漏线性整流层输出的六十四通道图像(包括64个组合图)中的组合图(称为第一组合图)的尺寸可以为x*y,则像素抽牌层输出的十六通道图像(包括16个组合图)中的通道图像(称为第二组合图)的尺寸可以为2x*2y。像素抽牌层可以将4份第一组合图(每份第一组合图包括16个第一组合图)中,每份第一组合图中的第m个第一组合图组成一组第一组合图,这样便可以得到16组第一组合图,每组第一组合图包括4个第一组合图。之后,像素抽牌层可以将每组第一组合图中的4个第一组合图拼接为1个第二组合图,从而得到16个第二组合图。The size of the combined image (called the first combined image) in the 64-channel image (including 64 combined images) output by the first leakage linear rectification layer can be x*y, then the 16-channel output of the pixel drawing layer The size of the channel image (referred to as the second combined image) in the image (including 16 combined images) can be 2x*2y. The pixel draw layer can combine 4 first combination pictures (each first combination picture includes 16 first combination pictures), each of the m-th first combination pictures in the first combination picture, form a set of first combinations In this way, 16 groups of first combined diagrams can be obtained, and each group of first combined diagrams includes 4 first combined diagrams. After that, the pixel drawing layer can splice the 4 first combination diagrams in each group of the first combination diagram into a second combination diagram, thereby obtaining 16 second combination diagrams.
示例地,图9为像素抽牌层将一组第一组合图中的4个第一组合图拼接为1个第二组合图的示意图。如图9所示,假设将这4个第一组合图分别称为图像A、B、C和D,图像A 中的像素称为1,图像B中的像素称为2,图像C中的像素称为3,图像D中的像素称为4。则将这4个第一组合图拼接而成的第二组合图可以包括阵列排布的多个像素组,每个像素组包括两行两列的像素,其中,第1行第1列的像素为来自图像A的像素1,第1行第2列的像素为来自图像B的像素2,第2行第1列的像素为来自图像C的像素3,第2行第2列的像素为来自图像D的像素4。For example, FIG. 9 is a schematic diagram of the pixel drawing layer splicing four first combination diagrams in a group of first combination diagrams into one second combination diagram. As shown in Figure 9, suppose that the four first combined images are called images A, B, C, and D, the pixels in image A are called 1, the pixels in image B are called 2, and the pixels in image C are It is called 3, and the pixel in image D is called 4. Then the second combination map formed by splicing the four first combination maps may include a plurality of pixel groups arranged in an array, and each pixel group includes two rows and two columns of pixels, where the pixels in the first row and the first column Is pixel 1 from image A, the pixel in row 1 and column 2 is pixel 2 from image B, the pixel in row 2 and column 1 is pixel 3 from image C, and the pixel in row 2 and column 2 is pixel from Pixel 4 of image D.
进一步地,每个第三模块能够对输入的图像放大2倍,由于第一模型包括k个第三模块,因此,k个第三模块共能够对输入的图像放大2k倍。超分彩色图像与目标生图像中目标区域的图像的分辨率比值K=2k。本申请实施例中可以根据K的大小,合理设置第一模型中第三模块的个数。Further, each third module can magnify the input image by 2 times. Since the first model includes k third modules, the k third modules can magnify the input image by 2k times in total. The resolution ratio of the super-division color image and the image of the target area in the target raw image is K=2k. In the embodiment of the present application, the number of third modules in the first model can be set reasonably according to the size of K.
(4)第四模块包括卷积层,且该卷积层用于分别采用64种卷积核(3*3大小)对输入第四模块的图像进行处理,并输出六十四通道图像。(4) The fourth module includes a convolutional layer, and the convolutional layer is used to process the image input to the fourth module by using 64 convolution kernels (3*3 size), and output a 64-channel image.
(5)第五模块包括卷积层,且该卷积层用于分别采用4种卷积核(3*3大小)对输入第四模块的图像进行处理,并输出四通道图像。第五模块输出的四通道图像的分辨率大于输入第一模型的四通道图像的分辨率。(5) The fifth module includes a convolutional layer, and the convolutional layer is used to process the image input to the fourth module with 4 types of convolution kernels (3*3 size), and output a four-channel image. The resolution of the four-channel image output by the fifth module is greater than the resolution of the four-channel image input to the first model.
(6)第六模块包括放大层和相加层。其中,放大层用于采用双线性插值的方式对输入第一模型的四通道图像进行插值放大,放大倍数同样为K。相加层用于将放大层输出的结果与第五模块输出的结果相加,得到上述提升分辨率后的目标区域的图像。(6) The sixth module includes an amplification layer and an addition layer. Among them, the magnification layer is used to interpolate and magnify the four-channel image input to the first model by means of bilinear interpolation, and the magnification factor is also K. The addition layer is used to add the output result of the amplification layer and the output result of the fifth module to obtain the above-mentioned image of the target area after the resolution is increased.
(7)第七模块包括:卷积层和漏线性整流层。该卷积层用于分别采用64种卷积核对第六模块输出的结果进行卷积处理,并输出六十四通道图像,其中,这64种卷积核中每种卷积核的大小均为3*3。漏线性整流层用于基于漏线性整流函数对卷积层输出的图像进行激活处理,得到该图像的激活特征图。(7) The seventh module includes: a convolutional layer and a leaky linear rectification layer. This convolutional layer is used to convolve the output of the sixth module with 64 types of convolution kernels, and output 64-channel images. Among them, the size of each of the 64 types of convolution kernels is equal to 3*3. The leaky linear rectification layer is used to activate the image output by the convolutional layer based on the leaky linear rectification function to obtain the activation feature map of the image.
其中,该漏线性整流函数可以参考第一模块中的漏线性整流函数,本申请实施例在此不做赘述。Wherein, the leakage linear rectification function can refer to the leakage linear rectification function in the first module, which is not described in detail in the embodiment of the present application.
(8)第八模块包括:卷积层。该卷积层用于分别采用3种卷积核对第七模块输出的结果进行卷积处理,并输出上述超分彩色图像(具有3个通道),其中,这3种卷积核中每种卷积核的大小均为3*3。(8) The eighth module includes: convolutional layer. The convolutional layer is used to convolve the output result of the seventh module with 3 types of convolution kernels, and output the above-mentioned super-division color image (with 3 channels). Among them, each of the 3 types of convolution kernels The size of the product core is 3*3.
可选地,上述实施例中以第一模型中的每个卷积层所采用的卷积核的大小均为3*3为例,可选地,各个卷积核的大小也可以不为3*3,如为4*4等,本申请实施例对此不作限定。Optionally, in the foregoing embodiment, the size of the convolution kernel used by each convolution layer in the first model is 3*3 as an example. Optionally, the size of each convolution kernel may not be 3. *3, such as 4*4, etc., this embodiment of the application does not limit this.
在步骤204的第一种实现方式中,物体识别装置在获取目标四通道图像后,直接采用该第一模型对该目标四通道图像进行处理,便能够得到超分彩色图像,使得获取超分彩色图像的效率较高。In the first implementation of step 204, after acquiring the target four-channel image, the object recognition device directly uses the first model to process the target four-channel image to obtain a super-differential color image, so that the super-differential color image can be obtained. The image efficiency is higher.
可选地,在使用该第一模型之前,物体识别装置可以基于第一训练数据对初始模型进行训练以得到第一模型。当然,训练初始模型得到第一模型的过程也可以不由物体识别装置执行,本申请实施例对此不作限定。Optionally, before using the first model, the object recognition device may train the initial model based on the first training data to obtain the first model. Of course, the process of training the initial model to obtain the first model may not be executed by the object recognition device, which is not limited in the embodiment of the present application.
示例地,在获取用于训练得到第一模型的第一训练数据时,可以首先获取拜耳模式的生图像(可以是任意的生图像),之后,再依照装箱(binning)插值方式对获取到的生图像进行插值,从而获得生图像的小尺寸的退化图像(该退化图像可以看做是分辨率降低后的生图像)。在获取第一训练数据时还可以将获取到的生图像进行处理,以得到生图像的彩色图像。该生图像的退化图像,以及生图像的彩色图像,便可以作为第一训练数据对初始模型进行训练。 比如训练时将退化图像作为输入,并将初始模型输出的结果与彩色图像进行比对,进而根据比对结果对初始模型进行调整,重复多次该过程便可以将初始模型训练为第一模型。For example, when obtaining the first training data for training the first model, the raw image of the Bayer mode (which can be any raw image) can be obtained first, and then the obtained image can be obtained according to the binning interpolation method. The raw image is interpolated to obtain a small-sized degraded image of the raw image (the degraded image can be regarded as a raw image with reduced resolution). When acquiring the first training data, the acquired raw image may also be processed to obtain a color image of the raw image. The degraded image of the raw image and the color image of the raw image can be used as the first training data to train the initial model. For example, the degraded image is used as input during training, and the output result of the initial model is compared with the color image, and then the initial model is adjusted according to the comparison result. The initial model can be trained as the first model by repeating this process many times.
其中,退化图像中像素的像素值满足以下公式:Among them, the pixel value of the pixel in the degraded image satisfies the following formula:
Figure PCTCN2020111141-appb-000003
Figure PCTCN2020111141-appb-000003
Figure PCTCN2020111141-appb-000004
Figure PCTCN2020111141-appb-000004
Figure PCTCN2020111141-appb-000005
Figure PCTCN2020111141-appb-000005
Figure PCTCN2020111141-appb-000006
Figure PCTCN2020111141-appb-000006
其中,R i,j为退化图像中的(i,j)坐标上的红色像素的像素值,GR i+1,j为退化图像中的(i+1,j)坐标上的绿色像素的像素值,GB i,j+1为退化图像中的(i,j+1)坐标上的绿色像素的像素值,B i+1,j+1为退化图像中的(i+1,j+1)坐标上的蓝色像素的像素值。f (i-1)×K+1+2×r,(j-1)×K+1+2×c表示退化图像对应的生图像(退化图像由对该生图像进行插值得到)中((i-1)×K+1+2×r,(j-1)×K+1+2×c)坐标上像素的像素值。f (i-1)×K+1+2×r,(j-1)×K+2+2×c表示退化图像对应的生图像中((i-1)×K+1+2×r,(j-1)×K+2+2×c)坐标上像素的像素值。f (i-1)×K+2+2×r,(j-1)×K+1+2×c表示退化图像对应的生图像中((i-1)×K+2+2×r,(j-1)×K+1+2×c)坐标上像素的像素值。f (i-1)×K+2+2×r,(j-1)×K+2+2×c表示退化图像对应的生图像中((i-1)×K+2+2×r,(j-1)×K+2+2×c)坐标上像素的像素值。K为退化倍数,与步骤205中提到的超分彩色图像与目标生图像中目标区域的图像的分辨率比值K相等。 Among them, R i,j is the pixel value of the red pixel at the (i,j) coordinate in the degraded image, and GR i+1,j is the pixel of the green pixel at the (i+1,j) coordinate in the degraded image Value, GB i,j+1 is the pixel value of the green pixel on the (i,j+1) coordinates in the degraded image, and B i+1,j+1 is the (i+1,j+1) in the degraded image ) The pixel value of the blue pixel on the coordinates. f (i-1)×K+1+2×r, (j-1)×K+1+2×c represents the raw image corresponding to the degraded image (the degraded image is obtained by interpolating the raw image) in (( i-1)×K+1+2×r, (j-1)×K+1+2×c) the pixel value of the pixel on the coordinate. f (i-1)×K+1+2×r, (j-1)×K+2+2×c means ((i-1)×K+1+2×r in the raw image corresponding to the degraded image , The pixel value of the pixel on the (j-1)×K+2+2×c) coordinates. f (i-1)×K+2+2×r, (j-1)×K+1+2×c means ((i-1)×K+2+2×r in the raw image corresponding to the degraded image , The pixel value of the pixel on the (j-1)×K+1+2×c) coordinates. f (i-1)×K+2+2×r, (j-1)×K+2+2×c represents the raw image corresponding to the degraded image ((i-1)×K+2+2×r , The pixel value of the pixel on the (j-1)×K+2+2×c) coordinates. K is the degradation factor, which is equal to the resolution ratio K of the super-division color image and the target region of the target image mentioned in step 205.
相关技术中也存在用于对彩色图像进行提升分辨率处理的模型,该模型所需的训练数据需要基于彩色图像的退化图像得到。而本申请中用于训练第一模型的第一训练数据是基于拜耳模式的生图像的退化图像得到。并且,获取彩色图像的退化图像的过程较复杂,而获取生图像的退化图像的过程交简单,因此,本申请中获取用于训练第一模型的第一训练数据的效率较高,相应地,训练得到第一模型的精度也较高。There is also a model for improving the resolution of a color image in the related art, and the training data required by the model needs to be obtained based on the degraded image of the color image. In this application, the first training data used to train the first model is obtained based on the degraded image of the raw image of the Bayer mode. In addition, the process of acquiring the degraded image of the color image is more complicated, while the process of acquiring the degraded image of the raw image is simple. Therefore, the efficiency of acquiring the first training data for training the first model in this application is relatively high. Accordingly, The accuracy of the first model obtained by training is also higher.
在步骤204的第二种实现方式中,如图10所示,步骤204可以包括:In the second implementation manner of step 204, as shown in FIG. 10, step 204 may include:
步骤2041b、获取目标四通道图像,其中,目标生图像中目标区域的图像包括阵列排布的多个像素组,该像素组包括两行两列像素,该目标四通道图像包括:该多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2 行第2列像素的组合图。Step 2041b. Obtain a target four-channel image, where the image of the target area in the target raw image includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: the plurality of pixels The combination map of the pixels in the first row and the first column, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column in the group.
步骤2041b可以参考上述步骤2041a,本申请实施例在此不做赘述。For step 2041b, reference may be made to the above step 2041a, which is not described in detail in the embodiment of the present application.
步骤2042b、将目标四通道图像输入第二模型,得到第二模型输出的提升分辨率后的目标四通道图像。Step 2042b: Input the target four-channel image into the second model to obtain the target four-channel image output by the second model after the resolution has been increased.
第二模型用于用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的该四通道图像。因此,在将目标生图像中目标区域的图像(一种拜耳模式的图像,本申请实施例以目标区域为整张目标生图像为例)的四通道图像(称为目标四通道图像)输入第二模型后,第二模型能够对该目标四通道图像进行处理,进而输出提升分辨率后的目标四通道图像。The second model is used to super-divide the four-channel image of the unsuper-divided Bayer mode image, and output the four-channel image after the resolution is increased. Therefore, the four-channel image (referred to as the target four-channel image) of the target area image (a Bayer mode image, the embodiment of the present application takes the target area as the entire target image as an example) in the target image After the second model, the second model can process the target four-channel image, and then output the target four-channel image with increased resolution.
可选地,该第二模型可以为神经网络模型。Optionally, the second model may be a neural network model.
示例地,图11为本申请实施例提供的一种第二模型的示意图,如图11所示,该第二模型可以包括:第一模块、16个串联的第二模块、k个串联的第三模块、第四模块、第五模块和第六模块。这些模块的解释可以参考图8所示的第一模型中这些模块的解释,本申请实施例在此不做赘述。For example, FIG. 11 is a schematic diagram of a second model provided by an embodiment of the application. As shown in FIG. 11, the second model may include: a first module, 16 second modules connected in series, and k first modules connected in series. Three modules, fourth module, fifth module and sixth module. For the explanation of these modules, reference may be made to the explanation of these modules in the first model shown in FIG. 8, which is not repeated in the embodiment of the present application.
步骤2043b、将提升分辨率后的目标四通道图像转换为拜耳模式图像。 Step 2043b: Convert the target four-channel image with the increased resolution into a Bayer mode image.
在得到第二模型输出的提升分辨率后的目标四通道图像后,物体识别装置可以按照步骤2041b中获取目标生图像中目标区域的图像的四通道图像(称为目标四通道图像)的方式的反方式,将提升分辨率后的目标四通道图像转换为拜耳模式图像。After obtaining the target four-channel image with the increased resolution output by the second model, the object recognition device may obtain the four-channel image of the target region in the target image (referred to as the target four-channel image) in step 2041b. In the opposite way, the target four-channel image with the increased resolution is converted into a Bayer mode image.
示例地,拜耳模式图像可以包括多个像素组,且每个像素组包括红、绿1、绿2和蓝四种像素。在步骤2043b中,物体识别装置可以将提升分辨率后的目标四通道图像转换为拜耳模式图像,该多个像素组中同一位置的像素均来自目标四通道图像中的同一组合图。For example, the Bayer mode image may include a plurality of pixel groups, and each pixel group includes four kinds of pixels of red, green 1, green 2, and blue. In step 2043b, the object recognition device may convert the target four-channel image with the increased resolution into a Bayer mode image, and the pixels at the same position in the multiple pixel groups all come from the same combined image in the target four-channel image.
例如,假设对图7中的目标四通道图像进行分辨率提升后得到的图像如图12所示(包括红色像素的组合图、绿1色像素的组合图、绿2色像素的组合图以及蓝色像素的组合图)。提升分辨率后的目标四通道图像转换得到的拜耳模式图像可以如图13所示。图13中的拜耳模式图像中,若像素的行数和列数均为奇数,则该像素来自于图12中红色像素的组合图;若像素的行数为奇数且列数为偶数,则该像素来自于图12中绿1色像素的组合图;若像素的行数为偶数且列数为奇数,则该像素来自于图12中绿2色像素的组合图;若像素的行数和列数均为偶数,则该像素来自于图12中蓝色像素的组合图。For example, suppose the image obtained after the resolution of the target four-channel image in Figure 7 is increased is as shown in Figure 12 (including the combination map of red pixels, the combination map of green 1-color pixels, the combination map of green 2-color pixels, and the blue Combination map of color pixels). The Bayer mode image obtained by the conversion of the target four-channel image after the resolution is increased may be as shown in FIG. 13. In the Bayer mode image in Figure 13, if the number of rows and columns of the pixel are both odd, then the pixel comes from the combination of red pixels in Figure 12; if the number of rows and the number of columns is even, the pixel is The pixel comes from the combination map of green 1-color pixels in Figure 12; if the number of rows of the pixel is even and the number of columns is odd, then the pixel comes from the combination map of green 2-color pixels in Figure 12; if the number of rows and columns of pixels If the numbers are even numbers, the pixel comes from the combination of blue pixels in FIG. 12.
步骤2044b、对拜耳模式图像进行第二处理,得到超分彩色图像。 Step 2044b: Perform a second process on the Bayer mode image to obtain a super-division color image.
可选地,在步骤2044b中,物体识别装置可以基于步骤202中第一处理的参数,对步骤2043b中得到的拜耳模式图像(也即提升分辨率后的目标区域的图像)进行第二处理,得到超分彩色图像。Optionally, in step 2044b, the object recognition apparatus may perform second processing on the Bayer mode image (that is, the image of the target area after the resolution is increased) obtained in step 2043b based on the parameters of the first processing in step 202, Obtain a hyperdivision color image.
可选地,在基于将目标生图像处理为彩色图像的参数对拜耳模式图像进行相应的处理时,若该参数为自动白平衡处理、颜色校正处理或伽马校正处理的参数,则可以直接采用该参数对拜耳模式图像进行处理。若该参数为畸变校正处理的参数,由于拜耳模式图像与目标生图像的尺寸以及分辨率不同,因此,需要对该畸变校正处理的参数进行修正,并采用修正后的畸变校正处理的参数对该拜耳模式图像进行处理。Optionally, when performing corresponding processing on the Bayer mode image based on the parameters for processing the target raw image into a color image, if the parameter is a parameter of automatic white balance processing, color correction processing or gamma correction processing, it can be directly used This parameter processes the Bayer mode image. If this parameter is a distortion correction processing parameter, because the Bayer mode image and the target raw image are different in size and resolution, it is necessary to correct the distortion correction processing parameters, and use the corrected distortion correction processing parameters Bayer mode image processing.
示例地,第一处理包括第一畸变校正处理,第二处理包括第二畸变校正处理;第一畸变校正处理和第二畸变校正处理均可以是基于畸变曲线进行的校正处理(如张氏畸变校正处理)。 假设第一畸变校正处理的参数包括:第一畸变曲线;第二畸变校正处理的参数包括:第二畸变曲线;第二畸变曲线中的任一采样点(X 0,Y 0)在拜耳模式图像中对应的像素的坐标为(X Ai,Y Ai),第一畸变曲线中的任一采样点(X 0,Y 0)在目标生图像中对应的像素的坐标为(X Bi,Y Bi);其中,X Ai=(X Bi-(W f/2+0.5))*K+(W f*K/2+0.5);Y Ai=(Y Bi-(H f/2+0.5))*K+(H f*K/2+0.5);W f表示目标生图像的宽,H f表示目标生图像的高。 For example, the first processing includes the first distortion correction processing, and the second processing includes the second distortion correction processing; both the first distortion correction processing and the second distortion correction processing may be correction processing based on the distortion curve (such as Zhang’s distortion correction). handle). It is assumed that the parameters of the first distortion correction processing include: the first distortion curve; the parameters of the second distortion correction processing include: the second distortion curve; any sampling point (X 0 , Y 0 ) in the second distortion curve is in the Bayer mode image The coordinates of the corresponding pixel in is (X Ai , Y Ai ), and the coordinates of any sampling point (X 0 , Y 0 ) in the first distortion curve in the target image is (X Bi , Y Bi ) ; Among them, X Ai =(X Bi -(W f /2+0.5))*K+(W f *K/2+0.5); Y Ai =(Y Bi -(H f /2+0.5))*K+ (H f *K/2+0.5); W f represents the width of the target image, and H f represents the height of the target image.
由于将目标生图像与拜耳模式图像的分辨率不同,因此,将目标生图像处理为彩色图像的参数可能并不适用于将拜耳模式图像处理为彩色图像,如果直接采用将目标生图像处理为彩色图像的参数对拜耳模式图像进行处理,则可能会导致处理得到的超分彩色图像出现问题(如颜色偏色、图像扭曲以及对比度变化等问题)。本申请实施例中,能够对将目标生图像处理为彩色图像的畸变校正处理的参数进行修正,因此,避免了得到的超分彩色图像存在这些问题。Since the resolution of the target image and the Bayer mode image are different, the parameters for processing the target image into a color image may not be suitable for processing the Bayer mode image into a color image. If the target image is directly processed into color The processing of the Bayer mode image with the image parameters may cause problems (such as color cast, image distortion, and contrast changes) in the processed super-division color image. In the embodiments of the present application, the parameters of the distortion correction processing for processing the target raw image into a color image can be corrected, thus avoiding these problems in the obtained super-division color image.
本申请实施例中基于将目标生图像处理为彩色图像的参数,对拜耳模式图像进行相应的处理,得到超分彩色图像。当然,在基于上述参数对拜耳模式图像进行相应的处理时,也可以不基于将目标生图像处理为彩色图像的参数对拜耳模式图像进行相应的处理。本申请实施例对此不作限定。In the embodiments of the present application, based on the parameters for processing the target raw image into a color image, corresponding processing is performed on the Bayer mode image to obtain a super-division color image. Of course, when corresponding processing is performed on the Bayer mode image based on the above parameters, the corresponding processing on the Bayer mode image may not be performed based on the parameters for processing the target raw image into a color image. The embodiment of the application does not limit this.
可以看出,步骤204的上述两种实现方式中,第一种实现方式是基于第一模型直接得到超分彩色图像,第一种实现方式得到超分彩色图像的速度较快。而第二种实现方式是先对目标区域的图像进行超分得到拜耳模式图像(如通过第二模型得到提升分辨率的目标四通道图像,再将该目标四通道图像转换为拜耳模式图像),之后再将拜耳模式图像处理为超分彩色图像。物体识别装置可以采用这两种实现方式中的任一种实现方式执行步骤204,或者,物体识别装置可以基于用户的选择,采用这两种实现方式中用户选择的实现方式执行步骤204。It can be seen that among the above two implementation manners of step 204, the first implementation manner is to directly obtain the super-division color image based on the first model, and the first implementation manner obtains the super-division color image faster. The second implementation method is to first super-divide the image of the target area to obtain the Bayer mode image (for example, obtain a target four-channel image with increased resolution through the second model, and then convert the target four-channel image into a Bayer mode image), After that, the Bayer mode image is processed into a super-division color image. The object recognition apparatus may use any one of these two implementation modes to perform step 204, or the object recognition apparatus may perform step 204 based on the user's selection by adopting the implementation mode selected by the user in the two implementation modes.
可选地,在使用该第二模型之前,物体识别装置可以基于第二训练数据对初始模型进行训练以得到第二模型。当然,训练初始模型得到第二模型的过程也可以不由物体识别装置执行,本申请实施例对此不作限定。Optionally, before using the second model, the object recognition device may train the initial model based on the second training data to obtain the second model. Of course, the process of training the initial model to obtain the second model may not be executed by the object recognition device, which is not limited in the embodiment of the present application.
示例地,在获取用于训练得到第二模型的第二训练数据时,可以首先获取拜耳模式的生图像(可以是任意的生图像),之后,再依照装箱(binning)插值方式对获取到的生图像进行插值,从而获得生图像的小尺寸的退化图像(该退化图像可以看做是分辨率降低后的生图像)。这些过程可以参考上述实施例中获取第一训练数据的过程,本申请实施例在此不做赘述。得到的退化图像以及生图像便可以作为第二训练数据对初始模型进行训练。在训练初始模型时,可以将退化图像输入初始模型,并将该初始模型输出的结果与该退化图像对应的生图像进行比对,并根据比对结果对初始模型进行调整;重复多次该过程便可以将初始模型训练为第二模型。For example, when acquiring the second training data used for training to obtain the second model, the raw image of the Bayer mode (which can be any raw image) can be obtained first, and then the obtained image can be obtained according to the binning interpolation method. The raw image is interpolated to obtain a small-sized degraded image of the raw image (the degraded image can be regarded as a raw image with reduced resolution). For these processes, reference may be made to the process of obtaining the first training data in the foregoing embodiment, which is not described in detail in the embodiment of the present application. The obtained degraded image and raw image can be used as the second training data to train the initial model. When training the initial model, you can input the degraded image into the initial model, and compare the output result of the initial model with the raw image corresponding to the degraded image, and adjust the initial model according to the comparison result; repeat the process many times Then the initial model can be trained as the second model.
进一步地,上述实施例中以步骤204中目标生图像的目标区域为目标生图像的全部区域为例。可选地,该目标生图像的目标区域可以为目标生图像中包含目标物体的区域(如部分区域)。Further, in the foregoing embodiment, the target area of the target image in step 204 is the entire area of the target image as an example. Optionally, the target area of the target image may be an area (such as a partial area) that contains the target object in the target image.
示例地,目标生图像可以为生图像视频(本申请实施例中以该生图像视频为拜耳模式的视频为例)中的第m+1帧图像,m≥1,此时,在图2所示的物体识别方法的基础上,如图14所示,在步骤204之前,该物体识别方法还可以包括:For example, the target raw image may be the m+1th frame image in the raw image video (in the embodiment of the application, the raw image video is the Bayer mode video), m≥1, at this time, in Fig. 2 Based on the illustrated object recognition method, as shown in FIG. 14, before step 204, the object recognition method may further include:
步骤301、对视频中的第m帧图像进行第三处理,得到第m帧图像的彩色图像。Step 301: Perform a third process on the m-th frame image in the video to obtain a color image of the m-th frame image.
步骤302、确定第m帧图像的彩色图像中包含目标物体的第三区域。Step 302: Determine a third region containing the target object in the color image of the m-th frame of image.
步骤303、确定第m+1帧图像中对应第三区域的目标区域。Step 303: Determine a target area corresponding to the third area in the (m+1)th frame of image.
在步骤303中,物体识别装置可以确定第三区域在第m+1帧图像(也即目标生图像)中对应的目标区域。第三区域与对应的目标区域所包含的内容大致相似,这两个区域的特征的相似度大于相似度阈值(比如80%、90%等),目标生图像中的目标区域由该目标区域在第m帧图像中对应的第三区域变化而来。In step 303, the object recognition device may determine the target area corresponding to the third area in the (m+1)th frame of image (that is, the target raw image). The content contained in the third area is roughly similar to the corresponding target area. The similarity of the features of these two areas is greater than the similarity threshold (such as 80%, 90%, etc.). The target area in the target image is defined by the target area. The corresponding third region in the m-th frame of image changes.
可选地,目标区域和第三区域均呈矩形,第三区域的左上角坐标为(X D1,Y D1),且右下角坐标为(X D2,Y D2);第三区域对应的目标区域的左上角坐标为(X C1,Y C1),且右下角坐标为(X C2,Y C2);其中,X D1=(X C1-(X C1+X C2)/2)*L+(X C1+X C2)/2;Y D1=(Y C1-(Y C1+Y C2)/2)*L+(Y C1+Y C2)/2;X D2=(X C2-(X C1+X C2)/2)*L+(X C1+X C2)/2;Y D2=(Y C2-(Y C1+Y C2)/2)*L+(Y C1+Y C2)/2;L>1。L可以由用户设置,比如1.5≤L≤3。 Optionally, both the target area and the third area are rectangular, the coordinates of the upper left corner of the third area are (X D1 , Y D1 ), and the coordinates of the lower right corner are (X D2 , Y D2 ); the target area corresponding to the third area The coordinates of the upper left corner of is (X C1 , Y C1 ), and the coordinates of the lower right corner are (X C2 , Y C2 ); where X D1 =(X C1 -(X C1 +X C2 )/2)*L+(X C1 +X C2 )/2; Y D1 =(Y C1 -(Y C1 +Y C2 )/2)*L+(Y C1 +Y C2 )/2; X D2 =(X C2 -(X C1 +X C2 ) /2)*L+(X C1 +X C2 )/2; Y D2 =(Y C2 -(Y C1 +Y C2 )/2)*L+(Y C1 +Y C2 )/2; L>1. L can be set by the user, such as 1.5≤L≤3.
示例地,假设L=3,第m帧图像如图15所示,第m+1帧图像如图16所示。若第m帧图像中的某一第三区域的左上角坐标为(3,3),且右下角坐标为(6,6),则该第三区域在第m+1帧图像中对应的目标区域的左上角坐标为((X C1-(X C1+X C2)/2)*L+(X C1+X C2)/2,(Y C1-(Y C1+Y C2)/2)*L+(Y C1+Y C2)/2)=((3-(3+6)/2)*3+(3+6)/2,(3-(3+6)/2)*3+(3+6)/2)=(0,0),且右下角坐标为((6-(3+6)/2)*3+(3+6)/2,(6-(3+6)/2)*3+(3+6)/2)=(9,9)。 For example, assuming L=3, the m-th frame image is shown in FIG. 15 and the m+1-th frame image is shown in FIG. 16. If the coordinates of the upper left corner of a third area in the mth frame image are (3, 3), and the coordinates of the lower right corner are (6, 6), then the third area corresponds to the target in the image frame m+1 The coordinates of the upper left corner of the area are ((X C1 -(X C1 +X C2 )/2)*L+(X C1 +X C2 )/2, (Y C1 -(Y C1 +Y C2 )/2)*L+( Y C1 +Y C2 )/2)=((3-(3+6)/2)*3+(3+6)/2, (3-(3+6)/2)*3+(3+ 6)/2)=(0,0), and the coordinates of the lower right corner are ((6-(3+6)/2)*3+(3+6)/2, (6-(3+6)/2 )*3+(3+6)/2)=(9,9).
步骤304、在第m+1帧图像包含多个目标区域,且该多个目标区域中存在满足替换条件的两个目标区域时,将该两个目标区域替换为备选目标区域,得到更新的多个目标区域。Step 304: When the (m+1)th frame of image contains multiple target areas, and there are two target areas that satisfy the replacement condition in the multiple target areas, replace the two target areas with candidate target areas to obtain an updated Multiple target areas.
该替换条件包括:该两个目标区域至少部分重合,且该两个目标区域的面积之和大于备选目标区域的面积。其中,两个目标区域中一个目标区域的左上角坐标为(X 11,Y 11),且右下角坐标为(X 12,Y 12);两个目标区域中另一个目标区域的左上角坐标为(X 21,Y 21),且右下角坐标为(X 22,Y 22);备选目标区域的左上角坐标为(X M1,Y M1),且右下角坐标为(X M2,Y M2);X M1为X 11和X 21的最小值;Y M1为Y 11和Y 21的最小值;X M2为X 12和X 22的最大值;Y M2为Y 12和Y 22的最大值。 The replacement condition includes: the two target regions at least partially overlap, and the sum of the areas of the two target regions is greater than the area of the candidate target region. Among them, the coordinates of the upper left corner of one of the two target areas are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the coordinates of the upper left corner of the other of the two target areas are (X 21 , Y 21 ), and the lower right corner coordinates are (X 22 , Y 22 ); the upper left corner coordinates of the candidate target area are (X M1 , Y M1 ), and the lower right corner coordinates are (X M2 , Y M2 ) ; X M1 is the minimum value of X 11 and X 21 ; Y M1 is the minimum value of Y 11 and Y 21 ; X M2 is the maximum value of X 12 and X 22 ; Y M2 is the maximum value of Y 12 and Y 22 .
示例地,假设目标生图像中的目标区域如图17所示,包括:目标区域1和目标区域2。其中,目标区域1的左上角坐标为(3,6),且右下角坐标为(6,3),目标区域2的左上角坐标为(0,9),目标区域2的右下角坐标为(9,0);备选目标区域X(并不属于步骤301中确定出的多个目标区域)的左上角坐标可以为(0,6),右下角坐标为(9,3)。可以看出,目标区域1和目标区域2至少部分重合,且目标区域1的面积(9)与目标区域2的面积(81)之和(90)大于备选目标区域X的面积(27)。因此,目标区域1和目标区域2满足上述替换条件,可以将多个目标区域中的目标区域1和目标区域2替换为备选目标区域X,从而实现了对步骤301中确定出的多个目标区域的更新。For example, suppose that the target area in the target raw image is as shown in FIG. 17, and includes: target area 1 and target area 2. Among them, the coordinates of the upper left corner of the target area 1 are (3, 6), and the coordinates of the lower right corner are (6, 3), the coordinates of the upper left corner of the target area 2 are (0, 9), and the coordinates of the lower right corner of the target area 2 are ( 9, 0); The coordinates of the upper left corner of the candidate target area X (not belonging to the multiple target areas determined in step 301) may be (0, 6), and the coordinates of the lower right corner may be (9, 3). It can be seen that the target area 1 and the target area 2 at least partially overlap, and the sum (90) of the area (9) of the target area 1 and the area (81) of the target area 2 is greater than the area (27) of the candidate target area X. Therefore, the target area 1 and the target area 2 meet the above replacement conditions, and the target area 1 and the target area 2 of the multiple target areas can be replaced with the candidate target area X, thereby achieving the multiple targets determined in step 301 Regional update.
由于将满足替换条件的多个目标区域替换为备选目标区域,因此减少了目标生图像中目标区域的个数,简化了基于目标区域进行目标物体的识别的过程。Since multiple target regions satisfying the replacement conditions are replaced with candidate target regions, the number of target regions in the target raw image is reduced, and the process of identifying target objects based on the target regions is simplified.
可选地,物体识别装置可以依次以步骤301中确定出的多个目标区域中的每个区域为基准区域,执行多个目标区域的更新流程。Optionally, the object recognition apparatus may sequentially use each of the multiple target areas determined in step 301 as the reference area, and execute the update process of the multiple target areas.
其中,该更新流程可以包括:物体识别装置依次判定该基准区域与该多个目标区域中除 该区域之外的每个其他区域是否满足替换条件。一旦某一基准区域与某一其他区域满足替换条件,则物体识别装置就可以将该基准区域与该其他区域替换为这两个区域对应的备选目标区域。Wherein, the update process may include: the object recognition device sequentially determines whether the reference area and each of the multiple target areas except the area satisfy the replacement condition. Once a certain reference area and a certain other area meet the replacement conditions, the object recognition device can replace the reference area and the other area with candidate target areas corresponding to the two areas.
或者,该更新流程可以包括:物体识别装置还可以将找出所有其他区域中的一组其他区域,该组其他区域中的每个区域均与该基准区域满足替换条件;之后,物体识别装置还可以确定该组其他区域中每个区域对应的面积差(该区域与基准区域的面积之和与这两个区域对应的备选目标区域的面积之差),并将该基准区域与该组其他区域中对应的面积差最大的区域替换为这两个区域对应的备选目标区域。Alternatively, the update process may include: the object recognition device may also find a group of other areas in all other areas, and each area in the group of other areas meets the replacement condition with the reference area; after that, the object recognition device returns The area difference corresponding to each area in the group of other areas can be determined (the difference between the sum of the area of the area and the reference area and the area of the candidate target areas corresponding to these two areas), and compare the reference area with the other areas in the group. The corresponding area with the largest area difference in the area is replaced with the candidate target area corresponding to the two areas.
在对多个目标区域更新完毕后,物体识别装置便可以将更新后的多个目标区域中的任一目标区域作为步骤204中目标区域。示例地,物体识别装置可以截取目标生图像中的每个目标区域,得到每个目标区域的图像,并采用图2所示的方法对每个目标区域的图像进行处理以及识别目标物体。After updating the multiple target areas, the object recognition device can use any one of the updated multiple target areas as the target area in step 204. For example, the object recognition device can intercept each target area in the target raw image to obtain an image of each target area, and use the method shown in FIG. 2 to process the image of each target area and recognize the target object.
进一步地,若在基于第m+1帧图像进行目标物体的识别后,还需要继续基于第m+2帧图像进行目标物体的识别,则该识别的过程可以参考基于第m+1帧图像进行目标物体的识别的过程。并且,上述步骤204中第m+1帧图像中的目标区域可以作为第m+2帧图像的前一帧图像中的第三区域,而无需在基于第m+2帧图像进行目标物体的识别的过程中,重新确定第m+2帧图像的前一帧图像中的第三区域。Further, if after the recognition of the target object based on the m+1 frame image, it is necessary to continue the recognition of the target object based on the m+2 frame image, the recognition process can refer to the recognition based on the m+1 frame image. The process of identifying the target object. In addition, the target area in the image of the m+1th frame in the above step 204 can be used as the third area in the image of the previous frame of the image of the m+2th frame, and there is no need to identify the target object based on the image of the m+2th frame. During the process, re-determine the third area in the previous frame of the m+2th frame of image.
上述实施例中以在步骤303中确定多个目标区域之后,还在步骤304中对多个目标区域进行更新为例,可选地,也可以不执行步骤304,而是在步骤303中确定多个目标区域之后,直接执行步骤204,本申请实施例对此不作限定。In the foregoing embodiment, after multiple target areas are determined in step 303, the multiple target areas are also updated in step 304 as an example. Optionally, step 304 may not be performed, but the multiple target areas are determined in step 303. Step 204 is directly executed after the target area is set, which is not limited in the embodiment of the present application.
本申请实施例提供的方法实施例步骤的先后顺序能够进行适当调整,步骤也能够根据情况进行相应增减,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。The order of the steps in the method embodiments provided in the embodiments of this application can be adjusted appropriately, and the steps can be increased or decreased accordingly according to the situation. Any person skilled in the art can easily think of changes within the technical scope disclosed in this application. The methods should all be covered in the scope of protection of this application, so I won’t repeat them here.
图18为本申请实施例提供的一种物体识别装置的框图,可以运行前面提及的物体识别方法。如图18所示,该物体识别装置包括:FIG. 18 is a block diagram of an object recognition device provided by an embodiment of this application, which can run the aforementioned object recognition method. As shown in Figure 18, the object recognition device includes:
获取模块1801,用于获取图像传感器产生的目标生图像;获取模块1801用于执行的操作可以参考图2所示的实施例中的步骤201,本申请实施例在此不做赘述。The obtaining module 1801 is configured to obtain the target raw image generated by the image sensor; the operation performed by the obtaining module 1801 can refer to step 201 in the embodiment shown in FIG. 2, and details are not described in this embodiment of the present application.
超分模块1802,用于对目标生图像中目标区域的图像进行超分,得到超分彩色图像,其中,目标区域包括目标生图像中的至少部分区域,超分彩色图像的分辨率大于目标区域的图像的分辨率;超分模块1802用于执行的操作可以参考图2所示的实施例中的步骤204,本申请实施例在此不做赘述。The super-division module 1802 is used to super-division the image of the target region in the target image to obtain a super-division color image, wherein the target region includes at least a part of the target image, and the resolution of the super-division color image is greater than that of the target region The resolution of the image; the operation performed by the super-division module 1802 can refer to step 204 in the embodiment shown in FIG. 2, and details are not described in this embodiment of the present application.
识别模块1803,用于基于超分彩色图像进行目标物体的识别。识别模块1803用于执行的操作可以参考图2所示的实施例中的步骤205和步骤206,本申请实施例在此不做赘述。The recognition module 1803 is used for recognizing the target object based on the hyperdivision color image. For operations performed by the identification module 1803, reference may be made to step 205 and step 206 in the embodiment shown in FIG. 2, and details are not described in the embodiment of the present application.
可选地,目标生图像为拜耳模式的图像。Optionally, the target image is a Bayer mode image.
可选地,物体识别装置还包括:Optionally, the object recognition device further includes:
第一处理模块(图18中未示出),用于对目标生图像进行第一处理,得到目标生图像的彩色图像;第一处理模块用于执行的操作可以参考图2所示的实施例中的步骤202,本申请实施例在此不做赘述。The first processing module (not shown in FIG. 18) is used to perform the first processing on the target raw image to obtain a color image of the target raw image; for the operations performed by the first processing module, refer to the embodiment shown in FIG. 2 Step 202 in the embodiment of the present application will not be repeated here.
第一确定模块(图18中未示出),用于确定目标生图像的彩色图像中包含目标物体的第一区域;第一确定模块用于执行的操作可以参考图2所示的实施例中的步骤203,本申请实施例在此不做赘述。The first determining module (not shown in FIG. 18) is used to determine the first area of the target object in the color image of the target image; the operation performed by the first determining module can refer to the embodiment shown in FIG. 2 Step 203 of the embodiment of the present application will not be repeated here.
其中,识别模块1803用于:基于第一区域,确定超分彩色图像中的第二区域;基于超分彩色图像中第二区域的图像,进行目标物体的识别。Wherein, the recognition module 1803 is configured to: determine the second area in the super-division color image based on the first area; and perform the recognition of the target object based on the image of the second area in the super-division color image.
可选地,目标生图像为拜耳模式的图像,超分模块1802用于:获取目标四通道图像,其中,目标区域的图像包括阵列排布的多个像素组,像素组包括两行两列像素,目标四通道图像包括:多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图;将目标四通道图像输入第一模型,得到第一模型输出的超分彩色图像;其中,第一模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的拜耳模式的图像的彩色图像。该过程可以参考图6所示的实施例,本申请实施例在此不做赘述。Optionally, the target raw image is a Bayer mode image, and the super-division module 1802 is used to: obtain a target four-channel image, where the image of the target area includes a plurality of pixel groups arranged in an array, and the pixel group includes two rows and two columns of pixels , The target four-channel image includes: a combination map of pixels in the first row and column 1, a combination map of pixels in the first row and second column, a combination map of pixels in the second row and first column, and a combination map of the pixels in the second row and second column. A combined image of columns of pixels; input the target four-channel image into the first model to obtain the super-division color image output by the first model; where the first model is used to super-division the four-channel image of the unsuper-division Bayer mode image , Output the color image of the Bayer mode image after the resolution has been increased. For this process, reference may be made to the embodiment shown in FIG. 6, and details are not described herein in the embodiment of the present application.
可选地,目标生图像为拜耳模式的图像,超分模块1902包括:超分子模块(图18中未示出),用于对目标区域的图像进行超分,得到拜耳模式图像;处理子模块(图18中未示出),用于对拜耳模式图像进行第二处理,得到超分彩色图像。Optionally, the target image is a Bayer mode image, and the super-division module 1902 includes: a supramolecular module (not shown in FIG. 18) for super-division of the image of the target area to obtain a Bayer mode image; processing sub-module (Not shown in FIG. 18), used to perform the second processing on the Bayer mode image to obtain a super-division color image.
可选地,超分子模块用于:获取目标四通道图像,其中,目标区域的图像包括阵列排布的多个像素组,像素组包括两行两列像素,目标四通道图像包括:多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图;将目标四通道图像输入第二模型,得到第二模型输出的提升分辨率后的目标四通道图像;将提升分辨率后的目标四通道图像转换为拜耳模式图像;其中,第二模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的四通道图像。该过程可以参考图10所示的实施例,本申请实施例在此不做赘述。Optionally, the supramolecular module is used to obtain a target four-channel image, where the image of the target area includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: a plurality of pixels The combination map of the pixels in the first row and the first column, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column in the group; target four channels The image is input to the second model, and the target four-channel image with the increased resolution output by the second model is obtained; the target four-channel image with the increased resolution is converted into a Bayer mode image; among them, the second model is used for the unsupervised The four-channel image of the Bayer mode image is super-divided, and the four-channel image with increased resolution is output. For this process, reference may be made to the embodiment shown in FIG. 10, and details are not described herein in the embodiment of the present application.
可选地,第一处理包括第一畸变校正处理,第二处理包括第二畸变校正处理;第一畸变校正处理的参数包括:第一畸变曲线;第二畸变校正处理的参数包括:第二畸变曲线;第二畸变曲线中的任一采样点(X 0,Y 0)在超分目标生图像中对应的像素的坐标为(X Ai,Y Ai),第一畸变曲线中的任一采样点(X 0,Y 0)在目标生图像中对应的像素的坐标为(X Bi,Y Bi);其中,X Ai=(X Bi-(W f/2+0.5))*K+(W f*K/2+0.5);Y Ai=(Y Bi-(H f/2+0.5))*K+(H f*K/2+0.5);W f表示目标生图像的宽,H f表示目标生图像的高,K表示超分彩色图像与目标区域的图像的分辨率比值。 Optionally, the first process includes a first distortion correction process, and the second process includes a second distortion correction process; the parameters of the first distortion correction process include: a first distortion curve; the parameters of the second distortion correction process include: a second distortion Curve; the coordinates of any sampling point (X 0 , Y 0 ) in the second distortion curve corresponding to the pixel in the hyperdivision target image is (X Ai , Y Ai ), any sampling point in the first distortion curve (X 0 , Y 0 ) The coordinates of the corresponding pixel in the target image are (X Bi , Y Bi ); where X Ai = (X Bi -(W f /2+0.5))*K+(W f * K/2+0.5); Y Ai = (Y Bi -(H f /2+0.5))*K+(H f *K/2+0.5); W f represents the width of the target image, H f represents the target image The height of the image, K represents the resolution ratio of the super-division color image to the image in the target area.
可选地,物体识别装置还包括:第二确定模块(图18中未示出),用于确定目标生图像中包含目标物体的目标区域。该过程可以参考图14所示的实施例中的步骤303,本申请实施例在此不做赘述。Optionally, the object recognition device further includes: a second determination module (not shown in FIG. 18), configured to determine a target area in the target image containing the target object. For this process, reference may be made to step 303 in the embodiment shown in FIG. 14, and details are not described in the embodiment of the present application.
可选地,目标生图像为生图像视频中的第m+1帧图像,m≥1,物体识别装置还包括:Optionally, the target raw image is the m+1th frame image in the raw image video, m≥1, and the object recognition device further includes:
第三处理模块(图18中未示出),用于对生图像视频中的第m帧图像进行第三处理,得到第m帧图像的彩色图像;该过程可以参考图14所示的实施例中的步骤301,本申请实施例在此不做赘述。The third processing module (not shown in FIG. 18) is used to perform third processing on the m-th frame image in the raw image video to obtain a color image of the m-th frame image; this process can refer to the embodiment shown in FIG. 14 Step 301 in the embodiment of the application will not be repeated here.
第三确定模块(图18中未示出),用于确定第m帧图像的彩色图像中包含目标物体的第三区域;该过程可以参考图14所示的实施例中的步骤302,本申请实施例在此不做赘述。The third determining module (not shown in FIG. 18) is used to determine the third region containing the target object in the color image of the m-th frame of image; this process can refer to step 302 in the embodiment shown in FIG. 14. This application The embodiments are not described in detail here.
第二确定模块用于:确定目标生图像中对应第三区域的目标区域。The second determining module is used for determining the target area corresponding to the third area in the target raw image.
可选地,物体识别装置还包括:Optionally, the object recognition device further includes:
替换模块(图18中未示出),用于在目标生图像包含多个目标区域,且多个目标区域中存在满足替换条件的两个目标区域时,将两个目标区域替换为备选目标区域,得到更新的多个目标区域;其中,替换条件包括:两个目标区域至少部分重合,且两个目标区域的面积之和大于备选目标区域的面积;两个目标区域中一个目标区域的左上角坐标为(X 11,Y 11),且右下角坐标为(X 12,Y 12);两个目标区域中另一个目标区域的左上角坐标为(X 21,Y 21),且右下角坐标为(X 22,Y 22);备选目标区域的左上角坐标为(X M1,Y M1),且右下角坐标为(X M2,Y M2);X M1为X 11和X 21的最小值;Y M1为Y 11和Y 21的最小值;X M2为X 12和X 22的最大值;Y M2为Y 12和Y 22的最大值。该过程可以参考图14所示的实施例中的步骤304,本申请实施例在此不做赘述。 Replacement module (not shown in Figure 18), used to replace the two target regions as candidate targets when the target image contains multiple target regions, and there are two target regions that meet the replacement conditions in the multiple target regions Region, the updated multiple target regions; among which, the replacement conditions include: two target regions at least partially overlap, and the sum of the two target regions is greater than the area of the candidate target region; the target region of one of the two target regions The coordinates of the upper left corner are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the coordinates of the upper left corner of the other of the two target areas are (X 21 , Y 21 ), and the lower right corner The coordinates are (X 22 , Y 22 ); the upper left corner coordinates of the candidate target area are (X M1 , Y M1 ), and the lower right corner coordinates are (X M2 , Y M2 ); X M1 is the smallest of X 11 and X 21 Y M1 is the minimum value of Y 11 and Y 21 ; X M2 is the maximum value of X 12 and X 22 ; Y M2 is the maximum value of Y 12 and Y 22 . For this process, reference may be made to step 304 in the embodiment shown in FIG. 14, and details are not described herein in the embodiment of the present application.
综上所述,本申请实施例提供的物体识别装置中,超分模块能够对目标生图像中目标区域的图像进行超分处理,以放大增强目标区域的图像中的细节信息。由于被超分的图像是未经过ISP处理的生图像,因此超分处理后的目标区域的图像能够包含较多细节信息。使得本实施例得到的超分彩色图像(超分处理后的目标区域的图像的彩色图像)相较于现有技术中的彩色图像拥有更多的细节信息。因此,基于该超分彩色图像进行目标物体的识别,能够提升目标物体的识别的准确度。To sum up, in the object recognition device provided by the embodiments of the present application, the super-division module can perform super-division processing on the image of the target area in the target raw image to enlarge and enhance the detailed information in the image of the target area. Since the super-divided image is a raw image that has not been processed by ISP, the image of the target area after the super-division processing can contain more detailed information. Therefore, the super-division color image (the color image of the image of the target area after the super-division processing) obtained by this embodiment has more detailed information than the color image in the prior art. Therefore, the recognition of the target object based on the super-division color image can improve the accuracy of the recognition of the target object.
本申请实施例提供了一种计算机存储介质,所述存储介质内存储有计算机程序,所述计算机程序用于执行本申请实施例提供的任一物体识别方法。The embodiment of the present application provides a computer storage medium in which a computer program is stored, and the computer program is used to execute any object recognition method provided in the embodiment of the present application.
本申请实施例提供了一种包含指令的计算机程序产品,当计算机程序产品在物体识别装置上运行时,使得物体识别装置执行本申请实施例提供的任一物体识别方法。The embodiments of the present application provide a computer program product containing instructions. When the computer program product runs on an object recognition device, the object recognition device executes any object recognition method provided in the embodiments of the present application.
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现,所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机的可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者包含一个或多个可用介质集成的服务器、数据中心等数据存储装置。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质,或者半导体介质(例如固态硬盘)等。In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented using software, it may be implemented in the form of a computer program product in whole or in part, and the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions described in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a computer network, or other programmable devices. The computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer instructions may be transmitted from a website, computer, server, or data. The center transmits to another website, computer, server, or data center through wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.). The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or a data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium, or a semiconductor medium (for example, a solid state hard disk).
在本申请中,术语“第一”和“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性。术语“至少一个”指一个或多个,“多个”指两个或两个以上,除非另有明确的限定。In this application, the terms "first" and "second", etc. are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance. The term "at least one" refers to one or more, and "multiple" refers to two or more, unless expressly defined otherwise.
本申请实施例提供的方法实施例和装置实施例等不同类型的实施例均可以相互参考,本申请实施例对此不做限定。Different types of embodiments such as method embodiments and device embodiments provided in the embodiments of the present application can be referred to each other, which is not limited in the embodiments of the present application.
在本申请提供的相应实施例中,应该理解到,所揭露的装置等可以通过其它的构成方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻 辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the corresponding embodiments provided in the present application, it should be understood that the disclosed device and the like can be implemented in other structural manners. For example, the device embodiments described above are merely illustrative, for example, the division of units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元描述的部件可以是或者也可以不是物理单元,既可以位于一个地方,或者也可以分布到多个物体识别装置(例如终端设备)上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separate, and the components described as units may or may not be physical units, and may be located in one place or distributed to multiple object recognition devices (such as terminals). Equipment). Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (18)

  1. 一种物体识别方法,其特征在于,所述方法包括:An object recognition method, characterized in that the method includes:
    获取图像传感器产生的目标生图像;Acquire the target image generated by the image sensor;
    对所述目标生图像中目标区域的图像进行超分,得到超分彩色图像,其中,所述目标区域包括所述目标生图像中的至少部分区域,所述超分彩色图像的分辨率大于所述目标区域的图像的分辨率;Perform super-division on the image of the target region in the target image to obtain a super-division color image, wherein the target region includes at least part of the region in the target image, and the resolution of the super-division color image is greater than that of all the regions. The resolution of the image of the target area;
    基于所述超分彩色图像进行目标物体的识别。The target object is recognized based on the super-division color image.
  2. 根据权利要求1所述的方法,其特征在于,所述目标生图像为拜耳模式的图像。The method according to claim 1, wherein the target raw image is a Bayer mode image.
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:The method according to claim 1 or 2, wherein the method further comprises:
    对所述目标生图像进行第一处理,得到所述目标生图像的彩色图像;Performing first processing on the target image to obtain a color image of the target image;
    确定所述目标生图像的彩色图像中包含所述目标物体的第一区域;Determining the first area of the target object in the color image of the target image;
    其中,基于所述超分彩色图像进行目标物体的识别,具体包括:Wherein, the recognition of the target object based on the super-division color image specifically includes:
    基于所述第一区域,确定所述超分彩色图像中的第二区域;Determining a second area in the super-division color image based on the first area;
    基于所述超分彩色图像中所述第二区域的图像,进行所述目标物体的识别。Based on the image of the second region in the super-division color image, the target object is recognized.
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述目标生图像为拜耳模式的图像,所述对所述目标生图像中的目标区域的图像进行超分,得到超分彩色图像,包括:The method according to any one of claims 1 to 3, wherein the target raw image is a Bayer mode image, and the image of the target area in the target raw image is super-divided to obtain a super-differentiated color Images, including:
    获取目标四通道图像,其中,所述目标区域的图像包括阵列排布的多个像素组,所述像素组包括两行两列像素,所述目标四通道图像包括:所述多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图;Acquire a target four-channel image, wherein the image of the target area includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: The combination map of the pixels in the first row and the first column, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column;
    将所述目标四通道图像输入第一模型,得到所述第一模型输出的所述超分彩色图像;Input the target four-channel image into a first model to obtain the super-division color image output by the first model;
    其中,所述第一模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的所述拜耳模式的图像的彩色图像。Wherein, the first model is used to super-divide the four-channel image of the Bayer-mode image that is not super-divided, and output the color image of the Bayer-mode image after the resolution is increased.
  5. 根据权利要求3所述的方法,其特征在于,所述目标生图像为拜耳模式的图像,所述对所述目标生图像中的目标区域的图像进行超分,得到超分彩色图像,包括:The method according to claim 3, wherein the target image is a Bayer mode image, and the super-division of the image of the target area in the target image to obtain a super-division color image comprises:
    对所述目标区域的图像进行超分,得到拜耳模式图像;Super-divide the image of the target area to obtain a Bayer mode image;
    对所述拜耳模式图像进行第二处理,得到所述超分彩色图像。The second processing is performed on the Bayer mode image to obtain the super-division color image.
  6. 根据权利要求5所述的方法,其特征在于,所述对所述目标区域的图像进行超分,得到拜耳模式图像,包括:The method according to claim 5, wherein the super-division of the image of the target area to obtain the Bayer mode image comprises:
    获取目标四通道图像,其中,所述目标区域的图像包括阵列排布的多个像素组,所述像素组包括两行两列像素,所述目标四通道图像包括:所述多个像素组中第1行第1列像素的组合图、第1行第2列像素的组合图、第2行第1列像素的组合图以及第2行第2列像素的组合图;Acquire a target four-channel image, wherein the image of the target area includes a plurality of pixel groups arranged in an array, the pixel group includes two rows and two columns of pixels, and the target four-channel image includes: The combination map of the pixels in the first row and the first column, the combination map of the pixels in the first row and the second column, the combination map of the pixels in the second row and the first column, and the combination map of the pixels in the second row and the second column;
    将所述目标四通道图像输入第二模型,得到所述第二模型输出的提升分辨率后的所述目标四通道图像;Inputting the target four-channel image into a second model to obtain the target four-channel image with an increased resolution output by the second model;
    将提升分辨率后的所述目标四通道图像转换为所述拜耳模式图像;Converting the target four-channel image after the resolution has been increased into the Bayer mode image;
    其中,所述第二模型用于对未超分的拜耳模式的图像的四通道图像进行超分,输出提升分辨率后的四通道图像。Wherein, the second model is used to super-divide the four-channel image of the unsuper-divided Bayer mode image, and output the four-channel image with increased resolution.
  7. 根据权利要求6所述的方法,其特征在于,所述第一处理包括第一畸变校正处理,所述第二处理包括第二畸变校正处理;The method according to claim 6, wherein the first processing includes a first distortion correction processing, and the second processing includes a second distortion correction processing;
    所述第一畸变校正处理的参数包括:第一畸变曲线;所述第二畸变校正处理的参数包括:第二畸变曲线;所述第二畸变曲线中的任一采样点(X 0,Y 0)在所述超分目标生图像中对应的像素的坐标为(X Ai,Y Ai),所述第一畸变曲线中的任一采样点(X 0,Y 0)在所述目标生图像中对应的像素的坐标为(X Bi,Y Bi); The parameters of the first distortion correction processing include: a first distortion curve; the parameters of the second distortion correction processing include: a second distortion curve; any sampling point (X 0 , Y 0) in the second distortion curve ) The coordinates of the corresponding pixel in the hyperdivision target image are (X Ai , Y Ai ), and any sampling point (X 0 , Y 0 ) in the first distortion curve is in the target image The coordinates of the corresponding pixel are (X Bi , Y Bi );
    其中,X Ai=(X Bi-(W f/2+0.5))*K+(W f*K/2+0.5);Y Ai=(Y Bi-(H f/2+0.5))*K+(H f*K/2+0.5);W f表示所述目标生图像的宽,H f表示所述目标生图像的高,K表示所述超分彩色图像与所述目标区域的图像的分辨率比值。 Among them, X Ai =(X Bi -(W f /2+0.5))*K+(W f *K/2+0.5); Y Ai =(Y Bi -(H f /2+0.5))*K+( H f *K/2+0.5); W f represents the width of the target image, H f represents the height of the target image, and K represents the resolution of the super-division color image and the image of the target area ratio.
  8. 根据权利要求1至7任一所述的方法,其特征在于,在所述对所述目标生图像中目标区域的图像进行超分,得到超分彩色图像之前,所述方法还包括:The method according to any one of claims 1 to 7, characterized in that, before the super-division of the image of the target area in the target raw image to obtain a super-division color image, the method further comprises:
    确定所述目标生图像中包含所述目标物体的所述目标区域。Determining the target area containing the target object in the target raw image.
  9. 根据权利要求8所述的方法,其特征在于,所述目标生图像为生图像视频中的第m+1帧图像,m≥1,在所述确定所述目标生图像中包含所述目标物体的所述目标区域之前,所述方法还包括:The method according to claim 8, wherein the target raw image is the m+1th frame image in the raw image video, m≥1, and the target object is included in the determined target raw image Before the target area, the method further includes:
    对所述生图像视频中的第m帧图像进行第三处理,得到所述第m帧图像的彩色图像;Performing a third process on the m-th frame image in the raw image video to obtain a color image of the m-th frame image;
    确定所述第m帧图像的彩色图像中包含所述目标物体的第三区域;Determining that the color image of the m-th frame image contains the third region of the target object;
    所述确定所述目标生图像中包含所述目标物体的所述目标区域,包括:The determining the target area in the target image containing the target object includes:
    确定所述目标生图像中对应所述第三区域的所述目标区域。Determining the target area corresponding to the third area in the target image.
  10. 根据权利要求8或9所述的方法,其特征在于,在所述对所述目标生图像中目标区域的图像进行超分,得到超分彩色图像之前,所述方法还包括:The method according to claim 8 or 9, characterized in that, before the super-division of the image of the target region in the target raw image to obtain a super-division color image, the method further comprises:
    在所述目标生图像包含多个目标区域,且所述多个目标区域中存在满足替换条件的两个目标区域时,将所述两个目标区域替换为备选目标区域,得到更新的所述多个目标区域;When the target raw image includes multiple target regions, and there are two target regions that satisfy the replacement condition among the multiple target regions, replace the two target regions with candidate target regions to obtain the updated Multiple target areas;
    其中,所述替换条件包括:所述两个目标区域至少部分重合,且所述两个目标区域的面积之和大于所述备选目标区域的面积;Wherein, the replacement condition includes: the two target regions at least partially overlap, and the sum of the areas of the two target regions is greater than the area of the candidate target region;
    所述两个目标区域中一个目标区域的左上角坐标为(X 11,Y 11),且右下角坐标为(X 12,Y 12);所述两个目标区域中另一个目标区域的左上角坐标为(X 21,Y 21),且右下角坐标为(X 22,Y 22);所述备选目标区域的左上角坐标为(X M1,Y M1),且右下角坐标为(X M2,Y M2);X M1为X 11和X 21的最小值;Y M1为Y 11和Y 21的最小值;X M2为X 12和X 22的最大值;Y M2为Y 12和Y 22的最大值。 The coordinates of the upper left corner of one of the two target areas are (X 11 , Y 11 ), and the coordinates of the lower right corner are (X 12 , Y 12 ); the upper left corner of the other of the two target areas The coordinates are (X 21 , Y 21 ), and the lower right corner coordinates are (X 22 , Y 22 ); the upper left corner coordinates of the candidate target area are (X M1 , Y M1 ), and the lower right corner coordinates are (X M2 , Y M2 ); X M1 is the minimum value of X 11 and X 21 ; Y M1 is the minimum value of Y 11 and Y 21 ; X M2 is the maximum value of X 12 and X 22 ; Y M2 is the minimum value of Y 12 and Y 22 Maximum value.
  11. 一种物体识别装置,其特征在于,所述物体识别装置包括:An object recognition device, characterized in that the object recognition device includes:
    获取模块,用于获取图像传感器产生的目标生图像;The acquisition module is used to acquire the target image generated by the image sensor;
    超分模块,用于对所述目标生图像中目标区域的图像进行超分,得到超分彩色图像,其中,所述目标区域包括所述目标生图像中的至少部分区域,所述超分彩色图像的分辨率大于所述目标区域的图像的分辨率;The super-division module is used to super-division the image of the target region in the target image to obtain a super-division color image, wherein the target region includes at least a part of the region in the target image, and the super-division color The resolution of the image is greater than the resolution of the image in the target area;
    识别模块,用于基于所述超分彩色图像进行目标物体的识别。The recognition module is used to recognize the target object based on the super-division color image.
  12. 根据权利要求11所述的物体识别装置,其特征在于,所述目标生图像为拜耳模式的图像。The object recognition device according to claim 11, wherein the target raw image is a Bayer pattern image.
  13. 根据权利要求11或12所述的物体识别装置,其特征在于,所述物体识别装置还包括:The object recognition device according to claim 11 or 12, wherein the object recognition device further comprises:
    第一处理模块,用于对所述目标生图像进行第一处理,得到所述目标生图像的彩色图像;The first processing module is configured to perform first processing on the target image to obtain a color image of the target image;
    第一确定模块,用于确定所述目标生图像的彩色图像中包含所述目标物体的第一区域;A first determining module, configured to determine a first region in the color image of the target image that contains the target object;
    其中,所述识别模块用于:Wherein, the identification module is used for:
    基于所述第一区域,确定所述超分彩色图像中的第二区域;Determining a second area in the super-division color image based on the first area;
    基于所述超分彩色图像中所述第二区域的图像,进行所述目标物体的识别。Based on the image of the second region in the super-division color image, the target object is recognized.
  14. 根据权利要求13所述的物体识别装置,其特征在于,所述目标生图像为拜耳模式的图像,所述超分模块包括:The object recognition device according to claim 13, wherein the target raw image is a Bayer pattern image, and the super-division module comprises:
    超分子模块,用于对所述目标区域的图像进行超分,得到拜耳模式图像;The supramolecular module is used to super-divide the image of the target area to obtain a Bayer mode image;
    处理子模块,用于对所述拜耳模式图像进行第二处理,得到所述超分彩色图像。The processing sub-module is configured to perform a second processing on the Bayer mode image to obtain the super-division color image.
  15. 根据权利要求11至14任一所述的物体识别装置,其特征在于,所述物体识别装置还包括:The object recognition device according to any one of claims 11 to 14, wherein the object recognition device further comprises:
    第二确定模块,用于确定所述目标生图像中包含所述目标物体的所述目标区域。The second determining module is configured to determine the target area containing the target object in the target raw image.
  16. 根据权利要求15所述的物体识别装置,其特征在于,所述目标生图像为生图像视频中的第m+1帧图像,m≥1,所述物体识别装置还包括:The object recognition device according to claim 15, wherein the target raw image is the m+1th frame image in the raw image video, m≥1, and the object recognition device further comprises:
    第三处理模块,用于对所述生图像视频中的第m帧图像进行第三处理,得到所述第m帧图像的彩色图像;The third processing module is configured to perform third processing on the m-th frame image in the raw image video to obtain a color image of the m-th frame image;
    第三确定模块,用于确定所述第m帧图像的彩色图像中包含所述目标物体的第三区域;A third determining module, configured to determine a third region containing the target object in the color image of the m-th frame of image;
    所述第二确定模块用于:确定所述目标生图像中对应所述第三区域的所述目标区域。The second determining module is configured to determine the target area corresponding to the third area in the target image.
  17. 一种物体识别装置,其特征在于,所述物体识别装置包括:处理器和接口,所述处理器用于通过所述接口从图像传感器获取生图像,所述处理器用于运行程序,以使得所述物体识别装置执行如权利要求1至10任一项所述的物体识别方法。An object recognition device, characterized in that the object recognition device includes a processor and an interface, the processor is used to obtain a raw image from an image sensor through the interface, and the processor is used to run a program so that the The object recognition device executes the object recognition method according to any one of claims 1 to 10.
  18. 一种计算机存储介质,其特征在于,所述存储介质内存储有计算机程序,所述计算机程序用于执行权利要求1至10任一项所述的物体识别方法。A computer storage medium, characterized in that a computer program is stored in the storage medium, and the computer program is used to execute the object recognition method according to any one of claims 1 to 10.
PCT/CN2020/111141 2020-01-21 2020-08-25 Object recognition method and device WO2021147316A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010071857.8 2020-01-21
CN202010071857.8A CN111242087B (en) 2020-01-21 2020-01-21 Object identification method and device

Publications (1)

Publication Number Publication Date
WO2021147316A1 true WO2021147316A1 (en) 2021-07-29

Family

ID=70872955

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/111141 WO2021147316A1 (en) 2020-01-21 2020-08-25 Object recognition method and device

Country Status (2)

Country Link
CN (1) CN111242087B (en)
WO (1) WO2021147316A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111242087B (en) * 2020-01-21 2024-06-07 华为技术有限公司 Object identification method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100254630A1 (en) * 2009-04-03 2010-10-07 Sony Corporation Method and apparatus for forming super resolution images from raw data representative of color filter array images
CN103810685A (en) * 2014-02-25 2014-05-21 清华大学深圳研究生院 Super resolution processing method for depth image
CN109886875A (en) * 2019-01-31 2019-06-14 深圳市商汤科技有限公司 Image super-resolution rebuilding method and device, storage medium
CN111242087A (en) * 2020-01-21 2020-06-05 华为技术有限公司 Object recognition method and device

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101652722B1 (en) * 2010-01-15 2016-09-01 삼성전자주식회사 Image interpolating method by bayer-pattern-converting signal and program recording medium
KR101243285B1 (en) * 2011-08-19 2013-03-13 한경대학교 산학협력단 Apparatus and method for reconstructing color image based on multi-spectrum using Bayer Color Filter Array camera
CN104252703B (en) * 2014-09-04 2017-05-03 吉林大学 Wavelet preprocessing and sparse representation-based satellite remote sensing image super-resolution reconstruction method
US9665927B2 (en) * 2015-06-03 2017-05-30 Samsung Electronics Co., Ltd. Method and apparatus of multi-frame super resolution robust to local and global motion
JP6602743B2 (en) * 2016-12-08 2019-11-06 株式会社ソニー・インタラクティブエンタテインメント Information processing apparatus and information processing method
KR102495753B1 (en) * 2017-10-10 2023-02-03 삼성전자주식회사 Method and electronic device for processing raw image acquired by using camera by using external electronic device
CN110555800A (en) * 2018-05-30 2019-12-10 北京三星通信技术研究有限公司 image processing apparatus and method
CN109190520A (en) * 2018-08-16 2019-01-11 广州视源电子科技股份有限公司 A kind of super-resolution rebuilding facial image method and device
CN109543548A (en) * 2018-10-26 2019-03-29 桂林电子科技大学 A kind of face identification method, device and storage medium
CN109871902B (en) * 2019-03-08 2022-12-13 哈尔滨工程大学 SAR small sample identification method based on super-resolution countermeasure generation cascade network
CN110298790A (en) * 2019-06-28 2019-10-01 北京金山云网络技术有限公司 A kind of pair of image carries out the processing method and processing device of super-resolution rebuilding
CN110414372A (en) * 2019-07-08 2019-11-05 北京亮亮视野科技有限公司 Method for detecting human face, device and the electronic equipment of enhancing
CN110366034A (en) * 2019-07-18 2019-10-22 浙江宇视科技有限公司 A kind of super-resolution image processing method and processing device
CN110503618A (en) * 2019-08-30 2019-11-26 维沃移动通信有限公司 Image processing method and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100254630A1 (en) * 2009-04-03 2010-10-07 Sony Corporation Method and apparatus for forming super resolution images from raw data representative of color filter array images
CN103810685A (en) * 2014-02-25 2014-05-21 清华大学深圳研究生院 Super resolution processing method for depth image
CN109886875A (en) * 2019-01-31 2019-06-14 深圳市商汤科技有限公司 Image super-resolution rebuilding method and device, storage medium
CN111242087A (en) * 2020-01-21 2020-06-05 华为技术有限公司 Object recognition method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XU XIANGYU; MA YONGRUI; SUN WENXIU: "Towards Real Scene Super-Resolution With Raw Images", 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 15 June 2019 (2019-06-15), pages 1723 - 1731, XP033686780, DOI: 10.1109/CVPR.2019.00182 *

Also Published As

Publication number Publication date
CN111242087A (en) 2020-06-05
CN111242087B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
WO2020192483A1 (en) Image display method and device
US11373275B2 (en) Method for generating high-resolution picture, computer device, and storage medium
US10614603B2 (en) Color normalization for a multi-camera system
EP4109392A1 (en) Image processing method and image processing device
JP6469678B2 (en) System and method for correcting image artifacts
CN105339951A (en) Method for detecting a document boundary
WO2021114184A1 (en) Neural network model training method and image processing method, and apparatuses therefor
WO2019056549A1 (en) Image enhancement method, and image processing device
CN110930301A (en) Image processing method, image processing device, storage medium and electronic equipment
JP2011511358A (en) Automatic red eye detection
US8290357B2 (en) Auto-exposure technique in a camera
WO2023226218A1 (en) Axisymmetric optical imaging parallel simulation method and apparatus
WO2019210707A1 (en) Image sharpness evaluation method, device and electronic device
CN114998122A (en) Low-illumination image enhancement method
WO2021147316A1 (en) Object recognition method and device
CN107220934A (en) Image rebuilding method and device
CN113935934A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN114390266B (en) Image white balance processing method, device and computer readable storage medium
US20220327862A1 (en) Method for detecting whether a face is masked, masked-face recognition device, and computer storage medium
WO2023151210A1 (en) Image processing method, electronic device and computer-readable storage medium
WO2022052820A1 (en) Data processing method, system, and apparatus
CN113824894A (en) Exposure control method, device, equipment and storage medium
WO2020259444A1 (en) Image processing method and related device
CN113570531A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
JP2004185183A (en) Image correction device, character recognition method, and image correction program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20915955

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20915955

Country of ref document: EP

Kind code of ref document: A1