WO2024001345A1 - Image processing method, electronic device, and computer storage medium - Google Patents

Image processing method, electronic device, and computer storage medium Download PDF

Info

Publication number
WO2024001345A1
WO2024001345A1 PCT/CN2023/084039 CN2023084039W WO2024001345A1 WO 2024001345 A1 WO2024001345 A1 WO 2024001345A1 CN 2023084039 W CN2023084039 W CN 2023084039W WO 2024001345 A1 WO2024001345 A1 WO 2024001345A1
Authority
WO
WIPO (PCT)
Prior art keywords
area
motion
pixels
processed
pixel
Prior art date
Application number
PCT/CN2023/084039
Other languages
French (fr)
Chinese (zh)
Inventor
游晶
陈杰
孔德辉
徐科
Original Assignee
深圳市中兴微电子技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市中兴微电子技术有限公司 filed Critical 深圳市中兴微电子技术有限公司
Publication of WO2024001345A1 publication Critical patent/WO2024001345A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/142Detection of scene cut or scene change
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation

Definitions

  • the present disclosure relates to, but is not limited to, the technical field of image processing.
  • Motion estimation is a technology widely used in video encoding and decoding and video processing (such as deinterleaving).
  • motion estimation is usually based on dividing prediction units (PUs), and dividing PUs is usually crudely segmented directly based on position information. Therefore, when performing motion estimation, inevitable problems will occur.
  • traditional video encoding and decoding technology usually uses global motion estimation. Global motion estimation is not only time-consuming, but also requires larger bandwidth support. In addition, with the continuous improvement of video quality and video resolution, the requirements for bandwidth are even greater. .
  • the present disclosure provides an image processing method, an electronic device, and a computer storage medium.
  • the present disclosure provides an image processing method, which method includes: dividing a static area and a suspected moving area from an image frame to be processed; determining the motion vector information of each pixel in the suspected moving area, and based on The motion vector information of each pixel divides each pixel into a moving pixel and a static pixel; the static pixel and all pixels in the static area are marked as static, and the moving pixel is marked as moving. Mark the corresponding motion vector information; perform video encoding and decoding processing on the marked image frame to be processed.
  • the present disclosure provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon; when the one or more programs are processed by the one or more When the processor is executed, the one or more processors are caused to implement any image processing method described herein.
  • the present disclosure provides a computer storage medium on which a computer program is stored, wherein when the computer program is executed by a processor, it causes the processor to implement any of the image processing methods described herein.
  • Figure 1 is a schematic flowchart of the image processing method provided by the present disclosure
  • FIG. 2 is a schematic flowchart of the image processing method provided by the present disclosure
  • Figure 3 is a schematic diagram of block matching provided by the present disclosure.
  • FIG. 4 is a schematic flowchart of the image processing method provided by the present disclosure.
  • FIG. 5 is a schematic flowchart of the image processing method provided by the present disclosure.
  • Figure 6 is a schematic flowchart of the image processing method provided by the present disclosure.
  • the embodiment of the present disclosure proposes , for some local motion scenes (such as live broadcast scenes), these scenes have a common feature, that is, most areas are actually in a static state, and only a small part of the area is in a moving state.
  • preliminary detection can be done first
  • the static area and the suspected moving area are identified, and then further motion detection is performed in the suspected moving area, and local motion estimation is performed on the suspected moving area, so as to determine the motion vector information of the pixels in the suspected moving area, and further divide the suspected moving area into For still pixels and moving pixels, mark the motion status of the still pixels and moving pixels respectively, and directly perform video encoding and decoding processing based on the relevant marks of the motion status of the pixels.
  • the present disclosure provides an image processing method, which may include the following steps S11 to S14.
  • step S11 a static area and a suspected motion area are divided from the image frame to be processed.
  • step S12 the motion vector information of each pixel in the suspected motion area is determined, and each pixel is divided into a moving pixel and a stationary pixel according to the motion vector information of each pixel.
  • step S13 the static pixels and all pixels in the static area are marked with a static state, and the moving pixels are marked with a motion state and all corresponding Marker describing motion vector information.
  • step S14 video encoding and decoding processing is performed on the marked image frame to be processed.
  • both the static area and the suspected motion area include multiple pixels.
  • the static area refers to the area where the pixels do not move
  • the suspected motion area refers to the area where the pixels are suspected of moving.
  • Dividing static areas and suspected moving areas from the image frame to be processed is a motion detection process, which can be carried out through any traditional image processing operation or deep learning neural network capable of motion detection, such as through image segmentation network, MSE (Mean Square Error, mean square error) operation, MAE (Mean Absolute Error, absolute error average) operation, SAD (Sum of Absolute Difference, absolute error sum) operation, frame difference calculation, etc. are performed.
  • MSE Mel Square Error, mean square error
  • MAE Mean Absolute Error, absolute error average
  • SAD Sud of Absolute Difference, absolute error sum
  • frame difference calculation etc.
  • Determining the motion vector information of each pixel in the suspected motion area, and dividing each pixel into moving pixels and stationary pixels based on the motion vector information of each pixel, is a process of local motion estimation, which can be done through any method.
  • Motion estimation is performed using traditional image processing operations or deep learning neural networks, such as image block matching methods, optical flow methods, optical flow networks, etc.
  • the image processing method provided by the embodiment of the present disclosure divides the static area and the suspected moving area from the image frame to be processed, and only performs local motion estimation on the suspected moving area, so as to determine
  • the motion vector information of each pixel in the suspected motion area is divided into moving pixels and stationary pixels according to the motion vector information of each pixel.
  • Mark the static state For the stationary pixel and all pixels in the stationary area, Mark the static state, mark the motion state and the corresponding motion vector information of the moving pixels, and perform video encoding and decoding processing on the marked image frame to be processed without the need to perform global coding on the image frame to be processed.
  • Motion estimation shortens the duration of motion estimation, improves the efficiency of image processing, and the identified still pixels (including pixels in still areas) have smaller bandwidth requirements during video encoding and decoding, while moving pixels are processed Video encoding and decoding has a large bandwidth requirement. Targeted processing of still pixels and moving pixels can also save bandwidth resources and relieve the pressure of video transmission.
  • determining the motion vector information of each pixel in the suspected motion area may include the following steps S121 to S123.
  • step S121 the suspected motion area is divided into a plurality of non-overlapping macroblocks.
  • step S122 for each macroblock, the matching block of the current macroblock is determined from the reference frame corresponding to the current macroblock.
  • step S123 the motion vector information of all pixels in each macro block is determined based on each macro block and the matching block of each macro block.
  • a macroblock usually consists of a luminance pixel block and two additional chrominance pixel blocks.
  • the reference frame corresponding to the macroblock refers to the reference frame of the image frame where the macroblock is located.
  • the reference frame The type and number are related to the type of the current frame. For example, when the current frame is a P frame, the reference frame is the I frame or P frame before the current frame. When the current frame is a B frame, the reference frame is the previous and/or following frame. , I frame and/or P frame, which will not be described again in the embodiments of this disclosure.
  • the suspected motion area is divided into multiple non-overlapping macro blocks, and it is considered that all pixels in each macro block have the same motion vector information.
  • the block most similar to the macroblock is searched from the reference frame, which is called the matching block of the macroblock.
  • the SAD algorithm can be used to calculate the similarity and determine the most similar block.
  • the algorithm is simple and fast.
  • the motion vector information corresponding to the macroblock can be determined based on the macroblock and the matching blocks of the macroblock, that is, the motion vector information of all pixels in the macroblock.
  • a block matching schematic diagram provided by the present disclosure takes a certain macroblock (called a current block) in a suspected motion area as an example.
  • the current block is The center point is the center point (that is, the point (x, y) shown in the figure).
  • search Region search range
  • search for the matching block most similar to the current block, and the center point of the matching block is (x1, y1)
  • the geometric coordinate difference between the center point of the current block and the center point of the matching block can be used as the motion vector from the current block to the matching block (Motion Vector), which can also be used as the motion vector of all pixels in the current block.
  • Motion Vector Motion Vector
  • determining the motion vector information of all pixels in each macro block based on each macro block and the matching block of each macro block may include the following steps: : For each macroblock, the geometric coordinate difference between the center point of the matching block of the current macroblock and the center point of the current macroblock is determined as the motion vector of all pixels in the current macroblock. information.
  • the motion vector When the motion vector is not zero, it can be said that the pixel must have moved. When the motion vector is zero, it is not enough to show that the pixel must not be moving. It is also necessary to combine the image frame to be processed where the pixel is located and its reference. The frame difference between frames is used to further judge.
  • dividing each pixel into a moving pixel and a still pixel according to the motion vector information of each pixel includes: dividing the pixels satisfying a preset condition into pixels are determined as the stationary pixels, and pixels other than the stationary pixels among the pixels are determined as the moving pixels, wherein the preset conditions include: the motion vector information is zero, and the pixels The frame difference between the image frame to be processed and its reference frame is less than the preset threshold.
  • the pixels whose motion vector information in each pixel in the suspected motion area is zero and whose corresponding frame difference is less than the preset threshold can be determined as still pixels, and the motion vector information in each pixel in the suspected motion area is zero. And the pixels whose corresponding frame difference is greater than or equal to the preset threshold and whose motion vector is not zero (at this time, regardless of whether the corresponding frame difference is zero) are determined to be motion pixels.
  • the frame difference between the image frame to be processed where the pixel is located and its reference frame refers to the average difference between each pixel in the image frame to be processed where the pixel is located and each pixel in the reference frame, that is, the average pixel difference value.
  • the motion vector information is zero and the frame difference between the image frame to be processed and its reference frame where the pixel is located is less than the preset threshold, it can be reasonably considered that the pixel has no motion and is a static pixel.
  • the static area and the suspected moving area can be divided into Use an image segmentation algorithm to segment still areas and suspected moving areas from each image frame. You can also directly classify multiple image frames through motion pre-detection and determine each image frame as a still area or suspected moving area. .
  • the static area includes a background area and a static target area
  • the suspected moving area includes a moving target area; as shown in Figure 4, the static area and the static area are divided from the image frame to be processed.
  • the suspected motion area (ie step S11) may include the following steps S111 to S113.
  • step S111 the image frame to be processed is divided into a foreground area and a background area.
  • step S112 targets in each foreground area are identified.
  • each foreground area is divided into a stationary target area and a moving target area according to the targets in each foreground area.
  • segmenting the image frame to be processed into a foreground area and a background area and identifying targets in each of the foreground areas can be performed by any traditional image processing operation or deep learning neural network capable of image segmentation, for example, It can be carried out through FCN (Full Connected Network, fully connected network), SegNet (segmented network), U-Net (U-shape Network, U-shaped network), etc.
  • FCN Full Connected Network, fully connected network
  • SegNet segmentNet
  • U-Net U-shape Network, U-shaped network
  • the foreground area usually refers to the area containing local motion
  • the target usually refers to the subject in the image such as people, animals, plants, etc. The embodiments of the present disclosure will not be repeated here.
  • each foreground area is divided into a stationary target area and a moving target area
  • the stationary target area and the background area are directly used as stationary areas. It is considered that there is no motion in the pixels in the area, and no motion estimation is required.
  • the moving target area is a suspected moving area and requires motion estimation to further determine whether there is motion in each pixel in the moving target area.
  • step S113 dividing each foreground area into a stationary target area and a moving target area according to the target in each foreground area (ie, step S113) may include The following steps are S1131 and S1132.
  • step S1131 for any target in any foreground area, when it is detected that the current target has motion, the current foreground area is changed to the current target.
  • the preset range area with the target as the center is determined as the motion target area.
  • step S1132 all areas in each foreground area except the moving target area are determined as the stationary target areas.
  • detecting whether the target is moving can be carried out through some simple image processing methods. For example, it can be carried out by comparing the geometric position change of the target between the previous and next frames.
  • the previous and next frames refer to the image frames to be processed where the target is located. Previous frame and next frame.
  • the preset range area centered on the target needs to include at least the entire target. For targets that are detected to have motion, a preset range area centered on the target will be used as the moving target area. There may be intersections between the moving target areas. After all moving target areas are determined, the moving target area will be The areas are all regarded as static target areas.
  • the reason why the area except the moving target area is regarded as the stationary target area after all the moving target areas are determined is not used as the preset range area centered on the target when no movement of the target is detected.
  • the stationary target area is because if the targets are detected in sequence and the preset range area centered on the moving target is used as the moving target area, and the preset range area centered on the non-moving target is used as the stationary target area, then It is very likely that the stationary target area determined later will cover the moving target area determined previously, which means that the moving target area will be mistakenly recognized as a stationary target area. Therefore, in order to avoid the moving target area being mistakenly recognized as a stationary target area and reduce errors. To reduce the risk of identification and improve the identification accuracy, all areas except the moving target areas should be regarded as stationary target areas only after all moving target areas are determined.
  • multiple image frames can also be directly classified by performing motion pre-detection.
  • Motion pre-detection can use traditional image processing operations such as calculating frame differences.
  • the number of image frames to be processed is multiple; as shown in Figure 6, dividing the still areas and suspected motion areas from the image frames to be processed (ie, step S11) may include The following steps are S111' and S112'.
  • step S111' the frame difference between each image frame to be processed and the corresponding reference frame is determined.
  • step S112' each of the image frames to be processed is divided into the still area and the suspected motion area according to each of the frame differences.
  • the frame difference between the image frame to be processed and the corresponding reference frame refers to the average value of the difference between each pixel in the image frame to be processed and each pixel in the reference frame, that is, the average pixel difference value.
  • dividing each image frame to be processed into the still area and the suspected motion area may include the following steps: dividing the frame The image frames to be processed whose difference is greater than or equal to the preset motion and stillness discrimination threshold are determined as the suspected motion areas, and the image frames to be processed whose frame differences are less than the preset motion and stillness discrimination threshold are determined to be the still areas.
  • the frame difference smaller than the preset motion discrimination threshold can be expressed as: frame_diff ⁇ threshold.
  • the present disclosure also provides an electronic device, including: one or more processors; a storage device on which one or more programs are stored; when the one or more programs are processed by the one or more processors, When executed, the one or more processors are caused to implement the image processing method as described above.
  • the present disclosure also provides a computer storage medium on which a computer program is stored, wherein when the program is executed by a processor, it causes the processor to implement the image processing method as described above.
  • Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
  • computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. removable, removable and non-removable media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer.
  • communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
  • Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a general illustrative sense only and not for purpose of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone, or may be used in combination with other embodiments, unless expressly stated otherwise. Features and/or components are used in combination. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the present disclosure as set forth in the appended claims.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present application provides an image processing method. The method comprises: dividing a static area and a suspected motion area from an image frame to be processed; determining motion vector information of pixels in the suspected motion area, and dividing the pixels into motion pixels and static pixels according to the motion vector information of the pixels; marking the static states of the static pixels and all pixels in the static area, and marking the motion states and corresponding motion vector information of the motion pixels; and performing video encoding and decoding processing on the marked image frame to be processed. The present application also provides an electronic device and a computer storage medium.

Description

图像处理方法、电子设备及计算机存储介质Image processing methods, electronic equipment and computer storage media
相关申请的交叉引用Cross-references to related applications
本申请要求2022年6月30日提交给中国专利局的第202210761067.1号专利申请的优先权,其全部内容通过引用合并于此。This application claims priority from Patent Application No. 202210761067.1 filed with the China Patent Office on June 30, 2022, the entire content of which is incorporated herein by reference.
技术领域Technical field
本公开涉及但不限于图像处理技术领域。The present disclosure relates to, but is not limited to, the technical field of image processing.
背景技术Background technique
运动估计(Motion Estimation)是视频编解码和视频处理(例如去交织)中广泛使用的一种技术。在传统的视频编解码技术中,运动估计通常是基于划分预测单元(PU)进行的,而划分PU通常又是直接根据位置信息进行粗暴的分割,因此在进行运动估计时,不可避免地会出现PU的运动估计准确性较低的问题。并且,传统的视频编解码技术通常采用的是全局运动估计,全局运动估计不仅耗时长,而且还需要较大的带宽支持,再加之视频质量、视频分辨率的不断提升,对带宽的要求更大。Motion estimation (Motion Estimation) is a technology widely used in video encoding and decoding and video processing (such as deinterleaving). In traditional video coding and decoding technology, motion estimation is usually based on dividing prediction units (PUs), and dividing PUs is usually crudely segmented directly based on position information. Therefore, when performing motion estimation, inevitable problems will occur. The problem of low motion estimation accuracy of PU. Moreover, traditional video encoding and decoding technology usually uses global motion estimation. Global motion estimation is not only time-consuming, but also requires larger bandwidth support. In addition, with the continuous improvement of video quality and video resolution, the requirements for bandwidth are even greater. .
发明内容Contents of the invention
本公开提供一种图像处理方法、一种电子设备及一种计算机存储介质。The present disclosure provides an image processing method, an electronic device, and a computer storage medium.
第一方面,本公开提供一种图像处理方法,所述方法包括:从待处理图像帧中划分出静止区域和疑似运动区域;确定出所述疑似运动区域中各像素的运动矢量信息,并根据所述各像素的运动矢量信息将所述各像素划分为运动像素和静止像素;对所述静止像素以及所述静止区域中的所有像素进行静止状态的标记,对所述运动像素进行运动状态以及相应的所述运动矢量信息的标记;对标记后的所述待处理图像帧进行视频编解码处理。 In a first aspect, the present disclosure provides an image processing method, which method includes: dividing a static area and a suspected moving area from an image frame to be processed; determining the motion vector information of each pixel in the suspected moving area, and based on The motion vector information of each pixel divides each pixel into a moving pixel and a static pixel; the static pixel and all pixels in the static area are marked as static, and the moving pixel is marked as moving. Mark the corresponding motion vector information; perform video encoding and decoding processing on the marked image frame to be processed.
第二方面,本公开提供一种电子设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现本文所述的任一图像处理方法。In a second aspect, the present disclosure provides an electronic device, including: one or more processors; a storage device having one or more programs stored thereon; when the one or more programs are processed by the one or more When the processor is executed, the one or more processors are caused to implement any image processing method described herein.
第三方面,本公开提供一种计算机存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时,使得所述处理器实现本文所述的任一图像处理方法。In a third aspect, the present disclosure provides a computer storage medium on which a computer program is stored, wherein when the computer program is executed by a processor, it causes the processor to implement any of the image processing methods described herein.
附图说明Description of drawings
图1是本公开提供的图像处理方法的流程示意图;Figure 1 is a schematic flowchart of the image processing method provided by the present disclosure;
图2是本公开提供的图像处理方法的流程示意图;Figure 2 is a schematic flowchart of the image processing method provided by the present disclosure;
图3是本公开提供的块匹配示意图;Figure 3 is a schematic diagram of block matching provided by the present disclosure;
图4是本公开提供的图像处理方法的流程示意图;Figure 4 is a schematic flowchart of the image processing method provided by the present disclosure;
图5是本公开提供的图像处理方法的流程示意图;Figure 5 is a schematic flowchart of the image processing method provided by the present disclosure;
图6是本公开提供的图像处理方法的流程示意图。Figure 6 is a schematic flowchart of the image processing method provided by the present disclosure.
具体实施方式Detailed ways
在下文中将参考附图更充分地描述示例实施方式,但是所述示例实施方式可以以不同形式来体现且不应当被解释为限于本文阐述的实施方式。反之,提供这些实施方式的目的在于使本公开透彻和完整,并将使本领域技术人员充分理解本公开的范围。Example embodiments will be described more fully below with reference to the accompanying drawings, which may, however, be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully allow those skilled in the art to fully understand the scope of the disclosure.
如本文所使用的,术语“和/或”包括一个或多个相关列举条目的任何和所有组合。As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
本文所使用的术语仅用于描述特定实施方式,且不意欲限制本公开。如本文所使用的,单数形式“一个”和“该”也意欲包括复数形式,除非上下文另外清楚指出。还将理解的是,当本说明书中使用术语“包括”和/或“由……制成”时,指定存在所述特征、整体、步骤、操作、元件和/或组件,但不排除存在或添加一个或多个其他特征、整体、步骤、操作、元件、组件和/或其群组。The terminology used herein is used to describe particular embodiments only and is not intended to limit the disclosure. As used herein, the singular forms "a," "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will also be understood that when the terms "comprising" and/or "made of" are used in this specification, the presence of stated features, integers, steps, operations, elements and/or components is specified but does not exclude the presence or Add one or more other features, entities, steps, operations, elements, components, and/or groups thereof.
本文所述实施方式可借助本公开的理想示意图而参考平面图和/ 或截面图进行描述。因此,可根据制造技术和/或容限来修改示例图示。因此,实施方式不限于附图中所示的实施方式,而是包括基于制造工艺而形成的配置的修改。因此,附图中例示的区具有示意性属性,并且图中所示区的形状例示了元件的区的具体形状,但并不旨在是限制性的。The embodiments described herein may be referred to with reference to plan views and/or schematic illustrations of the present disclosure. or cross-sectional view for description. Accordingly, example illustrations may be modified based on manufacturing techniques and/or tolerances. Therefore, embodiments are not limited to those shown in the drawings but include modifications of configurations based on manufacturing processes. Accordingly, the regions illustrated in the figures are of a schematic nature and the shapes of the regions shown in the figures are illustrative of the specific shapes of regions of the element and are not intended to be limiting.
除非另外限定,否则本文所用的所有术语(包括技术和科学术语)的含义与本领域普通技术人员通常理解的含义相同。还将理解,诸如那些在常用字典中限定的那些术语应当被解释为具有与其在相关技术以及本公开的背景下的含义一致的含义,且将不解释为具有理想化或过度形式上的含义,除非本文明确如此限定。Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will also be understood that terms such as those defined in commonly used dictionaries should be construed to have meanings consistent with their meanings in the context of the relevant art and the present disclosure, and will not be construed as having idealized or excessive formal meanings, Unless expressly so limited herein.
由于在传统的视频编解码技术中通常基于划分PU来进行运动估计并且是对图像帧进行全局运动估计,准确率较低、耗时长且对带宽要求较高,有鉴于此,本公开实施方式提出,对于一些局部运动场景(例如直播场景)来说,这些场景都有一个共同特点,就是大部分区域其实都是处于静止状态的,只有小部分区域是处于运动状态的,因此,可以先初步检测出静止区域和疑似运动区域,然后在疑似运动区域中进一步地进行运动检测,对疑似运动区域进行局部的运动估计,以便确定出疑似运动区域内像素的运动矢量信息,进一步划分出疑似运动区域中的静止像素和运动像素,分别给静止像素和运动像素做运动状态的相关标记,直接根据像素的运动状态的相关标记进行视频编解码处理即可。Since in traditional video encoding and decoding technology, motion estimation is usually based on dividing PU and global motion estimation is performed on image frames, the accuracy is low, time-consuming and has high bandwidth requirements. In view of this, the embodiment of the present disclosure proposes , for some local motion scenes (such as live broadcast scenes), these scenes have a common feature, that is, most areas are actually in a static state, and only a small part of the area is in a moving state. Therefore, preliminary detection can be done first The static area and the suspected moving area are identified, and then further motion detection is performed in the suspected moving area, and local motion estimation is performed on the suspected moving area, so as to determine the motion vector information of the pixels in the suspected moving area, and further divide the suspected moving area into For still pixels and moving pixels, mark the motion status of the still pixels and moving pixels respectively, and directly perform video encoding and decoding processing based on the relevant marks of the motion status of the pixels.
如图1所示,本公开提供一种图像处理方法,所述方法可以包括如下步骤S11至S14。As shown in Figure 1, the present disclosure provides an image processing method, which may include the following steps S11 to S14.
在步骤S11中,从待处理图像帧中划分出静止区域和疑似运动区域。In step S11, a static area and a suspected motion area are divided from the image frame to be processed.
在步骤S12中,确定出所述疑似运动区域中各像素的运动矢量信息,并根据所述各像素的运动矢量信息将所述各像素划分为运动像素和静止像素。In step S12, the motion vector information of each pixel in the suspected motion area is determined, and each pixel is divided into a moving pixel and a stationary pixel according to the motion vector information of each pixel.
在步骤S13中,对所述静止像素以及所述静止区域中的所有像素进行静止状态的标记,对所述运动像素进行运动状态以及相应的所 述运动矢量信息的标记。In step S13, the static pixels and all pixels in the static area are marked with a static state, and the moving pixels are marked with a motion state and all corresponding Marker describing motion vector information.
在步骤S14中,对标记后的所述待处理图像帧进行视频编解码处理。In step S14, video encoding and decoding processing is performed on the marked image frame to be processed.
其中,静止区域内和疑似运动区域内均包括多个像素,静止区域指的是其中的像素不存在运动的区域,疑似运动区域指的是其中的像素疑似发生了运动的区域。Among them, both the static area and the suspected motion area include multiple pixels. The static area refers to the area where the pixels do not move, and the suspected motion area refers to the area where the pixels are suspected of moving.
从待处理图像帧中划分出静止区域和疑似运动区域是一个运动检测的过程,可以通过任何能够进行运动检测的传统图像处理操作或深度学习神经网络来进行,例如通过图像分割网络、MSE(Mean Square Error,均方误差)操作、MAE(Mean Absolute Error,绝对误差平均值)操作、SAD(Sum of Absolute Difference,绝对误差和)操作、帧差计算等来进行。Dividing static areas and suspected moving areas from the image frame to be processed is a motion detection process, which can be carried out through any traditional image processing operation or deep learning neural network capable of motion detection, such as through image segmentation network, MSE (Mean Square Error, mean square error) operation, MAE (Mean Absolute Error, absolute error average) operation, SAD (Sum of Absolute Difference, absolute error sum) operation, frame difference calculation, etc. are performed.
确定出所述疑似运动区域中各像素的运动矢量信息,并根据所述各像素的运动矢量信息将所述各像素划分为运动像素和静止像素,是一个局部运动估计的过程,可以通过任何能够进行运动估计的传统图像处理操作或深度学习神经网络来进行,例如通过图像块匹配法、光流法、光流网络等来进行。Determining the motion vector information of each pixel in the suspected motion area, and dividing each pixel into moving pixels and stationary pixels based on the motion vector information of each pixel, is a process of local motion estimation, which can be done through any method. Motion estimation is performed using traditional image processing operations or deep learning neural networks, such as image block matching methods, optical flow methods, optical flow networks, etc.
从上述步骤S11-S14可以看出,本公开实施方式提供的图像处理方法,通过从待处理图像帧中划分出静止区域和疑似运动区域,仅对疑似运动区域进行局部的运动估计,以便确定出所述疑似运动区域中各像素的运动矢量信息,并根据所述各像素的运动矢量信息将所述各像素划分为运动像素和静止像素,对所述静止像素以及所述静止区域中的所有像素进行静止状态的标记,对所述运动像素进行运动状态以及相应的所述运动矢量信息的标记,对标记后的所述待处理图像帧进行视频编解码处理,而无需对待处理图像帧进行全局的运动估计,缩短了运动估计的时长,提高了图像处理的效率,并且识别出来的静止像素(包括静止区域内的像素)在进行视频编解码处理时具有较小的带宽需求,而运动像素在进行视频编解码处理时具有较大的带宽需求,对静止像素和运动像素分别进行针对性处理,还能够节省带宽资源,缓解视频传输的压力。 It can be seen from the above steps S11-S14 that the image processing method provided by the embodiment of the present disclosure divides the static area and the suspected moving area from the image frame to be processed, and only performs local motion estimation on the suspected moving area, so as to determine The motion vector information of each pixel in the suspected motion area is divided into moving pixels and stationary pixels according to the motion vector information of each pixel. For the stationary pixel and all pixels in the stationary area, Mark the static state, mark the motion state and the corresponding motion vector information of the moving pixels, and perform video encoding and decoding processing on the marked image frame to be processed without the need to perform global coding on the image frame to be processed. Motion estimation shortens the duration of motion estimation, improves the efficiency of image processing, and the identified still pixels (including pixels in still areas) have smaller bandwidth requirements during video encoding and decoding, while moving pixels are processed Video encoding and decoding has a large bandwidth requirement. Targeted processing of still pixels and moving pixels can also save bandwidth resources and relieve the pressure of video transmission.
仅对疑似运动区域进行运动估计,可以采用图像块匹配法、光流法、光流网络等等,其中,使用图像块匹配法方便快捷且具有较高的准确性。相应的,在一些实施方式中,如图2所示,所述确定出所述疑似运动区域中各像素的运动矢量信息(即步骤S12中所述)可以包括如下步骤S121至S123。For motion estimation only of suspected motion areas, image block matching method, optical flow method, optical flow network, etc. can be used. Among them, the image block matching method is convenient, fast and has high accuracy. Correspondingly, in some embodiments, as shown in FIG. 2 , determining the motion vector information of each pixel in the suspected motion area (ie, step S12) may include the following steps S121 to S123.
在步骤S121中,将所述疑似运动区域划分为多个互不重叠的宏块。In step S121, the suspected motion area is divided into a plurality of non-overlapping macroblocks.
在步骤S122中,对于每一所述宏块,均从当前所述宏块对应的参考帧中,确定出当前所述宏块的匹配块。In step S122, for each macroblock, the matching block of the current macroblock is determined from the reference frame corresponding to the current macroblock.
在步骤S123中,根据各所述宏块以及各所述宏块的匹配块,确定出各所述宏块中所有像素的运动矢量信息。In step S123, the motion vector information of all pixels in each macro block is determined based on each macro block and the matching block of each macro block.
其中,一个宏块通常由一个亮度像素块和附加的两个色度像素块组成,与宏块对应的参考帧,指的是宏块所在的图像帧的参考帧,在本领域中,参考帧的类型和数量,与当前帧的类型有关,譬如,当前帧为P帧时参考帧为当前帧前面的I帧或P帧,当前帧为B帧时参考帧为当前帧前面的和/或后面的、I帧和/或P帧,本公开实施方式不再赘述。Among them, a macroblock usually consists of a luminance pixel block and two additional chrominance pixel blocks. The reference frame corresponding to the macroblock refers to the reference frame of the image frame where the macroblock is located. In this field, the reference frame The type and number are related to the type of the current frame. For example, when the current frame is a P frame, the reference frame is the I frame or P frame before the current frame. When the current frame is a B frame, the reference frame is the previous and/or following frame. , I frame and/or P frame, which will not be described again in the embodiments of this disclosure.
首先将所述疑似运动区域划分为多个互不重叠的宏块,认为每个宏块内的所有像素具有相同的运动矢量信息。进一步的,针对每个宏块,均从参考帧中搜索得到与该宏块最相似的块,称为该宏块的匹配块,其中计算相似度并确定出最相似的块,可以采用SAD算法进行,该算法简单快速。最后,针对每个宏块,根据该宏块以及该宏块的匹配块即可确定出该宏块对应的运动矢量信息,也即该宏块内所有像素的运动矢量信息。First, the suspected motion area is divided into multiple non-overlapping macro blocks, and it is considered that all pixels in each macro block have the same motion vector information. Further, for each macroblock, the block most similar to the macroblock is searched from the reference frame, which is called the matching block of the macroblock. The SAD algorithm can be used to calculate the similarity and determine the most similar block. The algorithm is simple and fast. Finally, for each macroblock, the motion vector information corresponding to the macroblock can be determined based on the macroblock and the matching blocks of the macroblock, that is, the motion vector information of all pixels in the macroblock.
举例来说,如图3所示,为本公开提供的一种块匹配示意图,以疑似运动区域的某一宏块(称为当前块(current block))为例,在参考帧中以当前块的中心点为中心点(即图中所示点(x,y)),在该中心点附近的一个搜索范围(Search Region)内,搜索与当前块最相似的匹配块,匹配块的中心点为(x1,y1),当前块的中心点与匹配块的中心点之间的几何坐标差值可以作为当前块到匹配块的运动矢量 (Motion Vector),也可以作为当前块中所有像素的运动矢量。For example, as shown in Figure 3, a block matching schematic diagram provided by the present disclosure takes a certain macroblock (called a current block) in a suspected motion area as an example. In the reference frame, the current block is The center point is the center point (that is, the point (x, y) shown in the figure). Within a search range (Search Region) near the center point, search for the matching block most similar to the current block, and the center point of the matching block is (x1, y1), the geometric coordinate difference between the center point of the current block and the center point of the matching block can be used as the motion vector from the current block to the matching block (Motion Vector), which can also be used as the motion vector of all pixels in the current block.
相应的,在一些实施方式中,所述根据各所述宏块以及各所述宏块的匹配块,确定出各所述宏块中所有像素的运动矢量信息(即步骤S123)可以包括如下步骤:对于每一所述宏块,均确定出当前宏块的匹配块的中心点与所述当前宏块的中心点之间的几何坐标差值,作为所述当前宏块中所有像素的运动矢量信息。Correspondingly, in some embodiments, determining the motion vector information of all pixels in each macro block based on each macro block and the matching block of each macro block (ie, step S123) may include the following steps: : For each macroblock, the geometric coordinate difference between the center point of the matching block of the current macroblock and the center point of the current macroblock is determined as the motion vector of all pixels in the current macroblock. information.
例如,当前宏块的匹配块的中心点为(x1,y1),当前宏块的中心点为(x,y),那么计算(x1,y1)与(x,y)的几何坐标差值mv即可作为当前宏块中所有像素的运动矢量信息。For example, if the center point of the matching block of the current macroblock is (x1, y1) and the center point of the current macroblock is (x, y), then calculate the geometric coordinate difference between (x1, y1) and (x, y) mv can be used as the motion vector information of all pixels in the current macroblock.
在运动矢量不为零的情况下,可以说明像素必定发生了运动,而在运动矢量为零的情况下,并不足以说明像素必定不存在运动,还需结合像素所在的待处理图像帧与其参考帧之间的帧差来进一步判断。相应的,在一些实施方式中,根据所述各像素的运动矢量信息将所述各像素划分为运动像素和静止像素(即步骤S12中所述)包括:将所述各像素中满足预设条件的像素确定为所述静止像素,并将所述各像素中除所述静止像素之外的像素均确定为所述运动像素,其中,所述预设条件包括:运动矢量信息为零,且像素所在的待处理图像帧与其参考帧之间的帧差小于预设阈值。When the motion vector is not zero, it can be said that the pixel must have moved. When the motion vector is zero, it is not enough to show that the pixel must not be moving. It is also necessary to combine the image frame to be processed where the pixel is located and its reference. The frame difference between frames is used to further judge. Correspondingly, in some embodiments, dividing each pixel into a moving pixel and a still pixel according to the motion vector information of each pixel (ie, described in step S12) includes: dividing the pixels satisfying a preset condition into pixels are determined as the stationary pixels, and pixels other than the stationary pixels among the pixels are determined as the moving pixels, wherein the preset conditions include: the motion vector information is zero, and the pixels The frame difference between the image frame to be processed and its reference frame is less than the preset threshold.
也就是说,可以将疑似运动区域中各像素中的运动矢量信息为零且对应的帧差小于预设阈值的像素均确定为静止像素,将疑似运动区域中各像素中的运动矢量信息为零且对应的帧差大于或等于预设阈值、以及运动矢量不为零(此时无论对应的帧差是否为零)的像素均确定为运动像素。That is to say, the pixels whose motion vector information in each pixel in the suspected motion area is zero and whose corresponding frame difference is less than the preset threshold can be determined as still pixels, and the motion vector information in each pixel in the suspected motion area is zero. And the pixels whose corresponding frame difference is greater than or equal to the preset threshold and whose motion vector is not zero (at this time, regardless of whether the corresponding frame difference is zero) are determined to be motion pixels.
像素所在的待处理图像帧与其参考帧之间的帧差,指的是像素所在的待处理图像帧中的各像素与参考帧中的各像素之间的差值的平均值,即平均像素差值。在运动矢量信息为零且像素所在的待处理图像帧与其参考帧之间的帧差小于预设阈值的情况下,可以合理认为像素不存在运动,属于静止像素。The frame difference between the image frame to be processed where the pixel is located and its reference frame refers to the average difference between each pixel in the image frame to be processed where the pixel is located and each pixel in the reference frame, that is, the average pixel difference value. When the motion vector information is zero and the frame difference between the image frame to be processed and its reference frame where the pixel is located is less than the preset threshold, it can be reasonably considered that the pixel has no motion and is a static pixel.
从待处理图像帧中划分出静止区域和疑似运动区域,可以是采 用图像分割算法从每张图像帧中分割出静止区域和疑似运动区域,也可以是通过进行运动预检测直接将多张图像帧进行分类,将每一张图像帧确定为静止区域或疑似运动区域。From the image frame to be processed, the static area and the suspected moving area can be divided into Use an image segmentation algorithm to segment still areas and suspected moving areas from each image frame. You can also directly classify multiple image frames through motion pre-detection and determine each image frame as a still area or suspected moving area. .
相应的,在一些实施方式中,所述静止区域包括背景区域和静止目标区域,所述疑似运动区域包括运动目标区域;如图4所示,所述从待处理图像帧中划分出静止区域和疑似运动区域(即步骤S11)可以包括如下步骤S111至S113。Correspondingly, in some embodiments, the static area includes a background area and a static target area, and the suspected moving area includes a moving target area; as shown in Figure 4, the static area and the static area are divided from the image frame to be processed. The suspected motion area (ie step S11) may include the following steps S111 to S113.
在步骤S111中,将所述待处理图像帧分割为前景区域和背景区域。In step S111, the image frame to be processed is divided into a foreground area and a background area.
在步骤S112中,识别出各所述前景区域中的目标。In step S112, targets in each foreground area are identified.
在步骤S113中,根据各所述前景区域中的目标,将各所述前景区域均划分为静止目标区域和运动目标区域。In step S113, each foreground area is divided into a stationary target area and a moving target area according to the targets in each foreground area.
其中,将所述待处理图像帧分割为前景区域和背景区域以及识别出各所述前景区域中的目标,可以通过任何能够进行图像分割的传统图像处理操作或深度学习神经网络来进行,例如,可以通过FCN(Full Connected Network,全连接网络)、SegNet(分割网络)、U-Net(U-shape Network,U型网络)等来进行。在本领域中,前景区域通常指的是包含了局部运动的区域,目标通常指的是图像中的主体例如人、动物、植物等等,本公开实施方式在此不再赘述。Wherein, segmenting the image frame to be processed into a foreground area and a background area and identifying targets in each of the foreground areas can be performed by any traditional image processing operation or deep learning neural network capable of image segmentation, for example, It can be carried out through FCN (Full Connected Network, fully connected network), SegNet (segmented network), U-Net (U-shape Network, U-shaped network), etc. In this field, the foreground area usually refers to the area containing local motion, and the target usually refers to the subject in the image such as people, animals, plants, etc. The embodiments of the present disclosure will not be repeated here.
将各所述前景区域均划分为静止目标区域和运动目标区域之后,静止目标区域和背景区域均直接作为静止区域,认为区域内的像素不存在运动,不需要进行运动估计。而运动目标区域则作为疑似运动区域,需要进行运动估计,以进一步判断运动目标区域中的各像素是否存在运动。After each foreground area is divided into a stationary target area and a moving target area, the stationary target area and the background area are directly used as stationary areas. It is considered that there is no motion in the pixels in the area, and no motion estimation is required. The moving target area is a suspected moving area and requires motion estimation to further determine whether there is motion in each pixel in the moving target area.
识别出各前景区域中的目标之后,可以进一步检测目标是否存在运动。相应的,在一些实施方式中,如图5所示,所述根据各所述前景区域中的目标,将各所述前景区域均划分为静止目标区域和运动目标区域(即步骤S113)可以包括如下步骤S1131和S1132。After identifying the targets in each foreground area, you can further detect whether the targets are moving. Correspondingly, in some embodiments, as shown in Figure 5, dividing each foreground area into a stationary target area and a moving target area according to the target in each foreground area (ie, step S113) may include The following steps are S1131 and S1132.
在步骤S1131中,对于任一所述前景区域的任一所述目标,当检测到当前所述目标存在运动时,将当前所述前景区域中以当前所述 目标为中心的预设范围区域,确定为所述运动目标区域。In step S1131, for any target in any foreground area, when it is detected that the current target has motion, the current foreground area is changed to the current target. The preset range area with the target as the center is determined as the motion target area.
在步骤S1132中,将各所述前景区域中除所述运动目标区域之外的区域均确定为所述静止目标区域。In step S1132, all areas in each foreground area except the moving target area are determined as the stationary target areas.
其中,检测目标是否存在运动,可以通过一些简单的图像处理方法来进行,例如可以通过比较目标在前后帧之间的几何位置变化量来进行,前后帧指的是目标所在的待处理图像帧的前一帧和后一帧。以目标为中心的预设范围区域,需要至少将目标全部包括在内。对于检测到存在运动的目标,将以该目标为中心的预设范围区域作为运动目标区域,各运动目标区域之间可以有交集,确定出所有的运动目标区域之后,将除运动目标区域之外的区域均作为静止目标区域。Among them, detecting whether the target is moving can be carried out through some simple image processing methods. For example, it can be carried out by comparing the geometric position change of the target between the previous and next frames. The previous and next frames refer to the image frames to be processed where the target is located. Previous frame and next frame. The preset range area centered on the target needs to include at least the entire target. For targets that are detected to have motion, a preset range area centered on the target will be used as the moving target area. There may be intersections between the moving target areas. After all moving target areas are determined, the moving target area will be The areas are all regarded as static target areas.
而之所以在确定出所有的运动目标区域之后才将除运动目标区域之外的区域均作为静止目标区域,而不是在检测到目标不存在运动时将以该目标为中心的预设范围区域作为静止目标区域,是因为若依次检测目标并且依次将以存在运动的目标为中心的预设范围区域作为运动目标区域、将以不存在运动的目标为中心的预设范围区域作为静止目标区域,则很有可能导致后确定出来的静止目标区域覆盖之前确定出来的运动目标区域,也就是导致运动目标区域被误识别为静止目标区域,因此为了避免运动目标区域被误识别为静止目标区域从而降低误识别的风险、提高识别准确度,在确定出所有的运动目标区域之后才将除运动目标区域之外的区域均作为静止目标区域。The reason why the area except the moving target area is regarded as the stationary target area after all the moving target areas are determined is not used as the preset range area centered on the target when no movement of the target is detected. The stationary target area is because if the targets are detected in sequence and the preset range area centered on the moving target is used as the moving target area, and the preset range area centered on the non-moving target is used as the stationary target area, then It is very likely that the stationary target area determined later will cover the moving target area determined previously, which means that the moving target area will be mistakenly recognized as a stationary target area. Therefore, in order to avoid the moving target area being mistakenly recognized as a stationary target area and reduce errors. To reduce the risk of identification and improve the identification accuracy, all areas except the moving target areas should be regarded as stationary target areas only after all moving target areas are determined.
除了可以采用图像分割算法从每张图像帧中分割出静止区域和疑似运动区域之外,还可以通过进行运动预检测直接将多张图像帧进行分类。运动预检测可以采用传统的图像处理操作例如计算帧差。相应的,在一些实施方式中,所述待处理图像帧的数量为多个;如图6所示,所述从待处理图像帧中划分出静止区域和疑似运动区域(即步骤S11)可以包括如下步骤S111’和S112’。In addition to using image segmentation algorithms to segment static areas and suspected moving areas from each image frame, multiple image frames can also be directly classified by performing motion pre-detection. Motion pre-detection can use traditional image processing operations such as calculating frame differences. Correspondingly, in some embodiments, the number of image frames to be processed is multiple; as shown in Figure 6, dividing the still areas and suspected motion areas from the image frames to be processed (ie, step S11) may include The following steps are S111' and S112'.
在步骤S111’中,确定各所述待处理图像帧与相应参考帧之间的帧差。In step S111', the frame difference between each image frame to be processed and the corresponding reference frame is determined.
在步骤S112’中,根据各所述帧差将各所述待处理图像帧划分为所述静止区域和所述疑似运动区域。 In step S112', each of the image frames to be processed is divided into the still area and the suspected motion area according to each of the frame differences.
其中,待处理图像帧与相应参考帧之间的帧差,指的是待处理图像帧中的各像素与参考帧中的各像素之间的差值的平均值,即平均像素差值。计算当前待处理图像帧与参考帧之间的差值,可以表示为:frame_diff=|frame(t)-frame(t-1)|,frame(t)表示当前待处理图像帧,frame(t-1)表示当前待处理图像帧的参考帧,frame_diff表示帧差。The frame difference between the image frame to be processed and the corresponding reference frame refers to the average value of the difference between each pixel in the image frame to be processed and each pixel in the reference frame, that is, the average pixel difference value. Calculate the difference between the current image frame to be processed and the reference frame, which can be expressed as: frame_diff=|frame(t)-frame(t-1)|, frame(t) represents the current image frame to be processed, frame(t- 1) Indicates the reference frame of the current image frame to be processed, and frame_diff indicates the frame difference.
在帧差足够小的情况下,说明当前待处理图像帧与参考帧之间的差别较小,可以合理地认为当前待处理图像帧属于静止区域,即当前待处理图像帧中的像素不存在运动。相应的,在一些实施方式中,所述根据各所述帧差将各所述待处理图像帧划分为所述静止区域和所述疑似运动区域(即步骤S112’)可以包括如下步骤:将帧差大于或等于预设动静判别阈值的待处理图像帧确定为所述疑似运动区域,将帧差小于所述预设动静判别阈值的待处理图像帧确定为所述静止区域。When the frame difference is small enough, it means that the difference between the current image frame to be processed and the reference frame is small, and it can be reasonably considered that the current image frame to be processed belongs to the still area, that is, there is no motion in the pixels in the current image frame to be processed. . Correspondingly, in some embodiments, dividing each image frame to be processed into the still area and the suspected motion area (ie step S112') according to each frame difference may include the following steps: dividing the frame The image frames to be processed whose difference is greater than or equal to the preset motion and stillness discrimination threshold are determined as the suspected motion areas, and the image frames to be processed whose frame differences are less than the preset motion and stillness discrimination threshold are determined to be the still areas.
将预设动静判别阈值表示为threshold,则帧差大于或等于预设动静判别阈值可以表示为:frame_diff>=threshold,帧差小于所述预设动静判别阈值可以表示为:frame_diff<threshold。Expressing the preset motion discrimination threshold as threshold, the frame difference greater than or equal to the preset motion discrimination threshold can be expressed as: frame_diff>=threshold, and the frame difference smaller than the preset motion discrimination threshold can be expressed as: frame_diff<threshold.
此外,本公开还提供一种电子设备,包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如前所述的图像处理方法。In addition, the present disclosure also provides an electronic device, including: one or more processors; a storage device on which one or more programs are stored; when the one or more programs are processed by the one or more processors, When executed, the one or more processors are caused to implement the image processing method as described above.
此外,本公开还提供一种计算机存储介质,其上存储有计算机程序,其中,所述程序被处理器执行时,使得所述处理器实现如前所述的图像处理方法。In addition, the present disclosure also provides a computer storage medium on which a computer program is stored, wherein when the program is executed by a processor, it causes the processor to implement the image processing method as described above.
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施方式中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件, 或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。Those of ordinary skill in the art can understand that all or some steps in the methods disclosed above and functional modules/units in the devices can be implemented as software, firmware, hardware, and appropriate combinations thereof. In hardware implementations, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may consist of several physical components. Components execute cooperatively. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, a digital signal processor, or a microprocessor, or as hardware, or implemented as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As is known to those of ordinary skill in the art, the term computer storage media includes volatile and nonvolatile media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. removable, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disk (DVD) or other optical disk storage, magnetic cassettes, tapes, disk storage or other magnetic storage devices, or may Any other medium used to store the desired information and that can be accessed by a computer. Additionally, it is known to those of ordinary skill in the art that communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .
本文已经公开了示例实施方式,并且虽然采用了具体术语,但它们仅用于并仅应当被解释为一般说明性含义,并且不用于限制的目的。在一些实例中,对本领域技术人员显而易见的是,除非另外明确指出,否则可单独使用与特定实施方式相结合描述的特征、特性和/或元素,或可与其他实施方式相结合描述的特征、特性和/或元件组合使用。因此,本领域技术人员将理解,在不脱离由所附的权利要求阐明的本公开的范围的情况下,可进行各种形式和细节上的改变。 Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a general illustrative sense only and not for purpose of limitation. In some instances, it will be apparent to those skilled in the art that features, characteristics and/or elements described in connection with a particular embodiment may be used alone, or may be used in combination with other embodiments, unless expressly stated otherwise. Features and/or components are used in combination. Accordingly, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the scope of the present disclosure as set forth in the appended claims.

Claims (10)

  1. 一种图像处理方法,包括:An image processing method including:
    从待处理图像帧中划分出静止区域和疑似运动区域;Divide static areas and suspected motion areas from the image frame to be processed;
    确定出所述疑似运动区域中各像素的运动矢量信息,并根据所述各像素的运动矢量信息将所述各像素划分为运动像素和静止像素;Determine the motion vector information of each pixel in the suspected motion area, and divide each pixel into a moving pixel and a stationary pixel according to the motion vector information of each pixel;
    对所述静止像素以及所述静止区域中的所有像素进行静止状态的标记,对所述运动像素进行运动状态以及相应的所述运动矢量信息的标记;The stationary pixels and all pixels in the stationary area are marked with a stationary state, and the moving pixels are marked with a motion state and the corresponding motion vector information;
    对标记后的所述待处理图像帧进行视频编解码处理。Perform video encoding and decoding processing on the marked image frames to be processed.
  2. 根据权利要求1所述的方法,其中,所述确定出所述疑似运动区域中各像素的运动矢量信息包括:The method according to claim 1, wherein determining the motion vector information of each pixel in the suspected motion area includes:
    将所述疑似运动区域划分为多个互不重叠的宏块;Divide the suspected motion area into multiple non-overlapping macroblocks;
    对于每一所述宏块,均从当前所述宏块对应的参考帧中,确定出当前所述宏块的匹配块;For each macroblock, determine the matching block of the current macroblock from the reference frame corresponding to the current macroblock;
    根据各所述宏块以及各所述宏块的匹配块,确定出各所述宏块中所有像素的运动矢量信息。According to each of the macro blocks and the matching blocks of each of the macro blocks, the motion vector information of all pixels in each of the macro blocks is determined.
  3. 根据权利要求2所述的方法,其中,所述根据各所述宏块以及各所述宏块的匹配块,确定出各所述宏块中所有像素的运动矢量信息包括:The method according to claim 2, wherein determining the motion vector information of all pixels in each macro block based on each macro block and a matching block of each macro block includes:
    对于每一所述宏块,均确定出当前宏块的匹配块的中心点与所述当前宏块的中心点之间的几何坐标差值,作为所述当前宏块中所有像素的运动矢量信息。For each macroblock, the geometric coordinate difference between the center point of the matching block of the current macroblock and the center point of the current macroblock is determined as the motion vector information of all pixels in the current macroblock. .
  4. 根据权利要求3所述的方法,其中,根据所述各像素的运动矢量信息将所述各像素划分为运动像素和静止像素包括:The method according to claim 3, wherein dividing each pixel into a moving pixel and a static pixel according to the motion vector information of each pixel includes:
    将所述各像素中满足预设条件的像素确定为所述静止像素,并将所述各像素中除所述静止像素之外的像素均确定为所述运动像素, 其中,所述预设条件包括:运动矢量信息为零,且像素所在的待处理图像帧与其参考帧之间的帧差小于预设阈值。Determine the pixels that meet the preset conditions among the pixels as the stationary pixels, and determine the pixels except the stationary pixels among the pixels as the moving pixels, The preset conditions include: the motion vector information is zero, and the frame difference between the image frame to be processed and its reference frame where the pixel is located is less than a preset threshold.
  5. 根据权利要求1-4中任一项所述的方法,其中,所述静止区域包括背景区域和静止目标区域,所述疑似运动区域包括运动目标区域;所述从待处理图像帧中划分出静止区域和疑似运动区域包括:The method according to any one of claims 1 to 4, wherein the static area includes a background area and a static target area, the suspected moving area includes a moving target area; the static area is divided from the image frame to be processed. Areas and areas of suspected movement include:
    将所述待处理图像帧分割为前景区域和背景区域;Divide the image frame to be processed into a foreground area and a background area;
    识别出各所述前景区域中的目标;Identify targets in each of the foreground areas;
    根据各所述前景区域中的目标,将各所述前景区域均划分为静止目标区域和运动目标区域。According to the targets in each foreground area, each foreground area is divided into a stationary target area and a moving target area.
  6. 根据权利要求5所述的方法,其中,所述根据各所述前景区域中的目标,将各所述前景区域均划分为静止目标区域和运动目标区域包括:The method according to claim 5, wherein dividing each foreground area into a stationary target area and a moving target area according to the targets in each foreground area includes:
    对于任一所述前景区域的任一所述目标,当检测到当前所述目标存在运动时,将当前所述前景区域中以当前所述目标为中心的预设范围区域,确定为所述运动目标区域;For any target in any of the foreground areas, when it is detected that the current target has motion, a preset range area centered on the current target in the current foreground area is determined as the motion. target area;
    将各所述前景区域中除所述运动目标区域之外的区域均确定为所述静止目标区域。Areas in each foreground area other than the moving target area are determined as the stationary target area.
  7. 根据权利要求1-4中任一项所述的方法,其中,所述待处理图像帧的数量为多个;所述从待处理图像帧中划分出静止区域和疑似运动区域包括:The method according to any one of claims 1 to 4, wherein the number of image frames to be processed is multiple; dividing the still areas and suspected motion areas from the image frames to be processed includes:
    确定各所述待处理图像帧与相应参考帧之间的帧差;Determine the frame difference between each of the image frames to be processed and the corresponding reference frame;
    根据各所述帧差将各所述待处理图像帧划分为所述静止区域和所述疑似运动区域。Each of the image frames to be processed is divided into the still area and the suspected motion area according to each of the frame differences.
  8. 根据权利要求7所述的方法,其中,所述根据各所述帧差将各所述待处理图像帧划分为所述静止区域和所述疑似运动区域包括:The method according to claim 7, wherein dividing each of the image frames to be processed into the still area and the suspected motion area according to each of the frame differences includes:
    将帧差大于或等于预设动静判别阈值的待处理图像帧确定为所 述疑似运动区域,将帧差小于所述预设动静判别阈值的待处理图像帧确定为所述静止区域。Determine the image frame to be processed whose frame difference is greater than or equal to the preset motion discrimination threshold as the For the suspected motion area, the image frame to be processed whose frame difference is less than the preset motion and stillness discrimination threshold is determined as the still area.
  9. 一种电子设备,包括:An electronic device including:
    一个或多个处理器;one or more processors;
    存储装置,其上存储有一个或多个程序;A storage device on which one or more programs are stored;
    当所述一个或多个程序被所述一个或多个处理器执行时,使得所述一个或多个处理器实现如权利要求1-8任一项所述的图像处理方法。When the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the image processing method according to any one of claims 1-8.
  10. 一种计算机存储介质,其上存储有计算机程序,其中,所述计算机程序被处理器执行时,使得所述处理器实现如权利要求1-8任一项所述的图像处理方法。 A computer storage medium on which a computer program is stored, wherein when the computer program is executed by a processor, the processor implements the image processing method according to any one of claims 1-8.
PCT/CN2023/084039 2022-06-30 2023-03-27 Image processing method, electronic device, and computer storage medium WO2024001345A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210761067.1 2022-06-30
CN202210761067.1A CN117376571A (en) 2022-06-30 2022-06-30 Image processing method, electronic device, and computer storage medium

Publications (1)

Publication Number Publication Date
WO2024001345A1 true WO2024001345A1 (en) 2024-01-04

Family

ID=89382689

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/084039 WO2024001345A1 (en) 2022-06-30 2023-03-27 Image processing method, electronic device, and computer storage medium

Country Status (2)

Country Link
CN (1) CN117376571A (en)
WO (1) WO2024001345A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005295215A (en) * 2004-03-31 2005-10-20 Victor Co Of Japan Ltd Moving image encoding device
CN101699856A (en) * 2009-10-30 2010-04-28 北京中科大洋科技发展股份有限公司 De-interlacing method with self-adapting motion
CN102693537A (en) * 2011-01-17 2012-09-26 三星泰科威株式会社 Image surveillance system and method of detecting whether object is left behind or taken away
US20130336387A1 (en) * 2011-03-09 2013-12-19 Nippon Telegraph And Telephone Corporation Video encoding device, video encoding method and video encoding program
CN106878674A (en) * 2017-01-10 2017-06-20 哈尔滨工业大学深圳研究生院 A kind of parking detection method and device based on monitor video
CN106993187A (en) * 2017-04-07 2017-07-28 珠海全志科技股份有限公司 A kind of coding method of variable frame rate and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005295215A (en) * 2004-03-31 2005-10-20 Victor Co Of Japan Ltd Moving image encoding device
CN101699856A (en) * 2009-10-30 2010-04-28 北京中科大洋科技发展股份有限公司 De-interlacing method with self-adapting motion
CN102693537A (en) * 2011-01-17 2012-09-26 三星泰科威株式会社 Image surveillance system and method of detecting whether object is left behind or taken away
US20130336387A1 (en) * 2011-03-09 2013-12-19 Nippon Telegraph And Telephone Corporation Video encoding device, video encoding method and video encoding program
CN106878674A (en) * 2017-01-10 2017-06-20 哈尔滨工业大学深圳研究生院 A kind of parking detection method and device based on monitor video
CN106993187A (en) * 2017-04-07 2017-07-28 珠海全志科技股份有限公司 A kind of coding method of variable frame rate and device
US20200128252A1 (en) * 2017-04-07 2020-04-23 Allwinner Technology Co., Ltd. Variable Frame Rate Encoding Method and Device, Computer Device and Computer Readable Storage Medium

Also Published As

Publication number Publication date
CN117376571A (en) 2024-01-09

Similar Documents

Publication Publication Date Title
Chen et al. Improved saliency detection in RGB-D images using two-phase depth estimation and selective deep fusion
CN109035304B (en) Target tracking method, medium, computing device and apparatus
US11222211B2 (en) Method and apparatus for segmenting video object, electronic device, and storage medium
US8553980B2 (en) Method and apparatus extracting feature points and image based localization method using extracted feature points
EP2591593B1 (en) System and method for shot change detection in a video sequence
US10096117B2 (en) Video segmentation method
US9798951B2 (en) Apparatus, method, and processor for measuring change in distance between a camera and an object
JP2011517526A (en) Method and apparatus for adaptive color model parameter estimation of object of interest
CN108182421A (en) Methods of video segmentation and device
US20200380290A1 (en) Machine learning-based prediction of precise perceptual video quality
US20150110391A1 (en) Method and apparatus for scene segmentation from focal stack images
WO2021073066A1 (en) Image processing method and apparatus
JPWO2009125569A1 (en) Object detection device
CN109447022B (en) Lens type identification method and device
JP2018124689A (en) Moving body detection device, moving body detection system and moving body detection method
JPWO2016143067A1 (en) Video analysis device
WO2024001345A1 (en) Image processing method, electronic device, and computer storage medium
AU2018202801A1 (en) Method, apparatus and system for producing a foreground map
US11373277B2 (en) Motion detection method and image processing device for motion detection
JP6598952B2 (en) Image processing apparatus and method, and monitoring system
CN115937742B (en) Video scene segmentation and visual task processing methods, devices, equipment and media
CN109218728B (en) Scene switching detection method and system
US10686969B2 (en) Detecting shot changes in a video
GB2553105A (en) Method and system for determining a relationship between two regions of a scene
US20210034915A1 (en) Method and apparatus for object re-identification

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23829536

Country of ref document: EP

Kind code of ref document: A1