WO2022206679A1 - Procédé et appareil de traitement d'image, dispositif informatique et support de stockage - Google Patents

Procédé et appareil de traitement d'image, dispositif informatique et support de stockage Download PDF

Info

Publication number
WO2022206679A1
WO2022206679A1 PCT/CN2022/083400 CN2022083400W WO2022206679A1 WO 2022206679 A1 WO2022206679 A1 WO 2022206679A1 CN 2022083400 W CN2022083400 W CN 2022083400W WO 2022206679 A1 WO2022206679 A1 WO 2022206679A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
image
target
images
stationary
Prior art date
Application number
PCT/CN2022/083400
Other languages
English (en)
Chinese (zh)
Inventor
张伟俊
谢朝毅
Original Assignee
影石创新科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 影石创新科技股份有限公司 filed Critical 影石创新科技股份有限公司
Publication of WO2022206679A1 publication Critical patent/WO2022206679A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/73Deblurring; Sharpening
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion

Definitions

  • the present application relates to the field of computer technology, and in particular, to an image processing method, apparatus, computer device and storage medium.
  • the moving objects are usually recognized by relying on the change of pixel values in different frame images, so as to remove the moving objects in the images.
  • the moving object does not move enough or the moving object stays in a single location for too long, the pixel value of the object in different frame images will not change much, so that the recognition error occurs. That is to say, the recognition accuracy of the moving objects in the prior art is insufficient, so that ghost images exist in the composite image after removing the moving objects, and the image quality is poor.
  • an image processing method comprising:
  • Obtain multiple frames of images shot by the camera module on the same scene use the target detection model to perform target detection on the multiple frames of images, and obtain the target objects included in each frame of the multi-frame images; classify the target objects included in each frame of the image. process, determine the moving objects and stationary objects included in each frame of image; remove the moving objects in each frame of image, obtain the background image corresponding to each frame of image, and fuse all the background images to generate the target background image; The stationary object in the frame image covers the stationary object in the target background image to obtain the output image of the camera module.
  • classifying the target objects included in each frame of images, and determining the moving objects and stationary objects included in each frame of images includes: determining the position of the target objects in each frame of images; At the position in each frame of the image, determine whether the target object is a moving object or a stationary object.
  • determining whether the target object is a moving object or a stationary object according to the position of the target object in each frame of images includes: calculating the position deviation value of the target object in any two frames of the multi-frame images, if the maximum If the position deviation value is less than the position deviation threshold, the target object is determined to be a stationary object. If the position deviation value of the target object in any two frames of the multi-frame image is greater than or equal to the position deviation threshold, the target object is determined to be a moving object.
  • classifying the target objects included in each frame of image, and determining the moving objects and stationary objects included in each frame of image includes: determining the number of target pixels in the tracking position of each frame of image, the target The pixel is used to display the target object, and the tracking position is the position of the target object in any frame of the multi-frame image; according to the number of target pixels in the tracking position of each frame image, the target object is determined to be a moving object or a stationary object.
  • determining whether the target object is a moving object or a stationary object according to the number of target pixels at the tracking position for each frame of images includes: calculating the number of target pixels at the tracking position for any two frames of images in the multi-frame images If the maximum number difference is less than the pixel number threshold, the target object is determined to be a stationary object; if the target pixel number difference between any two frames of images at the tracking position is greater than or equal to the pixel number threshold, the target object is determined to be a moving object.
  • removing the moving objects in each frame of images to obtain a background image corresponding to each frame of images includes: marking pixels corresponding to the moving objects in each frame of images as invalid pixels; The remaining pixels in the image except the invalid pixels generate the background image corresponding to each frame of image.
  • using the still objects in the multi-frame images to cover the still objects in the target background image, and obtaining the output image of the camera module includes: determining a reference image according to the multi-frame images, and the definition of the still object in the reference image is Greater than the definition of the corresponding stationary object in the target background image; use the stationary object in the reference image to cover the stationary object in the target background image to obtain the output image.
  • an image processing apparatus in a second aspect, includes:
  • the acquisition module is used for acquiring multiple frames of images shot by the camera module on the same scene, and using the target detection model to perform target detection on the multiple frames of images to obtain the target object included in each frame of the multiple frames of images;
  • a determination module used for classifying and processing the target objects included in each frame of images, and determining the moving objects and stationary objects included in each frame of images;
  • the removal module is used to remove the moving objects in each frame of image, obtain the background image corresponding to each frame of image, and perform fusion processing on all the background images to generate the target background image;
  • the covering module is used for covering the stationary objects in the target background image with the stationary objects in the multi-frame images to obtain the output image of the camera module.
  • a computer device including a memory and a processor, wherein the memory stores a computer program, and when the processor executes the computer program, the method according to any one of the foregoing first aspects is implemented.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the method according to any one of the above-mentioned first aspects.
  • the above-mentioned image processing method, device, computer equipment and storage medium obtain multiple frames of images shot by a camera module on the same scene, use a target detection model to perform target detection on the multiple frames of images, and obtain the target included in each frame of the multiple frame images.
  • Objects classify and process the target objects included in each frame of image, determine the moving objects and stationary objects included in each frame of image; remove the moving objects in each frame of image, obtain the background image corresponding to each frame of image, and analyze all
  • the background images are fused to generate a target background image; the static objects in the multi-frame images are used to cover the static objects in the target background image, and the output image of the camera module is obtained.
  • the foreground object (for example, the target object described above) in the multi-frame images can be accurately identified by the target detection model, which improves the accuracy of the target object identification result.
  • the target detection model By classifying the target objects included in each frame of images, it is determined whether the target object is a moving object or a stationary object, thereby improving the recognition accuracy of the stationary object and the moving object.
  • the moving objects in the multi-frame images are removed to generate a background image. By fusing multiple frames of background images, ghosts existing in any frame of the multiple frames of images are eliminated, so that there is no ghosts in the generated target background images, and the clarity of the target background images is ensured.
  • the image corresponding to the stationary object in the multi-frame image is used to cover the image corresponding to the stationary object in the target background image, so that moving objects are removed from the final generated output image, and ghosts in the output image are eliminated, and the guarantee is ensured.
  • the sharpness of backgrounds and stationary objects improves image quality.
  • Fig. 1 is the application environment diagram of the image processing method in one embodiment
  • FIG. 2 is a schematic flowchart of an image processing method in one embodiment
  • FIG. 3 is a schematic diagram of determining a target position in a multi-frame image in an image processing method in one embodiment
  • FIG. 4 is a schematic diagram of a stationary object in a multi-frame image covering a stationary object in a target background image in an image processing method in one embodiment
  • FIG. 6 is a schematic flowchart of an image processing method in another embodiment
  • FIG. 7 is a schematic diagram of determining a target object in a multi-frame image in an image processing method in one embodiment
  • FIG. 8 is a schematic flowchart of an image processing method in another embodiment
  • FIG. 10 is a schematic flowchart of an image processing method in another embodiment
  • FIG. 11 is a schematic flowchart of an image processing method in another embodiment
  • FIG. 13 is a structural block diagram of an image processing apparatus in one embodiment
  • FIG. 14 is a structural block diagram of an image processing apparatus in one embodiment
  • FIG. 15 is a structural block diagram of an image processing apparatus in one embodiment.
  • the image processing method provided by the present application can be applied to the computer device as shown in FIG. 1 .
  • the computer equipment may be a terminal. Its internal structure diagram can be shown in Figure 1.
  • the computer equipment includes a processor, memory, a communication interface, a display screen, and an input device connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium, an internal memory.
  • the nonvolatile storage medium stores an operating system and a computer program.
  • the internal memory provides an environment for the execution of the operating system and computer programs in the non-volatile storage medium.
  • the communication interface of the computer device is used for wired or wireless communication with an external terminal, and the wireless communication can be realized by WIFI, operator network, NFC (Near Field Communication) or other technologies.
  • the computer program implements an image processing method when executed by a processor.
  • the display screen of the computer equipment may be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment may be a touch layer covered on the display screen, or a button, a trackball or a touchpad set on the shell of the computer equipment , or an external keyboard, trackpad, or mouse.
  • FIG. 1 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied. Include more or fewer components than shown in the figures, or combine certain components, or have a different arrangement of components.
  • an image processing method is provided, and the method is applied to the terminal in FIG. 1 as an example for description, including the following steps:
  • step 201 the terminal acquires multiple frames of images shot by the camera module on the same scene, and uses a target detection model to perform target detection on the multiple frames of images to obtain a target object included in each frame of the multiple frames of images.
  • the user can place the shooting device where the camera module is located at a fixed position, and keep the shooting device still, so that the camera module can shoot multiple frames of images of the same scene.
  • the relative positions of stationary objects in multiple frames of images captured by the camera module of the same scene do not change (for example, the stationary objects may be buildings, people being photographed, or trees, etc.), and the relative positions of moving objects may occur. Changes (for example, a moving object may be a person, animal, or vehicle, etc., that suddenly intrudes into the scene currently being filmed).
  • the same scene here is mainly the same shooting scene for stationary objects, that is, the stationary object is the target object in the final desired image, and the moving object is mistakenly entered into this shooting scene, which is not the user’s intention. need.
  • the above-mentioned method of fixing the shooting device where the camera module is located can obtain multiple frames of images of the same scene, but the method of obtaining multiple frames of images of the same scene by shooting is not limited to this, which is not specifically limited in this embodiment.
  • the terminal or the photographing device may control the camera module to photograph multiple frames of continuous images.
  • the photographing instruction input by the user may be that the user presses the shutter button, or the user enters a voice photographing password, or the terminal or the photographing device detects the user's photographing gesture. Instructions are not specifically limited.
  • the multi-frame images can be stored in the storage device, and the terminal can obtain the multiple frames of images captured by the camera module of the same scene from the storage device.
  • the terminal can input multiple frames of images into the target detection model, and use the target detection model to extract features in the multiple frames of images, thereby determining the target object in each frame of images.
  • the target detection model can be a model based on manual features, such as DPM (Deformable Parts Model, deformable parts), and the target detection model can also be a model based on a convolutional neural network, such as YOLO (You Only Look Once, you only see Once), R-CNN (Region-based Convolutional Neural Networks, region-based convolutional neural network), SSD (Single Shot MultiBox, single-shot multi-box) and Mask R-CNN (Mask Region-based Convolutional Neural Networks, with masking Region-based Convolutional Neural Networks for Codes), etc.
  • YOLO You Only Look Once, you only see Once
  • R-CNN Registered-based Convolutional Neural Networks, region-based convolutional neural network
  • SSD Single Shot MultiBox, single-shot multi-box
  • Mask R-CNN Mask R-CNN (Mask Region-based Convolutional Neural Networks, with masking Region-based Convolutional Neural Networks for Codes),
  • Step 202 The terminal classifies the target objects included in each frame of image, and determines the moving objects and stationary objects included in each frame of image.
  • the terminal can use the target tracking algorithm to track the same target object included in the multi-frame images, determine the position of the same target object in different frame images, and determine whether the same target object is a moving object or a stationary object, thereby Classify moving and stationary objects in each frame of image.
  • the terminal uses the target tracking algorithm to identify the position of the target object A in the multi-frame images respectively, according to the position of the target object A in the multi-frame images. , determine whether the target object A is a moving object or a stationary object.
  • the terminal can also use the target tracking algorithm to track the same position in the multi-frame images, determine the number of pixels in which the target object is detected at the same position in the multi-frame images, and determine the number of pixels in which the target object is detected at the same position in the multi-frame images.
  • the number of displayed pixels determines whether the target object is a moving object or a stationary object.
  • the terminal detects the position of the target object B in the first frame image according to the target tracking algorithm, and determines the position of the target object B in the first frame image as the target position.
  • the terminal also determines the same position in other multi-frame images as the target position according to the position of the target position in the first frame of image.
  • the terminal tracks the number of pixels occupied by the target object B in the same target position of the multi-frame image, and determines that the target object B is a moving object or a stationary object according to the number of pixels of the target object B at the target position of the multi-frame image.
  • Step 203 the terminal removes the moving objects in each frame of images, obtains a background image corresponding to each frame of images, and performs fusion processing on all the background images to generate a target background image.
  • the terminal marks the pixels in the target rectangular frame where the moving object in each frame of image is located as invalid pixels, obtains the background image corresponding to each frame of image, All background images are fused to generate the target background image.
  • the pixels in the target rectangular frame where the moving objects in each frame of image are located are marked as invalid pixels, and after acquiring the background image corresponding to each frame of image, the terminal can use the pixel-level image fusion method to merge multiple frames of background images.
  • Perform fusion processing to generate the target background image wherein the pixel-level image fusion method can be an image fusion method based on non-multi-scale transformation (for example: average and weighted average method, logical filter method, mathematical morphology method, image algebra method, etc. ) or an image fusion method based on multi-scale transformation (eg: pyramid image fusion method, wavelet transform image fusion method, neural network-based image fusion method, etc.).
  • the fusion method of multiple frames of background images is not limited, and the pixel-level image fusion method is used to retain more image information.
  • the terminal may also use the background modeling method to perform fusion processing on the background images corresponding to each frame of images.
  • the background modeling method can use a non-recursive background modeling method, or a recursive background modeling method, wherein the non-recursive background modeling method can include median, mean model, linear prediction model, non-parametric kernel density estimation etc.
  • recursive background modeling methods may include approximate median filtering methods, single Gaussian model methods, mixture Gaussian model methods, and the like.
  • the embodiments of the present application take the median model modeling method in the non-recursive background modeling method as an example for detailed description.
  • n frames of images there are n frames of images.
  • the pixels corresponding to the moving objects in each frame of the mask image in the mask image set are invalid pixels, and the invalid pixels can be marked as 0, and the pixels except the moving objects are valid pixels, and each valid pixel can be marked as 1 to generate the corresponding mask map.
  • the value range of the pixel value of each pixel in M k may be ⁇ 0, 1 ⁇ , where 0 represents an invalid pixel and 1 represents a valid pixel.
  • I k (p) and M k (p) respectively represent the pixel values of I k and M k corresponding to the pixel at the coordinate position p.
  • Median(*) in formula (1) represents the operation of taking the median of the elements in the set.
  • step 204 the terminal uses the stationary objects in the multi-frame images to cover the stationary objects in the target background image to obtain an output image of the camera module.
  • the photography module acquires multiple frames of images of the same scene, there will be human errors or equipment errors, resulting in slight deviations in the positions of each stationary object or moving object in the multiple frames of images, so that after fusion processing, the generated image
  • the edge corresponding to the stationary object in the target background image becomes blurred, in order to improve the clarity of the image corresponding to the stationary object in the output image.
  • the terminal can select an image with higher definition from the multi-frame images obtained by the camera module, and replace the image corresponding to the stationary object in the target background image with the image corresponding to the stationary object in the image, thereby obtaining the removal of moving objects.
  • the still object is also very clear image, as the output image of the camera module.
  • picture A in FIG. 4 is an image of an optional frame among the multi-frame images
  • picture B is the target background image
  • picture C is the output image of the camera module.
  • the person (1) in Figure A is a stationary person
  • the terminal can extract the pixels corresponding to the stationary person (1) in Figure A, and the extracted pixels corresponding to the stationary person (1) in Figure A Overlay the position of the pixel corresponding to the stationary person No. (1) in Figure B, thereby generating Figure C, that is, the output image of the camera module.
  • the target objects included in each frame of the multiple frames of images are obtained;
  • the target objects included in the image are classified and processed to determine the moving objects and stationary objects included in each frame of image; the moving objects in each frame of image are removed, the background image corresponding to each frame of image is obtained, and all background images are fused.
  • a target background image is generated; the static objects in the target background image are covered by the static objects in the multi-frame images, and the output image of the camera module is obtained.
  • the target object in the multi-frame images can be accurately recognized by the target detection model, which improves the accuracy of the target object recognition result.
  • the target object included in each frame of images it is determined whether the target object is a moving object or a stationary object, thereby improving the recognition accuracy of the stationary object and the moving object.
  • the moving objects in the multi-frame images are removed to generate a background image.
  • ghosts existing in any frame of the multiple frames of images are eliminated, so that there is no ghosts in the generated target background images, and the clarity of the target background images is ensured.
  • the image corresponding to the stationary object in the multi-frame image is used to cover the image corresponding to the stationary object in the target background image, so that moving objects are removed from the final generated output image, and ghosts in the output image are eliminated, and the guarantee is ensured.
  • the sharpness of backgrounds and stationary objects improves image quality.
  • step 202 classify the target objects included in each frame of image, and determine the moving objects and stationary objects included in each frame of image.
  • Step 501 the terminal determines the position of the target object in each frame of image.
  • the terminal determines the target object according to the recognition result of the target detection model. For the same target object in multiple frames of images, the terminal determines the position of the same target object in each frame of images respectively.
  • Step 502 the terminal determines that the target object is a moving object or a stationary object according to the position of the target object in each frame of image.
  • the terminal marks the position corresponding to the same target object in each frame of image.
  • the terminal compares whether the position of the target object in each frame of image changes, and compares the detection results to determine whether the target object is a moving object or a stationary object.
  • the terminal recognizes the same target object C in each frame of images according to the recognition result of the target detection model.
  • the terminal marks the position corresponding to the target object C in each frame of image according to the recognition result.
  • the terminal may use a block diagram to frame the target object in each frame of image.
  • the terminal compares whether the position mark of the target object C in each frame of images has changed, and judges whether the target object is a moving object or a stationary object according to the comparison result.
  • the terminal determines the position of the target object in each frame of image, and determines whether the target object is a moving object or a stationary object according to the position of the target object in each frame of image. Therefore, it is possible to accurately determine whether the target object is a moving object or a stationary object, avoid errors in the output image caused by the detection error of the moving object, and ensure the quality of the output image after removing the moving object.
  • the above step 502 "The terminal determines that the target object is a moving object or a stationary object according to the position of the target object in each frame of image" may include the following steps :
  • Step 601 the terminal calculates the position deviation value of the target object in any two frames of the multi-frame images. If the maximum position deviation value is less than the position deviation threshold value, go to step 602; if the target object position deviation value in any two frames of the multi-frame images is greater than or equal to the position deviation threshold value, go to step 603.
  • Step 602 the terminal determines that the target object is a stationary object.
  • Step 603 the terminal determines that the target object is a moving object.
  • the terminal may determine the position of the same target object in each frame of images.
  • the terminal can compare the positions corresponding to the same target object in any two frames of images, and calculate the difference between the positions corresponding to the same target object in any two frames of images, so as to obtain the position deviation value of the target objects in any two frames of images.
  • the terminal can compare the position deviation value of the target object in any two frames of images, so as to determine the maximum position deviation value.
  • the terminal After determining the maximum position deviation value of the target object in any two frames of images, the terminal compares the maximum position deviation value with the position deviation threshold value. If the maximum position deviation value is less than the position deviation threshold value, it means that the target object is in the multi-frame image If the position deviation is small, the terminal determines that the target object is a stationary object. If the position deviation value of the target object in any two frames of the multi-frame images is greater than or equal to the position deviation threshold, it means that the position deviation of the target object in the multi-frame images is relatively large, and the terminal determines that the target object is a moving object.
  • the terminal calculates the corresponding position deviation of the target position D in any two frames of images. Assuming that there are 5 frames of images, the terminal calculates the position deviation between the position corresponding to the target object D in the first frame image and the position corresponding to the target object D in the second frame image, and the position corresponding to the target object D in the first frame image. The position deviation between the position and the position corresponding to the target object D in the third frame image, and so on, respectively calculate the position deviation corresponding to the target object D in any two frames of images. The terminal compares the obtained multiple position deviations, and determines the largest position deviation from them.
  • the terminal compares the maximum position deviation with the position deviation threshold, and the comparison result is that the maximum position deviation is less than the position deviation threshold, and the terminal determines that the target object is a stationary object. If there is a position deviation of the target object D in any two frames of images, the distance is 15 pixels, and the position deviation threshold is 10 pixels. The terminal compares the maximum position deviation with the position deviation threshold, and the comparison result is that the maximum position deviation is greater than the position deviation threshold, and the terminal determines that the target object is a moving object.
  • the terminal calculates the position deviation value of the target object in any two frames of images of the multi-frame images. If the maximum position deviation value is less than the position deviation threshold, the terminal determines that the target object is a stationary object; if the position deviation value of the target object in any two frames of the multi-frame image is greater than or equal to the position deviation threshold, the terminal determines that the target object is a moving object .
  • the terminal can accurately and effectively determine whether the target object is moving by comparing the relationship between the position deviation value of the target object in any two frames of the multi-frame image and the position deviation threshold value in any two frames of the multi-frame image.
  • the object is still a static object, which avoids the error of the output image caused by the detection error of the moving object, thereby ensuring the quality of the output image after removing the moving object.
  • step 202 the terminal classifies the target objects included in each frame of image, and determines the moving objects and stationary objects included in each frame of image.
  • step 202 the terminal classifies the target objects included in each frame of image, and determines the moving objects and stationary objects included in each frame of image.
  • Step 801 the terminal determines the number of target pixels at the tracking position for each frame of image.
  • the target pixel is used to display the target object, and the tracking position is the position of the target object in any one frame of the multi-frame images.
  • the terminal can determine the position of the target object in any frame of images as the tracking position, and determine the same position corresponding to other frames according to the tracking position in the current frame as the tracking position, thereby ensuring the tracking position in the multi-frame images Likewise, the tracking position can reveal the target object more or less.
  • the terminal After determining the tracking position in each frame of image, the terminal can calculate the number of target pixels in the tracking position of each frame of image.
  • the target pixel is used to display the target object. That is, the terminal can calculate the number of pixels in which the target object is displayed in the tracking position for each frame of image.
  • Step 802 the terminal determines that the target object is a moving object or a stationary object according to the number of target pixels in the tracking position of each frame of image.
  • the terminal may compare the number of target pixels at the tracking position of any two frames of images, and determine that the target object is a moving object or a stationary object according to the comparison result.
  • the terminal determines the number of target pixels at the tracking position for each frame of image, and determines whether the target object is a moving object or a stationary object according to the number of target pixels at the tracking position for each frame of image.
  • the terminal can accurately determine whether the target object is a moving object or a stationary object, avoiding the error of the output image caused by the detection error of the moving object, thereby ensuring the quality of the output image after removing the moving object.
  • the terminal determines that the target object is a moving object or a stationary object according to the number of target pixels in the tracking position of each frame of image", which may include The following steps:
  • Step 901 the terminal calculates the difference in the number of target pixels at the tracking position of any two frames of images in the multi-frame images.
  • the terminal may calculate the difference in the number of target pixels at the tracking position for any two frames of images respectively.
  • the number of target pixels at the tracking position of the first frame image is 108; the number of target pixels at the tracking position of the second frame image is 111; the number of target pixels at the tracking position of the third frame image is 111; The number of target pixels is 100; the number of target pixels at the tracking position of the fourth frame image is 105; the number of target pixels at the tracking position of the fifth frame image is 113.
  • the terminal calculates the difference in the number of target pixels at the tracking position of any two frames of images respectively.
  • Step 902 if the largest difference in quantity is less than the threshold of the number of pixels, the terminal determines that the target object is a stationary object.
  • the terminal calculates the difference in the number of target pixels at the tracking position of any two frames of images respectively. Sort the calculated quantity differences of multiple target pixels, and select the largest quantity difference from them. The terminal compares the maximum number difference with the pixel number threshold. If the maximum number difference is less than the pixel number threshold, it means that the target object is not moving, and the terminal determines that the target object is a stationary object.
  • the maximum number difference is 9, and the pixel number threshold is 15.
  • the terminal compares the relationship between the maximum number difference and the pixel number threshold, determines that the maximum number difference is less than the pixel number threshold, and the terminal determines that the target object is stationary.
  • Step 903 if the difference in the number of target pixels at the tracking position between any two frames of images is greater than or equal to the pixel number threshold, the terminal determines that the target object is a moving object.
  • each time the terminal calculates the difference in the number of target pixels at the tracking position of any two frames of images it can compare the difference in the number of target pixels obtained by the last calculation with the threshold of the number of pixels. After the difference in number is greater than or equal to the threshold of the number of pixels, the terminal determines that the target object is a moving object, and the terminal will no longer calculate the difference in the number of target pixels at the tracking position for any two remaining frames of images.
  • the terminal determines that the difference in the number of target pixels at the tracking position between the first frame image and the second frame image is 20,
  • the pixel number threshold is 15, and the difference in the number of target pixels at the tracking position between the first frame image and the second frame image is greater than the pixel number threshold, the terminal determines that the target object is a moving object, and the terminal will no longer calculate the remaining two frames of images.
  • the difference in the number of target pixels at the tracking position is 20
  • the terminal calculates the difference in the number of target pixels at the tracking position of any two frames of images in the multi-frame images, and if the largest difference in the number is less than the threshold of the number of pixels, the terminal determines that the target object is a stationary object; If the difference in the number of target pixels at the tracking position of the image is greater than or equal to the pixel number threshold, the terminal determines that the target object is a moving object.
  • the terminal can accurately and effectively determine whether the target object is a moving object or a stationary object by comparing the difference between the number of target pixels at the tracking position of any two frames of images in the multi-frame image and the threshold of the number of pixels, avoiding the need for Errors in the detection of moving objects result in errors in the output image, thereby ensuring the quality of the output image from which moving objects are removed.
  • step 203 "remove moving objects in each frame of image, and obtain a background image corresponding to each frame of image" may include the following steps:
  • Step 1001 the terminal marks the pixels corresponding to the moving objects in each frame of images as invalid pixels.
  • the terminal can use a target segmentation algorithm to perform target segmentation on the moving objects in each frame of images, thereby obtaining more accurate mask images corresponding to multiple frames of images.
  • the terminal may represent the mask image corresponding to each frame of image as a binary image.
  • the pixel position corresponding to the moving object may be 0, and other pixel positions may be 1.
  • a pixel position of 1 indicates that the pixel is valid
  • a pixel position of 0 indicates that the pixel is invalid, so that the pixel corresponding to the moving object in each frame of image is marked as an invalid pixel.
  • Step 1002 the terminal generates a background image corresponding to each frame of image according to the remaining pixels in each frame of image except invalid pixels.
  • the terminal after the terminal marks the moving objects in each frame of images as invalid pixels, it can generate a background image corresponding to each frame of images according to other pixels other than the invalid pixels.
  • the terminal marks the pixels corresponding to the moving objects in each frame of images as invalid pixels, and generates a background image corresponding to each frame of images according to the remaining pixels in each frame of images except the invalid pixels, thereby The background image in each frame image can be eliminated, so that there are no moving objects in the background image.
  • the terminal uses the stationary object in the multi-frame image to cover the stationary object in the target background image, and obtains the output image of the camera module" , which can include the following steps:
  • Step 1101 the terminal determines a reference image according to the multiple frames of images.
  • the definition of the stationary object in the reference image is greater than that of the corresponding stationary object in the target background image.
  • the terminal can identify the sharpness of the stationary object in the multi-frame images through the image sharpness recognizing algorithm.
  • the terminal can sort the sharpness of the still objects in the multi-frame images according to the sharpness identification results of the multi-frame images, and select a frame of images with the highest sharpness of the still objects as the reference image.
  • Step 1102 the terminal uses the stationary object in the reference image to cover the stationary object in the target background image to obtain an output image.
  • the terminal may extract the image corresponding to the stationary object in the reference image, and overlay the image corresponding to the stationary object in the reference image on the image corresponding to the stationary object in the target background image.
  • the terminal when the terminal overlays the image corresponding to the stationary object in the reference image on the image corresponding to the stationary object in the target background image, classical techniques such as Poisson fusion and multi-band fusion can be used, so that the output image is in the stationary state.
  • classical techniques such as Poisson fusion and multi-band fusion can be used, so that the output image is in the stationary state.
  • the boundaries of object regions are more natural.
  • the terminal determines a reference image according to multiple frames of images, and uses the stationary object in the reference image to cover the stationary object in the target background image to obtain the output image. Therefore, the clarity of the still objects in the output image can be ensured, so that the output image as a whole is clearer.
  • FIG. 12 an optional operation flow of the image processing method is shown.
  • Step 1201 The terminal acquires multiple frames of images shot by the camera module on the same scene, uses the target detection model to perform target detection on the multiple frames of images, obtains the target object included in each frame of the multiple frames of images, and executes step 1202 or step 1206.
  • Step 1202 the terminal determines the position of the target object in each frame of image.
  • Step 1203 the terminal calculates the position deviation value of the target object in any two frames of the multi-frame image, if the maximum position deviation value is less than the position deviation threshold, then executes step 1204; if the target object is in any two frames of the multi-frame image. If the position deviation value is greater than or equal to the position deviation threshold, step 1205 is executed.
  • Step 1204 the terminal determines that the target object is a stationary object.
  • Step 1205 the terminal determines that the target object is a moving object, and executes step 1210.
  • Step 1206 the terminal determines the number of target pixels at the tracking position for each frame of image.
  • Step 1207 the terminal calculates the difference in the number of target pixels at the tracking position of any two frames of images in the multi-frame images. If the maximum number difference is less than the pixel number threshold, step 1208 is performed; if the target pixel number difference between any two frames of images at the tracking position is greater than or equal to the pixel number threshold, step 1209 is performed.
  • Step 1208 the terminal determines that the target object is a stationary object.
  • Step 1209 the terminal determines that the target object is a moving object, and executes step 1210.
  • Step 1210 the terminal marks the pixels corresponding to the moving objects in each frame of images as invalid pixels.
  • Step 1211 the terminal generates a background image corresponding to each frame of image according to the remaining pixels except the invalid pixels in each frame of image.
  • Step 1212 The terminal performs fusion processing on all background images to generate a target background image.
  • Step 1213 the terminal determines a reference image according to the multiple frames of images.
  • Step 1214 the terminal uses the stationary object in the reference image to cover the stationary object in the target background image to obtain an output image.
  • FIGS. 2 , 5-6 and 8-12 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence indicated by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order, and the steps may be executed in other orders. Moreover, at least a part of the steps in FIGS. 2 , 5-6 and 8-12 may include multiple steps or multiple stages, and these steps or stages are not necessarily executed and completed at the same time, but may be performed at different times. The execution order of these steps or phases is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or phases in the other steps.
  • an image processing apparatus 1300 including: an acquisition module 1310, a determination module 1320, a removal module 1330, and an overlay module 1340, wherein:
  • the acquisition module 1310 is configured to acquire multiple frames of images shot by the camera module on the same scene, and to perform target detection on the multiple frames of images by using a target detection model to obtain a target object included in each frame of the multiple frames of images;
  • the determining module 1320 is used to classify and process the target objects included in each frame of image, and determine the moving objects and stationary objects included in each frame of image;
  • the removing module 1330 is used to remove the moving objects in each frame of image, obtain the background image corresponding to each frame of image, perform fusion processing on all the background images, and generate the target background image;
  • the covering module 1340 is configured to cover the stationary object in the target background image with the stationary object in the multi-frame images, and obtain the output image of the camera module.
  • the above determination module 1320 includes: a first determination unit 1321 and a second determination unit 1322, wherein:
  • the first determining unit 1321 is configured to determine the position of the target object in each frame of image.
  • the second determining unit 1322 is configured to determine whether the target object is a moving object or a stationary object according to the position of the target object in each frame of image.
  • the above-mentioned second determining unit 1322 is specifically configured to calculate the position deviation value of the target object in any two frames of the multi-frame images. If the maximum position deviation value is less than the position deviation threshold, the target object is determined to be For a stationary object, if the position deviation value of the target object in any two frames of the multi-frame images is greater than or equal to the position deviation threshold, the target object is determined to be a moving object.
  • the above determination module 1320 includes: a third determination unit 1323 and a fourth determination unit 1324, wherein:
  • the third determining unit 1323 is used to determine the number of target pixels in the tracking position of each frame of images, the target pixels are used to display the target object, and the tracking position is the position of the target object in any frame of the multi-frame images.
  • the fourth determining unit 1324 is configured to determine whether the target object is a moving object or a stationary object according to the number of target pixels in the tracking position of each frame of image.
  • the above-mentioned fourth determining unit 1324 is specifically configured to calculate the difference in the number of target pixels at the tracking position of any two frames of images in the multi-frame image;
  • the object is a stationary object; if the difference in the number of target pixels at the tracking position between any two frames of images is greater than or equal to the pixel number threshold, the target object is determined to be a moving object.
  • the above-mentioned removal module 1330 is specifically configured to mark pixels corresponding to moving objects in each frame of images as invalid pixels; generate each frame according to the remaining pixels in each frame of images except invalid pixels The image corresponds to the background image.
  • the overlay module 1340 is specifically configured to determine a reference image according to multiple frames of images, and the definition of the stationary object in the reference image is greater than that of the corresponding stationary object in the target background image; using the stationary object in the reference image Overlay the stationary object in the target background image to obtain the output image.
  • Each module in the above-mentioned image processing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof.
  • the above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device including a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when executing the computer program: acquiring multiple frames shot by a camera module on the same scene image, using the target detection model to perform target detection on multiple frames of images, and obtain the target objects included in each frame of the multi-frame images; classify the target objects included in each frame of images, and determine the moving objects included in each frame of images. and static objects; remove the moving objects in each frame of image, obtain the background image corresponding to each frame of image, and fuse all background images to generate the target background image; use the still objects in the multi-frame images to cover the target background image
  • the stationary object is obtained, and the output image of the camera module is obtained.
  • the processor further implements the following steps when executing the computer program: determining the position of the target object in each frame of image; determining the target object as a moving object or stationary object.
  • the processor further implements the following steps when executing the computer program: calculating the position deviation value of the target object in any two frames of images of the multi-frame images, and determining the target object if the maximum position deviation value is less than the position deviation threshold value It is a stationary object. If the position deviation value of the target object in any two frames of the multi-frame images is greater than or equal to the position deviation threshold, the target object is determined to be a moving object.
  • the processor further implements the following steps when executing the computer program: determining the number of target pixels in the tracking position of each frame of image, the target pixels are used to display the target object, and the tracking position is any one of the multiple frames of images.
  • the processor also implements the following steps when executing the computer program: calculating the difference in the number of target pixels at the tracking position of any two frames of images in the multi-frame images; if the largest difference in the number is less than the pixel number threshold, then determine The target object is a stationary object; if the difference in the number of target pixels at the tracking position between any two frames of images is greater than or equal to the pixel number threshold, the target object is determined to be a moving object.
  • the processor also implements the following steps when executing the computer program: marking pixels corresponding to moving objects in each frame of image as invalid pixels; The background image corresponding to each frame image.
  • the processor further implements the following steps when executing the computer program: determining a reference image according to multiple frames of images, and the definition of the stationary object in the reference image is greater than that of the corresponding stationary object in the target background image; using the reference image The stationary objects in the target background image cover the stationary objects in the target background image to obtain the output image.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented: acquiring multiple frames of images shot by a camera module on the same scene, using The target detection model performs target detection on multiple frames of images, and obtains the target objects included in each frame of the multi-frame images; classifies the target objects included in each frame of images, and determines the moving objects and stationary objects included in each frame of images. ;Remove the moving objects in each frame of image, obtain the background image corresponding to each frame of image, and fuse all the background images to generate the target background image; Use the static objects in the multi-frame images to cover the static objects in the target background image to obtain the output image of the camera module.
  • the following steps are also implemented: determining the position of the target object in each frame of image; determining the target object as a moving object according to the position of the target object in each frame of image or stationary objects.
  • the following steps are further implemented: calculating the position deviation value of the target object in any two frames of images of the multi-frame images, and determining the target object if the maximum position deviation value is less than the position deviation threshold
  • the object is a stationary object. If the position deviation value of the target object in any two frames of the multi-frame images is greater than or equal to the position deviation threshold, the target object is determined to be a moving object.
  • the following steps are further implemented: determine the number of target pixels in the tracking position of each frame of image, the target pixels are used to display the target object, and the tracking position is any one of the multiple frames of images.
  • the position of the target object in one frame of image according to the number of target pixels in the tracking position of each frame of image, determine whether the target object is a moving object or a stationary object.
  • the following steps are further implemented: calculating the difference in the number of target pixels of any two frames of images in the multi-frame images at the tracking position; if the largest difference in the number is less than the pixel number threshold, then The target object is determined to be a stationary object; if the difference in the number of target pixels at the tracking position between any two frames of images is greater than or equal to the pixel number threshold, the target object is determined to be a moving object.
  • the following steps are further implemented: marking pixels corresponding to moving objects in each frame of image as invalid pixels; Generate a background image corresponding to each frame of image.
  • the following steps are further implemented: determining a reference image according to multiple frames of images, and the definition of the stationary object in the reference image is greater than that of the corresponding stationary object in the target background image; using the reference image The stationary object in the image covers the stationary object in the target background image to obtain the output image.
  • Non-volatile memory may include read-only memory (Read-Only Memory, ROM), magnetic tape, floppy disk, flash memory, or optical memory, and the like.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • the RAM may be in various forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un procédé et appareil de traitement d'image, un dispositif informatique et un support de stockage, qui s'appliquent au domaine technique des ordinateurs. Le procédé consiste : à acquérir une pluralité de trames d'images qui sont capturées par un module de caméra et sont de la même scène, et à effectuer une détection de cible sur la pluralité de trames d'images à l'aide d'un modèle de détection cible, de façon à obtenir un objet cible qui est compris dans chaque trame d'image parmi la pluralité de trames d'images ; à effectuer un traitement de classification sur l'objet cible qui est compris dans chaque trame d'image, de façon à déterminer un objet mobile et un objet fixe qui sont compris dans chaque trame d'image ; à retirer l'objet mobile de chaque trame d'image, de façon à obtenir une image d'arrière-plan correspondant à chaque trame d'image, et à effectuer un traitement de fusion sur toutes les images d'arrière-plan pour générer une image d'arrière-plan cible ; et à recouvrir l'objet fixe dans l'image d'arrière-plan cible avec des objets fixes dans la pluralité de trames d'images, de façon à obtenir une image de sortie du module de caméra, de telle sorte que la qualité d'image d'une image composite dont les objets mobiles ont été supprimés peut être améliorée.
PCT/CN2022/083400 2021-03-29 2022-03-28 Procédé et appareil de traitement d'image, dispositif informatique et support de stockage WO2022206679A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110330377.3A CN113129227A (zh) 2021-03-29 2021-03-29 图像处理方法、装置、计算机设备和存储介质
CN202110330377.3 2021-03-29

Publications (1)

Publication Number Publication Date
WO2022206679A1 true WO2022206679A1 (fr) 2022-10-06

Family

ID=76773963

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/083400 WO2022206679A1 (fr) 2021-03-29 2022-03-28 Procédé et appareil de traitement d'image, dispositif informatique et support de stockage

Country Status (2)

Country Link
CN (1) CN113129227A (fr)
WO (1) WO2022206679A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113129227A (zh) * 2021-03-29 2021-07-16 影石创新科技股份有限公司 图像处理方法、装置、计算机设备和存储介质

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1471694A (zh) * 2001-06-27 2004-01-28 ���ṫ˾ 图像处理装置和方法、以及图像捕获装置
CN1471693A (zh) * 2001-06-25 2004-01-28 ���ṫ˾ 图像处理设备和方法以及图像获取设备
CN1471691A (zh) * 2001-06-20 2004-01-28 ���ṫ˾ 图像处理设备和方法,以及图像捕获设备
CN1471692A (zh) * 2001-06-27 2004-01-28 ���ṫ˾ 图像处理设备和方法以及图像捕获设备
US20080019566A1 (en) * 2006-07-21 2008-01-24 Wolfgang Niem Image-processing device, surveillance system, method for establishing a scene reference image, and computer program
CN103310454A (zh) * 2013-05-08 2013-09-18 北京大学深圳研究生院 滞留物检测中的静止物体类型判断与物主分析方法及系统
CN107844765A (zh) * 2017-10-31 2018-03-27 广东欧珀移动通信有限公司 拍照方法、装置、终端及存储介质
CN107943837A (zh) * 2017-10-27 2018-04-20 江苏理工学院 一种前景目标关键帧化的视频摘要生成方法
CN109002787A (zh) * 2018-07-09 2018-12-14 Oppo广东移动通信有限公司 图像处理方法和装置、存储介质、电子设备
CN110290323A (zh) * 2019-06-28 2019-09-27 Oppo广东移动通信有限公司 图像处理方法、装置、电子设备和计算机可读存储介质
CN111028263A (zh) * 2019-10-29 2020-04-17 福建师范大学 一种基于光流颜色聚类的运动物体分割方法及其系统
CN113129229A (zh) * 2021-03-29 2021-07-16 影石创新科技股份有限公司 图像处理方法、装置、计算机设备和存储介质
CN113129227A (zh) * 2021-03-29 2021-07-16 影石创新科技股份有限公司 图像处理方法、装置、计算机设备和存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106469311B (zh) * 2015-08-19 2019-11-05 南京新索奇科技有限公司 目标检测方法及装置
CN105827952B (zh) * 2016-02-01 2019-05-17 维沃移动通信有限公司 一种去除指定对象的拍照方法及移动终端
CN109040603A (zh) * 2018-10-15 2018-12-18 Oppo广东移动通信有限公司 高动态范围图像获取方法、装置及移动终端
CN110166710A (zh) * 2019-06-21 2019-08-23 上海闻泰电子科技有限公司 图像合成方法、装置、设备和介质
CN111242128B (zh) * 2019-12-31 2023-08-04 深圳奇迹智慧网络有限公司 目标检测方法、装置、计算机可读存储介质和计算机设备

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1471691A (zh) * 2001-06-20 2004-01-28 ���ṫ˾ 图像处理设备和方法,以及图像捕获设备
CN1471693A (zh) * 2001-06-25 2004-01-28 ���ṫ˾ 图像处理设备和方法以及图像获取设备
CN1471692A (zh) * 2001-06-27 2004-01-28 ���ṫ˾ 图像处理设备和方法以及图像捕获设备
CN1471694A (zh) * 2001-06-27 2004-01-28 ���ṫ˾ 图像处理装置和方法、以及图像捕获装置
US20080019566A1 (en) * 2006-07-21 2008-01-24 Wolfgang Niem Image-processing device, surveillance system, method for establishing a scene reference image, and computer program
CN103310454A (zh) * 2013-05-08 2013-09-18 北京大学深圳研究生院 滞留物检测中的静止物体类型判断与物主分析方法及系统
CN107943837A (zh) * 2017-10-27 2018-04-20 江苏理工学院 一种前景目标关键帧化的视频摘要生成方法
CN107844765A (zh) * 2017-10-31 2018-03-27 广东欧珀移动通信有限公司 拍照方法、装置、终端及存储介质
CN109002787A (zh) * 2018-07-09 2018-12-14 Oppo广东移动通信有限公司 图像处理方法和装置、存储介质、电子设备
CN110290323A (zh) * 2019-06-28 2019-09-27 Oppo广东移动通信有限公司 图像处理方法、装置、电子设备和计算机可读存储介质
CN111028263A (zh) * 2019-10-29 2020-04-17 福建师范大学 一种基于光流颜色聚类的运动物体分割方法及其系统
CN113129229A (zh) * 2021-03-29 2021-07-16 影石创新科技股份有限公司 图像处理方法、装置、计算机设备和存储介质
CN113129227A (zh) * 2021-03-29 2021-07-16 影石创新科技股份有限公司 图像处理方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN113129227A (zh) 2021-07-16

Similar Documents

Publication Publication Date Title
CN109325954B (zh) 图像分割方法、装置及电子设备
WO2022206680A1 (fr) Procédé et appareil de traitement d'image, dispositif informatique et support d'enregistrement
WO2019218824A1 (fr) Procédé d'acquisition de piste de mouvement et dispositif associé, support de stockage et terminal
Li et al. Supervised people counting using an overhead fisheye camera
CN109815843B (zh) 图像处理方法及相关产品
EP3982322A1 (fr) Procédé d'épissage d'images et de vidéos panoramiques, support de stockage lisible par ordinateur et caméra panoramique
CN109344727B (zh) 身份证文本信息检测方法及装置、可读存储介质和终端
KR101603019B1 (ko) 화상 처리 장치, 화상 처리 방법 및 컴퓨터로 판독 가능한 기록 매체
CN109299658B (zh) 脸部检测方法、脸部图像渲染方法、装置及存储介质
CN112102340B (zh) 图像处理方法、装置、电子设备和计算机可读存储介质
Shi et al. Robust foreground estimation via structured Gaussian scale mixture modeling
CN110796041B (zh) 主体识别方法和装置、电子设备、计算机可读存储介质
CN111753882B (zh) 图像识别网络的训练方法和装置、电子设备
CN107346414B (zh) 行人属性识别方法和装置
WO2022160857A1 (fr) Procédé et appareil de traitement d'images, support de stockage lisible par ordinateur et dispositif électronique
WO2022194079A1 (fr) Procédé et appareil de segmentation de région du ciel, dispositif informatique et support de stockage
CN109447022B (zh) 一种镜头类型识别方法及装置
WO2022233252A1 (fr) Procédé et appareil de traitement d'images, ainsi que dispositif informatique et support de stockage
WO2022206679A1 (fr) Procédé et appareil de traitement d'image, dispositif informatique et support de stockage
Wang et al. Object counting in video surveillance using multi-scale density map regression
CN113658197B (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
Choi et al. A method for fast multi-exposure image fusion
AU2014277855A1 (en) Method, system and apparatus for processing an image
CN112884804A (zh) 行动对象追踪方法及相关设备
CN115115552B (zh) 图像矫正模型训练及图像矫正方法、装置和计算机设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22778861

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22778861

Country of ref document: EP

Kind code of ref document: A1