WO2020168515A1 - Procédé et appareil de traitement d'image, système de capture et de traitement d'image, et support - Google Patents

Procédé et appareil de traitement d'image, système de capture et de traitement d'image, et support Download PDF

Info

Publication number
WO2020168515A1
WO2020168515A1 PCT/CN2019/075707 CN2019075707W WO2020168515A1 WO 2020168515 A1 WO2020168515 A1 WO 2020168515A1 CN 2019075707 W CN2019075707 W CN 2019075707W WO 2020168515 A1 WO2020168515 A1 WO 2020168515A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
frame
target
target object
sequence
Prior art date
Application number
PCT/CN2019/075707
Other languages
English (en)
Chinese (zh)
Inventor
薛立君
克拉夫琴科·费奥多尔
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2019/075707 priority Critical patent/WO2020168515A1/fr
Priority to CN201980004937.7A priority patent/CN111247790A/zh
Publication of WO2020168515A1 publication Critical patent/WO2020168515A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction

Definitions

  • the embodiments of the present invention relate to the field of image processing technology, and in particular to an image processing method, device, image shooting and processing system and carrier.
  • Time-lapse shooting refers to a time-compression shooting technology. After a group of photos or videos is captured, the longer process can be compressed in a shorter time through photo series or video frame extraction. Way to broadcast.
  • the embodiments of the present invention provide an image processing method, device, image shooting and processing system and device, which can effectively remove abnormal objects in images and improve the playback effect of image frame sequences obtained by time-lapse shooting.
  • the first aspect of the embodiments of the present invention is to provide an image processing method, including:
  • the second aspect of the embodiments of the present invention is to provide an image processing device, including a memory and a processor;
  • the memory is used to store program codes
  • the processor calls the program code, and when the program code is executed, is used to perform the following operations:
  • a third aspect of the embodiments of the present invention is to provide an image shooting and processing system, which is characterized in that it includes a shooting device and one or more processors, wherein:
  • the photographing device is configured to obtain an image frame sequence by time-lapse photographing, and send the image frame sequence to the one or more processors;
  • the one or more processors are configured to determine a target frame with a target object in the sequence of image frames, cut out the image area where the target object exists in the target frame, and fill and cut out the target The image area behind the object.
  • the fourth aspect of the embodiments of the present invention is to provide a carrier, which is characterized by an image capturing and processing device, wherein the image capturing and processing device is configured to:
  • control terminal may first acquire the time-lapse captured image frame sequence, so that the target frame with the target object can be determined in the image frame sequence, and the image area where the target object exists in the target frame Cut out, and fill in the image area after the target object is cut out, which can effectively remove the image corresponding to the target object in the target frame, thereby improving the playback effect of time-lapse shooting.
  • FIG. 1 is a schematic diagram of an image processing scene provided by an embodiment of the present invention
  • FIG. 2 is a schematic diagram of an image processing scene provided by another embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an image frame sequence provided by an embodiment of the present invention.
  • Figure 4a is a schematic diagram of a target frame with a target object provided by an embodiment of the present invention.
  • FIG. 4b is a schematic diagram after subtracting the target object in the target frame shown in FIG. 4a according to an embodiment of the present invention
  • FIG. 4c is a schematic diagram after filling the image shown in FIG. 4b according to an embodiment of the present invention.
  • FIG. 5 is a schematic flowchart of an image processing method provided by an embodiment of the present invention.
  • FIG. 6 is a schematic flowchart of an image processing method according to another embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a target object provided by an embodiment of the present invention as a local object
  • FIG. 8 is a schematic block diagram of an image shooting and processing system according to an embodiment of the present invention.
  • Fig. 9a is a schematic structural diagram of a partial mask provided by an embodiment of the present invention.
  • Figure 9b is a schematic diagram of a partially convolutional neural network structure provided by an embodiment of the present invention.
  • the target object in order to eliminate the target object in the sequence of image frames obtained by time-lapse shooting, the target object includes an abnormal object included in the target frame during the time-lapse shooting process, or an object designated by the user to be removed in the target frame
  • the target object can be manually processed to eliminate the target object in the sequence.
  • the manual processing method if the manual processing method is adopted, there may be a risk of omission.
  • the removal efficiency of the target object in the image frame sequence is low, and the target object in the image cannot be effectively eliminated.
  • this application proposes an image processing method that can automatically recognize the time-lapse captured image frame sequence, and can subtract the image area where the target object exists in the image frame sequence, and the subtracted image area Filling can improve the removal efficiency of the target object in the image frame sequence, and can also improve the removal effect of the image area corresponding to the target object in the image frame sequence to improve the playback effect of the image frame sequence obtained by time-lapse shooting.
  • the image processing method may be applied to the image processing scene shown in FIG. 1, wherein the image processing method may be specifically applied to the image shooting and processing system shown in FIG. 1, wherein the The system includes a shooting device and one or more processors.
  • the shooting device and the one or more processors are integrated in the same physical device.
  • the image shooting and processing system only includes The physical device.
  • the one or more processors are configured inside the drone, and the photographing device is installed on the drone and used for time-lapse photography to obtain a sequence of image frames , Sending the sequence of image frames to the drone, where specifically sent to one or more processors integrated in the drone, and the one or more processors can acquire the camera
  • the sent image frame sequence of time-lapse shooting can be specifically configured by the shooting device to collect images at preset time intervals and send to the one or more processors, so that the one or more processors
  • the collected images can be sorted in a time sequence, so that the sorted images can be further compressed into a sequence of image frames, so that a sequence of image frames based on time-lapse shooting can be obtained.
  • the photographing device and the one or more processors may also be integrated in different physical devices, and the corresponding image photographing and processing system is composed of multiple physical devices, and the photographing device For example, it can be integrated in devices such as mobile phones, cameras, etc.
  • the one or more servers can be integrated in the above-mentioned physical equipment, and can also be integrated in a ground station or remote control device, where the camera and the camera are integrated.
  • the physical devices of one or more processors may perform image transmission based on a pre-established communication connection, so as to realize the processing of the image of the target object.
  • the image processing method may also be applied to the image processing scene shown in FIG. 2, wherein the image processing method may be specifically applied to the carrier shown in FIG. 2, and the carrier includes An image capturing and processing device, the image capturing and processing device can be mounted on the carrier, and the carrier includes an unmanned aerial vehicle, an unmanned vehicle, or a handheld device or carrier device with a pan/tilt.
  • the carrier is a handheld pan/tilt as an example.
  • the handheld pan/tilt is equipped with the above-mentioned image capturing and processing device, wherein the image capturing and processing device can be configured to perform: The sequence of image frames captured at time, and the sequence of image frames are processed to obtain a target image from which abnormal objects in the sequence of image frames are subtracted.
  • the image capturing device may be a part of the carrier, or it may be fixedly installed on the carrier.
  • the image processing device communicates with the image capturing device in a wired or wireless manner for receiving image data captured by the image processing device.
  • the drone or the handheld pan/tilt may sort the collected images in chronological order.
  • the following takes the shooting scene shown in FIG. 1 as an example to describe this solution in detail, as shown in FIG. 2
  • the drone that is, one or more processors in the drone can sort the collected images according to the time sequence indicated by the arrow in FIG. 3, and further compress the resulting image frame sequence after sorting Can be shown in Figure 3.
  • each frame image in the image frame sequence may be identified to determine a target frame including a target object from the image frame sequence, wherein the target object Is an abnormal object that appears in the time-lapse shooting process, and the abnormal object is an object corresponding to the image composed of the target frame and the pixel points corresponding to different pixel values at the same position of the adjacent frame of the target frame, wherein, Recognizing the image frame sequence shown in FIG. 3, the determined image including the target object is the image with the number 2 in the figure.
  • the target frame with the target object can be determined in the image frame sequence, and it can be assumed that the drone has determined from the image frame sequence with
  • the target frame of the target object is shown in FIG. 4a, where the image shown in 4a is the image marked with the serial number 2 in FIG. 3 above.
  • the target object is assumed to be an abnormal object identified by area 401 in the figure.
  • the target object may be an interference object such as a bird as an abnormal object preset in the UAV, or an object selected by the user to be eliminated.
  • the drone may determine the target object from the target frame when it detects that the target frame includes a preset interference; or the drone also
  • the types of objects included in any image frame can be recognized, and the types of objects included in any image frame can be determined, so that the target object can be determined based on the number of various types of objects included in the image frame, for example, The least number of objects of various types included in the image frame can be determined as the target object.
  • the image area where the target object exists in the target frame can be subtracted, that is, the image identified by area 401 in FIG.
  • the subsequent target frame can be shown in Figure 4b.
  • the area can be filled.
  • the filled image can be shown in Figure 4c, which can prevent the target object from affecting the playback effect when the image frame sequence is played. The resulting impact improves the user’s viewing quality.
  • the target object that is, the abnormal object placed on the handheld PTZ is a tourist who suddenly appears during the shooting, that is, the person in FIG.
  • FIG. 5 is a schematic flowchart of an image processing method proposed by an embodiment of the present invention.
  • the image processing method can be specifically applied to the above-mentioned image capturing and processing system and carrier.
  • the execution subject takes the image shooting and processing system as an example, and the image processing method is described in detail. As shown in FIG. 5, the method includes:
  • S501 Acquire a time-lapse shot image frame sequence.
  • the photographing device in the image photographing and processing system is used for time-lapse photographing to obtain an image frame sequence, and the image frame sequence is sent to one or more processors included in the system.
  • the processor can be configured to obtain a sequence of time-lapse images.
  • the photographing device may obtain multiple frames of images by photographing according to a preset time interval, the preset time interval may be, for example, 30 minutes, 2 hours, etc., and the photographing device may be, for example, an image acquisition device such as a camera.
  • the multiple frames of images can be directly sorted based on a time sequence to obtain an initial image sequence, and further, the initial image sequence can be compressed to generate an image frame sequence.
  • the sequence of image frames may also be generated by a device integrated with the one or more processors.
  • the device integrated with the one or more processors may be, for example, the aforementioned drone, unmanned vehicle, ground station, and remote control device.
  • the photographing device may directly send the multiple frames of images to one or more processors, so that the one or more processors can sort the multiple frames of images based on a time sequence
  • the initial image sequence is obtained, and the initial image sequence is compressed to generate the image frame sequence.
  • S502 Determine a target frame with a target object in the sequence of image frames.
  • the image capture and processing system After the image capture and processing system acquires the image frame sequence, where the image frame sequence is specifically acquired by the one or more processors, in order to determine the target frame with the target object from the image frame sequence,
  • the image frame sequence is preprocessed first, and the image frame sequence can be split into image groups sorted according to a time sequence, so that a target frame with a target object can be determined from the image group.
  • the image capturing and processing system determines the target object, it may be determined based on the target object preset in the image capturing and processing system. Specifically, the target object preset by the image capturing and processing system may be determined first. Type, so that image recognition can be performed on any frame of the image frame sequence to determine the category of the included object from the any frame of image, so that the category of the included object can be compared with the preset According to the comparison result, a target frame including the type of the target object can be determined.
  • the types of target objects preset by the image shooting and processing system may be preset based on different shooting scenes, and different types of target objects may be preset.
  • the shooting scene may be, for example, natural scenery or urban life. , Biological evolution, etc., for example, when the shooting scene is natural scenery, the preset target object type may be birds, etc.; when the shooting scene is biological evolution, the preset target object type may be humans, etc.
  • the type of the target object preset by the image capturing and processing system may be one type or multiple types.
  • the image capturing and processing system may first determine the image area of the target object corresponding to the target frame based on a preset network model, wherein the preset network model For example, it can be a regional convolutional neural network (Region CNN, RCNN) network model. Specifically, the image capturing and processing system can input the target frame into the RCNN model, so that the target frame can be determined based on the output of the RCNN model. The target object corresponds to the image area of the target frame.
  • a preset network model For example, it can be a regional convolutional neural network (Region CNN, RCNN) network model.
  • the image capturing and processing system can input the target frame into the RCNN model, so that the target frame can be determined based on the output of the RCNN model.
  • the target object corresponds to the image area of the target frame.
  • the RCNN model determines the image area corresponding to the target object based on the input target frame, it can first perform feature extraction on the input target frame, so that the category of the object included in the target frame can be determined based on the feature extraction result. Further, The category of the object included in the target frame can be compared with the category of the target object preset by the image capturing and processing system, so that the image area corresponding to the target object in the target frame can be determined according to the comparison result.
  • the corresponding partial mask graphic can be generated in the image area, that is, the mask image is used to identify the partial area of the image, so that all The partial image area identified by the partial mask image is cut out, so as to realize the cut out of the image area where the target object exists in the target frame.
  • a white area may be used in the image to represent it, and step S504 is executed.
  • S504 Fill the image area after the target object is cut out.
  • the image capturing and processing system cuts out the image area where the target object exists in the target frame, it needs to fill the image area after the target object is subtracted to maintain the continuity of the target frame image It can also guarantee the playback effect of the image frame sequence during playback.
  • the image area When filling the image area after the target object is subtracted, the image area may be filled based on the previous frame image and the next frame image of the target frame; or, the image with the target object may be cut out
  • the target frame of the region and the unit image corresponding to the image region of the abnormal image in the target frame are input into the convolutional neural network model, so as to realize the filling of the image region of the deducted target object, wherein the output of the convolutional neural network model
  • the image is the target frame after filling.
  • the convolutional neural network model for pixel filling, it can specifically be a convolutional neural (UNet) structure of a partial convolution (Partial Convolutions) layer in the convolutional neural network structure.
  • the grid is filled; or, the previous frame image, the next frame image of the target frame, the target frame that cuts out the image area of the target object, and the image area of the abnormal image in the target frame
  • the corresponding unit image is input to the convolutional neural network model for pixel filling.
  • the image capturing and processing system may first obtain the sequence of time-lapse captured image frames, so that the target frame with the target object can be determined in the sequence of image frames, and the target object may be detected in the target frame.
  • the image corresponding to the target object in the target frame can be effectively removed, thereby improving the playback effect of time-lapse shooting.
  • FIG. 6 is a schematic flowchart of an image processing method according to another embodiment of the present invention.
  • the image processing method can also be specifically applied to the aforementioned image capturing and processing system and carrier.
  • the execution subject is the image capturing and processing system as an example, and the image processing method is described in detail. As shown in FIG. 6, the method includes:
  • S601 Acquire a sequence of time-lapsed image frames.
  • the photographing device in the image capturing and processing system may first obtain at least one frame of initial images photographed by the photographing device, so that the at least one frame of images may be sorted based on the time sequence, The initial image sequence is obtained, and further, the image frame sequence can be obtained by compressing the initial image sequence.
  • the initial image frame sequence is compressed, blank images based on the initial image sequence may be deleted first, and the time corresponding to the initial image may be modified to obtain the image frame sequence, and the obtained Each frame of the image frame sequence is a continuous non-blank image.
  • S602 Determine a target frame with a target object in the sequence of image frames.
  • the image capturing and processing system when the image capturing and processing system determines a target frame with a target object, it may first determine adjacent frames of the target frame, and compare the target frame with the adjacent frames, Therefore, a target image composed of pixels corresponding to different pixel values at the same position of the adjacent frame can be determined from the target frame, and the object corresponding to the target image is the target object.
  • a target frame including the target object is determined in the sequence of image frames.
  • the target object is generally a moving object, and may be all or part of a moving object.
  • the target object is all of the moving object, as shown in FIG. 4a Shows that the target object in the target frame is all of the bird; when only part of the moving object is captured in the target frame, the target object is a part of the moving object, that is, the moving object is captured in the target frame
  • the target object in the target frame is the part (foot) of the bird, that is, the part image identified by 701.
  • the image capturing and processing system can recognize the edge of the image in the target frame based on the above-mentioned convolutional neural network structure, which can improve the recognition speed of the image corresponding to the target object in the target frame.
  • the UNet structure of the Partial Convolutions layer in the convolutional neural network structure recognizes the edge of the image, and determines the edge of the image corresponding to the target object in the target frame according to the recognition result.
  • the Partial Convolutions layer is used to recognize the edge of an image, a set of pixels belonging to the same semantic can be determined, so that the image area formed by the set of pixels belonging to the same semantic can be used as the image area of the image corresponding to the target object Further, the target frame can be determined.
  • the pixels belonging to the same semantics refer to the pixels used to describe the characteristics of the same object.
  • the pixels corresponding to the wings of the bird and the pixels corresponding to the feet of the bird in Figure 4a are used to describe the characteristics of the bird Pixels, therefore, the pixels corresponding to the wings and the feet are a set of pixels belonging to the same semantics.
  • the pixel points corresponding to the car door are not the pixels used to describe the characteristics of the bird. Therefore, the pixel points corresponding to the car door and the pixels corresponding to the wings do not belong to the same semantic pixel point set.
  • the target object can be determined by distance comparison, and the target frame with the target object can be further determined. For example, if the distance between a bird and the image capturing and processing system determined based on the same semantics in Figure 4a is a, and the distance between a car and the image capturing and processing system determined based on the same semantics is b, and a If it is less than b, it means that the car and the bird are not on the same plane. Therefore, the determined target object is a bird.
  • the image shooting and processing system may first process the image frame sequence based on the neural network model. Based on the processing of the image frame sequence, it may determine that the image frame sequence is The category of objects included in each image. Specifically, the image capturing and processing system may first input multiple images included in the image frame sequence into the neural network model; thus, the neural network model may be called to perform processing on any image in the image frame sequence. Feature extraction to obtain a feature extraction result; based on the feature extraction result, it can be determined that each image in the image frame sequence includes the category of the object.
  • the features may be color features, texture features, and the like, for example.
  • the image shooting and processing system may first call the neural network model to perform feature summary on the feature extraction result to obtain the feature summary result
  • the image shooting and processing system may separately summarize different features to obtain corresponding feature summary results, for example, separately summarize texture features to obtain texture feature summary results, and summarize color features, Get the color feature summary result.
  • the image shooting and processing system may also summarize all the feature extraction results, for example, it may summarize based on the texture feature and the color feature to obtain a feature summary result.
  • the feature extraction results can be directly added to obtain the feature summary result; or the feature extraction results can be weighted and calculated to obtain the feature summary result.
  • the image shooting and processing system can determine that each image in the image frame sequence includes the category of the object based on the feature summary result. Specifically, the image shooting and processing system can The feature summary result is matched with the preset feature value corresponding to each object type, so that the category of the object can be determined in each image according to the matching result.
  • the neural network model can be used to The output result of processing the image frame sequence determines the target frame.
  • the image capturing and processing system may determine a target frame from the sequence of image frames based on the category of objects included in each image output by the neural network model, wherein the image capturing and processing system may The types of objects included in each image are matched with the preset types of target objects, so that it can be determined whether each image in the sequence of image frames contains a target object, and the image frame containing the target object is determined as the target frame .
  • the image capturing and processing system when the image capturing and processing system determines a target frame with a target object from the sequence of image frames, it may also target any one of the multiple frames of images in the sequence of image frames, and compare the The image is divided into regions to obtain multiple regional images; thereby, the characteristic parameters of each of the regional images can be obtained, and based on the characteristic parameters, the target frame with the target object in the image frame sequence can be determined.
  • the object category included in any frame image in the image frame sequence can be determined based on the feature parameter, so that the target frame with the target object in the image frame sequence can be determined according to the object category.
  • S603 Determine an image area where the target object exists in the target frame.
  • S604 Based on the image area, generate a partial mask pattern corresponding to the image area.
  • Step S603-Step S605 are specific details of step S503 in the foregoing embodiment.
  • the image shooting and processing system may first determine the image area where the target object exists in the target frame.
  • the image shooting and processing system may call the preset Assuming that the network model determines the image area where the target object exists in the target frame, the preset network model may be the aforementioned RCNN model, for example.
  • a partial mask pattern corresponding to the image area that is, a mask pattern
  • the partial mask pattern is used to mark all
  • the partial image area of the target object exists in the target frame and based on the mask image and the target frame, the image area where the target object exists in the target frame may be cut out, and the image with the target object may be cut out After the area, it is represented by a white area in the target frame.
  • the image area where the target object exists in the target frame may correspond to the area identified by 401, and the corresponding partial mask pattern may be generated based on the image area
  • the target frame after the partial mask pattern is generated can be as shown in FIG. 4b.
  • the image in the mask pattern can be cut out, and the cut out is in the mask.
  • the image area where the target object exists in the target frame can be represented in white, as shown in Fig. 4c.
  • the image area after the target object is subtracted may be filled based on the surrounding image information of the partial mask graphic. Specifically, the surrounding image domain of the image area where the target object exists in the target frame may be determined first. The distance between the pixel in the surrounding image domain and the pixel in the image area where the target object exists is less than or equal to a preset distance threshold. Further, the image after the target object can be subtracted based on the surrounding image domain The area is filled.
  • a reference frame When filling the image area after subtracting the target object based on the surrounding image domain, a reference frame may be determined from the sequence of image frames first, and the reference frame is one of the first M frames of the target frame. Any frame, where M is an integer greater than 1; further, the image capturing and processing system can determine the exposure intensity of the reference frame, so that the image capturing and processing system can use a white balance algorithm based on the The exposure intensity of the reference frame and the surrounding image domain fill the image area after the target object is subtracted.
  • S606 Fill the image area after the target object is cut out.
  • the image capturing and processing system when the image capturing and processing system fills in the image area after the target object is cut out, it may first acquire the first unit image included in the target frame, where the first unit image is the target There is an image area of the target object in the frame, so that the target frame after the image area where the target object exists and the first unit image can be input into the convolutional neural network model, and the output image of the convolutional neural network model can be obtained , The output image is the target frame after filling.
  • the image capturing and processing system when the image capturing and processing system fills in the image area after the target object is cut out, it may also first acquire the first unit image included in the target frame and the front of the target frame.
  • One frame of image and the next frame of image, where the first unit image is an image area where the target object exists in the target frame, so that the previous frame image, the next frame image, and the The target frame after removing the image area where the target object exists and the first unit image are input to the convolutional neural network model, and the output image of the convolutional neural network model is obtained, and the output image is the target frame after filling .
  • the convolutional neural network model may specifically be the convolutional neural UNet structure of the Partial Convolutions layer described above.
  • the image capturing and processing system when the image capturing and processing system fills in the image area after the target object is cut out, it may also first obtain the previous frame image and the next frame image of the target frame, which can be based on The previous frame image and the next frame image are filled in the image after the target object is cut out to obtain a filled target frame. Specifically, the image capturing and processing system may first obtain a second unit image in the previous frame of image, where the second unit image is the presence of a target object in the previous frame of image and the target frame.
  • the image area corresponds to the image at the same position; and the third unit image in the next frame image is acquired, where the third unit image corresponds to the image area in the next frame image and the target object in the target frame
  • the first value of each pixel contained in the second unit image can be obtained
  • the second value of each pixel contained in the third unit image can be obtained, so as to target any
  • the average value of the first value and the second value can be calculated, so that based on the average value, pixel filling can be performed on the target frame of the image area where the target object is removed, to obtain the filled Target frame.
  • the image capturing and processing system when the image capturing and processing system performs pixel filling, it refers to the pixels of the previous frame of the target frame and the image of the next frame of the image corresponding to the target object at the same position, so , Which can ensure the continuity of the content and color of the target frame after filling.
  • the image capturing and processing system first obtains a sequence of time-lapse captured image frames, and in the sequence of image frames, determines a target frame with a target object, and can further determine that there is a target in the target frame
  • the image area of the object so that a partial mask image corresponding to the image area can be generated based on the image area, and based on the partial mask pattern and the target frame, it is possible to determine whether the target object exists in the target frame.
  • the image area is subtracted, and the image area after subtracting the target object is filled. While removing the image corresponding to the target object in the target frame, it can ensure that the content and color of the filled target frame are consistent in time sequence It can effectively improve the playback effect of time-lapse shooting.
  • FIG. 8 is a structural diagram of the image shooting and processing system provided by an embodiment of the present invention.
  • the image shooting and processing system 800 includes a shooting device 801 and a Or multiple processors 802, which can be specifically applied in the image processing scene as shown in FIG. 1, where:
  • the photographing device 801 is configured to obtain an image frame sequence by time-lapse photographing, and send the image frame sequence to the one or more processors;
  • the one or more processors 802 are configured to obtain a sequence of time-lapse photographed image frames, and in the sequence of image frames, determine a target frame with a target object, and cut out the target object in the target frame.
  • the image area is filled with the image area after the target object is cut out.
  • the one or more processors 802 are specifically configured to: when acquiring the time-lapse captured image frame sequence:
  • the initial image sequence is compressed to obtain an image frame sequence.
  • the one or more processors 802 determine a target frame with a target object in the image frame sequence, it is specifically configured to:
  • the target frame is determined according to an output result of processing the image frame sequence by the neural network model.
  • the one or more processors 802 are specifically configured to: when processing the image frame sequence based on the neural network model:
  • each image in the image frame sequence includes the category of the object.
  • the one or more processors 802 are specifically configured to: when determining that each image in the image frame sequence includes the category of the object based on the feature extraction result:
  • each image in the image frame sequence includes the category of the object.
  • the one or more processors 802 are specifically configured to determine the target frame according to the output result of processing the image frame sequence on the neural network model:
  • the image frame containing the target object is determined as the target frame.
  • the one or more processors 802 determine a target frame with a target object in the image frame sequence, it is specifically configured to:
  • the one or more processors 802 are specifically configured to: when determining a target frame with a target object in the image frame sequence based on the characteristic parameter:
  • a target frame with a target object in the sequence of image frames is determined.
  • the one or more processors 802 when the one or more processors 802 cut out the image area where the target object exists in the target frame, it is specifically configured to:
  • the one or more processors 802 when the one or more processors 802 fill in the image area after the target object is cut out, it is specifically configured to:
  • Determining the surrounding image domain of the image area where the target object exists in the target frame, and the distance between the pixel points in the surrounding image domain and the pixel point of the image area where the target object exists is less than or equal to a preset distance threshold
  • the one or more processors 802 are specifically configured to: when filling the image area after the target object is subtracted based on the surrounding image domain:
  • a white balance algorithm is used to fill the image area after deducting the target object based on the exposure intensity of the reference frame and the surrounding image domain.
  • the one or more processors 802 when the one or more processors 802 fill in the image area after the target object is cut out, it is specifically configured to:
  • the one or more processors 802 when the one or more processors 802 fill in the image area after the target object is cut out, it is specifically configured to:
  • the one or more processors 802 when the one or more processors 802 fill in the image area after the target object is cut out, it is specifically configured to:
  • the one or more processors 802 fill in the image area after the target object is cut out based on the previous frame of image and the next frame of image, to obtain the filled
  • the target frame is specifically used for:
  • pixel filling is performed on the target frame from which the image area where the target object exists is removed, to obtain the filled target frame.
  • the target object is an abnormal object included in the target frame during the time-lapse photographing process of the photographing device.
  • the one or more processors 802 are specifically configured to: when determining a target frame with a target object:
  • the target frame is compared with the adjacent frame, and a target image composed of pixels corresponding to different pixel values at the same position of the adjacent frame is determined from the target frame, and the object corresponding to the target image Is the target object;
  • the frame including the target object in the sequence of image frames is used as a target frame.
  • the target object is all or part of a moving object.
  • the image shooting and processing system provided in this embodiment can execute the image processing methods shown in FIG. 5 and FIG. 6 provided in the foregoing embodiment, and the execution mode and beneficial effects are similar, and details are not repeated here.
  • the embodiment of the present invention provides a carrier, the carrier includes an image shooting and processing device, the carrier can be specifically applied to the image processing scene shown in FIG. 2, wherein the image shooting and processing device is configured to in:
  • the image capturing and processing device is specifically used to:
  • the initial image sequence is compressed to obtain an image frame sequence.
  • the image capturing and processing device determines a target frame with a target object in the image frame sequence, it is specifically used to:
  • the target frame is determined according to an output result of processing the image frame sequence by the neural network model.
  • the image shooting and processing device when the image shooting and processing device processes the image frame sequence based on the neural network model, it is specifically configured to:
  • each image in the image frame sequence includes the category of the object.
  • the image capturing and processing device determines that each image in the image frame sequence includes the category of the object based on the feature extraction result, it is specifically configured to:
  • each image in the image frame sequence includes the category of the object.
  • the image capturing and processing device determines the target frame according to the output result of processing the image frame sequence on the neural network model, it is specifically configured to:
  • the image frame containing the target object is determined as the target frame.
  • the image capturing and processing device determines a target frame with a target object in the image frame sequence, it is specifically used to:
  • the image capturing and processing device determines the target frame with the target object in the image frame sequence based on the characteristic parameter, it is specifically configured to:
  • a target frame with a target object in the sequence of image frames is determined.
  • the image capturing and processing device when the image capturing and processing device cuts out the image area where the target object exists in the target frame, it is specifically configured to:
  • the image capturing and processing device when the image capturing and processing device fills the image area after the target object is cut out, it is specifically configured to:
  • Determining the surrounding image domain of the image area where the target object exists in the target frame, and the distance between the pixel points in the surrounding image domain and the pixel point of the image area where the target object exists is less than or equal to a preset distance threshold
  • the image capturing and processing device when the image capturing and processing device fills the image area after subtracting the target object based on the surrounding image domain, it is specifically configured to:
  • a white balance algorithm is used to fill the image area after deducting the target object based on the exposure intensity of the reference frame and the surrounding image domain.
  • the image capturing and processing device when the image capturing and processing device fills the image area after the target object is cut out, it is specifically configured to:
  • the image capturing and processing device when the image capturing and processing device fills the image area after the target object is cut out, it is specifically configured to:
  • the image capturing and processing device when the image capturing and processing device fills the image area after the target object is cut out, it is specifically configured to:
  • the image capturing and processing device fills the image area after the target object is cut out based on the previous frame image and the next frame image, to obtain the filled target Frame, specifically used for:
  • pixel filling is performed on the target frame from which the image area where the target object exists is removed, to obtain the filled target frame.
  • the target object is an abnormal object included in the target frame during the time-lapse photographing process of the photographing device.
  • the image capturing and processing device is specifically configured to:
  • the target frame is compared with the adjacent frame, and a target image composed of pixels corresponding to different pixel values at the same position of the adjacent frame is determined from the target frame, and the object corresponding to the target image Is the target object;
  • the frame including the target object in the sequence of image frames is used as a target frame.
  • the target object is all or part of a moving object.
  • the carrier provided in this embodiment can execute the image processing methods as shown in FIG. 5 and FIG. 6 provided in the foregoing embodiment, and the execution mode and beneficial effects are similar, and details are not repeated here.
  • the partial mask (Mask) and partial convolution (Partial Convolution) described in this specification will be described.
  • a Fully Convolutional Network (Fully Convolutional Network) is often used.
  • a fully convolutional neural network needs to perform ergodic convolution on the entire input image, which consumes a lot of resources and reduces the processing speed to a certain extent.
  • the local mask only convolves the region of interest to identify the semantics of each pixel in the region of interest, and at the same time, performs regression processing on the bounding box of the local mask to obtain the pixel features around the bounding box of the local mask.
  • the framed area of the input image is the region of interest (RoI, Region of interest).
  • RoI region of interest
  • ClassBox classifier
  • Lcls and Lbox can be defined by the loss function of general fast R-CNN; for each RoI mask, there is a branch of Km 2 dimensions, K is the binary coding, m*m is the resolution, for each K will have a category. For this, we provide a sigmoid() function for each pixel, and define Lmask as the average binary cross-entropy loss.
  • the U-net neural network model used in Partial Convolution is exemplarily given.
  • the network includes the process of performing multiple down-convolution and up-convolution on the input image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

La présente invention concerne, selon des modes de réalisation, un procédé et un appareil de traitement d'image, un système de capture et de traitement d'image et un support; le procédé comprend les étapes consistant à : acquérir une séquence de trames d'image qui est capturée de façon temporisée; dans la séquence de trames d'image, déterminer une trame cible ayant un objet cible; couper à partir de la trame cible une zone d'image dans laquelle l'objet cible est présent; et remplir la région d'image après que l'objet cible a été coupé. Ainsi, un objet cible dans une image peut être efficacement éliminé, et l'efficacité de lecture d'une séquence de trame d'image capturée de façon temporisée est augmentée.
PCT/CN2019/075707 2019-02-21 2019-02-21 Procédé et appareil de traitement d'image, système de capture et de traitement d'image, et support WO2020168515A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/075707 WO2020168515A1 (fr) 2019-02-21 2019-02-21 Procédé et appareil de traitement d'image, système de capture et de traitement d'image, et support
CN201980004937.7A CN111247790A (zh) 2019-02-21 2019-02-21 一种图像处理方法、装置、图像拍摄和处理系统及载体

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/075707 WO2020168515A1 (fr) 2019-02-21 2019-02-21 Procédé et appareil de traitement d'image, système de capture et de traitement d'image, et support

Publications (1)

Publication Number Publication Date
WO2020168515A1 true WO2020168515A1 (fr) 2020-08-27

Family

ID=70877357

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/075707 WO2020168515A1 (fr) 2019-02-21 2019-02-21 Procédé et appareil de traitement d'image, système de capture et de traitement d'image, et support

Country Status (2)

Country Link
CN (1) CN111247790A (fr)
WO (1) WO2020168515A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113744141B (zh) * 2020-11-19 2024-04-16 北京京东乾石科技有限公司 图像的增强方法、装置和自动驾驶的控制方法、装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113014799B (zh) * 2021-01-28 2023-01-31 维沃移动通信有限公司 图像显示方法、装置和电子设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160027159A1 (en) * 2014-07-24 2016-01-28 Adobe Systems Incorporated Low memory content aware image modification
CN106204567A (zh) * 2016-07-05 2016-12-07 华南理工大学 一种自然背景视频抠图方法
CN106250874A (zh) * 2016-08-16 2016-12-21 东方网力科技股份有限公司 一种服饰及随身物品的识别方法和装置
CN106651762A (zh) * 2016-12-27 2017-05-10 努比亚技术有限公司 一种照片处理方法、装置及终端
CN106951899A (zh) * 2017-02-24 2017-07-14 李刚毅 基于图像识别的异常检测方法
CN107481244A (zh) * 2017-07-04 2017-12-15 昆明理工大学 一种工业机器人视觉语义分割数据库制作方法
CN109191414A (zh) * 2018-08-21 2019-01-11 北京旷视科技有限公司 一种图像处理方法、装置、电子设备及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2494498B1 (fr) * 2009-10-30 2018-05-23 QUALCOMM Incorporated Procédé et appareil de détection d'image avec retrait d'objet non désiré
US10021363B2 (en) * 2015-10-16 2018-07-10 Novatek Microelectronics Corp. Method and apparatus for processing source image to generate target image
CN108399362B (zh) * 2018-01-24 2022-01-07 中山大学 一种快速行人检测方法及装置
CN108961302B (zh) * 2018-07-16 2021-03-02 Oppo广东移动通信有限公司 图像处理方法、装置、移动终端及计算机可读存储介质
CN109167893B (zh) * 2018-10-23 2021-04-27 Oppo广东移动通信有限公司 拍摄图像的处理方法、装置、存储介质及移动终端

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160027159A1 (en) * 2014-07-24 2016-01-28 Adobe Systems Incorporated Low memory content aware image modification
CN106204567A (zh) * 2016-07-05 2016-12-07 华南理工大学 一种自然背景视频抠图方法
CN106250874A (zh) * 2016-08-16 2016-12-21 东方网力科技股份有限公司 一种服饰及随身物品的识别方法和装置
CN106651762A (zh) * 2016-12-27 2017-05-10 努比亚技术有限公司 一种照片处理方法、装置及终端
CN106951899A (zh) * 2017-02-24 2017-07-14 李刚毅 基于图像识别的异常检测方法
CN107481244A (zh) * 2017-07-04 2017-12-15 昆明理工大学 一种工业机器人视觉语义分割数据库制作方法
CN109191414A (zh) * 2018-08-21 2019-01-11 北京旷视科技有限公司 一种图像处理方法、装置、电子设备及存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113744141B (zh) * 2020-11-19 2024-04-16 北京京东乾石科技有限公司 图像的增强方法、装置和自动驾驶的控制方法、装置

Also Published As

Publication number Publication date
CN111247790A (zh) 2020-06-05

Similar Documents

Publication Publication Date Title
CN109636754B (zh) 基于生成对抗网络的极低照度图像增强方法
US10937167B2 (en) Automated generation of pre-labeled training data
KR101573131B1 (ko) 이미지 촬상 방법 및 장치
CN110839129A (zh) 图像处理方法、装置以及移动终端
CN101512549B (zh) 数字图像采集装置中的实时人脸追踪
CN108830208A (zh) 视频处理方法和装置、电子设备、计算机可读存储介质
CN111898581B (zh) 动物检测方法、装置、电子设备及可读存储介质
WO2015034725A1 (fr) Sélection automatisée d'images de gardien d'un ensemble capturé d'images en rafale
Karaman et al. Comparison of static background segmentation methods
CN110751630B (zh) 基于深度学习的输电线路异物检测方法、装置及介质
CN107465855B (zh) 图像的拍摄方法及装置、无人机
CN108566513A (zh) 一种无人机对运动目标的拍摄方法
WO2020168515A1 (fr) Procédé et appareil de traitement d'image, système de capture et de traitement d'image, et support
CN106373139A (zh) 一种图像处理方法及装置
CN114022823A (zh) 一种遮挡驱动的行人再识别方法、系统及可存储介质
CN111080546B (zh) 一种图片处理方法及装置
CN107547839A (zh) 基于图像分析的远程控制平台
CN113038002B (zh) 图像处理方法、装置、电子设备及可读存储介质
CN111192286A (zh) 一种图像合成方法、电子设备及存储介质
CN111144156A (zh) 一种图像数据处理方法和相关装置
CN112839167A (zh) 图像处理方法、装置、电子设备及计算机可读介质
CN111105369B (zh) 图像处理方法、图像处理装置、电子设备和可读存储介质
US20230306564A1 (en) System and Methods for Photo In-painting of Unwanted Objects with Auxiliary Photos on Smartphone
CN116095363B (zh) 基于关键行为识别的移动端短视频高光时刻剪辑方法
CN116051477A (zh) 一种超高清视频文件的图像噪声检测方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19915790

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19915790

Country of ref document: EP

Kind code of ref document: A1