WO2023071189A1 - 一种图像处理方法、装置、计算机设备和存储介质 - Google Patents

一种图像处理方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2023071189A1
WO2023071189A1 PCT/CN2022/096442 CN2022096442W WO2023071189A1 WO 2023071189 A1 WO2023071189 A1 WO 2023071189A1 CN 2022096442 W CN2022096442 W CN 2022096442W WO 2023071189 A1 WO2023071189 A1 WO 2023071189A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
detected
detection
images
target
Prior art date
Application number
PCT/CN2022/096442
Other languages
English (en)
French (fr)
Inventor
袁熙
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023071189A1 publication Critical patent/WO2023071189A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Definitions

  • the present disclosure relates to the technical field of computer vision and image processing, and in particular to an image processing method, device, computer equipment and storage medium.
  • the behavior recognition processing algorithm Since the behavior recognition processing algorithm is subject to a limited time limit when detecting object behavior in an image, that is, it is required to output the object behavior recognition result within a specified time, therefore, the algorithm can only rely on a limited number of images to recognize the image within a limited time. Enter an image.
  • the traditional method of selecting input images is generally to select a limited number of image frames at equal intervals from the video as the input of the subsequent object behavior recognition process, but this method is easy to miss the image frames that reflect the key behavior of the object in the video, which will seriously affect the recognition.
  • the accuracy of the behavior of the object is low, resulting in low accuracy of behavior detection.
  • Embodiments of the present disclosure at least provide an image processing method, device, computer equipment, and storage medium.
  • an embodiment of the present disclosure provides an image processing method, including:
  • For each image group determine image difference information for reflecting the difference between the first target images in the image group
  • At least one frame of the second target image is screened from the multiple frames of the first target image.
  • the image difference information reflecting the difference between the first target images in the image group by using the image difference information reflecting the difference between the first target images in the image group, the difference between a group of first target images with the latest shooting time can be accurately obtained, and then, according to the image difference information, the Image screening, to screen out images whose degree of difference satisfies the preset difference conditions from multiple frames of the first target image, as the second target image, that is, an image that can represent key features, such as an image of an object’s abnormal behavior, etc.
  • the number of target images can improve the efficiency of subsequent image processing, thereby improving the accuracy of image processing.
  • it also includes:
  • image processing is performed to obtain an image processing result.
  • the second target image is an image capable of representing key features
  • image processing can be performed using the second target image to obtain accurate image processing results and improve image detection accuracy.
  • determining the image difference information used to reflect the difference between the first target images in the image group includes:
  • the pixel group includes at least two pixel points
  • image difference information for reflecting the difference between the first target images in the image group is determined.
  • pixels constitute an image
  • a pixel can be used as the smallest unit representing an image.
  • Difference information based on the pixel difference information corresponding to each pixel group in the plurality of pixel groups, the image used to reflect the difference between the first target images in the image group is determined Difference information, including:
  • image difference information for reflecting the difference between the first target images in the image group is determined.
  • the pixel difference information can finely distinguish the difference between the first target images in the image group, therefore, the pixel difference information between the pixel groups is comprehensively processed, such as cumulative summation, etc., can Accurate first comprehensive difference information is obtained, the first comprehensive difference information can characterize the maximum difference between the first target images in the image group (accumulated and summed results), and then, using the accurate first comprehensive difference information, Accurate image difference information can be obtained.
  • the screening of at least one frame of the second target image from the multiple frames of the first target image based on the image difference information corresponding to each image group includes:
  • At least one frame of the second target image is selected from the plurality of frames of the first target image.
  • the second comprehensive difference information in this embodiment can represent the maximum difference between the first target image in the first frame and the first target image in the last frame in a video segment (including all image groups), and then, according to each image group
  • the corresponding image difference information, the second comprehensive difference information, and the time information corresponding to each image group can more accurately select the second target image that can represent the key feature, that is, the key frame, from the multiple frames of the first target image.
  • the multiple frames of the first target image are selected At least one frame of the second target image, including:
  • At least one frame of the second target image is selected from the plurality of frames of the first target image.
  • the second comprehensive difference information can represent the maximum difference between the first frame of the first target image and the last frame of the first target image in a video segment (including all image groups)
  • the second comprehensive difference Information and the image difference information corresponding to each image can accurately calculate the first difference probability corresponding to each image group, and then integrate the first difference probability corresponding to each image group according to the time information.
  • the integration result such as the first In the case that difference probability cumulative summation
  • the second target image whose difference change satisfies the preset difference condition that is, the key frame, can be screened out from the multiple frames of the first target image.
  • At least one frame of the second target image is selected from the multiple frames of the first target image.
  • target image including:
  • each image group in the sequence of image groups is arranged from first to last according to the corresponding time information;
  • At least one subsequence is extracted from an image group sequence including each image group; the first subsequence corresponding to each image group in the subsequence A sum of difference probabilities is greater than or equal to the preset threshold; for each subsequence, a first target image in the last image group in the subsequence is used as the second target image.
  • a first target image in the last image group in the image group sequence can be used as a Images that characterize key features.
  • the sum of the first difference probabilities in the subsequence is greater than or equal to the preset threshold, that is, the difference between the first target image in the first image group and the first target image in the last image group in the subsequence satisfies the preset difference condition, therefore, a first target image in the last group of images is an image capable of representing key features, and it is used as a second target image, that is, the second target image is an image capable of representing key features.
  • the multiple frames of the first target image are selected At least one frame of the second target image, including:
  • the difference smoothing factor is greater than or equal to 0 and less than or equal to 1;
  • the image difference information corresponding to each image group, the second comprehensive difference information and the time information corresponding to each image group filter at least one frame of the second target image from the multiple frames of the first target image target image.
  • the difference smoothing factor is used to attenuate image difference information.
  • the attenuated image difference information corresponding to the difference smoothing factor can reduce the image quality of the first target image. The impact of the abnormal result of the difference information on the second difference information.
  • the acquiring multiple frames of the first target image includes:
  • Multiple frames of first target images are acquired from the video segment to be processed according to a preset frequency.
  • the first target image with different frame numbers can be obtained from the video clips to be processed by adjusting the preset frequency. If the preset frequency is a high frequency, it can ensure that the relative There are many first target images, and the time to acquire the second target image from the entire video segment to be processed can be reduced, that is, the second target image only needs to be acquired from multiple frames of the first target image, which improves the efficiency of image acquisition. efficiency.
  • performing image processing based on at least one frame of the second target image to obtain an image processing result includes:
  • Image processing is performed on the sub-image to obtain an image processing result.
  • the sub-image since the sub-image includes the target object, the sub-image is cropped from the second target image and processed on the sub-image without affecting the image processing result, and at the same time, the image processing speed can be improved.
  • the image processing result includes at least one of object behavior information and supervisory event result information.
  • accurate object behavior information and/or supervisory event result information can be identified by using the above-mentioned embodiment.
  • an embodiment of the present disclosure further provides an image processing device, including:
  • An image acquisition module configured to acquire multiple frames of the first target image, and according to the shooting time of each frame of the first target image, at least two frames of the first target image with the latest shooting time are taken as a group to obtain multiple image groups;
  • An information determination module configured to, for each image group, determine image difference information for reflecting the difference between the first target images in the image group
  • An image screening module configured to screen at least one frame of the second target image from the multiple frames of the first target image based on the image difference information corresponding to each image group.
  • the image processing device further includes an image processing module, configured to perform image processing based on at least one frame of the second target image to obtain an image processing result.
  • the information determination module is configured to use pixels with the same image position in multiple frames of the first target image that are time-sequentially adjacent in the image group as a group to obtain multiple A pixel group; the pixel group includes at least two pixel points; for each pixel group in the plurality of pixel groups, determine the pixel difference information between each pixel point in the pixel group; based on the plurality of The pixel difference information corresponding to each pixel group in the pixel group determines the image difference information used to reflect the difference between the first target images in the image group.
  • the information determination module is configured to determine first comprehensive difference information based on the pixel difference information corresponding to each pixel group in the plurality of pixel groups; The difference information is determined to reflect the image difference information used to reflect the difference between the first target images in the image group.
  • the image screening module is configured to determine second comprehensive difference information based on the image difference information corresponding to each image group; based on the image difference information corresponding to each image group, the second Synthesizing the difference information and the time information corresponding to each image group, and screening at least one frame of the second target image from the plurality of frames of the first target image.
  • the image screening module is configured to respectively determine the first difference probability corresponding to each image group based on the image difference information corresponding to each image group and the second comprehensive difference information; The first difference probability corresponding to each image group and the time information corresponding to each image group select at least one frame of the second target image from the plurality of frames of the first target image.
  • the image screening module is configured to, when the sum of the first difference probabilities corresponding to each image group in the image group sequence is less than a preset threshold, based on the time corresponding to each image group Information, using a first target image in the last image group in the image group sequence as the second target image; wherein, each image group in the image group sequence is from first to last according to the corresponding time information arrangement;
  • At least one subsequence is extracted from an image group sequence including each image group; the first subsequence corresponding to each image group in the subsequence A sum of difference probabilities is greater than or equal to the preset threshold; for each subsequence, a first target image in the last image group in the subsequence is used as the second target image.
  • the image screening module is configured to obtain a difference smoothing factor; the difference smoothing factor is greater than or equal to 0 and less than or equal to 1; based on the difference smoothing factor, each image group corresponds to The image difference information, the second comprehensive difference information, and the time information corresponding to each image group are used to select at least one frame of the second target image from the plurality of frames of the first target image.
  • the image acquisition module is configured to acquire a video segment to be processed; and acquire multiple frames of the first target image from the video segment to be processed at a preset frequency.
  • the image processing module is configured to identify the target object in each frame of the second target image in the at least one frame of the second target image, and determine the target object in each frame Position information of the second target image; based on the determined position information, respectively cut out sub-images containing the target object from each frame of the second target image; perform image processing on the sub-images to obtain image processing result.
  • the image processing result includes at least one of object behavior information and supervisory event result information.
  • an embodiment of the present disclosure further provides a computer device, including: a processor, a memory, and a bus, the memory stores machine-readable instructions executable by the processor, and when the computer device is running, the processing The processor communicates with the memory through a bus, and when the machine-readable instructions are executed by the processor, the above-mentioned first aspect, or the steps of any possible image processing method in the first aspect are executed.
  • embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the above-mentioned first aspect, or any of the first aspects of the first aspect, may be executed. Steps of a possible image processing method.
  • FIG. 1 shows a flowchart of an image processing method provided by an embodiment of the present disclosure
  • Fig. 2 shows a schematic diagram of different first target images in an image group provided by an embodiment of the present disclosure
  • FIG. 3 shows a schematic diagram of cropping a second target image provided by an embodiment of the present disclosure
  • Fig. 4a shows a schematic diagram of the display effect of the images screened based on the technical solutions provided by the embodiments of the present disclosure provided by the embodiments of the present disclosure;
  • Fig. 4b shows a schematic diagram of the display effect of images screened out at equal intervals according to an embodiment of the present disclosure
  • FIG. 5 shows a flow chart of selecting a second target image provided by an embodiment of the present disclosure
  • FIG. 6 shows a schematic diagram of an image processing device provided by an embodiment of the present disclosure
  • FIG. 7 shows a schematic structural diagram of a computer device provided by an embodiment of the present disclosure.
  • the behavior recognition processing algorithm is subject to a limited time limit when detecting the object behavior in the image, that is, it is required to output the object behavior recognition result within the specified time, therefore, the algorithm can only recognize the image within a limited time.
  • the traditional method of selecting input images is generally to select a limited number of image frames at equal intervals from the video as the input of the subsequent object behavior recognition process, but this method is easy to miss the image frames that reflect the key behavior of the object in the video, which will seriously affect the recognition.
  • the accuracy of the behavior of the object is low, resulting in low accuracy of behavior detection.
  • the present disclosure provides an image processing method, which can accurately obtain the difference between a group of first target images with the latest shooting time by using the image difference information reflecting the difference between the first target images in the image group , after that, image screening can be carried out according to the image difference information, and the image whose difference degree satisfies the preset difference condition is selected from the multi-frame first target image as the second target image, that is, an image that can represent key features, such as an object Therefore, reducing the number of second target images correspondingly can improve the efficiency of subsequent image processing, and further improve the accuracy of image processing.
  • RGB is a color standard in the industry. It is obtained by changing the three color channels of red (R), green (G) and blue (B) and superimposing them with each other. For the same color, RGB is the color representing the three channels of red, green, and blue. This standard includes almost all the colors that human vision can perceive, and is one of the most widely used color systems. All the colors on the computer screen are formed by mixing the three colors of red, green and blue in different proportions. A group of red, green and blue is the smallest display unit. Any color on the screen can be recorded and represented by a set of RGB values.
  • YUV is a color coding method. It is the type of compiling true-color color space (color space). "Y” stands for brightness (Luminance or Luma), that is, the gray scale value, "U” and “V” stand for chroma (Chrominance or Chroma), which are used to describe the color and saturation of the image, and are used to specify pixels s color.
  • an image processing method disclosed in the embodiment of the present disclosure is firstly introduced in detail.
  • the image processing method provided in the embodiment of the present disclosure is generally executed by a computer device with certain computing capabilities.
  • the image processing method may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • FIG. 1 is a flowchart of an image processing method provided by an embodiment of the present disclosure, the method includes steps S101 to S103, wherein:
  • S101 Acquire multiple frames of the first target image, and according to the shooting time of each frame of the first target image, use at least two frames of the first target image with the latest shooting time as a group to obtain multiple image groups.
  • multiple frames of the first target image may be selected from a video segment to be processed. Multiple frames of the first target image form multiple image groups. The first target images in each image group have the same size and the same resolution.
  • the video segment to be processed may be a video segment with longer video content, such as a video segment of 3 to 6 s.
  • the screening manners for screening multiple frames of the first target image from a video segment to be processed may include multiple types, specifically:
  • Mode 1 Filter out multiple frames of the first target image from a video segment to be processed according to the equal interval time.
  • the equal interval duration can be set according to the duration of the video segment to be processed, for example, when the video segment to be processed is 1s, the equal interval duration can be set to 32ms;
  • the equal interval duration can also be set according to the duration of the video segment to be processed and the subsequent image processing task, for example, when it is determined that the subsequent image processing task needs to input 8 frames of images each time (ie, the second target image described below)
  • the equal interval can be 31.25 ms. After that, the key feature image (second target image) is selected from the 32 frames of the first target image to obtain an accurate image process result.
  • different equal interval durations may be set according to different image processing tasks and different durations of video clips to be processed. Therefore, the embodiment of the present disclosure does not specifically limit the equal interval durations.
  • the first target image of successive frames may be filtered out from the video clips to be processed.
  • Method 3 Multiple frames of the first target image can be screened according to different application scenarios. Taking the scene of identifying abnormal behavior of the object as an example, multiple frames of the first target image containing the object can be screened from the video clips to be processed.
  • a target detection algorithm (such as Object Detection) is used to identify image frames in the video segment to be processed, and multiple frames of first target images containing objects are screened out from the image frames.
  • Exemplary utilize the target detection algorithm (such as target detection Object Detection) to identify the image frame in the video clip to be processed, filter out the multi-frame object image that contains the object from the image frame; Multiple frames of the first object image are filtered out from the multiple frames of the object image.
  • target detection algorithm such as target detection Object Detection
  • the first target images with the latest shooting time are two frames of first target images separated by an equal time interval.
  • One image group may include at least two frames of the first target image separated by at least one equal interval.
  • an image group may include two frames of the first target image separated by an equal time interval.
  • the multiple frames of the first target images are continuous frame images
  • at least two frames of the first target images with the latest shooting time are multiple adjacent frames of the first target images.
  • the specific number of frames of the first target image included in an image group may be set according to the processing capability of the algorithm during subsequent image processing, which is not specifically limited in the embodiments of the present disclosure.
  • S102 For each image group, determine image difference information for reflecting differences between first target images in the image group.
  • pixels can be used as the smallest unit to characterize an image.
  • the difference between the images can be understood as the pixel values corresponding to the pixel points between the images are different. Different pixel values reflect different colors and present different images. Therefore, the image difference information between images can be calculated by using the pixel difference information between pixels.
  • the image difference information can reflect the difference between the first target images in the image group.
  • the pixels with the same image position in the first target image of multiple frames in the image group that are time-sequentially adjacent are regarded as a group to obtain multiple pixel groups; the pixel group includes at least two Pixels; for each pixel group in the plurality of image groups, determine the pixel difference information between the pixel points in the pixel group; based on the pixel difference information corresponding to each pixel group in the plurality of pixel groups, determine the The image difference information of the difference between each first target image in the image group.
  • the first target image in the image group is arranged from front to back according to the timing information; each image group is also arranged from front to back according to the timing information, specifically, the timing information of the first target image in the image group can be referred to.
  • the multiple frames of temporally adjacent first target images in the image group may include two adjacent frames of the first target image, three adjacent frames of the first target image, or more than three adjacent frames of the first target image.
  • the pixel difference information may include a difference in pixel value between pixel points having the same image position.
  • the absolute value of the difference in pixel value between pixel points having the same image position may be included.
  • FIG. 2 it is a schematic diagram of two adjacent frames of the first target image in an image group. Since the first target image A and the first target image B have the same size and the same resolution, comparing the pixel difference information of the pixels with the same image position (x, y), the group with the latest shooting time can be obtained The difference between different first target images in the image group.
  • the pixels with the same image position (1,3) can be pixel A1 and pixel A2 in Figure 2, then pixel A1 and pixel A2 form a pixel group 1; have the same image position
  • the pixel points of (3,3) can be the pixel points B1 and B2 in Fig. 2, then the pixel points B1 and B2 form a pixel group 3, ..., and there are nine pixel groups shown in Fig. 2 .
  • the pixel difference information of the pixel points at the same image position (x, y) can be compared respectively, and respectively obtained
  • the pixel difference information corresponding to each image position (x, y), such as the pixel difference information corresponding to the nine pixel groups in Figure 2, the pixel difference information corresponding to the nine pixel groups can be added to obtain the first target image A and the image difference information between the first target image B.
  • the pixels at the same image position (x, y) can be compared respectively
  • the pixel difference information of the point for example, first compare the pixel difference information of the pixel points in the same image position between the first target image A and the first target image B, as the above calculation process, the first target image A and the first target image can be obtained
  • the image difference information S AB between the images B; after that, calculate the pixel difference information of the pixel points in the same image position between the first target image B and the first target image C in the same calculation method, and the first target image can be obtained Image difference information S BC between B and the first target image C.
  • the average of S AB and S BC can also be calculated take the average They are the image difference information between the first target image A and the first target image B, and the image difference information between the first target image B and the first target image C in the image group, respectively.
  • the difference between the first target images in the image group can be finely distinguished, and more accurate image difference information can be obtained.
  • the first comprehensive difference information can also be determined based on the pixel difference information corresponding to each pixel group in the multiple pixel groups; based on the first comprehensive difference information, determine the Image difference information between images.
  • the first comprehensive difference information may be the sum of pixel difference information corresponding to all pixel groups corresponding to different first target images in the image group.
  • the image difference information is the absolute value of the difference in pixel values between pixels with the same image position
  • the image difference information can be calculated using Formula 1:
  • S t represents the image difference information between the first target image in frame t and the first target image in frame t-1
  • I(x, y, t) represents the image position (x , y) at the pixel value corresponding to the pixel point
  • I(x, y, t-1) represents the pixel value of the corresponding pixel point at the image position (x, y) in the first target image of the t-1 frame
  • the first target image in frame t is the first target image in the next frame of the first target image in frame t-1
  • t represents the number of frames corresponding to the multiple frames of the first target image, a total of T frames, where T is a positive integer.
  • H represents height information in the size of the first object image and the second object image, and W represents width information in the size of the first object image and the second object image.
  • the pixel value can be an RGB value or a YUV value.
  • the pixel difference information between the pixel groups is comprehensively processed, such as cumulative summation, etc., to obtain an accurate
  • the first comprehensive difference information the first comprehensive difference information can represent the maximum difference between the first target images in the image group (accumulated and summed results), and then, using the accurate first comprehensive difference information, can obtain accurate image difference information.
  • S103 Based on the image difference information corresponding to each image group, screen at least one frame of the second target image from multiple frames of the first target image.
  • the second comprehensive difference information can be determined based on the image difference information corresponding to each image group; based on the image difference information corresponding to each image group, the second comprehensive difference information, and the time information corresponding to each image group, from Screening at least one frame of the second target image from the multiple frames of the first target image.
  • the second comprehensive difference information may be the sum of image difference information corresponding to all image groups in multiple frames of the first target image, that is, it can represent the first frame of the first target in the first image group included in all image groups
  • the difference between the first target image in the last frame of the image to the last image group For example, the sum of S t calculated in S102, t ⁇ 2,3,...,T ⁇ , namely The time information corresponding to each image group can be defined according to the shooting time sequence of the last frame of the first target image in the image group.
  • the image group where the first target image is captured first is sorted first, and the first target image captured later
  • the image group in which the image is located is sorted last.
  • the image group corresponding to S 2 is the second image group
  • the image group corresponding to S 3 is the third image group
  • the image corresponding to ST Group is the Tth image group.
  • the shooting time of the last frame of the first target image in the image group may also be used as the time information corresponding to the image group.
  • the second target image that can represent the key features can be selected from the multiple frames of the first target image more accurately.
  • the first difference probability corresponding to each image group can be determined respectively; based on the first difference probability corresponding to each image group and each time information corresponding to each image group, and select at least one frame of the second target image from multiple frames of the first target image.
  • the first difference probability can be calculated using Formula 2:
  • the first difference probability corresponding to each image group is calculated, that is, the first difference probability corresponding to the second image group The first difference probability corresponding to the second image group ..., the first difference probability corresponding to the T-th image group Afterwards, the first difference probability corresponding to each image group can be sequentially accumulated according to the time information, and each time the first difference probability corresponding to an image group is accumulated, it is detected whether the accumulated sum is greater than or equal to the preset threshold, and when the accumulated sum is less than the preset In the case of a threshold value, continue to accumulate the first difference probability corresponding to the sorted image group until the accumulated sum is greater than or equal to the preset threshold value, stop the accumulation, and determine the image group corresponding to the last accumulated first difference probability, and select the image One of the first target images in the group, as the second target image. Afterwards, the first difference probability is reset to zero, and the first difference probabilities corresponding to the remaining image groups are continuously accumulated until the accumulation of the first difference probabilities corresponding to all
  • the preset threshold may be set according to an actual application scenario, which is not specifically limited in this embodiment of the present disclosure.
  • One target image is used as the second target image, and then three frames of the second target image can be obtained.
  • the screening of the second image group in the image group sequence composed of three image groups (that is, the fourteenth, fifteenth and sixteenth image groups) whose first difference probability is less than the preset threshold value is abandoned.
  • the target image therefore, in this screening, two frames of the second target image are screened from 16 image groups.
  • multiple image groups obtained in S101 can also be combined into an image group sequence, and the arrangement of multiple image groups in the image group sequence can be as follows: each image group is arranged from first to last according to the corresponding time information .
  • the time information may include a time sequence corresponding to the first target image in the image group.
  • There may be multiple subsequences in the image group sequence, wherein the sum of the first difference probabilities corresponding to each image group in the subsequence is greater than or equal to a preset threshold; or, there is no subsequence in the image group sequence, that is, the image group sequence
  • the sum of the first difference probabilities corresponding to each image group in is less than a preset threshold.
  • the last image group in the image group sequence is used as the second target image.
  • a first target image can be arbitrarily selected from the last image group as the second target image, or it can be specified to select a first target image from the last image according to task requirements, such as specifying to select the last The last frame of the first target image in the image group is used as the second target image and so on.
  • the task requirements may be set according to actual application scenarios, which are not specifically limited in the embodiments of the present disclosure.
  • each image group in the image group sequence is arranged from first to last according to the corresponding time information.
  • it can be The second image group Z 2 , the third image group Z 3 , ... the T-th image group Z T , that is, the image group sequence is Z 2 Z 3 ... Z T .
  • Any first target image in the image group Z T may be used as the second target image.
  • the second target image is no longer acquired from the image group sequence; thereafter, Continue to execute S101 to obtain multiple image groups to form a new image group sequence, and determine the relationship between the sum of the first difference probabilities corresponding to each image group in the new image group sequence and the preset threshold value, so as to obtain the second target image .
  • At least one subsequence is extracted from the image group sequence including each image group; each image group in the subsequence corresponds to The sum of the first difference probabilities is greater than or equal to a preset threshold; for each subsequence, a first target image in the last image group in the subsequence is used as the second target image.
  • the image group sequence is obtained by arranging each image group from first to last according to the time information corresponding to each image group.
  • a subsequence can be part of a GOP sequence.
  • the first difference probability corresponding to the image group is sequentially accumulated, and two subsequences are extracted from the image group sequence, where the first subsequence includes the second, third, fourth and fifth
  • the first, sixth and seventh image groups are arranged according to the order of time information, that is, Z 2 Z 3 Z 4 Z 5 Z 6 Z 7
  • the second sub-sequence includes the eighth, ninth, tenth
  • the eleventh, twelfth and thirteenth image groups are arranged according to the order of time information, that is, Z 8 Z 9 Z 10 Z 11 Z 12 Z 13 , here, since the last target image corresponding to multiple frames
  • the cumulative sum of the first difference probabilities corresponding to the three image groups is less than the preset threshold, therefore, the fourteenth, fifteenth and sixteenth image groups do not belong to the subsequence part.
  • the first target image of a specified frame may also be selected from the last image group in each subsequence as the second target image, for example, the last frame of the first target image may be selected from the seventh image group as the second target image, Select the first target image in the first frame from the thirteenth image group as the second target image.
  • the difference between two adjacent frames of the first target image among the screened multiple frames of the first target image is very large , for example, the first target image C of a certain frame captured by the camera is blurred or black, and the first target image of other frames is intact, then the image difference calculated for the first target image C and the first target image of the previous frame or the next frame
  • the difference in information is large, but the objects or events in the actual environment have not changed greatly.
  • the difference smoothing factor can be used to perform attenuation processing, which can reduce the image difference information of the first target image
  • the influence of the abnormal result of the second difference information on the second difference information makes the transition of image difference information between different first target images in the same image group smoother.
  • the difference smoothing factor is greater than or equal to 0 and less than or equal to 1; then, based on the difference smoothing factor, the image difference information corresponding to each image group, the second comprehensive difference information and each The time information corresponding to the image group filters at least one frame of the second target image from the multiple frames of the first target image.
  • the second difference probability corresponding to each image group can be determined respectively, which can be calculated by using formula three:
  • represents the difference smoothing factor, and its value ranges from 0 to 1, including 0 and 1.
  • the difference smoothing factor can be used to attenuate image difference information. The larger the difference smoothing factor is, the easier it is to select the difference value corresponding to the image difference information, that is, the first target image of two adjacent frames with a large change in S t . When the difference smoothing factor is 0, it is equivalent to uniform selection.
  • the second difference probability corresponding to each image group is calculated, that is, the second difference probability corresponding to the second image group The second difference probability corresponding to the third image group ..., the second difference probability corresponding to the T-th image group Afterwards, the second difference probability corresponding to each image group can be sequentially accumulated according to the time information, and each time the second difference probability corresponding to an image group is accumulated, it is detected whether the accumulated sum is greater than or equal to the preset threshold, and when the accumulated sum is less than the preset In the case of a threshold value, continue to accumulate the second difference probability corresponding to the sorted image group until the accumulated sum is greater than or equal to the preset threshold value, stop the accumulation, and determine the image group corresponding to the last accumulated second difference probability, and select the The first target image in the image group is used as the second target image. After that, the second difference probability is reset to zero, and the second difference probabilities corresponding to the remaining image groups continue to be accumulated until the second difference probabilities corresponding to all image groups in the
  • the second target image screened in S103 is processed. Since the second target image is an image capable of representing key features, image processing using the second target image can obtain accurate image processing results and improve image detection. precision. During specific implementation, image processing is performed based on at least one frame of the second target image to obtain an image processing result.
  • image processing tasks correspond to different image processing results.
  • the image processing result is object behavior information, such as slapstick behavior in an elevator, Fighting behavior, dangerous action behavior, smoking behavior in public places, etc.;
  • the image processing task is the task of detecting events in the image, the image processing result is the result information of the regulatory event, such as the violation result corresponding to the violation event; Civilized events correspond to littering, smoking in public places, etc.
  • the position information may include the coordinate position of the target object on the second target image, such as the coordinates of the detection frame.
  • a target detection algorithm (such as target detection Object Detection) can be used to identify the target object in the second target image, and a detection frame can be marked for the target object.
  • the attention model can be passed according to the position of the detection frame mark
  • the Attention Model cuts out the target object from the second target image to obtain a sub-image containing the target object, and then performs image processing on the sub-image.
  • Different image processing tasks have different processing procedures. Embodiments of the present disclosure do not limit image The processing task, therefore, does not limit the image processing pipeline. It can be referred to as shown in FIG.
  • 3 which is a schematic diagram showing the cropping of the second target image, wherein 31 represents a plurality of frames of the second target image, wherein 311 represents the second target image, and 32 represents the frame cropped from the multiple frames of the second target image.
  • the sub-image contains the target object
  • the sub-image is cropped from the second target image, and the sub-image is processed without affecting the image processing result, and at the same time, the image processing speed can be improved.
  • FIG. 4a and FIG. 4b provide a display effect of comparing the images screened based on the technical solutions provided by the embodiments of the present disclosure with the images screened at equal intervals. Among them, Fig.
  • Fig. 4a is the second target image 41 obtained through screening according to this embodiment, which can characterize the key features, that is, the time-sequence actions of the target object 42 (such as a person) entering the car.
  • Fig. 4b is a sequence of frames (including multiple frames of images 43, such as the three images in the figure) extracted at equal intervals, and the movement of the target object cannot be observed from the sequence of frames.
  • the second target image is an image capable of representing key features
  • accurate image processing results can still be obtained by correspondingly reducing the recognition of the second target image, and image processing efficiency can be improved by reducing the recognition of the second target image.
  • video clips with longer video content can be processed, and accurate image processing results can be detected for some video clips with longer video content.
  • the first comprehensive difference information may also be a sum of pixel difference information corresponding to pixel groups corresponding to partial image positions. For example, use the target detection algorithm to identify the object in the first target image, determine the object detection frame, and use the sum of pixel difference information corresponding to the pixel group corresponding to the image position of the part framed by the object detection frame as the first comprehensive difference information.
  • the frame of the second target image first determine the number of frames of the second target image that is screened out, if the frame number is less than eight frames, that is, two frames of the second target image are screened out from the two subsequences in the above-mentioned embodiment, then For the remaining six frames, six frames may be randomly selected from the remaining first target images except the two frames of the second target images that have been screened out as the second target images, that is, eight frames of the second target images are obtained.
  • six frames are randomly selected from the remaining first target images at equal intervals as the second target image to obtain eight frames of the second target image.
  • the last frame of the first target image in the image group sequence is directly copied into six copies as the second target image, that is, eight frames of the second target image are obtained.
  • a video segment to be processed is obtained; multiple frames of the first target image are obtained from the video segment to be processed according to a preset frequency.
  • the preset frequency can be any one in the range of 16fps to 32fps. Including 16fps and 32fps, the video clip to be processed can be 1s to 2s. Multiple frames of the first target image with relatively small time intervals can be screened out from the video clips to be processed by using the preset frequency. For example, 32 frames of the first target image with a time interval of 31.25 ms can be selected from the 1 s video segment to be processed by using the preset frequency of 32 fps.
  • the preset frequency may be determined according to subsequent image processing tasks or empirical values, which is not specifically limited in this embodiment of the present disclosure.
  • the first target images with different frame numbers can also be screened from the video clips to be processed by adjusting the preset frequency. If the preset frequency is a high frequency, it can ensure that relatively many The first target image can reduce the time for screening the second target image from the entire video clip to be processed, that is, it only needs to filter the second target image from multiple frames of the first target image, which improves the efficiency of image screening .
  • FIG. 5 is a flow chart for selecting a second target image.
  • multiple frames of first target images may be screened out from the video clips to be processed at equal intervals; the first target images are composed into a frame sequence according to the order of shooting time, and then S502 is executed.
  • S502 is executed in the process of executing S502, in the frame sequence, according to the shooting time of the first target image of each frame, determine the image difference information between two adjacent frames of the first target image, that is, S t ; after that, continuously calculate in the frame sequence
  • the difference between the first target images may be calculated according to the above formula three.
  • Execute S503 after executing S502, and specifically execute the following steps in the process of executing S503: From the second difference probability between the first target image in the first frame and the first target image in the second frame
  • the second difference probability (such as )
  • the frame sequence is segmented from the position of the first target image corresponding to the last accumulated second difference probability, and at least one subsequence is extracted from the segmented frame sequence.
  • S503 After performing S503, perform S504. During the process of performing S504, the last frame of the first target image in at least one subsequence may be used as the second target image, that is, an image with key features. Finally, S505 is executed after S504 is executed, that is, at least one second target image is input to the image processing model for processing.
  • the writing order of each step does not mean a strict execution order and constitutes any limitation on the implementation process.
  • the specific execution order of each step should be based on its function and possible
  • the inner logic is OK.
  • the embodiment of the present disclosure also provides an image processing device corresponding to the image processing method. Since the problem-solving principle of the device in the embodiment of the present disclosure is similar to the above-mentioned image processing method in the embodiment of the present disclosure, the implementation of the device Reference can be made to the implementation of the method, and repeated descriptions will not be repeated.
  • the device includes: an image acquisition module 601, an information determination module 602, and an image screening module 603; wherein,
  • the image acquisition module 601 is configured to acquire multiple frames of the first target image, and according to the shooting time of each frame of the first target image, take at least two frames of the first target image with the latest shooting time as a group to obtain multiple image groups ;
  • An information determining module 602 configured to, for each image group, determine image difference information for reflecting the difference between the first target images in the image group;
  • An image screening module 603, configured to screen at least one frame of the second target image from the multiple frames of the first target image based on the image difference information corresponding to each image group.
  • the image processing device further includes an image processing module 604, configured to perform image processing based on at least one frame of the second target image, to obtain an image processing result.
  • the information determining module 602 is configured to use pixels with the same image position in multiple frames of the first target image that are time-sequentially adjacent in the image group as a group to obtain multiple pixel groups; the pixel group includes at least two pixel points; for each pixel group in the plurality of pixel groups, determine the pixel difference information between the pixel points in the pixel group; based on the plurality of pixel groups The pixel difference information corresponding to each pixel group in the pixel groups is determined to determine the image difference information used to reflect the difference between the first target images in the image group.
  • the information determination module 602 is configured to determine first comprehensive difference information based on the pixel difference information respectively corresponding to each pixel group in the plurality of pixel groups; The difference information is integrated to determine the image difference information for reflecting the difference between the first target images in the image group.
  • the image screening module 603 is configured to determine the second comprehensive difference information based on the image difference information corresponding to each image group; based on the image difference information corresponding to each image group, the first and secondly, synthesizing the difference information and the time information corresponding to each image group, and selecting at least one frame of the second target image from the plurality of frames of the first target image.
  • the image screening module 603 is configured to respectively determine the first difference probability corresponding to each image group based on the image difference information corresponding to each image group and the second comprehensive difference information; Based on the first difference probability corresponding to each image group and the time information corresponding to each image group, at least one frame of the second target image is selected from the plurality of frames of the first target image.
  • the image screening module 603 is configured to, when the sum of the first difference probabilities corresponding to each image group in the image group sequence is less than a preset threshold, based on the Time information, using a first target image in the last image group in the image group sequence as the second target image; wherein, each image group in the image group sequence arrives first according to the corresponding time information rearrange;
  • At least one subsequence is extracted from an image group sequence including each image group; the first subsequence corresponding to each image group in the subsequence A sum of difference probabilities is greater than or equal to the preset threshold; for each subsequence, a first target image in the last image group in the subsequence is used as the second target image.
  • the image screening module 603 is configured to obtain a difference smoothing factor; the difference smoothing factor is greater than or equal to 0 and less than or equal to 1; based on the difference smoothing factor, each image group Corresponding to the image difference information, the second comprehensive difference information, and the time information corresponding to each image group, at least one frame of the second target image is selected from the plurality of frames of the first target image.
  • the image acquiring module 601 is configured to acquire a video segment to be processed; acquire multiple frames of the first target image from the video segment to be processed at a preset frequency.
  • the image processing module 604 is configured to identify the target object in each frame of the second target image in the at least one frame of the second target image, and determine that the target object is The position information of the frame of the second target image; based on the determined position information, respectively cut out a sub-image containing the target object from each frame of the second target image; perform image processing on the sub-image to obtain an image process result.
  • the image processing result includes at least one of object behavior information and supervisory event result information.
  • the embodiment of the present application also provides a computer device.
  • FIG 7 it is a schematic structural diagram of a computer device provided in the embodiment of the present application, including:
  • the processor 71 executes The following steps: S101: Acquire multiple frames of the first target image, and according to the shooting time of each frame of the first target image, use at least two frames of the first target image with the latest shooting time as a group to obtain multiple image groups; S102: For each image group, determine the image difference information used to reflect the difference between the first target images in the image group; S103: Based on the image difference information corresponding to each image group, select at least one of the multiple frames of the first target image Frame the second target image.
  • memory 72 comprises memory 721 and external memory 722;
  • Memory 721 here is also called internal memory, is used for temporarily storing computing data in processor 71, and the data exchanged with external memory 722 such as hard disk, processor 71 communicates with memory 721 through memory 721.
  • the external memory 722 performs data exchange.
  • the processor 71 communicates with the memory 72 through the bus 73, so that the processor 71 executes the execution instructions mentioned in the above method embodiments.
  • Embodiments of the present disclosure further provide a computer-readable storage medium, on which a computer program is stored, and when the computer program is run by a processor, the steps of the image processing method described in the foregoing method embodiments are executed.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • An embodiment of the present disclosure further provides a computer program product, including computer instructions, and when the computer instructions are executed by a processor, the steps of the above-mentioned image processing method are implemented.
  • the computer program product may be any product capable of realizing the above-mentioned image processing method, and part or all of the solutions in the computer program product that contribute to the prior art may be implemented as a software product (such as a software development kit (Software Development Kit, SDK) ), the software product can be stored in a storage medium, and the computer instructions contained therein make relevant devices or processors execute some or all of the steps of the above-mentioned image processing method.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division.
  • multiple modules or components can be combined.
  • some features can be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some communication interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional module in each embodiment of the present disclosure may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.
  • the functions are implemented in the form of software function modules and sold or used as independent products, they can be stored in a non-volatile computer-readable storage medium executable by a processor.
  • the technical solution of the present disclosure is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

本公开提供了一种图像处理方法、装置、计算机设备和存储介质,其中,该方法包括:获取多帧第一目标图像,并按照每帧第一目标图像的拍摄时间,将拍摄时间最近的至少两帧第一目标图像作为一组,得到多个图像组;针对每个图像组,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息;基于每个图像组对应的图像差异信息,从多帧第一目标图像中筛选至少一帧第二目标图像。

Description

一种图像处理方法、装置、计算机设备和存储介质
本公开要求于2021年10月29日提交中国专利局、申请号为202111272556.2、发明名称为“一种图像处理方法、装置、计算机设备和存储介质”,其全部内容通过引用结合在本公开中。
技术领域
本公开涉及计算机视觉和图像处理技术领域,具体涉及一种图像处理方法、装置、计算机设备和存储介质。
背景技术
由于行为识别处理算法在检测图像中对象行为时受有限的时间限制,即要求在规定的时间内输出对象行为识别结果,因此,导致该算法在有限时间内对图像进行识别仅能依赖有限数量的输入图像。传统选取输入图像的方式一般为从视频中等间隔的选取有限数量的图像帧作为后续对象行为识别流程的输入,但是这种方式容易遗漏视频中体现对象关键行为的图像帧,这样会严重影响识别到的对象行为的准确度,导致行为检测精度较低。
发明内容
本公开实施例至少提供一种图像处理方法、装置、计算机设备和存储介质。
第一方面,本公开实施例提供了一种图像处理方法,包括:
获取多帧第一目标图像,并按照每帧所述第一目标图像的拍摄时间,将拍摄时间最近的至少两帧第一目标图像作为一组,得到多个图像组;
针对每个图像组,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息;
基于每个图像组对应的图像差异信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
本方面,利用反映图像组中各第一目标图像之间差异的图像差异信息,能够准确地得到拍摄时间最近的一组第一目标图像之间的差异,之后,可以根据该图像差异信息,进行图像筛选,从多帧第一目标图像中筛选出差异程度满足预设差异条件的图像,作为第二目标图像,即能够表征关键特征的图像,比如对象异常行为的图像等,因此,相应减少第二目标图像的数量,能够提高后续图像处理效率,进而提高图像处理精度。
一种可选的实施方式中,还包括:
基于至少一帧所述第二目标图像,进行图像处理,得到图像处理结果。
该实施方式中,由于第二目标图像为能够表征关键特征的图像,因此利用第二目标图像进行图像处理,能够得到精准的图像处理结果,提高图像的检测精度。
一种可选的实施方式中,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息,包括:
将所述图像组中多帧时序相邻的第一目标图像中,具有相同的图像位置的像素点作为一组,得到多个像素组;所述像素组包括至少两个像素点;
针对所述多个像素组中的每个像素组,确定所述像素组中的各像素点之间的像素差异信息;
基于所述多个像素组中各像素组分别对应的所述像素差异信息,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息。
本实施方式,由于像素点构成图像,因此像素点可以作为表征图像的最小单元,通过计算像素组中的各像素点之间的像素差异信息,能够细致化的区分图像组中的各第一目标图像之间的差异,得到较为准确的图像差异信息。
一种可选的实施方式中,所述基于所述多个像素组中各像素组分别对应的所述像素差异信息,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息,包括:
基于所述多个像素组中各像素组分别对应的所述像素差异信息,确定第一综合差异信息;
基于所述第一综合差异信息,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息。
本实施方式,由于像素差异信息能够细致化的区分图像组中的各第一目标图像之间的差异,因此,将各像素组之间的像素差异信息进行综合处理,比如累加求和等,能够得到精确的第一综合差异信息,该第一综合差异信息能够表征图像组中各第一目标图像之间的最大差异(累加求和后的结果),之后,利用精确的第一综合差异信息,能够得到精确的图像差异信息。
一种可选的实施方式中,所述基于每个图像组对应的图像差异信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像,包括:
基于每个图像组对应的图像差异信息,确定第二综合差异信息;
基于每个图像组对应的图像差异信息、所述第二综合差异信息和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
本实施方式中的第二综合差异信息能够表征一段视频片段(包括全部图像组)中第一帧第一目标图像到最后一帧第一目标图像之间的最大差异,之后,根据每个图像组对应的图像差异信息、第二综合差异信息和每个图像组对应的时间信息,可以较为准确地从多帧第一目标图像中筛选出能够表征关键特征的第二目标图像,即关键帧。
一种可选的实施方式中,所述基于每个图像组对应的图像差异信息、所述第二综合差异信息和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像,包括:
基于每个图像组对应的图像差异信息和所述第二综合差异信息,分别确定每个图像组对应的第一差异概率;
基于每个图像组对应的所述第一差异概率和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
本实施方式,由于第二综合差异信息能够表征一段视频片段(包括全部图像组)中第一帧第一目标图像到最后一帧第一目标图像之间的最大差异,因此,利用第二综合差异信息和每个图像对应的图像差异信息,能够准确计算出每个图像组对应的第一差异概率,之后,根据时间信息整合每个图像组对应的第一差异概率,当整合结果(比如第一差异概率 累加求和)满足预设差异条件的情况下,可以从多帧第一目标图像中筛选出差异变化满足预设差异条件的第二目标图像,即关键帧。
一种可选的实施方式中,所述基于每个图像组对应的所述第一差异概率和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像,包括:
在所述图像组序列中的各个图像组对应的第一差异概率的和小于预设阈值的情况下,基于每个图像组对应的时间信息,将所述图像组序列中的最后一个图像组中的一个第一目标图像作为所述第二目标图像;其中,所述图像组序列中的各个图像组按照对应的时间信息从先到后排列;
基于每个图像组对应的所述第一差异概率和每个图像组对应的时间信息,从包括各个图像组的图像组序列中提取至少一个子序列;所述子序列中各个图像组对应的第一差异概率的和大于或等于所述预设阈值;针对每个子序列,将所述子序列中的最后一个图像组中的一个第一目标图像作为所述第二目标图像。
本实施方式中,在图像组序列中的各个图像组对应的第一差异概率的和小于预设阈值的情况下,可以将图像组序列中最后一个图像组中的一个第一目标图像,作为能够表征关键特征的图像。子序列中的第一差异概率的和大于或等于预设阈值,即子序列中第一图像组中的第一目标图像和最后一个图像组中的第一目标图像之间的差异满足预设差异条件,因此,最后一组图像组中的一个第一目标图像为能够表征关键特征的图像,并将其作为第二目标图像,即该第二目标图像为能够表征关键特征的图像。
一种可选的实施方式中,所述基于每个图像组对应的图像差异信息、所述第二综合差异信息和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像,包括:
获取差异平滑因子;所述差异平滑因子大于或等于0、且小于或等于1;
基于所述差异平滑因子、每个图像组对应的图像差异信息、所述第二综合差异信息和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
本实施方式中差异平滑因子用于衰减图像差异信息,在第一目标图像中存在图像差异信息异常的情况下,可以通过差异平滑因子相应的衰减图像差异信息,能够降低该第一目标图像的图像差异信息的异常结果对第二差异信息的影响。
一种可选的实施方式中,所述获取多帧第一目标图像,包括:
获取待处理视频片段;
按照预设频率从所述待处理视频片段中获取多帧第一目标图像。
该实施方式,可以通过调整预设频率从待处理视频片段中获取帧数不同的第一目标图像,如果预设频率为高频率的情况下,则既能够保证从待处理视频片段中获取到相对多的第一目标图像,又能够降低从整个待处理视频片段中获取第二目标图像的时间,即后续只需要从多帧第一目标图像中获取第二目标图像即可,提高了图像获取的效率。
一种可选的实施方式中,所述基于至少一帧所述第二目标图像,进行图像处理,得到图像处理结果,包括:
识别所述至少一帧所述第二目标图像中每帧第二目标图像中的目标对象,并确定所述目标对象在每帧第二目标图像的位置信息;
基于确定的所述位置信息,分别从每帧所述第二目标图像中裁剪出包含所述目标对象的子图像;
对所述子图像进行图像处理,得到图像处理结果。
本实施方式,由于子图像包含目标对象,因此从第二目标图像中裁剪子图像,并对子图像进行处理,不影响图像处理结果,同时能够提高图像处理速度。
一种可选的实施方式中,所述图像处理结果包括对象行为信息,监管事件结果信息中的至少一种。
本实施方式中,利用上述实施方式能够识别出准确的对象行为信息和/或监管事件结果信息。
第二方面,本公开实施例还提供一种图像处理装置,包括:
图像获取模块,用于获取多帧第一目标图像,并按照每帧所述第一目标图像的拍摄时间,将拍摄时间最近的至少两帧第一目标图像作为一组,得到多个图像组;
信息确定模块,用于针对每个图像组,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息;
图像筛选模块,用于基于每个图像组对应的图像差异信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
一种可选的实施方式中,所述图像处理装置还包括图像处理模块,用于基于至少一帧所述第二目标图像,进行图像处理,得到图像处理结果。
一种可选的实施方式中,所述信息确定模块,用于将所述图像组中多帧时序相邻的第一目标图像中,具有相同的图像位置的像素点作为一组,得到多个像素组;所述像素组包括至少两个像素点;针对所述多个像素组中的每个像素组,确定所述像素组中的各像素点之间的像素差异信息;基于所述多个像素组中各像素组分别对应的所述像素差异信息,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息。
一种可选的实施方式中,所述信息确定模块,用于基于所述多个像素组中各像素组分别对应的所述像素差异信息,确定第一综合差异信息;基于所述第一综合差异信息,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息。
一种可选的实施方式中,所述图像筛选模块,用于基于每个图像组对应的图像差异信息,确定第二综合差异信息;基于每个图像组对应的图像差异信息、所述第二综合差异信息和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
一种可选的实施方式中,所述图像筛选模块,用于基于每个图像组对应的图像差异信息和所述第二综合差异信息,分别确定每个图像组对应的第一差异概率;基于每个图像组对应的所述第一差异概率和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
一种可选的实施方式中,所述图像筛选模块,用于在图像组序列中的各个图像组对应的第一差异概率的和小于预设阈值的情况下,基于每个图像组对应的时间信息,将所述图像组序列中的最后一个图像组中的一个第一目标图像作为所述第二目标图像;其中,所述图像组序列中的各个图像组按照对应的时间信息从先到后排列;
基于每个图像组对应的所述第一差异概率和每个图像组对应的时间信息,从包括各个 图像组的图像组序列中提取至少一个子序列;所述子序列中各个图像组对应的第一差异概率的和大于或等于所述预设阈值;针对每个子序列,将所述子序列中的最后一个图像组中的一个第一目标图像作为所述第二目标图像。
一种可选的实施方式中,所述图像筛选模块,用于获取差异平滑因子;所述差异平滑因子大于或等于0、且小于或等于1;基于所述差异平滑因子、每个图像组对应的图像差异信息、所述第二综合差异信息和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
一种可选的实施方式中,所述图像获取模块,用于获取待处理视频片段;按照预设频率从所述待处理视频片段中获取多帧第一目标图像。
一种可选的实施方式中,所述图像处理模块,用于识别所述至少一帧所述第二目标图像中每帧第二目标图像中的目标对象,并确定所述目标对象在每帧第二目标图像的位置信息;基于确定的所述位置信息,分别从每帧所述第二目标图像中裁剪出包含所述目标对象的子图像;对所述子图像进行图像处理,得到图像处理结果。
一种可选的实施方式中,所述图像处理结果包括对象行为信息,监管事件结果信息中的至少一种。
第三方面,本公开实施例还提供一种计算机设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行上述第一方面,或第一方面中任一种可能的图像处理方法的步骤。
第四方面,本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述第一方面,或第一方面中任一种可能的图像处理方法的步骤。
关于上述图像处理装置、计算机设备和存储介质的效果描述参见上述图像处理方法的说明,这里不再赘述。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例所提供的一种图像处理方法的流程图;
图2示出了本公开实施例所提供的一个图像组中不同第一目标图像的示意图;
图3示出了本公开实施例所提供的裁剪第二目标图像的展示示意图;
图4a示出了本公开实施例所提供的基于本公开实施例提供的技术方案筛选出的图像的展示效果示意图;
图4b示出了本公开实施例所提供的按等间隔时长筛选出的图像的展示效果示意图;
图5示出了本公开实施例所提供的选取第二目标图像的流程图;
图6示出了本公开实施例所提供的一种图像处理装置的示意图;
图7示出了本公开实施例所提供的一种计算机设备的结构示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
另外,本公开实施例中的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。
在本文中提及的“多个或者若干个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。字符“/”一般表示前后关联对象是一种“或”的关系。
经研究发现,由于行为识别处理算法在检测图像中对象行为时受有限的时间限制,即要求在规定的时间内输出对象行为识别结果,因此,导致该算法在有限时间内对图像进行识别仅能依赖有限数量的输入图像。传统选取输入图像的方式一般为从视频中等间隔的选取有限数量的图像帧作为后续对象行为识别流程的输入,但是这种方式容易遗漏视频中体现对象关键行为的图像帧,这样会严重影响识别到的对象行为的准确度,导致行为检测精度较低。
基于上述研究,本公开提供了一种图像处理方法,利用反映图像组中各第一目标图像之间差异的图像差异信息,能够准确地得到拍摄时间最近的一组第一目标图像之间的差异,之后,可以根据该图像差异信息,进行图像筛选,从多帧第一目标图像中筛选出差异程度满足预设差异条件的图像,作为第二目标图像,即能够表征关键特征的图像,比如对象异常行为的图像等,因此,相应减少第二目标图像的数量,能够提高后续图像处理效率,进而提高图像处理精度。
针对以上方案所存在的缺陷,均是发明人在经过实践并仔细研究后得出的结果,因此,上述问题的发现过程以及下文中本公开针对上述问题所提出的解决方案,都应该是发明人在本公开过程中对本公开做出的贡献。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
针对本公开实施例中涉及到的特殊名词做详细介绍:
1、颜色系统RGB,是工业界的一种颜色标准,是通过对红(R)、绿(G)、蓝(B)三个颜色通道的变化以及它们相互之间的叠加来得到各式各样的颜色的,RGB即是代表红、绿、蓝三个通道的颜色,这个标准几乎包括了人类视力所能感知的所有颜色,是运用最广的颜色系统之一。电脑屏幕上的所有颜色,都由这红色绿色蓝色三种色光按照不同的比例混合而成的。一组红色绿色蓝色就是一个最小的显示单位。屏幕上的任何一个颜色都可以由一组RGB值来记录和表达。
2、YUV,是一种颜色编码方法。是编译true-color颜色空间(color space)的种类。“Y”表示明亮度(Luminance或Luma),也就是灰阶值,“U”和“V”表示的则是色度(Chrominance或Chroma),作用是描述影像色彩及饱和度,用于指定像素的颜色。
为便于对本实施例进行理解,首先对本公开实施例所公开的一种图像处理方法进行详细介绍,本公开实施例所提供的图像处理方法的执行主体一般为具有一定计算能力的计算机设备。在一些可能的实现方式中,该图像处理方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
下面以执行主体为计算机设备为例对本公开实施例提供的图像处理方法加以说明。
参见图1所示,为本公开实施例提供的图像处理方法的流程图,所述方法包括步骤S101至S103,其中:
S101:获取多帧第一目标图像,并按照每帧第一目标图像的拍摄时间,将拍摄时间最近的至少两帧第一目标图像作为一组,得到多个图像组。
本步骤中,可以是从一段待处理视频片段中筛选多帧第一目标图像。多帧第一目标图像组成多个图像组。每个图像组中的第一目标图像的尺寸大小相同,分辨率相同。
这里,待处理视频片段可以为具有较长视频内容的视频片段,比如3至6s的视频片段。
从一段待处理视频片段中筛选多帧第一目标图像的筛选方式可以包括多种,具体的:
方式1、按照等间隔时长,从一段待处理视频片段中筛选出多帧第一目标图像。
其中,等间隔时长可以按照待处理视频片段的时长进行设定,例如,在待处理视频片段为1s的情况下,等间隔时长可以设置为32ms;
或者,等间隔时长也可以按照待处理视频片段的时长和后续图像处理任务进行设定,例如,在确定后续图像处理任务需要每次输入8帧图像(即下述的第二目标图像)的情况下,为了为图像处理任务提供关键特征的图像(第二目标图像),因此,需要优先从待处理视频片段中筛选出帧数大于8帧的多帧第一目标图像,例如,从1s待处理视频片段中筛选出32帧第一目标图像,则等间隔时长可以为31.25ms,之后,再从32帧第一目标图像中筛选出关键特征的图像(第二目标图像),可以得到精准的图像处理结果。
这里,根据不同的图像处理任务以及不同的待处理视频片段的时长可以设置不同的等间隔时长,因此,针对等间隔时长本公开实施例不进行具体限定。
方式2、可以从待处理视频片段中筛选出连续帧第一目标图像。
例如,从1s待处理视频片段中任意抽取连续帧第一目标图像,或者,按照后续图像处理任务需要,确定筛选第一目标图像的帧数,按照该帧数从待处理视频片段中筛选连续帧第一目标图像。
方式3、可以按照不同应用场景筛选多帧第一目标图像,以识别对象异常行为的场景 为例,可以从待处理视频片段中筛选包含对象的多帧第一目标图像等。
示例性的,利用目标检测算法(比如目标检测Object Detection)识别在待处理视频片段中的图像帧,从图像帧中筛选出包含对象的多帧第一目标图像。
示例性的,利用目标检测算法(比如目标检测Object Detection)识别在待处理视频片段中的图像帧,从图像帧中筛选出包含对象的多帧对象图像;之后,按照等间隔时长,还可以从多帧对象图像中筛选出多帧第一目标图像。
在一些实施例中,如果按照等间隔时长筛选出第一目标图像,则拍摄时间最近的第一目标图像为间隔一个等间隔时长的两帧第一目标图像。一个图像组中可以包括间隔至少一个等间隔时长的至少两帧第一目标图像。示例性的,一个图像组可以包括间隔一个等间隔时长的两帧第一目标图像。
在一些实施例中,在多帧第一目标图像为连续帧图像的情况下,拍摄时间最近的至少两帧第一目标图像为相邻多帧第一目标图像。
一个图像组中具体包括几帧第一目标图像可以按照后续进行图像处理时,算法的处理能力进行设定,本公开实施例不进行具体限定。
S102:针对每个图像组,确定用于反映图像组中的各第一目标图像之间差异的图像差异信息。
由于像素点构成图像,因此像素点可以作为表征图像的最小单元。而图像之间的不同,则可以理解为图像之间的像素点对应的像素值不同。不同像素值反映不同颜色,呈现不同的图像。因此,可以利用像素点之间的像素差异信息计算图像之间的图像差异信息。图像差异信息能够反映图像组中各第一目标图像之间的差异。
具体实施时,针对每个图像组,将图像组中多帧时序相邻的第一目标图像中,具有相同的图像位置的像素点作为一组,得到多个像素组;像素组包括至少两个像素点;针对多个图像组中的每个像素组,确定像素组中的各像素点之间的像素差异信息;基于多个像素组中个像素组分别对应的像素差异信息,确定用于反映图像组中的各第一目标图像之间差异的图像差异信息。
这里,图像组中的第一目标图像按照时序信息从前到后排列;各个图像组之间也按照时序信息从前到后排列,具体的,可以参照图像组中的第一目标图像的时序信息。图像组中多帧时序相邻的第一目标图像可以包括相邻两帧第一目标图像、相邻三帧第一目标图像、或者相邻三帧以上的第一目标图像等。
这里,像素差异信息可以包括具有相同的图像位置的像素点之间的像素值的差值。或者,可以包括具有相同的图像位置的像素点之间的像素值的差值的绝对值。
参见图2所示,其为一个图像组中相邻两帧第一目标图像的示意图。由于第一目标图像A和第一目标图像B的尺寸大小相同,分辨率相同,因此,比较具有相同的图像位置(x,y)的像素点的像素差异信息,能够得到拍摄时间最近的一组图像组中不同第一目标图像之间的差异。示例性的,具有相同的图像位置(1,3)的像素点可以为图2中的像素点A1和像素点A2,则像素点A1和像素点A2组成一个像素组①;具有相同的图像位置(3,3)的像素点可以为图2中的像素点B1和像素点B2,则像素点B1和像素点B2组成一个像素组③,……,图2中共展示有九个像素组。
示例性的,针对图像组中的两帧第一目标图像,即第一目标图像A和第一目标图像B,可以分别比较相同图像位置(x,y)的像素点的像素差异信息,分别得到每个图像位置(x,y)对应的像素差异信息,如图2中的九个像素组对应的像素差异信息,将九个像素组对应的像素差异信息相加,可以得到第一目标图像A和第一目标图像B之间的图像差异信息。
示例性的,针对图像组中的相邻三帧第一目标图像,即第一目标图像A、第一目标图像B和第三目标图像C,可以分别比较相同图像位置(x,y)的像素点的像素差异信息,比如,首先比较第一目标图像A和第一目标图像B之间相同图像位置的像素点的像素差异信息,如上述计算过程,能够得到第一目标图像A和第一目标图像B之间的图像差异信息S AB;之后,再以同样的计算方式计算第一目标图像B和第一目标图像C之间相同图像位置的像素点的像素差异信息,能够得到第一目标图像B和第一目标图像C之间的图像差异信息S BC。或者,也可以计算S AB和S BC的平均值
Figure PCTCN2022096442-appb-000001
将该平均值
Figure PCTCN2022096442-appb-000002
分别作为该图像组中第一目标图像A和第一目标图像B之间的图像差异信息,以及第一目标图像B和第一目标图像C之间的图像差异信息。
上述通过计算像素组中的各像素点之间的像素差异信息,能够细致化的区分图像组中的各第一目标图像之间的差异,得到较为准确的图像差异信息。
在一些实施例中,还可以基于多个像素组中的各像素组对应的像素差异信息,确定第一综合差异信息;基于第一综合差异信息,确定用于反映图像组中的各第一目标图像之间的图像差异信息。
这里,第一综合差异信息可以为图像组中不同第一目标图像对应的全部像素组对应的像素差异信息之和。
在像素差异信息为具有相同的图像位置的像素点之间的像素值的差值的绝对值的情况下,图像差异信息可以利用公式一计算得到:
Figure PCTCN2022096442-appb-000003
其中,S t表示第t帧第一目标图像和第t-1帧第一目标图像之间的图像差异信息,I(x,y,t)表示第t帧第一目标图像中图像位置(x,y)处对应的像素点的像素值,I(x,y,t-1)表示第t-1帧第一目标图像中图像位置(x,y)处对应的像素点的像素值,这里,第t帧第一目标图像为第t-1帧第一目标图像的后一帧第一目标图像;|I(x,y,t)-I(x,y,t-1)|表示图像位置(x,y)处对应的像素点之间的像素差异信息。t表示多帧第一目标图像对应的帧数,一共T帧,其中,T为正整数。H表示第一目标图像和第二目标图像的尺寸中的高度信息,W表示第一目标图像和第二目标图像的尺寸中的宽度信息。
这里,像素值可以为RGB值,或者是YUV值。
上述由于像素差异信息能够细致化的区分图像组中的各第一目标图像之间的差异,因此,将各像素组之间的像素差异信息进行综合处理,比如累加求和等,能够得到精确的第一综合差异信息,该第一综合差异信息能够表征图像组中各第一目标图像之间的最大差异(累加求和后的结果),之后,利用精确的第一综合差异信息,能够得到精确的图像差异信息。
S103:基于每个图像组对应的图像差异信息,从多帧第一目标图像中筛选至少一帧第二目标图像。
具体实施时,可以基于每个图像组对应的图像差异信息,确定第二综合差异信息;基于每个图像组对应的图像差异信息、第二综合差异信息和每个图像组对应的时间信息,从多帧第一目标图像中筛选至少一帧第二目标图像。
这里,第二综合差异信息可以为多帧第一目标图像中的全部图像组对应的图像差异信息之和,即能够表征全部图像组所包括的第一个图像组中的第一帧第一目标图像到最后一个图像组中的最后一帧第一目标图像之间的差异。比如,S102中计算出的S t之和,t∈{2,3,…,T},即
Figure PCTCN2022096442-appb-000004
每个图像组对应的时间信息可以按照图像组中最后一帧第一目标图像的拍摄时间顺序进行定义,先拍摄到的第一目标图像所在的图像组排序在先,后拍摄到的第一目标图像所在的图像组排序在后。或者,可以按照得到图像差异信息的时间顺序进行定义,例如,S 2对应的图像组为第二个图像组,S 3对应的图像组为第三个图像组,……,S T对应的图像组为第T个图像组。又或者,也可以将图像组中最后一帧第一目标图像的拍摄时间作为图像组对应的时间信息。
上述,可以根据每个图像组对应的图像差异信息、第二综合差异信息和每个图像组对应的时间信息,可以较为准确地从多帧第一目标图像中筛选出能够表征关键特征的第二目标图像,即关键帧。
在一些实施例中,可以基于每个图像组对应的图像差异信息和第二综合差异信息,分别确定每个图像组对应的第一差异概率;基于每个图像组对应的第一差异概率和每个图像组对应的时间信息,从多帧第一目标图像中筛选至少一帧第二目标图像。
这里,第一差异概率可以利用公式二计算得到:
Figure PCTCN2022096442-appb-000005
其中,
Figure PCTCN2022096442-appb-000006
表示第t个图像组对应的第一差异概率,这里,没有第1个图像组,图像组计数从第2个开始。
示例性的,计算出每个图像组对应的第一差异概率,即第二个图像组对应的第一差异概率
Figure PCTCN2022096442-appb-000007
第二个图像组对应的第一差异概率
Figure PCTCN2022096442-appb-000008
……,第T个图像组对应的第一差异概率
Figure PCTCN2022096442-appb-000009
之后,可以按照时间信息依次累加每个图像组对应的第一差异概率,每累加一个图像组对应的第一差异概率,检测是否累加之和大于或等于预设阈值,在累加之和小于预设阈值的情况下,继续累加排序之后的图像组对应的第一差异概率,直到累加之和大于或等于预设阈值,停止累加,并确定最后一次累加第一差异概率对应的图像组,选取该图像组中的一个第一目标图像,作为第二目标图像。之后,第一差异概率归零,继续累加之后剩余的图像组对应的第一差异概率,直到该多帧第一目标图像中的全部图像组对应的第一差异概率累加完毕为止。
这里,预设阈值可以按照实际应用场景进行设定,本公开实施例不进行具体限定。
示例性的,已知预设阈值为0.8,T=16,
Figure PCTCN2022096442-appb-000010
Figure PCTCN2022096442-appb-000011
第一差异概率累加求和后为0.8等于预设阈值,则可以从第4个图像组中任意选取一帧第一目标图像作为第二目标图像。之后,继续累加,已知
Figure PCTCN2022096442-appb-000012
Figure PCTCN2022096442-appb-000013
第一差异概率累加求和后为1大于预设阈值,则可以从第9个图像组中任意选取一帧第一目标图像作为第二目标图像。之后,继续累加,已知
Figure PCTCN2022096442-appb-000014
第一差异概率累加求和后为0.3小于预设阈 值,由于此时多帧第一目标图像中只有16个图像组,在一些实施例中,可以从第16个图像组中任意选取一帧第一目标图像作为第二目标图像,进而能够得到三帧第二目标图像。在另一些实施例中,放弃对第一差异概率小于预设阈值三个图像组(即第十四个、第十五个和第十六个图像组)所组成的图像组序列中筛选第二目标图像,因此本次筛选,从16个图像组中筛选得到两帧第二目标图像。
在一些实施例中,还可以将S101中得到的多个图像组组成图像组序列,多个图像组在图像组序列中的排列方式可以为,各个图像组按照对应的时间信息从先到后排列。其中,时间信息可以包括图像组中第一目标图像对应的时序。图像组序列中可能存在多个子序列,其中,子序列中各个图像组对应的第一差异概率的和大于或等于预设阈值;或者,图像组序列不存在一个子序列,也即,图像组序列中各个图像组对应的第一差异概率的和小于预设阈值。
一种情况下,在图像组序列中的各个图像组对应的第一差异概率的和小于预设阈值的情况下,基于每个图像组对应的时间信息,将图像组序列中的最后一个图像组中的一个第一目标图像作为第二目标图像。
这里,可以是从最后一个图像组中任意选取一个第一目标图像作为第二目标图像,或者,也可以是根据任务要求,指定从最后一个图像中选取一个第一目标图像,比如指定选取最后一个图像组中的最后一帧第一目标图像作为第二目标图像等。其中,任务要求可以根据实际应用场景设定,本公开实施例不进行具体限定。
示例性的,在图像组对应的时间信息为按照得到图像差异信息的时间顺序进行定义的情况下,图像组序列中的各个图像组按照对应的时间信息从先到后排列,具体的,可以为第二个图像组Z 2,第三个图像组Z 3,……第T个图像组Z T,即图像组序列为Z 2Z 3……Z T。可以将图像组Z T中的任意一个第一目标图像作为第二目标图像。
或者,在一些实施例中,在确定图像组序列中的各个图像组对应的第一差异概率的和小于预设阈值的情况下,不再从该图像组序列中获取第二目标图像;之后,继续执行S101,得到多个图像组,组成新的图像组序列,在判断新的图像组序列中各个图像组对应的第一差异概率的和与预设阈值的大小关系,从而获取第二目标图像。
另一种情况下,基于每个图像组对应的第一差异概率和每个图像组对应的时间信息,从包括各个图像组的图像组序列中提取至少一个子序列;子序列中各个图像组对应的第一差异概率的和大于或等于预设阈值;针对每个子序列,将子序列中的最后一个图像组中的一个第一目标图像作为第二目标图像。
这里,图像组序列为各个图像组按照每个图像组对应的时间信息从先到后顺序排列得到的。子序列可以为图像组序列的一部分。
延续上例,按照时间信息依次累加图像组对应的第一差异概率,从图像组序列中提取出两个子序列,其中,第一个子序列包括第二个、第三个、第四个第五个、第六个和第七个图像组按照时间信息顺序排列组成的,即Z 2Z 3Z 4Z 5Z 6Z 7,第二个子序列包括第八个、第九个、第十个、第十一个、第十二个和第十三个图像组按照时间信息顺序排列组成的,即Z 8Z 9Z 10Z 11Z 12Z 13,这里,由于多帧第一目标图像对应的最后三个图像组对应的第一差异概率的累加之和小于预设阈值,因此,第十四个、第十五个和第十六个图像组不属于子序列 部分。之后,从每个子序列中的最后一个图像组中选择任意一帧第一目标图像作为第二目标图像,即从第七个图像组中选择任意一帧第一目标图像和从第十三个图像组中选择任意一帧第一目标图像作为第二目标图像。或者,也可以从每个子序列中的最后一个图像组中选择指定帧第一目标图像作为第二目标图像,比如从第七个图像组中选择最后一帧第一目标图像作为第二目标图像,从第十三个图像组中选择第一帧第一目标图像作为第二目标图像。
在一些实施例中,由于外在因素的影响,比如拍摄画面被遮挡或设备故障等,导致在筛选出的多帧第一目标图像中的相邻两帧第一目标图像之间的差异很大,比如摄像头拍摄到的某一帧第一目标图像C模糊或黑屏,其他帧第一目标图像完好,则针对第一目标图像C与其前一帧或后一帧第一目标图像计算得到的图像差异信息的差异较大,但是实际环境中对象或事件并未发生较大改变,因此,针对该第一目标图像C,可以利用差异平滑因子进行衰减处理,能够降低该第一目标图像的图像差异信息的异常结果对第二差异信息的影响,使得同一图像组中不同第一目标图像之间的图像差异信息过度较为平滑。
具体实施时,首先,获取差异平滑因子;差异平滑因子大于或等于0、且小于或等于1;之后,基于差异平滑因子、每个图像组对应的图像差异信息、第二综合差异信息和每个图像组对应的时间信息,从多帧第一目标图像中筛选至少一帧第二目标图像。
示例性的,可以基于差异平滑因子、每个图像组对应的图像差异信息和第二综合差异信息,分别确定每个图像组对应的第二差异概率,可以利用公式三计算得到:
Figure PCTCN2022096442-appb-000015
其中,
Figure PCTCN2022096442-appb-000016
表示第t个图像组对应的第二差异概率,这里,没有第1个图像组,图像组计数从第2个开始。μ表示差异平滑因子,其取值范围为0至1,包括0和1。差异平滑因子可以用于衰减图像差异信息,差异平滑因子越大,越容易选取到图像差异信息对应的差分值,即S t变化较大的相邻两帧第一目标图像。差异平滑因子为0时,等同于均匀选取。
示例性的,计算出每个图像组对应的第二差异概率,即第二个图像组对应的第二差异概率
Figure PCTCN2022096442-appb-000017
第三个图像组对应的第二差异概率
Figure PCTCN2022096442-appb-000018
……,第T个图像组对应的第二差异概率
Figure PCTCN2022096442-appb-000019
之后,可以按照时间信息依次累加每个图像组对应的第二差异概率,每累加一个图像组对应的第二差异概率,检测是否累加之和大于或等于预设阈值,在累加之和小于预设阈值的情况下,继续累加排序之后的图像组对应的第二差异概率,直到累加之和大于或等于预设阈值,停止累加,并确定最后一次累加第二差异概率对应的图像组,任意选取该图像组中的第一目标图像,作为第二目标图像。之后,第二差异概率归零,继续累加之后剩余的图像组对应的第二差异概率,直到该多帧第一目标图像中的全部图像组对应的第二差异概率累加完毕为止。
之后,对S103中筛选出的第二目标图像进行处理,由于第二目标图像为能够表征关键特征的图像,因此利用第二目标图像进行图像处理,能够得到精准的图像处理结果,提高图像的检测精度。具体实施时,基于至少一帧第二目标图像,进行图像处理,得到图像处理结果。
这里,不同图像处理任务对应不同的图像处理结果,示例性的,在图像处理任务为检测图像中对象的异常行为的任务的情况下,图像处理结果为对象行为信息,比如电梯间打 闹行为,打架斗殴行为,危险动作行为,公共场所吸烟行为等;在图像处理任务为检测图像中所发生的事件的任务的情况下,图像处理结果为监管事件结果信息,比如违章事件对应的违章结果;不文明事件对应的乱扔垃圾、公共场所吸烟等。
在一些实施例中,识别至少一帧第二目标图像中每帧第二目标图像中的目标对象,并确定目标对象在每帧第二目标图像的位置信息;基于确定的位置信息,分别从每帧第二目标图像中裁剪出包含目标对象的子图像;对子图像进行图像处理,得到图像处理结果。其中,位置信息可以包括目标对象在第二目标图像上的坐标位置,比如检测框坐标。
示例性的,可以利用目标检测算法(比如目标检测Object Detection)识别出在第二目标图像中的目标对象,并为目标对象标记检测框,之后,可以按照检测框标记的位置,通过注意力模型Attention Model将目标对象从第二目标图像中裁剪出来,得到包含目标对象的子图像,之后,对该子图像进行图像处理,不同的图像处理任务,其处理流程不同,本公开实施例不限定图像处理任务,因此,也不限定图像处理流程。可以参见图3所示,其为裁剪第二目标图像的展示示意图,其中,31表示多帧第二目标图像,其中311表示第二目标图像,32表示从多帧第二目标图像中裁剪出来的目标对象的多个子图像,其中321表示子图像。
这里,由于子图像包含目标对象,因此从第二目标图像中裁剪子图像,并对子图像进行处理,不影响图像处理结果,同时能够提高图像处理速度。
上述利用图像组中的不同第一目标图像之间的图像差异信息,能够准确的得到拍摄时间最近的一组第一目标图像之间的差异,之后,可以根据该图像差异信息,进行图像筛选,从多帧第一目标图像中筛选出差异程度满足预设差异条件的图像,作为第二目标图像,即能够表征关键特征的图像,比如对象异常行为的图像等,之后,针对第二目标图像进行图像处理,能够得到精准的图像处理结果,提高图像的检测精度。示例性的,图4a和图4b提供了,基于本公开实施例提供的技术方案筛选出的图像与等间隔时长筛选出的图像进行对比的展示效果。其中,图4a为按照本实施方式筛选得到的第二目标图像41,能够表征关键特征,即目标对象42(如人物)进入车内的时序动作。图4b为按照等间隔时长提取到的帧序列(包括多帧图像43,如图中的三帧图像),无法从该帧序列中观察到目标对象的动作。
另外,由于第二目标图像是能够表征关键特征的图像,因此,相应减少针对第二目标图像的识别,依然能够得到准确的图像处理结果,而减少第二目标图像识别,能够提高图像处理效率。
另外,利用上述实施方式,能够处理具有较长视频内容的视频片段,针对一些具有较长视频内容的视频片段,能够检测得到准确的图像处理结果。
针对S102,在一些实施例中,针对第一综合差异信息,还可以为部分图像位置对应的像素组对应的像素差异信息之和。例如,利用目标检测算法识别第一目标图像中的对象,并确定对象检测框,将该对象检测框框出部分的图像位置对应的像素组对应的像素差异信息之和作为第一综合差异信息。
这里,确定对象检测框,可以确定该对象检测框的坐标以及框出部分的坐标,确定像素点坐标集P,之后,根据公式一确定第一综合差异信息,此时,t∈{P}。
针对S103,在一些实施例中,还可以按照图像处理任务中一次处理第二目标图像的数量,确定从图像组序列中筛选第二目标图像,示例性的,在确定图像处理任务中一次处理八帧第二目标图像的情况下,首先确定筛选出的第二目标图像的帧数,如果帧数少于八帧,即上述实施例中从两个子序列中筛选出两帧第二目标图像,则其余六帧可以从多帧第一目标图像中除了已经筛选出的两帧第二目标图像剩余的第一目标图像随机抽取六帧作为第二目标图像,即得到八帧第二目标图像。或者,按照等间隔时长从剩余的第一目标图像随机抽取六帧作为第二目标图像,得到八帧第二目标图像。或者,直接将图像组序列中最后一帧第一目标图像复制为六份作为第二目标图像,即得到八帧第二目标图像。
针对S101,在一些实施例中,获取待处理视频片段;按照预设频率从待处理视频片段中获取多帧第一目标图像。
这里,预设频率可以为16fps至32fps范围中的任意一个。包括16fps和32fps,待处理视频片段可以为1s至2s。利用预设频率能够从待处理视频片段中筛选出较小时间间隔的多帧第一目标图像。例如,可以利用预设频率32fps从1s待处理视频片段中筛选出时间间隔为31.25ms的32帧第一目标图像。
这里,预设频率可以根据后续图像处理任务或者经验值进行确定,本公开实施例不进行具体限定。
另外,还可以通过调整预设频率从待处理视频片段中筛选帧数不同的第一目标图像,如果预设频率为高频率的情况下,则既能够保证从待处理视频片段中筛选出相对多的第一目标图像,又能够降低从整个待处理视频片段中筛选第二目标图像的时间,即后续只需要从多帧第一目标图像中筛选第二目标图像即可,提高了图像筛选的效率。
在一种可能的实施方式中,针对上述图像处理方法,下面给出一个较为完整的实施例,参见图5所示,其为选取第二目标图像的流程图,首先执行S501,在执行S501的过程中,可以等间隔的从待处理视频片段中筛选出多帧第一目标图像;将该第一目标图像按照拍摄时间顺序组成帧序列,之后,执行S502。在执行S502的过程中,在帧序列中按照每帧第一目标图像的拍摄时间,确定相邻两帧第一目标图像之间的图像差异信息,即S t;之后,在帧序列中不断计算第一目标图像之间的差异,可以按照上公式三进行计算。
在执行完S502后执行S503,在执行S503的过程中具体执行以下步骤:从第一帧第一目标图像和第二帧第一目标图像之间的第二差异概率
Figure PCTCN2022096442-appb-000020
开始,不断累加后续两帧第一目标图像之间的第二差异概率(比如
Figure PCTCN2022096442-appb-000021
),在累加之和大于或等于预设阈值的情况下,可以确定第一帧第一目标图像与最后累加的第二差异概率对应的第一目标图像之间存在较大差异,因此,可以将帧序列从最后累加的第二差异概率对应的第一目标图像的位置进行分割处理,从分割后的帧序列中提取至少一个子序列。
在执行完S503后执行S504,在执行S504的过程中,可以将至少一个子序列中的最后一帧第一目标图像作为第二目标图像,即具有关键特征的图像。最后在执行完S504后执行S505,即将至少一个第二目标图像输入到图像处理模型进行处理。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定。
基于同一发明构思,本公开实施例中还提供了与图像处理方法对应的图像处理装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述图像处理方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
参照图6所示,为本公开实施例提供的一种图像处理装置的示意图,所述装置包括:图像获取模块601、信息确定模块602和图像筛选模块603;其中,
图像获取模块601,用于获取多帧第一目标图像,并按照每帧所述第一目标图像的拍摄时间,将拍摄时间最近的至少两帧第一目标图像作为一组,得到多个图像组;
信息确定模块602,用于针对每个图像组,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息;
图像筛选模块603,用于基于每个图像组对应的图像差异信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
一种可选的实施方式中,所述图像处理装置还包括图像处理模块604,用于基于至少一帧所述第二目标图像,进行图像处理,得到图像处理结果。
一种可选的实施方式中,所述信息确定模块602,用于将所述图像组中多帧时序相邻的第一目标图像中,具有相同的图像位置的像素点作为一组,得到多个像素组;所述像素组包括至少两个像素点;针对所述多个像素组中的每个像素组,确定所述像素组中的各像素点之间的像素差异信息;基于所述多个像素组中各像素组分别对应的所述像素差异信息,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息。
一种可选的实施方式中,所述信息确定模块602,用于基于所述多个像素组中各像素组分别对应的所述像素差异信息,确定第一综合差异信息;基于所述第一综合差异信息,确定用于反映所述图像组中各第一目标图像之间差异的图像差异信息。
一种可选的实施方式中,所述图像筛选模块603,用于基于每个图像组对应的图像差异信息,确定第二综合差异信息;基于每个图像组对应的图像差异信息、所述第二综合差异信息和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
一种可选的实施方式中,所述图像筛选模块603,用于基于每个图像组对应的图像差异信息和所述第二综合差异信息,分别确定每个图像组对应的第一差异概率;基于每个图像组对应的所述第一差异概率和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
一种可选的实施方式中,所述图像筛选模块603,用于在图像组序列中的各个图像组对应的第一差异概率的和小于预设阈值的情况下,基于每个图像组对应的时间信息,将所述图像组序列中的最后一个图像组中的一个第一目标图像作为所述第二目标图像;其中,所述图像组序列中的各个图像组按照对应的时间信息从先到后排列;
基于每个图像组对应的所述第一差异概率和每个图像组对应的时间信息,从包括各个图像组的图像组序列中提取至少一个子序列;所述子序列中各个图像组对应的第一差异概率的和大于或等于所述预设阈值;针对每个子序列,将所述子序列中的最后一个图像组中的一个第一目标图像作为所述第二目标图像。
一种可选的实施方式中,所述图像筛选模块603,用于获取差异平滑因子;所述差异 平滑因子大于或等于0、且小于或等于1;基于所述差异平滑因子、每个图像组对应的图像差异信息、所述第二综合差异信息和每个图像组对应的时间信息,从所述多帧第一目标图像中筛选至少一帧第二目标图像。
一种可选的实施方式中,所述图像获取模块601,用于获取待处理视频片段;按照预设频率从所述待处理视频片段中获取多帧第一目标图像。
一种可选的实施方式中,所述图像处理模块604,用于识别所述至少一帧所述第二目标图像中每帧第二目标图像中的目标对象,并确定所述目标对象在每帧第二目标图像的位置信息;基于确定的所述位置信息,分别从每帧所述第二目标图像中裁剪出包含所述目标对象的子图像;对所述子图像进行图像处理,得到图像处理结果。
一种可选的实施方式中,所述图像处理结果包括对象行为信息,监管事件结果信息中的至少一种。
关于图像处理装置中的各模块的处理流程、以及各模块之间的交互流程的描述可以参照上述图像处理方法实施例中的相关说明,这里不再详述。
基于同一技术构思,本申请实施例还提供了一种计算机设备。参照图7所示,为本申请实施例提供的计算机设备的结构示意图,包括:
处理器71、存储器72和总线73。其中,存储器72存储有处理器71可执行的机器可读指令,处理器71用于执行存储器72中存储的机器可读指令,所述机器可读指令被处理器71执行时,处理器71执行下述步骤:S101:获取多帧第一目标图像,并按照每帧第一目标图像的拍摄时间,将拍摄时间最近的至少两帧第一目标图像作为一组,得到多个图像组;S102:针对每个图像组,确定用于反映图像组中各第一目标图像之间差异的图像差异信息;S103:基于每个图像组对应的图像差异信息,从多帧第一目标图像中筛选至少一帧第二目标图像。
上述存储器72包括内存721和外部存储器722;这里的内存721也称内存储器,用于暂时存放处理器71中的运算数据,以及与硬盘等外部存储器722交换的数据,处理器71通过内存721与外部存储器722进行数据交换,当计算机设备运行时,处理器71与存储器72之间通过总线73通信,使得处理器71在执行上述方法实施例中所提及的执行指令。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中所述的图像处理方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例还提供一种计算机程序产品,包括计算机指令,所述计算机指令被处理器执行时实现上述的图像处理方法的步骤。其中,计算机程序产品可以是任何能实现上述图像处理方法的产品,该计算机程序产品中对现有技术做出贡献的部分或全部方案可以以软件产品(例如软件开发包(Software Development Kit,SDK))的形式体现,该软件产品可以被存储在一个存储介质中,通过包含的计算机指令使得相关设备或处理器执行上述图像处理方法的部分或全部步骤。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。以上所描 述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个模块或组件可以结合,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。
所述功能如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。

Claims (13)

  1. 一种活体检测方法,其特征在于,包括:
    响应活体检测请求,获取与目标动作对应的多帧待检测图像,其中,所述目标动作为进行活体检测时指示用户做出的动作;
    基于所述各帧待检测图像中与所述目标动作匹配的特征点位置信息,确定各帧待检测图像分别对应的用于表示所述目标动作完成情况的检测值;
    基于与所述目标动作匹配的检测方案对所述检测值进行检测,得到活体检测结果。
  2. 根据权利要求1所述的方法,其特征在于,所述基于与所述目标动作匹配的检测方案对所述检测值进行检测,得到活体检测结果,包括:
    基于所述目标动作对应的检测阈值和与所述目标动作匹配的检测方案,对所述多帧待检测图像的检测值进行检测,得到活体检测结果。
  3. 根据权利要求2所述的方法,其特征在于,在所述目标动作为点头或摇头的情况下,所述检测值包括头部偏移角度;所述检测阈值包括正向偏移阈值、负向偏移阈值、图像帧数阈值;
    所述基于所述目标动作对应的检测阈值和与所述目标动作匹配的检测方案,对所述多帧待检测图像的检测值进行检测,得到活体检测结果,包括:
    确定头部偏移角度大于所述正向偏移阈值的第一目标检测图像,以及小于所述负向偏移阈值的第二目标检测图像;
    在所述第一目标检测图像的数量和所述第二目标检测图像的数量超过图像帧数阈值的情况下,确定活体检测通过。
  4. 根据权利要求1至3任一所述的方法,其特征在于,在所述目标动作为张闭嘴的情况下,与所述目标动作匹配的特征点包括嘴部特征点;
    所述基于所述各帧待检测图像中与所述目标动作匹配的特征点位置信息,确定各帧待检测图像分别对应的用于表示所述目标动作完成情况的检测值,包括:
    基于嘴部特征点位置信息,确定表征嘴部中央位置处张开幅度的第一嘴部距离和表征嘴角位置处张开幅度的第二嘴部距离;
    基于所述第一嘴部距离和第二嘴部距离,确定所述检测值。
  5. 根据权利要求2或3所述的方法,其特征在于,所述检测阈值包括张嘴阈值、闭嘴阈值、张嘴帧数阈值;
    所述基于所述目标动作对应的检测阈值和与所述目标动作匹配的检测方案,对所述多帧待检测图像的检测值进行检测,得到活体检测结果,包括:
    确定检测值为所述闭嘴阈值的多帧第一待检测图像;
    确定所述多帧第一待检测图像中,每两个相邻的第一待检测图像之间的第二待检测图像;
    在检测到所述第二待检测图像满足第一预设条件的情况下,确定通过活体检测。
  6. 根据权利要求5所述的方法,其特征在于,所述第一预设条件包括:
    所述第二待检测图像中,对应的检测值为所述张嘴阈值的第三待检测图像的数量为第一预设值;
    所述多个第三待检测图像中,相邻两个第三待检测图像之间的第二待检测图像的数量大于所述张嘴帧数阈值。
  7. 根据权利要求5或6所述的方法,其特征在于,所述第一预设条件还包括:
    相邻两个第三待检测图像之间的第二待检测图像中,任意两个第二待检测图像的检测值之间的差值小于第二预设值。
  8. 根据权利要求5至7任一所述的方法,其特征在于,所述确定所述多帧第一待检测图像中,每两个相邻的第一待检测图像之间的第二待检测图像,包括:
    确定所述多帧第一待检测图像中,满足第二预设条件的两个相邻的第一待检测图像之间的第二待检测图像;
    其中,所述第二预设条件包括:
    相邻的第一待检测图像之间的待检测图像的数量大于所述张嘴帧数阈值;相邻的第一待检测图像之间的待检测图像的检测值最值满足所述张嘴阈值对应的筛选条件。
  9. 根据权利要求1至8任一所述的方法,其特征在于,在所述目标动作为睁闭眼的情况下,所述基于所述各帧待检测图像中与所述目标动作匹配的特征点位置信息,确定各帧待检测图像分别对应的用于表示所述目标动作完成情况的检测值,包括:
    针对多帧待检测图像中的每帧待检测图像,基于所述待检测图像的特征点位置信息,对所述待检测图像进行矫正处理;
    将矫正处理后的所述待检测图像输入至预先训练好的神经网络,确定所述待检测图像对应的检测值。
  10. 根据权利要求2、3、5至8任一所述的方法,其特征在于,所述检测值包括用于描述眼部遮挡情况的第一检测值,以及用于描述睁闭眼完成情况的第二检测值;
    所述检测阈值包括睁眼阈值、闭眼阈值、睁眼帧数阈值、眼部遮挡阈值;
    所述基于所述目标动作对应的检测阈值和与所述目标动作匹配的检测方案,对所述多帧待检测图像的检测值进行检测,得到活体检测结果,包括:
    基于所述睁眼阈值、闭眼阈值、以及所述多帧待检测图像的第二检测值,确定满足第三预设条件的第四待检测图像;
    确定对应的第一检测值小于所述眼部遮挡阈值的第四待检测图像的目标数量;
    在所述目标数量超过所述睁眼帧数阈值的情况下,确定通过活体检测。
  11. 一种活体检测装置,其特征在于,包括:
    获取模块,用于响应活体检测请求,获取与目标动作对应的多帧待检测图像,其中,所述目标动作为进行活体检测时指示用户做出的动作;
    确定模块,用于基于所述各帧待检测图像中与所述目标动作匹配的特征点位置信息,确定各帧待检测图像分别对应的用于表示所述目标动作完成情况的检测值;
    检测模块,用于基于与所述目标动作匹配的检测方案对所述检测值进行检测,得到活体检测结果。
  12. 一种计算机设备,其特征在于,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,当计算机设备运行时,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1至10任一所述 的活体检测方法的步骤。
  13. 一种计算机可读存储介质,其特征在于,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1至10任一项所述的活体检测方法的步骤。
PCT/CN2022/096442 2021-10-29 2022-08-01 一种图像处理方法、装置、计算机设备和存储介质 WO2023071189A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111272556.2 2021-10-29
CN202111272556.2A CN113989531A (zh) 2021-10-29 2021-10-29 一种图像处理方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2023071189A1 true WO2023071189A1 (zh) 2023-05-04

Family

ID=79744433

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096442 WO2023071189A1 (zh) 2021-10-29 2022-08-01 一种图像处理方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN113989531A (zh)
WO (1) WO2023071189A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989531A (zh) * 2021-10-29 2022-01-28 北京市商汤科技开发有限公司 一种图像处理方法、装置、计算机设备和存储介质
CN115082400A (zh) * 2022-06-21 2022-09-20 北京字跳网络技术有限公司 一种图像处理方法、装置、计算机设备及可读存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200265592A1 (en) * 2019-02-18 2020-08-20 Raytheon Company Three-frame difference target acquisition and tracking using overlapping target images
CN112333467A (zh) * 2020-11-27 2021-02-05 中国船舶工业系统工程研究院 一种用于检测视频的关键帧的方法、系统和介质
CN112749603A (zh) * 2019-10-31 2021-05-04 上海商汤智能科技有限公司 活体检测方法、装置、电子设备及存储介质
CN113111770A (zh) * 2021-04-12 2021-07-13 杭州赛鲁班网络科技有限公司 一种视频处理方法、装置、终端及存储介质
CN113989531A (zh) * 2021-10-29 2022-01-28 北京市商汤科技开发有限公司 一种图像处理方法、装置、计算机设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200265592A1 (en) * 2019-02-18 2020-08-20 Raytheon Company Three-frame difference target acquisition and tracking using overlapping target images
CN112749603A (zh) * 2019-10-31 2021-05-04 上海商汤智能科技有限公司 活体检测方法、装置、电子设备及存储介质
CN112333467A (zh) * 2020-11-27 2021-02-05 中国船舶工业系统工程研究院 一种用于检测视频的关键帧的方法、系统和介质
CN113111770A (zh) * 2021-04-12 2021-07-13 杭州赛鲁班网络科技有限公司 一种视频处理方法、装置、终端及存储介质
CN113989531A (zh) * 2021-10-29 2022-01-28 北京市商汤科技开发有限公司 一种图像处理方法、装置、计算机设备和存储介质

Also Published As

Publication number Publication date
CN113989531A (zh) 2022-01-28

Similar Documents

Publication Publication Date Title
WO2023071189A1 (zh) 一种图像处理方法、装置、计算机设备和存储介质
TWI607409B (zh) 影像優化方法以及使用此方法的裝置
JP4373840B2 (ja) 動物体追跡方法、動物体追跡プログラムおよびその記録媒体、ならびに、動物体追跡装置
US11836903B2 (en) Subject recognition method, electronic device, and computer readable storage medium
US7450778B2 (en) Artifact reduction in a digital video
JPH0944670A (ja) 特定画像領域抽出方法及び特定画像領域抽出装置
CN110866486B (zh) 主体检测方法和装置、电子设备、计算机可读存储介质
JP2016134803A (ja) 画像処理装置及び画像処理方法
CN111368819B (zh) 光斑检测方法和装置
WO2023273111A1 (zh) 一种图像处理方法、装置、计算机设备和存储介质
WO2017170084A1 (ja) 動線表示システム、動線表示方法およびプログラム記録媒体
CN108574803B (zh) 图像的选取方法、装置、存储介质及电子设备
JP2017229061A (ja) 画像処理装置およびその制御方法、ならびに撮像装置
CN107148237B (zh) 信息处理装置、信息处理方法和程序
CN107346417B (zh) 人脸检测方法及装置
CN111743524A (zh) 一种信息处理方法、终端和计算机可读存储介质
WO2021078276A1 (zh) 连拍照片获取方法、智能终端及存储介质
US8891833B2 (en) Image processing apparatus and image processing method
JP4565273B2 (ja) 被写体追跡装置、およびカメラ
JP2002269545A (ja) 顔画像処理方法及び顔画像処理装置
JP2018018500A (ja) 顔識別方法
CN104112266B (zh) 一种图像边缘虚化的检测方法和装置
WO2018159037A1 (ja) 顔検出装置およびその制御方法、並びにプログラム
JP2005309740A (ja) 動物体追跡方法、動物体追跡プログラムおよびその記録媒体、ならびに、動物体追跡装置
JPWO2018155269A1 (ja) 画像処理装置および方法、並びにプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22885099

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE