WO2022048578A1 - 图像内容检测方法、装置、电子设备和可读存储介质 - Google Patents

图像内容检测方法、装置、电子设备和可读存储介质 Download PDF

Info

Publication number
WO2022048578A1
WO2022048578A1 PCT/CN2021/116099 CN2021116099W WO2022048578A1 WO 2022048578 A1 WO2022048578 A1 WO 2022048578A1 CN 2021116099 W CN2021116099 W CN 2021116099W WO 2022048578 A1 WO2022048578 A1 WO 2022048578A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
score
sub
region
Prior art date
Application number
PCT/CN2021/116099
Other languages
English (en)
French (fr)
Inventor
马欣
祝夭龙
Original Assignee
北京灵汐科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京灵汐科技有限公司 filed Critical 北京灵汐科技有限公司
Publication of WO2022048578A1 publication Critical patent/WO2022048578A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the present invention relates to the technical field of image processing, and in particular, to an image content detection method, apparatus, electronic device and readable storage medium.
  • target detection is a common image processing technology.
  • the current target detection of an image is to perform target detection for each image frame and the entire area of each image frame, which results in relatively large power consumption (computational amount) for target detection.
  • Embodiments of the present invention provide an image content detection method, apparatus, electronic device, and readable storage medium, so as to solve the problem that the power consumption (computation amount) of target detection is relatively large.
  • an embodiment of the present invention provides an image content detection method, which includes: dividing an acquired image frame into regions to obtain S image sub-regions, where S is an integer greater than 1; Perform suspected target detection on each of the image sub-regions, and obtain a suspected target detection result; when it is determined that there is an image position that satisfies a preset condition according to the suspected target detection result, intercept an image including the image position in the currently acquired image frame partial area; perform object detection on the partial area of the image.
  • intercepting a partial image area including the image position in the currently acquired image frame includes: When the suspected target detection result indicates that there is an image sub-region with a suspected target as the target sub-region, the score of the pixels in the target sub-region is calculated according to the location of the suspected target; if the score of at least one pixel satisfies the predetermined In the case of setting the condition, the area of the currently acquired image frame including the pixels satisfying the preset condition is intercepted as a partial area of the image.
  • the suspected target detection result indicates that there is an image sub-area with a suspected target as the target sub-area
  • calculate the pixel in the target sub-area according to the location of the suspected target.
  • the score includes: in the case that the suspected target detection result indicates that there is an image sub-region with a suspected target as the target sub-region, determining the center position of the suspected target, and calculating each of the target sub-regions according to the center position.
  • the score of a pixel, the score of any said pixel is negatively correlated with the distance between that pixel and the center position.
  • intercepting an area of the currently acquired image frame that includes pixels that meet the preset condition is a partial image area, including: dividing the score of each pixel Accumulate to the elements corresponding to the matrix, wherein the elements included in the matrix are in one-to-one correspondence with the pixels included in the acquired image frame; when the cumulative score of at least one element in the matrix exceeds a preset threshold, intercept the current acquisition
  • the area of the image frame that includes pixels that satisfy the preset condition is a partial area of the image, wherein the pixels that satisfy the preset condition are in one-to-one correspondence with the elements whose cumulative score exceeds the preset threshold.
  • the cumulative score of the target element at the first moment is equal to the sum of the following two items: the score of the pixel corresponding to the target element at the first moment; the target element at the previous moment of the first moment The decay value of the cumulative score; wherein, the target element is any element in the matrix.
  • the performing the suspected target detection on the S image sub-regions respectively to obtain the suspected target detection result includes: using the first network model to respectively perform the suspected target detection on the S image sub-regions, and obtaining the suspected target detection result.
  • the suspected target detection result; the performing target detection on the partial area of the image includes: using a second network model to perform target detection on the partial area of the image; wherein, when processing the same image, the calculation of the first network model The amount is less than the computational amount of the second network model.
  • an embodiment of the present invention provides an image content detection apparatus, which includes: a division module configured to perform region division on an acquired image frame to obtain S image sub-regions, where S is an integer greater than 1;
  • the first detection module is configured to perform suspected target detection on the S image sub-regions respectively, and obtain the suspected target detection result; In this case, intercept the image partial area including the image position in the currently acquired image frame;
  • the second detection module is configured to perform target detection on the image partial area.
  • the interception module includes: a calculation unit, configured to, when the suspected target detection result indicates that there is an image sub-region with a suspected target as the target sub-region, according to the location of the suspected target , calculate the score of the pixels in the target sub-area; the interception unit is used to intercept the area of the currently acquired image frame including the pixels that meet the preset conditions as the image part when the score of at least one pixel satisfies the preset condition area.
  • the computing unit is configured to determine the center position of the suspected target when the suspected target detection result indicates that there is an image sub-region with a suspected target as the target sub-region, and according to the center position The score of each pixel in the target sub-region is calculated, and the score of any pixel is negatively correlated with the distance between the pixel and the center position.
  • the intercepting unit is configured to accumulate the scores of each pixel into elements corresponding to the matrix, wherein the elements included in the matrix correspond one-to-one with the pixels included in the acquired image frame; and in the matrix In the case where the cumulative score of at least one element exceeds the preset threshold, intercept the image partial area that includes the pixels that meet the preset conditions in the currently acquired image frame, wherein the pixels that meet the preset conditions and the cumulative score exceed the preset
  • the elements of the threshold correspond one-to-one.
  • the cumulative score of the target element at the first moment is equal to the sum of the following two items: the score of the pixel corresponding to the target element at the first moment; the target element at the previous moment of the first moment The decay value of the cumulative score; wherein, the target element is any element in the matrix.
  • the first detection module is configured to use the first network model to perform suspected target detection on the S image sub-regions, respectively, to obtain a suspected target detection result;
  • the second detection module is configured to use the second detection module.
  • the network model performs target detection on the partial area of the image; wherein, when processing the same image, the calculation amount of the first network model is less than that of the second network model.
  • an embodiment of the present invention provides an electronic device, including: a memory, a processor, and a program or instruction stored on the memory and executable on the processor, where the program or instruction is processed by the processor.
  • the steps in the image content detection method provided by the embodiment of the present invention are implemented when the device is executed.
  • an embodiment of the present invention provides a readable storage medium, characterized in that, a program or an instruction is stored on the readable storage medium, and when the program or instruction is executed by a processor, the implementation provided by the embodiment of the present invention is implemented Steps in an image content detection method.
  • the acquired image frame is divided into regions to obtain S image sub-regions, where S is an integer greater than 1; the S image sub-regions are respectively subjected to suspected target detection to obtain the suspected target detection Result: in the case of determining that there is an image position satisfying the preset condition according to the suspected target detection result, intercepting the image partial area including the image position in the currently acquired image frame; performing target detection on the image partial area.
  • the power consumption (computation amount) of the target detection can be reduced.
  • FIG. 1 is a flowchart of an image content detection method provided by an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of an image division provided by an embodiment of the present invention.
  • FIG. 3 is a structural diagram of an image content detection apparatus provided by an embodiment of the present invention.
  • FIG. 4 is a structural diagram of an electronic device provided by an embodiment of the present invention.
  • FIG. 5 is a structural diagram of a readable storage medium provided by an embodiment of the present invention.
  • first, second and the like in the description and claims of the present application are used to distinguish similar objects, and are not used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in sequences other than those illustrated or described herein, and "first”, “second” distinguishes Usually it is a class, and the number of objects is not limited.
  • the first object may be one or multiple.
  • FIG. 1 is a flowchart of an image content detection method provided by an embodiment of the present invention. As shown in FIG. 1, the following steps 101 to 104 are included.
  • Step 101 Divide the acquired image frame into regions to obtain S image sub-regions, where S is an integer greater than 1.
  • the acquired image frame may be an image frame collected by a camera, for example, may be an image collected by a built-in or external camera of an electronic device.
  • an image may also be received through a network, such as a received image frame of a certain video.
  • this acquisition may be receiving, or may be reading local video image frames.
  • the above-mentioned area division may be divided according to preset division positions to obtain S areas, where S may be a preset integer greater than 1, for example: 2, 4 or 6 may be specifically set according to the application scenario.
  • Step 102 Perform suspected target detection on the S image sub-regions respectively to obtain a suspected target detection result.
  • the above-mentioned suspected target detection may be to detect whether there is a suspected target (that is, an object that is "possible” as the target) in the sub-discrimination of the image, and does not perform specific detection on the target, for example, does not perform the type, posture, behavior, attribute, etc. discrimination of the target.
  • the above-mentioned suspected target detection may also be defined as target detection with lower precision, accuracy, and/or computational complexity than the target detection performed in step 104 .
  • the above-mentioned suspected target detection result may indicate that there is a suspected target in one or more sub-regions of the image, or may also indicate that there is no suspected target in all the sub-regions of the image for some image frames.
  • the above-mentioned image frame may be an ultra-high-resolution image frame or a high-resolution image frame.
  • this is not limited, and some low-resolution image frames are also possible.
  • step 102 can be continuously performed, for example, step 102 is performed for each image frame captured by the camera or for consecutive image frames in a video.
  • Step 103 in the case where it is determined according to the suspected target detection result that there is an image position that satisfies a preset condition, intercept an image partial area including the image position in the currently acquired image frame.
  • the above-mentioned preset condition may be that the scores of some pixels of the image exceed a preset threshold, so that the position of the pixel whose score exceeds the preset threshold is used as the above-mentioned image position; A suspected target is detected in the area, and the image position corresponding to the suspected target is taken as the above-mentioned image position.
  • the above preset conditions are not limited, and may specifically be pre-configured according to application scenarios or target detection requirements.
  • the above-mentioned intercepting the image partial area including the image position in the currently acquired image frame may be that the image frame is intercepted with a preset width and height centered on the above-mentioned image position, or only the above image position may be intercepted.
  • the above-mentioned currently acquired image frame may be the image currently collected by the camera, or may be the currently received or currently read image frame.
  • Step 104 Perform target detection on the partial area of the image.
  • the above-mentioned target detection may be to determine the type of the target, or to detect the relevant attributes of the target, or to detect the posture of the target, or to detect the behavior of the target, etc.
  • the target detection is not limited.
  • the above target detection may confirm the suspected target detected in the above step 102, that is, to confirm whether the suspected target is the target that needs to be detected.
  • the above steps can implement suspected target detection for each difference, and then perform target detection in a partial area of the image under the condition that preset conditions are met, thereby reducing the power consumption (computational amount) of target detection.
  • embodiments of the present invention may be applied to electronic devices, such as embedded devices, mobile phones, tablet computers, wearable devices, vehicles, etc., which are not limited thereto.
  • adjacent image sub-regions in the above-mentioned S image sub-regions have overlapping regions.
  • the overlapping area exists in the above-mentioned adjacent image sub-areas may be, the adjacent area of the adjacent image sub-areas is an overlapping area.
  • the first image from the decoded video is I
  • divide I into sub-regions of m rows and n columns. After the division, there are s m ⁇ n image sub-regions, and there are t pixel widths between each image sub-region. or high overlap.
  • the parameters m, n, and t are all preset values according to actual business needs. For example, when m and n are both 2 and S is 4, the divided image sub-regions (represented by sub-regions 1-4 in the figure) overlap. The situation is shown in Figure 2.
  • intercepting a partial image area including the image position in the currently acquired image frame including : in the case where the suspected target detection result indicates that there is an image sub-region with a suspected target as the target sub-region, calculate the score of the pixels in the target sub-region according to the location of the suspected target; When the score satisfies the preset condition, the area of the currently acquired image frame including the pixels that satisfy the preset condition is intercepted as a partial image area.
  • the above-mentioned suspected target may be an object with relevant attributes of the target, or may be an object with a relatively high possibility of being a target (eg, higher than a preset threshold value), or the like.
  • the target can be pre-defined as the object to be detected, such as a specific person, a specific vehicle, a specific object, etc.
  • the score of the at least one pixel satisfying the preset condition may be that the score of the at least one pixel exceeds the preset threshold.
  • the above-mentioned currently obtained image frame may be the image frame obtained when the interception operation is performed when the cumulative score of at least one element in the subsequent determination matrix exceeds the preset threshold; or, may be the cumulative score of at least one element in the subsequent matrix.
  • the image frame currently collected by the camera or the image frame currently read or it can be the image frame collected by the camera or the image read after the cumulative score of at least one element in the subsequent matrix exceeds the preset threshold. frame.
  • the image sub-region is taken as the target sub-region, the score of each pixel in the target sub-region is calculated, and the image is cut out according to the pixel score.
  • Part of the area in the frame is a part of the image, and object detection is performed on the part of the image, which can improve the accuracy of object detection.
  • the score of the pixels in the target sub-region is calculated, Including: in the case that the suspected target detection result indicates that there is an image sub-region with a suspected target as the target sub-region, determining the center position of the suspected target, and calculating the center position of each pixel in the target sub-region according to the center position. Score, the score of any of the pixels is negatively correlated with the distance between the pixel and the center position.
  • the corresponding score is calculated according to the distance between it and the center position (such as the geometric center) of the suspected target, and the farther the distance from the center position, the lower the pixel's score. .
  • a two-dimensional Gaussian distribution function may be used to determine the score for each pixel.
  • the position of each pixel corresponding to the above-mentioned suspected target is the region of interest (region of interest, ROI)
  • the coordinate position of the central pixel of the ROI region is (x c , y c ), that is, the above-mentioned center position
  • other pixels are expressed as pixel (i, j) according to the coordinate position, then the score of pixel (i, j) is obtained by the following formula:
  • G(i,j) represents the score of pixel (i,j).
  • the score of the center position can be pre-configured, or obtained by the above formula, for example, the pixel (i, j) is equal to the pixel (x c , y c ).
  • the pixel score of the sub-region of the image can be determined from the center position to the surrounding area.
  • the embodiment of the present invention does not limit the use of a two-dimensional Gaussian distribution function to determine the score of each pixel.
  • the score can also be directly determined according to the distance between the pixel and the above-mentioned center position. The distance from the center position is divided into multiple intervals, and each interval corresponds to a score.
  • the score of a pixel is negatively correlated with the distance between the pixel and the center position, which means that the farther the pixel is from the suspected target, the lower the score, thereby reducing the triggering of the above preset conditions to further save energy consumption.
  • pixels in image sub-regions where no suspected target is detected may have a score of zero, or in the implementation of the matrix, no Perform the operation of accumulating points.
  • intercepting the area of the currently acquired image frame including the pixel satisfying the preset condition as the image partial area including: accumulating the score of each pixel to Among the elements corresponding to the matrix, the elements included in the matrix are in one-to-one correspondence with the pixels included in the acquired image frame; when the cumulative score of at least one element in the matrix exceeds a preset threshold, the currently acquired image is intercepted
  • the area in the frame including the pixels satisfying the preset condition is a partial area of the image, wherein the pixels satisfying the preset condition are in one-to-one correspondence with the elements whose cumulative score exceeds the preset threshold.
  • the above-mentioned matrix is predefined and defined, and the initial value of each pixel in the matrix may be zero.
  • the scores of the elements in the above-mentioned matrix can be accumulated by processing multiple image frames.
  • the elements of the The score does not necessarily have to be higher and higher, it can be a certain attenuation.
  • a matrix is preset, and the elements of the matrix correspond one-to-one with the pixels of the image.
  • the length and width of the matrix may be equal to the length and width of the image. Therefore, for each image frame, the score of the corresponding element in the matrix can be calculated according to the score of the pixel therein, and by processing multiple image frames, the cumulative score of the elements of the matrix can be obtained. Therefore, only when the cumulative score of at least one element (reference element) in the matrix exceeds the preset threshold, the area of the currently acquired image frame including the pixels corresponding to these reference elements can be intercepted as the image partial area , thereby reducing the number of target detections to further save power consumption.
  • the cumulative score of the target element at the first moment is equal to the sum of the following two items: the score of the pixel corresponding to the target element at the first moment; the cumulative score of the target element at the previous moment of the first moment.
  • each moment corresponds to an image frame.
  • the cumulative score of each element in the matrix will be "attenuated” to a "decay value” at each time, so the current score of the pixel (the increased score at this time) and the cumulative score of the corresponding element at the previous time can be compared
  • the decay value of (when the score is not increased, the accumulated score remaining at this moment) is added to obtain the accumulated score at the current moment.
  • the suspected target may not be accurately identified from one image frame (that is, the suspected target cannot be accurately identified according to the score of the pixels in one image frame), and by means of accumulation, it can be jointly determined from multiple image frames. At the same time, since the cumulative score will also decay, the influence of image frames that are too long ago can be excluded, so that the suspected target can be determined according to multiple consecutive image frames.
  • the score of the element decays to 0, for example : No suspected target is detected in consecutive m frames (m times) in a certain image sub-area, so the corresponding elements of these m frames have no score, and the score of the element decays to 0.
  • the decay value of the above-mentioned accumulated score may use an exponential decay method to perform score decay on the elements of the above-mentioned matrix (which may be expressed as M). Assuming that if there is no new score increase, the score of each element in the matrix M after m frames decays to 0, that is, each consecutive m frame is regarded as a time statistical period, the exponential decay method is used to calculate the matrix in this time statistical period.
  • the above m and ⁇ are constants, and the above ⁇ formula can make the coefficient decay from 1 to close to 0.
  • the exponential decay method is not limited to attenuate the scores of elements.
  • every other image frame may be attenuated by a certain percentage, such as attenuated by 20% every other image frame.
  • the performing the suspected target detection on the S image sub-regions respectively to obtain the suspected target detection result includes: using the first network model to respectively perform the suspected target detection on the S image sub-regions detection to obtain a suspected target detection result;
  • the performing target detection on the partial area of the image includes: using a second network model to perform target detection on the partial area of the image; wherein, when processing the same image, the first network The computational complexity of the model is less than that of the second network model.
  • the above-mentioned first network model and second network model may be pre-trained.
  • the first network model since the calculation amount of the first network model is less than the calculation amount of the second network model under the same situation, the first network model can also be trained. It can be called a shallow object recognition network model, while the second network model is called a deep object recognition network model.
  • the first network model is used to screen out suspected targets from all images with less computational overhead, and does not perform detailed discrimination on the specific types of targets, only requiring speed, not high recognition accuracy. Then use the second network model to perform object detection on part of the image.
  • the above-mentioned second network model can be a network that meets specific identification accuracy according to actual business needs, such as a deep FasterRcnn network
  • the above-mentioned first network model can be a shallow-layer SSD network, of course, this is not limited, for example: the first network
  • Both the model and the second network model may also be other network models.
  • the first network model needs to process all the images, because it only uses to identify suspected targets, rather than performing accurate target detection, the total required amount of computation is relatively low.
  • the second network model can accurately detect the final target to ensure the effect of target detection; at the same time, the second network model only recognizes the part of the image with the suspected target when there is a suspected target, and the number of times of the recognition The size of the recognized objects is small, so the total computational complexity is also low.
  • the power consumption can be improved, and the calculation efficiency can be improved.
  • the acquired image frame is divided into regions to obtain S image sub-regions, where S is an integer greater than 1; the S image sub-regions are respectively subjected to suspected target detection to obtain the suspected target detection Result: in the case of determining that there is an image position meeting the preset condition according to the suspected target detection result, intercepting the image partial area including the image position in the currently acquired image frame; performing target detection on the image partial area.
  • the power consumption (computation amount) of the target detection can be reduced.
  • FIG. 3 is a structural diagram of an image content detection apparatus provided by an embodiment of the present invention.
  • the image content detection apparatus 300 includes: a division module 301, which is used for dividing the acquired image frame into regions to obtain S image sub-regions, where S is an integer greater than 1; the first detection module 302 is used to detect the suspected target respectively on the S image sub-regions to obtain the suspected target detection result; the interception module 303 is used to intercept the image partial area including the image position in the currently acquired image frame when it is determined according to the suspected target detection result that there is an image position that satisfies the preset condition; the second detection module 304 is used to detect Object detection is performed on the image partial area.
  • a division module 301 which is used for dividing the acquired image frame into regions to obtain S image sub-regions, where S is an integer greater than 1
  • the first detection module 302 is used to detect the suspected target respectively on the S image sub-regions to obtain the suspected target detection result
  • the interception module 303 is used to intercept the
  • the interception module includes: a calculation unit, configured to calculate the target sub-region according to the location of the suspected target when the suspected target detection result indicates that there is an image sub-region with a suspected target as the target sub-region.
  • the score of the pixels in the target sub-area is configured to intercept the area of the currently acquired image frame including the pixels that satisfy the preset condition as the partial area of the image when the score of at least one pixel satisfies the preset condition.
  • the calculation unit is configured to determine the center position of the suspected target when the suspected target detection result indicates that there is an image sub-region with a suspected target as the target sub-region, and calculate the center position according to the center position.
  • the score of each pixel in the target sub-region is negatively correlated with the distance between the pixel and the center position.
  • the intercepting unit is configured to accumulate the scores of each pixel into the elements corresponding to the matrix, wherein the elements included in the matrix are in one-to-one correspondence with the pixels included in the acquired image frame; and in the matrix at least When the cumulative score of an element exceeds the preset threshold, intercept the image partial area that includes the pixels that meet the preset conditions in the currently acquired image frame, wherein the pixels that meet the preset conditions and the accumulated score exceed the preset threshold.
  • Elements correspond one-to-one.
  • the cumulative score of the target element at the first moment is equal to the sum of the following two items:
  • the first detection module is configured to use the first network model to respectively perform suspected target detection on the S image sub-regions to obtain a suspected target detection result;
  • the second detection module is configured to use the second network model. Performing target detection on the partial area of the image; wherein, when processing the same image, the computation amount of the first network model is less than the computation amount of the second network model.
  • the image content detection apparatus provided in this embodiment of the present invention can implement each process in the method embodiment of FIG. 1 , and to avoid repetition, details are not described here.
  • the image content detection apparatus in the embodiment of the present invention may be a relatively independent apparatus, or may be a component, an integrated circuit, or a chip in an electronic device.
  • FIG. 4 is a structural diagram of an electronic device provided by an embodiment of the present invention.
  • the electronic device 400 includes: a memory 401 , a processor 402 , and a memory 401 and a processor 402 that are stored in the memory 401 and can be
  • an embodiment of the present invention further provides a readable storage medium 500, where a program or an instruction is stored on the readable storage medium 500, and when the program or instruction is executed by a processor, the above embodiment of the image content detection method is implemented and can achieve the same technical effect, in order to avoid repetition, it will not be repeated here.
  • the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in a part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, CD-ROM), including several instructions to make a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of this application.
  • a storage medium such as ROM/RAM, magnetic disk, CD-ROM

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

公开的图像内容检测方法包括:对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数;分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域;对所述图像部分区域执行目标检测。

Description

图像内容检测方法、装置、电子设备和可读存储介质 技术领域
本发明涉及图像处理技术领域,尤其涉及一种图像内容检测方法、装置、电子设备和可读存储介质。
背景技术
随着图像处理技术的发展,图像处理技术应用也越来越广泛,其中,目标检测是目前一种常见的图像处理技术。目前图像的目标检测是针对每一图像帧,以及每一图像帧的全部区域执行目标检测,这样导致目标检测的功耗(运算量)比较大。
发明内容
本发明实施例提供一种图像内容检测方法、装置、电子设备和可读存储介质,以解决目标检测的功耗(运算量)比较大的问题。
第一方面,本发明实施例提供一种图像内容检测方法,其包括:对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数;分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域;对所述图像部分区域执行目标检测。
在一些实施例中,所述在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域,包括:在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分;在至少一个像素的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域。
在一些实施例中,所述在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分,包括:在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,确定所述疑似目标的中心位置,依据所述中心位置计算所述目标子区中各像素的得分,任意所述像素的得分与该像素与所述中心位置之间的距离呈负相关。
在一些实施例中,所述在至少一个像素的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域,包括:将各像素的得分累加至矩阵对应的元素中,其中,所述矩阵包括的元素与获取的图像帧包括的像素一一对应;在所述矩阵中至少一个元素的累积得分超过预设门限的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域,其中,所述满足预设条件的像素与累积得分超过预设门限的元素一一对应。
在一些实施例中,目标元素在第一时刻的累积得分等于如下两项之和:所述目标元素对应像素在所述第一时刻得分;所述目标元素在所述第一时刻的上一时刻的累积得分的衰减值;其中,所述目标元素为所述矩阵中的任一元素。
在一些实施例中,所述分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果,包括:使用第一网络模型分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;所述对所述图像部分区域执行目标检测,包括:使用第二网络模型对所述图像部分区域执行目标检测;其中,在处理同一图像时,所述第一网络模型的计算量少于所述第二网络模型的计算量。
第二方面,本发明实施例提供一种图像内容检测装置,其包括:划分模块,用于对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数;第一检测模块,用于分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;截取模块,用于在依据所述疑似目标检测结果确定存在满足预设条件的图像位置 的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域;第二检测模块,用于对所述图像部分区域执行目标检测。
在一些实施例中,所述截取模块,包括:计算单元,用于在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分;截取单元,用于在至少一个像素的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域。
在一些实施例中,所述计算单元用于在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,确定所述疑似目标的中心位置,依据所述中心位置计算所述目标子区中各像素的得分,任意所述像素的得分与该像素与所述中心位置之间的距离呈负相关。
在一些实施例中,所述截取单元用于将各像素的得分累加至矩阵对应的元素中,其中,所述矩阵包括的元素与获取的图像帧包括的像素一一对应;以及在所述矩阵中至少一个元素的累积得分超过预设门限的情况下,截取当前获取的图像帧中包括满足预设条件的像素的图像部分区域,其中,所述满足预设条件的像素与累积得分超过预设门限的元素一一对应。
在一些实施例中,目标元素在第一时刻的累积得分等于如下两项之和:所述目标元素对应像素在所述第一时刻得分;所述目标元素在所述第一时刻的上一时刻的累积得分的衰减值;其中,所述目标元素为所述矩阵中的任一元素。
在一些实施例中,所述第一检测模块用于使用第一网络模型分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;所述第二检测模块用于使用第二网络模型对所述图像部分区域执行目标检测;其中,在处理同一图像时,所述第一网络模型的计算量少于所述第二网络模型的计算量。
第三方面,本发明实施例提供一种电子设备,包括:存储器、处 理器及存储在所述存储器上并可在所述处理器上运行的程序或者指令,所述程序或者指令被所述处理器执行时实现本发明实施例提供的图像内容检测方法中的步骤。
第四方面,本发明实施例提供一种可读存储介质,其特征在于,所述可读存储介质上存储有程序或指令,所述程序或指令被处理器执行时实现本发明实施例提供的图像内容检测方法中的步骤。
本发明实施例中,对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数;分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域;对所述图像部分区域执行目标检测。这样由于对各区别执行疑似目标检测,之后在满足预设条件的情况下在图像部分区域执行目标检测,从而可以降低目标检测的功耗(运算量)。
附图说明
图1是本发明实施例提供的一种图像内容检测方法的流程图。
图2是本发明实施例提供的一种图像划分的示意图。
图3是本发明实施例提供的一种图像内容检测装置的结构图。
图4是本发明实施例提供的一种电子设备的结构图。
图5是本发明实施例提供的一种可读存储介质的结构图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书和权利要求书中的术语“第一”、“第二”等是用于区别类似的对象,而不用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便本申请的实施例能够以除了在这里图示或描述的那些以外的顺序实施,且“第一”、“第二”所区别的对象通常为一类,并不限定对象的个数,例如第一对象可以是一个,也可以是多个。
请参见图1,图1是本发明实施例提供的一种图像内容检测方法的流程图,如图1所示,包括以下步骤101至104。
步骤101、对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数。
上述获取的图像帧可以是摄像头采集的图像帧,例如:可以是电子设备的内置或者外置摄像头采集的图像。当然,对此不作限定,例如:还可以是通过网络接收到图像,如接收的某一视频的图像帧。另外,这个获取可以是接收,或者可以是读取本地的视频的图像帧。
上述进行区域划分可以是按照预先设置的划分位置进行划分,以得到S个区域,其中,S可以为预先设置的大于1的整数,例如:2、4或者6具体可以根据应用场景进行设定。
步骤102、分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果。
上述疑似目标检测可以是检测图像子区别是否存在疑似的目标(即“有可能”为目标的对象),并不对目标进行具体检测,例如:不对目标执行种类、姿态、行为动作、属性等判别。
进一步,上述疑似目标检测也可以定义为精度、准确性和/或计算量等比步骤104中执行的目标检测低的目标检测。
上述疑似目标检测结果可以表示一个或者多个图像子区域存在疑似目标,或者也可能针对一些图像帧表示所有图像子区域均不存在疑似目标。
另外,上述图像帧可以是超高分辨率图像帧或者高分辨率图像帧,当然,对此不作限定,一些低分辨率图像帧也是可以的。
需要说明的是,步骤102可以持续执行的,例如:针对摄像头采 集的每个图像帧或者一段视频中的连续图像帧都执行步骤102。
步骤103、在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域。
其中,上述预设条件可以是图像的部分像素的得分超过预设门限,从而将得分超过预设门限像素的位置作为上述图像位置;或者上述预设条件可以是多个连续图像帧在同一图像子区域检测到疑似目标,从而将该疑似目标对应的图像位置作为上述图像位置。具体本发明实施例中,对上述预设条件不作限定,具体可以是根据应用场景或者目标检测需求进行预先配置。
上述截取当前获取的图像帧中包括所述图像位置的图像部分区域可以是,以上述图像位置为中心以预设的宽度和高度对图像帧进行截取,或者也可以是仅截取以上图像位置。
上述当前获取的图像帧可以是摄像头当前采集的图像,或者可以是,当前接收的或者当前读取的图像帧。
步骤104、对所述图像部分区域执行目标检测。
上述目标检测可以是对目标进行种类判别,或者可以是检测目标的相关属性,或者可以是检测目标姿态,或者可以是检测目标的行为动作等,本发明实施例中,对目标检测不作限定。或者,上述目标检测可以对上述步骤102检测到疑似目标进行确认,即确认疑似目标是否为需要检测的目标。
本发明实施例中,通过上述步骤可以实现对各区别执行疑似目标检测,之后在满足预设条件的情况下在图像部分区域执行目标检测,从而可以降低目标检测的功耗(运算量)。
需要说明的是,本发明实施例可以应用于电子设备,如嵌入式设备、手机、平板电脑、穿戴设备、车辆等,对此不作限定。
作为一种可选的实施方式,上述S个图像子区域中相邻图像子区域存在重叠区域。
上述相邻图像子区域存在重叠区域可以是,相邻图像子区域的相邻区域为重叠区域。
例如,假设来自视频解码后的第一图像为I,对I进行m行n列的子区域分割,分割后共有s=m×n个图像子区域,各图像子区域之间有t个像素宽度或高度的重叠。参数m、n和t均为根据实际业务需要预设的值,例如,当m、n均为2,S为4时,分割后的图像子区域(图中用子区域1-4表示)重叠情况如图2所示。
该实施方式中,由于相邻图像子区域存在重叠区域,这样对各图像子区域进行检测时,可以保证各图像子区域的评估图像内容的连续性,避免处于相邻图像子区域边界处的目标被“切分”为两半而在每个图像子区域中均无法被识别的问题,以提高目标检测的准确性。
作为一种可选的实施方式,所述在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域,包括:在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分;在至少一个像素的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域。
其中,上述疑似目标可以是具备目标的相关属性的对象,或者可以是为目标的可能性比较高(如高于预设门限值)的对象等。而目标可以预先定义为需要检测的对象,如特定人物、特定车辆、特定物体等。
上述至少一个像素的得分满足预设条件可以是,至少一个像素的得分超过预设门限。
上述当前获取的图像帧可以是在后续的确定矩阵中至少一个元素的累积得分超过预设门限时,执行该截取操作时获取的图像帧;或者,可以是在后续的矩阵中至少一个元素的累积得分超过预设门限时摄像头当前采集的图像帧或者当前读取的图像帧;或者可以是在后续的矩阵中至少一个元素的累积得分超过预设门限后,摄像头采集的图像帧或者读取的图像帧。
该实施方式中,若检测到某个图像子区域中具有疑似目标时,则以该图像子区域为目标子区域,并计算该目标子区域中各像素的得分, 依据像素的得分,截取出图像帧中的部分区域为图像部分区域,并对图像部分区域进行目标检测,这样可以提高目标检测的准确性。
可选的,所述在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分,包括:在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,确定所述疑似目标的中心位置,依据所述中心位置计算所述目标子区中各像素的得分,任意所述像素的得分与该像素与所述中心位置之间的距离呈负相关。
该实施方式中,对目标子区域中的各像素,依据其与疑似目标的中心位置(如几何中心)之间的距离计算相应的得分,与中心位置之间距离越远,像素的得分越低。
例如,可以是使用二维高斯分布函数确定各像素的得分。例如:在目标子区域中,以上述疑似目标对应的各像素的位置为感兴趣区域(region of interest,ROI),ROI区域的中心像素的坐标位置为(x c,y c),即上述中心位置,其他像素表示按照坐标位置表示为像素(i,j),则像素(i,j)的得分通过如下公式得到:
Figure PCTCN2021116099-appb-000001
其中,G(i,j)表示像素(i,j)的得分。
需要说明的是,上述公式仅是一个举例说明,例如:上述公式中的0.5可以配置为其他常数,或者还可以在上述公式中增加常数等,对此不作限定。
进一步的,上述中心位置的得分可以预先配置,或者通过上述公式得到,如像素(i,j)等于像素(x c,y c)。
根据二维高斯分布函数可以确定图像子区域的像素得分从中心位置向四周的减弱。
需要说明的是,本发明实施例中并不限定采用二维高斯分布函数确定各像素的得分,例如:还可以直接根据像素与上述中心位置的距离确定得分,例如:将图像子区域中各像素与中心位置之间的距离划 分多个区间,每个区间对应一个得分。
该实施方式中,像素的得分与像素与所述中心位置之间的距离呈负相关,这样可以表示距离疑似目标越远的像素得分越低,从而减少上述预设条件的触发,以进一步节约功耗。
还需要说明的是,本发明实施例中,并不限定以上述中心位置计算所述目标子区中各像素的得分,例如:还可以是直接以疑似目标的区域计算所述目标子区中各像素的得分。
另外,未检测到疑似目标的图像子区域(即不属于目标子区域的图像子区域)的像素的得分可以为零,或者在矩阵的实施方式中,针对未检测到疑似目标的图像子区域不执行得分累加的操作。
可选的,所述在至少一个像素的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域,包括:将各像素的得分累加至矩阵对应的元素中,其中,所述矩阵包括的元素与获取的图像帧包括的像素一一对应;在所述矩阵中至少一个元素的累积得分超过预设门限的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域,其中,所述满足预设条件的像素与累积得分超过预设门限的元素一一对应。
其中,上述矩阵为预设定义的,该矩阵中各像素的初始值可以为零。
需要说明的是,由于摄像头采集的像素帧或者获取的视频是连续的,因此,通过对多个图像帧的处理,上述矩阵中的元素的得分可以累积,当然,本发明实施例中,元素的得分并不一定就是越来越高,可以是存在一定衰减。
该实施方式中,预先设定一个矩阵,该矩阵的元素与图像的像素一一对应,如矩阵的长宽可就等于图像的长宽。从而对每个图像帧,可根据其中像素的得分计算矩阵中相应元素的得分,而通过处理多个图像帧,可得到矩阵的元素的累积得分。由此,可以只在矩阵中有至少一个元素(参考元素)的累积得分超过预设门限的情况下,才截取当前获取的图像帧中的、包括对应这些参考元素的像素的区域为图像部分区域,从而降低目标检测的次数,以进一步节约功耗。
可选的,目标元素在第一时刻的累积得分等于如下两项之和:所述目标元素对应像素在所述第一时刻得分;所述目标元素在所述第一时刻的上一时刻的累积得分的衰减值;其中,所述目标元素为所述矩阵中的任一元素。
其中,每个时刻对应一个图像帧。
该实施方式中,每过一个时刻,矩阵中各元素的累积得分会“衰减”为“衰减值”,故可以将像素当前的得分(本时刻的增加得分)与对应元素上一个时刻的累积得分的衰减值(在没有增加得分时,本时刻剩余的累积得分)进行相加,以得到当前时刻的累积得分。
在一些情况下,可能从一个图像帧中无法准识别出疑似目标(即根据一个图像帧中像素的得分无法准识别出疑似目标),而通过累积的方式,可根据多个图像帧共同确定出疑似目标;同时,由于累积得分还会衰减,故可排除太久之前的图像帧的影响,从而实现根据多个连续图像帧确定出疑似目标。
进一步的,由于上述目标元素在所述第一时刻的上一时刻的累积得分的衰减值,这样可以实现某一些元素在持续一定时间没有增加得分的情况下,该元素的得分衰减为0,例如:某一图像子区域连续m帧(m个时刻)未检测到疑似目标,从而这m帧相应元素均没有得分,进而元素的得分衰减为0。
另外,上述累积得分的衰减值可以使用指数衰减方法对上述矩阵(可以表示为M)元素进行得分衰减。假设若没有新的得分增加的情况下,矩阵M中各元素经过m帧后的得分衰减为0,即每连续的m帧作为一个时间统计周期,则用指数衰减的方法对本时间统计周期内矩阵M进行衰减,记当前时刻累积矩阵M中对应图像帧(可以表示为原图像I)中各元素的得分为k x,y,则下一时刻(下一帧)的元素衰减为与预设系数α(t)相乘:
k x,y(t+1)=α(t)×k(t)
α(t)=exp[-θ×t]
Figure PCTCN2021116099-appb-000002
其中,上述m和ε为常数,上述θ公式可以使得系数从1衰减到接 近0。
当然,本发明实施例中,并不限定指数衰减方法对元素的得分进行衰减,例如:还可以是每隔一图像帧衰减一定比例,如每隔一图像帧衰减20%。
作为一种可选的实施方式,所述分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果,包括:使用第一网络模型分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;所述对所述图像部分区域执行目标检测,包括:使用第二网络模型对所述图像部分区域执行目标检测;其中,在处理同一图像时,所述第一网络模型的计算量少于所述第二网络模型的计算量。
上述第一网络模型和第二网络模型可以是预先训练的,另外,由于在相同情况下,第一网络模型的计算量少于所述第二网络模型的计算量,从而可以第一网络模型也可以称作浅层目标识别网络模型,而第二网络模型称作深层目标识别网络模型。
该实施方式中,通过第一网络模型从全部图像中用较少计算开销筛查出疑似目标,且不对目标的具体种类进行详细判别,只要求速度,不要求有很高的识别准确率。之后使用第二网络模型对图像部分区域执行目标检测。
另外,上述第二网络模型可以是根据实际业务需要满足特定识别精度的网络,如深层FasterRcnn网络,而上述第一网络模型可以是浅层SSD网络,当然,对此不作限定,例如:第一网络模型和第二网络模型都还可以是其他网络模型。
其中,虽然第一网络模型需要处理全部图像,但因其只用识别疑似目标,而非进行准确的目标检测,故所需的总运算量较低。而第二网络模型可准确的检测出最终目标,以保证目标检测的效果;同时,第二网络模型只在确实有疑似目标时,才对具有疑似目标的图像部分区域进行识别,其识别的次数少、识别的对象尺寸小,从而总运算量也较低。
总之,该实施方式中,由于只有第二网络模型在满足上述预设条件后对截取图像区域进行检测,从而提升节约功耗(运算量),且还 可以提升计算效率。
本发明实施例中,对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数;分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域;对所述图像部分区域执行目标检测。这样由于对各区别执行疑似目标检测,之后在满足预设条件的情况下在图像部分区域执行目标检测,从而可以降低目标检测的功耗(运算量)。
请参见图3,图3是本发明实施例提供的一种图像内容检测装置的结构图,如图3所示,图像内容检测装置300包括:划分模块301,用于对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数;第一检测模块302,用于分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;截取模块303,用于在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域;第二检测模块304,用于对所述图像部分区域执行目标检测。
可选的,所述截取模块,包括:计算单元,用于在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分;截取单元,用于在至少一个像素的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域。
可选的,所述计算单元用于在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,确定所述疑似目标的中心位置,依据所述中心位置计算所述目标子区中各像素的得分,任意所述像素的得分与该像素与所述中心位置之间的距离呈负相关。
可选的,所述截取单元用于将各像素的得分累加至矩阵对应的元素中,其中,所述矩阵包括的元素与获取的图像帧包括的像素一一对应;以及在所述矩阵中至少一个元素的累积得分超过预设门限的情况下,截取当前获取的图像帧中包括满足预设条件的像素的图像部分区 域,其中,所述满足预设条件的像素与累积得分超过预设门限的元素一一对应。
可选的,目标元素在第一时刻的累积得分等于如下两项之和:
所述目标元素对应像素在所述第一时刻得分;所述目标元素在所述第一时刻的上一时刻的累积得分的衰减值;其中,所述目标元素为所述矩阵中的任一元素。
可选的,所述第一检测模块用于使用第一网络模型分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;所述第二检测模块用于使用第二网络模型对所述图像部分区域执行目标检测;其中,在处理同一图像时,所述第一网络模型的计算量少于所述第二网络模型的计算量。
本发明实施例提供的图像内容检测装置能够实现图1的方法实施例中的各个过程,为避免重复,这里不再赘述。
需要说明的是,本发明实施例中的图像内容检测装置可以是相对独立存在的装置,也可以是电子设备中的部件、集成电路、或芯片。
请参见图4,图4是本发明实施例提供的一种电子设备的结构图,如图4所示,电子设备400包括:存储器401、处理器402及存储在所述存储器401上并可在所述处理器402上运行的程序或者指令,所述程序或者指令被所述处理器402执行时实现上述图像内容检测方法中的步骤。
请参见图5,本发明实施例还提供一种可读存储介质500,所述可读存储介质500上存储有程序或指令,该程序或指令被处理器执行时实现上述图像内容检测方法实施例的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另 外的相同要素。此外,需要指出的是,本申请实施方式中的方法和装置的范围不限按示出或讨论的顺序来执行功能,还可包括根据所涉及的功能按基本同时的方式或按相反的顺序来执行功能,例如,可以按不同于所描述的次序来执行所描述的方法,并且还可以添加、省去、或组合各种步骤。另外,参照某些示例所描述的特征可在其他示例中被组合。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
上面结合附图对本申请的实施例进行了描述,但是本申请并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本申请的启示下,在不脱离本申请宗旨和权利要求所保护的范围情况下,还可做出很多形式,均属于本申请的保护之内。

Claims (14)

  1. 一种图像内容检测方法,其特征在于,包括:
    对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数;
    分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;
    在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域;
    对所述图像部分区域执行目标检测。
  2. 如权利要求1所述的方法,其特征在于,所述在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域,包括:
    在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分;
    在至少一个像素的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域。
  3. 如权利要求2所述的方法,其特征在于,所述在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分,包括:
    在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,确定所述疑似目标的中心位置,依据所述中心位置计算所述目标子区中各像素的得分,任意所述像素的得分与该像素与所述中心位置之间的距离呈负相关。
  4. 如权利要求2所述的方法,其特征在于,所述在至少一个像素 的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域,包括:
    将各像素的得分累加至矩阵对应的元素中,其中,所述矩阵包括的元素与获取的图像帧包括的像素一一对应;
    在所述矩阵中至少一个元素的累积得分超过预设门限的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域,其中,所述满足预设条件的像素与累积得分超过预设门限的元素一一对应。
  5. 如权利要求4所述的方法,其特征在于,目标元素在第一时刻的累积得分等于如下两项之和:
    所述目标元素对应像素在所述第一时刻得分;
    所述目标元素在所述第一时刻的上一时刻的累积得分的衰减值;
    其中,所述目标元素为所述矩阵中的任一元素。
  6. 如权利要求1至5中任一项所述的方法,其特征在于,所述分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果,包括:
    使用第一网络模型分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;
    所述对所述图像部分区域执行目标检测,包括:
    使用第二网络模型对所述图像部分区域执行目标检测;
    其中,在处理同一图像时,所述第一网络模型的计算量少于所述第二网络模型的计算量。
  7. 一种图像内容检测装置,其特征在于,包括:
    划分模块,用于对获取的图像帧进行区域划分,以得到S个图像子区域,其中,S为大于1的整数;
    第一检测模块,用于分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;
    截取模块,用于在依据所述疑似目标检测结果确定存在满足预设条件的图像位置的情况下,截取当前获取的图像帧中包括所述图像位置的图像部分区域;
    第二检测模块,用于对所述图像部分区域执行目标检测。
  8. 如权利要求7所述的装置,其特征在于,所述截取模块,包括:
    计算单元,用于在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,依据所述疑似目标的所在位置,计算所述目标子区域中像素的得分;
    截取单元,用于在至少一个像素的得分满足预设条件的情况下,截取当前获取的图像帧中包括满足预设条件的像素的区域为图像部分区域。
  9. 如权利要求8所述的装置,其特征在于,所述计算单元用于在所述疑似目标检测结果表示存在具有疑似目标的图像子区域为目标子区域的情况下,确定所述疑似目标的中心位置,依据所述中心位置计算所述目标子区中各像素的得分,任意所述像素的得分与该像素与所述中心位置之间的距离呈负相关。
  10. 如权利要求8所述的装置,其特征在于,所述截取单元用于将各像素的得分累加至矩阵对应的元素中,其中,所述矩阵包括的元素与获取的图像帧包括的像素一一对应;以及在所述矩阵中至少一个元素的累积得分超过预设门限的情况下,截取当前获取的图像帧中包括满足预设条件的像素的图像部分区域,其中,所述满足预设条件的像素与累积得分超过预设门限的元素一一对应。
  11. 如权利要求10所述的装置,其特征在于,目标元素在第一时刻的累积得分等于如下两项之和:
    所述目标元素对应像素在所述第一时刻得分;
    所述目标元素在所述第一时刻的上一时刻的累积得分的衰减值;
    其中,所述目标元素为所述矩阵中的任一元素。
  12. 如权利要求7至11中任一项所述的装置,其特征在于,所述 第一检测模块用于使用第一网络模型分别对所述S个图像子区域进行疑似目标检测,得到疑似目标检测结果;
    所述第二检测模块用于使用第二网络模型对所述图像部分区域执行目标检测;
    其中,在处理同一图像时,所述第一网络模型的计算量少于所述第二网络模型的计算量。
  13. 一种电子设备,其特征在于,包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的程序或者指令,所述程序或者指令被所述处理器执行时实现如权利要求1至6中任一项所述的图像内容检测方法中的步骤。
  14. 一种可读存储介质,其特征在于,所述可读存储介质上存储有程序或指令,所述程序或指令被处理器执行时实现如权利要求1至6中任一项所述的图像内容检测方法中的步骤。
PCT/CN2021/116099 2020-09-04 2021-09-02 图像内容检测方法、装置、电子设备和可读存储介质 WO2022048578A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010922688.4A CN112070083A (zh) 2020-09-04 2020-09-04 图像内容检测方法、装置、电子设备和存储介质
CN202010922688.4 2020-09-04

Publications (1)

Publication Number Publication Date
WO2022048578A1 true WO2022048578A1 (zh) 2022-03-10

Family

ID=73666067

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/116099 WO2022048578A1 (zh) 2020-09-04 2021-09-02 图像内容检测方法、装置、电子设备和可读存储介质

Country Status (2)

Country Link
CN (1) CN112070083A (zh)
WO (1) WO2022048578A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082836A (zh) * 2022-07-23 2022-09-20 深圳神目信息技术有限公司 一种行为识别辅助的目标物体检测方法及装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112070083A (zh) * 2020-09-04 2020-12-11 北京灵汐科技有限公司 图像内容检测方法、装置、电子设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663348A (zh) * 2012-03-21 2012-09-12 中国人民解放军国防科学技术大学 一种光学遥感图像中的海上舰船检测方法
CN111126252A (zh) * 2019-12-20 2020-05-08 浙江大华技术股份有限公司 摆摊行为检测方法以及相关装置
CN111402191A (zh) * 2018-12-28 2020-07-10 阿里巴巴集团控股有限公司 一种目标检测方法、装置、计算设备及介质
CN111489342A (zh) * 2020-04-09 2020-08-04 西安星舟天启智能装备有限责任公司 一种基于视频的火焰检测方法、系统及可读存储介质
CN112070083A (zh) * 2020-09-04 2020-12-11 北京灵汐科技有限公司 图像内容检测方法、装置、电子设备和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663348A (zh) * 2012-03-21 2012-09-12 中国人民解放军国防科学技术大学 一种光学遥感图像中的海上舰船检测方法
CN111402191A (zh) * 2018-12-28 2020-07-10 阿里巴巴集团控股有限公司 一种目标检测方法、装置、计算设备及介质
CN111126252A (zh) * 2019-12-20 2020-05-08 浙江大华技术股份有限公司 摆摊行为检测方法以及相关装置
CN111489342A (zh) * 2020-04-09 2020-08-04 西安星舟天启智能装备有限责任公司 一种基于视频的火焰检测方法、系统及可读存储介质
CN112070083A (zh) * 2020-09-04 2020-12-11 北京灵汐科技有限公司 图像内容检测方法、装置、电子设备和存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115082836A (zh) * 2022-07-23 2022-09-20 深圳神目信息技术有限公司 一种行为识别辅助的目标物体检测方法及装置
CN115082836B (zh) * 2022-07-23 2022-11-11 深圳神目信息技术有限公司 一种行为识别辅助的目标物体检测方法及装置

Also Published As

Publication number Publication date
CN112070083A (zh) 2020-12-11

Similar Documents

Publication Publication Date Title
US10846867B2 (en) Apparatus, method and image processing device for smoke detection in image
CN108960163B (zh) 手势识别方法、装置、设备和存储介质
CN110598558B (zh) 人群密度估计方法、装置、电子设备及介质
CN109165589B (zh) 基于深度学习的车辆重识别方法和装置
CN109726658B (zh) 人群计数及定位方法、系统、电子终端及存储介质
WO2022048578A1 (zh) 图像内容检测方法、装置、电子设备和可读存储介质
CN108875723B (zh) 对象检测方法、装置和系统及存储介质
CN108961318B (zh) 一种数据处理方法及计算设备
CN109145771B (zh) 一种人脸抓拍方法及装置
US10922535B2 (en) Method and device for identifying wrist, method for identifying gesture, electronic equipment and computer-readable storage medium
WO2019033575A1 (zh) 电子装置、人脸追踪的方法、系统及存储介质
CN108647587B (zh) 人数统计方法、装置、终端及存储介质
KR20140028809A (ko) 이미지 피라미드의 적응적 이미지 처리 장치 및 방법
CN112509003A (zh) 解决目标跟踪框漂移的方法及系统
CN109087347B (zh) 一种图像处理方法及装置
JP7115579B2 (ja) 情報処理装置、情報処理方法、及びプログラム
CN111062415B (zh) 基于对比差异的目标对象图像提取方法、系统及存储介质
CN113298852A (zh) 目标跟踪方法、装置、电子设备及计算机可读存储介质
US20230368397A1 (en) Method and system for detecting moving object
CN113129298A (zh) 文本图像的清晰度识别方法
WO2019242388A1 (zh) 一种基于深度图像的图书馆机器人障碍识别方法
CN108764206B (zh) 目标图像识别方法和系统、计算机设备
CN116128922A (zh) 基于事件相机的物体掉落检测方法、装置、介质及设备
CN113723375B (zh) 一种基于特征抽取的双帧人脸跟踪方法和系统
CN113205079B (zh) 一种人脸检测方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC. EPO FORM 1205A DATED 07-07-2023-

122 Ep: pct application non-entry in european phase

Ref document number: 21863642

Country of ref document: EP

Kind code of ref document: A1