WO2023000856A1 - 一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品 - Google Patents

一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品 Download PDF

Info

Publication number
WO2023000856A1
WO2023000856A1 PCT/CN2022/097584 CN2022097584W WO2023000856A1 WO 2023000856 A1 WO2023000856 A1 WO 2023000856A1 CN 2022097584 W CN2022097584 W CN 2022097584W WO 2023000856 A1 WO2023000856 A1 WO 2023000856A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
frame
area
sequence
detection
Prior art date
Application number
PCT/CN2022/097584
Other languages
English (en)
French (fr)
Inventor
苏海昇
Original Assignee
上海商汤智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤智能科技有限公司 filed Critical 上海商汤智能科技有限公司
Publication of WO2023000856A1 publication Critical patent/WO2023000856A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/103Static body considered as a whole, e.g. static pedestrian or occupant recognition

Definitions

  • the present disclosure relates to the technical field of computer vision, and in particular to an abnormal event detection method, device, electronic equipment, storage medium and computer program product.
  • Abnormal event detection in video is an important issue in the field of computer vision.
  • computer vision and deep learning techniques can be used to automatically detect abnormal events in video.
  • the traditional abnormal event detection method is to perform full image data enhancement or other preprocessing on the input video sequence, and then input it into the classification model for detection, or roughly locate the center point of the event through the dense distribution of the crowd. Less efficient and less accurate.
  • Embodiments of the present disclosure are expected to provide an abnormal event detection method, device, electronic equipment, and storage medium.
  • An embodiment of the present disclosure provides a method for detecting an abnormal event, the method comprising: acquiring an image sequence, and determining at least one image area set from the image sequence; each of the image area sets includes the image An image area in each frame of image in the sequence, and each image area has the same sequence of pedestrian density among the image areas in a frame of image; for each image area set, based on the image area set The spatial distribution of the image area selects a central area; for each central area, extracts the image area corresponding to the position from each frame of the image sequence to form a trajectory sequence; utilizes each trajectory sequence to perform Anomaly detection.
  • the determining at least one set of image regions from the image sequence includes: for each frame of image in the image sequence, sequentially determine at least one set of the the image area to obtain the image area group corresponding to the image; in the image area group corresponding to each frame image in the image sequence, the image areas with the same order of pedestrian density are divided into the same set to obtain the at least one A collection of image regions.
  • each frame image of the image sequence at least one image area is sequentially determined according to the density of pedestrians from large to small, and the image area group corresponding to the image is obtained, including: for each frame image of the image sequence, each pedestrian detection frame corresponding to the image is obtained, and expanded into an extended detection frame respectively to form an extended detection frame group corresponding to the image; for each of the extended detection frames frame group, determining the overlap ratio between different extended detection frames included in the extended detection frame group, and constructing a first adjacency matrix corresponding to the extended detection frame group according to the obtained overlap ratio; for each of the extended detection frames group, based on the first adjacency matrix corresponding to the extended detection frame group, determine the matching times of each extended detection frame contained in the extended detection frame group, and select at least one extended detection frame in order according to the matching times from large to small , to obtain a group of center detection frames corresponding to the image; for each frame of the image sequence, based on the group of center detection frames corresponding to the image, determine an image region
  • each pedestrian detection frame corresponding to the image is obtained, and expanded into an extended detection frame respectively to form an extended detection frame group corresponding to the image , comprising: acquiring at least one pedestrian detection frame from the image; in the image, taking each pedestrian detection frame in the at least one pedestrian detection frame as the center, and expanding outward according to a first preset ratio to form an extension A detection frame: forming an extended detection frame group corresponding to the image from the extended detection frames obtained by the expansion.
  • determining an image region group corresponding to the image based on a group of center detection frames corresponding to the image includes: acquiring at least one of the corresponding group of center detection frames from the image Expanding at least one pedestrian detection frame corresponding to the detection frame; from the image, using each pedestrian detection frame as the center, and expanding it according to a second preset ratio to form an image area; combining the image area obtained by the expansion An image area group corresponding to the image.
  • selecting a central area based on the spatial distribution of the image areas in the image area set includes: determining the overlap ratio between different image areas in the image area set, and according to the obtained overlap ratio Constructing a corresponding second adjacency matrix; determining the number of matches for each image region in the set of image regions based on the corresponding second adjacency matrix; selecting the image region with the largest number of matches as the central region corresponding to the set of image regions.
  • using each trajectory sequence to detect abnormal events respectively includes: using a preset abnormal event detection model to perform abnormal event detection on each trajectory sequence to obtain corresponding abnormal event detection results.
  • An embodiment of the present disclosure provides an abnormal event detection device, the device includes: an area determination part configured to acquire an image sequence, and determine at least one image area set from the image sequence; each of the image areas In the set, an image area in each frame image in the image sequence is included, and each image area has the same sequence of pedestrian density among the image areas in a frame image to which it belongs; the area selection part is configured as For each set of image areas, a central area is selected based on the spatial distribution of image areas in the set of image areas; the sequence determination part is configured to, for each central area, extract from each frame of the image sequence The image regions at corresponding positions are taken to form a trajectory sequence; the abnormality detection part is configured to use each trajectory sequence to detect abnormal events respectively.
  • the area determination part is further configured to sequentially determine at least one of the image areas for each frame of the image sequence according to the density of pedestrians from large to small, and obtain the image corresponding to group of image regions; in the group of image regions corresponding to each frame of image in the image sequence, the image regions with the same order of density of pedestrians are divided into the same set to obtain the at least one set of image regions.
  • the area determination part is further configured to obtain each pedestrian detection frame corresponding to the image for each frame of the image sequence, and expand it into an extended detection frame respectively to form the An extended detection frame group corresponding to the image; for each extended detection frame group, determine the overlap ratio between different extended detection frames included in the extended detection frame group, and construct the extended detection frame according to the obtained overlap ratio
  • the first adjacency matrix corresponding to the group for each of the extended detection frame groups, based on the first adjacency matrix corresponding to the extended detection frame group, determine the number of matches of each extended detection frame included in the extended detection frame group, And according to the number of matches from large to small, at least one extended detection frame is sequentially selected to obtain a group of center detection frames corresponding to the image; for each frame of the image sequence, based on a group of center detection frames corresponding to the image box to determine an image region group corresponding to the image.
  • the area determining part is further configured to obtain at least one pedestrian detection frame from the image; in the image, each pedestrian detection frame in the at least one pedestrian detection frame is taken as the center respectively , expanding according to a first preset ratio to form an extended detection frame; forming an extended detection frame group corresponding to the image with the extended detection frame obtained by the external expansion.
  • the area determination part is further configured to obtain at least one pedestrian detection frame corresponding to at least one extended detection frame that constitutes a corresponding set of central detection frames from the image; from the image, Taking each pedestrian detection frame as a center and expanding outward according to a second preset ratio to form an image area; forming an image area group corresponding to the image from the image area obtained by the expansion.
  • the region selection part is further configured to determine the overlap ratio between different image regions in the image region set, and construct a corresponding second adjacency matrix according to the obtained overlap ratio; based on the corresponding second The adjacency matrix determines the matching times of each image region in the set of image regions; and selects the image region with the largest matching times as the central region corresponding to the set of image regions.
  • the abnormality detection part is further configured to use a preset abnormal event detection model to perform abnormal event detection on each of the trajectory sequences, and obtain corresponding abnormal event detection results.
  • An embodiment of the present disclosure provides an electronic device, and the electronic device includes: a processor, a memory, and a communication bus; wherein the communication bus is configured to realize connection and communication between the processor and the memory; The processor is configured to implement the above abnormal event detection method when executing one or more programs stored in the memory.
  • An embodiment of the present disclosure provides a computer-readable storage medium, where one or more programs are stored in the computer-readable storage medium, and when the one or more programs are executed by one or more processors, the above abnormal events are realized Detection method.
  • An embodiment of the present disclosure provides a computer program product, the computer program product includes a non-transitory computer storage medium storing a computer program, and the computer program is read and executed by a computer to implement the above abnormal event detection method.
  • An embodiment of the present disclosure provides an abnormal event detection method, device, electronic equipment, storage medium, and computer program product.
  • the method includes: acquiring an image sequence, and determining at least one image area set from the image sequence; each image area set , including an image area in each frame of image in the image sequence, and each image area has the same sequence of pedestrian density among the image areas in the image area to which it belongs; the selection is based on the spatial distribution of image areas in the image area set A central area is generated; for each central area, the image area corresponding to the position is extracted from each frame of the image sequence to form a trajectory sequence; each trajectory sequence is used to detect abnormal events.
  • the technical solution provided by the embodiments of the present disclosure not only reduces the amount of data for abnormal event detection, improves detection efficiency, but also provides effective perception field, which improves the detection accuracy.
  • FIG. 1 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of determining an image region group provided by an embodiment of the present disclosure
  • FIG. 4 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure.
  • FIG. 5 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure.
  • FIG. 6 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of an exemplary trajectory sequence provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure.
  • FIG. 9 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure.
  • FIG. 10 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure.
  • FIG. 11 is a schematic structural diagram of an abnormal event detection device provided by an embodiment of the present disclosure.
  • Fig. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure.
  • An embodiment of the present disclosure provides a method for detecting an abnormal event, which may be executed by an apparatus for detecting an abnormal event.
  • the method for detecting an abnormal event may be executed by a terminal device or a server or other electronic devices, wherein the terminal device may be a user device ( User Equipment, UE), mobile devices, user terminals, terminals, cellular phones, cordless phones, personal digital assistants (PDA), handheld devices, computing devices, vehicle-mounted devices, and wearable devices, etc.
  • the abnormal event detection method can be implemented by a processor invoking computer-readable instructions stored in a memory.
  • FIG. 1 is a schematic flowchart of a method for detecting an abnormal event provided by an embodiment of the present disclosure.
  • the abnormal event detection method mainly includes the following steps:
  • each image area set includes an image area in each frame of image in the image sequence, and each image area belongs to a frame of image
  • the order of pedestrian density among image regions is the same.
  • the abnormal event detection apparatus may acquire an image sequence first, so as to determine at least one image region set from the image sequence.
  • the abnormal event detection device may include an image acquisition device, so that the image acquisition device is used to collect continuous frame images of a certain scene to obtain an image sequence.
  • the image sequence may also be an independent What is collected by the camera device, for example, what is collected by a camera, is then transmitted to the abnormal event detection device.
  • the manner of acquiring the image sequence, and the image sequence may be determined according to actual requirements and application scenarios, and are not limited in this embodiment of the present disclosure.
  • each set of image regions includes one image region in each frame of images in the image sequence, and each image The sequence of the pedestrian density among the image areas in the area to which it belongs is the same.
  • FIG. 2 is a schematic flowchart of a method for detecting an abnormal event provided by an embodiment of the present disclosure.
  • S101 may be implemented by the abnormal event detection device through S201 to S202 , which will be described in conjunction with the steps shown in FIG. 2 .
  • the image areas with the same sequence of pedestrian density are divided into the same set to obtain at least one image area set.
  • the image sequence includes multiple frames of images sequentially arranged in time sequence
  • the abnormal event detection device can, for each frame of images, rank from large to small according to the density of pedestrians, At least one image area in a frame of image is sequentially determined, so that each image area is regarded as an image area group corresponding to the frame of image, and finally an image area group corresponding to each frame of image is finally determined.
  • the abnormal event detection device divides the image areas with the same order of pedestrian density in different groups into the same set, that is, divides the multiple image area groups , in each image area group, select the image area with the first order of pedestrian density and divide it into the same set, and so on, until the image area with the last order of pedestrian density in each image area group is selected and divided into the same set , and since each image area group includes at least one image area, the number of finally divided image area sets is consistent with the number of image areas in each image area group.
  • At least one image area in each frame of image is determined according to the degree of pedestrian density, so as to obtain the image area group corresponding to each frame of image, and then the images with the same pedestrian density in each image area group are sorted
  • the area forms at least one image area set, the number of image areas included in each image area set is the same as the number of image frames included in the image sequence, and the image areas with the same pedestrian density in the image sequence are divided into the same set, which can be composed of one
  • a collection of image regions represents an event in an image sequence.
  • Fig. 3 is a schematic flowchart of determining an image region group provided by an embodiment of the present disclosure.
  • S201 may be implemented through S301 to S304 , which will be described in conjunction with the steps shown in FIG. 3 .
  • the abnormal event detection device can perform pedestrian detection for each frame of image in the image sequence, so as to generate a pedestrian detection frame for each detected pedestrian, and then, Each pedestrian detection frame is enlarged, and each enlarged pedestrian detection frame is an extended detection frame, and all the extended detection frames in a frame of image form an extended detection frame group corresponding to the frame of image. Since the image sequence includes multiple frames of images, the abnormal event detection device determines an extended detection frame group corresponding to each frame of images in the manner described above.
  • FIG. 4 is a schematic flowchart of a method for detecting an abnormal event provided by an embodiment of the present disclosure.
  • S301 may be implemented by the abnormal event detection device through S401 to S403 , which will be described in conjunction with the steps shown in FIG. 4 .
  • the abnormal event detection device can obtain at least one pedestrian detection frame from it , so that in the i-th frame of the image, each obtained pedestrian detection frame is centered and expanded with a certain ratio.
  • the obtained expanded detection frame not only includes the pedestrians included in the pedestrian detection frame, but also includes the surrounding pedestrians image information, such as people or objects around pedestrians, so that it is convenient for subsequent evaluation of information such as crowd density at the corresponding location, making the determination of the subsequent image area more accurate.
  • the abnormal event detection device can also directly expand each pedestrian detection frame to the preset size, that is, the size of each expanded detection frame obtained is the same preset size, and the abnormal event detection
  • the device may also perform external expansion by using different external expansion methods based on the positions of different pedestrian detection frames in one frame of image, which is not limited in this embodiment of the present disclosure.
  • the first preset ratio may be set according to actual requirements and application scenarios, which is not limited in the embodiments of the present disclosure.
  • the first preset ratio may be 1.5 times, that is, with the pedestrian detection frame as the center, the detection frame is expanded outward in the image by 1.5 times, and each enlarged pedestrian detection frame is an expanded detection frame.
  • each extended detection frame group determines an overlapping ratio between different extended detection frames included in the extended detection frame group, and construct a first adjacency matrix corresponding to the extended detection frame group according to the obtained overlapping ratio.
  • the abnormal event detection device can determine the overlap between different extended detection frames included in the extended detection frame group for each extended detection frame group ratio, and construct the first adjacency matrix corresponding to the extended detection frame group according to the overlap ratio obtained.
  • the abnormal event detection device can calculate each extended detection frame of an extended detection frame group, and other extended detection frames in the extended detection frame group The corresponding first adjacency matrix is constructed, and the first adjacency matrix actually records the overlap between any two extended detection frames in the extended detection frame group.
  • the abnormal event detection device constructs a corresponding first adjacency matrix according to the obtained overlap ratio between different extended detection frames. overlap ratio as an element in the first adjacency matrix.
  • the abnormal event detection device determines the corresponding first adjacency matrix for each extended detection frame group, it can determine each extended detection frame group in the extended detection frame group based on the first adjacency matrix.
  • the number of matching frames, and according to the matching times from large to small, at least one extended detection frame is selected in order to form a set of central detection frames.
  • the abnormal event detection device can obtain from In the first adjacency matrix, it is directly known whether each extended detection frame overlaps with other extended detection frames in the extended detection frame group, for example, for an extended detection frame, it is included in the first adjacency matrix
  • the overlap ratio between an extended detection frame is 0, there is no overlap between the representation and the extended detection frame and other extended detection frames in the extended detection frame group; when the overlap ratio is not 0, the representation and the extended detection frame
  • the detection frame overlaps with other extended detection frames in the extended detection frame group.
  • the number of matches is counted as one, so that the number of matches of the extended detection frame can be determined.
  • the abnormal event detection device can select the extended detection frame according to the matching times from large to small, And at least one selected extended detection frame is formed into a group of central detection frames of the image.
  • each extended detection frame group since each extended detection frame is actually obtained by expanding the pedestrian detection frame, that is, including pedestrians, the matching times in one extended detection frame More, that is, in the case of overlapping with more other extended detection frames, there will be more pedestrians distributed around the extended detection frame, and the location of the extended detection frame is actually the center of a pedestrian-intensive area. Therefore, the abnormal event
  • the detection device selects the extended detection frame from the group of extended detection frames, so that the region with relatively high density of pedestrians can be accurately determined from a frame of image to which the extended detection frame belongs.
  • an extended detection frame group there may be cases where the matching times of some extended detection frames are the same, when the abnormal event detection device selects according to the matching times from large to small, It can be further selected from some extended detection frames with the same number of matches based on other information. For example, the area of each extended detection frame among the partial extended detection frames with the same number of matches can be further obtained, and the extended detection frame with a larger area can be selected. frame.
  • the abnormal event detection device may determine an image area group based on a corresponding group of central detection frames for each frame of image in the image sequence.
  • each group of central detection frames is actually composed of at least one extended detection frame in an extended detection frame group, and each extended detection frame group is actually Included in one frame of the image sequence, therefore, a set of central detection boxes is actually included in one frame of the image sequence.
  • an extended detection frame group is obtained by expanding, and a frame of image is determined according to the overlap ratio between every two extended detection frames in an extended detection frame group A group of center detection frames in the frame, so as to obtain a group of image regions corresponding to each frame of image, so that the area with the most dense pedestrians in each frame of image can be determined, that is, the location of abnormal events in each frame of image can be determined The most likely area, so that subsequent abnormal event detection can be realized.
  • FIG. 5 is a schematic flowchart of a method for detecting an abnormal event provided by an embodiment of the present disclosure.
  • S304 may be implemented by the abnormal event detection device through S501 to S503 , which will be described in conjunction with the steps shown in FIG. 5 .
  • each extended detection frame is obtained by expanding a pedestrian detection frame, and the abnormal event detection device can directly obtain at least one corresponding pedestrian detection frame from the k-th frame image, so as to further use each pedestrian detection frame As the center, it is expanded with a certain ratio.
  • the obtained image area not only includes the pedestrians included in the pedestrian detection frame, but also includes the image information around the pedestrians. For example, people or objects around pedestrians.
  • the abnormal event detection device can also directly expand each pedestrian detection frame to the preset size, that is, the size of each expanded detection frame obtained is the same preset size, and the abnormal event detection
  • the device may also perform external expansion by using different external expansion methods based on the positions of different pedestrian detection frames in one frame of image, which is not limited in this embodiment of the present disclosure.
  • the second preset ratio may be set according to actual requirements and application scenarios, which is not limited in the embodiments of the present disclosure.
  • the second preset ratio may be 2 times, that is, the pedestrian detection frame is centered, and the detection frame is enlarged by 2 times outward in the image, and the enlarged pedestrian detection frame is the image area.
  • the second preset ratio is greater than the first preset ratio, that is, when determining the image area, it includes more information than the expanded detection frame.
  • more pedestrians can be included, so as to provide effective information for subsequent abnormal event detection. field of perception.
  • the abnormal event detection device may select a center for each image area set based on the spatial distribution of image areas in the image area set area.
  • FIG. 6 is a schematic flowchart of a method for detecting an abnormal event provided by an embodiment of the present disclosure.
  • S102 may be implemented by the abnormal event detection device through S601 to S603 , which will be described in conjunction with the steps shown in FIG. 6 .
  • the abnormal event detection device may determine the image area for each set of image areas The overlap ratio between different image regions in the set, so as to construct the corresponding second adjacency matrix, the second adjacency matrix characterizes the distribution of the image regions in the image region set in the spatial position, in order to improve the consistency of the central point, abnormal
  • the event detection device selects the image area with the densest pedestrian density from the image area set as the central area, that is, the image area with the largest number of matches, and finally selects at least one central area from at least one image area.
  • the determined image area with the densest pedestrians in each image area set is taken as the central area, that is, the image sequence is determined.
  • the most likely area for abnormal events to occur in the image provides a more accurate image detection area for subsequent abnormal event detection.
  • the abnormal event detection device can extract the corresponding image area from each frame of the image sequence for each central area to form a trajectory sequence.
  • the abnormal event detection device can adjust the scale of these image areas, and then Stitching into a sequence of trajectories based on timing.
  • Fig. 7 is a schematic diagram of an exemplary trajectory sequence provided by an embodiment of the present disclosure.
  • a trajectory sequence can actually include 5 image areas, wherein, the 5 image areas are the areas at the same position selected in the 5 frames of images, and the 5 image areas
  • the benchmark for image area extraction that is, the central area. For example, it may be the area shown in the second image area, so as to obtain the five image areas that make up the trajectory sequence shown.
  • the abnormal event detection device performs image area extraction based on a central area, and can use the position corresponding to the central point of the central area as the center in each frame of image to extract preset Size of the image area, if the center is at the edge of the image, the part that cannot be cut to the preset size can be filled with black.
  • the center points determined by different frames may be inconsistent.
  • the covered abnormal event detection range is too large, and it is easy to contain a large number of invalid information, so the detection efficiency is low
  • the abnormal event detection device directly determines a central area from a group of image areas, and then uniformly extracts the position of the central area for each frame of image The image areas are combined, so that the center points of the trajectory sequences obtained in this way are consistent, which improves the representativeness and effectiveness of the matting area, and the subsequent use of it for abnormal event detection is more reasonable and the detection efficiency is higher.
  • the abnormal event detection device can use each trajectory sequence to perform abnormal event detection respectively after obtaining the trajectory sequence.
  • FIG. 8 is a schematic flowchart of a method for detecting an abnormal event provided by an embodiment of the present disclosure.
  • S104 may be implemented by the abnormal event detection device through S801 , which will be described in conjunction with the steps shown in FIG. 8 .
  • the preset abnormal event detection model is configured to perform abnormal event detection.
  • the preset abnormal event detection model may be a specific image classification network, etc., which is not limited in this embodiment of the present disclosure.
  • the abnormal event detection device can use the preset abnormal event detection model to detect abnormal events according to each trajectory sequence, so as to obtain whether the real scene area corresponding to the trajectory sequence is abnormal event.
  • the abnormal event detection device repeats the generation process of the trajectory sequence at least once, generates at least one trajectory sequence corresponding to different regions, and performs abnormal event detection respectively, which improves the recall of the image region, At the same time, the effective receptive field of the model to the event is improved.
  • the abnormal event detection device can also input multiple trajectory sequences into the preset abnormal event detection model at the same time. It is assumed that the abnormal event detection model can combine multiple trajectory sequences for abnormal event detection, and analyze a corresponding abnormal event.
  • the following describes the application of the abnormal event detection method provided by the embodiments of the present disclosure in actual scenarios, taking the scene of abnormal event detection based on acquired city street videos as an example.
  • Anomaly detection in video is an important issue in the field of computer vision.
  • Computer vision and deep learning techniques can be used to automatically detect abnormal events in videos, such as detecting abnormal behavior, traffic accidents, and some uncommon events.
  • abnormal events such as detecting abnormal behavior, traffic accidents, and some uncommon events.
  • For machines the stronger the visual features, the better the expected anomaly detection performance.
  • the relevant behavior recognition method usually performs data enhancement or other preprocessing on the input video sequence, and then inputs it into the classification model for prediction.
  • this method is only suitable for human-centric video action recognition.
  • the video captured by the surveillance camera it often contains more information and covers a larger field of view.
  • the location of the target event and the scale of the human body are also random. Therefore, it is obviously unreasonable to simply use the entire image as the input of the model. It is a feasible solution to roughly locate the center point of the event through the dense distribution of the crowd, but the center point determined in different frame images may be inconsistent, resulting in the final merged area covering a large area.
  • FIG. 9 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure. In the embodiment of the present disclosure, the abnormal event detection method will be described with reference to FIG. 9 .
  • the pedestrian detection frame in the video image (corresponding to the image sequence in the above embodiments) is acquired by calling an upstream detection component.
  • the overlap ratio between every two pedestrian detection frames in each frame of image is calculated, and the matching times of every two pedestrian detection frames are counted, and the matching times are maximized
  • the pedestrian detection frame corresponding to the K matching times of K is taken as the center (corresponding to a group of center detection frames in the above-mentioned embodiment), and is adaptively expanded in proportion (corresponding to the second preset ratio in the above-mentioned embodiment), to obtain each frame image K candidate regions (corresponding to one image region group in the above embodiment).
  • a first candidate area 9021 , a second candidate area 9022 and a third candidate area 9023 are shown in FIG. 9 .
  • the ith (i is a natural number greater than or equal to 1) candidate regions of the T frame images are formed into a set (corresponding to the set of image regions in the above embodiment). Calculate the cross-over-union ratios between two candidate regions respectively to obtain an adjacency matrix of T times T (corresponding to the second adjacency matrix in the above-mentioned embodiment), and finally calculate the candidate region with the largest number of overlaps in time series, That is, it is determined as the final matting area (corresponding to the central area in the above-mentioned embodiment), the T frame images are respectively matted to obtain the corresponding trajectory sequence, and the Patch ID is given as i, instead of the entire image, input to the image classification model to increase the effective sensing area.
  • the track sequence corresponding to each set of frames (corresponding to the set of image regions in the above-mentioned embodiments) is respectively input into the image classification network for identification, and whether there is an abnormality in the real scene area corresponding to the track sequence can be obtained event.
  • the result of the abnormal event corresponding to the trajectory sequence is finally output, that is, it is output that an abnormal event has occurred or no abnormal event has occurred in the real scene area corresponding to the trajectory sequence.
  • the trajectory sequences corresponding to the candidate region sets of each frame image are respectively input into the image classification network for identification, which is conducive to high recall and can also improve the effective perception range of the model for abnormal events, and each trajectory The sequences all follow the consistent construction rules of the center point, which reduces the large matting coverage caused by disturbance.
  • FIG. 10 is a schematic flowchart of an abnormal event detection method provided by an embodiment of the present disclosure. As shown in FIG. 10 , the abnormal event detection method in a video can be detected based on the central point consistency constraint. It will be described in conjunction with S1001 to S1008.
  • the full-image video sequence taken by the surveillance camera is obtained, and the upstream structured detection component is called to extract the pedestrian detection frame in the video.
  • All the pedestrian detection frames of the current frame image are expanded by m times (the default is 1.5 times), and then the expanded pedestrian detection frames are sorted according to the area of the expanded pedestrian detection frames (sorted in descending order).
  • the adjacency matrix is constructed according to the cross-merging ratio between the expanded pedestrian detection frames, and the K expanded pedestrian detection frames with the largest matching times determined according to the adjacency matrix are used as the central pedestrian detection frame.
  • the externally expanded rectangular frame of the current frame image determined in this way is used as a candidate area of the current frame image.
  • the i-th candidate area of the T frame image is formed into a set of candidate areas. Determine the cross-merge ratio between the two candidate regions respectively, and obtain an adjacency matrix of T times T, which represents the distribution of candidate regions in space.
  • the center (that is, the one with the most matching times) is taken as the i-th candidate area shared by T frames, and is determined as the matting area of the i-th pedestrian trajectory.
  • the T frame images are extracted separately to obtain the corresponding trajectory sequence, and the Patch ID is given as i, which replaces the entire image input to increase the effective perception area.
  • the corresponding local areas are selected on the original image sequence to form the corresponding K trajectory sequences, which are respectively input into the image classification network for behavior recognition.
  • the local area extraction process is as follows: taking the image extraction area of the Kth track sequence determined in S1007 above as an example, scale the long side to 224 (resolution), and scale the short side equally, If the upper and lower black borders are less than 224, the final image size will be 224x224. In the case that the input video frame sequence has no pedestrian detection frame results, a 224x224 area is uniformly selected from the center point of the image. At this time, the video frame sequence has only one track.
  • An embodiment of the present disclosure provides an abnormal event detection method, including: acquiring an image sequence, and determining at least one image area set from the image sequence; each image area set includes one of each frame image in the image sequence Image areas, and each image area has the same sequence of pedestrian density among the image areas in a frame of images; a central area is selected based on the spatial distribution of the image areas in the image area set; for each central area, Extract the image area corresponding to the position from each frame of the image sequence to form a track sequence; use each track sequence to detect abnormal events respectively.
  • the abnormal event detection method provided by the embodiments of the present disclosure not only reduces the amount of abnormal event detection data, but also improves the detection efficiency through the heuristic pedestrian dense crowd position estimation method based on the central point consistency constraint, and also provides abnormal event detection.
  • the effective perception field improves the detection accuracy.
  • the abnormal event detection apparatus 1100 includes: an area determining part 1101 configured to acquire an image sequence, and determine at least one image area set from the image sequence; each The set of image regions includes an image region in each frame of image in the image sequence, and each image region has the same sequence of pedestrian density among the image regions in the frame of image to which it belongs; the region selection part 1102, configured to, for each image area set, select a central area based on the spatial distribution of the image areas in the image area set; the sequence determining part 1103, configured to select, for each central area, from the Each frame image of the image sequence extracts the image area at the corresponding position to form a trajectory sequence; the abnormality detection part 1104 is configured to use each trajectory sequence to perform abnormal event detection respectively.
  • the area determining part 1101 is further configured to determine at least one image area in sequence according to the density of pedestrians in descending order for each frame of image in the image sequence, to obtain The image area group corresponding to the image; in the image area group corresponding to each frame image in the image sequence, the image areas with the same order of pedestrian density are divided into the same set to obtain the at least one image area set.
  • the area determining part 1101 is further configured to obtain each pedestrian detection frame corresponding to the image for each frame of the image sequence, and expand it into an extended detection frame , to form an extended detection frame group corresponding to the image; for each of the extended detection frame groups, determine the overlap ratio between different extended detection frames included in the extended detection frame group, and construct the set according to the obtained overlap ratio
  • the number of matching times, and according to the number of matching times from large to small, at least one extended detection frame is sequentially selected to obtain a set of central detection frames corresponding to the image; for each frame of the image sequence, based on the corresponding A group of center detection frames is used to determine an image area group corresponding to the image.
  • the area determining part 1101 is further configured to obtain at least one pedestrian detection frame from the image; in the image, detect each pedestrian in the at least one pedestrian detection frame
  • the frames are respectively used as the center, and are expanded according to a first preset ratio to form an extended detection frame; the expanded detection frames obtained by the external expansion form an extended detection frame group corresponding to the image.
  • the area determining part 1101 is further configured to obtain at least one pedestrian detection frame corresponding to at least one extended detection frame that constitutes a corresponding group of central detection frames from the image; from the In the image, each pedestrian detection frame is taken as the center, and expanded according to a second preset ratio to form an image area; the image area obtained by the expansion forms an image area group corresponding to the image.
  • the region selecting part 1102 is further configured to determine the overlap ratio between different image regions in the image region set, and construct a corresponding second adjacency matrix according to the obtained overlap ratio; based on The corresponding second adjacency matrix determines the matching times of each image region in the set of image regions; and selects the image region with the largest matching times as the central region corresponding to the set of image regions.
  • the abnormality detection part 1104 is further configured to use a preset abnormal event detection model to perform abnormal event detection on each of the trajectory sequences, and obtain corresponding abnormal event detection results.
  • An embodiment of the present disclosure provides an abnormal event detection device, including: an area determining part configured to acquire an image sequence, and determine at least one image area set from the image sequence; each image area set includes the An image area in each frame of image, and each image area has the same sequence of pedestrian density among image areas in a frame of image; the area selection part is configured to be based on the set of each image area A central region is selected from the spatial distribution of the image regions in the image region set; the sequence determination part is configured to extract corresponding image regions from each frame of the image sequence for each central region to form a trajectory sequence; The anomaly detection part is configured to utilize each trajectory sequence to perform anomalous event detection respectively.
  • the abnormal event detection device provided by the embodiments of the present disclosure not only reduces the amount of data for abnormal event detection, improves detection efficiency, but also provides abnormal event detection The effective perception field improves the detection accuracy.
  • an electronic device 1200 includes: a processor 1201, a memory 1202, and a communication bus 1203; wherein, the communication bus 1203 is configured to implement the processor 1201 and the Connection and communication between the memories 1202; the processor 1201 is configured to implement the above abnormal event detection method when executing one or more programs stored in the memory 1202.
  • An embodiment of the present disclosure provides a computer-readable storage medium, where one or more programs are stored in the computer-readable storage medium, and when the one or more programs are executed by one or more processors, the above abnormal events are realized Detection method.
  • the computer-readable storage medium can be a volatile memory (volatile memory, VM), such as a random access memory (Random-Access Memory, RAM); or a non-volatile memory (non-volatile memory, NVM), such as a read-only Memory (Read-Only Memory, ROM), flash memory (flash memory), hard disk (Hard Disk Drive, HDD) or solid-state drive (Solid-State Drive, SSD); it can also include one of the above memories or any combination respective devices such as mobile phones, computers, tablet devices and personal digital assistants etc.
  • volatile memory volatile memory
  • RAM random access memory
  • NVM non-volatile memory
  • ROM read-only Memory
  • flash memory flash memory
  • HDD Hard Disk Drive
  • SSD solid-state drive
  • the embodiments of the present disclosure may be provided as methods, systems or computer program products. Accordingly, the present disclosure can take the form of a hardware embodiment, a software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, optical storage, etc.) having computer-usable program code embodied therein.
  • a computer-usable storage media including but not limited to disk storage, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable signal processing device to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in at least one flow of the flowchart and at least one block of the block diagram.
  • An embodiment of the present disclosure provides an abnormal event detection method, device, electronic equipment, storage medium, and computer program product.
  • the method includes: acquiring an image sequence, and determining at least one image area set from the image sequence; each image area set , including an image area in each frame of image in the image sequence, and each image area has the same sequence of pedestrian density among the image areas in the frame of image; based on the spatial distribution of image areas in the image area set Select a central area; for each central area, extract the corresponding image area from each frame of the image sequence to form a trajectory sequence; use each trajectory sequence to detect abnormal events.
  • the technical solution provided by the embodiments of the present disclosure not only reduces the amount of data for abnormal event detection, improves detection efficiency, but also provides effective perception field, which improves the detection accuracy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

本公开提供了一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品,方法包括:获取图像序列,并从图像序列中确定出至少一个图像区域集合;每个图像区域集合中,包括图像序列中每一帧图像内的一个图像区域,且每个图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;针对每个图像区域集合,基于图像区域集合内图像区域的空间分布情况选取出一个中心区域;针对每个中心区域,从图像序列的每一帧图像抠取对应位置的图像区域,组成一个轨迹序列;利用每个轨迹序列分别进行异常事件检测。

Description

一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品
相关申请的交叉引用
本公开要求2021年07月23日提交的中国专利申请号为202110836410.X,申请人为上海商汤智能科技有限公司,申请名称为“一种异常事件检测方法、装置、电子设备及存储介质”的优先权,该申请的全文以引用的方式并入本公开中。
技术领域
本公开涉及计算机视觉技术领域,尤其涉及一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品。
背景技术
视频中的异常事件检测是计算机视觉领域的一个重要问题,目前,可以利用计算机视觉和深度学习的技术自动检测发生在视频中的异常事件。
传统的异常事件检测方法,是对输入的视频序列进行全图的数据增强或其他预处理后,输入到分类模型中进行检测,或者,通过人群密集分布情况来粗略定位事件中心点,检测方式的效率较低,准确性较差。
发明内容
本公开实施例期望提供一种异常事件检测方法、装置、电子设备及存储介质。
本公开实施例的技术方案是这样实现的:
本公开实施例提供了一种异常事件检测方法,所述方法包括:获取图像序列,并从所述图像序列中确定出至少一个图像区域集合;每个所述图像区域集合中,包括所述图像序列中每一帧图像内的一个图像区域,且每个所述图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;针对每个图像区域集合,基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域;针对每个中心区域,从所述图像序列的每一帧图像抠取对应位置的所述图像区域,组成一个轨迹序列;利用每个轨迹序列分别进行异常事件检测。
在上述方法中,所述从所述图像序列中确定出至少一个图像区域集合,包括:针对所述图像序列的每一帧图像中,按照行人密集程度由大到小,依次确定出至少一个所述图像区域,得到所述图像对应的图像区域组;将所述图像序列中每一帧图像分别对应的图像区域组中,行人密集程度排序相同的图像区域划分至同一集合,得到所述至少一个图像区域集合。
在上述方法中,所述针对所述图像序列的每一帧图像,按照行人密集程 度由大到小,依次确定出至少一个所述图像区域,得到所述图像对应的图像区域组,包括:针对所述图像序列的每一帧图像,获取所述图像对应的每个行人检测框,并分别扩大成一个扩展检测框,组成所述图像对应的一个扩展检测框组;针对每一所述扩展检测框组,确定所述扩展检测框组包含的不同扩展检测框之间的重叠比,并根据得到的重叠比构建所述扩展检测框组对应的第一邻接矩阵;针对每一所述扩展检测框组,基于所述扩展检测框组对应的第一邻接矩阵,确定所述扩展检测框组包含的每个扩展检测框的匹配次数,并按照匹配次数从大到小,依次选取至少一个扩展检测框,得到所述图像对应的一组中心检测框;针对所述图像序列的每一帧图像,基于所述图像对应的一组中心检测框,确定所述图像对应的一个图像区域组。
在上述方法中,所述针对所述图像序列的每一帧图像,获取所述图像对应的每个行人检测框,并分别扩大成一个扩展检测框,组成所述图像对应的一个扩展检测框组,包括:从所述图像中获取至少一个行人检测框;在所述图像中,将所述至少一个行人检测框中每个行人检测框分别作为中心,按照第一预设比例外扩形成一个扩展检测框;将外扩得到的扩展检测框组成所述图像对应的一个扩展检测框组。
在上述方法中,所述基于所述图像对应的一组中心检测框,确定所述图像对应的一个图像区域组,包括:从所述图像中,获取组成对应的一组中心检测框的至少一个扩展检测框对应的至少一个行人检测框;从所述图像中,将所述每个行人检测框分别作为中心,按照第二预设比例外扩形成一个图像区域;将外扩得到的图像区域组成所述图像对应的一个图像区域组。
在上述方法中,所述基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域,包括:确定所述图像区域集合内不同图像区域之间的重叠比,并根据得到的重叠比构建对应的第二邻接矩阵;基于对应的第二邻接矩阵,确定所述图像区域集合内每个图像区域的匹配次数;选取出匹配次数最大的图像区域作为所述图像区域集合对应的中心区域。
在上述方法中,所述利用每个轨迹序列分别进行异常事件检测,包括:利用预设异常事件检测模型,对每个所述轨迹序列分别进行异常事件检测,得到对应的异常事件检测结果。
本公开实施例提供了一种异常事件检测装置,所述装置包括:区域确定部分,被配置为获取图像序列,并从所述图像序列中确定出至少一个图像区域集合;每个所述图像区域集合中,包括所述图像序列中每一帧图像内的一个图像区域,且每个所述图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;区域选取部分,被配置为针对每个图像区域集合,基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域;序列确定部分,被配置为针对每个中心区域,从所述图像序列的每一帧图像抠取对应位置的所述图像区域,组成一个轨迹序列;异常检测部分,被配置为利用每个轨迹序列分别进行异常事件检测。
在上述装置中,所述区域确定部分,还被配置为针对所述图像序列的每 一帧图像,按照行人密集程度由大到小,依次确定出至少一个所述图像区域,得到所述图像对应的图像区域组;将所述图像序列中每一帧图像分别对应的图像区域组中,行人密集程度排序相同的所述图像区域划分至同一集合,得到所述至少一个图像区域集合。
在上述装置中,所述区域确定部分,还被配置为针对所述图像序列的每一帧图像,获取所述图像对应的每个行人检测框,并分别扩大成一个扩展检测框,组成所述图像对应的一个扩展检测框组;针对每一所述扩展检测框组,确定所述扩展检测框组包含的不同扩展检测框之间的重叠比,并根据得到的重叠比构建所述扩展检测框组对应的第一邻接矩阵;针对每一所述扩展检测框组,基于所述扩展检测框组对应的第一邻接矩阵,确定所述扩展检测框组包含的每个扩展检测框的匹配次数,并按照匹配次数从大到小,依次选取至少一个扩展检测框,得到所述图像对应的一组中心检测框;针对所述图像序列的每一帧图像,基于所述图像对应的一组中心检测框,确定所述图像对应的一个图像区域组。
在上述装置中,所述区域确定部分,还被配置为从所述图像中获取至少一个行人检测框;在所述图像中,将所述至少一个行人检测框中每个行人检测框分别作为中心,按照第一预设比例外扩形成一个扩展检测框;将外扩得到的扩展检测框组成所述图像对应的一个扩展检测框组。
在上述装置中,所述区域确定部分,还被配置为从所述图像中,获取组成对应的一组中心检测框的至少一个扩展检测框对应的至少一个行人检测框;从所述图像中,将所述每个行人检测框分别作为中心,按照第二预设比例外扩形成一个图像区域;将外扩得到的图像区域组成所述图像对应的一个图像区域组。
在上述装置中,所述区域选取部分,还被配置为确定所述图像区域集合内不同图像区域之间的重叠比,并根据得到的重叠比构建对应的第二邻接矩阵;基于对应的第二邻接矩阵,确定所述图像区域集合内每个图像区域的匹配次数;选取出匹配次数最大的图像区域作为所述图像区域集合对应的中心区域。
在上述装置中,所述异常检测部分,还被配置为利用预设异常事件检测模型,对每个所述轨迹序列分别进行异常事件检测,得到对应的异常事件检测结果。
本公开实施例提供了一种电子设备,所述电子设备包括:处理器、存储器和通信总线;其中,所述通信总线,被配置为实现所述处理器和所述存储器之间的连接通信;所述处理器,被配置为执行所述存储器中存储的一个或者多个程序时,实现上述异常事件检测方法。
本公开实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序被一个或者多个处理器执行时,实现上述异常事件检测方法。
本公开实施例提供了一种计算机程序产品,所述计算机程序产品包括存 储了计算机程序的非瞬时性计算机存储介质,所述计算机程序被计算机读取并执行,实现上述异常事件检测方法。
本公开实施例提供了一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品,方法包括:获取图像序列,并从图像序列中确定出至少一个图像区域集合;每个图像区域集合中,包括图像序列中每一帧图像内的一个图像区域,且每个图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;基于图像区域集合内图像区域的空间分布情况选取出一个中心区域;针对每个中心区域,从图像序列的每一帧图像抠取对应位置的图像区域,组成一个轨迹序列;利用每个轨迹序列分别进行异常事件检测。本公开实施例提供的技术方案,通过基于中心点一致性约束的启发式行人密集群体位置估计方式,不仅降低了异常事件检测的数据量,提高了检测效率,还为异常事件检测提供了有效感知野,提高了检测准确性。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。
图1为本公开实施例提供的一种异常事件检测方法的流程示意图;
图2为本公开实施例提供的一种异常事件检测方法的流程示意图;
图3为本公开实施例提供的一种确定图像区域组的流程示意图;
图4为本公开实施例提供的一种异常事件检测方法的流程示意图;
图5为本公开实施例提供的一种异常事件检测方法的流程示意图;
图6为本公开实施例提供的一种异常事件检测方法的流程示意图;
图7为本公开实施例提供的一种示例性的轨迹序列的示意图;
图8为本公开实施例提供的一种异常事件检测方法的流程示意图;
图9为本公开实施例提供的一种异常事件检测方法的流程示意图;
图10为本公开实施例提供的一种异常事件检测方法的流程示意图;
图11为本公开实施例提供的一种异常事件检测装置的结构示意图;
图12为本公开实施例提供的一种电子设备的结构示意图。
具体实施方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述。
本公开实施例提供了一种异常事件检测方法,其执行主体可以是异常事件检测装置,例如,异常事件检测方法可以由终端设备或服务器或其它电子设备执行,其中,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字处理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备和可穿戴设备等。在一些实现方式中,异常事件检测方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
图1为本公开实施例提供的一种异常事件检测方法的流程示意图。如图1所示,在本公开的实施例中,异常事件检测方法主要包括以下步骤:
S101、获取图像序列,并从图像序列中确定出至少一个图像区域集合;每个图像区域集合中,包括图像序列中每一帧图像内的一个图像区域,且每个图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同。
在本公开的实施例中,异常事件检测装置可以先获取图像序列,从而从图像序列中确定出至少一个图像区域集合。
需要说明的是,在本公开的实施例中,异常事件检测装置可以包括图像采集器件,从而利用图像采集器件采集某一场景的连续帧图像,得到图像序列,当然,图像序列也可以是独立的摄像装置采集到的,例如,摄像头采集到的,之后,传输至异常事件检测装置。图像序列的获取方式,以及图像序列,可以根据实际需求和应用场景确定,本公开实施例不作限定。
需要说明的是,在本公开的实施例中,异常事件检测装置确定的至少一个图像区域集合,每个图像区域集合中,包括图像序列中每一帧图像内的一个图像区域,且每个图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同。
图2为本公开实施例提供的一种异常事件检测方法的流程示意图。如图2所示,在本公开的实施例中,S101可以由异常事件检测装置通过S201至S202实现,将结合图2示出的步骤进行说明。
S201、针对图像序列的每一帧图像,按照行人密集程度由大到小,依次确定出至少一个图像区域,得到图像对应的图像区域组;
S202、将图像序列中每一帧图像分别对应的图像区域组中,行人密集程度排序相同的图像区域划分至同一集合,得到至少一个图像区域集合。
需要说明的是,在本公开的实施例中,图像序列包括在时序上依次排布的多帧图像,异常事件检测装置对于其中的每一帧图像,均可以按照行人密集程度由大到小,依次确定出一帧图像中至少一个图像区域,从而将每个图像区域作为该帧图像对应的一个图像区域组,进而最终确定出每一帧图像分别对应的一个图像区域组。
可以理解的是,在本公开的实施例中,异常事件检测装置针对多个图像区域组,将不同组中行人密集程度排序相同的图像区域划分至同一集合,即,将多个图像区域组中,每个图像区域组中行人密集程度排序第一的图像区域选取出,划分至同一集合,依次类推,直至将每个图像区域组中行人密集程度排序最后的图像区域选取出,划分至同一集合,并且,由于每个图像区域组均包括至少一个图像区域,因此,最终划分出的图像区域集合的数量和每个图像区域组中的图像区域的数量一致。
上述实施例中,根据行人密集程度,确定出每一帧图像中的至少一个图像区域,从而得到每一帧图像对应的图像区域组,进而由每一个图像区域组中行人密集程度排序相同的图像区域组成至少一个图像区域集合,每一图像区域集合中包括的图像区域数量和图像序列包括的图像帧数相同,将图像序 列中行人密集程度相同的图像区域划分至同一个集合中,可以由一个图像区域集合表示图像序列中的一个事件。
图3为本公开实施例提供的一种确定图像区域组的流程示意图。如图3所示,在本公开的实施例中,S201可以通过S301至S304实现,将结合图3示出的步骤进行说明。
S301、针对图像序列的每一帧图像中,获取图像对应的每个行人检测框,并分别扩大成一个扩展检测框,组成图像对应的一个扩展检测框组。
需要说明的是,在本公开的实施例中,异常事件检测装置针对图像序列中的每一帧图像,可以分别进行行人检测,从而从中针对检测到的每个行人生成一个行人检测框,之后,将每个行人检测框进行扩大,扩大后的每个行人检测框即为扩展检测框,一帧图像中全部的扩展检测框即组成该帧图像对应的一个扩展检测框组。由于图像序列包括多帧图像,因此,异常事件检测装置按照上述方式确定每一帧图像对应的一个扩展检测框组。
图4为本公开实施例提供的一种异常事件检测方法的流程示意图。如图4所示,在本公开的实施例中,S301可以由异常事件检测装置通过S401至S403实现,将结合图4示出的步骤进行说明。
S401、从图像中获取至少一个行人检测框;
S402、在图像中,将至少一个行人检测框中每个行人检测框分别作为中心,按照第一预设比例外扩形成一个扩展检测框;
S403、将外扩得到的扩展检测框组成图像对应的一个扩展检测框组。
需要说明的是,在本公开的实施例中,对于图像序列中任意一帧图像,即第i帧(i为大于等于1的自然数)图像,异常事件检测装置可以从中获取到至少一个行人检测框,从而在第i帧图像中,分别以获得的每个行人检测框为中心,以一定比例进行外扩,这样,得到的扩展检测框不仅仅包括行人检测框中包括的行人,还包括行人周围的图像信息,例如,行人周边的人或物,这样,便于后续评估对应位置的人群密度等信息,使得后续图像区域的确定更为精准。对于行人检测框的外扩方式,异常事件检测装置也可以直接将每个行人检测框外扩到预设尺寸,即得到的每个扩展检测框的尺寸均为预设的同一尺寸,异常事件检测装置还可以基于不同行人检测框在所属一帧图像中的位置,采用不同的外扩方式进行外扩,本公开实施例不作限定。
需要说明的是,在本公开的实施例中,第一预设比例可以根据实际需求和应用场景设定,本公开实施例不作限定。例如,第一预设比例可以为1.5倍,即以行人检测框为中心,在图像中向外将检测框扩大1.5倍,扩大后的每个行人检测框即为扩展检测框。
S302、针对每一扩展检测框组,确定扩展检测框组包含的不同扩展检测框之间的重叠比,并根据得到的重叠比构建扩展检测框组对应的第一邻接矩阵。
在本公开的实施例中,异常事件检测装置在获得一帧图像的一个扩展检测框组之后,即可针对每一扩展检测框组,确定扩展检测框组包含的不同扩 展检测框之间的重叠比,并根据得到的重叠比构建该扩展检测框组对应的第一邻接矩阵。
需要说明的是,在本公开的实施例中,对于每一扩展检测框组,异常事件检测装置可以计算一个扩展检测框组的每个扩展检测框,与该扩展检测框组内其它扩展检测框之间的重叠比,从而构建相应的第一邻接矩阵,该第一邻接矩阵实际上记录了该扩展检测框组内两两扩展检测框之间的重叠情况。
需要说明的是,在本公开的实施例中,异常事件检测装置针对每一扩展检测框组,根据得到的不同扩展检测框之间的重叠比构建对应的第一邻接矩阵,实际上就是将每个重叠比作为第一邻接矩阵中的一个元素。
S303、针对每一扩展检测框组,基于扩展检测框组对应的第一邻接矩阵,确定扩展检测框组包含的每个扩展检测框的匹配次数,并按照匹配次数从大到小,依次选取至少一个扩展检测框,得到该图像对应的一组中心检测框。
在本公开的实施例中,异常事件检测装置在针对每一扩展检测框组确定出对应的第一邻接矩阵之后,即可基于该第一邻接矩阵,确定该扩展检测框组内每个扩展检测框的匹配次数,并按照匹配次数从大到小,依次选取至少一个扩展检测框组成一组中心检测框。
可以理解的是,在本公开的实施例中,对于每一扩展检测框组,其对应的第一邻接矩阵包括了其中不同扩展检测框之间的重叠比信息,因此,异常事件检测装置可以从第一邻接矩阵中,直接获知每个扩展检测框与扩展检测框组内其它扩展检测框是否存在重叠,例如,对于一个扩展检测框,在第一邻接矩阵中包括的其与扩展检测框组内一个扩展检测框之间的重叠比为0的情况下,表征与该扩展检测框与扩展检测框组内的其它扩展检测框不存在重叠,在重叠比不为0的情况下,表征与该扩展检测框与扩展检测框组内的其它扩展检测框存在重叠。在一个扩展检测框与另一个扩展检测框存在重叠的情况下,匹配次数记为一次,从而可以确定出该扩展检测框的匹配次数。
在本公开的实施例中,异常事件检测装置在确定出一个扩展检测框组中每个扩展检测框的匹配次数的情况下,即可按照匹配次数从大到小,进行扩展检测框的选取,并将选取出的至少一个扩展检测框组成该图像的一组中心检测框。
可以理解的是,在本公开的实施例中,每一扩展检测框组中,由于每个扩展检测框实际上是由行人检测框扩大得到的,即包括行人,在一个扩展检测框的匹配次数较多,即与较多的其它扩展检测框存在重叠的情况下,那么该扩展检测框周围将分布较多行人,该扩展检测框所在位置实际上是一个行人密集区域的中心,因此,异常事件检测装置将其从该扩展检测框组中选取出该扩展检测框,从而可以基于该扩展检测框从所属的一帧图像中准确确定出行人密集程度较大的区域。
需要说明的是,在本公开的实施例中,对于一个扩展检测框组,其中可能存在部分扩展检测框的匹配次数相同的情况,异常事件检测装置在按照匹配次数从大到小进行选取时,可以进一步基于其他信息从匹配次数相同的部 分扩展检测框中进行选取,例如,可以进一步获取匹配次数相同的部分扩展检测框中,每个扩展检测框的面积,从中选取出面积较大的扩展检测框。
S304、针对图像序列的每一帧图像,基于图像对应的一组中心检测框,确定图像对应的一个图像区域组。
在本公开的实施例中,异常事件检测装置可以针对图像序列的每一帧图像,基于对应的一组中心检测框,确定一个图像区域组。
需要说明的是,在本公开的实施例中,如步骤S303所述,每组中心检测框实际上由一个扩展检测框组中的至少一个扩展检测框组成,而每一扩展检测框组实际上包括于图像序列的一帧图像中,因此,一组中心检测框实际上包括于图像序列的一帧图像中。
上述实施例中,基于同一帧图像中的每个行人检测框,扩展得到一个扩展检测框组,根据一个扩展检测框组中的每两个扩展检测框之间的重叠比,确定出一帧图像中的一组中心检测框,从而得到每一帧图像对应的一组图像区域,这样,可以确定出每一帧图像中行人最为密集的区域,即确定出每一帧图像中异常事件发生的可能性最大的区域,从而可以实现后续的异常事件检测。
图5为本公开实施例提供的一种异常事件检测方法的流程示意图。如图5所示,在本公开的实施例中,S304可以由异常事件检测装置通过S501至S503实现,将结合图5示出的步骤进行说明。
S501、从图像中,获取组成对应的一组中心检测框的至少一个扩展检测框对应的至少一个行人检测框;
S502、从图像中,将每个行人检测框分别作为中心,按照第二预设比例外扩形成一个图像区域;
S503、将外扩得到的图像区域组成图像对应的一个图像区域组。
需要说明的是,在本公开的实施例中,对于图像序列中的第k帧(k为大于等于1的自然数)图像对应的一组中心检测框,其包括在第k帧图像中,由至少一个扩展检测框组成,每个扩展检测框均由一个行人检测框扩大得到,异常事件检测装置可以从第k帧图像中,直接获取对应的至少一个行人检测框,从而进一步以每个行人检测框为中心,以一定比例进行外扩,这样,得到的图像区域不仅仅包括行人检测框中包括的行人,还包括行人周围的图像信息。例如,行人周边的人或物。这样,便于后续进行异常事件检测,使得异常事件检测更为精准。对于行人检测框的外扩方式,异常事件检测装置也可以直接将每个行人检测框外扩到预设尺寸,即得到的每个扩展检测框的尺寸均为预设的同一尺寸,异常事件检测装置还可以基于不同行人检测框在所属一帧图像中的位置,采用不同的外扩方式进行外扩,本公开实施例不作限定。
需要说明的是,在本公开的实施例中,第二预设比例可以根据实际需求和应用场景设定,本公开实施例不作限定。例如,第二预设比例可以为2倍,即以行人检测框为中心,在图像中向外将检测框扩大2倍,扩大后的行人检 测框即为图像区域。
需要说明的是,在本公开的实施例中,由于扩展检测框的选取实际上只是为了基本的进行行人中心的估计,而图像区域实际上是进行后续异常事件检测的基础区域,因此,可以规定第二预设比例大于第一预设比例,即在确定图像区域时,相比于扩展检测框,包括的信息更多,当然,可以包括更多的行人,从而为后续进行异常事件检测提供有效的感知野。
S102、针对每个图像区域集合,基于图像区域集合内图像区域的空间分布情况选取出一个中心区域。
在本公开的实施例中,异常事件检测装置在从图像序列中确定出至少一个图像区域集合的情况下,可以针对每个图像区域集合,基于图像区域集合内图像区域的空间分布情况选取一个中心区域。
图6为本公开实施例提供的一种异常事件检测方法的流程示意图。如图6所示,在本公开的实施例中,S102可以由异常事件检测装置通过S601至S603实现,将结合图6示出的步骤进行说明。
S601、确定图像区域集合内不同图像区域之间的重叠比,并根据得到的重叠比构建对应的第二邻接矩阵;
S602、基于对应的第二邻接矩阵,确定图像区域集合内每个图像区域的匹配次数;
S603、选取出匹配次数最大的图像区域作为图像区域集合对应的中心区域。
需要说明的是,在本公开的实施例中,与上述针对每一扩展检测框组中每个扩展检测框确定匹配次数的方式相同,异常事件检测装置可以针对每个图像区域集合,确定图像区域集合内不同图像区域之间的重叠比,从而构建相应的第二邻接矩阵,该第二邻接矩阵表征了图像区域集合内的图像区域在空间位置的分布情况,为了提高中心点的一致性,异常事件检测装置从图像区域集合中选取出行人最密集的图像区域作为中心区域,即匹配次数最大的图像区域,最终从至少一个图像区域中选取出至少一个中心区域。
上述实施例中,根据每个图像区域集合中的每两个图像区域之间的重叠比,将确定出的每个图像区域集合中行人最密集的图像区域作为中心区域,即确定出了图像序列中异常事件发生的可能性最大的区域,为后续的异常事件检测提供一个覆盖区域更为准确的图像检测区域。
S103、针对每个中心区域,从图像序列的每一帧图像抠取对应位置的图像区域,组成一个轨迹序列。
在本公开的实施例中,异常事件检测装置在得到每个中心区域之后,即可针对每个中心区域,从图像序列的每一帧图像抠取对应位置的图像区域,组成一个轨迹序列。
需要说明的是,在本公开的实施例中,异常事件检测装置在从图像序列的每一帧图像中,抠取出一个中心区域对应位置的图像区域之后,可以将这些图像区域进行尺度调整,再基于时序拼接成一个轨迹序列。
图7为本公开实施例提供的一种示例性的轨迹序列的示意图。如图7所示,在图像序列包括5帧图像的情况下,一个轨迹序列实际上可以包括5个图像区域,其中,5个图像区域分别为5帧图像中选取的同一位置的区域,5个图像区域抠取的基准,即中心区域。例如,可以是第2个图像区域所示的区域,从而得到组成所示轨迹序列的5个图像区域。
需要说明的是,在本公开的实施例中,异常事件检测装置基于一个中心区域进行图像区域抠取,可以在每帧图像中以该中心区域的中心点对应的位置为中心,抠取预设尺寸的图像区域,在中心在图像边缘的情况下,对于无法抠取到预设尺寸的部分,可以以黑色填充。
可以理解的是,对于图像序列的每一帧图像,不同帧确定的中心点可能会存在不一致的现象,在直接进行图像区域合并的情况下,覆盖的异常事件检测的范围过大,容易包含大量的无效信息,从而检测效率较低,而在本公开的实施例中,异常事件检测装置直接从一组图像区域中确定一个中心区域,之后,针对每一帧图像统一抠取中心区域位置的图像区域,进行组合,这样得到的轨迹序列中心点一致,提升了抠图区域的代表性和有效性,后续利用其进行异常事件检测的范围较为合理,检测效率较高。
S104、利用每个轨迹序列分别进行异常事件检测。
在本公开的实施例中,异常事件检测装置在得到轨迹序列的情况下,即可利用每个轨迹序列分别进行异常事件检测。
图8为本公开实施例提供的一种异常事件检测方法的流程示意图。如图8所示,在本公开的实施例中,S104可以由异常事件检测装置通过S801实现,将结合图8示出的步骤进行说明。
S801、利用预设异常事件检测模型,对每个轨迹序列分别进行异常事件检测,得到对应的异常事件检测结果。
需要说明的是,在本公开的实施例中,预设异常事件检测模型,被配置为进行异常事件检测。例如,预设异常事件检测模型可以是特定的图像分类网络等,本公开实施例不作限定。
需要说明的是,在本公开的实施例中,异常事件检测装置可以利用预设异常事件检测模型,根据每个轨迹序列进行异常事件检测,从而得到该轨迹序列所对应的现实场景区域是否出现异常事件。
可以理解的是,在本公开的实施例中,异常事件检测装置重复至少一次轨迹序列的生成过程,生成至少一个不同区域对应的轨迹序列,并分别进行异常事件检测,提升了图像区域的召回,同时提升了模型对事件的有效感受野。
需要说明的是,在本公开的实施例中,对于多个轨迹序列,也有可能表征出一个异常事件,因此,异常事件检测装置还可以将多个轨迹序列同时输入预设异常事件检测模型,预设异常事件检测模型可以结合多个轨迹序列进行异常事件检测,分析出对应的一个异常事件。
下面说明本公开实施例提供的异常事件检测方法在实际场景中的应用, 以基于获取的城市街道视频进行异常事件检测场景为例进行说明。
视频中的异常检测是计算机视觉领域的一个重要问题,可以利用计算机视觉和深度学习的技术来自动检测视频中发生的异常事件,例如检测异常行为、交通事故和一些不常见的事件等。对于机器来说,视觉特征越强,则期望的异常检测性能就越好。
对于监控场景下的异常行为检测,如何能在视频帧序列的整张图像(不同视角下)中准确定位到异常事件的发生区域,从而以局部区域代替整张图像,输入到识别网络中进行行为分类,有助于提升机器对于目标事件的有效感知范围,减少图像中大多数无关信息的干扰,使得模型更关注于如何区分目标人物不同的运动细节。可以实现对城市街道和轨道交通工具等室内外公共场所等场景中发生的异常行为的自动检测。
在一些实施例中,相关的行为识别方法,通常对输入的视频序列进行整张图像的数据增强或其他预处理后,输入到分类模型中进行预测。然而这种方法只适用于以人为中心的视频行为识别。对于监控摄像头拍摄的视频来说,往往包含更多的信息,覆盖的视野也更大,同时,目标事件发生的位置和人体尺度也具有随机性。因此,简单地以整张图像作为模型的输入显然是不合理的。通过人群密集分布情况来粗略定位事件中心点是一个可行的方案,但是不同帧图像中确定的中心点可能会存在不一致的现象,导致最终合并的区域覆盖了较大的范围。
图9为本公开实施例提供的一种异常事件检测方法的流程示意图。在本公开实施例中,异常事件检测方法将结合图9进行说明。
S901、获取视频图像中的行人检测框。
在一些实施例中,通过调用上游检测组件获取视频图像(对应上述实施例中的图像序列)中的行人检测框。
S902、获取轨迹系列。
在一些实施例中,基于行人检测框的密度分布情况,计算每一帧图像中的每两个行人检测框之间的重叠比,并统计每两个行人检测框的匹配次数,将匹配次数最大的K个匹配次数对应的行人检测框作为中心(对应上述实施例中的一组中心检测框),按比例(对应上述实施例中的第二预设比例)自适应扩充,得到各帧图像的K个候选区域(对应上述实施例中的一个图像区域组)。如图9中所示的第一候选区域9021、第二候选区域9022和第三候选区域9023。
将T帧图像的第i个(i为大于等于1的自然数)候选区域组成一个集合(对应上述实施例中的图像区域集合)。分别计算T个候选区域的两两之间的交并比,得到一个T乘以T的邻接矩阵(对应上述实施例中的第二邻接矩阵),最终计算出时序上重叠次数最多的候选区域,即确定为最终的抠图区域(对应上述实施例中的中心区域),对T帧图像分别进行抠取,得到对应的轨迹序列,并赋予Patch ID为i,取代整张图像,输入到图像分类模型中以增大有效感知区域。
对T帧图像的每个候选区域分别重复上述步骤的操作,即可得到K个轨迹序列。
S903、对轨迹序列采用图像分类网络进行识别。
在一些实施例中,将各个帧的集合(对应上述实施例中的图像区域集合)对应的轨迹序列分别输入到图像分类网络中进行识别,可以得到该轨迹序列所对应的现实场景区域是否出现异常事件。
S904、获得异常事件识别结果。
在一些实施例中,通过图像分类网络对轨迹序列的识别,最终输出该轨迹序列对应的异常事件的结果,即输出该轨迹序列所对应的现实场景区域发生了异常事件或没有发生异常事件。
在这一实施例中,将各个帧图像的候选区域集合对应的轨迹序列分别输入到图像分类网络中进行识别,有利于高召回的同时也能提升模型对于异常事件的有效感知范围,且各个轨迹序列都遵循中心点一致的构建规则,减小因扰动而带来较大的抠图覆盖范围。
图10为本公开实施例提供的一种异常事件检测方法的流程示意图,如图10所示,可以基于中心点一致性约束的异常事件检测方法,实现对视频中的异常事件的检测。将结合S1001至S1008进行说明。
S1001、输入视频(对应上述实施例中的图像序列),提取每一帧图像的行人检测框;
S1002、按固定比例(对应上述实施例中的第一预设比例)外扩行人检测框,并对外扩后的行人检测框(对应上述实施例中的扩展检测框)根据面积大小进行排序;
S1003、根据每两个行人检测框之间的交叉合并比(Intersection over Union,IoU)构建邻接矩阵(对应上述实施例中的第一邻接矩阵),确定被匹配次数最大的K个外扩后的行人检测框(对应上述实施例中的一组中心检测框);
S1004、对匹配次数最大的K个外扩后的行人检测框分别恢复其分辨率,并以其为中心按一个更大的固定比例(对应上述实施例中的第二预设比例)自适应的外扩为一个候选区域(对应上述实施例中的图像区域);
S1005、将T帧图像的第i个候选区域组成一个集合(对应上述实施例中的一个图像区域集合);
S1006、分别计算T个候选区域的两两之间的交叉合并比,得到一个T乘以T的邻接矩阵(对应上述实施例中的第二邻接矩阵),基于邻接矩阵,将时序上重叠次数最多的候选区域(对应上述实施例中的中心区域)确定为最终的抠图区域;
S1007、对T帧图像分别进行抠取,得到对应的轨迹序列,并赋予Patch ID为i,以此取代整张图像,作为图像分类网络的输入图像;
S1008、将各个帧图像的K个候选区域集合对应的K个轨迹序列分别输入到图像分类网络中进行异常事件识别。
在一些实施例中,获取监控摄像头拍摄的全图视频序列,调用上游结构 化检测组件,提取视频中的行人检测框。对当前帧图像的所有的行人检测框进行m倍(默认为1.5倍)的外扩,然后按照外扩后的行人检测框的面积大小对外扩后的行人检测框进行排序(降序排列)。根据外扩后的行人检测框两两之间的交叉合并比构建邻接矩阵,将根据邻接矩阵确定匹配次数最大的K个外扩后的行人检测框作为中心行人检测框。恢复获得的中心行人检测框的分辨率,并定义一个更大的外扩比例n(默认为2倍),按照固定比例进行自适应的外扩,目的是以上述预测的中心行人检测框为中心,尽可能使得外扩后的矩形框包围异常事件的群体。将以此确定的当前帧图像的外扩矩形框作为当前帧图像的候选区域。将T帧图像的第i个候选区域组成一个候选区域集合。分别确定这T个候选区域两两之间的交叉合并比,得到一个T乘以T的邻接矩阵,代表候选区域在空间位置的分布情况,为了提升中心点的一致性,选取最密集的候选区域中心(即匹配次数最多的)作为最终由T帧共享的第i个候选区域,并确定为第i个行人轨迹的抠图区域。对T帧图像分别进行抠取,得到对应的轨迹序列,并赋予Patch ID为i,取代整张图像输入来增大有效感知区域。对T帧图像的每个候选区域分别重复上述步骤S1005至S1007,即对K个候选区域分别在原始图像序列上抠取对应的局部区域,形成相应的K个轨迹序列,分别输入图像分类网络进行行为的识别。
在一些实施例中,局部区域抠取过程为:以上述S1007确定的第K个轨迹序列的抠图区域为例,对长边进行尺度放缩到224(分辨率),短边等比例缩放,不足224的上下补黑边,最终图像大小为224x224。在输入的视频帧序列均无行人检测框结果的情况下,统一以图像中心点抠取224x224的区域,此时该视频帧序列只有一条轨迹。
本公开实施例提供了一种异常事件检测方法,包括:获取图像序列,并从图像序列中确定出至少一个图像区域集合;每个图像区域集合中,包括图像序列中每一帧图像内的一个图像区域,且每个图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域;针对每个中心区域,从图像序列的每一帧图像抠取对应位置的图像区域,组成一个轨迹序列;利用每个轨迹序列分别进行异常事件检测。本公开实施例提供的异常事件检测方法,通过基于中心点一致性约束的启发式行人密集群体位置估计方式,不仅降低了异常事件检测的数据量,提高了检测效率,还为异常事件检测提供了有效感知野,提高了检测准确性。
本公开实施例提供了一种异常事件检测装置。图11为本公开实施例提供的一种异常事件检测装置的结构示意图。如图11所示,在本公开的实施例中,异常事件检测装置1100包括:区域确定部分1101,被配置为获取图像序列,并从所述图像序列中确定出至少一个图像区域集合;每个所述图像区域集合中,包括所述图像序列中每一帧图像内的一个图像区域,且每个所述图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;区域选取部分1102,被配置为针对每个图像区域集合,基于所述图像区域集合内的图像区 域的空间分布情况,选取出一个中心区域;序列确定部分1103,被配置为针对每个中心区域,从所述图像序列的每一帧图像抠取对应位置的所述图像区域,组成一个轨迹序列;异常检测部分1104,被配置为利用每个轨迹序列分别进行异常事件检测。
在本公开一实施例中,所述区域确定部分1101,还被配置为针对所述图像序列的每一帧图像,按照行人密集程度由大到小,依次确定出至少一个所述图像区域,得到所述图像对应的图像区域组;将所述图像序列中每一帧图像分别对应的图像区域组中,行人密集程度排序相同的图像区域划分至同一集合,得到所述至少一个图像区域集合。
在本公开一实施例中,所述区域确定部分1101,还被配置为针对所述图像序列的每一帧图像,获取所述图像对应的每个行人检测框,并分别扩大成一个扩展检测框,组成所述图像对应的一个扩展检测框组;针对每一所述扩展检测框组,确定所述扩展检测框组包含的不同扩展检测框之间的重叠比,并根据得到的重叠比构建所述扩展检测框组对应的第一邻接矩阵;针对每一所述扩展检测框组,基于所述扩展检测框组对应的第一邻接矩阵,确定所述扩展检测框组包含的每个扩展检测框的匹配次数,并按照匹配次数从大到小,依次选取至少一个扩展检测框,得到所述图像对应的一组中心检测框;针对所述图像序列的每一帧图像,基于所述图像对应的一组中心检测框,确定所述图像对应的一个图像区域组。
在本公开一实施例中,所述区域确定部分1101,还被配置为从所述图像中获取至少一个行人检测框;在所述图像中,将所述至少一个行人检测框中每个行人检测框分别作为中心,按照第一预设比例外扩形成一个扩展检测框;将外扩得到的扩展检测框组成所述图像对应的一个扩展检测框组。
在本公开一实施例中,所述区域确定部分1101,还被配置为从所述图像中,获取组成对应一组中心检测框的至少一个扩展检测框对应的至少一个行人检测框;从所述图像中,将所述每个行人检测框分别作为中心,按照第二预设比例外扩形成一个图像区域;将外扩得到的图像区域组成所述图像对应的一个图像区域组。
在本公开一实施例中,所述区域选取部分1102,还被配置为确定所述图像区域集合内不同图像区域之间的重叠比,并根据得到的重叠比构建对应的第二邻接矩阵;基于对应的第二邻接矩阵,确定所述图像区域集合内每个图像区域的匹配次数;选取出匹配次数最大的图像区域作为所述图像区域集合对应的中心区域。
在本公开一实施例中,所述异常检测部分1104,还被配置为利用预设异常事件检测模型,对每个所述轨迹序列进行异常事件检测,得到对应的异常事件检测结果。
本公开实施例提供了一种异常事件检测装置,包括:区域确定部分,被配置为获取图像序列,并从图像序列中确定出至少一个图像区域集合;每个图像区域集合中,包括图像序列中每一帧图像内的一个图像区域,且每个图 像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;区域选取部分,被配置为针对每个图像区域集合中,基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域;序列确定部分,被配置为针对每个中心区域,从图像序列的每一帧图像抠取对应位置的图像区域,组成一个轨迹序列;异常检测部分,被配置为利用每个轨迹序列分别进行异常事件检测。本公开实施例提供的异常事件检测装置,通过基于中心点一致性约束的启发式行人密集群体位置估计方式,不仅降低了异常事件检测的数据量,提高了检测效率,还为异常事件检测提供了有效感知野,提高了检测准确性。
本公开实施例提供了一种电子设备。图12为本公开实施例提供的一种电子设备的结构示意图。如图12所示,在本公开的实施例中,电子设备1200包括:处理器1201、存储器1202和通信总线1203;其中,所述通信总线1203,被配置为实现所述处理器1201和所述存储器1202之间的连接通信;所述处理器1201,被配置为执行所述存储器1202中存储的一个或者多个程序时,实现上述异常事件检测方法。
本公开实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序被一个或者多个处理器执行时,实现上述异常事件检测方法。计算机可读存储介质可以是易失性存储器(volatile memory,VM),例如随机存取存储器(Random-Access Memory,RAM);或者非易失性存储器(non-volatile memory,NVM),例如只读存储器(Read-Only Memory,ROM),快闪存储器(flash memory),硬盘(Hard Disk Drive,HDD)或固态硬盘(Solid-State Drive,SSD);也可以是包括上述存储器之一或任意组合的各自设备,如移动电话、计算机、平板设备和个人数字助理等。
本领域内的技术人员应明白,本公开的实施例可提供为方法、系统或计算机程序产品。因此,本公开可采用硬件实施例、软件实施例或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。
本公开是参照根据本公开实施例的方法、设备(系统)和计算机程序产品的流程图和方框图中的至少之一来描述的。应理解可由计算机程序指令实现流程图和方框图中的每一流程和方框中的至少之一、以及由计算机程序指令实现流程图和方框图中的流程和方框的结合中的至少之一。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程信号处理设备的处理器以产生一个机器,使得通过计算机或其他可编程信号处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和方框图一个方框或多个方框中的至少一个指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程信号处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图的至少一个 流程和方框图的至少一个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程信号处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图的至少一个流程和方框图的至少一个方框中指定的功能的步骤。
以上所述,仅为本公开的较佳实施例而已,并非用于限定本公开的保护范围。
工业实用性
本公开实施例提供了一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品,方法包括:获取图像序列,并从图像序列中确定出至少一个图像区域集合;每个图像区域集合中,包括图像序列中每一帧图像内的一个图像区域,且每个图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域;针对每个中心区域,从图像序列的每一帧图像抠取对应位置的图像区域,组成一个轨迹序列;利用每个轨迹序列分别进行异常事件检测。本公开实施例提供的技术方案,通过基于中心点一致性约束的启发式行人密集群体位置估计方式,不仅降低了异常事件检测的数据量,提高了检测效率,还为异常事件检测提供了有效感知野,提高了检测准确性。

Claims (17)

  1. 一种异常事件检测方法,所述方法由电子设备执行,所述方法包括:
    获取图像序列,并从所述图像序列中确定出至少一个图像区域集合;每个所述图像区域集合中,包括所述图像序列中每一帧图像内的一个图像区域,且每个所述图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;
    针对每个图像区域集合,基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域;
    针对每个中心区域,从所述图像序列的每一帧图像抠取对应位置的所述图像区域,组成一个轨迹序列;
    利用每个轨迹序列分别进行异常事件检测。
  2. 根据权利要求1所述的方法,其中,所述从所述图像序列中确定出至少一个图像区域集合,包括:
    针对所述图像序列的每一帧图像,按照行人密集程度由大到小,依次确定出至少一个所述图像区域,得到所述图像对应的图像区域组;
    将所述图像序列中每一帧图像分别对应的图像区域组中,行人密集程度排序相同的图像区域划分至同一集合,得到所述至少一个图像区域集合。
  3. 根据权利要求2所述的方法,其中,所述针对所述图像序列的每一帧图像,按照行人密集程度由大到小,依次确定出至少一个所述图像区域,得到所述图像对应的图像区域组,包括:
    针对所述图像序列的每一帧图像,获取所述图像对应的每个行人检测框,并分别扩大成一个扩展检测框,组成所述图像对应的一个扩展检测框组;
    针对每一所述扩展检测框组,确定所述扩展检测框组包含的不同扩展检测框之间的重叠比,并根据得到的重叠比构建所述扩展检测框组对应的第一邻接矩阵;
    针对每一所述扩展检测框组,基于所述扩展检测框组对应的第一邻接矩阵,确定所述扩展检测框组包含的每个扩展检测框的匹配次数,并按照匹配次数从大到小,依次选取至少一个扩展检测框,得到所述图像对应的一组中心检测框;
    针对所述图像序列的每一帧图像,基于所述图像对应的一组中心检测框,确定所述图像对应的一个图像区域组。
  4. 根据权利要求3所述的方法,其中,所述针对所述图像序列的每一帧图像,获取所述图像对应的每个行人检测框,并分别扩大成一个扩展检测框,组成所述图像对应的一个扩展检测框组,包括:
    从所述图像中获取至少一个行人检测框;
    在所述图像中,将所述至少一个行人检测框中每个行人检测框分别作为中心,按照第一预设比例外扩形成一个扩展检测框;
    将外扩得到的扩展检测框组成所述图像对应的一个扩展检测框组。
  5. 根据权利要求3所述的方法,其中,所述基于所述图像对应的一组中心检测框,确定所述图像对应的一个图像区域组,包括:
    从所述图像中,获取组成对应的一组中心检测框的至少一个扩展检测框对应的至少一个行人检测框;
    从所述图像中,将所述每个行人检测框分别作为中心,按照第二预设比例外扩形成一个图像区域;
    将外扩得到的图像区域组成所述图像对应的一个图像区域组。
  6. 根据权利要求1所述的方法,其中,所述基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域,包括:
    确定所述图像区域集合内不同图像区域之间的重叠比,并根据得到的重叠比构建对应的第二邻接矩阵;
    基于对应的第二邻接矩阵,确定所述图像区域集合内每个图像区域的匹配次数;
    选取出匹配次数最大的图像区域作为所述图像区域集合对应的中心区域。
  7. 根据权利要求1所述的方法,其中,所述利用每个轨迹序列分别进行异常事件检测,包括:
    利用预设异常事件检测模型,对每个所述轨迹序列分别进行异常事件检测,得到对应的异常事件检测结果。
  8. 一种异常事件检测装置,所述装置包括:
    区域确定部分,被配置为获取图像序列,并从所述图像序列中确定出至少一个图像区域集合;每个所述图像区域集合中,包括所述图像序列中每一帧图像内的一个图像区域,且每个所述图像区域在所属一帧图像中各图像区域间行人密集程度的排序相同;
    区域选取部分,被配置为针对每个图像区域集合,基于所述图像区域集合内图像区域的空间分布情况选取出一个中心区域;
    序列确定部分,被配置为针对每个中心区域,从所述图像序列的每一帧图像抠取对应位置的所述图像区域,组成一个轨迹序列;
    异常检测部分,被配置为利用每个轨迹序列分别进行异常事件检测。
  9. 根据权利要求8所述的装置,其中,所述区域确定部分,还被配置为针对所述图像序列的每一帧图像,按照行人密集程度由大到小,依次确定出至少一个所述图像区域,得到所述图像对应的图像区域组;
    将所述图像序列中每一帧图像分别对应的图像区域组中,行人密集程度排序相同的图像区域划分至同一集合,得到所述至少一个图像区域集合。
  10. 根据权利要求9所述的装置,其中,所述区域确定部分,还被配置为针对所述图像序列的每一帧图像,获取所述图像对应的每个行人检测框,并分别扩大成一个扩展检测框,组成所述图像对应的一个扩展检测框组;
    针对每一所述扩展检测框组,确定所述扩展检测框组包含的不同扩展检测框之间的重叠比,并根据得到的重叠比构建所述扩展检测框组对应的第一 邻接矩阵;
    针对每一所述扩展检测框组,基于所述扩展检测框组对应的第一邻接矩阵,确定所述扩展检测框组包含的每个扩展检测框的匹配次数,并按照匹配次数从大到小,依次选取至少一个扩展检测框,得到所述图像对应的一组中心检测框;
    针对所述图像序列的每一帧图像,基于所述图像对应的一组中心检测框,确定所述图像对应的一个图像区域组。
  11. 根据权利要求10所述的装置,其中,所述区域确定部分,还被配置为从所述图像中获取至少一个行人检测框;
    在所述图像中,将所述至少一个行人检测框中每个行人检测框分别作为中心,按照第一预设比例外扩形成一个扩展检测框;
    将外扩得到的扩展检测框组成所述图像对应的一个扩展检测框组。
  12. 根据权利要求10所述的装置,其中,所述区域确定部分,还被配置为从所述图像中,获取组成对应的一组中心检测框的至少一个扩展检测框对应的至少一个行人检测框;
    从所述图像中,将所述每个行人检测框分别作为中心,按照第二预设比例外扩形成一个图像区域;
    将外扩得到的图像区域组成所述图像对应的一个图像区域组。
  13. 根据权利要求8所述的装置,其中,所述区域选取部分,还被配置为确定所述图像区域集合内不同图像区域之间的重叠比,并根据得到的重叠比构建对应的第二邻接矩阵;
    基于对应的第二邻接矩阵,确定所述图像区域集合内每个图像区域的匹配次数;
    选取出匹配次数最大的图像区域作为所述图像区域集合对应的中心区域。
  14. 根据权利要求8所述的装置,其中,所述异常检测部分,还被配置为利用预设异常事件检测模型,对每个所述轨迹序列进行异常事件检测,得到对应的异常事件检测结果。
  15. 一种电子设备,所述电子设备包括:处理器、存储器和通信总线;其中,
    所述通信总线,被配置为实现所述处理器和所述存储器之间的连接通信;
    所述处理器,被配置为执行所述存储器中存储的一个或者多个程序,实现权利要求1至7任一项所述的异常事件检测方法。
  16. 一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序被一个或者多个处理器执行时,实现权利要求1至7任一项所述的异常事件检测方法。
  17. 一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机存储介质,所述计算机程序被计算机读取并执行,实现权利要求1至7任一项所述的异常事件检测方法。
PCT/CN2022/097584 2021-07-23 2022-06-08 一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品 WO2023000856A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110836410.XA CN113553950A (zh) 2021-07-23 2021-07-23 一种异常事件检测方法、装置、电子设备及存储介质
CN202110836410.X 2021-07-23

Publications (1)

Publication Number Publication Date
WO2023000856A1 true WO2023000856A1 (zh) 2023-01-26

Family

ID=78132592

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/097584 WO2023000856A1 (zh) 2021-07-23 2022-06-08 一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品

Country Status (2)

Country Link
CN (1) CN113553950A (zh)
WO (1) WO2023000856A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113553950A (zh) * 2021-07-23 2021-10-26 上海商汤智能科技有限公司 一种异常事件检测方法、装置、电子设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160034784A1 (en) * 2014-08-01 2016-02-04 Ricoh Company, Ltd. Abnormality detection apparatus, abnormality detection method, and recording medium storing abnormality detection program
CN109753851A (zh) * 2017-11-03 2019-05-14 郑州大学 一种异常检测方法及系统
CN112183304A (zh) * 2020-09-24 2021-01-05 高新兴科技集团股份有限公司 离位检测方法、系统及计算机存储介质
CN113095257A (zh) * 2021-04-20 2021-07-09 上海商汤智能科技有限公司 异常行为检测方法、装置、设备及存储介质
CN113111839A (zh) * 2021-04-25 2021-07-13 上海商汤智能科技有限公司 行为识别方法及装置、设备和存储介质
CN113553950A (zh) * 2021-07-23 2021-10-26 上海商汤智能科技有限公司 一种异常事件检测方法、装置、电子设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9014486B2 (en) * 2011-11-21 2015-04-21 Siemens Aktiengesellschaft Systems and methods for tracking with discrete texture traces
CN111461104B (zh) * 2019-01-22 2024-04-09 北京京东乾石科技有限公司 视觉识别方法、装置、设备及存储介质
CN113111838B (zh) * 2021-04-25 2024-09-17 上海商汤智能科技有限公司 行为识别方法及装置、设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160034784A1 (en) * 2014-08-01 2016-02-04 Ricoh Company, Ltd. Abnormality detection apparatus, abnormality detection method, and recording medium storing abnormality detection program
CN109753851A (zh) * 2017-11-03 2019-05-14 郑州大学 一种异常检测方法及系统
CN112183304A (zh) * 2020-09-24 2021-01-05 高新兴科技集团股份有限公司 离位检测方法、系统及计算机存储介质
CN113095257A (zh) * 2021-04-20 2021-07-09 上海商汤智能科技有限公司 异常行为检测方法、装置、设备及存储介质
CN113111839A (zh) * 2021-04-25 2021-07-13 上海商汤智能科技有限公司 行为识别方法及装置、设备和存储介质
CN113553950A (zh) * 2021-07-23 2021-10-26 上海商汤智能科技有限公司 一种异常事件检测方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN113553950A (zh) 2021-10-26

Similar Documents

Publication Publication Date Title
CN110807385B (zh) 目标检测方法、装置、电子设备及存储介质
CN109255352B (zh) 目标检测方法、装置及系统
CN108764085B (zh) 基于生成对抗网络的人群计数方法
CN108256404B (zh) 行人检测方法和装置
CN110210276A (zh) 一种移动轨迹获取方法及其设备、存储介质、终端
CN111191570B (zh) 图像识别方法和装置
CN111047626B (zh) 目标跟踪方法、装置、电子设备及存储介质
CN110659391A (zh) 一种视频侦查方法及装置
CN111667030A (zh) 基于深度神经网络实现遥感图像目标检测的方法、系统及其存储介质
WO2023000856A1 (zh) 一种异常事件检测方法、装置、电子设备、存储介质及计算机程序产品
JP2017508130A (ja) 騒音マップ制作方法及び装置
CN111753587A (zh) 一种倒地检测方法及装置
CN114926791A (zh) 一种路口车辆异常变道检测方法、装置、存储介质及电子设备
CN116824488A (zh) 一种基于迁移学习的目标检测方法
CN110505397B (zh) 相机选择的方法、装置及计算机存储介质
CN113592881B (zh) 图片指代性分割方法、装置、计算机设备和存储介质
CN108876062B (zh) 一种犯罪事件智能预测的大数据方法及装置
CN112738725B (zh) 半封闭区域目标人群的实时识别方法、装置、设备和介质
KR102218255B1 (ko) 갱신 영역 학습을 통한 인공지능 기반의 영상 분석 시스템 및 방법과, 이를 위한 컴퓨터 프로그램
Lu et al. Bi-temporal attention transformer for building change detection and building damage assessment
CN112232236B (zh) 行人流量的监测方法、系统、计算机设备和存储介质
CN113920585A (zh) 行为识别方法及装置、设备和存储介质
CN117576634A (zh) 基于密度检测的异常分析方法、设备以及存储介质
CN103065302A (zh) 一种基于离群数据挖掘的图像显著性检测方法
Yu et al. An Algorithm for Target Detection of Engineering Vehicles Based on Improved CenterNet.

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE