WO2022222445A1 - 事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质 - Google Patents

事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质 Download PDF

Info

Publication number
WO2022222445A1
WO2022222445A1 PCT/CN2021/130349 CN2021130349W WO2022222445A1 WO 2022222445 A1 WO2022222445 A1 WO 2022222445A1 CN 2021130349 W CN2021130349 W CN 2021130349W WO 2022222445 A1 WO2022222445 A1 WO 2022222445A1
Authority
WO
WIPO (PCT)
Prior art keywords
event
output
alarm information
detected
preset
Prior art date
Application number
PCT/CN2021/130349
Other languages
English (en)
French (fr)
Inventor
熊梓云
李亚南
Original Assignee
深圳市商汤科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市商汤科技有限公司 filed Critical 深圳市商汤科技有限公司
Publication of WO2022222445A1 publication Critical patent/WO2022222445A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/266Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
    • H04N21/2662Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4882Data services, e.g. news ticker for displaying messages, e.g. warnings, reminders
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/44Event detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the present disclosure relates to the field of computer vision technology, and in particular, to an event detection and output method, an event policy determination method and apparatus, an electronic device, and a computer-readable storage medium.
  • the target video is usually decoded to obtain an image frame, and then the target object is recognized on the image frame, and after the target object is recognized, the corresponding event information is output.
  • Embodiments of the present disclosure provide at least an event detection and output method, an event policy determination method and apparatus, an electronic device, and a computer-readable storage medium.
  • an embodiment of the present disclosure provides a method for detecting and outputting an event, including:
  • the alarm information of each of the events is output.
  • the output of the alarm information is more efficient.
  • the frequency conforms to the perception of the human body and is closer to the actual occurrence frequency of events, which can improve the intelligence of event detection, thereby improving system performance. Therefore, it is also possible to output event alarm information based on different task types.
  • the preset decoding mode is determined by the type of the event; the preset decoding mode includes a key frame decoding mode or an all frame decoding mode.
  • determining the preset decoding mode according to the type of the event can not only improve the decoding efficiency, but also reduce the occurrence of missed detection.
  • the decoding of the acquired target video according to the preset decoding mode includes:
  • the alarm strategy includes a second preset number of parsed image frames and a second preset time; the alarm strategy according to the event corresponding to the parsed image frame and the parsing Image frame, output the alarm information of each described event, including:
  • the alarm information of the event is output, so that errors caused by flashing frames can be reduced. the occurrence of the judgment.
  • the alarm policy further includes a third preset time; the parsing result of each parsing image frame in the second preset number of parsing image frames is In case of occurrence, output the alarm information of the event, including:
  • the alarm information of the event is output.
  • the alarm information of the event is output, which can reduce the occurrence of the event in a short time. lead to misjudgment.
  • the outputting the alarm information of the event includes:
  • the output includes alarm information of the event state; the event state includes at least one of an event start state, an event ongoing state, and an event end state.
  • a more eye-catching prompt can be provided for the user, so that the user can clearly understand the current state of the event.
  • the alarm policy further includes a fourth preset time; and in the case that the event state is the event continuing state, the output includes alarm information of the event state ,include:
  • alarm information including the event-continuing state is output.
  • the method further comprises:
  • the alarm information including the event-continuing state is not output.
  • an embodiment of the present disclosure provides a method for determining an event policy, including:
  • an event policy corresponding to the event to be detected is determined, and the event policy includes a decoding method and an alarm policy of alarm information.
  • the event policy corresponding to the to-be-detected event is determined based on the type of the to-be-detected event, including:
  • the decoding method is used to decode the target video to obtain a target image frame; the target image frame is used to detect the to-be-detected event, and each target image frame corresponds to an analysis result.
  • the alarm information includes: first alarm information used to characterize the occurrence of an event;
  • the alarm policy includes: a first output condition, a second preset time, and a second preset number ;
  • the determining an event policy corresponding to the event to be detected based on the type of the event to be detected includes:
  • the second preset time represents the time interval between two consecutive readings of the analysis results; the second preset number represents the number of the analysis results read each time; the first output condition represents In a case where the second preset number of parsing results are all occurrences, the first alarm information is output.
  • the alarm information includes: second alarm information for representing an event start state;
  • the alarm policy includes: a third preset time and a second output condition;
  • the determining an event policy corresponding to the event to be detected based on the type of the event to be detected includes:
  • the third preset time and the second output condition corresponding to the second alarm information are determined; the second output condition represents the first time in the distance from the to-be-detected event occurrence When the duration of the occurrence of the to-be-detected event reaches the third preset time, the second alarm information is output.
  • the alarm information includes: third alarm information used to represent a state in which the event continues;
  • the alarm policy includes: a third output condition and a fourth preset time;
  • the determining an event policy corresponding to the event to be detected based on the type of the event to be detected includes:
  • the fourth preset time corresponding to the third alarm information and the third output condition are determined based on the type of the to-be-detected event; the third output condition represents that the to-be-detected event is in the For the time in which the event continues in the state, the third alarm information is output when the time since the last output of the third alarm information reaches the fourth preset time.
  • the event strategy further includes: a target neural network model; the target neural network model is used to analyze each target image frame to obtain the analysis result;
  • the determining an event policy corresponding to the event to be detected based on the type of the event to be detected includes:
  • the target neural network model corresponding to the event to be detected is determined from at least one neural network model.
  • an event detection and output device including:
  • the video decoding part is configured to decode the acquired target video according to the preset decoding method to obtain the target image frame;
  • the image analysis part is configured to input the target image frame into the target neural network model for analysis, and obtain the analysis image frame marked with the analysis result;
  • the information output part is configured to output alarm information of each event according to the alarm policy of the event corresponding to the parsed image frame and the parsed image frame.
  • the preset decoding mode is determined by the type of the event; the preset decoding mode includes a key frame decoding mode or an all frame decoding mode.
  • the video decoding portion is further configured to:
  • the alarm policy includes a second preset number of parsed image frames and a second preset time; the information output part is further configured to:
  • the alarm policy further includes a third preset time; the information output part is further configured to:
  • the alarm information of the event is output.
  • the information output part is further configured to:
  • the output includes alarm information of the event state; the event state includes at least one of an event start state, an event ongoing state, and an event end state.
  • the alarm policy further includes a fourth preset time; the information output part is further configured to:
  • alarm information including the event-continuing state is output.
  • the information output part is further configured to:
  • the alarm information including the event-continuing state is not output.
  • an apparatus for determining an event policy including:
  • the acquisition part is configured to acquire the target video
  • the determining part is configured to determine the type of the event to be detected based on the target video; based on the type of the event to be detected, determine the event strategy corresponding to the event to be detected, the event strategy includes a decoding method and alarm information. Alert policy.
  • the determining part is further configured to: based on the type of the to-be-detected event, determine the to-be-detected event corresponding to the to-be-detected event from at least one preset decoding manner A decoding method; the decoding method is used to decode the target video to obtain a target image frame; the target image frame is used to detect the to-be-detected event, and each target image frame corresponds to an analysis result.
  • the alarm information includes: first alarm information used to characterize the occurrence of an event;
  • the alarm policy includes: a first output condition, a second preset time, and a second preset number ;
  • the determining part is further configured to: determine the second preset time, the second preset number and the first output corresponding to the first alarm information based on the type of the event to be detected conditions; the second preset time represents the time interval between two consecutive readings of the parsing results; the second preset number represents the number of the parsing results read each time; the first output The condition indicates that the first alarm information is output in the case that the second preset number of parsing results are all occurrences.
  • the alarm information includes: second alarm information for representing an event start state;
  • the alarm policy includes: a third preset time and a second output condition;
  • the determining part is further configured to: determine the third preset time corresponding to the second alarm information and the second output condition based on the type of the event to be detected; the second output condition It means that the second alarm information is output when the duration from the occurrence of the to-be-detected event to the first occurrence of the to-be-detected event reaches the third preset time.
  • the alarm information includes: third alarm information used to represent a state in which the event continues;
  • the alarm policy includes: a third output condition and a fourth preset time;
  • the determining part is further configured to: determine the fourth preset time corresponding to the third alarm information and the third output condition based on the type of the event to be detected; the third output condition It means that the third alarm information is output when the time when the to-be-detected event is in the event-continuing state has reached the fourth preset time since the last time the third alarm information was output.
  • the event strategy further includes: a target neural network model; the target neural network model is used to analyze each target image frame to obtain the analysis result;
  • the determining part is further configured to: determine the target neural network model corresponding to the to-be-detected event from at least one neural network model based on the type of the to-be-detected event.
  • an embodiment of the present disclosure provides an electronic device, including: a processor, a memory, and a bus, where the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, all The processor and the memory communicate through a bus, and when the machine-readable instruction is executed by the processor, the method for detecting and outputting an event according to the first aspect is executed.
  • an embodiment of the present disclosure provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and the computer program is executed by a processor when the detection output of the event described in the first aspect is executed method.
  • an embodiment of the present disclosure provides a computer program, the computer program product includes a computer program or instructions, and when the computer program or instructions are run on a computer, the computer executes as described in the first aspect.
  • the detection output method of the described event is described in the first aspect.
  • FIG. 1 shows a schematic diagram of an application system architecture of an event detection and output method or an event policy determination method provided by an embodiment of the present disclosure
  • FIG. 2 shows a flowchart of an event detection and output method provided by an embodiment of the present disclosure
  • FIG. 3 shows a flowchart of a first method for outputting alarm information provided by an embodiment of the present disclosure
  • FIG. 4 shows a schematic diagram of a parameter control output provided by an embodiment of the present disclosure
  • FIG. 5 shows a flowchart of a second method for outputting alarm information provided by an embodiment of the present disclosure
  • FIG. 6 shows a schematic diagram of outputting alarm information of a motor vehicle event provided by an embodiment of the present disclosure
  • FIG. 7 shows a flowchart of a third method for outputting alarm information provided by an embodiment of the present disclosure
  • FIG. 8 shows a flowchart of a fourth method for outputting alarm information provided by an embodiment of the present disclosure
  • FIG. 9 shows a flowchart of a method for determining an event policy provided by an embodiment of the present disclosure
  • FIG. 10 shows a schematic structural diagram of an event detection and output device provided by an embodiment of the present disclosure
  • FIG. 11 shows a schematic structural diagram of an event policy determination apparatus provided by an embodiment of the present disclosure
  • FIG. 12 shows a schematic diagram of an electronic device provided by an embodiment of the present disclosure.
  • Computer vision technology (Computer Vision, CV) is a science that studies how to make machines "see”. Further, it refers to the use of cameras and computers instead of human eyes to identify, track, and measure objects and other machine vision, and further. Do graphics processing to make computer processing become images more suitable for human eye observation or transmission to instruments for detection. As a scientific discipline, computer vision studies related theories and technologies, trying to build artificial intelligence systems that can obtain information from images or multidimensional data.
  • Computer vision technology usually includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, 3D object reconstruction, 3D technology, virtual reality, augmented reality, simultaneous localization and mapping It also includes common biometric identification technologies such as face recognition and fingerprint recognition.
  • the target video can be decoded to obtain the image frame, and then the target object can be recognized on the image frame, and after the target object is recognized, the corresponding event information can be output.
  • the corresponding event information is output for each image frame after the target object is identified, it will not only affect the user experience, but also affect the system performance.
  • an embodiment of the present disclosure provides an event detection and output method.
  • the acquired target video is decoded according to a preset decoding method to obtain a target image frame; then the target image frame is input into a target neural network.
  • the model is parsed to obtain parsed image frames marked with the parsed results; and then according to the alarm policy of the event corresponding to the parsed image frame and the parsed image frame, the alarm information of each of the events is output.
  • the output frequency of the alarm information of the event can be made close to the frequency of the actual occurrence of the event, the intelligence of the event detection can be improved, and the system performance; at the same time, realize the output of event alarm information based on different task types.
  • an embodiment of the present disclosure also provides a method for determining an event strategy, first acquiring a target video, then determining the type of the event to be detected based on the target video, and finally determining an event strategy corresponding to the event to be detected based on the type of the event to be detected,
  • the event policy includes the decoding method and the alarm policy of the alarm information. In this way, a corresponding event strategy can be determined for different events to be detected, thereby improving the flexibility in event detection.
  • the method for detecting and outputting the event or the method for determining the event policy can be applied to a terminal, or can be applied to a server, or can be applied to a system architecture composed of a terminal and a server.
  • the event detection and output method or the event policy determination method may also be software running in a terminal or server, such as an application program with event detection and output functions, or an event policy determination function.
  • the terminal may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.
  • the server can be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud service, cloud database, cloud computing, cloud function, cloud storage, network service, cloud communication, domain name Cloud servers for services, security services, and basic cloud computing services such as big data and artificial intelligence platforms.
  • the method for detecting and outputting the event or the method for determining the event policy may be implemented by the processor calling computer-readable instructions stored in the memory.
  • FIG. 1 is a schematic diagram of an application system architecture of an event detection and output method or an event policy determination method provided by an embodiment of the present disclosure.
  • the system architecture 100 includes a notebook computer 10 , a desktop computer 20 , a smart phone 30 and a server 40 .
  • the server 40 can communicate with the notebook computer 10 , the desktop computer 20 and the smart phone 30 respectively through the network 50 .
  • the network 50 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the server 40 is a video parsing server, configured to execute the event detection and output method provided by the embodiment of the present disclosure, and output the alarm information of the event to the terminal device (such as the notebook 10 in FIG. 1 , the desktop computer 20 or smart phone 30), so that the user can view the corresponding alarm information through the corresponding terminal device, or be used to execute the event policy determination method provided by the embodiment of the present disclosure.
  • the terminal device such as the notebook 10 in FIG. 1 , the desktop computer 20 or smart phone 30
  • the method for detecting and outputting an event includes the following S101-S103:
  • S101 Decode the acquired target video according to a preset decoding method to obtain a target image frame.
  • the target video may be a related video of the target place, and is obtained by photographing the target place with a camera disposed at the target place.
  • the target place can be an umbrella canopy, an urban road (such as a motor vehicle, a non-motor vehicle, etc.), a kitchen, or a place where people gather (such as a stadium, a station, etc.), which is not limited here.
  • the target site in each embodiment of the present disclosure is described by taking an urban road as an example.
  • the target site can set a prohibited parking area for vehicles, and the related video about the prohibited parking area can be obtained by photographing the prohibited parking area with a camera.
  • the vehicle parking prohibited area is some specific area where motor vehicles are prohibited from parking according to traffic laws.
  • the preset decoding mode is determined by the type of the event; the preset decoding mode includes a key frame decoding mode or an all frame decoding mode.
  • the type of event refers to the type of the target event at the target location.
  • key video frames may be extracted from all video frames of the target video every first preset number of frames or first preset time, and Decode the key video frame to obtain the target image frame.
  • the target video needs to be decoded to obtain the image frame, the smallest unit of the video.
  • the target objects such as umbrella canopies
  • the target objects can be identified and discovered, and corresponding information can be output.
  • the umbrella canopy that is set illegally usually lasts for a period of time from appearing to disappearing, and will not disappear after flashing. Therefore, it is not necessary to obtain the image frame of each frame. You can obtain an image frame every first preset number of frames (such as 10 frames or 20 frames), or you can obtain an image every first preset time (such as 40ms). frame. In this way, the amount of processing can be reduced, unnecessary processing time can be saved, and processing efficiency can be improved.
  • the preset decoding method can be determined according to the type of event. In this way, not only the decoding efficiency can be improved, but also the occurrence of missed detection can be reduced.
  • the target image frame is input into the target neural network model for analysis, and the analysis image frame marked with the analysis result is obtained.
  • the target neural network model includes a target detection neural network model, a semantic segmentation neural network model or a target tracking neural network model.
  • the target neural network model can be determined according to the different event types, and after the analysis of different target neural network models, the obtained annotation analysis results are also different.
  • the parsing results may include detection boxes and quality scores.
  • the detection frame is used to indicate the detected target object, and the quality score is used to measure the detection accuracy of the detected target object;
  • the analysis result includes the density map and the target. box;
  • the target neural network model is a target tracking neural network model, the parsing result may include a tracking number, and the tracking number is used to mark the same target object in adjacent time periods.
  • the alarm information of each of the events is output, so that the output frequency of the output alarm information is in line with the human body's perception and is closer to the event The true frequency of occurrence, which in turn can improve the intelligence in event detection.
  • the alarm policy may be preset according to the type of the event and/or the user's requirement.
  • the alert policy includes a second preset number of parsed image frames and a second preset time.
  • the alarm information of each event is output, including S1031-S1032:
  • S1031 Read the parsing results of the second preset number of parsing image frames adjacent to the parsing image frames at the second preset time interval.
  • the second preset number is described by taking M as an example, where M is a positive integer greater than or equal to 1.
  • indicates that the event does not occur, and ⁇ indicates that the event occurs.
  • an parsed image frame corresponding to the image frame will be obtained.
  • the parsed image frame is marked with the parsing result.
  • every M frame can be read from the initial image frame.
  • the analysis result of the image frame and judge whether the event occurs according to the analysis result, and output the alarm of event occurrence when the analysis result of each image frame in the read M frames is "occurred" information. Conversely, if the analysis result of any image frame among the M frames is "no occurrence", no alarm information is output.
  • the parsing result can be read once at a specific time a, and the specific time a can be set according to the event type and the user's needs. For example, taking post detection as an example, ⁇ means that the event occurred, which means the staff leaves the post, and ⁇ means that the incident does not occur, which means that the staff is on duty. In the case that the setting of the post is relatively free, the interval can be longer ( For example, the analysis results of M frames of image frames can be read once every 5 minutes, which can reduce the waste of resources.
  • the alarm policy further includes a third preset time; for the above step S1032 , in each of the second preset number of parsed image frames, each of the parsed image frames If the parsing results of the event are all occurrences, output the alarm information of the event, which may include the following S10321 to S10322:
  • the third preset time is also called the event activation time. Please refer to Figure 4 again, if the event occurs, but the duration from the start to the current time is less than the activation time e seconds, no message is output to the user, if the event of the event continues to reach the activation time e seconds, it is determined that the event is in the event start state , and output the corresponding alarm information.
  • the activation time is correspondingly described below by taking the illegal parking of a motor vehicle as an example.
  • the activation time is 2 minutes as an example for description. Referring to Fig. 6, it is determined that an event occurs after the motor vehicle is parked for the first time, and then it is determined that the motor vehicle is parked several times, but the activation time of 2 minutes is not reached, so no corresponding alarm information is output; when the parking time reaches 2 minutes When the event is determined, the corresponding alarm information is output at this time. In this way, it is possible to reduce the occurrence of misjudgments caused by the motor vehicle leaving after a short pause (for example, 1 minute), thereby improving the reliability of the detection output method.
  • a short pause for example, 1 minute
  • outputting the alarm information of the event may include the following S103221-S103222:
  • S103221 Determine the event state of the event according to the current result of the occurrence of the event and the result of whether an event preceding the current result occurs.
  • Output alarm information including the event state; the event state includes at least one of an event start state, an event ongoing state, and an event end state.
  • the event state judgment may be performed according to the parsing result. Referring to FIG. 4 again, for the state where the current result is "occurring”, if the previous result is also "occurring", the event state is assigned as “event ongoing”. On the contrary, it means that the event has just started to occur, and the event state is assigned as "event start”. For the state where the current result is "not happening”, if the previous result state is also "not happening", the event state is assigned the value "event did not happen”. On the contrary, it means that the event has just ended, and the event state is assigned as "event ended”.
  • the alarm information when the event ends, since the current result is "no occurrence", it is not necessary to output corresponding alarm information at this time. In other implementation manners, if the alarm information needs to be output when the event ends according to actual requirements, the alarm information may be output when the event does not occur but is in the event end state.
  • the alarm policy further includes a fourth preset time; as shown in FIG. 8 , for the above step S10322, in the case that the event state is the event continuing state, the following steps S10322a ⁇ 10322c:
  • the fourth preset time is also called cooling time.
  • the cool-down time refers to the alarm-free time during the continuous occurrence of the event.
  • the cooling time is c seconds. Therefore, when the event is in progress, the alarm information including the state in which the event is in progress is output at intervals of c seconds.
  • the cooling time is 60 seconds, that is, after it is determined that the motor vehicle is parked, the alarm information is output at a frequency of 60 seconds, which can reduce the The redundant output of events reduces the occurrence of continuous alarms for the same event.
  • the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the execution order of each step should be based on its function and possibility The inner logic is determined.
  • the embodiment of the present disclosure also provides an event detection and output device corresponding to the event detection and output method, because the principle of the device in the embodiment of the present disclosure to solve the problem is the same as the above-mentioned event detection and output method of the embodiment of the present disclosure. Similar, therefore, the implementation of the apparatus may refer to the implementation of the method, and repeated descriptions will not be repeated.
  • the method for determining an event policy includes the following S201-S203:
  • the target video may be a related video of the target place, and is obtained by photographing the target place with a camera set at the target place.
  • the target place can be an umbrella canopy, an urban road (such as a motor vehicle, a non-motor vehicle, etc.), a kitchen, or a place where people gather (such as a stadium, a station, etc.), which is not limited here.
  • Target detection can be performed on the target video to obtain the shooting object of the target video, and the type of the event to be detected can be determined according to the shooting object; for example, when the shooting object of the target video is a motor vehicle lane, the type of the event to be detected can be determined. is: illegal behavior on the motor vehicle lane; for another example, in the case that the shooting object of the target video is a shopping mall, it can be determined that the type of the event to be detected can be: people flow counting behavior.
  • S203 Determine an event policy corresponding to the event to be detected based on the type of the event to be detected, where the event policy includes a decoding method and an alarm policy of alarm information.
  • Each type of event to be detected corresponds to an event strategy, and different types of events to be detected usually have different event strategies; for example, the event strategy corresponding to the to-be-detected event of illegal behavior on a motor vehicle lane is different from that of people traffic statistics.
  • the event policies corresponding to the to-be-detected event of behavior are different.
  • the target video is obtained first, then the type of the event to be detected is determined based on the target video, and finally the event strategy corresponding to the event to be detected is determined based on the type of the event to be detected, and the event strategy includes the decoding method and alarm information. Alert policy. In this way, a corresponding event strategy can be determined for different events to be detected, thereby improving the flexibility in event detection.
  • the event policy corresponding to each type of event to be detected includes: a decoding method and an alarm policy of alarm information.
  • the target video can be decoded by the decoding method to obtain the target image frame, so as to determine the event state of the event to be detected according to the target image frame.
  • the event strategy corresponding to each type of event to be detected further includes: a target neural network model for analyzing each target image frame to obtain the analysis result of each target image frame.
  • the above S203 can be implemented by S2031-S2032:
  • the decoding method is used to decode the target video to obtain the target image frame; the target image frame is used to detect the event to be detected, and each target image frame corresponds to an analysis result.
  • the decoding methods of different to-be-detected events may be different.
  • at least one decoding method is preset, for example, it may be a key frame decoding method and an all-frame decoding method.
  • the decoding method corresponding to the event may be a key frame decoding method and an all-frame decoding method.
  • the alarm information includes: first alarm information used to characterize the occurrence of an event;
  • the alarm policy includes: a first output condition, a second preset time, and a second preset number; the above S203 can be implemented by S2033:
  • the second preset time represents the time when the analysis results are read twice adjacently time interval
  • the second preset number represents the number of parsing results read each time
  • the first output condition represents that the first alarm information is output when the second preset number of parsing results all occur.
  • the second preset time may be a second, and the second preset number may be M; for example, as shown in Figure 4 above, where ⁇ indicates that the event does not occur, ⁇ indicates that the event occurs, and can be read from the initial image frame.
  • different types of events to be detected may correspond to different M and a, or, different types of events to be detected may correspond to different M or a.
  • the alarm information includes: second alarm information used to represent the start state of the event;
  • the alarm policy includes: a third preset time and a second output condition; the above S203 can be implemented through S2034:
  • S2034. Determine a third preset time and a second output condition corresponding to the second alarm information based on the type of the event to be detected; the second output condition represents the duration of the first occurrence of the event to be detected between the occurrence of the event to be detected and the first occurrence of the event to be detected. In the case of three preset times, the second alarm information is output.
  • the third preset time is also called the event activation time.
  • the third preset time may be e seconds; continue to refer to the above-mentioned FIG. 4 , when the event to be detected occurs, but the duration from the beginning to the current time is less than the activation time e seconds, the second alarm information is not output to the user. , if the event of the to-be-detected event continues to reach the activation time e seconds, it is determined that the to-be-detected event is in the event start state, and second alarm information is output.
  • e corresponding to different types of events to be detected may be different.
  • the alarm information includes: third alarm information used to represent the ongoing state of the event; the alarm policy includes: a third output condition and a fourth preset time; the above S203 can be implemented through S2035:
  • the fourth preset time is also called cooling time.
  • the fourth preset time may be c seconds; with continued reference to the above FIG. 4 , when the event to be detected continues, the third alarm information may be output every c seconds.
  • the c corresponding to different types of events to be detected may be different.
  • the event strategy further includes: a target neural network model; the target neural network model is used to parse each target image frame to obtain an analysis result; the above S203 can be implemented by S2036:
  • different types of events to be detected may correspond to different neural network models, and when at least one neural network model is preset, for example, it can be a target detection neural network model, a semantic segmentation neural network model, and a target tracking neural network model. , from which the neural network model corresponding to the event to be detected can be selected to obtain the target neural network model.
  • a corresponding event strategy can be adapted for each type of to-be-detected events through the above method.
  • the event policy determination method provides robust message filtering and event alerting methods.
  • the event detection and output device 500 includes:
  • the video decoding part 501 is configured to decode the acquired target video according to the preset decoding mode to obtain the target image frame;
  • the image analysis part 502 is configured to input the target image frame into the target neural network model for analysis, and obtain the analysis image frame marked with the analysis result;
  • the information output part 503 is configured to output alarm information of each event according to the alarm policy of the event corresponding to the parsed image frame and the parsed image frame.
  • the preset decoding mode is determined by the type of the event; the preset decoding mode includes a key frame decoding mode or an all frame decoding mode.
  • the video decoding section 501 is further configured to:
  • the alarm policy includes a second preset number of parsed image frames and a second preset time; the information output part 503 is further configured to:
  • the alarm policy further includes a third preset time; the information output part 503 is further configured to:
  • the alarm information of the event is output.
  • the information output part 503 is further configured to:
  • the output includes alarm information of the event state; the event state includes at least one of an event start state, an event ongoing state, and an event end state.
  • the alarm policy further includes a fourth preset time; the information output part 503 is further configured to:
  • alarm information including the event-continuing state is output.
  • the information output part 503 is further configured to:
  • the alarm information including the event-continuing state is not output.
  • a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, a unit, a module or a non-modularity.
  • the event policy determination apparatus 600 includes:
  • an acquisition part 601 configured to acquire a target video
  • the determining part 602 is configured to determine the type of the event to be detected based on the target video; based on the type of the event to be detected, determine the event strategy corresponding to the event to be detected, the event strategy includes a decoding method and alarm information alert policy.
  • the determining part 602 is further configured to: determine the decoding mode corresponding to the to-be-detected event from at least one preset decoding mode based on the type of the to-be-detected event; The decoding method is used to decode the target video to obtain a target image frame; the target image frame is used to detect the to-be-detected event, and each target image frame corresponds to an analysis result.
  • the alarm information includes: first alarm information used to characterize the occurrence of an event;
  • the alarm policy includes: a first output condition, a second preset time, and a second preset number;
  • the determining part 602 is further configured to: based on the type of the event to be detected, determine the second preset time, the second preset number and the first alarm information corresponding to the first alarm information output conditions; the second preset time represents the time interval between two consecutive readings of the parsing results; the second preset number represents the number of the parsing results read each time; the first The output condition represents that the first alarm information is output when the second preset number of parsing results all occur.
  • the alarm information includes: second alarm information for representing an event start state;
  • the alarm policy includes: a third preset time and a second output condition;
  • the determining part 602 is further configured to: determine the third preset time and the second output condition corresponding to the second alarm information based on the type of the event to be detected; the second output The condition means that the second alarm information is output when the duration from the occurrence of the to-be-detected event to the first occurrence of the to-be-detected event reaches the third preset time.
  • the alarm information includes: third alarm information used to represent a state in which the event continues; the alarm policy includes: a third output condition and a fourth preset time;
  • the determining part 602 is further configured to: determine the fourth preset time and the third output condition corresponding to the third alarm information based on the type of the event to be detected; the third output The condition represents that the third alarm information is output when the time when the to-be-detected event is in the event-continuing state has reached the fourth preset time since the last time the third alarm information was output.
  • the event strategy further includes: a target neural network model; the target neural network model is used to analyze each target image frame to obtain the analysis result;
  • the determining part 602 is further configured to: determine the target neural network model corresponding to the to-be-detected event from at least one neural network model based on the type of the to-be-detected event.
  • a schematic structural diagram of an electronic device 700 provided by an embodiment of the present disclosure includes a processor 701 , a memory 702 , and a bus 703 .
  • the memory 702 is configured to store execution instructions, including the memory 7021 and the external memory 7022; the memory 7021 here is also called internal memory, and is configured to temporarily store the operation data in the processor 701 and the external memory 7022 such as the hard disk.
  • the processor 701 exchanges data with the external memory 7022 through the memory 7021 .
  • the memory 702 is further configured to store application program codes for executing the technical solutions of the present disclosure, and the execution is controlled by the processor 701 . That is, when the electronic device 700 is running, the processor 701 communicates with the memory 702 through the bus 703, so that the processor 701 executes the application code stored in the memory 702, thereby executing the method described in any of the foregoing embodiments.
  • the memory 702 may be, but not limited to, random access memory (Random Access Memory, RAM), read only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), or Erasable Programmable Read-Only Memory (EPROM), Electrical Erasable Programmable Read-Only Memory (EEPROM), etc.
  • RAM Random Access Memory
  • ROM read only memory
  • PROM programmable read only memory
  • EPROM Erasable Programmable Read-Only Memory
  • EEPROM Electrical Erasable Programmable Read-Only Memory
  • the processor 701 may be an integrated circuit chip with signal processing capability.
  • the above-mentioned processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it can also be a digital signal processor (DSP), an application-specific integrated circuit (ASIC) , Field Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA Field Programmable Gate Array
  • a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the structures illustrated in the embodiments of the present disclosure do not constitute a specific limitation on the electronic device 700 .
  • the electronic device 700 may include more or less components than shown, or combine some components, or separate some components, or different component arrangements.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • Embodiments of the present disclosure further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the steps of the method for detecting and outputting an event in the foregoing method embodiment are executed, or The steps of the event policy determination method.
  • the storage medium may be a volatile or non-volatile computer-readable storage medium.
  • Embodiments of the present disclosure further provide a computer program product, where the computer program product carries program code, and the program code includes instructions that can be used to execute the steps of the event detection and output method in the above method embodiments, or the event policy determination method For the steps, please refer to the above method embodiments.
  • the above-mentioned computer program product can be specifically implemented by means of hardware, software or a combination thereof.
  • the computer program product is embodied as a computer storage medium, and in another optional embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. Wait.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the functions, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a processor-executable non-volatile computer-readable storage medium.
  • the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the parts that make contributions to the prior art or the parts of the technical solutions.
  • the computer software products are stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • a computer-readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Non-exhaustive list of computer readable storage media include: portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory), static random access memory (SRAM), portable compact disk read only memory (CD-ROM), digital versatile disk (DVD), memory sticks, floppy disks, mechanically coded devices, such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • flash memory static random access memory
  • SRAM static random access memory
  • CD-ROM compact disk read only memory
  • DVD digital versatile disk
  • memory sticks floppy disks
  • mechanically coded devices such as printers with instructions stored thereon Hole cards or raised structures in grooves, and any suitable combination of the above.
  • Computer-readable storage media are not to be construed as transient signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (eg, light pulses through fiber optic cables), or through electrical wires transmitted electrical signals.
  • Embodiments of the present disclosure provide an event detection and output method, an event policy determination method and apparatus, an electronic device, and a computer-readable storage medium.
  • the event detection and output method includes: performing a decoding process on an acquired target video according to a preset decoding method. Decoding to obtain a target image frame; inputting the target image frame into the target neural network model for analysis, to obtain a parsed image frame marked with the analysis result; according to the alarm strategy of the event corresponding to the parsed image frame and the parsed image frame , and output the alarm information of each said event.
  • the embodiment of the present disclosure makes the output alarm information of the event closer to the real occurrence frequency of the event, improves the intelligence of event detection, and thus improves the system performance; meanwhile, the output of event alarm information based on different task types can be realized.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本公开实施例提供了一种事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质,该事件的检测输出方法包括:根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧;将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧;根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。

Description

事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质
相关申请的交叉引用
本公开基于申请号为202110442017.2、申请日为2021年04月23日、申请名称为“事件的检测输出方法及装置、电子设备和存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。
技术领域
本公开涉及计算机视觉技术领域,具体而言,涉及一种事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质。
背景技术
在计算机视觉技术的应用过程中,通常对目标视频进行解码得到图像帧,再对图像帧进行目标物体识别,并在识别出目标物体后,输出相应的事件信息。
然而,由于视频流的传输频率(如25帧/每秒)较高,如果以该频率进行图像识别以及输出相应的事件输出,将导致用户以较高频率接收到相应的事件输出信息,进而使得用户频繁的被打扰以影响用户体验和损耗系统性能,也无法基于不同的任务类型进行事件告警信息的输出。
发明内容
本公开实施例至少提供一种事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质。
第一方面,本公开实施例提供了一种事件的检测输出方法,包括:
根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧;
将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧;
根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。
本公开实施例中,由于根据解析图像帧对应的事件的告警策略以及解析图像帧,输出各事件的告警信息,而不是直接以视频流的传输频率进行告警信息的输出,使得输的告警信息的频率符合人体的感知,更接近事件真实的发生频率,进而可以提高事件检测的智能性,从而提高系统性能;同时,由于根据解析图像帧对应的事件的告警策略以及解析图像帧,可输出各事件的告警信息,所以,还可以实现基于不同的任务类型进行事件告警信息的输出。
根据第一方面,在一些实施方式中,所述预设的解码方式由所述事件的类型确定;所述预设的解码方式包括关键帧解码方式或者全部帧解码方式。
本实施方式中,根据事件的类型确定预设的解码方式不仅可以提高解码效率,还可以减少漏检测的情况发生。
根据第一方面,在一些实施方式中,在所述预设的解码方式为所述关键帧解码方式的情况下,所述根据预设的解码方式对获取的目标视频进行解码,包括:
每隔第一预设数量帧或第一预设时间从所述目标视频的全部视频帧中抽取关键视频帧,并对所述关键视频帧进行解码。
本公开实施例中,通过每隔第一预设数量帧或第一预设时间从目标视频的全部视频帧 中抽取关键视频帧的方式,实现了抽帧解码,提高了解码的效率。
根据第一方面,在一些实施方式中,所述告警策略包括第二预设数量的解析图像帧以及第二预设时间;所述根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息,包括:
每间隔所述第二预设时间读取所述解析图像帧中相邻的所述第二预设数量的解析图像帧的解析结果;
在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息。
本公开实施例中,由于在第二预设数量的解析图像帧中的每个解析图像帧的解析结果均为发生的情况下,输出事件的告警信息,如此可以减少因闪帧带来的误判情况的发生。
根据第一方面,在一些实施方式中,所述告警策略还包括第三预设时间;所述在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息,包括:
在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,判断事件发生;
在当前的事件发生距离首次事件发生的持续时间达到所述第三预设时间的情况下,输出所述事件的告警信息。
本实施方式中,在当前的事件发生距离首次事件发生的持续时间达到所述第三预设时间(也称激活时间)时的情况下,输出事件的告警信息,可以减少因事件短时出现而导致误判的情况发生。
根据第一方面,在一些实施方式中,所述输出所述事件的告警信息,包括:
根据所述事件发生的当前结果,以及所述当前结果的前一条事件是否发生的结果,确定所述事件的事件状态;
输出包括所述事件状态的告警信息;所述事件状态包括事件开始状态、事件持续中状态,以及事件结束状态中的至少一种。
本实施方式中,通过对事件状态的判断,可以为用户提供更加醒目的提示,使得用户清楚的了解事件当前所处的状态。
根据第一方面,在一些实施方式中,所述告警策略还包括第四预设时间;在所述事件状态为所述事件持续中状态的情况下,所述输出包括所述事件状态的告警信息,包括:
判断所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间是否达到所述第四预设时间;
在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间达到所述第四预设时间的情况下,输出包括所述事件持续中状态的告警信息。
本实施方式中,在事件处于事件持续中状态的时间达到第四预设时间的情况下,输出包括事件持续中状态的告警信息,如此可以减少事件的告警信息的冗余输出,进一步提高了系统性能。
根据第一方面,在一些实施方式中,所述方法还包括:
在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间未达到所述第四预设时间的情况下,不输出包括所述事件持续中状态的告警信息。
第二方面,本公开实施例提供了一种事件策略确定方法,包括:
获取目标视频;
基于所述目标视频,确定待检测事件的类型;
基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,所述事件策略包括解码方式和告警信息的告警策略。
根据第二方面,在一些实施方式中,所述基于所述待检测事件的类型,确定所述待检 测事件对应的事件策略,包括:
基于所述待检测事件的类型,从至少一个预设的解码方式中确定与所述待检测事件对应的所述解码方式;
所述解码方式用于对所述目标视频进行解码得到目标图像帧;所述目标图像帧用于检测所述待检测事件,且每个目标图像帧对应一个解析结果。
根据第二方面,在一些实施方式中,所述告警信息包括:用于表征事件发生的第一告警信息;所述告警策略包括:第一输出条件、第二预设时间和第二预设数量;
所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
基于所述待检测事件的类型,确定与所述第一告警信息对应的所述第二预设时间、所述第二预设数量和所述第一输出条件;
所述第二预设时间表征相邻两次读取所述解析结果时的时间间隔;所述第二预设数量表征每次读取的所述解析结果的数量;所述第一输出条件表征在所述第二预设数量的解析结果均为发生的情况下,输出所述第一告警信息。
根据第二方面,在一些实施方式中,所述告警信息包括:用于表征事件开始状态的第二告警信息;所述告警策略包括:第三预设时间、和第二输出条件;
所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
基于所述待检测事件的类型,确定与所述第二告警信息对应的所述第三预设时间和所述第二输出条件;所述第二输出条件表征在所述待检测事件发生距离首次所述待检测事件发生的持续时间,达到所述第三预设时间的情况下,输出所述第二告警信息。
根据第二方面,在一些实施方式中,所述告警信息包括:用于表征事件持续中状态的第三告警信息;所述告警策略包括:第三输出条件和第四预设时间;
所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
基于所述待检测事件的类型,确定与所述第三告警信息对应的所述第四预设时间和所述第三输出条件;所述第三输出条件表征在所述待检测事件处于所述事件持续中状态的时间,距上一次输出所述第三告警信息的时间达到所述第四预设时间的情况下,输出所述第三告警信息。
根据第二方面,在一些实施方式中,所述事件策略还包括:目标神经网络模型;所述目标神经网络模型用于对每张目标图像帧进行解析,得到所述解析结果;
所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
基于所述待检测事件的类型,从至少一个神经网络模型中,确定与所述待检测事件对应的所述目标神经网络模型。
第三方面,本公开实施例提供了一种事件的检测输出装置,包括:
视频解码部分,被配置为根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧;
图像解析部分,被配置为将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧;
信息输出部分,被配置为根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。
根据第三方面,在一些实施方式中,所述预设的解码方式由所述事件的类型确定;所述预设的解码方式包括关键帧解码方式或者全部帧解码方式。
根据第三方面,在一些实施方式中,视频解码部分还被配置为:
每隔第一预设数量帧或第一预设时间从所述目标视频的全部视频帧中抽取关键视频帧,并对所述关键视频帧进行解码。
根据第三方面,在一些实施方式中,所述告警策略包括第二预设数量的解析图像帧以及第二预设时间;所述信息输出部分还被配置为:
每间隔所述第二预设时间读取所述解析图像帧中相邻的所述第二预设数量的解析图像帧的解析结果;
在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息。
根据第三方面,在一些实施方式中,所述告警策略还包括第三预设时间;所述信息输出部分还被配置为:
在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,判断事件发生;
在当前的事件发生距离首次事件发生的持续时间达到所述第三预设时间的情况下,输出所述事件的告警信息。
根据第三方面,在一些实施方式中,所述信息输出部分还被配置为:
根据所述事件发生的当前结果,以及所述当前结果的前一条事件是否发生的结果,确定所述事件的事件状态;
输出包括所述事件状态的告警信息;所述事件状态包括事件开始状态、事件持续中状态,以及事件结束状态中的至少一种。
根据第三方面,在一些实施方式中,所述告警策略还包括第四预设时间;所述信息输出部分还被配置为:
判断所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间是否达到所述第四预设时间;
在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间达到所述第四预设时间的情况下,输出包括所述事件持续中状态的告警信息。
根据第三方面,在一些实施方式中,所述信息输出部分还被配置为:
在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间未达到所述第四预设时间的情况下,不输出包括所述事件持续中状态的告警信息。
第四方面,本公开实施例提供了一种事件策略确定装置,包括:
获取部分,被配置为获取目标视频;
确定部分,被配置为基于所述目标视频,确定待检测事件的类型;基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,所述事件策略包括解码方式和告警信息的告警策略。
根据第四方面,在一些实施方式中,所述确定部分,还被配置为:基于所述待检测事件的类型,从至少一个预设的解码方式中确定与所述待检测事件对应的所述解码方式;所述解码方式用于对所述目标视频进行解码得到目标图像帧;所述目标图像帧用于检测所述待检测事件,且每个目标图像帧对应一个解析结果。
根据第四方面,在一些实施方式中,所述告警信息包括:用于表征事件发生的第一告警信息;所述告警策略包括:第一输出条件、第二预设时间和第二预设数量;
所述确定部分,还被配置为:基于所述待检测事件的类型,确定与所述第一告警信息对应的所述第二预设时间、所述第二预设数量和所述第一输出条件;所述第二预设时间表征相邻两次读取所述解析结果时的时间间隔;所述第二预设数量表征每次读取的所述解析结果的数量;所述第一输出条件表征在所述第二预设数量的解析结果均为发生的情况下,输出所述第一告警信息。
根据第四方面,在一些实施方式中,所述告警信息包括:用于表征事件开始状态的第二告警信息;所述告警策略包括:第三预设时间、和第二输出条件;
所述确定部分,还被配置为:基于所述待检测事件的类型,确定与所述第二告警信息对应的所述第三预设时间和所述第二输出条件;所述第二输出条件表征在所述待检测事件发生距离首次所述待检测事件发生的持续时间,达到所述第三预设时间的情况下,输出所 述第二告警信息。
根据第四方面,在一些实施方式中,所述告警信息包括:用于表征事件持续中状态的第三告警信息;所述告警策略包括:第三输出条件和第四预设时间;
所述确定部分,还被配置为:基于所述待检测事件的类型,确定与所述第三告警信息对应的所述第四预设时间和所述第三输出条件;所述第三输出条件表征在所述待检测事件处于所述事件持续中状态的时间,距上一次输出所述第三告警信息的时间达到所述第四预设时间的情况下,输出所述第三告警信息。
根据第四方面,在一些实施方式中,所述事件策略还包括:目标神经网络模型;所述目标神经网络模型用于对每张目标图像帧进行解析,得到所述解析结果;
所述确定部分,还被配置为:基于所述待检测事件的类型,从至少一个神经网络模型中,确定与所述待检测事件对应的所述目标神经网络模型。
第五方面,本公开实施例提供了一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,在电子设备运行的情况下,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如第一方面所述的事件的检测输出方法。
第六方面,本公开实施例提供了一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如第一方面所述的事件的检测输出方法。
第七方面,本公开实施例提供了一种计算机程序,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在计算机上运行的情况下,所述计算机执行如第一方面所述的事件的检测输出方法。
为使本公开的上述目的、特征和优点能更明显易懂,下文特举较佳实施例,并配合所附附图,作详细说明如下。
附图说明
为了更清楚地说明本公开实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,此处的附图被并入说明书中并构成本说明书中的一部分,这些附图示出了符合本公开的实施例,并与说明书一起用于说明本公开的技术方案。应当理解,以下附图仅示出了本公开的某些实施例,因此不应被看作是对范围的限定,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他相关的附图。
图1示出了本公开实施例所提供的一种事件的检测输出方法或一种事件策略确定方法的应用系统架构示意图;
图2示出了本公开实施例所提供的一种事件的检测输出方法的流程图;
图3示出了本公开实施例所提供的第一种告警信息的输出方法的流程图;
图4示出了本公开实施例所提供的一种参数控制输出的示意图;
图5示出了本公开实施例所提供的第二种告警信息的输出方法的流程图;
图6示出了本公开实施例所提供的一种机动车事件的告警信息的输出示意图;
图7示出了本公开实施例所提供的第三种告警信息的输出方法的流程图;
图8示出了本公开实施例所提供的第四种告警信息的输出方法的流程图;
图9示出了本公开实施例所提供的一种事件策略确定方法的流程图;
图10示出了本公开实施例所提供的一种事件的检测输出装置的结构示意图;
图11示出了本公开实施例所提供的一种事件策略确定装置的结构示意图;
图12示出了本公开实施例所提供的一种电子设备的示意图。
具体实施方式
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本公开实施例的组件可以以各种不同的配置来布置和设计。因此,以下对在附图中提供的本公开的实施例的详细描述并非旨在限制要求保护的本公开的范围,而是仅仅表示本公开的选定实施例。基于本公开的实施例,本领域技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。
本文中术语“和/或”,仅仅是描述一种关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。
计算机视觉技术(Computer Vision,CV)是一门研究如何使机器“看”的科学,更进一步地说,就是指用摄影机和电脑代替人眼对目标进行识别、跟踪和测量等机器视觉,并进一步做图形处理,使电脑处理成为更适合人眼观察或传送给仪器检测的图像。作为一个科学学科,计算机视觉研究相关的理论和技术,试图建立能够从图像或者多维数据中获取信息的人工智能系统。计算机视觉技术通常包括图像处理、图像识别、图像语义理解、图像检索、OCR、视频处理、视频语义理解、视频内容/行为识别、三维物体重建、3D技术、虚拟现实、增强现实、同步定位与地图构建等技术,还包括常见的人脸识别、指纹识别等生物特征识别技术。
随着计算机视觉技术的发展,计算机视觉技术被广泛的应用于各个领域中。例如,在智慧城市的应用中,可以对目标视频进行解码得到图像帧,再对图像帧进行目标物体识别,并在识别出目标物体后,输出相应的事件信息。然而,若在识别出目标物体后的每一图像帧均输出相应的事件信息,既会影响用户体验,也会影响系统性能。
基于上述研究,本公开实施例提供了一种事件的检测输出方法,首先根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧;接着将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧;然后根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。如此,通过根据解析图像帧对应的事件的告警策略,输出各所述事件的告警信息,可以使得事件的告警信息的输出频率接近事件真实发生的频率,可以提高事件检测的智能性,从而可以提高系统的性能;同时,实现基于不同的任务类型进行事件告警信息的输出。
另外,本公开实施例还提供了一种事件策略确定方法,首先获取目标视频,接着基于目标视频,确定待检测事件的类型,最后基于待检测事件的类型,确定待检测事件对应的事件策略,事件策略包括解码方式和告警信息的告警策略。如此,可以针对不同的待检测事件进行相应的事件策略的确定,从而提高了事件检测时的灵活性。
该事件的检测输出方法或该事件策略确定方法可应用于终端中,或者可应用于服务器中,或者可应用于由终端和服务器所组成的系统架构中。此外,该事件的检测输出方法或该事件策略确定方法还可以是运行于终端或服务器中的软体,例如具有事件检测和输出功能的应用程序,或具有事件策略确定功能的应用程序等。
其中,终端可以是智能手机、平板电脑、笔记本电脑、台式计算机、智能音箱、智能手表等,但并不局限于此。服务器可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云服务、云数据库、云计算、云函数、云 存储、网络服务、云通信、域名服务、安全服务以及大数据和人工智能平台等基础云计算服务的云服务器。在一些实现方式中,该事件的检测输出方法或该或该事件策略确定方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。
请参阅图1,图1是本公开实施例提供的一种事件的检测输出方法或一种事件策略确定方法的应用系统架构示意图。参见图1所示,该系统架构100包括笔记本电脑10、台式计算机20、智能手机30以及服务器40。服务器40可以通过网络50分别与笔记本电脑10、台式计算机20及智能手机30进行通信。网络50可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
本公开实施方式中,服务器40为视频解析服务器,用于执行本公开实施例所提供的事件的检测输出方法,并将事件的告警信息输出至终端设备(比如图1中的笔记本10、台式计算机20或者智能手机30),以使得用户可以通过相应的终端设备查看相应的告警信息,或用于执行本公开实施例所提供的事件策略确定方法。
参见图2所示,为本公开实施例提供的事件的检测输出方法的流程图,该事件的检测输出方法包括以下S101~S103:
S101,根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧。
示例性地,目标视频可以是目标场所的相关视频,并通过设置于目标场所的摄像头对目标场所进行拍摄而获得。该目标场所可以是伞篷、城市道路(如机动车、非机动车等)、厨房、人员聚集场所(如体育场馆、车站等),在此不做限定。
本公开各实施例中的目标场所以城市道路为例进行说明,例如,该目标场所可以设置车辆禁停区域,通过摄像头对该禁停区域进行拍摄,即可得到关于该禁停区域的相关视频。其中,所述车辆禁停区域是按照交通法规禁止机动车辆停放的某些特定的区域。
一些实施方式中,所述预设的解码方式由所述事件的类型确定;所述预设的解码方式包括关键帧解码方式或者全部帧解码方式。
其中,事件的类型是指目标场所的目标事件的类型。在所述预设的解码方式为所述关键帧解码方式的情况下,可以每隔第一预设数量帧或第一预设时间从所述目标视频的全部视频帧中抽取关键视频帧,并对所述关键视频帧进行解码,得到所述目标图像帧。
可以理解,为了利用目标神经网络模型对目标场所的目标视频进行检测,得到检测结果,需要将目标视频进行解码,从而得到视频最小的单元—图像帧。例如,可以将目标视频的视频流解码成图像帧后,对图像帧里的目标物体(比如伞篷)进行识别发现,并输出相应的信息。
然而,在该实施例中,由于违规设置的伞蓬从出现到消失通常会持续一段时间,不会闪现后即消失,如果闪现后即消失,则说明该伞蓬只是闪现,并未违规设置,因此没有必要去获取每一帧的图像帧,可以每隔第一预设数量帧(比如10帧或者20帧)获取一次图像帧,也可以每隔第一预设时间(比如40ms)获取一次图像帧。如此,可以减少处理量,节省不必要的处理时间,提高处理效率。
但是,在其他实施例中,比如在目标场所为厨房的情况下,需要关注厨房作业人员(比如厨师)的行为是否违规操作,则需要对每一图像帧进行解析,以免漏掉在该过程中的违规行为,因此,可以根据事件的类型确定预设的解码方式。如此不仅可以提高解码效率,还可以减少漏检测的情况发生。
S102,将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧。
示例性地,所述目标神经网络模型包括目标检测神经网络模型、语义分割神经网络模型或者目标跟踪神经网络模型。这里,可以根据事件类型的不同确定目标神经网络模型,且经过不同的目标神经网络模型解析后,得到的标注的解析结果也不同。
例如,在目标神经网络模型为目标检测神经网络模型的情况下,解析结果可以包括检 测框和质量分数。其中,检测框用于指示检测出的目标物体,质量分数用于衡量检测出的目标物体的检测精度;在目标神经网络模型为语义分割神经网络模型的情况下,该解析结果包含密度图以及目标框;在目标神经网络模型为目标跟踪神经网络模型的情况下,该解析结果可以包含跟踪编号,该跟踪编号用于在相邻的时间段标记同一个目标物体。
S103,根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。
可以理解,由于常规的视频流以25帧/每秒的频率进行传输,也就是在伞篷出现的一分钟之内,会有1500(25*60=1500)条事件信息输出到用户,即使是在步骤S101进行关键帧抽帧解码后,一秒钟仍有多条告警信息输出,如果事件信息按照该频率进行输出,将使得用户很难正常使用。也即,由于每一图像帧的间隔时间很短,通常肉眼是无法察觉的。
因此,本实施方式中,根据解析图像帧对应的事件的告警策略,以及所述解析图像帧,输出各所述事件的告警信息,使得输出的告警信息的输出频率符合人体的感知,更接近事件真实的发生频率,进而可以提高事件检测时的智能性。其中,告警策略可以根据事件的类型和/或用户的需求而进行预先设定。
在一些实施方式中,所述告警策略包括第二预设数量的解析图像帧以及第二预设时间。参见图3所示,针对上述步骤S103,在根根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息,包括S1031~S1032:
S1031,每间隔所述第二预设时间读取所述解析图像帧中相邻的所述第二预设数量的解析图像帧的解析结果。
S1032,在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息。
本实施方式中,第二预设数量以M为例来进行说明,其中,M为大于等于1的正整数。
参见图4所示,其中○表示事件不发生,△表示事件发生。这里,将解码得到的图像帧输入目标神经网络之后,会得到与该图像帧相对应的解析图像帧,该解析图像帧标注有解析结果,根据告警策略可以从初始图像帧开始读取每M帧图像帧的解析结果,并根据解析结果进行事件是否发生的判断,并在所读取的M帧的图像帧中每个图像帧的解析结果均为“发生”的情况下,输出事件发生的告警信息。反之,若该M帧的图像帧中的任一图像帧的解析结果为“不发生”,则不输出告警信息。
在一些实施方式中,可以间隔特定时间a,读取一次解析结果,特定时间a可以根据事件类型以及用户的需求而设定。比如,以岗位检测为例,其中△表示事件发生,代表工作人员离岗,而○表示事件不发生,则表示工作人员在岗,在该岗位设置较为自由的情况下,可以间隔较长的时间(比如每5分钟检测一次)读取一次M帧的图像帧的解析结果即可,如此可以减少资源的浪费。
参见图5所示,在一些实施方式中,所述告警策略还包括第三预设时间;针对上述步骤S1032,在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息,可以包括以下S10321~S10322:
S10321,在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,判断事件发生。
S10322,在当前的事件发生距离首次事件发生的持续时间达到所述第三预设时间的情况下,输出所述事件的告警信息。
本实施方式中,第三预设时间也称事件激活时间。请再次参阅图4,若事件发生,但从开始到当前的持续时间不到激活时间e秒,则不输出消息给用户,如果事件的事件持续到达激活时间e秒,则确定事件处于事件开始状态,输出相应的告警信息。
下面以机动车违停为例,来对该激活时间进行相应的说明,本实施方式中,以激活时间为2分钟为例来进行说明。参见图6所示,在首次确定机动车驶入停放后确定事件发生,后续接着多次确定机动车停放,但是没有达到激活时间2分钟,因此不输出相应的告警信息;在停放时间达到2分钟时,确定事件开始,此时输出相应的告警信息。如此,可以减少机动车因短暂停放(比如1分钟)后离开,而导致的误判的情况发生,提高了检测输出方法的可靠性。
在一些实施方式中,参见图7所示,针对上述步骤S10322,输出所述事件的告警信息,可以包括以下S103221~S103222:
S103221,根据所述事件发生的当前结果,以及所述当前结果的前一条事件是否发生的结果,确定所述事件的事件状态。
S103222,输出包括所述事件状态的告警信息;所述事件状态包括事件开始状态、事件持续中状态,以及事件结束状态中的至少一种。
示例性地,可以根据解析结果进行事件状态判断。请再次参阅图4,对于当前结果为“发生”的状态,如果前一条结果也为“发生”,则给该事件状态赋值为“事件持续中”。反之,则表示这个事件刚开始发生,给该事件状态赋值为“事件开始”。对于当前结果为“不发生”的状态,如果前一条结果状态也为“不发生”,则给该事件状态赋值为“事件没有发生”。反之,则表示这个事件刚刚结束,给该事件状态赋值为“事件结束”。
需要说明的是,本实施方式中,事件结束时,由于当前结果为“不发生”,此时则不需要输出相应的告警信息。而在其他实施方式中,若根据实际需求需要在事件结束时也输出告警信息,则可以在事件不发生,但处于事件结束状态时,输出告警信息。
在一些实施方式中,所述告警策略还包括第四预设时间;参见图8所示,针对上述步骤S10322,在所述事件状态为所述事件持续中状态的情况下,还包括以下S10322a~10322c:
S10322a,判断事件处于事件持续中状态的时间距上一次输出告警信息的时间是否达到第四预设时间;若是,则执行步骤S10322b;若否,则执行步骤S10322c。
S10322b,输出包括所述事件持续中状态的告警信息。
S10322c,不输出包括所述事件持续中状态的告警信息。
本实施方式中,第四预设时间也称冷却时间。示例性地,冷却时间是指事件持续发生过程中的告警免推送时间。请再次参阅图4,冷却时间为c秒,因此在事件持续进行中时,间隔c秒输出包括所述事件持续中状态的告警信息。
请再次参阅图6,还以机动车停放为例进行说明,该实施方式中,冷却时间为60秒,也即,在确定机动车停放后,以60秒一次的频率输出告警信息,如此可以减少事件的冗余输出,减少同一事件持续告警的情况发生。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤之间的执行顺序应当以其功能和可能的内在逻辑确定。
基于同一技术构思,本公开实施例中还提供了与事件的检测输出方法对应的事件的检测输出装置,由于本公开实施例中的装置解决问题的原理与本公开实施例上述事件的检测输出方法相似,因此装置的实施可以参见方法的实施,重复之处不再赘述。
参见图9所示,为本公开实施例提供的事件策略确定方法的流程图,该事件策略确定方法包括以下S201~S203:
S201、获取目标视频。
目标视频可以是目标场所的相关视频,并通过设置于目标场所的摄像头对目标场所进行拍摄而获得。该目标场所可以是伞篷、城市道路(如机动车、非机动车等)、厨房、人员聚集场所(如体育场馆、车站等),在此不做限定。
S202、基于目标视频,确定待检测事件的类型。
可以通过对目标视频进行目标检测,得到目标视频的拍摄物体,并根据拍摄物体确定待检测事件的类型;例如,在目标视频的拍摄物体是机动车道的情况下,可以确定待检测事件的类型可以为:机动车道上的违章行为;又例如,在目标视频的拍摄物体是商场的情况下,可以确定待检测事件的类型可以为:人流量统计行为。
S203、基于待检测事件的类型,确定待检测事件对应的事件策略,事件策略包括解码方式和告警信息的告警策略。
每种类型的待检测事件均对应有事件策略,且不同类型的待检测事件所对应事件策略通常不同;例如,在机动车道上的违章行为这一待检测事件对应的事件策略,与人流量统计行为这一待检测事件对应的事件策略不同。
本公开实施例中,通过首先获取目标视频,接着基于目标视频,确定待检测事件的类型,最后基于待检测事件的类型,确定待检测事件对应的事件策略,事件策略包括解码方式和告警信息的告警策略。如此,可以针对不同的待检测事件进行相应的事件策略的确定,从而提高了事件检测时的灵活性。
在一些实施例中,每种类型的待检测事件对应的事件策略,包括:解码方式和告警信息的告警策略。采用解码方式可以对目标视频进行解码,得到目标图像帧,以根据目标图像帧确定待检测事件的事件状态。
在另一些实施例中,每种类型的待检测事件对应的事件策略还包括:目标神经网络模型,用于对每张目标图像帧进行解析,得到每张目标图像帧的解析结果。
在一些实施例中,上述S203可以通过S2031-S2032实现:
S2031、基于待检测事件的类型,从至少一个预设的解码方式中确定与待检测事件对应的解码方式。
S2032、解码方式用于对目标视频进行解码得到目标图像帧;目标图像帧用于检测待检测事件,且每个目标图像帧对应一个解析结果。
在一些实施例中,不同的待检测事件的解码方式可以不同,在预设有至少一个解码方式,例如,可以是关键帧解码方式和全部帧解码方式的情况下,可以从其中选择与待检测事件对应的解码方式。
在一些实施例中,告警信息包括:用于表征事件发生的第一告警信息;告警策略包括:第一输出条件、第二预设时间和第二预设数量;上述S203可以通过S2033实现:
S2033、基于待检测事件的类型,确定与第一告警信息对应的第二预设时间、第二预设数量和第一输出条件;第二预设时间表征相邻两次读取解析结果时的时间间隔;第二预设数量表征每次读取的解析结果的数量;第一输出条件表征在第二预设数量的解析结果均为发生的情况下,输出第一告警信息。
示例性的,第二预设时间可以为a秒,第二预设数量可以为M;例如,如上述图4,其中○表示事件不发生,△表示事件发生,可以从初始图像帧开始读取每M帧图像帧对应的M个解析结果,并根据这M个解析结果进行事件是否发生的判断,并在所读取的M个解析结果均为“发生”的情况下,输出事件发生的第一告警信息。反之,若该M个解析结果中的任一解析结果为“不发生”,则不输出第一告警信息。
这里,不同类型的待检测事件可以对应不同的M和a,或者,不同类型的待检测事件可以对应不同M或a。
在一些实施例中,告警信息包括:用于表征事件开始状态的第二告警信息;告警策略包括:第三预设时间、和第二输出条件;上述S203可以通过S2034实现:
S2034、基于待检测事件的类型,确定与第二告警信息对应的第三预设时间和第二输出条件;第二输出条件表征在待检测事件发生距离首次待检测事件发生的持续时间,达到第三预设时间的情况下,输出第二告警信息。
这里,第三预设时间也称事件激活时间。
示例性的,第三预设时间可以是e秒;继续参考上述图4,在待检测事件发生,但从开始到当前的持续时间不到激活时间e秒,则不输出第二告警信息给用户,如果待检测事件的事件持续到达激活时间e秒,则确定待检测事件处于事件开始状态,输出第二告警信息。
这里,不同类型的待检测事件对应的e可以不同。
在一些实施例中,告警信息包括:用于表征事件持续中状态的第三告警信息;告警策略包括:第三输出条件和第四预设时间;上述S203可以通过S2035实现:
S2035、基于待检测事件的类型,确定与第三告警信息对应的第四预设时间和第三输出条件;第三输出条件表征在待检测事件处于事件持续中状态的时间,距上一次输出第三告警信息的时间达到第四预设时间的情况下,输出第三告警信息。
这里,第四预设时间也称冷却时间。示例性的,第四预设时间可以为c秒;继续参考上述图4,在待检测事件持续进行中的情况下,可以每间隔c秒输出第三告警信息。
这里,不同类型的待检测事件对应的c可以不同。
在一些实施例中,事件策略还包括:目标神经网络模型;目标神经网络模型用于对每张目标图像帧进行解析,得到解析结果;上述S203可以通过S2036实现:
S2036、基于待检测事件的类型,从至少一个神经网络模型中,确定与待检测事件对应的目标神经网络模型。
这里,不同类型的待检测事件可以对应不同的神经网络模型,在预设有至少一个神经网络模型,例如,可以是的目标检测神经网络模型、语义分割神经网络模型和目标跟踪神经网络模型情况下,可以从其中选择与待检测事件对应的神经网络模型,从而得到目标神经网络模型。
在一些实施例中,当同一个目标视频中包括多种并列的被检测事件时,通过上述方法可为每种类型的待检测事件适配对应的事件策略。在配合视频的全结构化分析过程,该事件策略确定方法提供鲁棒的消息过滤和事件告警方法。
参照图10所示,为本公开实施例提供的一种事件的检测输出装置500的示意图,该事件的检测输出装置500包括:
视频解码部分501,被配置为根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧;
图像解析部分502,被配置为将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧;
信息输出部分503,被配置为根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。
在一些实施方式中,所述预设的解码方式由所述事件的类型确定;所述预设的解码方式包括关键帧解码方式或者全部帧解码方式。
在一些实施方式中,视频解码部分501还被配置为:
每隔第一预设数量帧或第一预设时间从所述目标视频的全部视频帧中抽取关键视频帧,并对所述关键视频帧进行解码。
在一些实施方式中,所述告警策略包括第二预设数量的解析图像帧以及第二预设时间;所述信息输出部分503还被配置为:
每间隔所述第二预设时间读取所述解析图像帧中相邻的所述第二预设数量的解析图像帧的解析结果;
在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息。
在一些实施方式中,所述告警策略还包括第三预设时间;所述信息输出部分503还被配置为:
在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,判断事件发生;
在当前的事件发生距离首次事件发生的持续时间达到所述第三预设时间的情况下,输出所述事件的告警信息。
在一些实施方式中,所述信息输出部分503还被配置为:
根据所述事件发生的当前结果,以及所述当前结果的前一条事件是否发生的结果,确定所述事件的事件状态;
输出包括所述事件状态的告警信息;所述事件状态包括事件开始状态、事件持续中状态,以及事件结束状态中的至少一种。
在一些实施方式中,所述告警策略还包括第四预设时间;所述信息输出部分503还被配置为:
判断所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间是否达到所述第四预设时间;
在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间达到所述第四预设时间的情况下,输出包括所述事件持续中状态的告警信息。
在一些实施方式中,所述信息输出部分503还被配置为:
在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间未达到所述第四预设时间的情况下,不输出包括所述事件持续中状态的告警信息。
在本公开实施例以及其他的实施例中,“部分”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是单元,还可以是模块也可以是非模块化的。
参照图11所示,为本公开实施例提供的事件策略确定装置600的示意图,该事件策略确定装置600包括:
获取部分601,被配置为获取目标视频;
确定部分602,被配置为基于所述目标视频,确定待检测事件的类型;基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,所述事件策略包括解码方式和告警信息的告警策略。
在一些实施方式中,所述确定部分602,还被配置为:基于所述待检测事件的类型,从至少一个预设的解码方式中确定与所述待检测事件对应的所述解码方式;所述解码方式用于对所述目标视频进行解码得到目标图像帧;所述目标图像帧用于检测所述待检测事件,且每个目标图像帧对应一个解析结果。
在一些实施方式中,所述告警信息包括:用于表征事件发生的第一告警信息;所述告警策略包括:第一输出条件、第二预设时间和第二预设数量;
所述确定部分602,还被配置为:基于所述待检测事件的类型,确定与所述第一告警信息对应的所述第二预设时间、所述第二预设数量和所述第一输出条件;所述第二预设时间表征相邻两次读取所述解析结果时的时间间隔;所述第二预设数量表征每次读取的所述解析结果的数量;所述第一输出条件表征在所述第二预设数量的解析结果均为发生的情况下,输出所述第一告警信息。
在一些实施方式中,所述告警信息包括:用于表征事件开始状态的第二告警信息;所述告警策略包括:第三预设时间、和第二输出条件;
所述确定部分602,还被配置为:基于所述待检测事件的类型,确定与所述第二告警信息对应的所述第三预设时间和所述第二输出条件;所述第二输出条件表征在所述待检测事件发生距离首次所述待检测事件发生的持续时间,达到所述第三预设时间的情况下,输出所述第二告警信息。
在一些实施方式中,所述告警信息包括:用于表征事件持续中状态的第三告警信息;所述告警策略包括:第三输出条件和第四预设时间;
所述确定部分602,还被配置为:基于所述待检测事件的类型,确定与所述第三告警信息对应的所述第四预设时间和所述第三输出条件;所述第三输出条件表征在所述待检测事件处于所述事件持续中状态的时间,距上一次输出所述第三告警信息的时间达到所述第四预设时间的情况下,输出所述第三告警信息。
在一些实施方式中,所述事件策略还包括:目标神经网络模型;所述目标神经网络模型用于对每张目标图像帧进行解析,得到所述解析结果;
所述确定部分602,还被配置为:基于所述待检测事件的类型,从至少一个神经网络模型中,确定与所述待检测事件对应的所述目标神经网络模型。
关于装置中的各部分的处理流程、以及各部分之间的交互流程的描述可以参照上述方法实施例中的相关说明,这里不再详述。
基于同一技术构思,本公开实施例还提供了一种电子设备。参照图12所示,为本公开实施例提供的电子设备700的结构示意图,包括处理器701、存储器702和总线703。其中,存储器702被配置为存储执行指令,包括内存7021和外部存储器7022;这里的内存7021也称内存储器,被配置为暂时存放处理器701中的运算数据,以及与硬盘等外部存储器7022交换的数据,处理器701通过内存7021与外部存储器7022进行数据交换。
本公开实施例中,存储器702还被配置为存储执行本公开的技术方案的应用程序代码,并由处理器701来控制执行。也即,当电子设备700运行时,处理器701与存储器702之间通过总线703通信,使得处理器701执行存储器702中存储的应用程序代码,进而执行前述任一实施例中所述的方法。
其中,存储器702可以是,但不限于,随机存取存储器(Random Access Memory,RAM),只读存储器(Read Only Memory,ROM),可编程只读存储器(Programmable Read-Only Memory,PROM),可擦除只读存储器(Erasable Programmable Read-Only Memory,EPROM),电可擦除只读存储器(Electric Erasable Programmable Read-Only Memory,EEPROM)等。
处理器701可以是一种集成电路芯片,具有信号的处理能力。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现场可编程门阵列(FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
可以理解的是,本公开实施例示意的结构并不构成对电子设备700的具体限定。在本公开另一些实施例中,电子设备700可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
本公开实施例还提供一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行上述方法实施例中的事件的检测输出方法的步骤,或事件策略确定方法的步骤。其中,该存储介质可以是易失性或非易失的计算机可读取存储介质。
本公开实施例还提供一种计算机程序产品,该计算机程序产品承载有程序代码,所述程序代码包括的指令可用于执行上述方法实施例中的事件的检测输出方法的步骤,或事件策略确定方法的步骤,可参见上述方法实施例。
其中,上述计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选实施例中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选实施例中,计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统和装置的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。在本公开所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,又例如,多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些通信接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个处理器可执行的非易失的计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。
计算机可读存储介质(存储介质)可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以是(但不限于)电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、静态随机存取存储器(SRAM)、便携式压缩盘只读存储器(CD-ROM)、数字多功能盘(DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波(例如,通过光纤电缆的光脉冲)、或者通过电线传输的电信号。
最后应说明的是:以上所述实施例,仅为本公开的具体实施方式,用以说明本公开的技术方案,而非对其限制,本公开的保护范围并不局限于此,尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,其依然可以对前述实施例所记载的技术方案进行修改或可轻易想到变化,或者对其中部分技术特征进行等同替换;而这些修改、变化或者替换,并不使相应技术方案的本质脱离本公开实施例技术方案的精神和范围,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应所述以权利要求的保护范围为准。
工业实用性
本公开实施例提供了一种事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质,该事件的检测输出方法包括:根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧;将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧;根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。本公开实施例,使得输出的事件的告警信息更接近事件真实的发生频率,提高了事件检测的智能性,从而提高了系统性能;同时,可以实现基于不同的任务类型进行事件告警信息的输出。

Claims (31)

  1. 一种事件的检测输出方法,包括:
    根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧;
    将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧;
    根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。
  2. 根据权利要求1所述的方法,其中,所述预设的解码方式由所述事件的类型确定;所述预设的解码方式包括关键帧解码方式或者全部帧解码方式。
  3. 根据权利要求2所述的方法,其中,在所述预设的解码方式为所述关键帧解码方式的情况下,所述根据预设的解码方式对获取的目标视频进行解码,包括:
    每隔第一预设数量帧或第一预设时间从所述目标视频的全部视频帧中抽取关键视频帧,并对所述关键视频帧进行解码。
  4. 根据权利要求1-3任一所述的方法,其中,所述告警策略包括第二预设数量的解析图像帧以及第二预设时间;所述根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息,包括:
    每间隔所述第二预设时间读取所述解析图像帧中相邻的所述第二预设数量的解析图像帧的解析结果;
    在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息。
  5. 根据权利要求4所述的方法,其中,所述告警策略还包括第三预设时间;所述在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息,包括:
    在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,判断事件发生;
    在当前的事件发生距离首次事件发生的持续时间达到所述第三预设时间的情况下,输出所述事件的告警信息。
  6. 根据权利要求4或5所述的方法,其中,所述输出所述事件的告警信息,包括:
    根据所述事件发生的当前结果,以及所述当前结果的前一条事件是否发生的结果,确定所述事件的事件状态;
    输出包括所述事件状态的告警信息;所述事件状态包括事件开始状态、事件持续中状态,以及事件结束状态中的至少一种。
  7. 根据权利要求6所述的方法,其中,所述告警策略还包括第四预设时间;在所述事件状态为所述事件持续中状态的情况下,所述输出包括所述事件状态的告警信息,包括:
    判断所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间是否达到所述第四预设时间;
    在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间达到所述第四预设时间的情况下,输出包括所述事件持续中状态的告警信息。
  8. 根据权利要求7所述的方法,其中,所述方法还包括:
    在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间未达到所述第四预设时间的情况下,不输出包括所述事件持续中状态的告警信息。
  9. 一种事件策略确定方法,包括:
    获取目标视频;
    基于所述目标视频,确定待检测事件的类型;
    基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,所述事件策略包括解码方式和告警信息的告警策略。
  10. 根据权利要求9所述的方法,其中,所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
    基于所述待检测事件的类型,从至少一个预设的解码方式中确定与所述待检测事件对应的所述解码方式;
    所述解码方式用于对所述目标视频进行解码得到目标图像帧;所述目标图像帧用于检测所述待检测事件,且每个目标图像帧对应一个解析结果。
  11. 根据权利要求9或10所述的方法,其中,所述告警信息包括:用于表征事件发生的第一告警信息;所述告警策略包括:第一输出条件、第二预设时间和第二预设数量;
    所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
    基于所述待检测事件的类型,确定与所述第一告警信息对应的所述第二预设时间、所述第二预设数量和所述第一输出条件;
    所述第二预设时间表征相邻两次读取所述解析结果时的时间间隔;所述第二预设数量表征每次读取的所述解析结果的数量;所述第一输出条件表征在所述第二预设数量的解析结果均为发生的情况下,输出所述第一告警信息。
  12. 根据权利要求9-11任一项所述的方法,其中,所述告警信息包括:用于表征事件开始状态的第二告警信息;所述告警策略包括:第三预设时间、和第二输出条件;
    所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
    基于所述待检测事件的类型,确定与所述第二告警信息对应的所述第三预设时间和所述第二输出条件;所述第二输出条件表征在所述待检测事件发生距离首次所述待检测事件发生的持续时间,达到所述第三预设时间的情况下,输出所述第二告警信息。
  13. 根据权利要求9-12任一项所述的方法,其中,所述告警信息包括:用于表征事件持续中状态的第三告警信息;所述告警策略包括:第三输出条件和第四预设时间;
    所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
    基于所述待检测事件的类型,确定与所述第三告警信息对应的所述第四预设时间和所述第三输出条件;所述第三输出条件表征在所述待检测事件处于所述事件持续中状态的时间,距上一次输出所述第三告警信息的时间达到所述第四预设时间的情况下,输出所述第三告警信息。
  14. 根据权利要求11所述的方法,其中,所述事件策略还包括:目标神经网络模型;所述目标神经网络模型用于对每张目标图像帧进行解析,得到所述解析结果;
    所述基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,包括:
    基于所述待检测事件的类型,从至少一个神经网络模型中,确定与所述待检测事件对应的所述目标神经网络模型。
  15. 一种事件的检测输出装置,所述装置包括:
    视频解码部分,被配置为根据预设的解码方式对获取的目标视频进行解码,得到目标图像帧;
    图像解析部分,被配置为将所述目标图像帧输入至目标神经网络模型进行解析,得到标注有解析结果的解析图像帧;
    信息输出部分,被配置为根据所述解析图像帧对应的事件的告警策略以及所述解析图像帧,输出各所述事件的告警信息。
  16. 根据权利要求15所述的装置,其中,所述预设的解码方式由所述事件的类型确定;所述预设的解码方式包括关键帧解码方式或者全部帧解码方式。
  17. 根据权利要求16所述的装置,其中,在所述预设的解码方式为所述关键帧解码方式的情况下,所述视频解码部分,还被配置为:
    每隔第一预设数量帧或第一预设时间从所述目标视频的全部视频帧中抽取关键视频帧,并对所述关键视频帧进行解码。
  18. 根据权利要求15-17任一项所述的装置,其中,所述告警策略包括第二预设数量的解析图像帧以及第二预设时间;所述信息输出部分,还被配置为:
    每间隔所述第二预设时间读取所述解析图像帧中相邻的所述第二预设数量的解析图像帧的解析结果;
    在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,输出所述事件的告警信息。
  19. 根据权利要求18所述的装置,其中,所述告警策略还包括第三预设时间;所述信息输出部分,还被配置为:
    在所述第二预设数量的解析图像帧中的每个所述解析图像帧的解析结果均为发生的情况下,判断事件发生;
    在当前的事件发生距离首次事件发生的持续时间达到所述第三预设时间的情况下,输出所述事件的告警信息。
  20. 根据权利要求18或19所述的装置,其中,所述信息输出部分,还被配置为:
    根据所述事件发生的当前结果,以及所述当前结果的前一条事件是否发生的结果,确定所述事件的事件状态;
    输出包括所述事件状态的告警信息;所述事件状态包括事件开始状态、事件持续中状态,以及事件结束状态中的至少一种。
  21. 根据权利要求20所述的装置,其中,所述告警策略还包括第四预设时间;所述信息输出部分,还被配置为:
    判断所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间是否达到所述第四预设时间;
    在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间达到所述第四预设时间的情况下,输出包括所述事件持续中状态的告警信息。
  22. 根据权利要求21所述的装置,其中,所述信息输出部分,还被配置为:
    在所述事件处于所述事件持续中状态的时间距上一次输出所述告警信息的时间未达到所述第四预设时间的情况下,不输出包括所述事件持续中状态的告警信息。
  23. 一种事件策略确定装置,所述装置包括:
    获取部分,被配置为获取目标视频;
    确定部分,被配置为基于所述目标视频,确定待检测事件的类型;基于所述待检测事件的类型,确定所述待检测事件对应的事件策略,所述事件策略包括解码方式和告警信息的告警策略。
  24. 根据权利要求23所述的装置,其中,所述确定部分,还被配置为:基于所述待检测事件的类型,从至少一个预设的解码方式中确定与所述待检测事件对应的所述解码方式;所述解码方式用于对所述目标视频进行解码得到目标图像帧;所述目标图像帧用于检测所述待检测事件,且每个目标图像帧对应一个解析结果。
  25. 根据权利要求23或24所述的装置,其中,所述告警信息包括:用于表征事件发生的第一告警信息;所述告警策略包括:第一输出条件、第二预设时间和第二预设数量;
    所述确定部分,还被配置为:基于所述待检测事件的类型,确定与所述第一告警信息对应的所述第二预设时间、所述第二预设数量和所述第一输出条件;所述第二预设时间表征相邻两次读取所述解析结果时的时间间隔;所述第二预设数量表征每次读取的所述解析结果的数量;所述第一输出条件表征在所述第二预设数量的解析结果均为发生的情况下,输出所述第一告警信息。
  26. 根据权利要求23-25任一项所述的装置,其中,所述告警信息包括:用于表征事 件开始状态的第二告警信息;所述告警策略包括:第三预设时间、和第二输出条件;
    所述确定部分,还被配置为:基于所述待检测事件的类型,确定与所述第二告警信息对应的所述第三预设时间和所述第二输出条件;所述第二输出条件表征在所述待检测事件发生距离首次所述待检测事件发生的持续时间,达到所述第三预设时间的情况下,输出所述第二告警信息。
  27. 根据权利要求23-26任一项所述的装置,其中,所述告警信息包括:用于表征事件持续中状态的第三告警信息;所述告警策略包括:第三输出条件和第四预设时间;
    所述确定部分,还被配置为:基于所述待检测事件的类型,确定与所述第三告警信息对应的所述第四预设时间和所述第三输出条件;所述第三输出条件表征在所述待检测事件处于所述事件持续中状态的时间,距上一次输出所述第三告警信息的时间达到所述第四预设时间的情况下,输出所述第三告警信息。
  28. 根据权利要求24所述的装置,其中,所述事件策略还包括:目标神经网络模型;所述目标神经网络模型用于对每张目标图像帧进行解析,得到所述解析结果;
    所述确定部分,还被配置为:基于所述待检测事件的类型,从至少一个神经网络模型中,确定与所述待检测事件对应的所述目标神经网络模型。
  29. 一种电子设备,包括:处理器、存储器和总线,所述存储器存储有所述处理器可执行的机器可读指令,在电子设备运行的情况下,所述处理器与所述存储器之间通过总线通信,所述机器可读指令被所述处理器执行时执行如权利要求1-8任一所述的事件的检测输出方法,或,9至14任一所述的事件策略确定方法。
  30. 一种计算机可读存储介质,该计算机可读存储介质上存储有计算机程序,该计算机程序被处理器运行时执行如权利要求1-8任一所述的事件的检测输出方法,或,9至14任一所述的事件策略确定方法。
  31. 一种计算机程序,所述计算机程序产品包括计算机程序或指令,在所述计算机程序或指令在计算机上运行的情况下,所述计算机执行权利要求1至8任一所述的事件的检测输出方法,或,9至14任一所述的事件策略确定方法。
PCT/CN2021/130349 2021-04-23 2021-11-12 事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质 WO2022222445A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110442017.2 2021-04-23
CN202110442017.2A CN113179423A (zh) 2021-04-23 2021-04-23 事件的检测输出方法及装置、电子设备和存储介质

Publications (1)

Publication Number Publication Date
WO2022222445A1 true WO2022222445A1 (zh) 2022-10-27

Family

ID=76924626

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/130349 WO2022222445A1 (zh) 2021-04-23 2021-11-12 事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN113179423A (zh)
WO (1) WO2022222445A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113179423A (zh) * 2021-04-23 2021-07-27 深圳市商汤科技有限公司 事件的检测输出方法及装置、电子设备和存储介质
CN116527853B (zh) * 2023-06-20 2023-10-13 深圳比特微电子科技有限公司 电子设备、云端设备、客户端设备及其操作方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060109341A1 (en) * 2002-08-15 2006-05-25 Roke Manor Research Limited Video motion anomaly detector
US10055961B1 (en) * 2017-07-10 2018-08-21 Careview Communications, Inc. Surveillance system and method for predicting patient falls using motion feature patterns
CN110969115A (zh) * 2019-11-28 2020-04-07 深圳市商汤科技有限公司 行人事件的检测方法及装置、电子设备和存储介质
CN112069939A (zh) * 2020-08-21 2020-12-11 深圳市商汤科技有限公司 事件检测方法及其装置、电子设备、存储介质
CN112419639A (zh) * 2020-10-13 2021-02-26 中国人民解放军国防大学联合勤务学院 一种视频信息的获取方法及装置
CN112633184A (zh) * 2020-12-25 2021-04-09 成都商汤科技有限公司 告警方法及装置、电子设备和存储介质
CN113179423A (zh) * 2021-04-23 2021-07-27 深圳市商汤科技有限公司 事件的检测输出方法及装置、电子设备和存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301373B (zh) * 2017-05-18 2018-03-27 深圳云天励飞技术有限公司 数据处理方法、装置及存储介质
CN110928255B (zh) * 2019-11-20 2021-02-05 珠海格力电器股份有限公司 数据异常统计报警方法、装置、存储介质及电子设备
CN111310665A (zh) * 2020-02-18 2020-06-19 深圳市商汤科技有限公司 违规事件检测方法及装置、电子设备和存储介质
CN112069937A (zh) * 2020-08-21 2020-12-11 深圳市商汤科技有限公司 事件检测方法及其装置、电子设备、存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060109341A1 (en) * 2002-08-15 2006-05-25 Roke Manor Research Limited Video motion anomaly detector
US10055961B1 (en) * 2017-07-10 2018-08-21 Careview Communications, Inc. Surveillance system and method for predicting patient falls using motion feature patterns
CN110969115A (zh) * 2019-11-28 2020-04-07 深圳市商汤科技有限公司 行人事件的检测方法及装置、电子设备和存储介质
CN112069939A (zh) * 2020-08-21 2020-12-11 深圳市商汤科技有限公司 事件检测方法及其装置、电子设备、存储介质
CN112419639A (zh) * 2020-10-13 2021-02-26 中国人民解放军国防大学联合勤务学院 一种视频信息的获取方法及装置
CN112633184A (zh) * 2020-12-25 2021-04-09 成都商汤科技有限公司 告警方法及装置、电子设备和存储介质
CN113179423A (zh) * 2021-04-23 2021-07-27 深圳市商汤科技有限公司 事件的检测输出方法及装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN113179423A (zh) 2021-07-27

Similar Documents

Publication Publication Date Title
WO2022222445A1 (zh) 事件的检测输出方法、事件策略确定方法及装置、电子设备和计算机可读存储介质
US8510795B1 (en) Video-based CAPTCHA
CN111082966A (zh) 基于批量告警事件的定位方法、装置、电子设备及介质
CN112200081A (zh) 异常行为识别方法、装置、电子设备及存储介质
WO2022227764A1 (zh) 事件检测的方法、装置、电子设备以及可读存储介质
CN112487886A (zh) 一种有遮挡的人脸识别方法、装置、存储介质及终端
CN111860377A (zh) 基于人工智能的直播方法、装置、电子设备及存储介质
CN105095415A (zh) 网络情绪的确定方法和装置
CN110705494A (zh) 人流量监测方法、装置、电子设备及计算机可读存储介质
CN113381962A (zh) 一种数据处理方法、装置和存储介质
CN111552800A (zh) 摘要生成方法、装置、电子设备及介质
CN108459845A (zh) 一种监控标签属性的埋点方法及装置
CN113381963A (zh) 一种域名检测方法、装置和存储介质
CN111062823A (zh) 一种社交图谱分析方法、装置及存储介质
CN114743157B (zh) 一种基于视频的行人监控方法、装置、设备及介质
CN111128233A (zh) 录音检测方法、装置、电子设备及存储介质
CN112906671B (zh) 面审虚假图片识别方法、装置、电子设备及存储介质
CN113989859A (zh) 一种防刷机设备指纹相似度识别方法和装置
CN111061975B (zh) 一种页面中无关内容的处理方法、装置
CN111259216A (zh) 一种信息识别方法、装置及设备
CN110598115A (zh) 一种基于人工智能多引擎的敏感网页识别方法及系统
US9008104B2 (en) Methods and apparatus for detecting and filtering forced traffic data from network data
CN110209880A (zh) 视频内容检索方法、视频内容检索装置及存储介质
CN114491429A (zh) 一种基于区块链的直播短视频大数据篡改识别方法及系统
CN113449506A (zh) 一种数据检测方法、装置、设备及可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21937667

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM XXXX DATED 30/01/2024)