WO2023000639A1

WO2023000639A1 - Video generation method and apparatus, electronic device, storage medium, and program

Info

Publication number: WO2023000639A1
Application number: PCT/CN2022/074117
Authority: WO
Inventors: 徐尤龙; 樊俊良; 陈万里; 张广程; 温祖钦; 陈宇恒
Original assignee: 上海商汤智能科技有限公司
Priority date: 2021-07-23
Filing date: 2022-01-26
Publication date: 2023-01-26
Also published as: CN113556485A

Abstract

The present disclosure relates to a video generation method and apparatus, an electronic device, a storage medium, and a program. The method comprises: writing video data of a video stream into a preset cache queue, the video data comprising a data packet of the video stream and time information of the data packet; when an event message for the video stream is received, according to a trigger moment of a predetermined event in the event message, determining a time interval corresponding to the predetermined event, wherein the duration of the time interval is less than or equal to the duration of the video data in the cache queue; from within the video data of the cache queue, acquiring a video file of a video clip corresponding to the time interval; and storing the video file of the video clip into a preset storage space so as to acquire a video clip associated with the predetermined event. Embodiments of the present disclosure can increase the efficiency of event backtracking.

Description

Video generating method, device, electronic device, storage medium and program

Cross References to Related Applications

This patent application requires that the Chinese patent application number 202110837802.8 submitted on July 23, 2021, the applicant is Shanghai Shangtang Intelligent Technology Co., Ltd., and the patent application titled "video generation method and device, electronic equipment and storage medium" has priority rights, the entirety of which is incorporated into this application by reference.

technical field

The present disclosure relates to the technical field of computer vision, including but not limited to a video generation method, device, electronic equipment, storage medium and program.

Background technique

With the popularization of various video capture devices, the data volume of video resources is also increasing rapidly. In a video system without computer vision, it is necessary to manually screen the video clips that users are interested in, and the human cost of processing massive video resources is very high. In a pure computer vision system, image recognition can be performed and corresponding detection results can be output, but manual intervention is also required to trace back the event-related video clips through the detection results. When the video system does not have a video storage function, event retrospective cannot be performed; when the video system has a video storage function, a large number of invalid videos may be stored, wasting a large amount of storage resources, and there are insufficient permissions and excessive video data And other problems, resulting in low efficiency of event backtracking and high cost of event backtracking.

Contents of the invention

The present disclosure proposes a video generation technical solution, including but not limited to a video generation method, device, electronic equipment, storage medium and program, which can improve the efficiency of event backtracking during video analysis.

The present disclosure provides a video generation method, including:

Write the video data of the video stream into the preset cache queue, the video data including the data packets of the video stream and the time information of the data packets; when receiving the event message for the video stream Next, according to the triggering moment of the predetermined event in the event message, determine the time interval corresponding to the predetermined event, the duration of the time interval is less than or equal to the duration of the video data in the buffer queue; from the buffer queue In the video data, the video file of the video segment corresponding to the time interval is obtained; the video file of the video segment is stored in a preset storage space, so as to obtain the video segment associated with the predetermined event.

In a possible implementation manner, the trigger moment is within the time interval, and the time interval includes a start moment and an end moment, wherein the video data acquired from the buffer queue is related to the The video file of the video segment corresponding to the time interval includes: when the current moment reaches the end moment, from the video data in the cache queue, copy the data packet of the video segment corresponding to the time interval; Encapsulate the data packets of the video clips to obtain the video files of the video clips.

In a possible implementation manner, the storing the video file of the video segment in a preset storage space to obtain the video segment associated with the predetermined event includes: storing the video file of the video segment The file is stored in the storage space; an association relationship between the predetermined event and the storage address of the video file in the storage space is established, so as to obtain a video segment associated with the predetermined event.

In a possible implementation manner, the method further includes: performing event detection on video frames of the video stream, and determining whether a predetermined event occurs in the video frame; if a predetermined event occurs in the video frame , determining the triggering moment of the predetermined event according to the time information of the video frame; sending an event message for the video stream, where the event message includes the predetermined event and the triggering moment.

In a possible implementation manner, the method is applied to an electronic device, and a visual analysis service and a video editing service are run in the electronic device, and the visual analysis service is configured to: perform Event detection, determining whether a predetermined event occurs in the video frame; in the case of a predetermined event occurring in the video frame, determining the triggering moment of the predetermined event according to the time information of the video frame; The service sends the event message.

In a possible implementation manner, the video editing service is configured to: write the video data of the video stream into a preset cache queue; when receiving the event message sent by the visual analysis service Next, according to the triggering moment of the predetermined event, determine the time interval corresponding to the predetermined event; from the video data in the cache queue, obtain the video file of the video segment corresponding to the time interval; The video files of the segment are stored in a preset storage space, so as to obtain the video segment associated with the predetermined event.

In a possible implementation manner, the method further includes: reading a video file of a video segment associated with the predetermined event from the storage space in response to a viewing operation on the predetermined event; playing the video file.

In a possible implementation manner, the video stream includes one or more video streams, the buffer queue includes a circular queue, and the predetermined event includes a pedestrian fall event, a pedestrian retrograde event, a pedestrian squatting event, and a smoking event. at least one of .

The present disclosure provides a video generation device, including:

The data cache module is configured to write the video data of the video stream into a preset cache queue, the video data including the data packets of the video stream and the time information of the data packets;

The time interval determination module is configured to determine the time interval corresponding to the predetermined event according to the triggering moment of the predetermined event in the event message when the event message for the video stream is received, and the time interval of the time interval The duration is less than or equal to the duration of the video data in the cache queue;

A video file obtaining module configured to obtain a video file of a video segment corresponding to the time interval from the video data in the cache queue;

The storing and associating module is configured to store the video file of the video segment in a preset storage space, so as to obtain the video segment associated with the predetermined event.

In a possible implementation manner, the trigger moment is within the time interval, and the time interval includes a start moment and an end moment, wherein the video file acquisition module is configured to: reach the end moment at the current moment In the case of time, from the video data in the cache queue, copy the data packet of the video segment corresponding to the time interval; encapsulate the data packet of the video segment to obtain the video file of the video segment.

In a possible implementation manner, the storing and associating module is configured to: store the video file of the video clip in the storage space; establish the relationship between the scheduled event and the video file in the storage space The association relationship between the storage addresses of , so as to obtain the video clips associated with the predetermined event.

In a possible implementation manner, the device further includes: an event detection module configured to perform event detection on a video frame of the video stream, and determine whether a predetermined event occurs in the video frame; a time determination module configured to When a predetermined event occurs in the video frame, according to the time information of the video frame, determine the triggering moment of the predetermined event; the message sending module is configured to send an event message for the video stream, the event The message includes the predetermined event and the triggering moment.

In a possible implementation manner, the apparatus is applied to an electronic device, and a visual analysis service and a video editing service are run in the electronic device, and the visual analysis service is configured to: perform Event detection, determining whether a predetermined event occurs in the video frame; in the case of a predetermined event occurring in the video frame, determining the triggering moment of the predetermined event according to the time information of the video frame; The service sends the event message.

In a possible implementation manner, the device further includes: a video file reading module configured to read a video file associated with the predetermined event from the storage space in response to a viewing operation on the predetermined event A video file of the segment; a video playing module configured to play the video file.

The present disclosure provides an electronic device, including: a memory configured to store executable instructions; a processor configured to call the instructions stored in the memory to execute the above video generation method.

The present disclosure provides a computer-readable storage medium, on which computer program instructions are stored, and the above-mentioned method is implemented when the computer program instructions are executed by a processor.

The present disclosure provides a computer program, including computer-readable codes. When the computer-readable codes are run in an electronic device, a processor in the electronic device executes to realize any one of the above-mentioned video generation methods.

In the embodiment of the present disclosure, the video data of the video stream can be cached in the cache queue; when a predetermined event occurs, the time interval corresponding to the predetermined event is determined; the video file of the corresponding video segment is obtained from the cache queue and stored, and the stored The video files are associated with the corresponding scheduled events, thereby improving the efficiency of event backtracking and saving a lot of storage resources.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Description of drawings

The accompanying drawings here are incorporated into the description and constitute a part of the present description. These drawings show embodiments consistent with the present disclosure, and are used together with the description to explain the technical solution of the present disclosure.

Fig. 1 shows a flowchart of a video generation method in an embodiment of the present disclosure.

Fig. 2 shows a schematic diagram of the processing procedure of the video generation method of the embodiment of the present disclosure.

FIG. 3 shows a block diagram of a video generating device of an embodiment of the present disclosure.

Fig. 4 shows a block diagram of an electronic device according to an embodiment of the present disclosure.

Fig. 5 shows a block diagram of an electronic device according to an embodiment of the present disclosure.

detailed description

The present application will be described in further detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the embodiments provided here are only used to explain the present application, not to limit the present application. In addition, the embodiments provided below are some embodiments for implementing the application, rather than providing all the embodiments for implementing the application. In the case of no conflict, the technical solutions described in the embodiments of the application can be combined in any manner implement.

The term "and/or" in this article is just an association relationship describing associated objects, which means that there can be three relationships, for example, A and/or B can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations. In addition, the term "at least one" herein means any one of a variety or any combination of at least two of the more, for example, including at least one of A, B, and C, which may mean including from A, Any one or more elements selected from the set formed by B and C.

The video generation method provided by the embodiments of the present disclosure can be executed by electronic devices such as terminal devices or servers, and the terminal devices can be user equipment (User Equipment, UE), mobile devices, user terminals, terminals, cellular phones, cordless phones, personal digital Assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle-mounted device, wearable device, etc., the method can be implemented by calling the computer-readable instructions stored in the memory by the processor. Alternatively, the method may be performed by a server.

The video generation method provided by the embodiments of the present disclosure can be applied to smart city systems and smart community systems to realize the intelligent video analysis function in the security monitoring process, and present video clips associated with the alarm event when the video stream triggers the alarm event.

Hereinafter, the video generation method, device, electronic device, storage medium, and program provided by the embodiments of the present disclosure are introduced in detail.

FIG. 1 shows a flow chart of a video generation method in an embodiment of the present disclosure. As shown in FIG. 1 , the video generation method provided in an embodiment of the present disclosure may include the following steps:

Step S11: Write video data of the video stream into a preset buffer queue, the video data including data packets of the video stream and time information of the data packets.

Step S12: When the event message for the video stream is received, determine the time interval corresponding to the predetermined event according to the triggering moment of the predetermined event in the event message, and the duration of the time interval is less than or equal to The duration of the video data in the cache queue.

Step S13: Obtain the video file of the video segment corresponding to the time interval from the video data in the cache queue.

Step S14: storing the video file of the video segment in a preset storage space, so as to obtain the video segment associated with the predetermined event.

The video generation method of the embodiment of the present disclosure can cache the video data of the video stream in the cache queue, and detect the predetermined event of interest in the video stream through computer vision; determine the time interval corresponding to the predetermined event when the predetermined event occurs; Edit and store the corresponding video clips from the cache queue, and associate the stored video clips with the corresponding scheduled events, so that users can efficiently look back at the video screen when the event occurred, store really useful videos, and save a lot of storage resources.

In an example, the video stream to be analyzed may be an online video stream or an offline video stream collected by various video capture devices, and a predetermined event may occur in a picture of the video stream. The embodiment of the present disclosure does not limit the source and type of the video stream.

In an example, in the video stream of the escalator, pedestrians may have abnormal behaviors, such as falling, walking backwards, squatting, etc. Correspondingly, the predetermined events in the video stream may include pedestrian falling events, pedestrian retrograde events, pedestrian squatting events, and the like.

In an example, in video streams of shopping malls, exhibition halls and other places, people may have abnormal behaviors, such as smoking, entering non-visiting areas, and so on. Correspondingly, the predetermined events in the video stream may include smoking events, illegal entry into non-visiting areas, and the like.

In an example, in the video stream of the road, abnormal behaviors of people or vehicles may occur, for example, the driver does not wear a seat belt, fatigue driving, running a red light, etc. Correspondingly, predetermined events in the video stream may include corresponding various violation events. The embodiment of the present disclosure does not limit the specific type of the predetermined event.

In a possible implementation manner, a visual analysis service may run in the electronic device, and analyze predetermined events of interest in the video stream by way of computer vision. For example, image recognition and event detection are performed on one or more video frames of the video stream, and corresponding event detection results are output.

In an example, the event detection result may include any one of the following information: whether a scheduled event occurs, the type of the scheduled event that occurred. Predetermined events can be detected through a trained event detection network, and the event detection network can use a convolutional neural network. Embodiments of the present disclosure do not limit the network type and network structure of the event detection network.

In a possible implementation manner, a video clipping service may run in the electronic device for clipping a video stream. Both the visual analysis service and the video editing service can be processes running on an electronic device, and can run on the same electronic device or different electronic devices, which is not limited in the embodiments of the present disclosure.

In an example, the video clip service can continuously pull the video data of the video stream, combine the received video data packet and the time information of the video data packet into a new data packet, and write it into the preset cache queue middle. The time information of the video data packet may be the receiving time of the video data packet, or the collection time of the video data packet, which is not limited in the embodiments of the present disclosure.

In a possible implementation, the cache queue can be a ring queue, or a queue of other structures; the capacity of the cache queue can be set according to the actual situation. None are restricted.

In a possible implementation, when the cache queue is full, new video data can be used to overwrite the earliest cached video data, thereby enabling circular buffering of video data.

In a possible implementation, in the case of using a circular queue, if the queue is full, the new video data can automatically overwrite the earliest cached video data, so that the circular buffering of video data can be automatically realized, saving processing resources .

In a possible implementation manner, when detecting that a predetermined event occurs in the video stream, the visual analysis service will send an event message for the video stream to the video editing service, and the event message includes the predetermined event and the corresponding trigger time.

In an example, the trigger moment of the predetermined event is determined according to the time information of the video frame where the predetermined event occurs. The time information of the video frame can be the receiving time of the video frame received by the electronic device, or the acquisition time of the video frame, as long as it is consistent with the cache The time information of the video data can be synchronized. In this way, the occurrence of time misalignment can be avoided.

In a possible implementation manner, in step S12, the video editing service determines that a predetermined event occurs in the video stream when receiving the event message. According to the triggering moment of the predetermined event in the event message, the time interval corresponding to the predetermined event can be determined, that is, the time interval corresponding to the video segment to be edited.

In the example, assuming that the triggering moment of the scheduled event is time t, based on the triggering moment t of the scheduled event, m seconds can be traced forward and n seconds can be traced backward, then the time interval corresponding to the video segment to be edited is [t-m, t+n]. The duration (m+n) of the time interval is less than or equal to the duration of the video data in the cache queue. The embodiments of the present disclosure do not limit specific values of m and n.

In a possible implementation manner, when the predetermined event is triggered, the video data after the trigger time t may not be cached in the cache queue, and in this case, asynchronous processing may be performed.

In the example, when the end moment (t+n) of the time interval is reached, from the video data in the cache queue, obtain the video segment corresponding to the time interval; encapsulate the video data of the video segment, Generate video files.

In a possible implementation, in step S14, the video file can be stored in a preset storage space, and the association between the video file and the predetermined event can be established to obtain the video segment associated with the predetermined event , so that relevant personnel can trace back the event.

In an example, the preset storage space may be a storage space of the electronic device itself, or a storage space of a storage server in the cloud, which is not limited in this embodiment of the present disclosure.

In the embodiment of the present disclosure, the video data of the video stream is cached in the cache queue; when a predetermined event occurs, the time interval corresponding to the predetermined event is determined; the video file of the corresponding video segment is obtained from the cache queue and stored, and the stored video The file is associated with the corresponding scheduled event, which is convenient for users to efficiently review the video screen when the event occurred, store the really useful video, and save a lot of storage resources.

Hereinafter, the video generation method provided by the embodiment of the present disclosure will be described in detail.

In a possible implementation manner, the visual analysis service and the video editing service may be respectively run on the same or different electronic devices. The visual analysis service and the video editing service can obtain video streams respectively. Wherein, the acquired video stream may include one or more video streams, for example, video streams of different areas of a shopping mall.

In an example, the visual analysis service and the video editing service can process each video stream separately. The embodiment of the present disclosure does not limit the number of video streams. For any acquired video stream, the video stream can be detected.

In a possible implementation manner, the video generation method in the embodiment of the present disclosure may further include the following steps: performing event detection on the video frame of the video stream, and determining whether a predetermined event occurs in the video frame; When a predetermined event occurs in a video frame, determine the triggering moment of the predetermined event according to the time information of the video frame; send an event message for the video stream, the event message includes the predetermined event and the trigger moment.

In an example, the video stream may be decoded to obtain multiple video frames of the video stream, and event detection is performed on each video frame. For example, event detection is performed on continuous video frames; frame extraction is first performed on continuous video frames, and then event detection is performed on the extracted video frames; event detection is performed on key frames in video frames, which is not covered in the embodiments of the present disclosure. limit.

In a possible implementation manner, different event detection modes may be set according to types of various predetermined events to be detected. For example, when detecting the pedestrian retrograde event of the escalator, the pedestrian in the video frame can be identified, and the relative speed between the pedestrian and the escalator can be analyzed to determine whether the pedestrian retrograde event occurs; for example, when the pedestrian fall event of the escalator is detected , it can identify pedestrians in the video frame, analyze the body posture of pedestrians, and judge whether pedestrians fall. The embodiment of the present disclosure does not limit the specific manner of event detection.

In a possible implementation, the detection of predetermined events can be realized through a trained event detection network, and the event detection network can use one or more convolutional neural networks. The type and network structure are not limited.

In a possible implementation manner, when it is detected that a predetermined event occurs in a video frame, the triggering moment of the predetermined event may be determined according to time information of the video frame.

In an example, when the predetermined event is an event in a single video frame, the time information of the video frame can be directly determined as the trigger moment of the predetermined event; In some cases, the time information of the last frame or the middle frame of multiple video frames may be determined as the triggering moment of the predetermined event. Embodiments of the present disclosure do not limit this.

In a possible implementation manner, an event message for the video stream may be sent, where the event message includes the scheduled event and the triggering moment of the scheduled event, so as to clip the video data in the buffer queue. In this way, the detection of predetermined events can be realized, and the accuracy of event detection can be improved.

In a possible implementation manner, when the video stream is detected by the visual analysis service, the visual analysis service is configured to: perform event detection on video frames of the video stream, and determine Whether a predetermined event occurs; if the predetermined event occurs in the video frame, determine the triggering moment of the predetermined event according to the time information of the video frame; and send the event message to the video editing service.

In an example, the visual analysis service may perform event detection on video frames of the acquired video stream, and determine whether a predetermined event occurs in the video frame; if a predetermined event occurs in the video frame, according to the video frame time information to determine the trigger moment of the predetermined event.

In a possible implementation, the visual analysis service may send an event message to the video clipping service, where the event message includes the scheduled event and the triggering time of the scheduled event, so that the video clipping service clips the video data in the cache queue .

In the example, the visual analysis process is separated from the video editing process, so that the visual analysis process can be set according to the requirements of the business system without affecting the video editing process. The visual analysis service and the video editing service can obtain video streams separately, and do not rely on the function of obtaining video clips by time from the original video stream. Even if the original video stream does not support video clips by time, the video editing process can also be implemented normally.

In a possible implementation manner, the processing procedures of steps S11-S14 may be realized through a video editing service. The video clip service can be configured to: write the video data of the video stream into a preset cache queue; upon receiving the event message sent by the visual analysis service, according to the triggering moment of the predetermined event, determine The time interval corresponding to the predetermined event; from the video data in the cache queue, obtain the video file of the video segment corresponding to the time interval; store the video file of the video segment into a preset storage space , to get the video clip associated with the predetermined event.

In an example, the video editing service may set up a video processing instance for each video stream, and a memory-based cache queue is set in each video processing instance. The video reading module of the video processing instance continuously reads the video data of the corresponding video stream, combines the received video data packet and the time information of the video data packet into a new data packet, and writes it into the preset cache queue . The time information of the video data packet may be the receiving time of the video data packet, or may be the collection time of the video data packet, which is not limited in the embodiments of the present disclosure.

In a possible implementation manner, the buffer queue may be a ring queue. By adopting a circular queue, when the queue is full, new video data can automatically overwrite the earliest cached video data, thereby automatically realizing circular buffering of video data and saving processing resources.

In an example, the ring queue can buffer 60s of video data, and the embodiment of the present disclosure does not limit the length of the ring queue.

In a possible implementation manner, the video editing service will continuously monitor event messages from the visual analysis service. When the event message is received in step S12, the video editing service may verify the event message to confirm whether it is a scheduled event of interest.

It should be understood that the predetermined events concerned by the visual analysis service and the video clip service may be different, for example, the predetermined events concerned by the visual analysis service include A, B, and C, while the predetermined events concerned by the video clip service include A and B. Embodiments of the present disclosure do not limit this.

In a possible implementation manner, if the predetermined event in the event message is a concerned event, the verification is passed. The video editing service can determine the time interval corresponding to the predetermined event, that is, the time interval corresponding to the video segment to be edited, according to the trigger time in the event message.

In the example, assuming that the triggering moment of the scheduled event is time t, based on the triggering moment t of the scheduled event, m seconds can be traced forward and n seconds can be traced backward, then the time interval corresponding to the video segment to be edited is [t-m, t+n]. The duration (m+n) of the time interval is less than or equal to the duration of the video data in the cache queue. For example, m may take a value of 3s, and n may take a value of 6s. Embodiments of the present disclosure do not limit specific values of m and n.

In a possible implementation manner, the trigger moment t is within the time interval [t-m, t+n], and the time interval [t-m, t+n] includes a start moment (t-m) and an end moment (t+n) .

In a possible implementation manner, video editing may be performed in step S13. Wherein, step S13 may include:

When the current moment reaches the end moment, from the video data in the cache queue, copy the data packets of the video segments corresponding to the time interval; encapsulate the data packets of the video segments to obtain the The video file of the video clip.

In an example, when the predetermined event is triggered, the video data after the trigger time t may not be cached in the cache queue, and asynchronous processing may be performed. When the video clip service receives the event message and passes the verification, it can start an asynchronous video clip processing process.

In a possible implementation, when the end moment (t+n) of the time interval is reached, the processing can copy the data packets of the video segment corresponding to the time interval from the video data in the cache queue; Encapsulate the data packet of the video segment to generate a video file of the video segment.

It should be understood that the video file may be temporarily cached in the memory of the electronic device for subsequent storage in the storage space. In this way, asynchronous video clip processing can be realized, thereby effectively intercepting video segments corresponding to predetermined events, and improving processing efficiency and accuracy.

In a possible implementation manner, video clips may be stored and associated with events in step S14. Wherein, step S14 may include the following steps: storing the video file of the video segment in the storage space; establishing an association between the predetermined event and the storage address of the video file in the storage space, to obtain video clips associated with the predetermined event.

In the example, the generated video file can be uploaded to the storage space, and the video file temporarily cached in the memory is deleted after the upload is successful. The storage space can return the storage address of the video file in the storage space to the electronic device. Here, the storage space may be a storage space of a cloud storage server.

In a possible implementation, after determining the storage address of the video file in the storage space, the video clip service can establish an association between the scheduled event and the storage address of the video file, and end the asynchronous video clip processing of this event process. In this way, video clips associated with the scheduled event can be obtained, so that relevant personnel can look back at the video screen of the scheduled event.

In this way, video clips can be stored and the association between predetermined events and video clips can be realized. Therefore, only video clips associated with predetermined events can be persistently stored, thereby saving a lot of storage resources.

Fig. 2 shows a schematic diagram of the processing procedure of the video generation method of the embodiment of the present disclosure. As shown in FIG. 2, there are n video streams: video stream 1, video stream 2, . . . , video stream n. The video editing service can set up a video processing instance for each video stream, and there are n video processing instances: instance 1, instance 2, ..., instance n, and each video processing instance is provided with a memory-based ring queue.

In the example, the visual analysis service and the video editing service can respectively read and process each video stream. The video stream 1 is taken as an example for description below.

In the example, in step 1, the video clip service reads the video data of video stream 1 through the video reading module of instance 1; in step 2, the received video data packet and the time information of the video data packet Combine into a new data packet and write it into the ring queue.

In an example, the visual analysis service may read video data of video stream 1, decode multiple video frames of the video stream, and perform event detection on each video frame. If a predetermined event is detected, the predetermined event is triggered in step 3, and an event message is sent to the video editing service, and the message includes the predetermined event and its triggering time.

In an example, when the video editing service receives the event message and passes the verification, it determines the time interval corresponding to the predetermined event, and starts an asynchronous video editing process. When the end moment of the time interval is reached, in step 4, the corresponding video segment is obtained from the circular queue, and a file of the video segment is generated.

It should be understood that the video editing service can store video clips and associate scheduled events, upload the files of the video clips to the storage space, establish the association between the scheduled events and the storage addresses of the video files, and end the asynchronous video editing process. In this way, video clips associated with the scheduled event can be obtained, so that relevant personnel can look back at the video picture of the scheduled event.

In a possible implementation manner, the video generation method of the embodiment of the present disclosure may further include the following step: in response to a viewing operation on the predetermined event, reading the information associated with the predetermined event from the storage space The video file of the video clip; play said video file.

For example, if relevant personnel need to look back at a scheduled event, they can trigger a viewing operation for the scheduled event, such as clicking the icon of the scheduled event, or clicking a control corresponding to viewing the scheduled event, etc. The specific triggering method of the present disclosure is described in No limit.

In a possible implementation, in response to the viewing operation on the predetermined event, the electronic device can read the associated video file of the video segment from the storage space according to the association between the predetermined event and the storage address of the video file. document. For example, a file read request is sent to a storage server in the cloud, so that the storage server returns a corresponding video file.

In a possible implementation manner, after receiving the video file, the electronic device may play the video file, so that relevant personnel can watch the video picture corresponding to the predetermined event. In this way, the convenience of going back to the scheduled event can be improved, and relevant personnel can quickly observe more details of the event.

The video generation method of the embodiment of the present disclosure can set visual perception rules to set interested scheduled events, and output the corresponding events when the rules are triggered; the video clip service can cache the video data of the video stream in the cache queue, When an event occurs, determine the time interval corresponding to the scheduled event; clip and store the corresponding video clips from the cache queue, and associate the stored video clips with the corresponding scheduled event, so that users can efficiently look back at the video screen when the event occurs, and store Really useful video, save a lot of storage resources.

The video generation method of the embodiment of the present disclosure can automatically trigger the interception of video clips based on computer vision, and does not need to manually screen the video clips of interest; the video data of the video stream will be cached in a memory-based circular queue, without NVR (Network Video Recorder, network video recording equipment) and other types of devices are required to store all videos, and only persistently store video clips associated with scheduled events, which greatly saves storage space; usually events are only associated with snapshots, this solution Predetermined events are associated with video clips, so that events can be indexed to related videos, and more event details can be quickly observed.

The video generation method of the embodiment of the present disclosure can be applied to application scenarios such as intelligent video analysis, security monitoring, smart city, and smart community, and can filter out interesting video clips from massive video data. Only the videos you are interested in can be stored, and you can quickly view the video images at that time when the online video stream triggers an alarm event.

It should be understood that the above-mentioned method embodiments mentioned in the present disclosure can all be combined with each other to form a combined embodiment without violating the principles and logics. Due to space limitations, the present disclosure will not repeat them here. Those skilled in the art can understand that, in the above method in the specific implementation manner, the specific execution order of each step should be determined according to its function and possible internal logic.

The present disclosure also provides a video generating device, electronic equipment, computer-readable storage medium, and program, all of which can be used to realize any video generating method provided in the present disclosure, corresponding technical solutions and descriptions, and corresponding records in the method section.

FIG. 3 shows a block diagram of a video generation device in an embodiment of the present disclosure. As shown in FIG. 3 , the video generation device provided in an embodiment of the present disclosure includes:

The data cache module 31 is configured to write the video data of the video stream into a preset cache queue, the video data including the data packets of the video stream and the time information of the data packets;

The time interval determination module 32 is configured to determine the time interval corresponding to the predetermined event according to the triggering moment of the predetermined event in the event message when receiving the event message for the video stream, the time interval The duration is less than or equal to the duration of the video data in the cache queue;

The video file obtaining module 33 is configured to obtain the video file of the video segment corresponding to the time interval from the video data in the cache queue;

The storing and associating module 34 is configured to store the video file of the video segment in a preset storage space, so as to obtain the video segment associated with the predetermined event.

In a possible implementation manner, the trigger moment is within the time interval, and the time interval includes a start moment and an end moment, wherein the video file acquisition module 33 is configured to: reach the In the case of the end moment, from the video data in the cache queue, copy the data packets of the video segments corresponding to the time interval; encapsulate the data packets of the video segments to obtain the video files of the video segments.

In a possible implementation manner, the storage and association module 34 is configured to: store the video file of the video segment in the storage space; In order to obtain the video clips associated with the predetermined event.

In a possible implementation manner, the apparatus is applied to an electronic device, and a visual analysis service and a video editing service are run in the electronic device, and the visual analysis service is configured to: analyze the video frame of the video stream Perform event detection to determine whether a predetermined event occurs in the video frame; if a predetermined event occurs in the video frame, determine the triggering moment of the predetermined event according to the time information of the video frame; The clipping service sends the event message.

In a possible implementation manner, the functions or modules included in the apparatus provided by the embodiments of the present disclosure can be configured to execute the methods described in the above method embodiments, and its specific implementation can refer to the descriptions of the above method embodiments. It is concise and will not be repeated here.

Embodiments of the present disclosure also provide a computer-readable storage medium, on which computer program instructions are stored, and the above-mentioned method is implemented when the computer program instructions are executed by a processor. Computer readable storage media may be volatile or nonvolatile computer readable storage media.

An embodiment of the present disclosure also proposes an electronic device, including: a memory configured to store executable instructions; a processor configured to call the instructions stored in the memory to execute the above video generation method.

In an example, the type of the electronic device may be a terminal device, a server or other types of devices.

An embodiment of the present disclosure also provides a computer program product, including computer-readable codes, or a non-volatile computer-readable storage medium carrying computer-readable codes, when the computer-readable codes are stored in a processor of an electronic device When running in the electronic device, the processor in the electronic device executes the above method.

FIG. 4 shows a block diagram of an electronic device 400 according to an embodiment of the present disclosure. The device type of the electronic device 400 may include any of the following: mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant.

4, electronic device 400 may include one or more of the following components: processing component 402, memory 404, power supply component 406, multimedia component 408, audio component 410, input/output (I/O) interface 412, sensor component 414 , and the communication component 416 .

Among other things, the processing component 402 generally controls the overall operations of the electronic device 400, such as operations associated with display, phone calls, data communications, camera operations, and recording operations. The processing component 402 may include one or more processors 420 to execute instructions to complete all or part of the steps of the above method. Additionally, processing component 402 may include one or more modules that facilitate interaction between processing component 402 and other components. For example, processing component 402 may include a multimedia module to facilitate interaction between multimedia component 408 and processing component 402 .

The memory 404 is configured to store various types of data to support operations at the electronic device 400 . Examples of such data include instructions for any application or method operating on the electronic device 400, contact data, phonebook data, messages, pictures, videos, and the like. The memory 404 can be implemented by any type of volatile or non-volatile storage device or their combination, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.

The power supply component 406 provides power to various components of the electronic device 400 . Power components 406 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for electronic device 400 .

The multimedia component 408 includes a screen providing an output interface between the electronic device 400 and the user. The screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense a boundary of a touch or swipe action, but also detect duration and pressure associated with the touch or swipe action.

In some embodiments, the multimedia component 408 includes a front camera and/or a rear camera. When the electronic device 400 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front camera and rear camera can be a fixed optical lens system or have focal length and optical zoom capability.

The audio component 410 is configured to output and/or input audio signals. For example, the audio component 410 includes a microphone (MIC), which is configured to receive external audio signals when the electronic device 400 is in operation modes, such as call mode, recording mode and voice recognition mode. Received audio signals may be further stored in memory 404 or sent via communication component 416 . The audio component 410 may also include a speaker for outputting audio signals.

The I/O interface 412 provides an interface between the processing component 402 and a peripheral interface module. The peripheral interface module may be a keyboard, a click wheel, a button, and the like. These buttons may include, but are not limited to: a home button, volume buttons, start button, and lock button.

Sensor assembly 414 includes one or more sensors for providing various aspects of status assessment for electronic device 400 . For example, the sensor assembly 414 can detect the open/close state of the electronic device 400, the relative positioning of the components, such as the display and the keypad of the electronic device 400, the sensor assembly 414 can also detect the electronic device 400 or one of the electronic device 400 Changes in position of components, presence or absence of user contact with electronic device 400 , electronic device 400 orientation or acceleration/deceleration and temperature changes in electronic device 400 .

The sensor assembly 414 may include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact. Sensor assembly 414 may also include an optical sensor, such as a complementary metal-oxide-semiconductor (CMOS) or charge-coupled device (CCD) image sensor, for use in imaging applications. The sensor assembly 414 may also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 416 is configured to facilitate wired or wireless communication between the electronic device 400 and other devices. The electronic device 400 can access a wireless network based on a communication standard, such as a wireless network (WiFi), a second generation mobile communication technology (2G) or a third generation mobile communication technology (3G), or a combination thereof. In an exemplary embodiment, the communication component 416 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 416 also includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, Infrared Data Association (IrDA) technology, Ultra Wide Band (UWB) technology, Bluetooth (BT) technology and other technologies.

In some embodiments, the electronic device 400 may be programmed by one or more application-specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing the methods described above.

An embodiment of the present disclosure provides a non-volatile computer-readable storage medium, such as a memory 404 including computer program instructions, which can be executed by the processor 420 of the electronic device 400 to complete the above video generation method.

FIG. 5 shows a block diagram of an electronic device 500 according to an embodiment of the present disclosure. For example, the electronic device 500 may be provided as a server. Referring to FIG. 5 , electronic device 500 includes processing component 522 , which further includes one or more processors, and a memory resource represented by memory 532 for storing instructions executable by processing component 522 , such as application programs. The application program stored in memory 532 may include one or more modules each corresponding to a set of instructions. In addition, the processing component 522 is configured to execute instructions to perform the above method.

In an example, the electronic device 500 may further include a power supply component 526 configured to perform power management of the electronic device 500, a wired or wireless network interface 550 configured to connect the electronic device 500 to a network, and an input/output (I/O O) Interface 558. The electronic device 500 can operate based on the operating system stored in the memory 532, such as the Microsoft server operating system (Windows ServerTM), the graphical user interface-based operating system (Mac OS XTM) introduced by Apple Inc., the multi-user and multi-process computer operating system (UnixTM). ), a free and open source Unix-like operating system (LinuxTM), an open source Unix-like operating system (FreeBSDTM), or similar.

An embodiment of the present disclosure provides a non-volatile computer-readable storage medium, such as a memory 532 including computer program instructions, which can be executed by the processing component 522 of the electronic device 500 to complete the above method.

An embodiment of the present disclosure provides a computer program, including computer readable codes. When the computer readable codes are run in an electronic device, a processor in the electronic device executes any one of the video generation methods described above. .

The methods disclosed in the various method embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments.

The features disclosed in the various product embodiments provided in this application can be combined arbitrarily without conflict to obtain new product embodiments.

The features disclosed in each method or device embodiment provided in this application can be combined arbitrarily without conflict to obtain a new method embodiment or device embodiment.

In the several embodiments provided in this application, it should be understood that the disclosed devices and methods may be implemented in other ways. The device embodiments described above are only illustrative. For example, the division of units is only a logical function division. In actual implementation, there may be other division methods, such as: multiple units or components can be combined, or May be integrated into another system, or some features may be ignored, or not implemented. In addition, the mutual coupling, or direct coupling, or communication connection of the various components shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be electrical, mechanical or other forms .

The units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed to multiple grid units ; Some or all of the units can be selected according to the actual situation to realize the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application can be integrated into one processing module, or each unit can be used as a single unit, or two or more units can be integrated into one unit; the above-mentioned integration The unit can be realized in the form of hardware or in the form of hardware plus software functional unit.

Those of ordinary skill in the art can understand that all or part of the steps for realizing the above-mentioned method embodiments can be completed by hardware related to program instructions, and the aforementioned program can be stored in a computer-readable storage medium. When the program is executed, the The steps of the above method embodiments are included.

The above is only a specific embodiment of the application, but the scope of protection of the application is not limited thereto. Any person familiar with the technical field can easily think of changes or replacements within the technical scope disclosed in the application, and should be covered within the protection scope of this application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.

Industrial Applicability

The present disclosure relates to a video generation method, device, electronic equipment, storage medium and program, the method comprising: writing the video data of the video stream into a preset cache queue, the video data including the A data packet and the time information of the data packet; in the case of receiving an event message for the video stream, according to the triggering moment of the predetermined event in the event message, determine the time interval corresponding to the predetermined event, so The duration of the time interval is less than or equal to the duration of the video data in the cache queue; from the video data in the cache queue, obtain the video file of the video segment corresponding to the time interval; The file is stored in a preset storage space, so as to obtain video clips associated with the predetermined event. The embodiments of the present disclosure can improve the efficiency of event backtracking during video analysis.

Claims

A video generation method, comprising:

Writing video data of the video stream into a preset buffer queue, the video data including data packets of the video stream and time information of the data packets;

When an event message for the video stream is received, determine a time interval corresponding to the predetermined event according to the triggering moment of the predetermined event in the event message, and the duration of the time interval is less than or equal to the buffer The duration of the video data in the queue;

From the video data in the cache queue, obtain the video file of the video segment corresponding to the time interval;

The video file of the video segment is stored in a preset storage space, so as to obtain the video segment associated with the predetermined event.
The method according to claim 1, wherein the trigger moment is within the time interval, and the time interval includes a start moment and an end moment,

Wherein, the acquisition of the video file of the video segment corresponding to the time interval from the video data in the cache queue includes:

When the current moment reaches the end moment, from the video data in the buffer queue, copy the data packet of the video segment corresponding to the time interval;

Encapsulate the data packet of the video segment to obtain a video file of the video segment.
The method according to claim 1 or 2, wherein the storing the video file of the video segment in a preset storage space to obtain the video segment associated with the predetermined event comprises:

storing the video file of the video segment in the storage space;

Establishing an association between the predetermined event and the storage address of the video file in the storage space, so as to obtain a video segment associated with the predetermined event.
The method according to any one of claims 1-3, wherein the method further comprises:

Perform event detection on the video frame of the video stream, and determine whether a predetermined event occurs in the video frame;

When a predetermined event occurs in the video frame, determine the triggering moment of the predetermined event according to the time information of the video frame;

sending an event message for the video stream, where the event message includes the predetermined event and the trigger moment.
The method according to claim 4, wherein the method is applied to an electronic device, and a visual analysis service and a video clip service are run in the electronic device, and the visual analysis service is configured to:

Perform event detection on the video frame of the video stream, and determine whether a predetermined event occurs in the video frame;

When a predetermined event occurs in the video frame, determine the triggering moment of the predetermined event according to the time information of the video frame;

Send the event message to the video clipping service.
The method of claim 5, wherein the video clipping service is configured to:

Writing the video data of the video stream into a preset buffer queue;

In the case of receiving the event message sent by the visual analysis service, according to the triggering moment of the predetermined event, determine the time interval corresponding to the predetermined event;

From the video data in the cache queue, obtain the video file of the video segment corresponding to the time interval;

The video file of the video segment is stored in a preset storage space, so as to obtain the video segment associated with the predetermined event.
The method according to any one of claims 1-6, wherein the method further comprises:

reading a video file of a video segment associated with the predetermined event from the storage space in response to a viewing operation for the predetermined event;

Play said video file.
The method according to any one of claims 1-7, wherein the video stream includes one or more video streams, the buffer queue includes a circular queue, and the predetermined events include pedestrian fall events, pedestrian retrograde events, At least one of pedestrian squatting incidents and smoking incidents.
A video generating device, comprising:

The data cache module is configured to write the video data of the video stream into a preset cache queue, the video data including the data packets of the video stream and the time information of the data packets;

The time interval determination module is configured to determine the time interval corresponding to the predetermined event according to the triggering moment of the predetermined event in the event message when the event message for the video stream is received, and the time interval of the time interval The duration is less than or equal to the duration of the video data in the cache queue;

A video file obtaining module configured to obtain a video file of a video segment corresponding to the time interval from the video data in the cache queue;

The storing and associating module is configured to store the video file of the video segment in a preset storage space, so as to obtain the video segment associated with the predetermined event.
An electronic device comprising:

a memory for storing processor-executable instructions;

A processor configured to invoke instructions stored in the memory to perform the method of any one of claims 1-8.
A computer-readable storage medium, on which computer program instructions are stored, and when the computer program instructions are executed by a processor, the method according to any one of claims 1 to 8 is implemented.
A computer program, comprising computer-readable codes, when the computer-readable codes are run in an electronic device, a processor in the electronic device executes the image processing described in any one of claims 1 to 8 method.