CN112437233B - Video generation method, video processing device and camera equipment - Google Patents

Video generation method, video processing device and camera equipment Download PDF

Info

Publication number
CN112437233B
CN112437233B CN202110100556.8A CN202110100556A CN112437233B CN 112437233 B CN112437233 B CN 112437233B CN 202110100556 A CN202110100556 A CN 202110100556A CN 112437233 B CN112437233 B CN 112437233B
Authority
CN
China
Prior art keywords
video
event
signal
target event
trigger signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110100556.8A
Other languages
Chinese (zh)
Other versions
CN112437233A (en
Inventor
谷周亮
李升�
刘强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenlan Changsheng Technology Co ltd
Original Assignee
Beijing Shenlan Changsheng Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenlan Changsheng Technology Co ltd filed Critical Beijing Shenlan Changsheng Technology Co ltd
Priority to CN202110100556.8A priority Critical patent/CN112437233B/en
Publication of CN112437233A publication Critical patent/CN112437233A/en
Application granted granted Critical
Publication of CN112437233B publication Critical patent/CN112437233B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects

Abstract

The application relates to a video generation method, a video processing device, a camera device, an intelligent short video cloud and a computer readable storage medium. The method comprises the following steps: receiving a trigger signal generated by a target event, wherein the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal; and generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event. By adopting the method, the video segment corresponding to the target event can be automatically and efficiently obtained.

Description

Video generation method, video processing device and camera equipment
Technical Field
The present application relates to the field of video processing technologies, and in particular, to a video generation method, a video processing method, an apparatus, a camera device, an intelligent short video cloud, and a computer-readable storage medium.
Background
The short video is gradually pursued by people due to the characteristics of short time and concentrated information, and for example, when a wonderful event occurs in the process of a match, players or audiences may need to watch or share the video of the wonderful event again.
However, the conventional short video generation and processing method completely depends on manual identification of wonderful events and post-video clipping, and the videos are distributed according to the clipping result, so that the high efficiency and timeliness of the generation and processing of the short videos cannot be guaranteed.
Disclosure of Invention
In view of the above, it is necessary to provide a video generation method, a video processing method, an apparatus, an image pickup device, an intelligent short video cloud, and a computer-readable storage medium, which address the above technical problems.
A video generation method applied to an image pickup apparatus, the method comprising:
receiving a trigger signal generated by a target event, wherein the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
and generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event.
In one embodiment, the receiving the trigger signal generated by the target event includes:
performing signal identification on a received trigger signal generated by the target event, and determining that the trigger signal comprises the audio signal; the audio signal is a signal which is acquired by audio acquisition equipment, judged that a voiceprint characteristic data value in the audio signal meets a preset threshold value and then sent to the camera equipment;
generating a corresponding clipping instruction according to the trigger signal, clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event, wherein the method comprises the following steps:
and generating a corresponding clipping instruction according to the audio signal, and clipping a video segment with preset duration from the acquired original video file to obtain a video corresponding to the target event.
In one embodiment, the receiving the trigger signal generated by the target event includes:
performing signal identification on a received trigger signal generated by the target event, determining that the trigger signal comprises a candidate audio signal, and performing voiceprint analysis on the candidate audio signal to extract a voiceprint characteristic data value in the candidate audio signal;
judging whether the voiceprint characteristic data value meets a preset threshold value or not, and if so, determining the candidate audio signal as an effective audio signal;
generating a corresponding clipping instruction according to the trigger signal, clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event, wherein the method comprises the following steps:
and generating a corresponding clipping instruction according to the effective audio signal, and clipping a video segment with preset duration from the acquired original video file to obtain a video corresponding to the target event.
In one embodiment, the receiving the trigger signal generated by the target event includes:
performing signal identification on the received trigger signal, and determining that the trigger signal comprises the vibration signal, wherein the vibration signal is a signal which is acquired based on vibration sensor equipment and is sent to the camera equipment after judging that a vibration frequency spectrum in the vibration signal meets a preset threshold; the vibration signal is representative of a shot hit event;
generating a corresponding clipping instruction according to the trigger signal, clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event, wherein the method comprises the following steps:
and generating a corresponding clipping instruction according to the vibration signal, and clipping a video segment with preset duration from the acquired original video file to obtain a video corresponding to the shooting hit event.
In one embodiment, the receiving the trigger signal generated by the target event includes:
performing signal identification on the received trigger signal, determining that the trigger signal comprises the real-time positioning signal, and storing event information carried by the real-time positioning signal;
generating a corresponding clipping instruction according to the trigger signal, clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event, wherein the method comprises the following steps:
and generating a corresponding clipping instruction according to the real-time positioning signal, and clipping a video segment with preset duration from the acquired original video file to obtain a video corresponding to the target event.
In one embodiment, the method further comprises:
and uploading the videos corresponding to the target event obtained by clipping to an intelligent short video cloud in a grouping mode according to the event types, and deleting the video files left after the videos corresponding to the target event are clipped from the original video files.
A video processing method, applied to an intelligent short video cloud, the method comprising:
receiving a video corresponding to a target event uploaded by at least one camera device, wherein the video carries grouping identification information; the video is obtained by clipping the collected original video according to a clipping instruction generated by the trigger signal; the target event is an event generating a trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event;
identifying whether a video corresponding to the target event contains event information of the target event; the event information of the target event is information which is obtained by analyzing a real-time positioning system and is transmitted by the real-time positioning signal and the video, and the event information comprises an event type, an event principal and an event result;
if the video corresponding to the target event does not contain the event information, performing video analysis on the video corresponding to the target event, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features based on the time sequence and a preset feature attribute rule.
In one embodiment, the method further comprises:
and storing the obtained event type, event principal and event result contained in the video corresponding to the target event into a memory of the intelligent short video cloud together with the video as the relevant information of the video.
In one embodiment, the method further comprises:
and if the video corresponding to the target event contains the event information, storing the event information as the related information of the video together with the video into a memory of the intelligent short video cloud.
In one embodiment, the method further comprises:
identifying a target object corresponding to the target event according to the event pivot contained in the video corresponding to the target event; the target object is an event principal in the target event;
and establishing an incidence relation with an account corresponding to the target object, and pushing the video to the account corresponding to the target object according to the incidence relation.
A video generation apparatus applied to an image pickup device, the apparatus comprising:
the signal receiving module is used for receiving a trigger signal generated by a target event, wherein the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
and the video cutting module is used for generating a corresponding cutting instruction according to the trigger signal, and cutting the acquired original video according to the cutting instruction to obtain a video corresponding to the target event.
A video processing apparatus, the apparatus being applied to a smart short video cloud, the apparatus comprising:
the video receiving module is used for receiving videos corresponding to target events uploaded by at least one camera device, and the videos carry grouping identification information; the video is obtained by clipping the collected original video according to a clipping instruction generated by the trigger signal; the target event is an event generating a trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
the video grouping module is used for grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event;
the identification module is used for identifying whether the video corresponding to the target event contains event information of the target event; the event information of the target event is information which is obtained by analyzing a real-time positioning system and is transmitted by the real-time positioning signal and the video, and the event information comprises an event type, an event principal and an event result;
and the video analysis module is used for carrying out video analysis on the video corresponding to the target event if the video corresponding to the target event does not contain the event information, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features based on time sequence and a preset feature attribute rule.
An image pickup apparatus comprising a memory storing a computer program and a processor implementing the following steps when executing the computer program:
receiving a trigger signal generated by a target event, wherein the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
and generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event.
An intelligent short video cloud comprising a memory storing a computer program and a processor implementing the following steps when executing the computer program:
receiving a video corresponding to a target event uploaded by at least one camera device, wherein the video carries grouping identification information; the video is obtained by clipping the collected original video according to a clipping instruction generated by the trigger signal; the target event is an event generating a trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event;
identifying whether a video corresponding to the target event contains event information of the target event; the event information of the target event is information which is obtained by analyzing a real-time positioning system and is transmitted by the real-time positioning signal and the video, and the event information comprises an event type, an event principal and an event result;
if the video corresponding to the target event does not contain the event information, performing video analysis on the video corresponding to the target event, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features based on the time sequence and a preset feature attribute rule.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
receiving a trigger signal generated by a target event, wherein the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
and generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event.
A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:
receiving a video corresponding to a target event uploaded by at least one camera device, wherein the video carries grouping identification information; the video is obtained by clipping the collected original video according to a clipping instruction generated by the trigger signal; the target event is an event generating a trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event;
identifying whether a video corresponding to the target event contains event information of the target event; the event information of the target event is information which is obtained by analyzing a real-time positioning system and is transmitted by the real-time positioning signal and the video, and the event information comprises an event type, an event principal and an event result;
if the video corresponding to the target event does not contain the event information, performing video analysis on the video corresponding to the target event, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features based on the time sequence and a preset feature attribute rule.
The video generation method, the video generation device, the camera equipment, the intelligent short video cloud and the storage medium receive a trigger signal generated by a target event, wherein the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal; and generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event. By adopting the method, the video clip can be automatically carried out through the triggering of the triggering signal, and the video segment corresponding to the target event can be rapidly obtained in time.
Drawings
FIG. 1 is a diagram of an exemplary video generation and processing method;
FIG. 2 is a diagram of an exemplary video generation process in an embodiment, such as a basketball game field;
FIG. 3 is a flow diagram illustrating a method for video generation in one embodiment;
FIG. 4 is a flow diagram illustrating the steps of receiving a trigger and clipping video in one embodiment;
FIG. 5 is a flowchart illustrating a method for receiving a trigger signal and editing a video according to another embodiment;
FIG. 6 is a flowchart illustrating a method for receiving a trigger signal and editing a video according to another embodiment;
FIG. 7 is a flowchart illustrating a method for receiving a trigger signal and editing a video according to another embodiment;
FIG. 8 is a flow diagram that illustrates a video processing method in one embodiment;
FIG. 9 is a diagram illustrating an implementation of OpenPose technology to identify key nodes of a human body in one embodiment;
FIG. 10 is a flow diagram that illustrates associating users with corresponding event principals in one embodiment;
FIG. 11 is a flow diagram that illustrates an example of a method for video generation processing in one embodiment;
fig. 12 is a flowchart showing an example of a video generation processing method in another embodiment;
FIG. 13 is a block diagram showing the structure of a video generating apparatus according to an embodiment;
FIG. 14 is a block diagram showing the structure of a video processing apparatus according to one embodiment;
FIG. 15 is a block diagram showing internal configurations of a video generating apparatus and a video processing apparatus according to an embodiment;
FIG. 16 is an internal block diagram of the Smart short video cloud in one embodiment;
fig. 17 is an internal configuration diagram of an image pickup apparatus in one embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The video generation method provided by the application can be applied to the application environment shown in fig. 1. The image capturing apparatus 102 and the server (i.e., the smart short video cloud) 104 communicate with each other via a network.
Specifically, the image pickup apparatus 102 receives a trigger signal generated by a target event, wherein the trigger signal includes at least one of a vibration signal, an audio signal and a real-time positioning signal; then, the camera device 102 generates a corresponding clipping instruction according to the trigger signal, clips the acquired original video according to the clipping instruction, and obtains a video corresponding to the target event. The camera device 102 uploads the video corresponding to the clipped target event to the intelligent short video cloud 104, so that the intelligent short video cloud 104 receives the video corresponding to the target event uploaded by at least one camera device 102, and the video carries the grouping identification information; the intelligent short video cloud 104 performs grouping processing on the received videos according to the grouping identification information to obtain videos corresponding to the same target event; identifying whether a video corresponding to a target event contains event information of the target event; the event information of the target event is information which is obtained by analyzing a real-time positioning system and is transmitted through a real-time positioning signal and a video, and specifically, the event information comprises an event type, an event principal angle and an event result; if the video corresponding to the target event does not contain event information, the intelligent short video cloud 104 performs video analysis on the video corresponding to the target event, extracts image features in the video, and obtains an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features and a preset feature attribute rule.
One or more image capturing devices 102 may be deployed in an application environment of the video generation method, and generally, a plurality of image capturing devices 102 need to be equipped in an application scene for view diversity, and are laid out at different angles of the application scene for synchronous shooting; the intelligent short video cloud 104 may be implemented by an independent server or a server cluster composed of a plurality of servers. The camera device 102 may be a smart camera, a mobile phone equipped with a camera, or the like, and the embodiment is not limited. In addition, the trigger signals received in the video generation method are respectively generated by the vibration sensor, the audio acquisition device (or the audio sensor) and the RTLS system according to the target event.
Specifically, a basketball game field is taken as an example for description, a specific application scenario of the video generation method and the video processing method is shown in fig. 2, an image pickup device (i.e., a shooting terminal) is disposed on two sides of the basketball game field, and audio sensors that generate trigger signals are also disposed on two sides of the game field (or integrated inside the image pickup device), a vibration sensor is disposed on a basketball net, an RTLS (Real Time positioning system, i.e., an RTLS server (including a bluetooth base station)) is disposed outside the game field, each sensor (the vibration sensor and the audio sensor) and the RTLS communicate with the image pickup device through a bluetooth communication technology, the image pickup device is connected with an intelligent short video cloud through a network and can perform file uploading operation, the intelligent short video cloud is associated with a user account of a client, and the video is transmitted to the user account of the client through the network.
In one embodiment, as shown in fig. 3, there is provided a video generation method, which is described by way of example as being applied to the image pickup apparatus 102 in fig. 1, and includes the steps of:
step 301, receiving a trigger signal generated by a target event, where the trigger signal includes at least one of a vibration signal, an audio signal, and a real-time positioning signal.
In an implementation, the image pickup apparatus receives a trigger signal generated by a target event, wherein the trigger signal includes at least one of a vibration signal, an audio signal, and a real-time positioning signal (signal of an RTLS system).
Specifically, the target event may be a wonderful event in a live broadcast of a game, and taking a basketball game as an example, the target event may be a shooting hit event, a passing event, a capping event, a backboard event, and the like. The trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal, and the specific combination mode is as follows: 1. a vibration signal; 2. an audio signal; 3. positioning the signal in real time; 4. a vibration signal and an audio signal; 5. vibration signals and real-time positioning signals; 6. an audio signal and a real-time positioning signal; 7. a vibration signal, an audio signal, and a real-time positioning signal.
Optionally, in the combination of the multiple signals, the image capturing apparatus has processing priorities for different signals, that is, the real-time positioning signal is prioritized over the vibration signal over the audio signal, for example, if one combination of the trigger signals includes the real-time positioning signal and other signals, the real-time positioning signal is processed preferentially, and meanwhile, for processing efficiency, the other signals are ignored and are not processed; if the combination mode of the trigger signal comprises a vibration signal and an audio signal (not comprising a real-time positioning signal), the vibration signal is processed preferentially, and the audio signal can be not processed.
Optionally, the vibration signal, the audio signal, and the real-time positioning signal may be converted into a bluetooth broadcast signal by a bluetooth communication technology and sent to the image capturing apparatus. If an audio acquisition module (or called as an audio acquisition device) is integrated in the camera device, the camera device can also directly receive and process an audio signal.
And 302, generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event.
In implementation, the image pickup apparatus is always in a shooting state, and meanwhile, the internal running program of the image pickup apparatus monitors the received trigger signal in real time, generates a corresponding clipping instruction according to the received trigger signal, clips the acquired original video (namely, the video cached in current shooting) in response to the clipping instruction (the moment when the trigger signal is monitored is basically consistent with the moment when an event occurs, and the moment when the trigger signal reaches each image pickup apparatus is basically consistent), and obtains the video corresponding to the target event.
Specifically, the image pickup apparatus performs analysis and recognition according to the received trigger signal, generates an executable clipping instruction, and then clips the acquired original video according to the clipping instruction, considering the delay time, for example, m seconds, an operating program inside the image pickup apparatus intercepts, with reference to the current time when the trigger signal is received, n seconds before the current time and m seconds after the current time, which are m + n seconds, as a video corresponding to the target event.
Optionally, m seconds and n seconds included in the video clip duration may be flexibly configured according to the video type and the event type included in the video, which is not limited in this embodiment.
In the video generation method, a trigger signal generated by a target event is received, wherein the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal; and generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event. By adopting the method, the video clip is automatically carried out by triggering the trigger signal, and the video segment corresponding to the target event is rapidly obtained in time.
In one embodiment, as shown in fig. 4, the specific processing procedure of step 201 is as follows:
step 401, performing signal identification on a trigger signal generated by a received target event, and determining that the trigger signal comprises an audio signal; the audio signal is a signal which is acquired by the audio acquisition equipment, judged that a voiceprint characteristic data value in the audio signal meets a preset threshold value and then sent to the camera equipment.
In implementation, the camera device performs signal recognition on a trigger signal generated by a received target event, and determines that the trigger signal comprises an audio signal through recognition, wherein the audio signal is acquired by an audio acquisition device independent of the camera device, and then inside the audio acquisition device, performs voiceprint analysis on the audio signal and judges whether a voiceprint characteristic data value extracted by the voiceprint analysis meets a preset threshold value according to a preset threshold value rule, and if the voiceprint characteristic data value meets the preset threshold value, the audio acquisition device sends the audio signal to the camera device. For example, the audio signal is transmitted to the image pickup apparatus in the form of bluetooth broadcast by bluetooth communication technology.
The specific processing procedure of the above step 202 is as follows:
step 402, generating a corresponding clipping instruction according to the audio signal, and clipping a video segment with a preset duration from the acquired original video file to obtain a video corresponding to the target event.
In implementation, the image capturing apparatus generates a corresponding clipping instruction according to the audio signal, and clips a video segment with a preset duration (for example, m + n seconds) from the acquired original video file to obtain a video corresponding to the target event.
Optionally, the clipping criterion for the video may be an event type or other requirement criterion, and this embodiment is not limited.
In this embodiment, sound caused by an event is collected by an audio collecting device independent of the outside of the image pickup device to obtain an audio signal, and voiceprint judgment is performed on a voiceprint feature data value extracted from the audio signal according to a preset threshold value to determine that a target event occurs, and then the audio signal is sent to the image pickup device as a trigger signal to trigger the image pickup device to perform automatic video clipping, so that the generation efficiency of a video corresponding to the target event is improved.
In another embodiment, as shown in FIG. 5, the specific process of step 102 is as follows:
step 501, performing signal identification on a trigger signal generated by a received target event, determining that the trigger signal comprises a candidate audio signal, performing voiceprint analysis on the candidate audio signal, and extracting a voiceprint characteristic data value in the candidate audio signal.
In implementation, camera equipment is internally integrated with an audio acquisition device (or called as an audio acquisition module), then camera equipment can directly acquire external audio signals, therefore, after the camera equipment performs signal identification on a trigger signal generated by a received target event, it is determined that the trigger signal comprises a candidate audio signal, because the candidate audio signal is directly acquired by the audio acquisition device inside the camera equipment, therefore, the audio acquisition device transmits the trigger signal to a video editing device inside the camera equipment through a microelectronic circuit, voiceprint analysis is performed on the candidate audio signal inside the camera equipment, and a voiceprint characteristic data value in the candidate audio signal is extracted.
Step 502, judging whether the voiceprint characteristic data value meets a preset threshold value, and if so, determining that the candidate audio signal is an effective audio signal.
In implementation, the image capturing apparatus determines whether the voiceprint feature data value satisfies a preset threshold, and if the voiceprint feature data value satisfies the preset threshold, determines that the candidate audio signal is a valid audio signal, that is, the clipped audio signal may be triggered.
The specific processing procedure of the above step 302 is as follows:
step 503, generating a corresponding clipping instruction according to the effective audio signal, and clipping a video segment with a preset duration from the acquired original video file to obtain a video corresponding to the target event.
In implementation, the image capturing apparatus generates a corresponding clipping instruction according to the effective audio signal, and clips a video segment with a preset duration (for example, m + n seconds) from the acquired original video file to obtain a video corresponding to the target event.
In this embodiment, sound caused by an event is collected by an audio collection device (including a microphone or a microphone array) inside the image pickup apparatus, and is converted into an audio signal, and a voiceprint characteristic data value extracted from the audio signal is subjected to voiceprint judgment according to a preset threshold value, so that it is determined that a target event occurs, and the audio signal is used as a trigger signal to trigger the image pickup apparatus to perform automatic video clipping, thereby improving the generation efficiency of a video corresponding to the target event.
In one embodiment, as shown in fig. 6, the specific processing procedure of step 201 is as follows:
step 601, performing signal identification on the received trigger signal, determining that the trigger signal comprises a vibration signal, wherein the vibration signal is a signal which is acquired by vibration sensor equipment, judges that a vibration frequency spectrum in the vibration signal meets a preset threshold value and then is sent to camera equipment; the vibration signal characterizes a shot hit event.
In implementation, the camera device performs signal identification on the received trigger signal, and the trigger signal is determined to include a vibration signal through identification, wherein the vibration signal is acquired by a vibration sensor device independent of the camera device, a micro-processing chip for data analysis and processing is integrated in the vibration sensor, the vibration signal can be analyzed, a vibration spectrum of the vibration signal is extracted and combined with a preset threshold rule, the vibration spectrum is judged, and if the vibration spectrum meets a preset threshold, the vibration sensor sends the vibration signal to the camera device as the trigger signal. Optionally, for example, in a basketball game, the vibration signal included in the trigger signal represents a shooting hit event.
Specifically, the vibration sensor is arranged on a basketball net, a player shoots a ball in a game to cause the basketball to vibrate, the vibration sensor arranged on the vibration sensor collects the vibration and generates a vibration signal, the vibration signal is analyzed and judged, if vibration data contained in a vibration frequency spectrum of the vibration signal meets a preset threshold value, the result of shooting is shot hit, otherwise, the result of shooting is shot miss, and when the event is shot hit, a Bluetooth signal corresponding to the vibration signal is sent to the camera in a broadcasting mode.
The specific processing procedure of the above step 302 is as follows:
step 602, generating a corresponding clipping instruction according to the vibration signal, and clipping a video segment with a preset duration from the acquired original video file to obtain a video corresponding to the shooting hit event.
In implementation, the image capturing device generates a corresponding clipping instruction according to the vibration signal, and clips a video segment with a preset duration (for example, m + n seconds) from the acquired original video file to obtain a video corresponding to the shooting hit event.
In this embodiment, vibration caused by an event is collected by the vibration sensor and converted into a vibration signal, a frequency spectrum data value extracted from the vibration signal is judged according to a preset threshold value, it is determined that a target event (shooting hit event) occurs, and the vibration signal is sent to the image pickup device as a trigger signal to trigger the image pickup device to perform automatic video editing, so that the generation efficiency of a video corresponding to the target event is improved.
In one embodiment, as shown in fig. 7, the specific processing procedure of step 201 is as follows:
step 701, performing signal identification on the received trigger signal, determining that the trigger signal comprises a real-time positioning signal, and storing event information carried by the real-time positioning signal.
In implementation, the image pickup device performs signal identification on the received trigger signal, determines that the trigger signal includes a real-time positioning signal, and the real-time positioning signal carries event information of a target event, and stores the event information carried by the real-time positioning signal.
Specifically, the real-time positioning signal is a signal sent to the image pickup device by a positioning device equipped with a real-time positioning system (RTLS), the real-time positioning system can judge the event type, the event pivot and the event result (which can be collectively referred to as event three elements) of an occurring target event through the position change of target objects and the relative position relationship among the target objects, taking the RTLS deployed on a basketball court as an example, the tracking positioning and the relative position analysis of the position information of a basketball and a plurality of players can judge the shooting hit (or the corresponding judgment miss), the passing of the basketball, the blocking and the blocking occurring in one basketball game, and after identifying the event three elements, the RTLS system sends the event three elements and the position information of the event occurring to the image pickup device in the form of bluetooth broadcast signals through the bluetooth communication technology, the three elements of the event and the position information of the event occurrence are sent in the form of a bluetooth broadcast signal, that is, the real-time positioning signal in the trigger signal corresponds to this embodiment.
The specific processing procedure of step 302 is as follows:
step 702, generating a corresponding clipping instruction according to the real-time positioning signal, and clipping a video segment with a preset duration from the acquired original video file to obtain a video corresponding to the target event.
In implementation, the camera device generates a corresponding clipping instruction according to the real-time positioning signal, clips a video segment with a preset duration (for example, m + n seconds) from the acquired original video file, and obtains a video corresponding to the target event.
In this embodiment, the RTLS system identifies the position change generated by the target object, determines the event type, the event pivot, and the event result of the target event in combination with the position information, converts the event type, the event pivot, and the event result into corresponding RTLS signals, and sends the corresponding RTLS signals to the image capturing device, so as to trigger the image capturing device to perform automatic video editing, and improve the generation efficiency of the video corresponding to the target event.
In one embodiment, the video generation method further comprises: and uploading videos corresponding to the target event obtained by clipping to the intelligent short video cloud in a grouping mode according to the event type, and deleting the video files left after the videos corresponding to the target event are clipped from the original video files.
In implementation, the camera device uploads videos corresponding to the clipped target event to the intelligent short video cloud in a grouping mode according to event types, and deletes the residual video files after the videos corresponding to the clipped target event in the original video files.
Specifically, when a plurality of image capturing apparatuses capture images simultaneously, videos of a plurality of synchronized angles are divided into a same group according to clip signals (carrying synchronization time information) for a same target event, and uploaded to the smart short video cloud.
In this embodiment, the image pickup apparatus immediately cleans the remaining files after the clipping and the video files that have been too long, so that the image pickup apparatus maintains a sufficient storage space inside for new video storage.
In one embodiment, as shown in fig. 8, there is provided a video processing method applied to an intelligent short video cloud, the method including:
step 801, receiving a video corresponding to a target event uploaded by at least one camera device, wherein the video carries grouping identification information; the video is obtained by clipping the collected original video according to a clipping instruction generated by the trigger signal; the target event is an event generating a trigger signal; the trigger signal includes at least one of a vibration signal, an audio signal, and a real-time positioning signal.
The camera device is connected with a public network through a wireless or wired network, and an internal running program of the camera device can associate a video (video file) corresponding to a target event obtained after clipping with file attribute information of the file, (if the video also carries event information contained in video content, the video also contains the event information), and upload the video to an intelligent short video cloud in the public network through the network.
In implementation, the intelligent short video cloud receives a video corresponding to a target event uploaded by at least one camera device, wherein the video carries packet identification information, the video uploaded by the camera device is also obtained by correspondingly generating a clipping instruction according to a trigger signal generated by the target event in the video generation method and clipping an original video according to the clipping instruction. The trigger signal which can trigger the camera equipment to carry out the video clip comprises at least one of a vibration signal, an audio signal and a real-time positioning signal.
And step 802, grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event.
In implementation, the videos uploaded by the camera device all carry the grouping identifier, and the intelligent short video cloud performs grouping processing on the uploaded videos according to the grouping identifier information, wherein each group of videos corresponds to a video of the same target event.
Specifically, one or more image pickup devices can be deployed in a target image pickup scene, and when one image pickup device is deployed, only a video segment at a shooting angle can be correspondingly acquired for each target event, so that the intelligent short video cloud performs grouping processing on one video segment corresponding to the same target event according to a grouping identifier; when a plurality of camera devices are deployed, the plurality of camera devices can correspondingly acquire video segments of the same target event at a plurality of shooting angles according to each target event, so that the intelligent short video cloud divides the plurality of video segments corresponding to the same target event into one group according to the grouping identification, namely each group of videos corresponds to the same target event.
Step 803, identifying whether the video corresponding to the target event contains event information of the target event; the event information of the target event is information which is analyzed and obtained by the real-time positioning system and is transmitted by the video, and the event information comprises an event type, an event principal and an event result.
Since the trigger signal in the video generation method may include an RTLS signal, and if the trigger signal includes the RTLS signal, the RTLS signal carries event information and event location information, the image pickup apparatus clips the original video according to a clipping instruction generated by the RTLS signal, and the obtained video file corresponding to the target event also includes the event information and the event location information of the target event carried in the RTLS signal. The camera equipment uploads the video file, the event information and the event position information to the intelligent short video cloud in the public network through the network
In implementation, the intelligent short video cloud first identifies whether a video corresponding to a target event contains event information of the target event, wherein the event information of the target event is obtained based on analysis of an RTLS real-time positioning system, and the event information includes three elements of the target event: event type, event principal and event outcome, as well as location information for real-time location in the RTLS signal.
Step 804, if the video corresponding to the target event does not contain the event information, performing video analysis on the video corresponding to the target event, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features and a preset feature attribute rule.
In implementation, if the video corresponding to the target event does not contain event information, the intelligent short video cloud performs video analysis on the video corresponding to the target event, extracts image features in the video, and analyzes the video according to the extracted image features and a preset feature attribute rule to obtain an event type, an event principal angle and an event result contained in the video corresponding to the target event.
Specifically, the intelligent short video cloud takes a target event as a video analysis dimension, and identifies the action type of an event pivot in the event by using a third-party human multi-pose evaluation technology, for example, taking a basketball event as an example, identifying the action types of each action object in the video may include: shooting, people passing, capping and the like, correspondingly judging the result generated by the action, and according to the event pivot in the video corresponding to the target event determined by the action executor, specifically, all video image analysis algorithms applied by the intelligent short video cloud: the event pivot can be identified by using the body pix under the CV (Crowd Counting) framework of the third party, and the features of the face, the hair length and color, the jersey number, the jersey name and the like of the event pivot are extracted. Furthermore, the whole process of the target event is analyzed by using a visual technology, and the colors of the jerseys are distinguished by using third-party image processing software such as OpenCV (open computer vision correction) and the like so as to be used for distinguishing teams to which the players belong. Detecting a basketball and a basket in a video and each marking area in the basketball court, such as a three-line area, a three-second area, a central circle and the like, by using a third-party detection frame (such as SSD, fast RCNN and the like), and realizing the identification of 14 key nodes of all athlete bodies in the video by using OpenPose by using a third-party human body multi-posture evaluation frame, as shown in FIG. 9; the event type and event outcome are identified by changes in the frame sequence (i.e., the sequence in the video in the time dimension) for each player's joint and basketball location information, or changes in two or more player's joint and basketball location information for both parties. The specific judgment process is as follows:
step one, judging a ball holding player: traversing the relative positions of all the players and the basketball in the whole field, identifying that the position of the node of the palm of the player A who goes into the attack side is superposed with the position of the basketball, and judging that the player A holds the basketball in the video at the moment;
step two (a), judging shooting action: the subsequent frame sequence in the video shows that the palm node of the player A is higher than the top of the head, the basketball leaves the palm of the player A, moves towards the basket in the horizontal direction, moves upwards firstly in the vertical direction and then downwards, and finally falls near the basket, so that the target event is judged to be a shooting event and the player A is an event principal angle;
and step three (a), analyzing the positions of the basketball and the basket in the video, and if the positions of the basketball and the basket are overlapped and the descending speed is obviously reduced in the process of passing through the net, judging that the event is shot by the system.
Step two (b), the judgement of the defensive player: traversing the relative positions of all players and the basketball in the whole field, wherein the relative position of the player is different from the color of the coat of the player A, the player A faces the player A, and when the distance between the two arms of the player A and the player A is smaller than a certain threshold value, the system judges the player as a defensive player B;
step three (b) judging the action of the cap: and displaying a subsequent frame sequence in the video, wherein after the basketball holding player A plays a shooting action, a joint point of a palm of the defending player B is higher than the top of the head, the distance between the joint point and the center of the basketball is less than a certain threshold value, or the displacement speed or direction of the basketball after leaving the palm of the ball holding player A is obviously changed, the target event is judged to be a cap event, the cap event is successfully covered, and the player B is an event main angle.
Optionally, the event pivot is determined according to the action performer, wherein an attacking player such as shooting and passing is the event pivot, and a defending player such as a cap and a snapping is the event pivot.
According to the video processing method, the intelligent short video cloud receives videos corresponding to target events uploaded by at least one camera device, and the videos carry grouping identification information; grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event; identifying whether a video corresponding to a target event contains event information of the target event; if the video corresponding to the target event does not contain the event information, performing video analysis on the video corresponding to the target event, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features and a preset feature attribute rule. By adopting the method, the video files are grouped, the characteristics are extracted and analyzed through the intelligent short video cloud, the event information contained in the video files is obtained, and the metadata of the video files is obtained and is used for storage, retrieval and pushing of the subsequent video files, so that the distribution efficiency of the video is improved.
In one embodiment, the method further comprises: and taking the event type, the event principal and the event result contained in the video corresponding to the obtained target event as related information of the video, and storing the related information and the video into a memory of the intelligent short video cloud.
In implementation, the intelligent short video cloud performs feature extraction and analysis on the video to obtain an event type, an event pivot and an event result corresponding to a primary target event, the event type, the event pivot and the event result are used as related information of the video, an association relation is established between the related information and the video, and the related information and the video are stored in a memory of the intelligent short video cloud together, so that the video is queried and pushed according to the related information as a retrieval or pushing condition.
Optionally, the video search may also be performed according to conventional search conditions, such as the type of event, video time information, and video title included in the video.
Optionally, the video library for storing the video may also be built based on a distributed file system and a distributed database system of a third party, the video files corresponding to the target events and the related information (used as metadata for describing the video) of the video files are stored in the distributed database together, and mass storage and high-speed retrieval of the video files are realized based on the distributed file system structure.
In one embodiment, the method further comprises: and if the video corresponding to the target event contains the event information, storing the event information and the video into a memory of the intelligent short video cloud together as the relevant information of the video.
In implementation, if the intelligent short video cloud recognizes that the video file uploaded by the camera device carries RTLS information, that is, event information carried by an RTLS signal is included, the video cloud directly establishes an association relationship between the video and the event information corresponding to the video without performing video analysis, and stores the association relationship into a memory of the intelligent short video cloud.
In one embodiment, as shown in fig. 10, the video processing method further includes:
step 1001, identifying a target object corresponding to a target event according to an event pivot contained in a video corresponding to the target event; the target object is an event principal in the target event.
In implementation, the intelligent short video cloud identifies a target object corresponding to a target event according to an event pivot contained in a video corresponding to the target event, wherein the target object is the event pivot in the target event.
Specifically, the intelligent short video cloud determines an event pivot in a video corresponding to a target event, and identifies a target object in the video according to features extracted in the process of determining the event pivot, such as faces of players, hair colors, and features of shirts, names of shirts, and the like.
Step 1002, establishing an association relation with an account corresponding to the target object, and pushing the video to the account corresponding to the target object according to the association relation.
In implementation, the intelligent short video cloud establishes an association relation with an account corresponding to the target object, and pushes the video to the account corresponding to the target object according to the association relation.
Optionally, the user may log in a personal account of the user through an APP or an applet to obtain a video corresponding to a target event with the user as an event principal, and then the subsequent intelligent short video cloud may automatically push the video to a corresponding user entity account according to a preset period, a video file uploading condition, and the like, and the user may also manually establish an association relationship between the video and the account in a "focus on" manner, so that the video after being clipped is automatically pushed to the account in the subsequent process.
Optionally, the video distribution mode may be a push mode of the smart short video cloud (server) to the user account (client), and may also be a pull mode of the client from the server, so that the embodiment is not limited to the specific implementation mode of video distribution.
In the embodiment, the video file is pushed to the event pivot in the video through the association relationship established between the video file and the video related information, so that the automatic distribution of the video is completed, and the video distribution efficiency is improved. Specifically, a short video is typically not longer than 3 minutes from live editing, uploading, analyzing, to an account number pushed to the target object.
In the present embodiment, an example of a video generation and processing procedure is provided, as shown in fig. 11, in which an audio acquisition apparatus is inside an image capturing apparatus;
in the basket sensor, the basket sensor receives the vibration signal, analyzes the vibration signal, judges whether the vibration frequency spectrum in the vibration signal reaches a preset threshold value (whether the goal is reached), and sends a Bluetooth broadcast signal to the camera equipment if the result is yes; if the result is negative, continuing to receive the vibration signal;
analyzing an event (event type, event principal and event result) according to the recognized location information in an RTLS (server) system and transmitting the location information and the event information to an image pickup apparatus in the form of a bluetooth broadcast signal;
step 1110, (a) the image pickup apparatus receives a bluetooth broadcast signal, and if the bluetooth broadcast signal is received, step 1120 is executed; step 1110 (b) receiving a live audio signal, and if the live audio signal is received, performing step 1130;
step 1120, analyzing whether the bluetooth broadcast signal contains the event information of the RTLS, if so, temporarily storing the RTLS event information and executing step 1140, and if not, executing step 1140;
step 1130, performing voiceprint analysis on the audio signal, and determining whether a voiceprint feature in the audio signal reaches a preset threshold, if so, performing step 1140, otherwise, performing step 1110 (b);
and step 1140, clipping m + n seconds of video in the original video as the video corresponding to the target event.
Step 1150, cleaning the remaining videos in the original video file after clipping, and uploading videos corresponding to the clipped target event, video grouping information (the same target event has the same information identifier), and RTLS event information (if any) to the smart short video cloud;
step 1160, receiving a video corresponding to the target event by the intelligent short video cloud;
step 1170, performing grouping processing and video preprocessing on the video according to the grouping information;
step 1180, storing the video according to the grouping result;
step 1190, determining whether the video corresponding to the target event carries the RTLS event information, if yes, executing step 11110, and if not, executing step 11100;
step 11100; analyzing and processing videos of the same target event to obtain event information (event type, event principal and event result) of the target event in the videos;
step 11110, associate the event information with the video of the corresponding target event, and store the event information;
step 11120, associating the video corresponding to each target event with the account number of the event principal of the target event contained in the video, so as to push the video to the event principal.
In one embodiment, an example of a video generation, processing procedure is provided, as shown in fig. 12, in which an audio capture apparatus is independent of a camera device;
in the basket sensor, the basket sensor receives the vibration signal, analyzes the vibration signal, judges whether the vibration frequency spectrum in the vibration signal reaches a preset threshold value (whether the goal is reached), and sends a Bluetooth broadcast signal to the camera equipment if the result is yes; if the result is negative, continuing to receive the vibration signal;
in the audio sensor, the audio sensor receives a field audio signal, performs voiceprint analysis on the audio signal, and judges whether voiceprint characteristics in the audio signal reach a preset threshold value, if so, a Bluetooth broadcast signal is sent, and if not, the field audio signal is continuously received;
analyzing an event (event type, event principal and event result) according to the recognized location information in an RTLS (server) system and transmitting the location information and the event information to an image pickup apparatus in the form of a bluetooth broadcast signal;
step 1201, the camera device receives a Bluetooth broadcast signal;
step 1202, determining whether the bluetooth broadcast signal contains event information of RTLS, if yes, executing step 1203, and if not, executing step 1204;
step 1203, temporarily storing RTLS event information;
step 1204, clipping m + n seconds of video in original video as video corresponding to target event
Step 1205, cleaning the residual videos in the original video file after clipping, and uploading videos corresponding to the clipped target event, video grouping information (the same target event has the same information identifier), and RTLS event information (if any) to the intelligent short video cloud;
step 1206, the intelligent short video cloud receives a video corresponding to the target event;
step 1207, grouping processing and video preprocessing are carried out on the video according to the grouping information;
step 1208, storing the video according to the grouping result;
1209, judging whether the video corresponding to the target event carries RTLS event information, if so, executing a step 1211, and if not, executing a step 1210;
step 1210, analyzing and processing videos of the same target event to obtain event information (event type, event principal and event result) of the target event in the videos;
a step 1211 of associating the event information with a video of a corresponding target event and storing the event information;
step 1212, associating the video corresponding to each target event with the account number of the event hero of the target event included in the video, so as to push the video to the event hero.
It should be understood that although the various steps in the flowcharts of fig. 3-8, 10-12 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. 3-8, 10-12 may include multiple steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of execution of the steps or stages is not necessarily sequential, but may be alternated or performed with other steps or at least a portion of the steps or stages within other steps.
In one embodiment, as shown in fig. 13, there is provided a video generation apparatus 1300 including: a signal receiving module 1310 and a video cropping module 1320, wherein:
a signal receiving module 1310, configured to receive a trigger signal generated by a target event, where the trigger signal includes at least one of a vibration signal, an audio signal, and a real-time positioning signal;
the video clipping module 1320 is configured to generate a corresponding clipping instruction according to the trigger signal, and clip the acquired original video according to the clipping instruction to obtain a video corresponding to the target event.
The video generating device receives a trigger signal generated by a target event, wherein the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal; and generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event. By adopting the method, the video clip is automatically carried out by triggering the trigger signal, and the video segment corresponding to the target event is rapidly obtained in time.
In one embodiment, the signal receiving module 1310 is specifically configured to perform signal identification on a received trigger signal generated by the target event, and determine that the trigger signal includes the audio signal; the audio signal is a signal which is acquired by audio acquisition equipment, judged that a voiceprint characteristic data value in the audio signal meets a preset threshold value and then sent to the camera equipment;
the video clipping module 1320 is specifically configured to generate a corresponding clipping instruction according to the audio signal, and clip a video segment with a preset duration from the acquired original video file to obtain a video corresponding to the target event.
In another embodiment, the signal receiving module 1310 is specifically configured to perform signal identification on a trigger signal generated by a received target event, determine that the trigger signal includes a candidate audio signal, perform voiceprint analysis on the candidate audio signal, and extract a voiceprint feature data value in the candidate audio signal;
judging whether the voiceprint characteristic data value meets a preset threshold value or not, and if so, determining that the candidate audio signal is an effective audio signal;
the video clipping module 1320 is specifically configured to generate a corresponding clipping instruction according to the effective audio signal, and clip a video segment with a preset duration from the acquired original video file to obtain a video corresponding to the target event.
In an embodiment, the signal receiving module 1310 is specifically configured to perform signal identification on the received trigger signal, and determine that the trigger signal includes a vibration signal, where the vibration signal is a signal that is acquired by the vibration sensor device and sent to the image capturing device after determining that a vibration spectrum in the vibration signal satisfies a preset threshold; the vibration signal represents a shooting hit event;
the video clipping module 1320 is specifically configured to generate a corresponding clipping instruction according to the vibration signal, and clip a video segment with a preset duration from the acquired original video file to obtain a video corresponding to the shooting hit event.
In an embodiment, the signal receiving module 1310 is specifically configured to perform signal identification on a received trigger signal, determine that the trigger signal includes a real-time positioning signal, and store event information carried by the real-time positioning signal;
the video clipping module 1320 is specifically configured to generate a corresponding clipping instruction according to the real-time positioning signal, and clip a video segment with a preset duration from the acquired original video file to obtain a video corresponding to the target event.
In one embodiment, the video generation device further comprises a video cleaning module, wherein the video cleaning module is used for uploading videos corresponding to the clipped target events to the intelligent short video cloud in a grouping mode according to event types, and deleting the video files left after the videos corresponding to the target events are clipped from the original video files.
In one embodiment, as shown in fig. 14, there is provided a video processing apparatus 1400 comprising: a video receiving module 1410, a video grouping module 1420, a recognition module 1430, and a video analysis module 1440, wherein:
the video receiving module 1410 is configured to receive a video corresponding to a target event uploaded by at least one camera device, where the video carries packet identification information; the video is obtained by clipping the collected original video according to a clipping instruction generated by the trigger signal; the target event is an event generating a trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
the video grouping module 1420 is configured to perform grouping processing on the received multiple videos according to the grouping identification information to obtain videos corresponding to the same target event;
the identifying module 1430 is configured to identify whether a video corresponding to the target event includes event information of the target event; the event information of the target event is information which is obtained by analyzing a real-time positioning system and is subjected to real-time positioning signal and video transmission, and the event information comprises an event type, an event principal and an event result;
the video analysis module 1440 is configured to, if the video corresponding to the target event does not include the event information, perform video analysis on the video corresponding to the target event, extract image features in the video, and obtain an event type, an event pivot, and an event result included in the video corresponding to the target event according to the extracted image features and a preset feature attribute rule.
The video processing device receives videos corresponding to target events uploaded by at least one camera device through the intelligent short video cloud, and the videos carry grouping identification information; grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event; identifying whether a video corresponding to a target event contains event information of the target event; if the video corresponding to the target event does not contain the event information, performing video analysis on the video corresponding to the target event, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features and a preset feature attribute rule. By adopting the device, the video files are grouped, the characteristics are extracted and analyzed through the intelligent short video cloud, the event information contained in the video files is obtained, and the metadata of the video files is obtained for storage, retrieval and pushing of the subsequent video files, so that the distribution efficiency of the video is improved.
In one embodiment, the video processing apparatus 1400 further includes a storage module, where the storage module is specifically configured to store the event type, the event pivot and the event result included in the video corresponding to the obtained target event as the related information of the video, together with the video, in the memory of the smart short video cloud.
In another embodiment, the storage module is specifically configured to, if the video corresponding to the target event includes event information, store the event information as related information of the video together with the video in the memory of the smart short video cloud.
In one embodiment, the video processing apparatus 1400 further comprises:
the identification module is used for identifying a target object corresponding to a target event according to an event pivot contained in a video corresponding to the target event; the target object is an event principal in the target event;
and the pushing module is used for establishing an incidence relation with the account corresponding to the target object and pushing the video to the account corresponding to the target object according to the incidence relation.
Optionally, the video generating device 1300 and the video processing module 1400 in the smart short video cloud respectively include module architectures in the image capturing apparatus (or called shooting terminal), as shown in fig. 15, specifically, the image capturing apparatus includes: camera module, signal reception module, video clip module, upload module, storage clearance module, sound collection module and voiceprint analysis module, the short video cloud of intelligence includes: the system comprises a video receiving and preprocessing module, a video grouping module, a video and account association module, a video analysis module, a video retrieval module and a video and classification information storage module.
For specific limitations of the video generating apparatus 1300, reference may be made to the above limitations on the video generating method, and similarly, for specific limitations of the video processing apparatus 1400, reference may also be made to the above limitations on the video processing method, which is not described herein again. The respective modules in the video generation apparatus and the video processing apparatus described above may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, an intelligent short video cloud device is provided, and the intelligent short video cloud device may be a server, and the internal structure diagram of the intelligent short video cloud device may be as shown in fig. 16. The intelligent short video cloud device comprises a processor, a memory and a network interface which are connected through a system bus. Wherein the processor of the intelligent short video cloud device is configured to provide computing and control capabilities. The memory of the intelligent short video cloud equipment comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the intelligent short video cloud equipment is used for storing videos and event information data in the videos. The network interface of the intelligent short video cloud equipment is used for being connected and communicated with an external terminal through a network. The computer program is executed by a processor to implement a video processing method.
In one embodiment, there is provided an image pickup apparatus whose internal structural view may be as shown in fig. 17. The image pickup apparatus includes a processor, a memory, a communication interface, a display screen, and an input device connected through a system bus. Wherein the processor of the imaging apparatus is configured to provide computational and control capabilities. The memory of the image pickup apparatus includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the image pickup apparatus is used for performing wired or wireless communication with an external terminal, and the wireless communication can be realized by WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a video generation method. The display screen of the image pickup apparatus may be a liquid crystal display screen or an electronic ink display screen.
Those skilled in the art will appreciate that the configurations shown in fig. 16 and 17 are only block diagrams of partial configurations relevant to the present disclosure, and do not constitute a limitation on the image capture device and the intelligent short video cloud to which the present disclosure is applied, and a particular image capture device and intelligent short video cloud may include more or fewer components than those shown in the figures, or combine certain components, or have different arrangements of components.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (15)

1. A video generation method applied to an image pickup apparatus, the method comprising:
receiving a trigger signal generated by a target event, performing signal identification on the received trigger signal, and determining the type of the signal contained in the trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal; the trigger signal has a plurality of signal combination modes; processing priorities exist for different types of the trigger signals in the combination mode of the trigger signals, and the real-time positioning signal is superior to the vibration signal and the audio signal; if the combination mode of the trigger signals comprises the real-time positioning signals and other signals, the real-time positioning signals are preferentially processed, and the other signals are not processed; if the combination mode of the trigger signal does not contain the real-time positioning signal but contains the vibration signal and the audio signal, preferentially processing the vibration signal and not processing the audio signal;
and generating a corresponding clipping instruction according to the trigger signal, and clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event.
2. The method of claim 1, wherein receiving the trigger signal generated by the target event comprises:
performing signal identification on a received trigger signal generated by the target event, and determining that the trigger signal comprises the audio signal; the audio signal is a signal which is acquired by audio acquisition equipment, judged that a voiceprint characteristic data value in the audio signal meets a preset threshold value and then sent to the camera equipment;
generating a corresponding clipping instruction according to the trigger signal, clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event, wherein the method comprises the following steps:
and generating a corresponding clipping instruction according to the audio signal, and clipping a video segment with preset duration from the acquired original video file to obtain a video corresponding to the target event.
3. The method of claim 1, wherein receiving the trigger signal generated by the target event comprises:
performing signal identification on a received trigger signal generated by the target event, determining that the trigger signal comprises a candidate audio signal, and performing voiceprint analysis on the candidate audio signal to extract a voiceprint characteristic data value in the candidate audio signal;
judging whether the voiceprint characteristic data value meets a preset threshold value or not, and if so, determining the candidate audio signal as an effective audio signal;
generating a corresponding clipping instruction according to the trigger signal, clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event, wherein the method comprises the following steps:
and generating a corresponding clipping instruction according to the effective audio signal, and clipping a video segment with preset duration from the acquired original video file to obtain a video corresponding to the target event.
4. The method of claim 1, wherein receiving the trigger signal generated by the target event comprises:
performing signal identification on the received trigger signal, and determining that the trigger signal comprises the vibration signal, wherein the vibration signal is a signal which is acquired based on vibration sensor equipment and is sent to the camera equipment after judging that a vibration frequency spectrum in the vibration signal meets a preset threshold; the vibration signal is representative of a shot hit event;
generating a corresponding clipping instruction according to the trigger signal, clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event, wherein the method comprises the following steps:
and generating a corresponding clipping instruction according to the vibration signal, and clipping a video segment with preset duration from the acquired original video file to obtain a video corresponding to the shooting hit event.
5. The method of claim 1, wherein receiving the trigger signal generated by the target event comprises:
performing signal identification on the received trigger signal, determining that the trigger signal comprises the real-time positioning signal, and storing event information carried by the real-time positioning signal;
generating a corresponding clipping instruction according to the trigger signal, clipping the acquired original video according to the clipping instruction to obtain a video corresponding to the target event, wherein the method comprises the following steps:
and generating a corresponding clipping instruction according to the real-time positioning signal, and clipping a video segment with preset duration from the acquired original video file to obtain a video corresponding to the target event.
6. The method according to any one of claims 1-5, further comprising:
and uploading the videos corresponding to the target event obtained by clipping to an intelligent short video cloud in a grouping mode according to the event types, and deleting the video files left after the videos corresponding to the target event are clipped from the original video files.
7. A video processing method is applied to an intelligent short video cloud, and comprises the following steps:
receiving a video corresponding to a target event uploaded by at least one camera device, wherein the video carries grouping identification information; the video is obtained by clipping the collected original video according to a clipping instruction generated by the trigger signal; the target event is an event generating a trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event;
identifying whether a video corresponding to the target event contains event information of the target event; the event information of the target event is information which is obtained by analyzing a real-time positioning system and is transmitted by the real-time positioning signal and the video, and the event information comprises an event type, an event principal and an event result;
if the video corresponding to the target event does not contain the event information, performing video analysis on the video corresponding to the target event, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features based on the time sequence and a preset feature attribute rule.
8. The method of claim 7, further comprising:
and storing the obtained event type, event principal and event result contained in the video corresponding to the target event into a memory of the intelligent short video cloud together with the video as the relevant information of the video.
9. The method of claim 7, further comprising:
and if the video corresponding to the target event contains the event information, storing the event information as the related information of the video together with the video into a memory of the intelligent short video cloud.
10. The method of claim 7, further comprising:
identifying a target object corresponding to the target event according to the event pivot contained in the video corresponding to the target event; the target object is an event principal in the target event;
and establishing an incidence relation with an account corresponding to the target object, and pushing the video to the account corresponding to the target object according to the incidence relation.
11. A video generation apparatus, characterized in that the apparatus is applied to an image pickup device, the apparatus comprising:
the device comprises a signal receiving module, a signal processing module and a signal processing module, wherein the signal receiving module is used for receiving a trigger signal generated by a target event, performing signal identification on the received trigger signal and determining the type of the signal contained in the trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal; the trigger signal has a plurality of signal combination modes; processing priorities exist for different types of the trigger signals in the combination mode of the trigger signals, and the real-time positioning signal is superior to the vibration signal and the audio signal; if the combination mode of the trigger signals comprises the real-time positioning signals and other signals, the real-time positioning signals are preferentially processed, and the other signals are not processed; if the combination mode of the trigger signal does not contain the real-time positioning signal but contains the vibration signal and the audio signal, preferentially processing the vibration signal and not processing the audio signal;
and the video cutting module is used for generating a corresponding cutting instruction according to the trigger signal, and cutting the acquired original video according to the cutting instruction to obtain a video corresponding to the target event.
12. A video processing apparatus, wherein the apparatus is applied to an intelligent short video cloud, and the apparatus comprises:
the video receiving module is used for receiving videos corresponding to target events uploaded by at least one camera device, and the videos carry grouping identification information; the video is obtained by clipping the collected original video according to a clipping instruction generated by the trigger signal; the target event is an event generating a trigger signal; the trigger signal comprises at least one of a vibration signal, an audio signal and a real-time positioning signal;
the video grouping module is used for grouping the received videos according to the grouping identification information to obtain videos corresponding to the same target event;
the identification module is used for identifying whether the video corresponding to the target event contains event information of the target event; the event information of the target event is information which is obtained by analyzing a real-time positioning system and is transmitted by the real-time positioning signal and the video, and the event information comprises an event type, an event principal and an event result;
and the video analysis module is used for carrying out video analysis on the video corresponding to the target event if the video corresponding to the target event does not contain the event information, extracting image features in the video, and obtaining an event type, an event pivot and an event result contained in the video corresponding to the target event according to the extracted image features based on time sequence and a preset feature attribute rule.
13. An image capturing apparatus comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 6 when executing the computer program.
14. An intelligent short video cloud apparatus comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 7 to 10.
15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6 or 7 to 10.
CN202110100556.8A 2021-01-26 2021-01-26 Video generation method, video processing device and camera equipment Active CN112437233B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110100556.8A CN112437233B (en) 2021-01-26 2021-01-26 Video generation method, video processing device and camera equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110100556.8A CN112437233B (en) 2021-01-26 2021-01-26 Video generation method, video processing device and camera equipment

Publications (2)

Publication Number Publication Date
CN112437233A CN112437233A (en) 2021-03-02
CN112437233B true CN112437233B (en) 2021-04-16

Family

ID=74697255

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110100556.8A Active CN112437233B (en) 2021-01-26 2021-01-26 Video generation method, video processing device and camera equipment

Country Status (1)

Country Link
CN (1) CN112437233B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113556485A (en) * 2021-07-23 2021-10-26 上海商汤智能科技有限公司 Video generation method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105120191A (en) * 2015-07-31 2015-12-02 小米科技有限责任公司 Video recording method and device
CN107079201A (en) * 2014-08-13 2017-08-18 英特尔公司 technology and device for editing video
WO2018044731A1 (en) * 2016-09-02 2018-03-08 Vid Scale, Inc. Systems and methods for hybrid network delivery of objects of interest in video
CN109326310A (en) * 2017-07-31 2019-02-12 西梅科技(北京)有限公司 A kind of method, apparatus and electronic equipment of automatic editing
CN110717071A (en) * 2018-06-26 2020-01-21 北京深蓝长盛科技有限公司 Image clipping method, image clipping device, computer device, and storage medium
CN111167105A (en) * 2019-11-29 2020-05-19 北京深蓝长盛科技有限公司 Shooting detection method, device, equipment, system and storage medium
CN111494912A (en) * 2019-01-31 2020-08-07 北京深蓝长盛科技有限公司 Basketball exercise assisting system and method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8731239B2 (en) * 2009-12-09 2014-05-20 Disney Enterprises, Inc. Systems and methods for tracking objects under occlusion

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107079201A (en) * 2014-08-13 2017-08-18 英特尔公司 technology and device for editing video
CN105120191A (en) * 2015-07-31 2015-12-02 小米科技有限责任公司 Video recording method and device
WO2018044731A1 (en) * 2016-09-02 2018-03-08 Vid Scale, Inc. Systems and methods for hybrid network delivery of objects of interest in video
CN109326310A (en) * 2017-07-31 2019-02-12 西梅科技(北京)有限公司 A kind of method, apparatus and electronic equipment of automatic editing
CN110717071A (en) * 2018-06-26 2020-01-21 北京深蓝长盛科技有限公司 Image clipping method, image clipping device, computer device, and storage medium
CN111494912A (en) * 2019-01-31 2020-08-07 北京深蓝长盛科技有限公司 Basketball exercise assisting system and method
CN111167105A (en) * 2019-11-29 2020-05-19 北京深蓝长盛科技有限公司 Shooting detection method, device, equipment, system and storage medium

Also Published As

Publication number Publication date
CN112437233A (en) 2021-03-02

Similar Documents

Publication Publication Date Title
CN111163259A (en) Image capturing method, monitoring camera and monitoring system
JPWO2018198373A1 (en) Video surveillance system
US9451178B2 (en) Automatic insertion of video into a photo story
US10360481B2 (en) Unconstrained event monitoring via a network of drones
EP3120358A1 (en) Automatically curating video to fit display time
WO2020094088A1 (en) Image capturing method, monitoring camera, and monitoring system
GB2414614A (en) Image processing to determine most dissimilar images
CN109410278B (en) Target positioning method, device and system
JP2016129347A (en) Method for automatically determining probability of image capture with terminal using contextual data
US9842258B2 (en) System and method for video preview
US20150341559A1 (en) Thumbnail Editing
WO2021068553A1 (en) Monitoring method, apparatus and device
CN109922311B (en) Monitoring method, device, terminal and storage medium based on audio and video linkage
CN112437233B (en) Video generation method, video processing device and camera equipment
CN111586432B (en) Method and device for determining air-broadcast live broadcast room, server and storage medium
CN110717071B (en) Image clipping method, image clipping device, computer device, and storage medium
CN110266953B (en) Image processing method, image processing apparatus, server, and storage medium
CN106886746B (en) Identification method and back-end server
CN108540817B (en) Video data processing method, device, server and computer readable storage medium
CN107092636A (en) The retrieval device and method of CCTV images
CN104170367A (en) Virtual shutter image capture
CN112468735B (en) Video processing system and video processing method
CN112287771A (en) Method, apparatus, server and medium for detecting video event
CN109640022A (en) Video recording method, device, network shooting device and storage medium
CN113259734B (en) Intelligent broadcasting guide method, device, terminal and storage medium for interactive scene

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant