CN111770359A - Event video clipping method, system and computer readable storage medium - Google Patents

Event video clipping method, system and computer readable storage medium Download PDF

Info

Publication number
CN111770359A
CN111770359A CN202010493124.3A CN202010493124A CN111770359A CN 111770359 A CN111770359 A CN 111770359A CN 202010493124 A CN202010493124 A CN 202010493124A CN 111770359 A CN111770359 A CN 111770359A
Authority
CN
China
Prior art keywords
video
event
playback
playback segment
track
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010493124.3A
Other languages
Chinese (zh)
Other versions
CN111770359B (en
Inventor
赵筠
尹东芹
吴双龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suning Cloud Computing Co Ltd
Original Assignee
Suning Cloud Computing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Cloud Computing Co Ltd filed Critical Suning Cloud Computing Co Ltd
Priority to CN202010493124.3A priority Critical patent/CN111770359B/en
Publication of CN111770359A publication Critical patent/CN111770359A/en
Application granted granted Critical
Publication of CN111770359B publication Critical patent/CN111770359B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

The invention discloses a method, a system and a computer readable storage medium for editing a video of an event, wherein the method comprises the following steps: separating and correspondingly storing a video track and an audio track of the event video to be processed; identifying playback segments and non-playback segments in a video track; analyzing to obtain an explication voice tail point in an audio track corresponding to the non-playback segment, and intercepting the non-playback segment according to the explication voice tail point to obtain a non-playback segment material; filtering the playback special effect frame of the playback section to obtain playback section materials; intercepting the audio track according to the playback segment material and the non-playback segment material to obtain a corresponding audio material; merging the playback segment material and the non-playback segment material to obtain a material video, and synthesizing the audio material to the material video to obtain a target clip video; the invention supports various events, automatically extracts and reserves important playback segments, simultaneously accurately shortens the time length, quickly manufactures a large number of wonderful highlights with different dimensions, and has high accuracy.

Description

Event video clipping method, system and computer readable storage medium
Technical Field
The invention relates to the field of video data editing processing, in particular to a method and a system for editing event video and a computer readable storage medium.
Background
In traditional copyright event operation, some wonderful segments or highlights can be edited in the event live broadcast process by traditional media, and the wonderful segments or highlights can be quickly browsed and shared by a large number of watching groups. The existing method of the process usually depends on a large amount of manual video editing to perform editing production manually, and has the following problems: 1. the timeliness is poor, and a wonderful collection video of match in whole field often needs the edition to spend a large amount of time watch the match repeatedly and look for wonderful fragment, and experiment accurate positioning is repeated, and the loaded down with trivial details inefficiency of manufacture process often can be exported for a long time after the match, influences the user and sees the match and experience. 2. The content yield is low, is limited by operation resources, the content production can only mainly ensure key events, the content yield of non-head events is particularly influenced, and the number of output videos of one event is limited.
With the enrichment of sports games at the present stage, the manual editing method cannot meet the requirement of professional fast editing and outputting of a large number of games, and an automatic video data screening method and an automatic video data screening device gradually appear at present, but the video edited by the automatic video data screening method and the automatic video data screening device at present is poor in accuracy.
Disclosure of Invention
The invention aims to: a method, system and computer readable storage medium for editing a video of an event are provided that enable fast and accurate production of edited video.
The technical scheme of the invention is as follows:
in a first aspect, a method for editing a video of an event is provided, the method comprising:
separating and correspondingly storing a video track and an audio track of the event video to be processed;
identifying playback segments and non-playback segments in the video track;
analyzing to obtain an explication voice tail point in an audio track corresponding to the non-playback segment, and intercepting the non-playback segment according to the explication voice tail point to obtain a non-playback segment material;
filtering the playback special effect frame of the playback segment to obtain playback segment materials;
intercepting the audio track according to the playback segment material and the non-playback segment material to obtain a corresponding audio material;
and combining the playback segment material and the non-playback segment material to obtain a material video, and synthesizing the audio material to the material video to obtain a target clip video.
Further, the identifying playback segments and non-playback segments in the video track specifically includes:
identifying the event logo picture in the video frame of the video track by using a ResNet50 neural network to classify the playback special effect frame, and recording the classification result in real time;
and identifying playback segments and non-playback segments in the classification result.
Further, the analyzing to obtain the commentary voice tail point in the audio track corresponding to the non-playback segment specifically includes:
setting a reserved time length range threshold T after an event occurs according to the event type;
and analyzing the explanation voice tail point of the corresponding commentator closest to the threshold T boundary of the time length range in the audio track by using a voice endpoint detection method based on the short-time energy and the short-time average zero crossing rate.
Further: the intercepting the non-playback segment according to the commentary voice endpoint to obtain non-playback segment materials specifically includes:
and intercepting the non-playback segment by taking the explication voice tail point as a segment end time point to obtain a non-playback segment material.
Further, the intercepting the audio track according to the playback section material and the non-playback section material to obtain the corresponding audio material specifically includes:
and correspondingly intercepting the audio track from the initial position of the audio track according to the sum of the durations of the non-playback segment material and the playback segment material to obtain the audio material.
Further, before the separating and correspondingly storing the video track and the audio track of the event video to be processed, the method comprises the following steps:
acquiring an event video to be processed; the method specifically comprises the following steps:
acquiring N video frames in a to-be-processed event video stream and a display timestamp corresponding to each video frame;
identifying an event time identifier in each video frame picture in the N video frames, and performing first time axis matching on the event time identifier and a display time stamp to position an effective event video;
analyzing the event data corresponding to the event video stream to be processed to obtain the structured data of all events, wherein the structured data comprises the occurrence time of the events in the events, and performing second time axis matching on the occurrence time and the display time stamp of the events;
acquiring a plurality of related events forming any target event from all events, determining a starting point display time stamp and an end point display time stamp of the target event according to the starting points and the end points of the plurality of related events, positioning and extracting all video frames of the target event in the effective event video, and clipping and pressing the video frames into an event video to be processed.
Further, the identifying the event time identifier in each video frame picture of the N video frames, and performing the first time axis matching between the event time identifier and the display timestamp to locate the effective event video specifically includes:
and identifying and verifying the event time identifier in each video frame picture in the N video frames by utilizing a deep learning algorithm based on a Faster R-CNN neural network through an AI identification module, and performing first time axis matching on the verified event time identifier and a display time stamp to position an effective event video.
Further, the method further comprises: and merging the target clip video according to the occurrence time to obtain a collection video.
In a second aspect, there is provided an event video clip system, the system comprising:
the separation storage module is used for separating and storing the video track and the audio track of the event video to be processed;
the filtering module is used for filtering the playback special effect frames of the playback segments to obtain playback segment materials;
the recognition analysis module is used for recognizing the playback segment and the non-playback segment in the video track and analyzing to obtain the explication voice tail point in the audio track corresponding to the non-playback segment;
the intercepting module is used for intercepting the non-playback segment according to the interpretation voice tail point to obtain a non-playback segment material, and intercepting the audio track according to the playback segment material and the non-playback segment material to obtain a corresponding audio material;
and the merging module is used for merging the playback segment material and the non-playback segment material to obtain a material video, and synthesizing the audio material to the material video to obtain a target clip video.
In a third aspect, a computer-readable storage medium is provided, having stored thereon a computer program which, when being executed by a processor, carries out the steps of the method of any one of the first aspect.
The invention has the advantages that: the method supports various event types, can automatically extract and reserve important playback segments from event videos, simultaneously accurately remove playback special effect pictures and shorten the video duration, quickly produces and manufactures a large number of clipped videos on the basis of massive video resources, has high accuracy, is favorable for users to quickly know the summary and essence of the events, and provides favorable support for clipping, making, sharing and spreading of increasingly red short videos.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flow chart illustrating a method for editing a video clip of an event according to an embodiment of the present invention;
FIG. 2 is a block diagram of a soccer game highlights editing system according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating transition effects achieved by a video editing method for an event according to an embodiment of the present invention;
FIG. 4 is an exemplary diagram of goal coordinates and range in a video editing method for an event according to an embodiment of the present invention;
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments that can be derived from the embodiments given herein by a person of ordinary skill in the art are intended to be within the scope of the present disclosure.
In the application, ResNet, also called as residual neural network, refers to the idea of adding residual learning (residual learning) to the traditional convolutional neural network, so that the problems of gradient dispersion and precision reduction (training set) in a deep network are solved, the network can be deeper and deeper, the precision is ensured, and the speed is controlled.
BlockID: and (5) identifying the fragments.
CDN: an abbreviation of Content Delivery Network, namely a Content distribution Network, is a layer of intelligent virtual Network on the basis of the existing internet, which is formed by placing node servers at various parts of the Network, and can avoid bottlenecks and links on the internet which possibly affect the data transmission speed and stability as far as possible, so that the Content transmission is faster and more stable.
VAR: the abbreviation of Video assistance Referee refers to that an active Referee provides information to the Referee by playing back videos, and assists the Referee to correct mistakes and omissions clearly and obviously changing the competition trend, so that the accuracy of the judgment is improved.
The event video clip needs to ensure the integrity of the event, and the video duration needs to be compressed on the premise of the integrity of the event. In order to solve the problems of low efficiency and high cost of manual clipping and low accuracy of automatic clipping (the video duration cannot be effectively compressed or the event is incomplete due to inaccurate clipping of the event), the application aims to provide a method for clipping an event video, which realizes automatic fine clipping operation, compresses the display duration of a single event, highlights the event focus and effectively shortens the overall duration of the video.
Example 1: an event video clipping method, as shown in fig. 1, includes:
101. separating and correspondingly storing a video track and an audio track of the event video to be processed;
specifically, a video track and an audio track of the event video to be processed are separated and correspondingly stored, and a picture sequence and a corresponding storage of the audio sequence are established.
Prior to this step, the method further comprises: the method for acquiring the event video to be processed specifically comprises the following steps:
101-1, acquiring N video frames in a video stream of the event to be processed and a display time stamp corresponding to each video frame;
the display time stamp is mainly used for measuring when the decoded video frame is displayed, i.e. for marking the display time point of each frame in the manufactured target video.
The event video stream to be processed is a live event video stream provided by an event data provider or an on-demand event video stream downloaded through an on-demand video playing address, and the specific type of the event video stream is not limited in this embodiment.
N is at least 4. Take a soccer game as an example:
when the event video stream is a live event video stream, the video frames are acquired by the CDN, the CDN intercepts N video frames from the live event video stream at a preset fixed frequency and extracts a Block ID and a corresponding display timestamp of a TS fragment where each video frame is located; after the first half or the second half of the football game starts, the system acquires the first 6 video frames after each start of the video stream of the game to be processed, the blockID of the corresponding TS fragment and the display timestamp of each video frame from the CDN.
When the event video stream is the on-demand event video stream, the on-demand file is directly read through the AI identification module, the total duration of the on-demand file is fixed, the relative time points of the upper half field and the lower half field relative to the total duration of the video are randomly selected, the picture frame extraction is carried out at the time points, and the corresponding display timestamp is obtained through decoding. N is 4, namely 2 frames in the upper half field and the lower half field respectively, if the identification is wrong, 2 frames are continuously randomly extracted. If the overtime exists, the frame extraction processing is also needed to be carried out by the method when the overtime is used for playing the upper half field and the lower half field.
101-2, identifying the event time identifier in each video frame picture in the N video frames, and carrying out first time axis matching on the event time identifier and the display time stamp to position an effective event video.
Specifically, an AI identification module is adopted to identify the event time identifier in each video frame picture in the N video frames by utilizing a deep learning algorithm. In this embodiment, the event time identifier in each video frame is preferably obtained by the AI identification module identifying the display time of the score in each video frame in the event video stream by using a deep learning algorithm based on the Faster R-CNN neural network, and then checking the display time.
Because the video clips before the event starts exist in the video, the event time identification and the display time stamp are subjected to first time axis matching to position an event part, namely an effective event video, the event starting point is determined by acquiring the effective event time, and the positioning precision is improved.
101-3, analyzing the event data corresponding to the to-be-processed event video stream to obtain the structured data of all events, wherein the structured data comprises the occurrence time of the events in the event, and performing second time axis matching on the occurrence time and the display time stamp of the events;
specifically, when the event video stream is a live event video stream provided by an event data provider, the event data is obtained by feedback from the event data provider or by query;
and when the event video stream is the on-demand event video stream downloaded through the on-demand video playing address, the event data is obtained by directly inquiring the detailed event data of the historical events.
Since the event is interrupted or prolonged, it is necessary to extract the occurrence time of the event in the event, perform the second time axis matching between the occurrence time of the event in the event and the display time stamp, and achieve the alignment between the occurrence time and the display time.
101-4, acquiring a plurality of associated events forming any target event from all events, determining a starting point display time stamp and an end point display time stamp of the target event according to the starting points and the end points of the plurality of associated events, positioning and extracting all video frames of the target event in the effective event video, and clipping and pressing the video frames into an event video to be processed.
Specifically, a core event is determined, and a related event of the core event is obtained through a preset association rule, such as: when the core event is a goal, the related events comprise passing, dribbling, passing and middle circle opening events which may occur before and after the goal. The core event and its related events together constitute a related event.
In this embodiment, the preset association rule is not specifically limited.
After the event video to be processed is obtained, the event video to be processed can be screened according to a preset weight rule; for example, when the event is a soccer event, the preset weighting rules at least include: the top half start and the bottom half end, and at least one of a goal, a Highlight event given by the event data provider, a VAR, a threat goal, a red and yellow tile, a point ball. For a football game without goal, the preset weighting rules comprise the beginning of the first half and the end of the second half, and at least comprise threatening shooting, and the degree of threatening shooting needs to be defined, see the diagram of the coordinates and range of the goal in fig. 4, if a football track falls into a circle of Close area near the outer frame of the goal after shooting, the goal is judged to be threatening shooting, and other tracks far away from the goal are not screened and recorded.
It should be noted that the preset weighting rule is set according to the event content, the existing editing habit and experience, and the preset weighting rule is not specifically limited in this embodiment.
102. Identifying playback segments and non-playback segments in the video track; the method specifically comprises the following steps:
identifying the event logo picture in the video frame of the video track by using a ResNet50 neural network to classify the playback special effect frame, and recording the classification result in real time;
and identifying playback segments and non-playback segments in the classification result.
More specifically, the classification result of the real-time recording is analyzed, if the inter-frame distance of the playback special effect frames is smaller than a preset value, the same segment of playback special effect is judged, and the video content between the two segments of playback special effects is judged to be a playback segment, so that the playback segment is identified.
103. Analyzing to obtain an explication voice tail point in an audio track corresponding to the non-playback segment, and intercepting the non-playback segment according to the explication voice tail point to obtain a non-playback segment material;
the analyzing to obtain the comment speech end point in the audio track corresponding to the non-playback segment specifically includes:
setting a reserved time length range threshold T after an event occurs according to the event type;
and analyzing the explanation voice tail point of the corresponding commentator closest to the threshold T boundary of the time length range in the audio track by using a voice endpoint detection method based on the short-time energy and the short-time average zero crossing rate.
The intercepting the non-playback segment according to the commentary voice endpoint to obtain a non-playback segment material specifically includes:
and intercepting the non-playback segment by taking the explication voice tail point as a segment end time point to obtain a non-playback segment material.
104. Filtering the playback special effect frame of the playback segment to obtain playback segment materials;
105. intercepting the audio track according to the playback segment material and the non-playback segment material to obtain a corresponding audio material; the method specifically comprises the following steps:
and correspondingly intercepting the audio track from the initial position of the audio track according to the sum of the durations of the non-playback segment material and the playback segment material to obtain the audio material. More specifically, the purpose of capturing the audio track to obtain the audio material is to enable the audio and video synchronization of the non-playback segment, and no limitation is imposed on the description of whether the corresponding portion of the playback segment is the image time interval in the original event video to be processed in this embodiment.
106. And combining the playback segment material and the non-playback segment material to obtain a material video, and synthesizing the audio material to the material video to obtain a target clip video.
The method further comprises the following steps: and merging the target clip video according to the occurrence time to obtain a collection video.
The event video clipping method provided by the embodiment supports multiple event types, can automatically extract and reserve important playback segments from the event videos, accurately remove playback special effect pictures and shorten the time length, quickly produces and manufactures a large number of clipped videos on the basis of massive video resources, is high in accuracy, is beneficial to users to quickly know the summary and essence of the events, and provides a beneficial support for clipping, manufacturing, sharing and spreading of increasingly red short videos.
Example 2: the present embodiment provides an event video clip system, which, as shown in fig. 2, includes:
a separation storage module 21, configured to separately store a video track and an audio track of the event video to be processed;
a filtering module 22, configured to filter the playback special effect frames of the playback segment to obtain playback segment material;
the recognition analysis module 23 is configured to recognize a playback segment and a non-playback segment in the video track, and analyze the playback segment and the non-playback segment to obtain an end point of the narration voice in the audio track corresponding to the non-playback segment;
the intercepting module 24 is configured to intercept the non-playback segment according to the narration speech endpoint to obtain a non-playback segment material, and intercept the audio track according to the playback segment material and the non-playback segment material to obtain an audio material;
and the merging module 25 is configured to merge the playback segment material and the non-playback segment material to obtain a material video, and synthesize the audio material onto the material video.
The beneficial effects of the event video clipping system provided in this embodiment for implementing the event video clipping method provided in embodiment 1 are the same as those of the event video clipping method provided in embodiment 1, and are not described herein again.
It should be noted that: in the event video clipping system provided in the above embodiment, when executing an event video clipping method, only the division of the above function modules is used for illustration, and in practical applications, the function distribution may be completed by different function modules according to needs, that is, the internal structure of the device is divided into different function modules to complete all or part of the functions described above. In addition, the event video editing system and the event video editing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments and are not described herein again.
Embodiment 3, this embodiment provides a computer-readable storage medium having a computer program stored thereon, the computer program when executed by a processor implementing any of the steps of:
acquiring an event video to be processed;
separating and correspondingly storing a video track and an audio track of the event video to be processed;
identifying playback segments and non-playback segments in the video track;
analyzing to obtain an explication voice tail point in an audio track corresponding to the non-playback segment, and intercepting the non-playback segment according to the explication voice tail point to obtain a non-playback segment material;
filtering the playback special effect frame of the playback segment to obtain playback segment materials;
intercepting the audio track according to the playback segment material and the non-playback segment material to obtain a corresponding audio material;
merging the playback segment material and the non-playback segment material to obtain a material video, and synthesizing the audio material to the material video to obtain a target clip video;
and merging the target clip video according to the occurrence time to obtain a collection video.
The beneficial effects of a computer-readable storage medium provided in this embodiment for processing and executing the steps of the event video clipping method provided in embodiment 1 are the same as those of the event video clipping method provided in embodiment 1, and thus, the description thereof is omitted here.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be, but is not limited to, a read-only memory, a magnetic or optical disk, and the like.
It should be understood that the above-mentioned embodiments are only illustrative of the technical concepts and features of the present invention, and are intended to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and not to limit the scope of the present invention. All modifications made according to the spirit of the main technical scheme of the invention are covered in the protection scope of the invention.

Claims (10)

1. A method for video editing of an event, the method comprising:
separating and correspondingly storing a video track and an audio track of the event video to be processed;
identifying playback segments and non-playback segments in the video track;
analyzing to obtain an explication voice tail point in an audio track corresponding to the non-playback segment, and intercepting the non-playback segment according to the explication voice tail point to obtain a non-playback segment material;
filtering the playback special effect frame of the playback segment to obtain playback segment materials;
intercepting the audio track according to the playback segment material and the non-playback segment material to obtain a corresponding audio material;
and combining the playback segment material and the non-playback segment material to obtain a material video, and synthesizing the audio material to the material video to obtain a target clip video.
2. The event video clipping method of claim 1, wherein said identifying playback segments and non-playback segments in the video track specifically comprises:
identifying the event logo picture in the video frame of the video track by using a ResNet50 neural network to classify the playback special effect frame, and recording the classification result in real time;
and identifying playback segments and non-playback segments in the classification result.
3. The method as claimed in claim 1, wherein the analyzing the commentary voice end point in the audio track corresponding to the non-playback segment includes:
setting a reserved time length range threshold T after an event occurs according to the event type;
and analyzing the explanation voice tail point of the corresponding commentator closest to the threshold T boundary of the time length range in the audio track by using a voice endpoint detection method based on the short-time energy and the short-time average zero crossing rate.
4. A method of video clipping for an event according to claim 1, wherein: the intercepting the non-playback segment according to the commentary voice endpoint to obtain non-playback segment materials specifically includes:
and intercepting the non-playback segment by taking the explication voice tail point as a segment end time point to obtain a non-playback segment material.
5. The event video clipping method according to claim 1, wherein the clipping the audio track according to the playback segment material and the non-playback segment material to obtain the corresponding audio material specifically comprises:
and correspondingly intercepting the audio track from the initial position of the audio track according to the sum of the durations of the non-playback segment material and the playback segment material to obtain the audio material.
6. An event video clipping method as claimed in claim 1, further comprising, before said separating and correspondingly storing the video track and the audio track of the event video to be processed:
acquiring an event video to be processed; the method specifically comprises the following steps:
acquiring N video frames in a to-be-processed event video stream and a display timestamp corresponding to each video frame;
identifying an event time identifier in each video frame picture in the N video frames, and performing first time axis matching on the event time identifier and a display time stamp to position an effective event video;
analyzing the event data corresponding to the event video stream to be processed to obtain the structured data of all events, wherein the structured data comprises the occurrence time of the events in the events, and performing second time axis matching on the occurrence time and the display time stamp of the events;
acquiring a plurality of related events forming any target event from all events, determining a starting point display time stamp and an end point display time stamp of the target event according to the starting points and the end points of the plurality of related events, positioning and extracting all video frames of the target event in the effective event video, and clipping and pressing the video frames into an event video to be processed.
7. The event video clipping method according to claim 6, wherein the identifying the event time identifier in each of the N video frames, and performing the first time axis matching of the event time identifier and the display timestamp to locate the valid event video specifically comprises:
and identifying and verifying the event time identifier in each video frame picture in the N video frames by utilizing a deep learning algorithm based on a Faster R-CNN neural network through an AI identification module, and performing first time axis matching on the verified event time identifier and a display time stamp to position an effective event video.
8. A method for video editing of an event as claimed in claim 1, the method further comprising: and merging the target clip video according to the occurrence time to obtain a collection video.
9. An event video clipping system, the system comprising:
the separation storage module is used for separating and storing the video track and the audio track of the event video to be processed;
the filtering module is used for filtering the playback special effect frames of the playback segments to obtain playback segment materials;
the recognition analysis module is used for recognizing the playback segment and the non-playback segment in the video track and analyzing to obtain the explication voice tail point in the audio track corresponding to the non-playback segment;
the intercepting module is used for intercepting the non-playback segment according to the interpretation voice tail point to obtain a non-playback segment material, and intercepting the audio track according to the playback segment material and the non-playback segment material to obtain a corresponding audio material;
and the merging module is used for merging the playback segment material and the non-playback segment material to obtain a material video, and synthesizing the audio material to the material video to obtain a target clip video.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
CN202010493124.3A 2020-06-03 2020-06-03 Event video clipping method, system and computer readable storage medium Active CN111770359B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010493124.3A CN111770359B (en) 2020-06-03 2020-06-03 Event video clipping method, system and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010493124.3A CN111770359B (en) 2020-06-03 2020-06-03 Event video clipping method, system and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN111770359A true CN111770359A (en) 2020-10-13
CN111770359B CN111770359B (en) 2022-10-11

Family

ID=72720600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010493124.3A Active CN111770359B (en) 2020-06-03 2020-06-03 Event video clipping method, system and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111770359B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112839236A (en) * 2020-12-31 2021-05-25 北京达佳互联信息技术有限公司 Video processing method, device, server and storage medium
CN113515997A (en) * 2020-12-28 2021-10-19 腾讯科技(深圳)有限公司 Video data processing method and device and readable storage medium
CN114697722A (en) * 2022-04-01 2022-07-01 湖南快乐阳光互动娱乐传媒有限公司 Video playing method and device, electronic equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599179A (en) * 2009-07-17 2009-12-09 北京邮电大学 Method for automatically generating field motion wonderful scene highlights
US20120017153A1 (en) * 2010-07-15 2012-01-19 Ken Matsuda Dynamic video editing
US20160037217A1 (en) * 2014-02-18 2016-02-04 Vidangel, Inc. Curating Filters for Audiovisual Content
WO2017177859A1 (en) * 2016-04-11 2017-10-19 腾讯科技(深圳)有限公司 Video playing method and device, and computer readable storage medium
CN109194978A (en) * 2018-10-15 2019-01-11 广州虎牙信息科技有限公司 Live video clipping method, device and electronic equipment
CN109862388A (en) * 2019-04-02 2019-06-07 网宿科技股份有限公司 Generation method, device, server and the storage medium of the live video collection of choice specimens
CN110012348A (en) * 2019-06-04 2019-07-12 成都索贝数码科技股份有限公司 A kind of automatic collection of choice specimens system and method for race program
CN110087097A (en) * 2019-06-05 2019-08-02 西安邮电大学 It is a kind of that invalid video clipping method is automatically removed based on fujinon electronic video endoscope
CN110188241A (en) * 2019-06-04 2019-08-30 成都索贝数码科技股份有限公司 A kind of race intelligence manufacturing system and production method
CN110572722A (en) * 2019-09-26 2019-12-13 腾讯科技(深圳)有限公司 Video clipping method, device, equipment and readable storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101599179A (en) * 2009-07-17 2009-12-09 北京邮电大学 Method for automatically generating field motion wonderful scene highlights
US20120017153A1 (en) * 2010-07-15 2012-01-19 Ken Matsuda Dynamic video editing
US20160037217A1 (en) * 2014-02-18 2016-02-04 Vidangel, Inc. Curating Filters for Audiovisual Content
WO2017177859A1 (en) * 2016-04-11 2017-10-19 腾讯科技(深圳)有限公司 Video playing method and device, and computer readable storage medium
CN109194978A (en) * 2018-10-15 2019-01-11 广州虎牙信息科技有限公司 Live video clipping method, device and electronic equipment
CN109862388A (en) * 2019-04-02 2019-06-07 网宿科技股份有限公司 Generation method, device, server and the storage medium of the live video collection of choice specimens
CN110012348A (en) * 2019-06-04 2019-07-12 成都索贝数码科技股份有限公司 A kind of automatic collection of choice specimens system and method for race program
CN110188241A (en) * 2019-06-04 2019-08-30 成都索贝数码科技股份有限公司 A kind of race intelligence manufacturing system and production method
CN110087097A (en) * 2019-06-05 2019-08-02 西安邮电大学 It is a kind of that invalid video clipping method is automatically removed based on fujinon electronic video endoscope
CN110572722A (en) * 2019-09-26 2019-12-13 腾讯科技(深圳)有限公司 Video clipping method, device, equipment and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SAWITCHAYA TIPPAYA; SUCHADA SITJONGSATAPORN; TELE TAN; MASOOD ME: "Multi-Modal Visual Features-Based Video Shot Boundary Detection", 《IEEE ACCESS》 *
徐浩然: "基于对象的时空域压缩视频摘要技术", 《数字技术与应用》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113515997A (en) * 2020-12-28 2021-10-19 腾讯科技(深圳)有限公司 Video data processing method and device and readable storage medium
CN113515997B (en) * 2020-12-28 2024-01-19 腾讯科技(深圳)有限公司 Video data processing method and device and readable storage medium
CN112839236A (en) * 2020-12-31 2021-05-25 北京达佳互联信息技术有限公司 Video processing method, device, server and storage medium
CN114697722A (en) * 2022-04-01 2022-07-01 湖南快乐阳光互动娱乐传媒有限公司 Video playing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111770359B (en) 2022-10-11

Similar Documents

Publication Publication Date Title
CN111770359B (en) Event video clipping method, system and computer readable storage medium
CN112565825B (en) Video data processing method, device, equipment and medium
CN106162223B (en) News video segmentation method and device
Hanjalic Adaptive extraction of highlights from a sport video based on excitement modeling
CN110381366B (en) Automatic event reporting method, system, server and storage medium
US8195038B2 (en) Brief and high-interest video summary generation
CN111757147B (en) Method, device and system for event video structuring
US8068678B2 (en) Electronic apparatus and image processing method
US9426411B2 (en) Method and apparatus for generating summarized information, and server for the same
CN102595206B (en) Data synchronization method and device based on sport event video
CN103165151A (en) Method and device for playing multi-media file
KR20200023013A (en) Video Service device for supporting search of video clip and Method thereof
CN104320670A (en) Summary information extracting method and system for network video
Merler et al. Automatic curation of golf highlights using multimodal excitement features
CN110881131B (en) Classification method of live review videos and related device thereof
CN112699787A (en) Method and device for detecting advertisement insertion time point
Kapela et al. Real-time event detection in field sport videos
CN107369450B (en) Recording method and recording apparatus
CN114845149A (en) Editing method of video clip, video recommendation method, device, equipment and medium
CN112287771A (en) Method, apparatus, server and medium for detecting video event
Tsao et al. Thumbnail image selection for VOD services
CN114782879B (en) Video identification method and device, computer equipment and storage medium
CN115022663A (en) Live stream processing method and device, electronic equipment and medium
CN115080792A (en) Video association method and device, electronic equipment and storage medium
CN114866788A (en) Video processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant