CN107222795A - A kind of video abstraction generating method of multiple features fusion - Google Patents

A kind of video abstraction generating method of multiple features fusion Download PDF

Info

Publication number
CN107222795A
CN107222795A CN201710486660.9A CN201710486660A CN107222795A CN 107222795 A CN107222795 A CN 107222795A CN 201710486660 A CN201710486660 A CN 201710486660A CN 107222795 A CN107222795 A CN 107222795A
Authority
CN
China
Prior art keywords
mrow
msub
video
frame
munderover
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710486660.9A
Other languages
Chinese (zh)
Other versions
CN107222795B (en
Inventor
李泽超
唐金辉
胡铜铃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201710486660.9A priority Critical patent/CN107222795B/en
Publication of CN107222795A publication Critical patent/CN107222795A/en
Application granted granted Critical
Publication of CN107222795B publication Critical patent/CN107222795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The invention provides a kind of video abstraction generating method of multiple features fusion, comprise the following steps:Obtain video and regard video as input data;The number of the segmentation of fragment, record cut-point and video segment is carried out to the video data of input;Extract the frame of video and frame of video central block in each video segment;Frame of video and frame of video central block respectively to extraction carries out the calculating of feature and picture quality;The calculating of global importance and local importance is carried out according to obtained feature;The global importance of each frame to obtaining, which with local importance merge, to be obtained merging importance;The calculating of importance is carried out to each video segment according to cut-point;According to the importance and given threshold of obtained each video segment, video segment is selected, the video segment subset of an optimization is selected;The synthesis of video frequency abstract is carried out according to the video segment subset selected.

Description

A kind of video abstraction generating method of multiple features fusion
Technical field
The present invention relates to a kind of food analysis and image processing techniques, the video frequency abstract of particularly a kind of multiple features fusion is given birth to Into method.
Background technology
Current internet technology and can only the fast development of equipment cause people obtain video and browse the mode of video to become Obtain more diversified, while the video data faced is also more and more, in face of such substantial amounts of video data, how therefrom to find The video data or visual information needed to us is a current study hotspot, is also in the research of Video Analysis Technology Hold.On the Research foundation to massive video data, there is missing in the analysis to video data, the method such as processing and storage, Cause user that there is blindness when finding useful video data, currently the majority generates the result of video frequency abstract in addition All less preferable, because the video frequency abstract of many method generations is all static video frequency abstract, this video frequency abstract is unfavorable for user Browse, be less useful for assurance of the user to video content.Therefore need to carry out data mining to video data and image procossing is obtained To a kind of video abstraction generating method of the practical multiple features fusion based on global importance and local importance.
The content of the invention
It is an object of the invention to provide a kind of video of the multiple features fusion based on global importance and local importance Abstraction generating method, comprises the following steps:
Step 1, obtain video and regard video as input data;
Step 2, the number of the segmentation of fragment, record cut-point and video segment is carried out to the video data of input;
Step 3, the frame of video and frame of video central block in each video segment are extracted;
Step 4, frame of video and frame of video central block respectively to extraction carries out the calculating of feature and picture quality;
Step 5, the calculating of global importance and local importance is carried out according to obtained feature;
Step 6, the global importance of each frame to obtaining, which with local importance merge, obtains merging importance;
Step 7, the calculating of importance is carried out to each video segment according to cut-point;
Step 8, according to the importance and given threshold of obtained each video segment, video segment is selected, selected Go out the video segment subset of an optimization;
Step 9, the synthesis of video frequency abstract is carried out according to the video segment subset selected.
Present invention utilizes user obtain various video data, including obtained by smart machine and internet on obtain The various video data such as video data taken, the video data in these a variety of sources obtained, network can be covered as far as possible On all kinds video data;The present invention can quickly obtain the video frequency abstract that user wants without training, be user's section About substantial amounts of time and efforts;Whether present invention is alternatively directed to there is audio-frequency information dynamically to extract in video in video in addition Audio-frequency information is put into video frequency abstract;The present invention make use of video analysis and figure when user video summary result is presented to As the technology of processing, original video is analyzed and processed to the video frequency abstract concentrated, allows users to quickly obtain wanting concentration Video, has considerably improved the experience of user.
The present invention is described further with reference to Figure of description.
Brief description of the drawings
Fig. 1 is the video abstraction generating method stream of multiple features fusion of the present invention based on global importance and local importance Cheng Tu.
Fig. 2 is the original video frame schematic diagram that the present invention is extracted from original video.
Fig. 3 is the fritter that the frame of video that the present invention is extracted first is divided into 5x5, then extracts the 3x3 of core center Block is used for the schematic diagram for calculating local importance.
Fig. 4 is that the video frequency abstract generation system of multiple features fusion of the present invention based on global importance and local importance is drilled The design sketch shown.
Embodiment
With reference to Fig. 1, a kind of video abstraction generating method based on global importance with the multiple features fusion of local importance, Comprise the following steps:
Step 1, obtain video and regard video as input data;
Step 2, the video data of input is handled, obtains the number of cut-point one by one and video segment;
Step 3, the frame of video and frame of video central block in each video segment are extracted;
Step 4, frame of video and frame of video central block respectively to extraction carries out the calculating of feature and picture quality;
Step 5, the calculating of global importance and local importance is carried out according to obtained feature;
Step 6, the global importance of each frame to obtaining merge obtaining final fusion weight with local importance The property wanted;
Step 7, the calculating of importance is carried out to each video segment according to cut-point;
Step 8, according to the importance of obtained each video segment, given threshold carries out video segment and selected, and selects Go out the video segment subset of an optimization;
Step 9, the synthesis of video frequency abstract is carried out according to the video segment subset selected.
Video data in step 1 can be obtained by internet and various smart machines, and obtaining the website of video includes http://www.youku.com/, http:The websites such as //www.iqiyi.com/, obtain the smart machine of video including various Smart mobile phone, flat board etc..
The video data of acquisition and is subjected to the segmentation of fragment to it as the video of input in step 2, superframe point is used Video segmentation into small video segment one by one is obtained one by the prospect of the method combination video cut, background and movable information The number of individual cut-point and video segment, shearing point and video segment number to video segment are preserved the meter so as to the later stage Calculate.
The extraction of frame of video and frame of video central block is carried out in step 3 for video, the extraction of frame of video uses routine Extracting method, but the extraction for frame of video central block needs first to split frame of video, here in order that must regard Feel that content is effectively maintained, frame of video is divided into 5x5 block, the 3x3 of core central block is then extracted For calculating local importance.
The calculating of picture feature and picture quality is carried out to the frame of video and frame of video central block of extraction in step 4, calculated Feature include vision significance exposure, saturation degree, colourity, Rule of thirds, contrast, direction degree also needs in addition Calculate the calculating of frame of video and the picture quality of frame of video central block;The calculation formula of wherein vision significance is:
In formula, ASFor static conspicuousness, ATFor time conspicuousness, γ is the empirical parameter of a non-negative, FAIt is only referred to One function name of generation, for representing the fusion of two kinds of vision significances;
The calculation formula of exposure is:
Wherein X, Y are respectively that the video image of extraction is converted to the length and width of HSV images, x, during y is respectively passage V Location of pixels, IV(x, y) is the V passages of HSV images.
The calculation formula of colourity is:
Wherein X, Y are respectively that the video image of extraction is converted to the length and width of HSV images, x, during y is respectively passage S Location of pixels, IS(x, y) is the channel S of HSV images.
The calculation formula of saturation degree is:
Wherein X, Y are respectively that the video image of extraction is converted to the length and width of HSV images, x, during y is respectively passage V Location of pixels, IH(x, y) is the V passages of HSV images.
Rule of thirds calculation formula is:
Wherein X, Y are respectively that the video image of extraction is converted to the length and width of HSV images, x, during y is respectively passage Location of pixels, IH(x,y)、IS(x,y)、IV(x, y) is three passages of HSV images.f5、f6、f7It is according to Rule of Thirds calculates three obtained characteristic values, and the main information mainly reflected with these three characteristic values in image is located at image Three points of positions near.
For contrast, the calculating of direction degree mainly uses Tamura textural characteristics to calculate, Tamura image lines Managing feature includes six kinds of features, is respectively:Roughness, contrast, direction degree, line granularity, six kinds of features of rule degree and smoothness, First three feature in this six kinds of features has very important effect for field of image search.
The picture quality of frame of video is obtained by the image quality evaluating method of non-reference pictureWith frame of video central block Picture qualityAnd picture quality is mainly used to the frame of video of constant extraction and the quality of frame of video central block, because from The possible mass ratio of the frame of video having and central block extracted in video is relatively low, so we need to consider the fuzzy of these distortions Whether these features that frame of video and central block are calculated can express video well, because picture quality quality is plucked to video The generation wanted has very important effect.
For the calculating of the global importance of every frame frame of video and local importance in step 5, the calculating of global importance is public Formula is:
Wherein k refers to kth frame video,It is the quality of frame of video, fG_1~fG_9Respectively require to calculate in 4 based on video The value of nine features of frame.
The calculation formula of local importance is:
Wherein k refers to kth frame video,It is the quality of frame of video, fL_1~fL_9Nine respectively based on frame of video central block The value of individual feature.
To every frame frame of video merge the calculating of importance in step 6, the importance of fusion is made up of two parts:Entirely Office's importance and local importance.Its calculation formula is:
I_Gk&Lk=I_Gk+I_Lk (10)
Wherein I_GkAnd I_LkThe respectively global importance of frame of video and local importance.
To the calculating of each video segment importance in step 7, main cutting according to the video segment obtained by step 2 The average fusion that the fusion importance of each frame frame of video obtained by point of contact and step 6 calculates each video segment is important Property, the calculating of this importance is prepared mainly for the selection to ensuing video segment subset.
The calculation formula of video segment is:
ICRefer to the fusion importance sum of video segment, IjRefer to the average fusion importance of video segment, i, which refers in step 2, to be obtained The shearing point arrived, next_i refers to next shearing point.The average fusion importance I of video segmentjAs followed by The foundation of video segment subset selection.
The fusion importance of each video segment obtained in step 8 according to being calculated in step 7 and the threshold value of setting are to step Video segment set obtained by splitting in rapid 2 carries out the selection of subset, and threshold value is set as institute shared by video frequency abstract fragment here There is the ratio of video segment, it is impossible to which the ratio of setting is too high or too low, the video segment that otherwise chooses or too much or too Few inherently to influence the quality of video frequency abstract, such as setting ratio is 15% or is set as that 20% is proper.
Selection subset calculation formula be:
Wherein { 1,0 } is a decision function, is used as video for judging whether some video segment is selected and plucks The part wanted, if choosing the part as video frequency abstract, the value of the function is 1, is otherwise 0.Based on above Formula we may be selected by out a suitable video segment subset.
The synthesis of video frequency abstract is carried out in step 9 according to the video segment subset selected in step 8 out.So-called synthesis Exactly each video segment in resulting video segment subset is merged according to the order in original video.Video is plucked Need to consider whether the video includes audio-frequency information during the synthesis wanted, if comprising audio-frequency information, made a summary in synthetic video During also audio-frequency information is included.It is illustrated in figure 4 video frequency abstract demo system.This video summarization method is with one Plant succinct mode to be presented video summary results in front of the user, significantly improve viewing experience of the user to video data And demand.

Claims (10)

1. a kind of video abstraction generating method of multiple features fusion, it is characterised in that comprise the following steps:
Step 1, obtain video and regard video as input data;
Step 2, the number of the segmentation of fragment, record cut-point and video segment is carried out to the video data of input;
Step 3, the frame of video and frame of video central block in each video segment are extracted;
Step 4, obtain the frame of video extracted and frame of video central block carries out feature and picture quality;
Step 5, the calculating of global importance and local importance is carried out according to obtained feature;
Step 6, the global importance of each frame to obtaining, which with local importance merge, obtains merging importance;
Step 7, the calculating of importance is carried out to each video segment according to cut-point;
Step 8, according to the importance and given threshold of obtained each video segment, video segment is selected, one is selected The video segment subset of individual optimization;
Step 9, the synthesis of video frequency abstract is carried out according to the video segment subset selected.
2. according to the method described in claim 1, it is characterised in that the video of input is split using superframe in the step 2 Prospect, background and movable information of the method by calculating video by into some small video segments of Video segmentation, split The number of point and video segment.
3. according to the method described in claim 1, it is characterised in that for the extraction of frame of video central block in the step 3 Process is:5x5 block is divided into frame of video, the 3x3 of core central block is then extracted.
4. according to the method described in claim 1, it is characterised in that the feature calculated in step 4 includes vision significance f1, expose Luminosity f2, colourity f3, saturation degree f4, Rule of thirds three characteristic value f5,f6,f7, contrast f8, direction degree f9, step The picture quality calculated in 4 includes the picture quality of frame of videoWith the picture quality of frame of video central blockWherein
<mrow> <msub> <mi>f</mi> <mn>1</mn> </msub> <mo>=</mo> <msub> <mi>F</mi> <mi>A</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>A</mi> <mi>S</mi> </msub> <mo>,</mo> <msub> <mi>A</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <mo>&amp;lsqb;</mo> <mrow> <mo>(</mo> <msub> <mi>A</mi> <mi>s</mi> </msub> <mo>+</mo> <msub> <mi>A</mi> <mi>T</mi> </msub> <mo>)</mo> </mrow> <mo>+</mo> <mfrac> <mn>1</mn> <mrow> <mn>1</mn> <mo>+</mo> <mi>&amp;gamma;</mi> </mrow> </mfrac> <mo>|</mo> <mrow> <msub> <mi>A</mi> <mi>s</mi> </msub> <mo>-</mo> <msub> <mi>A</mi> <mi>T</mi> </msub> </mrow> <mo>|</mo> <mo>&amp;rsqb;</mo> <mo>,</mo> <mi>w</mi> <mi>h</mi> <mi>e</mi> <mi>r</mi> <mi>e</mi> <mi>&amp;gamma;</mi> <mo>&gt;</mo> <mn>0</mn> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
Wherein, ASFor static conspicuousness, ATFor time conspicuousness, γ is the empirical parameter of a non-negative;
<mrow> <msub> <mi>f</mi> <mn>2</mn> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>X</mi> <mi>Y</mi> </mrow> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>v</mi> </msub> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>X</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>y</mi> <mi>v</mi> </msub> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>Y</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>I</mi> <mi>V</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>v</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>v</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>
Wherein, X, Y are respectively that the video image of extraction is converted to the length and width of HSV images, xv、yvIn respectively passage V Location of pixels, IV(xv,yv) be HSV images V passages;
<mrow> <msub> <mi>f</mi> <mn>3</mn> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>X</mi> <mi>Y</mi> </mrow> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>s</mi> </msub> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>X</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>y</mi> <mi>s</mi> </msub> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>Y</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>I</mi> <mi>S</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>s</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>s</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>
Wherein, xs、ysLocation of pixels in respectively passage S, IS(xs,ys) be HSV images channel S;
<mrow> <msub> <mi>f</mi> <mn>4</mn> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mrow> <mi>X</mi> <mi>Y</mi> </mrow> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>h</mi> </msub> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>X</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>y</mi> <mi>h</mi> </msub> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>Y</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <msub> <mi>I</mi> <mi>H</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>h</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>h</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow>
Wherein, xh、yhLocation of pixels in respectively passage H, IH(x, y) is the H passages of HSV images;
<mrow> <msub> <mi>f</mi> <mn>5</mn> </msub> <mo>=</mo> <mfrac> <mn>9</mn> <mrow> <mi>X</mi> <mi>Y</mi> </mrow> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>h</mi> </msub> <mo>=</mo> <mi>X</mi> <mo>/</mo> <mn>3</mn> </mrow> <mrow> <mn>2</mn> <mi>X</mi> <mo>/</mo> <mn>3</mn> </mrow> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>y</mi> <mi>h</mi> </msub> <mo>=</mo> <mi>Y</mi> <mo>/</mo> <mn>3</mn> </mrow> <mrow> <mn>2</mn> <mi>Y</mi> <mo>/</mo> <mn>3</mn> </mrow> </munderover> <msub> <mi>I</mi> <mi>H</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>h</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>h</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow>
<mrow> <msub> <mi>f</mi> <mn>6</mn> </msub> <mo>=</mo> <mfrac> <mn>9</mn> <mrow> <mi>X</mi> <mi>Y</mi> </mrow> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>s</mi> </msub> <mo>=</mo> <mi>X</mi> <mo>/</mo> <mn>3</mn> </mrow> <mrow> <mn>2</mn> <mi>X</mi> <mo>/</mo> <mn>3</mn> </mrow> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>y</mi> <mi>s</mi> </msub> <mo>=</mo> <mi>Y</mi> <mo>/</mo> <mn>3</mn> </mrow> <mrow> <mn>2</mn> <mi>Y</mi> <mo>/</mo> <mn>3</mn> </mrow> </munderover> <msub> <mi>I</mi> <mi>S</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>s</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>s</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow> 1
<mrow> <msub> <mi>f</mi> <mn>7</mn> </msub> <mo>=</mo> <mfrac> <mn>9</mn> <mrow> <mi>X</mi> <mi>Y</mi> </mrow> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>x</mi> <mi>v</mi> </msub> <mo>=</mo> <mi>X</mi> <mo>/</mo> <mn>3</mn> </mrow> <mrow> <mn>2</mn> <mi>X</mi> <mo>/</mo> <mn>3</mn> </mrow> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <msub> <mi>y</mi> <mi>v</mi> </msub> <mo>=</mo> <mi>Y</mi> <mo>/</mo> <mn>3</mn> </mrow> <mrow> <mn>2</mn> <mi>Y</mi> <mo>/</mo> <mn>3</mn> </mrow> </munderover> <msub> <mi>I</mi> <mi>V</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>v</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>v</mi> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow>
Contrast, direction degree are calculated using Tamura textural characteristics;
The picture quality of frame of video is obtained by the image quality evaluating method of non-reference pictureWith the figure of frame of video central block As quality
5. method according to claim 4, it is characterised in that the global importance I_G in step 5kCalculation formula be:
<mrow> <mi>I</mi> <mo>_</mo> <msub> <mi>G</mi> <mi>k</mi> </msub> <mo>=</mo> <msub> <mi>q</mi> <msub> <mi>G</mi> <mi>k</mi> </msub> </msub> <mo>&amp;CenterDot;</mo> <mo>&amp;lsqb;</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>1</mn> </mrow> </msub> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>2</mn> </mrow> </msub> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>3</mn> </mrow> </msub> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>4</mn> </mrow> </msub> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>5</mn> </mrow> </msub> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>6</mn> </mrow> </msub> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>7</mn> </mrow> </msub> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>8</mn> </mrow> </msub> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>G</mi> <mo>_</mo> <mn>9</mn> </mrow> </msub> <mo>&amp;rsqb;</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>8</mn> <mo>)</mo> </mrow> </mrow>
Wherein, k is the index value of frame of video, fG_1~fG_9The value of 9 features respectively based on frame of video;
Local importance I_L in step 5kCalculation formula be:
<mrow> <mi>I</mi> <mo>_</mo> <msub> <mi>L</mi> <mi>k</mi> </msub> <mo>=</mo> <msub> <mi>q</mi> <msub> <mi>L</mi> <mi>k</mi> </msub> </msub> <mo>&amp;CenterDot;</mo> <mo>&amp;lsqb;</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>1</mn> </mrow> </msub> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>2</mn> </mrow> </msub> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>3</mn> </mrow> </msub> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>4</mn> </mrow> </msub> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>5</mn> </mrow> </msub> <mo>+</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>-</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>6</mn> </mrow> </msub> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>7</mn> </mrow> </msub> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>8</mn> </mrow> </msub> <mo>+</mo> <msub> <mi>f</mi> <mrow> <mi>L</mi> <mo>_</mo> <mn>9</mn> </mrow> </msub> <mo>&amp;rsqb;</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>9</mn> <mo>)</mo> </mrow> </mrow>
Wherein, k is the index value of frame of video, fL_1~fL_9The value of 9 features respectively based on frame of video central block.
6. according to the method described in claim 1, it is characterised in that step 6 obtains fusion importance by formula (10):
I_Gk&Lk=I_Gk+I_Lk (10)
Wherein, k is the index value of frame of video, I_Gk&LkFor fusion importance, I_GkAnd I_LkRespectively frame of video is global important Property and local importance.
7. according to the method described in claim 1, it is characterised in that each video segment importance described in step 7 includes regarding The fusion importance sum I of frequency fragmentC, video segment average fusion importance Ij,
<mrow> <msub> <mi>I</mi> <mi>j</mi> </msub> <mo>=</mo> <mfrac> <msub> <mi>I</mi> <mi>C</mi> </msub> <mrow> <mi>n</mi> <mi>e</mi> <mi>x</mi> <mi>t</mi> <mo>_</mo> <mi>i</mi> <mo>-</mo> <mi>i</mi> <mo>+</mo> <mn>1</mn> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>12</mn> <mo>)</mo> </mrow> </mrow>
Wherein, k is the index value of frame of video, I_Gk&LkFor the fusion importance of each frame, i represents i-th of shearing point, next_i For next shearing point.
8. method according to claim 7, it is characterised in that the step 8 is by trying the piece of video that (12) selection optimizes Cross-talk collection:
<mrow> <mi>I</mi> <mo>=</mo> <mi>argmax</mi> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>c</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>N</mi> </munderover> <mo>{</mo> <mn>1</mn> <mo>,</mo> <mn>0</mn> <mo>}</mo> <mo>*</mo> <msub> <mi>I</mi> <mi>C</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>12</mn> <mo>)</mo> </mrow> </mrow>
Wherein, N refers to the sum of video segment, and { 1,0 } is a decision function, for judge some video segment whether by The part as video frequency abstract is chosen, if choosing the part as video frequency abstract, the value of the function is 1, Otherwise it is 0.
9. according to the method described in claim 1, it is characterised in that according to regarding for being come out selected in step 7 in the step 9 Frequency fragment is merged according to each video segment in subset according to the order in original video.
10. method according to claim 9, it is characterised in that if comprising audio-frequency information during the synthesis of video frequency abstract, Audio-frequency information is included during synthetic video is made a summary.
CN201710486660.9A 2017-06-23 2017-06-23 Multi-feature fusion video abstract generation method Active CN107222795B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710486660.9A CN107222795B (en) 2017-06-23 2017-06-23 Multi-feature fusion video abstract generation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710486660.9A CN107222795B (en) 2017-06-23 2017-06-23 Multi-feature fusion video abstract generation method

Publications (2)

Publication Number Publication Date
CN107222795A true CN107222795A (en) 2017-09-29
CN107222795B CN107222795B (en) 2020-07-31

Family

ID=59950929

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710486660.9A Active CN107222795B (en) 2017-06-23 2017-06-23 Multi-feature fusion video abstract generation method

Country Status (1)

Country Link
CN (1) CN107222795B (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804578A (en) * 2018-05-24 2018-11-13 南京理工大学 The unsupervised video summarization method generated based on consistency segment
CN109819338A (en) * 2019-02-22 2019-05-28 深圳岚锋创视网络科技有限公司 A kind of automatic editing method, apparatus of video and portable terminal
CN110868630A (en) * 2018-08-27 2020-03-06 北京优酷科技有限公司 Method and device for generating forecast report
WO2020077999A1 (en) * 2018-10-19 2020-04-23 深圳市商汤科技有限公司 Video abstract generation method and apparatus, electronic device and computer storage medium
CN111062284A (en) * 2019-12-06 2020-04-24 浙江工业大学 Visual understanding and diagnosing method of interactive video abstract model
CN111246246A (en) * 2018-11-28 2020-06-05 华为技术有限公司 Video playing method and device
WO2020134926A1 (en) * 2018-12-28 2020-07-02 广州市百果园信息技术有限公司 Video quality evaluation method, apparatus and device, and storage medium
CN111641868A (en) * 2020-05-27 2020-09-08 维沃移动通信有限公司 Preview video generation method and device and electronic equipment
CN112052841A (en) * 2020-10-12 2020-12-08 腾讯科技(深圳)有限公司 Video abstract generation method and related device
CN112734733A (en) * 2021-01-12 2021-04-30 天津大学 Non-reference image quality monitoring method based on channel recombination and feature fusion
CN113052149A (en) * 2021-05-20 2021-06-29 平安科技(深圳)有限公司 Video abstract generation method and device, computer equipment and medium
CN114140461A (en) * 2021-12-09 2022-03-04 成都智元汇信息技术股份有限公司 Picture cutting method based on edge picture recognition box, electronic equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102930061A (en) * 2012-11-28 2013-02-13 安徽水天信息科技有限公司 Video abstraction method and system based on moving target detection
US20140037269A1 (en) * 2012-08-03 2014-02-06 Mrityunjay Kumar Video summarization using group sparsity analysis
CN105228033A (en) * 2015-08-27 2016-01-06 联想(北京)有限公司 A kind of method for processing video frequency and electronic equipment
US20160299968A1 (en) * 2015-04-09 2016-10-13 Yahoo! Inc. Topical based media content summarization system and method
CN106713964A (en) * 2016-12-05 2017-05-24 乐视控股(北京)有限公司 Method of generating video abstract viewpoint graph and apparatus thereof
CN106941631A (en) * 2015-11-20 2017-07-11 联发科技股份有限公司 Summarized radio production method and video data processing system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140037269A1 (en) * 2012-08-03 2014-02-06 Mrityunjay Kumar Video summarization using group sparsity analysis
CN102930061A (en) * 2012-11-28 2013-02-13 安徽水天信息科技有限公司 Video abstraction method and system based on moving target detection
US20160299968A1 (en) * 2015-04-09 2016-10-13 Yahoo! Inc. Topical based media content summarization system and method
CN105228033A (en) * 2015-08-27 2016-01-06 联想(北京)有限公司 A kind of method for processing video frequency and electronic equipment
CN106941631A (en) * 2015-11-20 2017-07-11 联发科技股份有限公司 Summarized radio production method and video data processing system
CN106713964A (en) * 2016-12-05 2017-05-24 乐视控股(北京)有限公司 Method of generating video abstract viewpoint graph and apparatus thereof

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804578A (en) * 2018-05-24 2018-11-13 南京理工大学 The unsupervised video summarization method generated based on consistency segment
CN108804578B (en) * 2018-05-24 2022-06-07 南京理工大学 Unsupervised video abstraction method based on consistency segment generation
CN110868630A (en) * 2018-08-27 2020-03-06 北京优酷科技有限公司 Method and device for generating forecast report
WO2020077999A1 (en) * 2018-10-19 2020-04-23 深圳市商汤科技有限公司 Video abstract generation method and apparatus, electronic device and computer storage medium
CN111246246A (en) * 2018-11-28 2020-06-05 华为技术有限公司 Video playing method and device
WO2020134926A1 (en) * 2018-12-28 2020-07-02 广州市百果园信息技术有限公司 Video quality evaluation method, apparatus and device, and storage medium
US11762905B2 (en) 2018-12-28 2023-09-19 Bigo Technology Pte. Ltd. Video quality evaluation method and apparatus, device, and storage medium
CN109819338A (en) * 2019-02-22 2019-05-28 深圳岚锋创视网络科技有限公司 A kind of automatic editing method, apparatus of video and portable terminal
US11955143B2 (en) 2019-02-22 2024-04-09 Arashi Vision Inc. Automatic video editing method and portable terminal
CN109819338B (en) * 2019-02-22 2021-09-14 影石创新科技股份有限公司 Automatic video editing method and device and portable terminal
CN111062284A (en) * 2019-12-06 2020-04-24 浙江工业大学 Visual understanding and diagnosing method of interactive video abstract model
CN111062284B (en) * 2019-12-06 2023-09-29 浙江工业大学 Visual understanding and diagnosis method for interactive video abstract model
CN111641868A (en) * 2020-05-27 2020-09-08 维沃移动通信有限公司 Preview video generation method and device and electronic equipment
CN112052841A (en) * 2020-10-12 2020-12-08 腾讯科技(深圳)有限公司 Video abstract generation method and related device
CN112052841B (en) * 2020-10-12 2021-06-29 腾讯科技(深圳)有限公司 Video abstract generation method and related device
CN112734733A (en) * 2021-01-12 2021-04-30 天津大学 Non-reference image quality monitoring method based on channel recombination and feature fusion
CN112734733B (en) * 2021-01-12 2022-11-01 天津大学 Non-reference image quality monitoring method based on channel recombination and feature fusion
CN113052149B (en) * 2021-05-20 2021-08-13 平安科技(深圳)有限公司 Video abstract generation method and device, computer equipment and medium
CN113052149A (en) * 2021-05-20 2021-06-29 平安科技(深圳)有限公司 Video abstract generation method and device, computer equipment and medium
CN114140461B (en) * 2021-12-09 2023-02-14 成都智元汇信息技术股份有限公司 Picture cutting method based on edge picture recognition box, electronic equipment and medium
CN114140461A (en) * 2021-12-09 2022-03-04 成都智元汇信息技术股份有限公司 Picture cutting method based on edge picture recognition box, electronic equipment and medium

Also Published As

Publication number Publication date
CN107222795B (en) 2020-07-31

Similar Documents

Publication Publication Date Title
CN107222795A (en) A kind of video abstraction generating method of multiple features fusion
CN109862391A (en) Video classification methods, medium, device and calculating equipment
CN105874449B (en) For extracting and generating the system and method for showing the image of content
EP2587826A1 (en) Extraction and association method and system for objects of interest in video
CN112132197B (en) Model training, image processing method, device, computer equipment and storage medium
CN110414519A (en) A kind of recognition methods of picture character and its identification device
US20130094756A1 (en) Method and system for personalized advertisement push based on user interest learning
US9606975B2 (en) Apparatus and method for automatically generating visual annotation based on visual language
CN107818105A (en) The recommendation method and server of application program
CN106649663B (en) A kind of video copying detection method based on compact video characterization
CN103838754B (en) Information retrieval device and method
CN103810251B (en) Method and device for extracting text
CN109308324A (en) A kind of image search method and system based on hand drawing style recommendation
CN111429341B (en) Video processing method, device and computer readable storage medium
CN109299277A (en) The analysis of public opinion method, server and computer readable storage medium
CN104951495A (en) Apparatus and method for managing representative video images
CN105912684A (en) Cross-media retrieval method based on visual features and semantic features
CN110782448A (en) Rendered image evaluation method and device
CN106600482A (en) Multi-source social data fusion multi-angle travel information perception and intelligent recommendation method
CN106600213A (en) Intelligent resume management system and method
CN112231554A (en) Search recommendation word generation method and device, storage medium and computer equipment
US9569213B1 (en) Semantic visual hash injection into user activity streams
CN107656760A (en) Data processing method and device, electronic equipment
CN111163366A (en) Video processing method and terminal
US20170061642A1 (en) Information processing apparatus, information processing method, and non-transitory computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant