CN104540044A

CN104540044A - Video segmentation method and device

Info

Publication number: CN104540044A
Application number: CN201410843714.9A
Authority: CN
Inventors: 周正杰; 张彦刚
Original assignee: Beijing QIYI Century Science and Technology Co Ltd
Current assignee: Beijing QIYI Century Science and Technology Co Ltd
Priority date: 2014-12-30
Filing date: 2014-12-30
Publication date: 2015-04-22
Anticipated expiration: 2034-12-30
Also published as: CN104540044B

Abstract

The embodiment of the invention discloses a video segmentation method and device, and relates to the technical field of video processing. The method comprises the steps that character information flow in target video is spliced, and a character segment to be processed is generated; a sliding window with the size of w1 and the unit sliding distance of d1 is utilized for calculating the character relevancy Ri of characters in the ith window of the character segment to be processed, and then a character relevancy sequence, {R1, R2, R3...}, is obtained; according to the variation trend of the character relevancy sequence, the segmentation points of the target video are determined; the target video is segmented according to the determined segmentation points. According to the scheme, video segmentation is carried out, accurate segmentation information facilitating operation can be provided for users, and user experience can be improved.

Description

A kind of video segmentation method and device

Technical field

The present invention relates to technical field of video processing, particularly a kind of video segmentation method and device.

Background technology

Video because having advantages such as containing much information, abundant in content, and is liked by users deeply.But user has in the process of the video of large information capacity in viewing some part may not liking this video, and in this case, user generally selects to skip the part do not liked, continue the further part of this video of viewing.

Because the frame of video comprised in a video is more, when skipping by the mode adjusting playing progress bar the video section do not liked, be difficult to the end frame accurately navigating to this part, for this reason, in practical application, in order to navigate to user fast and accurately for skipping the end frame of video section, to continue the further part watching video, generally video is carried out segment processing, namely prior video is divided into multiple video-frequency band.When user has " skipping " to need, according to the segment information of video, directly skip a video-frequency band, and without the need to user by adjustment playing progress bar skipping one by one.

In prior art, normally realize video segmentation by the mode of scene detection, that is: the successive video frames under Same Scene is placed in same video-frequency band.After application scenarios detection mode carries out video segmentation, if user is for skipping certain part of video, then can realize in units of scene.But, for news category video, radio program presenter announces's scene, one or more live report scene etc. in news item, may be comprised, if user does not want to watch certain news, for skip this news continue to watch below news time, just can only can skip this news according to scene multi-pass operation.In addition, when the scene that in video, many news is corresponding is continuously radio program presenter announces's scene, because between continuous two news, scene does not change, this continuous print many news are generally divided in a video-frequency band, when skipping news item wherein, directly can skip other follow-up news.Therefore, under certain situation, apply after existing scene detection mode carries out video segmentation, cannot provide accurately for user, the segment information of convenient operation, affect Consumer's Experience.

Summary of the invention

The embodiment of the invention discloses a kind of video segmentation method and device, think that user provides accurately, the segment information of convenient operation, improve Consumer's Experience.

For achieving the above object, the embodiment of the invention discloses a kind of video segmentation method, described method comprises:

Word message stream in target video is spliced, generates pending word section;

Utilize size for w ₁, unit sliding distance is d ₁sliding window, calculate the word degree of correlation R of word in described pending word section i-th window respectively _i, and then obtain word degree of correlation sequence: { R ₁, R ₂, R ₃, wherein, the starting point of i-th window is 1+ (i-1) d ₁, terminal is w ₁+ (i-1) d ₁, i=1,2,3 ..., w ₁>=d ₁;

According to the variation tendency of described word degree of correlation sequence, determine the waypoint of described target video, wherein, determined waypoint to taper off change relative to the value be positioned at before this waypoint in described word degree of correlation sequence, and the value be positioned in described word degree of correlation sequence after this waypoint is incremental variations relative to this waypoint;

According to determined waypoint, segmentation is carried out to described target video.

Concrete, the word degree of correlation R of word in the described pending word section of described calculating i-th window _i, comprising:

Utilize size for w ₂, unit sliding distance is d ₂sliding sub-window, the word degree of correlation between the window calculating any two subwindow xth in described pending word section i-th window and y subwindow Chinese word, wherein, the starting point of an xth subwindow is 1+ (i-1) d ₁+ (x-1) d ₂, terminal is w ₂+ (i-1) d ₁+ (x-1) d ₂, the starting point of y subwindow is 1+ (i-1) d ₁+ (y-1) d ₂, terminal is w ₂+ (i-1) d ₁+ (y-1) d ₂, x, y=1,2,3 ..., w ₂>=d ₂, w ₁-w ₂equal d ₂nonnegative integer doubly, d ₁equal d ₂nonnegative integer doubly;

According to the word degree of correlation between the window calculated, calculate the word degree of correlation R of word in described pending word section i-th window _i.

Concrete, in the described pending word section of described calculating i-th window any two subwindow xth and y the Chinese word of subwindow window between the word degree of correlation, comprising:

Add up the probability that in described pending word section i-th window, in any two subwindow xth and y subwindow, each word occurs respectively;

According to the probability added up the word that obtains and occur, the word degree of correlation between the window calculating any two subwindow xth and y the Chinese word of subwindow in described pending word section i-th window.

Concrete, the described probability according to adding up the word that obtains and occurring, the word degree of correlation between the window calculating any two subwindow xth and y the Chinese word of subwindow in described pending word section i-th window, comprising:

Determine the word that in described pending word section i-th window, any two subwindow xth are identical with in y subwindow;

Word degree of correlation S between the window calculating an xth subwindow and the Chinese word of y subwindow according to following formula _xy:

S_{xy} = \frac{P_{sx 1} * P_{sy 1} + P_{sx 2} * P_{sy 2} + . . . + P_{sxm} * P_{sym}}{P_{x 1}^{2} + P_{x 2}^{2} + . . . + P_{xn}^{2} + P_{y 1}^{2} + P_{y 2}^{2} + . . . + P_{yp}^{2}},

Wherein, m represents the quantity of same text in an xth subwindow and y subwindow, and n represents the quantity of the different literals comprised in an xth subwindow, and p represents the quantity of the different literals comprised in y subwindow, P _sxm, P _symrepresent the probability that in an xth subwindow and y subwindow, m same text occurs, P _xnrepresent the probability that in an xth subwindow, the n-th different literals occurs, P _yprepresent the probability that in y subwindow, p different literals occurs.

Concrete, the word degree of correlation between the window that described basis calculates, calculates the word degree of correlation R of word in described pending word section i-th window _i, comprising:

According to the weight coefficient preset, the word degree of correlation between the window calculated is weighted, obtains the word degree of correlation R of word in described pending word section i-th window _i.

Concrete, the described variation tendency according to described word degree of correlation sequence, determine the video segmentation point of described target video, comprising:

By in described word degree of correlation sequence, meet the video playback time that the word degree of correlation of following condition is corresponding, be defined as the video segmentation point of described target video:

By arbitrary word degree of correlation R _jr is positioned at in described word degree of correlation sequence _jthe word degree of correlation before, the slope of a curve g determined ₁< 0;

By arbitrary word degree of correlation R _jr is positioned at in described word degree of correlation sequence _jthe word degree of correlation afterwards, the slope of a curve g determined ₂> 0.

Concrete, the Word message in described target video flows through and obtains with under type:

The Word message stream of described target video is obtained according to speech recognition algorithm; Or

The Word message stream of described target video is obtained from the file preset; Or

According to Text region algorithm, obtain the Word message stream of described target video from the predeterminated position of each frame of video of target video.

For achieving the above object, the embodiment of the invention discloses a kind of video segmentation device, described device comprises:

Word section generation module, for splicing the Word message stream in target video, generates pending word section;

Word relatedness computation module, for utilizing size for w ₁, unit sliding distance is d ₁sliding window, calculate the word degree of correlation R of word in described pending word section i-th window respectively _i, and then obtain word degree of correlation sequence: { R ₁, R ₂, R ₃, wherein, the starting point of i-th window is 1+ (i-1) d ₁, terminal is w ₁+ (i-1) d ₁, i=1,2,3 ..., w ₁>=d ₁;

Waypoint determination module, for the variation tendency according to described word degree of correlation sequence, determine the waypoint of described target video, wherein, determined waypoint to taper off change relative to the value be positioned at before this waypoint in described word degree of correlation sequence, and the value be positioned in described word degree of correlation sequence after this waypoint is incremental variations relative to this waypoint;

Video segmentation module, for carrying out segmentation according to determined waypoint to described target video.

Concrete, described word relatedness computation module, comprising:

Word relatedness computation submodule between window, for utilizing size for w ₂, unit sliding distance is d ₂sliding sub-window, the word degree of correlation between the window calculating any two subwindow xth in described pending word section i-th window and y subwindow Chinese word, wherein, the starting point of an xth subwindow is 1+ (i-1) d ₁+ (x-1) d ₂, terminal is w ₂+ (i-1) d ₁+ (x-1) d ₂, the starting point of y subwindow is 1+ (i-1) d ₁+ (y-1) d ₂, terminal is w ₂+ (i-1) d ₁+ (y-1) d ₂, x, y=1,2,3 ..., w ₂>=d ₂, w ₁-w ₂equal d ₂nonnegative integer doubly, d ₁equal d ₂nonnegative integer doubly;

Word relatedness computation submodule, for according to the word degree of correlation between the window calculated, calculates the word degree of correlation R of word in described pending word section i-th window _i.

Concrete, word relatedness computation submodule between described window, comprising:

Probability statistics unit, for adding up the probability that in described pending word section i-th window, in any two subwindow xth and y subwindow, each word occurs respectively;

Word correlation calculating unit between window, for according to the probability added up the word that obtains and occur, the word degree of correlation between the window calculating any two subwindow xth and y the Chinese word of subwindow in described pending word section i-th window.

Concrete, word correlation calculating unit between described window, comprising:

Same text determination subelement, for determining word identical in any two subwindow xth and y subwindow in described pending word section i-th window;

Similarity characteristic value computation subunit, for calculate an xth subwindow and the Chinese word of y subwindow according to following formula window between word degree of correlation S _xy:

S_{xy} = \frac{P_{sx 1} * P_{sy 1} + P_{sx 2} * P_{sy 2} + . . . + P_{sxm} * P_{sym}}{P_{x 1}^{2} + P_{x 2}^{2} + . . . + P_{xn}^{2} + P_{y 1}^{2} + P_{y 2}^{2} + . . . + P_{yp}^{2}},

Concrete, word relatedness computation submodule, specifically for according to the weight coefficient preset, is weighted the word degree of correlation between the window calculated, obtains the word degree of correlation R of word in described pending word section i-th window _i.

Concrete, described waypoint determination module, specifically for by described word degree of correlation sequence, meets the video playback time that the word degree of correlation of following condition is corresponding, is defined as the video segmentation point of described target video:

Concrete, described video segmentation device also comprises: Word message stream obtains module, for obtaining the Word message stream in described target video;

Specifically for obtaining the Word message stream of described target video according to speech recognition algorithm; Or

Specifically for obtaining the Word message stream of described target video from the file preset; Or

Specifically for according to Text region algorithm, obtain the Word message stream of described target video from the predeterminated position of each frame of video of target video.

As seen from the above, in the scheme that the embodiment of the present invention provides, Word message stream in target video is spliced, after generating pending word section, utilize sliding window, calculate the word degree of correlation of word in pending word section i-th window respectively, and then obtain word degree of correlation sequence, again according to the variation tendency of word degree of correlation sequence, determine the waypoint of target video, and according to determined video segmentation point, segmentation is carried out to target video.Stronger correlation is there is due between the Word message that each frame of video describing same event is corresponding, therefore, when the scheme that the application embodiment of the present invention provides carries out video segmentation, can each frame of video describing same event be divided in a video segmentation, when certain part of target video is skipped in user's selection, directly can skip video-frequency band corresponding to this part, without the need to multi-pass operation; In addition, because correlation between the Word message that each frame of video describing different event is corresponding is more weak, therefore, even if the video scene that each frame of video of description different event is corresponding is similar, also still according to Word message, each frame of video above-mentioned can be divided in the different video section for two different event.More than comprehensive, the scheme that the embodiment of the present invention provides can provide accurately for user, the segment information of convenient operation, can improve Consumer's Experience.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.

The schematic flow sheet of a kind of video segmentation method that Fig. 1 provides for the embodiment of the present invention;

A kind of schematic flow sheet calculating the method for the word degree of correlation that Fig. 2 provides for the embodiment of the present invention;

The schematic flow sheet of the method for the another kind calculating word degree of correlation that Fig. 3 provides for the embodiment of the present invention;

Schematic diagram in the structure of a kind of video segmentation device that Fig. 4 provides for the embodiment of the present invention;

A kind of structural representation calculating the device of the word degree of correlation that Fig. 5 provides for the embodiment of the present invention;

The structural representation of the device of the another kind calculating word degree of correlation that Fig. 6 provides for the embodiment of the present invention.

Embodiment

In practical application, between the word describing same event in video, the degree of correlation is higher, and between the word describing different event, the degree of correlation is lower, and in the application, inventor utilizes this characteristic to carry out video segmentation, proposes a kind of video segmentation method and device.

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

The schematic flow sheet of a kind of video segmentation method that Fig. 1 provides for the embodiment of the present invention, the method comprises:

S101: splice the Word message stream in target video, generates pending word section.

It should be noted that, the video mentioned in the application can be the video on the ordinary meaning understood of user, that is: comprise the video of audio-frequency information and image information; In addition, the video mentioned in the application can also be only comprise image information, does not comprise the video of audio-frequency information.

Word message stream in target video can be obtained by following several mode:

First kind of way: the Word message stream obtaining target video according to speech recognition algorithm.

May not comprise caption information in some videos, in this case, the Word message stream in video can be obtained by speech recognition algorithm, wherein, there is the speech recognition algorithm of multiple maturation, no longer describe in detail here in prior art.

The second way: the Word message stream obtaining target video from the file preset.

In a kind of situation, the caption information of video is embedded in video image, in another kind of situation, caption information and the video file of video are separated, caption information is stored in default file, when displaying video, from the file preset, read caption information, and show at the ad-hoc location of corresponding video frame the caption information read.

Therefore, when caption information is stored in default file, the Word message stream of target video can be obtained by the mode reading the Word message in default file.

The third mode: according to Text region algorithm, obtains the Word message stream of target video from the predeterminated position of each frame of video of target video.

Character recognition technology was more and more ripe in recent years, can also pass through Text region algorithm, identify Word message, as the Word message stream of target video from the predeterminated position of frame of video.Due in generally frame of video, not only comprise caption information, also may comprise Word message belonging to scene content etc., but, the location comparison that caption information occurs is fixed, such as, the position of frame of video near bottom is generally appeared at, therefore, before carrying out Text region, can first determine Text region region, then carry out Text region, the Word message stream of target video can be obtained more accurately.

In addition, those skilled in the art are understandable that, life period correlation between several frame of video of continuous print, between each frame, image content is similar, caption information may be identical, after the caption information being obtained each frame by Text region algorithm identified, the caption information obtaining each frame if judge is identical, only can retain the caption information of a wherein frame.

There is the Text region algorithm of multiple maturation in prior art, no longer described in detail here.

Certainly, the mode obtaining the Word message stream in target video in practical application is not limited in above-mentioned three kinds, and the application does not limit this.

According to above-mentioned three kinds of modes except can obtaining the Word message stream in target video, the word in Word message stream and the corresponding relation between the video playback time can also be obtained.

Concrete, the word in Word message stream and the corresponding relation between the video playback time can be obtained according to the synchronizing information in audio-frequency information etc.;

According to each section of caption character recorded in the file preset relative to the reproduction time playing initial time, the word in Word message stream and the corresponding relation between the video playback time can be obtained;

According to the frame number of frame of video carrying out Text region, the word in Word message stream and the corresponding relation between the video playback time etc. can be obtained.

It should be noted that, the word mentioned in the application can be Chinese character, and can be also English word, the application limit this.

S102: utilize size for w ₁, unit sliding distance is d ₁sliding window, calculate the word degree of correlation R of word in pending word section i-th window respectively _i, and then obtain word degree of correlation sequence: { R ₁, R ₂, R ₃.

Wherein, the starting point of i-th window is 1+ (i-1) d ₁, terminal is w ₁+ (i-1) d ₁, i=1,2,3 ..., w ₁>=d ₁.

S103: according to the variation tendency of word degree of correlation sequence, determines the waypoint of target video.

Be understandable that, slide into the process of the word describing second event at sliding window from the word of description first event, in each sliding window obtained, the word degree of correlation of word can weaken gradually, but along with the Word message of the description second event comprised in sliding window increases, in each sliding window obtained, the word degree of correlation of word can strengthen gradually, so, sliding window slides into the word describing second event process from the word of description first event can be got, the video playback time that the minimum sliding window of the word degree of correlation is corresponding is waypoint.

That is, determined waypoint to taper off change relative to the value be positioned at before this waypoint in word degree of correlation sequence, and the value be positioned in word degree of correlation sequence after this waypoint is incremental variations relative to this waypoint.

In a preferred embodiment of the present invention, by above-mentioned word degree of correlation sequence, the video playback time that the word degree of correlation of following condition is corresponding can be met, be defined as the video segmentation point of target video:

By arbitrary word degree of correlation R _jr is positioned at in above-mentioned word degree of correlation sequence _jthe word degree of correlation before, the slope of a curve g determined ₁< 0;

By arbitrary word degree of correlation R _jr is positioned at in above-mentioned word degree of correlation sequence _jthe word degree of correlation afterwards, the slope of a curve g determined ₂> 0.

Due to video continuous some frame of video between life period correlation, so, determined for determining that video playback time corresponding to the word degree of correlation of waypoint may be a time period according to the variation tendency of word degree of correlation sequence, in another preferred embodiment of the present invention, the video playback moment in the above-mentioned time period corresponding to last frame of video can being defined as waypoint, can ensureing that the frame of video for describing the very first time is not divided in the video-frequency band of the frame of video formation for describing second event so to greatest extent.

S104: segmentation is carried out to target video according to determined waypoint.

As seen from the above, in the scheme that the present embodiment provides, Word message stream in target video is spliced, after generating pending word section, utilize sliding window, calculate the word degree of correlation of word in pending word section i-th window respectively, and then obtain word degree of correlation sequence, again according to the variation tendency of word degree of correlation sequence, determine the waypoint of target video, and according to determined video segmentation point, segmentation is carried out to target video.Stronger correlation is there is due between the Word message that each frame of video describing same event is corresponding, therefore, when the scheme that application the present embodiment provides carries out video segmentation, can each frame of video describing same event be divided in a video segmentation, when certain part of target video is skipped in user's selection, directly can skip video-frequency band corresponding to this part, without the need to multi-pass operation; In addition, because correlation between the Word message that each frame of video describing different event is corresponding is more weak, therefore, even if the video scene that each frame of video of description different event is corresponding is similar, also still according to Word message, each frame of video above-mentioned can be divided in the different video section for two different event.More than comprehensive, the scheme that the present embodiment provides can provide accurately for user, the segment information of convenient operation, can improve Consumer's Experience.

In one particular embodiment of the present invention, see Fig. 2, provide a kind of schematic flow sheet calculating the method for the word degree of correlation, in the present embodiment, the word degree of correlation R of word in the pending word section of above-mentioned calculating i-th window _i, comprising:

S1021: utilize size for w ₂, unit sliding distance is d ₂sliding sub-window, the word degree of correlation between the window calculating any two subwindow xth in pending word section i-th window and y subwindow Chinese word.

Wherein, the starting point of an xth subwindow is 1+ (i-1) d ₁+ (x-1) d ₂, terminal is w ₂+ (i-1) d ₁+ (x-1) d ₂, the starting point of y subwindow is 1+ (i-1) d ₁+ (y-1) d ₂, terminal is w ₂+ (i-1) d ₁+ (y-1) d ₂, x, y=1,2,3 ..., w ₂>=d ₂, w ₁-w ₂equal d ₂nonnegative integer doubly, d ₁equal d ₂nonnegative integer doubly.

Work as w ₂=d ₂time, illustrate between the word in adjacent two subwindows not overlapping, and work as w ₂> d ₂time, illustrate between the word in adjacent two subwindows overlapping.Those skilled in the art are understandable that, lap is longer, and between the window calculated, the degree of correlation of word is higher, and certainly adjacent two subwindows can not be completely overlapping, such as, works as d ₂when=1, between the window calculated, the word degree of correlation is the word degree of correlation between window for each word.

Further, w is worked as ₁=d ₁time, can be understood as and pending word section is divided into several continuous print word segmentations; Work as w ₂=d ₂time, can be understood as and each window of pending word section is divided into several continuous print Ziwen word segmentations.

In a preferred embodiment of the invention, as (w ₁-w ₂)/d ₂=d ₁/ d ₂when-1, the subwindow intersection in each window of the pending word section obtained, and directly utilizes size for w ₂, unit sliding distance is d ₂sliding sub-window, the subwindow intersection obtained in pending word section is identical.

Preferably, for ensure to calculate pending word section each window in the word degree of correlation of word time, the quantity of the subwindow that each window utilized is corresponding is equal, by arranging w ₂and d ₂value, the equal realization of quantity of subwindow in each window making pending word section.

Below by instantiation, the subwindow in each window of pending word section and each window is described.

Example one, w ₁=d ₁, w ₂=d ₂:

Suppose, w ₁=d ₁=15, w ₂=d ₂=5, w ₁-w ₂=d ₂2 times, d ₁=d ₂3 times, then the starting point of pending word section the 1st sliding window is: 1+ (1-1) x15=1, terminal are: 15+ (1-1) x15=15, in the 1st window the starting point of each subwindow and terminal as shown in table 1.

Table 1

The starting point of pending word section the 2nd window is: 1+ (2-1) x15=16, terminal are: 15+ (2-1) x15=30, and starting point and the terminal of the 2nd window each subwindow interior are as shown in table 2.

Table 2

Example two, w ₁> d ₁, w ₂> d ₂:

Suppose, w ₁=14, d ₁=12, w ₂=5, d ₂=3, w ₁-w ₂equal d ₂3 times, d ₁equal d ₂4 times, then the starting point of pending word section the 1st window is: 1+ (1-1) x12=1, terminal are: 14+ (1-1) x12=14, in the 1st window the starting point of each subwindow and terminal as shown in table 3.

Table 3

The starting point of pending word section the 2nd window is: 1+ (2-1) x12=13, terminal are: 14+ (2-1) x12=26, and starting point and the terminal of the 2nd window each subwindow interior are as shown in table 4.

Table 4

In another specific embodiment of the present invention, see Fig. 3, provide the another kind of schematic flow sheet calculating the method for the word degree of correlation, compared with embodiment illustrated in fig. 2, in the present embodiment, the word degree of correlation between the window calculating any two subwindow xth in pending word section i-th window and y subwindow Chinese word, comprising:

S1021A: add up the probability that in pending word section i-th window, in any two subwindow xth and y subwindow, each word occurs respectively.

The number of times that the probability that above-mentioned each word occurs can occur in subwindow with each word represents, also the ratio of word total quantity can represent in the number of times that occurs in subwindow of each word and subwindow, the application does not limit this.

S1021B: according to the probability added up the word that obtains and occur, the word degree of correlation between the window calculating any two subwindow xth and y the Chinese word of subwindow in pending word section i-th window.

In a kind of implementation of the present invention, first can determine the word that in pending word section i-th window, any two subwindow xth are identical with in y subwindow, then the word degree of correlation between the window calculating an xth subwindow and the Chinese word of y subwindow according to following formula

S_{xy} = \frac{P_{sx 1} * P_{sy 1} + P_{sx 2} * P_{sy 2} + . . . + P_{sxm} * P_{sym}}{P_{x 1}^{2} + P_{x 2}^{2} + . . . + P_{xn}^{2} + P_{y 1}^{2} + P_{y 2}^{2} + . . . + P_{yp}^{2}},

It should be noted that, above-mentioned n with p may not be identical, when n is not equal to p, and P _x1p _xmnumber and P _y1p _ypnumber not identical, for guaranteeing the word degree of correlation normally used between above-mentioned expression formula calculation window, when n < p, can at P _xmafterwards with 0 polishing, when n > p, can at P _ypafterwards with 0 polishing, make the number of two groups of numbers identical.

Such as, n=3, p=2, then P _x1p _xmfor: P _x1, P _x2, P _x3, P _y1p _ypfor: P _y1, P _y2, then need at P _y2afterwards to mend 0, after mending 0 be: P _y1, P _y2, 0.

In addition, P _xnand the corresponding relation in an xth subwindow between different literals, the order that can occur in an xth subwindow according to these different literals is determined, certainly, also this corresponding relation is determined in probability sequence from high to low or from low to high occurred in an xth subwindow according to each different literals etc., the determination mode of corresponding relation is not limited in above-mentioned several, can determine as the case may be in practical application.

Determine P _ypand the mode of the corresponding relation in y subwindow between different literals, can with determine P _xnidentical with the corresponding relation in an xth subwindow between different literals, no longer repeat here.

S1022: according to the word degree of correlation between the window calculated, calculates the word degree of correlation R of word in pending word section i-th window _i.

Concrete, according to the weight coefficient preset, the word degree of correlation between the window calculated can be weighted, obtain the word degree of correlation R of word in pending word section i-th window _i.

As seen from the above, in the scheme that above-described embodiment provides, with the word degree of correlation of word in total i-th window of sliding window form calculus pending word section, size and unit sliding distance by adjusting sliding window obtain the word degree of correlation of different accuracy, are convenient to the various computational accuracy demands meeting user.

Be described in detail to embodiment illustrated in fig. 3 again below by an instantiation.

S1021A: suppose that the word in pending word section the 1st window in an xth subwindow is " left " above below, the word in y subwindow is " after the right side, face ", then

The probability (representing with the number of times that word occurs in subwindow) that in an xth subwindow, each word occurs is: upper: 1, face: 2, under: 1, left: 1;

The probability that in y subwindow, each word occurs is: face: 3, right: 1, rear: 1.

S102B: according to the word in an xth subwindow and the word in y subwindow, can learn that word identical in these two subwindows is for " face ", the then quantity m=1 of same text in these two subwindows, the quantity n=4 of the different literals comprised in an xth subwindow, the quantity p=3 of the different literals comprised in y subwindow, the probability P that word " face " occurs in an xth subwindow _sx1=2, the probability P occurred in y subwindow _sy1=3, xth subwindow Chinese word " on ", the probability that occurs of " face ", D score and " left side " is respectively: P _x1=1, P _x2=2, P _x3=1, P _x4=1, the probability that y subwindow Chinese word " face ", " left side " and " right side " occur is respectively: P _y1=3, P _y2=1, P _y3=1;

In addition, n > p, can at P _y3supplement a P afterwards _y4=0;

According to following formulae discovery S _xyfor:

\begin{matrix} S_{xy} = \frac{P_{sx 1} * P_{sy 1}}{P_{x 1}^{2} + P_{x 2}^{2} + P_{x 3}^{2} + P_{x 4}^{2} + P_{y 1}^{2} + P_{y 2}^{2} + P_{y 3}^{2} + P_{y 4}^{2}} \\ = \frac{2 * 3}{1^{2} + 2^{2} + 1^{2} + 1^{2} + 3^{2} + 1^{2} + 1^{2} + 0^{2}} = \frac{1}{3} \end{matrix} .

S1022: the window word degree of correlation that can calculate any two subwindows Chinese word in i-th window according to said process, according to the weight coefficient preset, the above-mentioned window word degree of correlation calculated is weighted, obtains the word degree of correlation R of word in pending word section i-th window _i.

Corresponding with above-mentioned video segmentation method, the embodiment of the present invention additionally provides a kind of video segmentation device.

The structural representation of a kind of video segmentation device that Fig. 4 provides for the embodiment of the present invention, this device comprises: word section generation module 401, word relatedness computation module 402, waypoint determination module 403 and video segmentation module 404.

Wherein, word section generation module 401, for splicing the Word message stream in target video, generates pending word section;

Word relatedness computation module 402, for utilizing size for w ₁, unit sliding distance is d ₁sliding window, calculate the word degree of correlation R of word in described pending word section i-th window respectively _i, and then obtain word degree of correlation sequence: { R ₁, R ₂, R ₃, wherein, the starting point of i-th window is 1+ (i-1) d ₁, terminal is w ₁+ (i-1) d ₁, i=1,2,3 ..., w ₁>=d ₁;

Waypoint determination module 403, for the variation tendency according to described word degree of correlation sequence, determine the waypoint of described target video, wherein, determined waypoint to taper off change relative to the value be positioned at before this waypoint in described word degree of correlation sequence, and the value be positioned in described word degree of correlation sequence after this waypoint is incremental variations relative to this waypoint;

Video segmentation module 404, for carrying out segmentation according to determined waypoint to described target video.

Concrete, above-mentioned waypoint determination module 403, specifically for by described word degree of correlation sequence, meets the video playback time that the word degree of correlation of following condition is corresponding, is defined as the video segmentation point of described target video:

Concrete, above-mentioned video segmentation device can also comprise: Word message stream obtains module (not shown).

Wherein, Word message stream obtains module, for obtaining the Word message stream in described target video;

In one particular embodiment of the present invention, see Fig. 5, provide a kind of structural representation calculating the device of the word degree of correlation, in the present embodiment, word relatedness computation module 402 in previous embodiment, comprising: word relatedness computation submodule 4021 and word relatedness computation submodule 4022 between window.

Wherein, word relatedness computation submodule 4021 between window, for utilizing size for w ₂, unit sliding distance is d ₂sliding sub-window, the word degree of correlation between the window calculating any two subwindow xth in described pending word section i-th window and y subwindow Chinese word, wherein, the starting point of an xth subwindow is 1+ (i-1) d ₁+ (x-1) d ₂, terminal is w ₂+ (i-1) d ₁+ (x-1) d ₂, the starting point of y subwindow is 1+ (i-1) d ₁+ (y-1) d ₂, terminal is w ₂+ (i-1) d ₁+ (y-1) d ₂, x, y=1,2,3 ..., w ₂>=d ₂, w ₁-w ₂equal d ₂nonnegative integer doubly, d ₁equal d ₂nonnegative integer doubly;

Concrete, word relatedness computation submodule 4022, specifically for according to the weight coefficient preset, is weighted the word degree of correlation between the window calculated, obtains the word degree of correlation R of word in described pending word section i-th window _i.

In another specific embodiment of the present invention, see Fig. 6, provide the another kind of structural representation calculating the device of the word degree of correlation, compared with embodiment illustrated in fig. 5, in the present embodiment, word relatedness computation submodule 4021 between window, comprising: word correlation calculating unit 40212 between probability statistics unit 40211 and window.

Wherein, probability statistics unit 40211, for adding up the probability that in described pending word section i-th window, in any two subwindow xth and y subwindow, each word occurs respectively;

Intersegmental similarity calculated 40212, for according to the probability added up the word that obtains and occur, the word degree of correlation between the window calculating any two subwindow xth and y the Chinese word of subwindow in described pending word section i-th window.

Concrete, between above-mentioned window, word correlation calculating unit 40212 can comprise: same text determination subelement and similarity characteristic value computation subunit (not shown).

Wherein, same text determination subelement, for determining word identical in any two subwindow xth and y subwindow in described pending word section i-th window;

S_{XY} = \frac{P_{sx 1} * P_{sy 1} + P_{sx 2} * P_{sy 2} + . . . + P_{sxm} * P_{sym}}{P_{x 1}^{2} + P_{x 2}^{2} + . . . + P_{xn}^{2} + P_{y 1}^{2} + P_{y 2}^{2} + . . . + P_{yp}^{2}},

For device embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.

It should be noted that, in this article, the such as relational terms of first and second grades and so on is only used for an entity or operation to separate with another entity or operating space, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.And, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or equipment.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the equipment comprising described key element and also there is other identical element.

One of ordinary skill in the art will appreciate that all or part of step realized in said method execution mode is that the hardware that can carry out instruction relevant by program has come, described program can be stored in computer read/write memory medium, here the alleged storage medium obtained, as: ROM/RAM, magnetic disc, CD etc.

The foregoing is only preferred embodiment of the present invention, be not intended to limit protection scope of the present invention.All any amendments done within the spirit and principles in the present invention, equivalent replacement, improvement etc., be all included in protection scope of the present invention.

Claims

1. a video segmentation method, is characterized in that, described method comprises:

Word message stream in target video is spliced, generates pending word section;

2. method according to claim 1, is characterized in that, the word degree of correlation R of word in the described pending word section of described calculating i-th window _i, comprising:

3. method according to claim 2, is characterized in that, in the described pending word section of described calculating i-th window any two subwindow xth and y subwindow Chinese word window between the word degree of correlation, comprising:

4. method according to claim 3, it is characterized in that, the described probability according to adding up the word that obtains and occurring, the word degree of correlation between the window calculating any two subwindow xth and y the Chinese word of subwindow in described pending word section i-th window, comprising:

S_{xy} = \frac{P_{sx 1} * P_{sy 1} + P_{sx 2} * P_{sy 2} + . . . + P_{sxm} * P_{sym}}{P_{x 1}^{2} + P_{x 2}^{2} + . . . + P_{xn}^{2} + P_{y 1}^{2} + P_{y 2}^{2} + . . . + P_{yp}^{2}},

5. method according to claim 2, is characterized in that, the word degree of correlation between the window that described basis calculates, and calculates the word degree of correlation R of word in described pending word section i-th window _i, comprising:

6. method according to claim 1, is characterized in that, the described variation tendency according to described word degree of correlation sequence, determines the video segmentation point of described target video, comprising:

7. method according to any one of claim 1 to 6, is characterized in that, the Word message in described target video flows through and obtains with under type:

8. a video segmentation device, is characterized in that, described device comprises:

9. device according to claim 8, is characterized in that, described word relatedness computation module, comprising:

10. device according to claim 9, is characterized in that, word relatedness computation submodule between described window, comprising:

11. devices according to claim 10, is characterized in that, word correlation calculating unit between described window, comprising:

S_{xy} = \frac{P_{sx 1} * P_{sy 1} + P_{sx 2} * P_{sy 2} + . . . + P_{sxm} * P_{sym}}{P_{x 1}^{2} + P_{x 2}^{2} + . . . + P_{xn}^{2} + P_{y 1}^{2} + P_{y 2}^{2} + . . . + P_{yp}^{2}},

12. devices according to claim 9, it is characterized in that, word relatedness computation submodule, specifically for the weight coefficient that basis is preset, the word degree of correlation between the window calculated is weighted, obtains the word degree of correlation R of word in described pending word section i-th window _i.

13. devices according to claim 8, it is characterized in that, described waypoint determination module, specifically for by described word degree of correlation sequence, meet the video playback time that the word degree of correlation of following condition is corresponding, be defined as the video segmentation point of described target video:

Device according to any one of 14. according to Claim 8 to 13, is characterized in that, described device also comprises: Word message stream obtains module, for obtaining the Word message stream in described target video;