CN101616264B - Method and system for cataloging news video - Google Patents

Method and system for cataloging news video Download PDF

Info

Publication number
CN101616264B
CN101616264B CN2008101157870A CN200810115787A CN101616264B CN 101616264 B CN101616264 B CN 101616264B CN 2008101157870 A CN2008101157870 A CN 2008101157870A CN 200810115787 A CN200810115787 A CN 200810115787A CN 101616264 B CN101616264 B CN 101616264B
Authority
CN
China
Prior art keywords
news
frame
video
host
word message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2008101157870A
Other languages
Chinese (zh)
Other versions
CN101616264A (en
Inventor
陈众
张树武
曾智
杨武夷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN2008101157870A priority Critical patent/CN101616264B/en
Publication of CN101616264A publication Critical patent/CN101616264A/en
Application granted granted Critical
Publication of CN101616264B publication Critical patent/CN101616264B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a method and a system for cataloging news video. The method realizes automatic cataloging of the news video based on caption bars, anchorman and audio mute point information in a news program, and comprises the following steps: carrying out audio-video separation of news video stream and head leader music matching of audio data to determine the effective time range of a news program in a file; determining an audio mute point, an anchorman frame and the emerging time of a caption frame within the effective time range, and carrying out comprehensive analysis processing to determine the division time point of news items; and identifying video caption information, associating the caption information with a division result, and taking the caption information after association as cataloging semantic information. The system comprises a bar removing module and an educing module connected with a news video bar-removing result database as well as a browse module, a play module and a correction module connected in parallel between a client and the news video bar-removing result database. The method and the system solve the problems of news automatic bar removing and news item automatic semantic information labeling and realize automatic cataloging of news programs, thereby having the advantages of high efficiency and low cost.

Description

News video categorization and system
Technical field
The invention belongs to the video structure analysis field, more precisely, relate to the news video structured techniques.
Background technology
Video structural obtains structural informations such as camera lens that video has, scene exactly when taking, utilize these structured messages to set up some index for video, makes things convenient for the management and the use of video.Can adopt manual mode that video frequency program is cut into more a plurality of video-frequency bands on the content, and these video-frequency bands are marked, for user index and use.But manual method will spend a large amount of time and human cost, inefficiency.And video manually marked there is subjective inconsistency, to same section video frequency program, different marks has different understanding with the user of service, and this otherness makes markup information can not objectively respond the true content of video, brings some inconvenience for the management of video content.
Because it is unrealistic in operation by hand video to be carried out structuring, utilizes automated method to come video is handled.Utilize the powerful calculating ability of computer, video is carried out structuring handle, finish the manual work that is difficult to realize.
Summary of the invention
The present invention is directed to that existing manual news categorization efficient is low, cost is high, made a catalogue personnel subjective factor influences big problem, for this reason, the invention provides a kind of news video automated cataloging method and system.
In order to reach described purpose, an aspect of of the present present invention, provide a kind of news video categorization, its technical scheme comprises the steps: based on the caption strips that occurs in the news program, host, audio mute dot information news video to be carried out automated cataloging, and step is as follows:
Step 1: news video stream is carried out audio, video data separate, obtain voice data and video data; Step 2: voice data is carried out head music coupling, determine news program scope effective time hereof; Voice data in the time range of news program place is carried out quiet point detect, obtain the audio mute point sequence; Video data in the time range of news program place is carried out key frame extraction, the detection of host's picture frame and literal frame detect, obtain quiet time, host's time of occurrence, literal time of occurrence in the time range of news program place; Step 3: audio mute point sequence, host's time of occurrence and Word message time of occurrence are carried out comprehensive analysis processing, obtain news item point sliced time; Simultaneously the Word message that occurs in the video is discerned, extracted Word message; Step 4: the demolition result and the Word message that identifies of news program carried out related, obtain having the news catalogue result of semantic information.
Wherein, the treatment step of video data comprises:
Step S2B1: extract the isolated video data of audio, video data; Step S2B2: video data is extracted key frame, be used to detect host's picture frame and Word message picture frame; Step S2B3: the time point that host's frame is occurred carries out the detection based on local feature coupling and host's time distribution characteristics, is used to generate the information of the zero-time that helps definite news item; Step S2B4: key frame set is detected, obtain the Word message frame, be used for generating the number of the news item that news program comprises.
Wherein, definite step of news item point sliced time comprises as follows:
Step 31: host's frame and Word message frame according to the time order and function order, are mixed and line up a mixed sequence M; Step 32: utilize host and Word message two class time points among the mixed sequence M,, determine the time point that news is cut apart in conjunction with the information among the quiet point sequence V.
Wherein, news sliced time, point adopted rule 1 and rule 2, or adopted rule 1 and rule 3, and its rule is: 1: one Word message frame of rule is represented news item, the start time point of this news Word message frame appearance place or before; The rule 2: in mixed sequence M, if current Word message frame front adjacent be host's key frame, think that then current Word message frame and host's frame belong to same news, the host belongs to the leading report camera lens of this news; Got among the quiet point sequence V before this host's frame, and apart from the quiet point of its nearest, as the zero-time of current this news; The rule 3: in mixed sequence M, if current Word message frame front adjacent also be a Word message frame, these two Word message frames belong to different news item; Got among the quiet point sequence V before current Word message frame, and apart from the quiet point of its nearest, as the zero-time of current this news.
Wherein, key frame of video extracts: be to extract key frame on the basis of video I frame, window with one 3 frame sign slides in the I frame sequence, the similarity of frame difference target area in the similarity of frame difference target area and second frame and the 3rd frame in interior first frame of difference calculation window and second frame, use sim (n respectively, n+1) and sim (n+1, n+2) expression; Calculation of similarity degree adopts histogram intersection, and the color histogram of establishing frame difference target area in interior three the I frames of window is respectively H n(k), H N+1(k), H N+2(k), the formula of calculating similarity is:
sim ( n , n + 1 ) = Σ k = 0 N - 1 min ( H n ( k ) , H n + 1 ( k ) ) Σ k = 0 N - 1 H n ( k )
sim ( n + 1 , n + 2 ) = Σ k = 0 N - 1 min ( H n + 1 ( k ) , H n + 2 ( k ) ) Σ k = 0 N - 1 H n + 1 ( k )
In the formula, N is the number between the chromatic zones that comprises of color histogram; According to the I frame similarity threshold T of prior setting, carry out following relatively judgement: if sim (n, n+1)<T, and sim (n+1, n+2)>T, i.e. n I frame and n+1 I frame dissmilarity, and n+1 I frame is similar to n+2 I frame, and extracting n+1 I frame so is key frame; Otherwise n+1 I frame is not key frame; Then, window is slided backward a frame, continue above-mentioned similarity calculating and relatively judgement; When window in the entire I frame sequence, slide go over after, just extracted the key frame set that may comprise host or Word message.
Wherein, it is to carry out on the basis of the key frame set of extracting that host's picture frame detects, and the key frame that utilizes people's face to detect extracting filters, and the key frame of selecting to comprise people's face is formed new people's face key frame set; Behind some extracted region visual signature to people's face key frame, utilize local feature point detection algorithm in the specific region of people's face key frame, to detect the local feature point; With some key frame is benchmark, mates the local feature point in other key frames, finds out many groups key frame that can match abundant local feature point host's key frame group as the candidate; Before two people's face key frames being carried out local feature point coupling, whether utilize color histogram to ask similarity to calculate these two people's face key frames may be similar, if the method by the histogram coupling is assert two key frame dissmilarities, they are not carried out local feature point coupling; Based on the time regularity of distribution of host's key frame in whole program, if the time span of one group of key frame in video is greater than certain threshold value, think that then they are candidate set of host's key frame, otherwise think that they are not host's key frames and it is given up; At last, comprehensive candidate's key frame group that only comprises a host and the key frame group that comprises two hosts judge which is host's key frame.
In order to reach described purpose, a second aspect of the present invention, be to the invention provides a kind of news video cataloging syytem, comprise: the output of demolition module is connected with the input of news video demolition result database, is used to export the demolition result that audio and video characteristic merges; News video demolition result database output is connected with the input of deriving module, receive the demolition result that audio and video characteristic merges, guide goes out the input output news video catalogue result of module and exports in the XML file outside the system, be used for these XML files are loaded into other system, make other system obtain news video catalogue result; Browsing module, playing module and correction module is parallel between user side and the news video demolition result database; Browse module, the numbering of the news item that the requirement of reception user appointment is browsed receives the inventory information of specifying news item in the news video demolition result database; Export the inventory information of specifying news item to the user, comprise some sliced time, headline, the news content descriptor of news item; Playing module receives the user and specifies the news item that requires to play to number, and receives the file path and the time range of this news in the news video demolition result database; Play the picture and the sound-content of this news to the user; Correction module receives the numbering that the user specifies the news item that requires correction, receives the existing inventory information of this news in the news video demolition result database; Show the existing inventory information of this news to the user, the inventory information of this news behind news video demolition result database output calibration.
Wherein, the demolition module comprises: the output of audio, video data separative element is connected with the input of audio and video characteristic integrated unit, its audio, video data separative element receives news video stream, is used for the news video flow point from generating voice data and video data and output; The audio and video characteristic integrated unit receives voice data and video data, is used for voice data and video data are generated demolition result and output.
Wherein, the audio, video data separative element also comprises:
The voice data subelement has the happy matching part of a slice head tone, has one quiet some test section, and described head music matching part and quiet some test section are connected in parallel; The video data subelement has host's frame test section, has a title bar frame test section, has a caption text identification part, and described host's frame test section, title bar frame test section and caption text identification part are connected in parallel.
Wherein, browse module and comprise: text header is browsed subelement and key frame images and is browsed subelement and be connected in parallel, and is used for different forms the result of news catalogue being showed the user.
Wherein, correction module comprises: news item fractionation or merging subelement, news item time point information syndrome unit, news item text message syndrome unit are connected in parallel, and from different perspectives the problem that may occur in the news automated cataloging process are proofreaied and correct respectively.
Beneficial effect of the present invention: the present invention has adopted the quiet dot information, host's information and the Word message that utilize in the news program news program to be carried out the technical scheme of automated cataloging.Solved the automatic demolition of news, the problem of news item automatic semantic information labeling.Realize the automated cataloging of news program, had efficient height, advantage that cost is low.Use XML as intermediate medium simultaneously in the solution of the present invention, realize the exchanges data and the information sharing of cataloging syytem and other video on-demand systems.
Description of drawings
Fig. 1 is a news catalogue scheme flow chart of the present invention.
Fig. 2 is that frame difference of the present invention is calculated target area figure.
Fig. 3 is that continuous three frames are formed a window in the I frame sequence of the present invention.
Fig. 4 is a news cataloging syytem structure chart of the present invention.
Fig. 5 is a news cataloging syytem surface chart of the present invention.
Embodiment
Describe each related detailed problem in the technical solution of the present invention in detail below in conjunction with accompanying drawing.Be to be noted that described embodiment only is intended to be convenient to the understanding of the present invention, and it is not played any qualification effect.
The present invention proposes a kind of news video automated cataloging method, as shown in Figure 1, this method can be carried out automated cataloging to the news video program to method, and the caption text information Recognition in the news program is come out, as the meaning of one's words information of News Stories.The method of catalogue is mainly carried out work by the appearance of caption strips, host and audio mute point in the news video is discerned, and is total to the analysis-by-synthesis to above-mentioned information, determines the time point cut apart and the information of headline.The operations such as result data derivation of can making a catalogue, browse, play, proofread and correct, mark, make a catalogue video file of news automated cataloging system.System utilizes the XML file as intermediary, realizes exchanges data with existed system.
1. news video categorization
Catalogued procedure is divided into that audio, video data separates, head music coupling, quiet point detect, key frame extracts, host's frame detects, steps such as some sliced time, related news item and text message are determined in the inspection of literal frame, Word message identification, comprehensive audio/video information.
(1) audio, video data separates:
The catalogue scheme that the present invention proposes will utilize picture and sound two aspect information that news content is carried out analyzing and processing, so before carrying out concrete catalogue calculating, voice data in the video file and video data to be extracted respectively earlier, use for follow-up Audio Processing and video processing procedure.
(2) head music coupling: voice data is carried out head music coupling, determine news program scope effective time hereof; Voice data in the time range of news program place is carried out quiet point detect, obtain the audio mute point sequence; Video data in the time range of news program place is carried out key frame extraction, the detection of host's frame and literal frame detect, obtain quiet time, host's time of occurrence, literal time of occurrence in the time range of news program place.Audio mute point sequence, host's time of occurrence and Word message time of occurrence are carried out comprehensive analysis processing, obtain news item point sliced time; Simultaneously the Word message that occurs in the video is discerned, extracted Word message;
Described processing of audio data step comprises: step S2A1: extract the isolated voice data of audio, video data; Step S2A2: voice data is carried out the frequency domain differential demodulation feature extraction, obtain audio frequency characteristics and head music template characteristic and mate, find the zero-time of news program in the file of news program place, obtain the audio mute point sequence; Identify the news program type simultaneously; Step S2A3: audio stream is carried out discrete sampling, and be divided into a plurality of audio frames in short-term, have necessarily overlappingly between the adjacent audio frame, with short-time average energy voice data is carried out quiet point and detect, find out possible news item point sliced time.
The treatment step of described video data comprises:
Step S2B1: extract the isolated video data of audio, video data; Step S2B2: video data is extracted key frame, be used to detect host's picture frame and Word message picture frame; Step S2B3: the time point that host's frame is occurred carries out the detection based on local feature coupling and host's time distribution characteristics, is used to generate the information of the zero-time that helps definite news item; Step S2B4: key frame set is detected, obtain the Word message frame, be used for generating the number of the news item that news program comprises.
News video normally obtains by the news in the recording TV program, in order to guarantee the integrality of program recording, and generally can be with each records the content of a period of time more after finishing before news program begins.In this case, effectively news program partly is in certain uncertain position in the video file.Before news video is made a catalogue, at first to determine news program time range hereof, then could be to the calculating of making a catalogue of the valid data in this scope.
Categorization that the present invention proposes, used some about the priori of news program as program parameter.Use the priori can the short cataloging scheme, get around some full-automatic algorithms and solve a bad difficult problem, such as Word message orientation problem in the video, to reach practical purpose.Dissimilar news has different time and space structure, so the program parameter that uses during to dissimilar news catalogue is also different.Therefore, before calculating that news video is made a catalogue, to determine the type of handled news program earlier.
When news program begins to play, one section head music is arranged all, and the head music difference of different news.Based on these characteristics, utilize the method for program head music coupling, can find the zero-time of news program in the file, identify the news program type simultaneously.
In advance the head music of preserving present common news is as template, in the time of determining the zero-time of the news program that a file comprises and type, with regard to respectively with these templates go with file in voice data mate.Use the characteristic vector of audible spectrum difference feature as head music matching process.
Similarity between two audio fragments, can utilize their characteristic vector to calculate:
Sim ( a 1 , a 2 ) = 1 - HD ( H 1 , H 2 ) N
Wherein, a 1, a 2Represent two audio fragments; H 1And H 2Represent from a respectively 1And a 2In the N dimensional feature vector that extracts; Two hamming distances (Hammingdistance) between the vector are asked in HD () expression.
If the known type news program has the P kind, be respectively News 1, News 2..., News P, corresponding head music template is respectively HM 1, HM 2..., HMP.With head music template HM 1From audio stream starting point to be matched, be that unit slides with the frame, every cunning moves a step, and once mates calculating, if HM 1Surpassed the threshold value of a predefined with the similarity of the audio fragment of position, then thought and found possible head music starting point, having stopped the current coupling of sliding, and to write down this start time be ST 1, similarity is Sim 1Carried out sliding after the coupling with all head music templates, obtained similarity sequence Sim 1, Sim 2..., Sim P, suppose that maximum wherein is Sim k, then select ST kBe news program zero-time hereof, the news type is News k
According to the news type that obtains, can know the time span of news program, in conjunction with the zero-time that obtains, can know the time range of news program in video file.
(3) quiet point detects: audio mute point sequence, host's time of occurrence and Word message time of occurrence are carried out comprehensive analysis processing, obtain news item point sliced time; Simultaneously the Word message that occurs in the video is discerned, extracted Word message;
In a news-video, all there are host's report or the sound that backgrounding explains orally the most of the time.And, have the pause of reporting or explaining orally in the place that two news replace, in audio stream, can there be one section very tangible quiet fragment.This quiet fragment can help some sliced time between definite news item.
Use the short-time average energy method, voice data is carried out quiet point detect, find out possible news item point sliced time.Short-time average energy refers to an average energy that the sampled point signal is assembled in the audio frame in short-term.Represent one section continuous audio signal stream with x, x is carried out discrete sampling, and be divided into a plurality of audio frames in short-term, have certain overlapping between the adjacent audio frame.Then wherein the short-time average energy of m audio frame is:
E m = Σ n = 0 N - 1 [ x ( n ) ] 2 N
Wherein, E mThe short-time average energy of representing m audio frame, N are represented the number of the sampled point that comprises in the m frame, the sampled value of n sampled point in x (n) the expression m frame.
If one in short-term the average energy of audio frame be lower than a prior given threshold value, judge that then this short time frame is quiet, otherwise be non-quiet.For a little audio fragment, surpassed certain proportion if wherein be judged as the quiet number of audio frame in short-term, then this little audio fragment is judged as quiet fragment.
(4) key frame of video extracts:
Before video data is handled, extract key frame earlier, replace entire video data with key frame then, carry out follow-up computing.Because key frame has been eliminated redundant data, can significantly reduce follow-up amount of calculation, so the key frame extraction is a very important step.
The key frame extraction operation here mainly is to detect and the detection of Word message frame is prepared for follow-up host's frame.The target that extracts is the picture frame that possible comprise host or Word message, and needn't extract the representative frame of all reaction different pictures contents, the key frame that extracts is like this wanted much less than the key frame on the ordinary meaning, more helps reducing follow-up amount of calculation.
Owing to be the key frame extraction of news program being carried out above-mentioned specific type, therefore can utilize some prioris about news program, improve traditional key frame of video abstracting method.Owing to only need to extract the picture frame that may comprise host or Word message, so when calculating the frame difference, can only consider to reflect that the host occurs or the variation of certain zonule that Word message occurs gets final product, and needn't consider the variation of whole video frame images, can reduce participating in the number that the frame difference is calculated like this, thereby reduce amount of calculation as rope point.As shown in Figure 2, the white rectangle zone of selecting the video pictures lower left is as the target area of calculating the frame difference, and (a) represents news spot among Fig. 2, and (b) the expression Word message (c) is represented male host, (d) the expression toastmistress.
As can be seen from the figure, when video content never transforms to the picture that has Word message with the picture of Word message, perhaps when non-host's picture transformed to host's picture, obvious variation all can take place in the vision content of selected rectangle zonule.This meets the target area of calculating the frame difference can react the principle that Word message occurs or the host occurs.When video was changed between above-mentioned four types picture, the color characteristic of frame difference target area had significant variation, thus select this regional color histogram, as the characteristic vector of calculating the frame difference.
Key frame extracts and carries out on the basis of video I frame (inner picture intra picture).As shown in Figure 3, utilize the gap of consecutive frame picture material to judge the existence of key frame, window with one 3 frame sign among the figure slides in the I frame sequence, the similarity of frame difference target area in the similarity of frame difference target area and second frame and the 3rd frame in interior first frame of difference calculation window and second frame, use sim (n respectively, n+1) and sim (n+1, n+2) expression.Calculation of similarity degree adopts the method for histogram intersection, and the color histogram of establishing frame difference target area in interior three the I frames of window is respectively H n(k), H N+1(k), H N+2(k), the formula of calculating similarity is:
sim ( n , n + 1 ) = Σ k = 0 N - 1 min ( H n ( k ) , H n + 1 ( k ) ) Σ k = 0 N - 1 H n ( k )
sim ( n + 1 , n + 2 ) = Σ k = 0 N - 1 min ( H n + 1 ( k ) , H n + 2 ( k ) ) Σ k = 0 N - 1 H n + 1 ( k )
Wherein, N is the number of (bin) between the chromatic zones that comprises of color histogram.
According to the I frame similarity threshold T of prior setting, carry out following relatively judgement: if sim (n, n+1)<T, and sim (n+1, n+2)>T, i.e. n I frame and n+1 I frame dissmilarity, and n+1 I frame is similar to n+2 I frame, and extracting n+1 I frame so is key frame; Otherwise n+1 I frame is not key frame.Then, window is slided backward a frame, continue above-mentioned similarity calculating and relatively judgement.When window in the entire I frame sequence, slide go over after, just extracted the key frame set that may comprise host or Word message, n=1,2,3,4 ....
This method can not directly extract the frame of video that only comprises host or Word message, but can obtain their superset, and the number of the frame of video that is comprised in this superset, all frame of video that comprise than video file or the number of I frame are wanted much less.This can significantly reduce the number of the picture frame that next participates in detection of host's frame and the detection of Word message frame, thereby reduces amount of calculation.
(5) host's frame detects:
The appearance of host's frame means the beginning of news item usually, therefore can determine the zero-time of news item by the time point that detects the appearance of host's frame.
The present invention uses based on the method for detection of people's face and local Feature Points Matching and carries out the detection of host's frame.This method is based on following hypothesis: (1) news program has one or two host, and a host can repeatedly occur in same news program, occurs for the first time and has the long time interval between last the appearance; (2) the positive face of host appears in the video pictures above the waist to video camera; (3) same host is when the different time points of whole program occurs, and only there are some small variations in gesture actions above the waist; (4) in same news program, host's clothing is constant, but background can have bigger variation.
Host's picture frame detects on the basis of the key frame set of extracting and carries out.The key frame that utilizes people's face to detect extracting filters, and only selects to comprise the key frame of people's face, and these key frames of selecting are formed new people's face key frame set.Behind some extracted region visual signature to people's face key frame, utilize local feature point detection algorithm in the specific region of people's face key frame, to detect the local feature point.With some key frame is benchmark, mates the local feature point in other key frames, finds out many groups key frame that can match abundant local feature point host's key frame group as the candidate.Note, before two people's face key frames being carried out local feature point coupling, whether can utilize color histogram to ask the method for similarity to calculate these two people's face key frames earlier may be similar, if the method by the histogram coupling is assert two key frame dissmilarities, just needn't carry out local feature point coupling to them again, thereby reduce the workload that local feature point detects and mates.It is because the amount of calculation of color histogram coupling detects than local feature point and the amount of calculation of coupling is much smaller that such judgement is carried out in selection.Based on the time regularity of distribution of host's key frame in whole program, if the time span of one group of key frame in video is greater than certain threshold value, just think that they are candidate set of host's key frame, otherwise think that they can not be host's key frames and it is given up.At last, comprehensive candidate's key frame group that only comprises a host and the key frame group that comprises two hosts judge which is host's key frame.
(6) the Word message frame detects:
Find that by the observation to a large amount of news video programs the appearance of each bar news all is attended by relevant Word message in the program, these Word messages are described the content of this news.Because having one to one with every news, Word message concerns, so can be by the number of the news item the detection of Word message being determined comprise in the news program.
The Word message frame detects on the basis of the key frame set of extracting and carries out.In a kind of news program of definite type, describing the locus of Word message in frame of video of news content fixes, can utilize this priori, in frame of video, mark the Word message viewing area, and should the zone when detecting the Word message frame, the effective coverage of calculating two frame similarities.That is to say that similarity is only relevant with the zone of this piece mark between two frames, and the extra-regional content of this piece does not participate in calculation of similarity degree, this zone is called " Word message target area ".
Preserve the Word message frame template of common type news program in advance.When detecting the Word message frame,, select related words information frame template according to the news program type that head music coupling is determined.Calculate the similarity of the Word message target area of the Word message target area of this template and each key frame respectively, select all similarities greater than the key frame of given threshold value as the Word message frame.
The method that calculation of similarity degree adopts color histogram to intersect is established H Model(k) be the color histogram of template Word message target area, H i(k) be the color histogram of the Word message target area of i key frame, then the similarity of template and i key frame is:
sim ( mod el , i ) = Σ k = 0 N - 1 min ( H mod el ( k ) , H i ( k ) ) Σ k = 0 N - 1 H mod el ( k )
Wherein, (model i) is the similarity of Word message frame template and i key frame to sim, and N is the number of (bin) between chromatic zones in the color histogram.
If T is prior given similarity threshold, if sim (model, i)>T, think that then i key frame is the Word message frame; Otherwise i key frame is not the Word message frame, gives up.
(7) comprehensive audio/video information is determined some sliced time:
Comprehensive described news item point sliced time determine to comprise the steps: step 31: host's frame and Word message frame according to the time order and function order, are mixed and line up a mixed sequence M; Step 32: utilize host and Word message two class time points among the mixed sequence M,, determine the time point that news is cut apart in conjunction with the information among the quiet point sequence V.
Through the processing of front, audio mute time point sequence, host's time of occurrence point sequence and Word message time of occurrence point sequence in the news program have been obtained.The information of comprehensive these three time point sequences can determine to comprise in the news program number of news item and the zero-time of each news item in whole file.
News item must be accompanied by a Word message of describing its content, and this is the basic foundation that we cut apart news program.So a Word message frame that detects has just been determined the existence of news item.Host's time of occurrence point and audio mute point, the auxiliary concrete zero-time of determining each bar news.
Host's frame and Word message frame according to the time order and function order, are mixed and line up a sequence, and it is called sequence M.(3) detected quiet point sequence is called V.Utilize host and Word message two class time points among the sequence M, in conjunction with the information among the quiet point sequence V, determine the time point that news is cut apart, detailed process is based on following rule:
The rule 1 one Word message frames represent news item, the start time point of this news Word message frame appearance place or before.
Rule 2 in sequence M, if current Word message frame front adjacent be host's key frame, think that so current Word message frame and host's frame belong to same news, the host belongs to the leading report camera lens of this news.Got among the sequence V before this host's frame, and apart from the quiet point of its nearest, as the zero-time of current this news.
Rule 3 in sequence M, if current Word message frame front adjacent also be a Word message frame, these two Word message frames belong to different news item.Got among the sequence V before current Word message frame, and apart from the quiet point of its nearest, as the zero-time of current this news.Described news point sliced time adopts rule 1 and rule 2, or adopts rule 1 and rule 3.
(8) Word message identification
Literal letter in the news video has comprised abundant semantic content, is the description to corresponding news item content.These Word messages can be extracted from video, as news catalogue result's a part.
(9) related news item and Word message:
The OCR result of Word message frame not only comprises the Word message of being discerned, and also comprises the time location that the Word message frame occurs.Utilize this time tag recognition result and its described news item of Word message can be associated, obtain having the news catalogue result of text description information.
2. cataloging syytem function module design
System hardware and software environmental condition of the present invention: system of the present invention, exploitation and operation are adopted intel pentium 4 processors, Windows XP operating system on conventional microcomputer.Development language uses C++ and Java.Developing instrument uses VC6.0 and Eclipse.Database uses SQLServer2000.
News cataloging syytem structure of the present invention as shown in Figure 4, the news cataloging syytem mainly is divided into five modules: demolition module 1, news video demolition result database 2, derive module 3, browse module 4, playing module 5, correction module 6 and user 7.
The output of demolition module is connected with the input of news video demolition result database, is used to export the demolition result that audio and video characteristic merges;
News video demolition result database output is connected with the input of deriving module, receive the demolition result that audio and video characteristic merges, guide goes out the input output news video catalogue result of module and exports in the XML file outside the system, be used for these XML files are loaded into other system, make other system obtain news video catalogue result;
Browsing module, playing module and correction module is parallel between user side and the news video demolition result database;
Browse module, the numbering of the news item that the requirement of reception user appointment is browsed receives the inventory information of specifying news item in the news video demolition result database; Export the inventory information of specifying news item to the user, comprise some sliced time, headline, the news content descriptor of news item;
Playing module receives the user and specifies the news item that requires to play to number, and receives the file path and the time range of this news in the news video demolition result database; Play the picture and the sound-content of this news to the user;
Correction module receives the numbering that the user specifies the news item that requires correction, receives the existing inventory information of this news in the news video demolition result database; Show the existing inventory information of this news to the user, the inventory information of this news behind news video demolition result database output calibration.
(1) demolition module 1 is the corn module of system.From news video stream, extract voice data and video data, voice data is carried out head music coupling and quiet some detection acquisition audio frequency characteristics information, video data is carried out the detection of host's frame, the detection of title bar frame and caption text identification acquisition visual signature information.According to certain rule together, determine some sliced time of news item with the audio and video characteristic information fusion.Demolition result mainly comprises the beginning and ending time point of news item and headline information etc., and these results are stored in the news video demolition result database 2, support the service function in future.
Demolition module 1 comprises: audio, video data separative element 11 and audio and video characteristic integrated unit 12, audio, video data separative element 11 outputs and audio and video characteristic integrated unit 12 inputs are connected in series, wherein: audio, video data separative element 11 receives news video stream, is used for the news video flow point from generating voice data and video data and output; Audio and video characteristic integrated unit 12 receives voice data and video data, is used for voice data and video data are generated demolition result and output.Described audio, video data separative element 11 also comprises: voice data subelement 1a, have the happy matching part of a slice head tone, and have the happy matching part of a slice head tone, described head music matching part and head music matching part are connected in parallel; Video data subelement 1b has host's frame test section, has a title bar frame test section, has a caption text identification part, and described host's frame test section, title bar frame test section and caption text identification part are connected in parallel.
Audio, video data separative element 11 is separated into voice data and video data two parts with video flowing; The voice data that obtains is used for head music coupling and quiet point detects, and the video data of acquisition is used for the detection of host's frame, the Word message frame detects and Word message identification; The analysis-by-synthesis module merges audio/video information, obtains demolition news demolition result as a result.
(2) browse module 4 text and two kinds of browsing modes of picture are provided.By can read the fast heading message of each news item of text mode, understand the general content of news; Can browse the key frame picture of news item by the figure sheet mode, news content is had impression intuitively, just look like to be that news illustration on the newspaper is the same.
Text header browses subelement and key frame images is browsed two subblocks of subelement, is coordination between these two, with different forms the result of news catalogue is showed user 7 user respectively.
(3) playing module 5, and the video player that utilizes system to carry carries out playback to the news item of user's 7 appointments, for user 7 provides detailed news report content.
(4) correction module 6, and item text information editing and clauses and subclauses beginning and ending time point editor is provided two kinds of functions.The text message editor allows 7 pairs of clauses and subclauses titles of discerning automatically of user to revise, and can also add other relevant text message for clauses and subclauses.Beginning and ending time, the some editor allowed the zero-time and the termination time of 7 pairs of clauses and subclauses of user to make amendment, and can also delete and add clauses and subclauses, when having the clauses and subclauses time point inaccurate in the automatic demolition, can utilize manual mode to go to revise.
News item fractionation or merging subelement, news item time point information syndrome unit, three subblocks in news item text message syndrome unit, between these three subblocks is relation arranged side by side, from different perspectives the problem that may occur in the news automated cataloging process is proofreaied and correct respectively.
(5) the catalogue result derives module 3, and the news video in the news video demolition result database 2 catalogue result is exported in XML file system outside, and these XML files are loaded in the other system, can make other system obtain the result that news video is made a catalogue.
Catalogue is export function as a result, and the catalogue result of derivation is saved in the XML file of system outside.
3. system interface layout
System interface is the listed files of news-video on the left of the interface as shown in Figure 5, organizes according to the TV station's classification under the news program.The top, left side is TV station's directory tree, and the below is the news program listed files, and when choosing some TV stations node in TV station's directory tree, the news video listed files can be updated to the news program file that belongs to this TV station synchronously.Each Archive sit can launch, and shows the news program title that this document comprises.The news program node further launches, and shows the heading message of a plurality of news item that obtain behind this news program catalogue.The right side, interface is a news item key frame display floater, provides the synopsis of news item in the mode of picture, and is visual and clear.Top, middle part, interface is a video player, can play the news footage of choosing on left and right sides listed files and key frame panel, allows the user understand the detailed content of news.The player below is the panel that shows current broadcast news item information, and the user can be in this reading or modification temporal information and the semantic information relevant with news item.Result's derivation is made a catalogue and made a catalogue to news by the realization of the function menu item in the File menu.
Table 1. news catalogue experimental result
News program Actual news item number Detected news item number The omission entry number Many inspection entry number
News 30 minutes-1 18 18 0 0
News 30 minutes-2 26 26 0 0
News hookup-1 32 29 3 0
News hookup-2 40 40 0 0
News when international 8 8 0 0
The Zhejiang news hookup 18 17 1 0
Summer is looked news 9 9 0 0
The Xinjiang news hookup 17 13 4 0
Zun Yi news hookup 8 8 0 0
Zhengzhou news 14 14 0 0
Amount to 190 182 8 0

Claims (7)

1. a news video categorization is characterized in that, based on the caption strips that occurs in the news program, host and audio mute dot information news video is carried out automated cataloging, and step is as follows:
Step 1: news video stream is carried out audio, video data separate, obtain voice data and video data;
Step 2: voice data is carried out head music coupling, determine news program scope effective time hereof; Voice data in the time range of news program place is carried out quiet point detect, obtain the audio mute point sequence; Video data in the time range of news program place is carried out key frame extraction, the detection of host's picture frame and literal frame detect, obtain quiet time, host's time of occurrence and Word message time of occurrence in the time range of news program place;
Step 3: audio mute point sequence, host's time of occurrence, Word message time of occurrence and rule are carried out comprehensive analysis processing, host's frame and Word message frame according to the time order and function order, are mixed and line up a mixed sequence M; Step 32: utilize host and Word message two class time points among the mixed sequence M,, obtain news item point sliced time in conjunction with the information among the quiet point sequence V; Simultaneously the Word message that occurs in the video is discerned, extracted Word message;
Described rule is rule 1, rule 2 and rule 3, described news item point sliced time adopts rule 1 and rule 2, or adopt rule 1 and rule 3, described regular 1: one Word message frame is represented news item, the start time point of this news Word message frame appearance place or before; Described regular 2: in mixed sequence M, if current Word message frame front adjacent be host's key frame, think that then current Word message frame and host's frame belong to same news, the host belongs to the leading report camera lens of this news; Got among the quiet point sequence V before this host's picture frame, and apart from the quiet point of its nearest, as the zero-time of current this news; Described regular 3: in mixed sequence M, if current Word message frame front adjacent also be a Word message frame, these two Word message frames belong to different news item; Got among the quiet point sequence V before current Word message frame, and apart from the quiet point of its nearest, as the zero-time of current this news;
Step 4: the demolition result and the Word message that identifies of news program carried out related, obtain having the news catalogue result of semantic information.
2. news video categorization according to claim 1 is characterized in that: the treatment step of described video data comprises:
Step S2B1: extract the isolated video data of audio, video data;
Step S2B2: video data is extracted key frame, be used to detect host's picture frame and Word message picture frame;
Step S2B3: the time point that host's picture frame is occurred carries out the detection based on local feature coupling and host's time distribution characteristics, is used to generate the information of the zero-time that helps definite news item;
Step S2B4: key frame set is detected, obtain the Word message picture frame, be used for generating the number of the news item that news program comprises.
3. news video categorization according to claim 1, it is characterized in that, described key frame of video extracts: be to extract key frame on the basis of video I frame, window with one 3 frame sign slides in the I frame sequence, the similarity of frame difference target area in the similarity of frame difference target area and second frame and the 3rd frame in interior first frame of difference calculation window and second frame, use respectively sim (n, n+1) and sim (n+1, n+2) expression; Calculation of similarity degree adopts histogram intersection, and the color histogram of establishing frame difference target area in interior three the I frames of window is respectively H n(k), H N+1(k), H N+2(k), the formula of calculating similarity is:
sim ( n , n + 1 ) = Σ k = 0 N - 1 min ( H n ( k ) , H n + 1 ( k ) ) Σ k = 0 N - 1 H n ( k )
sim ( n + 1 , n + 2 ) = Σ k = 0 N - 1 min ( H n + 1 ( k ) , H n + 2 ( k ) ) Σ k = 0 N - 1 H n + 1 ( k )
In the formula, N is the number between the chromatic zones that comprises of color histogram; According to the I frame similarity threshold T of prior setting, carry out following relatively judgement: if sim (n, n+1)<T, and sim (n+1, n+2)>T, i.e. n I frame and n+1 I frame dissmilarity, and n+1 I frame is similar to n+2 I frame, and extracting n+1 I frame so is key frame; Otherwise n+1 I frame is not key frame; Then, window is slided backward a frame, continue above-mentioned similarity calculating and relatively judgement; When window in the entire I frame sequence, slide go over after, just extracted the key frame set that may comprise host or Word message.
4. news video categorization according to claim 2 is characterized in that:
It is to carry out on the basis of the key frame set of extracting that described host's picture frame detects, and the key frame that utilizes people's face to detect extracting filters, and the key frame of selecting to comprise people's face is formed new people's face key frame set; Behind some extracted region visual signature to people's face key frame, utilize local feature point detection algorithm in the specific region of people's face key frame, to detect the local feature point; With some key frame is benchmark, mates the local feature point in other key frames, finds out many groups key frame that can match abundant local feature point host's key frame group as the candidate; Before two people's face key frames being carried out local feature point coupling, whether utilize color histogram to ask similarity to calculate these two people's face key frames may be similar, if the method by the histogram coupling is assert two key frame dissmilarities, they are not carried out local feature point coupling; Based on the time regularity of distribution of host's key frame in whole program, if the time span of one group of key frame in video is greater than certain threshold value, think that then they are candidate set of host's key frame, otherwise think that they are not host's key frames and it is given up; At last, comprehensive candidate's key frame group that only comprises a host and the key frame group that comprises two hosts judge which is host's key frame.
5. a news video cataloging syytem is characterized in that, comprising:
The demolition module comprises: the output of audio, video data separative element is connected with the input of audio and video characteristic integrated unit; The audio, video data separative element receives news video stream, is used for the news video flow point from generating voice data and video data and output; Described audio, video data separative element also comprises: the voice data subelement, have the happy matching part of a slice head tone, and have the happy matching part of a slice head tone, described head music matching part and head music matching part are connected in parallel; The video data subelement has host's frame test section, has a title bar frame test section, has a caption text identification part, and described host's frame test section, title bar frame test section and caption text identification part are connected in parallel; The audio and video characteristic integrated unit receives voice data and video data, is used for voice data and video data are generated demolition result and output; The audio and video characteristic integrated unit determines that news item sliced time, the step of point comprised as follows: step 31: host's frame and Word message frame according to the time order and function order, are mixed and line up a mixed sequence M; Step 32: utilize host and Word message two class time points among the mixed sequence M,, determine the time point that news is cut apart in conjunction with the information among the quiet point sequence V; Described news point sliced time adopts rule 1 and rule 2, or adopts rule 1 and rule 3, and its rule is: 1: one Word message frame of rule is represented news item, the start time point of this news Word message frame appearance place or before; The rule 2: in mixed sequence M, if current Word message frame front adjacent be host's key frame, think that then current Word message frame and host's frame belong to same news, the host belongs to the leading report camera lens of this news; Got among the quiet point sequence V before this host's picture frame, and apart from the quiet point of its nearest, as the zero-time of current this news; The rule 3: in mixed sequence M, if current Word message frame front adjacent also be a Word message frame, these two Word message frames belong to different news item; Got among the quiet point sequence V before current Word message frame, and apart from the quiet point of its nearest, as the zero-time of current this news;
The output of demolition module is connected with the input of news video demolition result database, is used to export the demolition result that audio and video characteristic merges;
News video demolition result database output is connected with the input of deriving module, receive the demolition result that audio and video characteristic merges, guide goes out the input output news video catalogue result of module and exports in the XML file outside the system, be used for these XML files are loaded into other system, make other system obtain news video catalogue result;
Browsing module, playing module and correction module is parallel between user side and the news video demolition result database;
Browse module, the numbering of the news item that the requirement of reception user appointment is browsed receives the inventory information of specifying news item in the news video demolition result database; Export the inventory information of specifying news item to the user, comprise some sliced time, headline, the news content descriptor of news item;
Playing module receives the user and specifies the news item that requires to play to number, and receives the file path and the time range of this news in the news video demolition result database; Play the picture and the sound-content of this news to the user;
Correction module receives the numbering that the user specifies the news item that requires correction, receives the existing inventory information of this news in the news video demolition result database; Show the existing inventory information of this news to the user, the inventory information of this news behind news video demolition result database output calibration.
6. according to claim 5 news video cataloging syytem, it is characterized in that: the described module of browsing comprises: text header is browsed subelement and key frame images and is browsed subelement and be connected in parallel, and is used for different forms the result of news catalogue being showed the user.
7. according to claim 5 news video cataloging syytem, it is characterized in that: described correction module module comprises: news item fractionation or merging subelement, news item time point information syndrome unit, news item text message syndrome unit are connected in parallel, and from different perspectives the problem that may occur in the news automated cataloging process are proofreaied and correct respectively.
CN2008101157870A 2008-06-27 2008-06-27 Method and system for cataloging news video Expired - Fee Related CN101616264B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101157870A CN101616264B (en) 2008-06-27 2008-06-27 Method and system for cataloging news video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101157870A CN101616264B (en) 2008-06-27 2008-06-27 Method and system for cataloging news video

Publications (2)

Publication Number Publication Date
CN101616264A CN101616264A (en) 2009-12-30
CN101616264B true CN101616264B (en) 2011-03-30

Family

ID=41495625

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101157870A Expired - Fee Related CN101616264B (en) 2008-06-27 2008-06-27 Method and system for cataloging news video

Country Status (1)

Country Link
CN (1) CN101616264B (en)

Families Citing this family (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9443147B2 (en) * 2010-04-26 2016-09-13 Microsoft Technology Licensing, Llc Enriching online videos by content detection, searching, and information aggregation
CN101968819B (en) * 2010-11-05 2012-05-30 中国传媒大学 Audio/video intelligent catalog information acquisition method facing to wide area network
CN102323948A (en) * 2011-09-07 2012-01-18 上海大学 Automatic detection method for title sequence and tail leader of TV play video
CN103118222B (en) * 2011-09-29 2015-10-28 成都索贝数码科技股份有限公司 Net platform collecting and editing system
CN102724598A (en) * 2011-12-05 2012-10-10 新奥特(北京)视频技术有限公司 Method for splitting news items
WO2013097101A1 (en) * 2011-12-28 2013-07-04 华为技术有限公司 Method and device for analysing video file
CN102497590B (en) * 2011-12-30 2014-04-23 百视通网络电视技术发展有限责任公司 IPTV-based method for automatically generating catalogued picture of strip-splitting video and system thereof
CN102595206B (en) * 2012-02-24 2014-07-02 央视国际网络有限公司 Data synchronization method and device based on sport event video
CN102750349B (en) * 2012-06-08 2014-10-08 华南理工大学 Video browsing method based on video semantic modeling
US9251421B2 (en) * 2012-09-13 2016-02-02 General Electric Company System and method for generating semantic annotations
CN103065511B (en) * 2012-12-29 2015-04-01 福州新锐同创电子科技有限公司 Implementation method of teaching plan editor
CN103079041B (en) * 2013-01-25 2016-01-27 深圳先进技术研究院 The method of news video automatic strip-cutting device and news video automatic strip
CN104519401B (en) * 2013-09-30 2018-04-17 贺锦伟 Video segmentation point preparation method and equipment
CN104660666A (en) * 2013-10-08 2015-05-27 深圳市王菱科技开发有限公司 Video electronic product supporting seamless connection of interaction association system and WIFI
CN103533459B (en) * 2013-10-09 2017-05-03 北京中科模识科技有限公司 Method and system for splitting news video entry
CN103546667B (en) * 2013-10-24 2016-08-17 中国科学院自动化研究所 A kind of automatic news demolition method towards magnanimity broadcast television supervision
CN103646094B (en) * 2013-12-18 2017-05-31 上海紫竹数字创意港有限公司 Realize that audiovisual class product content summary automatically extracts the system and method for generation
CN103731675B (en) * 2013-12-30 2017-02-01 广州中大数字家庭工程技术研究中心有限公司 Intelligent news broadcast system based on digital family interactive service middleware
CN104185080B (en) * 2014-03-24 2018-05-08 无锡天脉聚源传媒科技有限公司 A kind of generation method and device of digital television program list
CN103915106B (en) * 2014-03-31 2017-01-11 宇龙计算机通信科技(深圳)有限公司 Title generation method and system
CN103870598B (en) * 2014-04-02 2017-02-08 北京航空航天大学 Unmanned aerial vehicle surveillance video information extracting and layered cataloguing method
CN103905742A (en) * 2014-04-10 2014-07-02 北京数码视讯科技股份有限公司 Video file segmentation method and device
CN104410867A (en) * 2014-11-17 2015-03-11 北京京东尚科信息技术有限公司 Improved video shot detection method
CN104780388B (en) * 2015-03-31 2018-03-09 北京奇艺世纪科技有限公司 The cutting method and device of a kind of video data
CN105608423A (en) * 2015-12-17 2016-05-25 天脉聚源(北京)科技有限公司 Video matching method and device
CN106066862B (en) * 2016-05-25 2019-05-31 东软集团股份有限公司 Media event display methods and device
CN108228658B (en) * 2016-12-22 2022-06-03 阿里巴巴集团控股有限公司 Method and device for automatically generating dubbing characters and electronic equipment
CN108319888B (en) * 2017-01-17 2023-04-07 阿里巴巴集团控股有限公司 Video type identification method and device and computer terminal
CN108024146A (en) * 2017-12-14 2018-05-11 深圳Tcl数字技术有限公司 News interface automatic setting method, smart television and computer-readable recording medium
CN108093314B (en) * 2017-12-19 2020-09-01 北京奇艺世纪科技有限公司 Video news splitting method and device
CN108388872B (en) * 2018-02-28 2021-10-22 北京奇艺世纪科技有限公司 Method and device for identifying news headlines based on font colors
CN108551584B (en) * 2018-05-17 2021-03-16 北京奇艺世纪科技有限公司 News segmentation method and device
CN108810568B (en) * 2018-05-17 2020-11-27 北京奇艺世纪科技有限公司 News segmentation method and device
CN108734166B (en) * 2018-05-23 2022-03-11 深圳市茁壮网络股份有限公司 News title detection method and device
CN108810569B (en) * 2018-05-23 2021-01-22 北京奇艺世纪科技有限公司 Video news segmentation method and device
CN108710860B (en) * 2018-05-23 2021-01-12 北京奇艺世纪科技有限公司 Video news segmentation method and device
CN109005451B (en) * 2018-06-29 2021-07-30 杭州星犀科技有限公司 Video strip splitting method based on deep learning
CN112863547B (en) * 2018-10-23 2022-11-29 腾讯科技(深圳)有限公司 Virtual resource transfer processing method, device, storage medium and computer equipment
CN109348289B (en) * 2018-11-15 2021-08-24 北京奇艺世纪科技有限公司 News program title extraction method and device
CN109472243B (en) * 2018-11-15 2021-08-17 北京奇艺世纪科技有限公司 News program segmentation method and device
CN109350084A (en) * 2018-12-04 2019-02-19 安徽阳光心健科技发展有限公司 A kind of psychological test device and its test method
CN109640193B (en) * 2018-12-07 2021-02-26 成都东方盛行电子有限责任公司 News strip splitting method based on scene detection
CN111314775B (en) 2018-12-12 2021-09-07 华为终端有限公司 Video splitting method and electronic equipment
CN109743624B (en) * 2018-12-14 2021-08-17 深圳壹账通智能科技有限公司 Video cutting method and device, computer equipment and storage medium
CN110532983A (en) * 2019-09-03 2019-12-03 北京字节跳动网络技术有限公司 Method for processing video frequency, device, medium and equipment
CN112784106B (en) * 2019-11-04 2024-05-14 阿里巴巴集团控股有限公司 Content data processing method, report data processing method, computer device, and storage medium
CN111324753B (en) * 2020-01-22 2021-09-03 天窗智库文化传播(苏州)有限公司 Media information publishing management method and system
CN111556254B (en) * 2020-04-10 2021-04-02 早安科技(广州)有限公司 Method, system, medium and intelligent device for video cutting by using video content
CN111242110B (en) * 2020-04-28 2020-08-14 成都索贝数码科技股份有限公司 Training method of self-adaptive conditional random field algorithm for automatically breaking news items
CN111709324A (en) * 2020-05-29 2020-09-25 中山大学 News video strip splitting method based on space-time consistency
CN111901696B (en) * 2020-07-31 2022-04-15 杭州当虹科技股份有限公司 Real-time recording and strip-disassembling system based on hls technology by using preloading mode
CN111813998B (en) * 2020-09-10 2020-12-11 北京易真学思教育科技有限公司 Video data processing method, device, equipment and storage medium
CN112258513A (en) * 2020-10-23 2021-01-22 岭东核电有限公司 Nuclear power test video segmentation method and device, computer equipment and storage medium
CN113542820B (en) * 2021-06-30 2023-12-22 北京中科模识科技有限公司 Video cataloging method, system, electronic equipment and storage medium
CN113992944A (en) * 2021-10-28 2022-01-28 北京中科闻歌科技股份有限公司 Video cataloging method, device, equipment, system and medium
CN114051154A (en) * 2021-11-05 2022-02-15 新华智云科技有限公司 News video strip splitting method and system
CN116939291B (en) * 2023-09-13 2023-11-28 浙江新华移动传媒股份有限公司 Video quick stripping method and related device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000045291A1 (en) * 1999-01-28 2000-08-03 Koninklijke Philips Electronics N.V. System and method for analyzing video content using detected text in video frames
CN1658226A (en) * 2004-02-20 2005-08-24 三星电子株式会社 Method and apparatus for detecting anchorperson shot
WO2006103633A1 (en) * 2005-03-31 2006-10-05 Koninklijke Philips Electronics, N.V. Synthesis of composite news stories
CN101031035A (en) * 2006-03-03 2007-09-05 广州市纽帝亚资讯科技有限公司 Method for cutting news video unit automatically based on video sequence analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000045291A1 (en) * 1999-01-28 2000-08-03 Koninklijke Philips Electronics N.V. System and method for analyzing video content using detected text in video frames
CN1658226A (en) * 2004-02-20 2005-08-24 三星电子株式会社 Method and apparatus for detecting anchorperson shot
WO2006103633A1 (en) * 2005-03-31 2006-10-05 Koninklijke Philips Electronics, N.V. Synthesis of composite news stories
CN101031035A (en) * 2006-03-03 2007-09-05 广州市纽帝亚资讯科技有限公司 Method for cutting news video unit automatically based on video sequence analysis

Also Published As

Publication number Publication date
CN101616264A (en) 2009-12-30

Similar Documents

Publication Publication Date Title
CN101616264B (en) Method and system for cataloging news video
CN102342124B (en) Method and apparatus for providing information related to broadcast programs
RU2494566C2 (en) Display control device and method
CN103593363B (en) The method for building up of video content index structure, video retrieval method and device
CN110012349B (en) A kind of news program structural method end to end
US7876381B2 (en) Telop collecting apparatus and telop collecting method
CN106021496A (en) Video search method and video search device
CN102547139A (en) Method for splitting news video program, and method and system for cataloging news videos
KR20000054561A (en) A network-based video data retrieving system using a video indexing formula and operating method thereof
Jiang et al. Automatic consumer video summarization by audio and visual analysis
KR101550886B1 (en) Apparatus and method for generating additional information of moving picture contents
JP2002533841A (en) Personal video classification and search system
US20120242897A1 (en) method and system for preprocessing the region of video containing text
CN111432140B (en) Method for splitting television news into strips by using artificial neural network
US7349477B2 (en) Audio-assisted video segmentation and summarization
CN112019871B (en) Live E-commerce content intelligent management platform based on big data
CN112291574B (en) Large-scale sports event content management system based on artificial intelligence technology
Jindal et al. Efficient and language independent news story segmentation for telecast news videos
JP4270118B2 (en) Semantic label assigning method, apparatus and program for video scene
Haloi et al. Unsupervised story segmentation and indexing of broadcast news video
Haloi et al. Unsupervised broadcast news video shot segmentation and classification
JP4906552B2 (en) Meta information adding apparatus and meta information adding program
JP2007060606A (en) Computer program comprised of automatic video structure extraction/provision scheme
Tapu et al. TV news retrieval based on story segmentation and concept association
Bechet et al. Detecting person presence in tv shows with linguistic and structural features

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20110330

Termination date: 20180627

CF01 Termination of patent right due to non-payment of annual fee