Background technology
Audio-video frequency content automatic structureization and the object of precisely searching for, be help user to get the audio-video frequency content wanted most more rapidly and accurately in internet mass audio-video frequency content, help user to save and obtain the time of the relevant accurate content of audio frequency and video and reduce procurement cost.
Along with the fast development of Internet technology and Internet service, the data type in internet constantly increases fast, data type except word with
figuresheet also has a large amount of audio frequency and video outward.In data type in internet, word with
figuresheet has been now structural data all, searches the content needed most with can facilitating user's fast accurate.But, the audio-video frequency content of the magnanimity in internet data develops into structural data not yet on a large scale, therefore, how fast and effeciently magnanimity audio frequency and video to be carried out full-automatic content structure process and carry out precisely search to audio-video frequency content becoming the problem needing to solve.
The audio video searching method generally used at present is: search for based on the word in the title of the audio frequency and video of human-edited or brief introduction or label, the defect of this search is that searchable word is limited, and be all that artificial later stage compilation is added, objectivity and the accuracy of Search Results are lower, further, this way of search precisely cannot search the key content a certain second in audio frequency and video.
A kind of audio video searching method is in addition at present: extract the some crucial track in audio frequency and video or key frame, be that a certain feature to be gone in audio frequency and video to be searched by track or mates screening frame by frame with the static information in crucial track or key frame, the defect of this way of search be to need according to time ordered pair key sound rail or the screening of key frame repeated matching and search, operand in search procedure is quite huge, and along with the continuous increase in audio frequency and video storehouse to be searched, the search efficiency of the method can exponentially level decline, and searches for consuming time long.
Summary of the invention
For solving the problem and overcoming Problems existing in correlation technique, example of the present invention discloses a kind of full-automatic audio frequency and video structuring and the method for precisely searching for, help user to improve the accuracy of audio-video frequency content search in order to the magnanimity audio-video frequency content in quick large-scale structure internet data, reduce duration and the procurement cost of Search Results of audio-video frequency content search.
A kind of full-automatic audio frequency and video structuring disclosed in example of the present invention comprises two aspects with the method for precisely searching for, and is the accurate searching method of audio frequency and video after a kind of full-automatic data structured method of audio-video frequency content and a kind of structuring respectively.
According to the first aspect of disclosure example, provide a kind of full-automatic data structured method of audio-video frequency content, process is as follows.
System automatically on internet or LAN (Local Area Network) batch extracting treat structuring audio frequency and video, and record each internet treating structuring audio frequency and video extracted or lan address.
System automatically utilize audio analysis techniques batch to extract each above-mentioned corresponding complete track until structuring audio frequency and video of having extracted and be compressed to the sound signal that is not less than 16bit with until after use.
System automatically by above-mentioned each to have extracted and the stand-by track logic being compressed to the sound signal being not less than 16bit is cut into multiple track in short-term in seconds.
System is automatically for multiple tracks in short-term of the above-mentioned cutting of logic sequentially mark Millisecond beginning and ending time code.
System is automatically by the multiple tracks in short-term also sequentially marking Millisecond beginning and ending time code of the above-mentioned cutting of logic, submit to multiple speech recognition server respectively in the mode of batch multithreading simultaneously, utilize speech recognition technology to complete the full-automatic conversion of sound to text character.
System automatically by above-mentioned completed multiple conversions in short-term corresponding to track that sound transforms to text character after text fragments fetch, and each character in text fragments after all conversions is sequentially marked upper corresponding Millisecond beginning and ending time code.
All characters that system has marked Millisecond beginning and ending time code by above-mentioned automatically and text fragments are sequentially combined into complete text again, and each character in full copy all has the Millisecond beginning and ending time code of its correspondence.
System automatically by the above-mentioned full copy of Millisecond beginning and ending time code that marked with the complete track corresponding to it and treat that structuring audio frequency and video all synchronously set up complete unique mapping relations, that is, treat that each sound in the complete track of structuring audio frequency and video all has a unique corresponding text character marking Millisecond beginning and ending time code.
The internet that system treats structuring audio frequency and video by above-mentioned automatically or lan address, the complete track corresponding to it and the unique full copy marking Millisecond beginning and ending time code corresponding to it are with character string mode typing structuring audio frequency and video index data base.
So far, the full-automatic data structured process of audio-video frequency content completes.
According to the second aspect of disclosure example, provide the audio frequency and video after a kind of structuring accurate searching method, process is as follows.
The accurate searching request of video that system receives user is initiated, at least carries video content keyword character or the subjective video presentation ocra font ocr thought of user in described searching request.
The mode that system is retrieved in full automatically from the structuring audio frequency and video index data base described in disclosure example first aspect, extract the multiple character strings consistent with above-mentioned user search request, utilize clustering algorithm to determine the audio and video resources of Search Results to be presented respectively, and be each audio and video resources determination string matching degree mark to be presented.
System automatically from the structuring audio frequency and video index data base described in disclosure example first aspect in the mode of context semantic analysis, extract and multiple character strings approximate in above-mentioned user search request, utilize clustering algorithm to determine the audio and video resources of Search Results to be presented respectively, and be each audio and video resources determination semantic matching degree mark to be presented.
System utilizes formula automatically: string matching degree mark+semantic matching degree mark, calculates the final score of each audio and video resources to be presented respectively.
System according to the final score of each audio and video resources to be presented, in the mode of descending list, to the final Search Results of user feedback.
Fig. 1 in Figure of description page is the implementing procedure figure of a kind of full-automatic audio frequency and video structuring and accurate method of searching in the embodiment of the present invention.