CN106055570A - Video retrieval device based on audio data and video retrieval method for same - Google Patents

Video retrieval device based on audio data and video retrieval method for same Download PDF

Info

Publication number
CN106055570A
CN106055570A CN201610339063.9A CN201610339063A CN106055570A CN 106055570 A CN106055570 A CN 106055570A CN 201610339063 A CN201610339063 A CN 201610339063A CN 106055570 A CN106055570 A CN 106055570A
Authority
CN
China
Prior art keywords
audio
module
voice data
video
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610339063.9A
Other languages
Chinese (zh)
Inventor
高万林
李佳璇
冯慧
张莉
于丽娜
宋越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN201610339063.9A priority Critical patent/CN106055570A/en
Publication of CN106055570A publication Critical patent/CN106055570A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features

Abstract

The invention discloses a video retrieval device based on audio data and a video retrieval method for the same. The device comprises a video database module, a first audio-video separation module, an audio database module, an audio-video data receiving module, a second audio-video separation module, an audio data matching module and a video retrieval display module, wherein the video database module is used to store video data; the first audio-video separation module is used to separate audio data of the video data in the video database module; the audio database module is used to store the audio data obtained by the first audio-video separation module; the audio-video data receiving module is used to receive audio or video data input by a user; the second audio-video separation module is used to separate audio data in received video data after the audio-video data receiving modules receives the video data; the audio data matching module is used to match the audio data input by the user or the audio data obtained by the second audio-video separation module with the audio data in the audio database module, so that one piece or multiple pieces of target audio data can be obtained; and the video retrieval display module is used to display target video data corresponding to the target audio data to the user.

Description

The device of a kind of video frequency searching based on voice data and video retrieval method thereof
Technical field
The present invention relates to multimedia technology field, be specifically related to a kind of video frequency searching based on voice data device and Video retrieval method.
Background technology
Under big data age, video data rapid development, of a great variety, enormous amount, the most in real time, efficiently and accurately Retrieval video, is one of current information-intensive society problem demanding prompt solution.The requirement of video frequency searching is not only satisfied with logical by people Cross its metadata (such as video name, author etc.) and obtain corresponding video content, and more want to by a bit of the unknown The video intelligent in source quickly obtains the complete video information of its place video, and therefore, content based video retrieval system is near Study hotspot over a little years.Video, as a kind of aggregate data, contains much information, such as image, word, sound etc., Therefore it is currently based on the video frequency searching of content to be typically to combine much information mode video is retrieved, wherein image inspection Rope is often as main retrieval mode, and retrieval is optimized by audio-frequency information often as one auxiliary information, and independent The research started with from audio frequency is few.On the other hand, content-based audio retrieval is intended to some spies by audio content itself Levying, retrieve its complete information, wherein, content-based music retrieval a lot of APP realize.And " will listen to sing and know song " this merit The system that can amplify video aspect there is no the research that comparison is complete at present.
Summary of the invention
In view of the above problems, the present invention proposes and overcomes the problems referred to above or solve the one of the problems referred to above at least in part The device of video frequency searching based on voice data and video retrieval method thereof.
For this purpose it is proposed, first aspect, the present invention proposes the device of a kind of video frequency searching based on voice data, including:
Video data library module, is used for storing video data, and receive user and/or manager's input be used for update The video data of video database;
First audio frequency and video separation module, for the audio frequency separated in described video data library module in the video data of storage Data;
Voice data library module, for storing the voice data of described first audio frequency and video separation module isolated;
Audio, video data receiver module, for receiving voice data or the video data of user's input;
Second audio frequency and video separation module, for after described audio, video data receiver module receives video data, separates Voice data in the video data that described audio, video data receiver module receives;
Voice data matching module, for voice data user inputted or the second audio frequency and video separation module isolated Voice data with in described voice data library module storage voice data mate, obtain one or more target audio Data;Described target audio data are and being stored in described voice data library module of matching of voice data of user's input Voice data;
Video frequency searching display module, for by target video data corresponding for the one or more target audio data to User shows, described target video data is the video data of storage in described video data library module.
Optionally, described first audio frequency and video separation module, including:
Segregant module, for the voice data separated in described video data library module in the video data of storage;
Labeling submodule, for the voice data of described segregant module isolated increases mark, described mark is used Corresponding relation between instruction voice data and video data;
Correspondingly, described voice data library module, for storing the voice data increasing mark.
Optionally, described device also includes:
First audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described audio database In module, the voice data of storage carries out audio-frequency fingerprint extraction;
Fingerprint database module, for storing the audio-frequency fingerprint that described first audio-frequency fingerprint extraction module extracts;
Index data library module, for storing audio-frequency fingerprint and the audio frequency that described first audio-frequency fingerprint extraction module extracts Index relative between data;
First audio classification module, for the audio-frequency fingerprint stored based on described fingerprint database module, to described audio frequency The voice data of DBM storage is classified.
Optionally, described device also includes:
Second audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described audio, video data The voice data of user's input that receiver module receives or the voice data of described second audio frequency and video separation module isolated Carry out audio-frequency fingerprint extraction;
Second audio classification module is for the audio-frequency fingerprint extracted based on described second audio-frequency fingerprint extraction module, right The voice data of described user input or the voice data of described second audio frequency and video separation module isolated are classified.
Optionally, described voice data matching module, including:
Voice data to be retrieved determines subelement, for the voice data that obtains based on described second audio classification module The classification of the voice data of the described audio database module stores that classification and described first audio classification module obtain, from institute State and the voice data of audio database module stores determines each voice data to be retrieved;The class of described each voice data to be retrieved Not identical with the classification of the voice data that described second audio classification module obtains;
The audio-frequency fingerprint of voice data to be retrieved determines subelement, for sound based on described index data base module stores Frequently the index relative between fingerprint and voice data, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
Audio-frequency fingerprint coupling subelement, the audio-frequency fingerprint being used for obtaining described second audio-frequency fingerprint extraction module is with described The audio-frequency fingerprint of voice data to be retrieved determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, To one or more target audio data.
Second aspect, the present invention also proposes a kind of video retrieval method based on the device described in second aspect, including:
Audio, video data receiver module receives voice data or the video data of user's input;
After described audio, video data receiver module receives video data, the second audio frequency and video separation module separates described sound Voice data in the video data that video data receiver module receives;
Voice data that user is inputted by voice data matching module or described second audio frequency and video separation module isolated Voice data with in voice data library module storage voice data mate, obtain one or more target sound frequency According to;Described target audio data are and being stored in described voice data library module of matching of voice data of user's input Voice data;
Video frequency searching display module by target video data corresponding for the one or more target audio data to user Display, described target video data is the video data of storage in video data library module;In described voice data library module The video data that voice data is separated in described video data library module by the first audio frequency and video separation module obtains.
Optionally, after described audio, video data receiver module receives voice data or the video data of user's input, institute Method of stating also includes:
Second audio-frequency fingerprint extraction module, based on default audio-frequency fingerprint extracting rule, receives mould to described audio, video data The voice data of user's input or the voice data of described second audio frequency and video separation module isolated that block receives carry out sound Frequently fingerprint extraction;
The audio-frequency fingerprint that second audio classification module is extracted based on described second audio-frequency fingerprint extraction module, to described use The voice data of family input or the voice data of described second audio frequency and video separation module isolated are classified.
Optionally, user is inputted by described voice data matching module voice data or described second audio frequency and video splitting die The voice data of block isolated mates with the voice data of storage in voice data library module, obtains one or more mesh Mark voice data, including:
The classification of the voice data that described voice data matching module obtains based on described second audio classification module and The classification of the voice data of the described audio database module stores that the first audio classification module obtains, from described audio database The voice data of module stores determines each voice data to be retrieved;The classification and described second of described each voice data to be retrieved The classification of the voice data that audio classification module obtains is identical;
Between described voice data matching module audio-frequency fingerprint based on index data base module stores and voice data Index relative, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
The audio-frequency fingerprint that described second audio-frequency fingerprint extraction module obtains is treated by described voice data matching module with described The audio-frequency fingerprint of retrieval voice data determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, and obtains One or more target audio data.
Compared to prior art, the device of the video frequency searching based on voice data that the present invention proposes and video frequency searching side thereof Method, goes out to comprise the whole of similar audio content according to the audio retrieval in a bit of video that user is interested and completely regards Frequently, existing video frequency searching scheme is overcome not to be based only on the deficiency that video sound intermediate frequency data carry out retrieving.
Accompanying drawing explanation
The structure drawing of device of a kind of based on voice data the video frequency searching that Fig. 1 provides for first embodiment of the invention;
The video frequency searching of the device of a kind of based on voice data the video frequency searching that Fig. 2 provides for second embodiment of the invention Method flow diagram.
Detailed description of the invention
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with the embodiment of the present invention In accompanying drawing, the technical scheme in the embodiment of the present invention is explicitly described, it is clear that described embodiment is the present invention A part of embodiment rather than whole embodiments.
It should be noted that in this article, " first " is used merely to " second " separate identical name region, and not It is to imply the relation between these titles or order.
As it is shown in figure 1, the present embodiment discloses the device of a kind of video frequency searching based on voice data, it may include such as lower mold Block: video data library module the 11, first audio frequency and video separation module 12, voice data library module 13, audio, video data receiver module 14, the second audio frequency and video separation module 15, voice data matching module 16 and video frequency searching display module 17.Each module is specifically retouched State as follows:
Video data library module 11, is used for storing video data, and receive user and/or manager's input for more The video data of new video data base.In the present embodiment, in video data library module 11, the video data of storage can be updated, and uses Video data library module 11 all can be updated by family or manager.In a particular application, video data library module 11 can be by The storage hardware such as memory hardware such as hard disk and relevant database software thereof combine realization.
First audio frequency and video separation module 12, for separating in the video data stored in described video data library module 11 Voice data.In the present embodiment, the first audio frequency and video separation module 12 is by the video data of storage in video data library module 11 Voice data separate, it is simple to carry out video frequency searching based on voice data.In a particular application, the first audio frequency and video splitting die Block 12 can be realized by the processor hardware such as processor hardware such as single-chip microcomputer, DSP, ARM.
Voice data library module 13, for storing the voice data of described first audio frequency and video separation module 12 isolated. In the present embodiment, voice data library module 13 can be soft by the storage hardware such as memory hardware such as hard disk and relevant data base thereof Part combines realization.
Audio, video data receiver module 14, for receiving voice data or the video data of user's input.In the present embodiment, Audio, video data receiver module 14 can be made up of mike, denoising device, USB interface and display, and display provides user Operation interface, user can select directly to play video segment or by video segment duplication to carrying this video frequency searching device In terminal, it is possible to the auxiliary information of the data inquired about is provided in operation interface, the most unique including audio types, main Audio types, whether it is to carry out inquiry etc. for the first time.
Second audio frequency and video separation module 15, is used for after described audio, video data receiver module 14 receives video data, Separate the voice data in the video data that described audio, video data receiver module 14 receives.In a particular application, if used Mike inputting audio data are passed through at family, then denoising device is to voice data is transferred to after voice data denoising voice data coupling Module 16, if user is by USB typing video data, then through the second audio frequency and video separation module 15 by the sound in video data Frequency is according to separating and be transferred to voice data matching module 16.
Voice data matching module 16, separates for voice data or the second audio frequency and video separation module 15 user inputted The voice data obtained carries out audio similarity with the voice data of storage in described voice data library module 13 and mates, and obtains one Individual or multiple target audio data;Described target audio data are that being stored in of matching with the voice data of user's input is described Voice data in voice data library module 13.In the present embodiment, voice data library module 13 sound intermediate frequency similarity can be more than The voice data of preset audio similarity thresholding is as target audio data.
Video frequency searching display module 17, for by target video data corresponding for the one or more target audio data Displaying to the user that, described target video data is the video data of storage in described video data library module 11.In the present embodiment, If there being multiple target video data, multiple target video data can be ranked up by video frequency searching display module 17, and sequence depends on According to descending for audio similarity, and display to the user that each target video data after sequence, certainly, in order to make retrieval result More effectively, optional several the forward video datas that sort show, the most front 3 video datas.User can be to display Video data select.
Visible, the device of the disclosed video frequency searching based on voice data of the present embodiment, interested by user is inputted Voice data corresponding to video segment mate with the voice data of storage in voice data library module, it is achieved for completely The retrieval of video, thus meet user's demand for the retrieval of one section of video segment place complete video interested.
The device of the disclosed video frequency searching based on voice data of the present embodiment, according to a bit of video that user is interested In audio retrieval go out the whole complete video comprising similar audio content, overcome existing video frequency searching scheme not have It is based only on the deficiency that video sound intermediate frequency data carry out retrieving.
In a specific example, described first audio frequency and video separation module 12, including:
Segregant module, for the voice data separated in described video data library module 11 in the video data of storage;
Labeling submodule, for the voice data of described segregant module isolated increases mark, described mark is used Corresponding relation between instruction voice data and video data;
Correspondingly, described voice data library module 13, for storing the voice data increasing mark.
In a specific example, described device also includes that Fig. 1 is unshowned with lower module:
First audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described audio database In module 13, the voice data of storage carries out audio-frequency fingerprint extraction;
Fingerprint database module, for storing the audio-frequency fingerprint that described first audio-frequency fingerprint extraction module extracts;
Index data library module, for storing audio-frequency fingerprint and the audio frequency that described first audio-frequency fingerprint extraction module extracts Index relative between data;
First audio classification module, for the audio-frequency fingerprint stored based on described fingerprint database module, to described audio frequency The voice data of DBM 13 storage is classified.
In a specific example, described device also includes that Fig. 1 is unshowned with lower module:
Second audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described audio, video data The voice data of user's input that receiver module 14 receives or the audio frequency of described second audio frequency and video separation module 15 isolated Data carry out audio-frequency fingerprint extraction;
Second audio classification module is for the audio-frequency fingerprint extracted based on described second audio-frequency fingerprint extraction module, right The voice data of described user input or the voice data of described second audio frequency and video separation module 15 isolated are classified.
In a specific example, described voice data matching module 16, including:
Voice data to be retrieved determines subelement, for the voice data that obtains based on described second audio classification module The classification of the voice data of described voice data library module 13 storage that classification and described first audio classification module obtain, from The voice data of described voice data library module 13 storage determines each voice data to be retrieved;Described each voice data to be retrieved Classification identical with the classification of the voice data that described second audio classification module obtains;
The audio-frequency fingerprint of voice data to be retrieved determines subelement, for sound based on described index data base module stores Frequently the index relative between fingerprint and voice data, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
Audio-frequency fingerprint coupling subelement, the audio-frequency fingerprint being used for obtaining described second audio-frequency fingerprint extraction module is with described The audio-frequency fingerprint of voice data to be retrieved determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, To one or more target audio data.
As in figure 2 it is shown, the present embodiment discloses a kind of based on the open video frequency searching based on voice data of above-described embodiment The video retrieval method of device, the method can comprise the following steps 201~204:
201, audio, video data receiver module 14 receives voice data or the video data of user's input;
202, after described audio, video data receiver module 14 receives video data, the second audio frequency and video separation module 15 points Voice data in the video data that described audio, video data receiver module 14 receives;
203, user is inputted by voice data matching module 16 voice data or described second audio frequency and video separation module 15 The voice data of isolated mates with the voice data of storage in voice data library module 13, obtains one or more mesh Mark voice data;Described target audio data be with user input voice data match be stored in described audio database Voice data in module 13;
204, video frequency searching display module 17 is by target video data corresponding for the one or more target audio data Displaying to the user that, described target video data is the video data of storage in video data library module 11;Described audio database Voice data in module 13 is separated the video data in described video data library module 11 by the first audio frequency and video separation module 12 Obtain.
Visible, the video retrieval method of the device of the disclosed video frequency searching based on voice data of the present embodiment, by inciting somebody to action User inputs the voice data of storage in voice data corresponding to video segment interested and voice data library module and carries out Join, it is achieved for the retrieval of complete video, thus meet user for one section of video segment place complete video interested The demand of retrieval.
The video retrieval method of the device of the disclosed video frequency searching based on voice data of the present embodiment, feels emerging according to user Audio retrieval in a bit of video of interest goes out the whole complete video comprising similar audio content, overcomes existing regarding Frequently retrieval scheme is not based only on the deficiency that video sound intermediate frequency data carry out retrieving.
In a specific example, described audio, video data receiver module 14 receives the voice data of user's input or regards Frequency is according to afterwards, and described method also includes the following steps not shown in Fig. 2:
Second audio-frequency fingerprint extraction module, based on default audio-frequency fingerprint extracting rule, receives mould to described audio, video data The voice data of user's input or the voice data of described second audio frequency and video separation module 15 isolated that block 14 receives enter Row audio-frequency fingerprint extracts;
The audio-frequency fingerprint that second audio classification module is extracted based on described second audio-frequency fingerprint extraction module, to described use The voice data of family input or the voice data of described second audio frequency and video separation module 15 isolated are classified.
In a specific example, the voice data or described that user is inputted by described voice data matching module 16 In the voice data of two audio frequency and video separation module 15 isolateds and voice data library module 13, the voice data of storage is carried out Join, obtain one or more target audio data, including:
The classification of the voice data that described voice data matching module 16 obtains based on described second audio classification module with And first classification of voice data of described voice data library module 13 storage that obtain of audio classification module, from described audio frequency number Voice data according to library module 13 storage determines each voice data to be retrieved;The classification of described each voice data to be retrieved and institute The classification stating the voice data that the second audio classification module obtains is identical;
Between described voice data matching module 16 audio-frequency fingerprint based on index data base module stores and voice data Index relative, determine the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
The audio-frequency fingerprint that described second audio-frequency fingerprint extraction module is obtained by described voice data matching module 16 is with described The audio-frequency fingerprint of voice data to be retrieved determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, To one or more target audio data.
It will be understood by those skilled in the art that and each unit in embodiment can be combined into a unit, and in addition Multiple subelement can be put them into.Except at least some in such feature and/or process or unit is to arrange mutually Scold part, any combination can be used to all features disclosed in this specification and the disclosedest any method or to set Standby all processes or unit are combined.Unless expressly stated otherwise, each feature disclosed in this specification can be by carrying Alternative features for identical, equivalent or similar purpose replaces.
Although it will be appreciated by those of skill in the art that embodiments more described herein include being wrapped in other embodiments Some feature included rather than further feature, but the combination of the feature of different embodiment mean to be in the scope of the present invention it In and form different embodiments.
Although being described in conjunction with the accompanying embodiments of the present invention, but those skilled in the art can be without departing from this Making various modifications and variations in the case of bright spirit and scope, such amendment and modification each fall within by claims Within limited range.

Claims (8)

1. the device of a video frequency searching based on voice data, it is characterised in that including:
Video data library module, is used for storing video data, and receive user and/or manager's input for more new video The video data of data base;
First audio frequency and video separation module, for the audio frequency number separated in described video data library module in the video data of storage According to;
Voice data library module, for storing the voice data of described first audio frequency and video separation module isolated;
Audio, video data receiver module, for receiving voice data or the video data of user's input;
Second audio frequency and video separation module, for after described audio, video data receiver module receives video data, separates described Voice data in the video data that audio, video data receiver module receives;
Voice data matching module, for voice data user inputted or the sound of the second audio frequency and video separation module isolated Frequency is mated according to the voice data of storage in described voice data library module, obtains one or more target sound frequency According to;Described target audio data are and being stored in described voice data library module of matching of voice data of user's input Voice data;
Video frequency searching display module, is used for target video data corresponding for the one or more target audio data to user Display, described target video data is the video data of storage in described video data library module.
Device the most according to claim 1, it is characterised in that
Described first audio frequency and video separation module, including:
Segregant module, for the voice data separated in described video data library module in the video data of storage;
Labeling submodule, for the voice data of described segregant module isolated is increased mark, described mark is used for referring to Show the corresponding relation between voice data and video data;
Correspondingly, described voice data library module, for storing the voice data increasing mark.
Device the most according to claim 1, it is characterised in that described device also includes:
First audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described voice data library module The voice data of middle storage carries out audio-frequency fingerprint extraction;
Fingerprint database module, for storing the audio-frequency fingerprint that described first audio-frequency fingerprint extraction module extracts;
Index data library module, for storing audio-frequency fingerprint and the voice data that described first audio-frequency fingerprint extraction module extracts Between index relative;
First audio classification module, for the audio-frequency fingerprint stored based on described fingerprint database module, to described voice data The voice data of library module storage is classified.
Device the most according to claim 3, it is characterised in that described device also includes:
Second audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, receives described audio, video data The voice data of user's input or the voice data of described second audio frequency and video separation module isolated that module receives are carried out Audio-frequency fingerprint extracts;
Second audio classification module, for the audio-frequency fingerprint extracted based on described second audio-frequency fingerprint extraction module, to described The voice data of user's input or the voice data of described second audio frequency and video separation module isolated are classified.
Device the most according to claim 4, it is characterised in that described voice data matching module, including:
Voice data to be retrieved determines subelement, the classification of the voice data for obtaining based on described second audio classification module And the classification of the voice data of described audio database module stores that described first audio classification module obtains, from described sound The voice data of DBM storage frequently determines each voice data to be retrieved;The classification of described each voice data to be retrieved with The classification of the voice data that described second audio classification module obtains is identical;
The audio-frequency fingerprint of voice data to be retrieved determines subelement, refers to for audio frequency based on described index data base module stores Index relative between stricture of vagina and voice data, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
Audio-frequency fingerprint coupling subelement, to be checked with described for the audio-frequency fingerprint that described second audio-frequency fingerprint extraction module is obtained The audio-frequency fingerprint of rope voice data determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, and obtains one Individual or multiple target audio data.
6. a video retrieval method based on the device described in any one of claim 1 to 5, it is characterised in that including:
Audio, video data receiver module receives voice data or the video data of user's input;
After described audio, video data receiver module receives video data, the second audio frequency and video separation module separates described audio frequency and video Voice data in the video data that data reception module receives;
Voice data that user is inputted by voice data matching module or the sound of described second audio frequency and video separation module isolated Frequency is mated according to the voice data of storage in voice data library module, obtains one or more target audio data;Institute Stating target audio data is the audio frequency being stored in described voice data library module that the voice data with user's input matches Data;
Target video data corresponding for the one or more target audio data is displayed to the user that by video frequency searching display module, Described target video data is the video data of storage in video data library module;Audio frequency number in described voice data library module Obtain according to the video data separated in described video data library module by the first audio frequency and video separation module.
Method the most according to claim 6, it is characterised in that described audio, video data receiver module receives user's input After voice data or video data, described method also includes:
Described audio, video data receiver module, based on default audio-frequency fingerprint extracting rule, is connect by the second audio-frequency fingerprint extraction module The voice data of user's input or the voice data of described second audio frequency and video separation module isolated that receive carry out audio frequency and refer to Stricture of vagina extracts;
The audio-frequency fingerprint that second audio classification module is extracted based on described second audio-frequency fingerprint extraction module, defeated to described user The voice data entered or the voice data of described second audio frequency and video separation module isolated are classified.
Method the most according to claim 7, it is characterised in that the audio frequency that user is inputted by described voice data matching module The voice data of data or described second audio frequency and video separation module isolated and the audio frequency number of storage in voice data library module According to mating, obtain one or more target audio data, including:
The classification and first of the voice data that described voice data matching module obtains based on described second audio classification module The classification of the voice data of the described audio database module stores that audio classification module obtains, from described voice data library module The voice data of storage determines each voice data to be retrieved;The classification of described each voice data to be retrieved and described second audio frequency The classification of the voice data that sort module obtains is identical;
Index between described voice data matching module audio-frequency fingerprint based on index data base module stores and voice data Relation, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
The audio-frequency fingerprint that described second audio-frequency fingerprint extraction module is obtained by described voice data matching module is to be retrieved with described The audio-frequency fingerprint of voice data determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, and obtains one Or multiple target audio data.
CN201610339063.9A 2016-05-19 2016-05-19 Video retrieval device based on audio data and video retrieval method for same Pending CN106055570A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610339063.9A CN106055570A (en) 2016-05-19 2016-05-19 Video retrieval device based on audio data and video retrieval method for same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610339063.9A CN106055570A (en) 2016-05-19 2016-05-19 Video retrieval device based on audio data and video retrieval method for same

Publications (1)

Publication Number Publication Date
CN106055570A true CN106055570A (en) 2016-10-26

Family

ID=57177344

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610339063.9A Pending CN106055570A (en) 2016-05-19 2016-05-19 Video retrieval device based on audio data and video retrieval method for same

Country Status (1)

Country Link
CN (1) CN106055570A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106686401A (en) * 2017-01-13 2017-05-17 山东鑫诚信电子科技有限公司 Video data distributed storage method, video data distributed storage device, video data retrieval method and video data retrieval device
CN107886959A (en) * 2017-09-30 2018-04-06 中国农业科学院蜜蜂研究所 A kind of method and apparatus extracted honeybee and visit flower video segment
CN108520078A (en) * 2018-04-20 2018-09-11 百度在线网络技术(北京)有限公司 Video frequency identifying method and device
CN109446356A (en) * 2018-09-21 2019-03-08 深圳市九洲电器有限公司 A kind of multimedia document retrieval method and device
CN110412368A (en) * 2019-06-27 2019-11-05 安徽继远软件有限公司 Electrical equipment online supervision method and system based on Application on Voiceprint Recognition
CN110958485A (en) * 2019-10-30 2020-04-03 维沃移动通信有限公司 Video playing method, electronic equipment and computer readable storage medium
CN112165634A (en) * 2020-09-29 2021-01-01 北京百度网讯科技有限公司 Method for establishing audio classification model and method and device for automatically converting video

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411578A (en) * 2010-09-25 2012-04-11 盛乐信息技术(上海)有限公司 Multimedia playing system and method
US20120185475A1 (en) * 2008-08-19 2012-07-19 Miller Frank W Variable audio/visual data incorporation system and method
CN102799605A (en) * 2012-05-02 2012-11-28 天脉聚源(北京)传媒科技有限公司 Method and system for monitoring advertisement broadcast
CN104598541A (en) * 2014-12-29 2015-05-06 乐视网信息技术(北京)股份有限公司 Identification method and device for multimedia file
CN104778238A (en) * 2015-04-03 2015-07-15 中国农业大学 Video saliency analysis method and video saliency analysis device
CN105184610A (en) * 2015-09-02 2015-12-23 王磊 Real-time mobile advertisement synchronous putting method and device based on audio fingerprints

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120185475A1 (en) * 2008-08-19 2012-07-19 Miller Frank W Variable audio/visual data incorporation system and method
CN102411578A (en) * 2010-09-25 2012-04-11 盛乐信息技术(上海)有限公司 Multimedia playing system and method
CN102799605A (en) * 2012-05-02 2012-11-28 天脉聚源(北京)传媒科技有限公司 Method and system for monitoring advertisement broadcast
CN104598541A (en) * 2014-12-29 2015-05-06 乐视网信息技术(北京)股份有限公司 Identification method and device for multimedia file
CN104778238A (en) * 2015-04-03 2015-07-15 中国农业大学 Video saliency analysis method and video saliency analysis device
CN105184610A (en) * 2015-09-02 2015-12-23 王磊 Real-time mobile advertisement synchronous putting method and device based on audio fingerprints

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106686401A (en) * 2017-01-13 2017-05-17 山东鑫诚信电子科技有限公司 Video data distributed storage method, video data distributed storage device, video data retrieval method and video data retrieval device
CN107886959A (en) * 2017-09-30 2018-04-06 中国农业科学院蜜蜂研究所 A kind of method and apparatus extracted honeybee and visit flower video segment
CN107886959B (en) * 2017-09-30 2021-07-27 中国农业科学院蜜蜂研究所 Method and device for extracting bee interview video clip
CN108520078A (en) * 2018-04-20 2018-09-11 百度在线网络技术(北京)有限公司 Video frequency identifying method and device
CN108520078B (en) * 2018-04-20 2020-03-20 百度在线网络技术(北京)有限公司 Video identification method and device
CN109446356A (en) * 2018-09-21 2019-03-08 深圳市九洲电器有限公司 A kind of multimedia document retrieval method and device
WO2020057347A1 (en) * 2018-09-21 2020-03-26 深圳市九洲电器有限公司 Multimedia file retrieval method and apparatus
CN110412368A (en) * 2019-06-27 2019-11-05 安徽继远软件有限公司 Electrical equipment online supervision method and system based on Application on Voiceprint Recognition
CN110958485A (en) * 2019-10-30 2020-04-03 维沃移动通信有限公司 Video playing method, electronic equipment and computer readable storage medium
CN112165634A (en) * 2020-09-29 2021-01-01 北京百度网讯科技有限公司 Method for establishing audio classification model and method and device for automatically converting video

Similar Documents

Publication Publication Date Title
CN106055570A (en) Video retrieval device based on audio data and video retrieval method for same
WO2018072071A1 (en) Knowledge map building system and method
US20200125981A1 (en) Systems and methods for recognizing ambiguity in metadata
US10089392B2 (en) Automatically selecting thematically representative music
US8457368B2 (en) System and method of object recognition and database population for video indexing
KR20140093957A (en) Interactive multi-modal image search
US20100217755A1 (en) Classifying a set of content items
US20110029510A1 (en) Method and apparatus for searching a plurality of stored digital images
KR100615522B1 (en) music contents classification method, and system and method for providing music contents using the classification method
CN1759396A (en) Improved data retrieval method and system
CN106682012A (en) Commodity object information searching method and device
CN108228682A (en) Character string verification method, character string expansion method and verification model training method
CN109857898A (en) A kind of method and system of mass digital audio-frequency fingerprint storage and retrieval
JP5306114B2 (en) Query extraction device, query extraction method, and query extraction program
JP2014500528A (en) Enhancement of meaning using TOP-K processing
CN106446235A (en) Video searching method and device
TW201417093A (en) Electronic device with video/audio files processing function and video/audio files processing method
CN101957860B (en) Method and device for releasing and searching information
TW200805251A (en) A method and apparatus for accessing a digital file from a collection of digital films
CN113065018A (en) Audio and video index library creating and retrieving method and device and electronic equipment
KR100644016B1 (en) Moving picture search system and method thereof
JP2012015809A (en) Music selection apparatus, music selection method, and music selection program
JP2005346259A5 (en)
JP2010003219A (en) Related query derivation device, and related query derivation method and program
JPH09223150A (en) Information classification processing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20161026