CN106055570A - Video retrieval device based on audio data and video retrieval method for same - Google Patents
Video retrieval device based on audio data and video retrieval method for same Download PDFInfo
- Publication number
- CN106055570A CN106055570A CN201610339063.9A CN201610339063A CN106055570A CN 106055570 A CN106055570 A CN 106055570A CN 201610339063 A CN201610339063 A CN 201610339063A CN 106055570 A CN106055570 A CN 106055570A
- Authority
- CN
- China
- Prior art keywords
- audio
- module
- voice data
- video
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/783—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/7834—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
Abstract
The invention discloses a video retrieval device based on audio data and a video retrieval method for the same. The device comprises a video database module, a first audio-video separation module, an audio database module, an audio-video data receiving module, a second audio-video separation module, an audio data matching module and a video retrieval display module, wherein the video database module is used to store video data; the first audio-video separation module is used to separate audio data of the video data in the video database module; the audio database module is used to store the audio data obtained by the first audio-video separation module; the audio-video data receiving module is used to receive audio or video data input by a user; the second audio-video separation module is used to separate audio data in received video data after the audio-video data receiving modules receives the video data; the audio data matching module is used to match the audio data input by the user or the audio data obtained by the second audio-video separation module with the audio data in the audio database module, so that one piece or multiple pieces of target audio data can be obtained; and the video retrieval display module is used to display target video data corresponding to the target audio data to the user.
Description
Technical field
The present invention relates to multimedia technology field, be specifically related to a kind of video frequency searching based on voice data device and
Video retrieval method.
Background technology
Under big data age, video data rapid development, of a great variety, enormous amount, the most in real time, efficiently and accurately
Retrieval video, is one of current information-intensive society problem demanding prompt solution.The requirement of video frequency searching is not only satisfied with logical by people
Cross its metadata (such as video name, author etc.) and obtain corresponding video content, and more want to by a bit of the unknown
The video intelligent in source quickly obtains the complete video information of its place video, and therefore, content based video retrieval system is near
Study hotspot over a little years.Video, as a kind of aggregate data, contains much information, such as image, word, sound etc.,
Therefore it is currently based on the video frequency searching of content to be typically to combine much information mode video is retrieved, wherein image inspection
Rope is often as main retrieval mode, and retrieval is optimized by audio-frequency information often as one auxiliary information, and independent
The research started with from audio frequency is few.On the other hand, content-based audio retrieval is intended to some spies by audio content itself
Levying, retrieve its complete information, wherein, content-based music retrieval a lot of APP realize.And " will listen to sing and know song " this merit
The system that can amplify video aspect there is no the research that comparison is complete at present.
Summary of the invention
In view of the above problems, the present invention proposes and overcomes the problems referred to above or solve the one of the problems referred to above at least in part
The device of video frequency searching based on voice data and video retrieval method thereof.
For this purpose it is proposed, first aspect, the present invention proposes the device of a kind of video frequency searching based on voice data, including:
Video data library module, is used for storing video data, and receive user and/or manager's input be used for update
The video data of video database;
First audio frequency and video separation module, for the audio frequency separated in described video data library module in the video data of storage
Data;
Voice data library module, for storing the voice data of described first audio frequency and video separation module isolated;
Audio, video data receiver module, for receiving voice data or the video data of user's input;
Second audio frequency and video separation module, for after described audio, video data receiver module receives video data, separates
Voice data in the video data that described audio, video data receiver module receives;
Voice data matching module, for voice data user inputted or the second audio frequency and video separation module isolated
Voice data with in described voice data library module storage voice data mate, obtain one or more target audio
Data;Described target audio data are and being stored in described voice data library module of matching of voice data of user's input
Voice data;
Video frequency searching display module, for by target video data corresponding for the one or more target audio data to
User shows, described target video data is the video data of storage in described video data library module.
Optionally, described first audio frequency and video separation module, including:
Segregant module, for the voice data separated in described video data library module in the video data of storage;
Labeling submodule, for the voice data of described segregant module isolated increases mark, described mark is used
Corresponding relation between instruction voice data and video data;
Correspondingly, described voice data library module, for storing the voice data increasing mark.
Optionally, described device also includes:
First audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described audio database
In module, the voice data of storage carries out audio-frequency fingerprint extraction;
Fingerprint database module, for storing the audio-frequency fingerprint that described first audio-frequency fingerprint extraction module extracts;
Index data library module, for storing audio-frequency fingerprint and the audio frequency that described first audio-frequency fingerprint extraction module extracts
Index relative between data;
First audio classification module, for the audio-frequency fingerprint stored based on described fingerprint database module, to described audio frequency
The voice data of DBM storage is classified.
Optionally, described device also includes:
Second audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described audio, video data
The voice data of user's input that receiver module receives or the voice data of described second audio frequency and video separation module isolated
Carry out audio-frequency fingerprint extraction;
Second audio classification module is for the audio-frequency fingerprint extracted based on described second audio-frequency fingerprint extraction module, right
The voice data of described user input or the voice data of described second audio frequency and video separation module isolated are classified.
Optionally, described voice data matching module, including:
Voice data to be retrieved determines subelement, for the voice data that obtains based on described second audio classification module
The classification of the voice data of the described audio database module stores that classification and described first audio classification module obtain, from institute
State and the voice data of audio database module stores determines each voice data to be retrieved;The class of described each voice data to be retrieved
Not identical with the classification of the voice data that described second audio classification module obtains;
The audio-frequency fingerprint of voice data to be retrieved determines subelement, for sound based on described index data base module stores
Frequently the index relative between fingerprint and voice data, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
Audio-frequency fingerprint coupling subelement, the audio-frequency fingerprint being used for obtaining described second audio-frequency fingerprint extraction module is with described
The audio-frequency fingerprint of voice data to be retrieved determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates,
To one or more target audio data.
Second aspect, the present invention also proposes a kind of video retrieval method based on the device described in second aspect, including:
Audio, video data receiver module receives voice data or the video data of user's input;
After described audio, video data receiver module receives video data, the second audio frequency and video separation module separates described sound
Voice data in the video data that video data receiver module receives;
Voice data that user is inputted by voice data matching module or described second audio frequency and video separation module isolated
Voice data with in voice data library module storage voice data mate, obtain one or more target sound frequency
According to;Described target audio data are and being stored in described voice data library module of matching of voice data of user's input
Voice data;
Video frequency searching display module by target video data corresponding for the one or more target audio data to user
Display, described target video data is the video data of storage in video data library module;In described voice data library module
The video data that voice data is separated in described video data library module by the first audio frequency and video separation module obtains.
Optionally, after described audio, video data receiver module receives voice data or the video data of user's input, institute
Method of stating also includes:
Second audio-frequency fingerprint extraction module, based on default audio-frequency fingerprint extracting rule, receives mould to described audio, video data
The voice data of user's input or the voice data of described second audio frequency and video separation module isolated that block receives carry out sound
Frequently fingerprint extraction;
The audio-frequency fingerprint that second audio classification module is extracted based on described second audio-frequency fingerprint extraction module, to described use
The voice data of family input or the voice data of described second audio frequency and video separation module isolated are classified.
Optionally, user is inputted by described voice data matching module voice data or described second audio frequency and video splitting die
The voice data of block isolated mates with the voice data of storage in voice data library module, obtains one or more mesh
Mark voice data, including:
The classification of the voice data that described voice data matching module obtains based on described second audio classification module and
The classification of the voice data of the described audio database module stores that the first audio classification module obtains, from described audio database
The voice data of module stores determines each voice data to be retrieved;The classification and described second of described each voice data to be retrieved
The classification of the voice data that audio classification module obtains is identical;
Between described voice data matching module audio-frequency fingerprint based on index data base module stores and voice data
Index relative, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
The audio-frequency fingerprint that described second audio-frequency fingerprint extraction module obtains is treated by described voice data matching module with described
The audio-frequency fingerprint of retrieval voice data determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, and obtains
One or more target audio data.
Compared to prior art, the device of the video frequency searching based on voice data that the present invention proposes and video frequency searching side thereof
Method, goes out to comprise the whole of similar audio content according to the audio retrieval in a bit of video that user is interested and completely regards
Frequently, existing video frequency searching scheme is overcome not to be based only on the deficiency that video sound intermediate frequency data carry out retrieving.
Accompanying drawing explanation
The structure drawing of device of a kind of based on voice data the video frequency searching that Fig. 1 provides for first embodiment of the invention;
The video frequency searching of the device of a kind of based on voice data the video frequency searching that Fig. 2 provides for second embodiment of the invention
Method flow diagram.
Detailed description of the invention
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is explicitly described, it is clear that described embodiment is the present invention
A part of embodiment rather than whole embodiments.
It should be noted that in this article, " first " is used merely to " second " separate identical name region, and not
It is to imply the relation between these titles or order.
As it is shown in figure 1, the present embodiment discloses the device of a kind of video frequency searching based on voice data, it may include such as lower mold
Block: video data library module the 11, first audio frequency and video separation module 12, voice data library module 13, audio, video data receiver module
14, the second audio frequency and video separation module 15, voice data matching module 16 and video frequency searching display module 17.Each module is specifically retouched
State as follows:
Video data library module 11, is used for storing video data, and receive user and/or manager's input for more
The video data of new video data base.In the present embodiment, in video data library module 11, the video data of storage can be updated, and uses
Video data library module 11 all can be updated by family or manager.In a particular application, video data library module 11 can be by
The storage hardware such as memory hardware such as hard disk and relevant database software thereof combine realization.
First audio frequency and video separation module 12, for separating in the video data stored in described video data library module 11
Voice data.In the present embodiment, the first audio frequency and video separation module 12 is by the video data of storage in video data library module 11
Voice data separate, it is simple to carry out video frequency searching based on voice data.In a particular application, the first audio frequency and video splitting die
Block 12 can be realized by the processor hardware such as processor hardware such as single-chip microcomputer, DSP, ARM.
Voice data library module 13, for storing the voice data of described first audio frequency and video separation module 12 isolated.
In the present embodiment, voice data library module 13 can be soft by the storage hardware such as memory hardware such as hard disk and relevant data base thereof
Part combines realization.
Audio, video data receiver module 14, for receiving voice data or the video data of user's input.In the present embodiment,
Audio, video data receiver module 14 can be made up of mike, denoising device, USB interface and display, and display provides user
Operation interface, user can select directly to play video segment or by video segment duplication to carrying this video frequency searching device
In terminal, it is possible to the auxiliary information of the data inquired about is provided in operation interface, the most unique including audio types, main
Audio types, whether it is to carry out inquiry etc. for the first time.
Second audio frequency and video separation module 15, is used for after described audio, video data receiver module 14 receives video data,
Separate the voice data in the video data that described audio, video data receiver module 14 receives.In a particular application, if used
Mike inputting audio data are passed through at family, then denoising device is to voice data is transferred to after voice data denoising voice data coupling
Module 16, if user is by USB typing video data, then through the second audio frequency and video separation module 15 by the sound in video data
Frequency is according to separating and be transferred to voice data matching module 16.
Voice data matching module 16, separates for voice data or the second audio frequency and video separation module 15 user inputted
The voice data obtained carries out audio similarity with the voice data of storage in described voice data library module 13 and mates, and obtains one
Individual or multiple target audio data;Described target audio data are that being stored in of matching with the voice data of user's input is described
Voice data in voice data library module 13.In the present embodiment, voice data library module 13 sound intermediate frequency similarity can be more than
The voice data of preset audio similarity thresholding is as target audio data.
Video frequency searching display module 17, for by target video data corresponding for the one or more target audio data
Displaying to the user that, described target video data is the video data of storage in described video data library module 11.In the present embodiment,
If there being multiple target video data, multiple target video data can be ranked up by video frequency searching display module 17, and sequence depends on
According to descending for audio similarity, and display to the user that each target video data after sequence, certainly, in order to make retrieval result
More effectively, optional several the forward video datas that sort show, the most front 3 video datas.User can be to display
Video data select.
Visible, the device of the disclosed video frequency searching based on voice data of the present embodiment, interested by user is inputted
Voice data corresponding to video segment mate with the voice data of storage in voice data library module, it is achieved for completely
The retrieval of video, thus meet user's demand for the retrieval of one section of video segment place complete video interested.
The device of the disclosed video frequency searching based on voice data of the present embodiment, according to a bit of video that user is interested
In audio retrieval go out the whole complete video comprising similar audio content, overcome existing video frequency searching scheme not have
It is based only on the deficiency that video sound intermediate frequency data carry out retrieving.
In a specific example, described first audio frequency and video separation module 12, including:
Segregant module, for the voice data separated in described video data library module 11 in the video data of storage;
Labeling submodule, for the voice data of described segregant module isolated increases mark, described mark is used
Corresponding relation between instruction voice data and video data;
Correspondingly, described voice data library module 13, for storing the voice data increasing mark.
In a specific example, described device also includes that Fig. 1 is unshowned with lower module:
First audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described audio database
In module 13, the voice data of storage carries out audio-frequency fingerprint extraction;
Fingerprint database module, for storing the audio-frequency fingerprint that described first audio-frequency fingerprint extraction module extracts;
Index data library module, for storing audio-frequency fingerprint and the audio frequency that described first audio-frequency fingerprint extraction module extracts
Index relative between data;
First audio classification module, for the audio-frequency fingerprint stored based on described fingerprint database module, to described audio frequency
The voice data of DBM 13 storage is classified.
In a specific example, described device also includes that Fig. 1 is unshowned with lower module:
Second audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described audio, video data
The voice data of user's input that receiver module 14 receives or the audio frequency of described second audio frequency and video separation module 15 isolated
Data carry out audio-frequency fingerprint extraction;
Second audio classification module is for the audio-frequency fingerprint extracted based on described second audio-frequency fingerprint extraction module, right
The voice data of described user input or the voice data of described second audio frequency and video separation module 15 isolated are classified.
In a specific example, described voice data matching module 16, including:
Voice data to be retrieved determines subelement, for the voice data that obtains based on described second audio classification module
The classification of the voice data of described voice data library module 13 storage that classification and described first audio classification module obtain, from
The voice data of described voice data library module 13 storage determines each voice data to be retrieved;Described each voice data to be retrieved
Classification identical with the classification of the voice data that described second audio classification module obtains;
The audio-frequency fingerprint of voice data to be retrieved determines subelement, for sound based on described index data base module stores
Frequently the index relative between fingerprint and voice data, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
Audio-frequency fingerprint coupling subelement, the audio-frequency fingerprint being used for obtaining described second audio-frequency fingerprint extraction module is with described
The audio-frequency fingerprint of voice data to be retrieved determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates,
To one or more target audio data.
As in figure 2 it is shown, the present embodiment discloses a kind of based on the open video frequency searching based on voice data of above-described embodiment
The video retrieval method of device, the method can comprise the following steps 201~204:
201, audio, video data receiver module 14 receives voice data or the video data of user's input;
202, after described audio, video data receiver module 14 receives video data, the second audio frequency and video separation module 15 points
Voice data in the video data that described audio, video data receiver module 14 receives;
203, user is inputted by voice data matching module 16 voice data or described second audio frequency and video separation module 15
The voice data of isolated mates with the voice data of storage in voice data library module 13, obtains one or more mesh
Mark voice data;Described target audio data be with user input voice data match be stored in described audio database
Voice data in module 13;
204, video frequency searching display module 17 is by target video data corresponding for the one or more target audio data
Displaying to the user that, described target video data is the video data of storage in video data library module 11;Described audio database
Voice data in module 13 is separated the video data in described video data library module 11 by the first audio frequency and video separation module 12
Obtain.
Visible, the video retrieval method of the device of the disclosed video frequency searching based on voice data of the present embodiment, by inciting somebody to action
User inputs the voice data of storage in voice data corresponding to video segment interested and voice data library module and carries out
Join, it is achieved for the retrieval of complete video, thus meet user for one section of video segment place complete video interested
The demand of retrieval.
The video retrieval method of the device of the disclosed video frequency searching based on voice data of the present embodiment, feels emerging according to user
Audio retrieval in a bit of video of interest goes out the whole complete video comprising similar audio content, overcomes existing regarding
Frequently retrieval scheme is not based only on the deficiency that video sound intermediate frequency data carry out retrieving.
In a specific example, described audio, video data receiver module 14 receives the voice data of user's input or regards
Frequency is according to afterwards, and described method also includes the following steps not shown in Fig. 2:
Second audio-frequency fingerprint extraction module, based on default audio-frequency fingerprint extracting rule, receives mould to described audio, video data
The voice data of user's input or the voice data of described second audio frequency and video separation module 15 isolated that block 14 receives enter
Row audio-frequency fingerprint extracts;
The audio-frequency fingerprint that second audio classification module is extracted based on described second audio-frequency fingerprint extraction module, to described use
The voice data of family input or the voice data of described second audio frequency and video separation module 15 isolated are classified.
In a specific example, the voice data or described that user is inputted by described voice data matching module 16
In the voice data of two audio frequency and video separation module 15 isolateds and voice data library module 13, the voice data of storage is carried out
Join, obtain one or more target audio data, including:
The classification of the voice data that described voice data matching module 16 obtains based on described second audio classification module with
And first classification of voice data of described voice data library module 13 storage that obtain of audio classification module, from described audio frequency number
Voice data according to library module 13 storage determines each voice data to be retrieved;The classification of described each voice data to be retrieved and institute
The classification stating the voice data that the second audio classification module obtains is identical;
Between described voice data matching module 16 audio-frequency fingerprint based on index data base module stores and voice data
Index relative, determine the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
The audio-frequency fingerprint that described second audio-frequency fingerprint extraction module is obtained by described voice data matching module 16 is with described
The audio-frequency fingerprint of voice data to be retrieved determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates,
To one or more target audio data.
It will be understood by those skilled in the art that and each unit in embodiment can be combined into a unit, and in addition
Multiple subelement can be put them into.Except at least some in such feature and/or process or unit is to arrange mutually
Scold part, any combination can be used to all features disclosed in this specification and the disclosedest any method or to set
Standby all processes or unit are combined.Unless expressly stated otherwise, each feature disclosed in this specification can be by carrying
Alternative features for identical, equivalent or similar purpose replaces.
Although it will be appreciated by those of skill in the art that embodiments more described herein include being wrapped in other embodiments
Some feature included rather than further feature, but the combination of the feature of different embodiment mean to be in the scope of the present invention it
In and form different embodiments.
Although being described in conjunction with the accompanying embodiments of the present invention, but those skilled in the art can be without departing from this
Making various modifications and variations in the case of bright spirit and scope, such amendment and modification each fall within by claims
Within limited range.
Claims (8)
1. the device of a video frequency searching based on voice data, it is characterised in that including:
Video data library module, is used for storing video data, and receive user and/or manager's input for more new video
The video data of data base;
First audio frequency and video separation module, for the audio frequency number separated in described video data library module in the video data of storage
According to;
Voice data library module, for storing the voice data of described first audio frequency and video separation module isolated;
Audio, video data receiver module, for receiving voice data or the video data of user's input;
Second audio frequency and video separation module, for after described audio, video data receiver module receives video data, separates described
Voice data in the video data that audio, video data receiver module receives;
Voice data matching module, for voice data user inputted or the sound of the second audio frequency and video separation module isolated
Frequency is mated according to the voice data of storage in described voice data library module, obtains one or more target sound frequency
According to;Described target audio data are and being stored in described voice data library module of matching of voice data of user's input
Voice data;
Video frequency searching display module, is used for target video data corresponding for the one or more target audio data to user
Display, described target video data is the video data of storage in described video data library module.
Device the most according to claim 1, it is characterised in that
Described first audio frequency and video separation module, including:
Segregant module, for the voice data separated in described video data library module in the video data of storage;
Labeling submodule, for the voice data of described segregant module isolated is increased mark, described mark is used for referring to
Show the corresponding relation between voice data and video data;
Correspondingly, described voice data library module, for storing the voice data increasing mark.
Device the most according to claim 1, it is characterised in that described device also includes:
First audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, to described voice data library module
The voice data of middle storage carries out audio-frequency fingerprint extraction;
Fingerprint database module, for storing the audio-frequency fingerprint that described first audio-frequency fingerprint extraction module extracts;
Index data library module, for storing audio-frequency fingerprint and the voice data that described first audio-frequency fingerprint extraction module extracts
Between index relative;
First audio classification module, for the audio-frequency fingerprint stored based on described fingerprint database module, to described voice data
The voice data of library module storage is classified.
Device the most according to claim 3, it is characterised in that described device also includes:
Second audio-frequency fingerprint extraction module, for based on default audio-frequency fingerprint extracting rule, receives described audio, video data
The voice data of user's input or the voice data of described second audio frequency and video separation module isolated that module receives are carried out
Audio-frequency fingerprint extracts;
Second audio classification module, for the audio-frequency fingerprint extracted based on described second audio-frequency fingerprint extraction module, to described
The voice data of user's input or the voice data of described second audio frequency and video separation module isolated are classified.
Device the most according to claim 4, it is characterised in that described voice data matching module, including:
Voice data to be retrieved determines subelement, the classification of the voice data for obtaining based on described second audio classification module
And the classification of the voice data of described audio database module stores that described first audio classification module obtains, from described sound
The voice data of DBM storage frequently determines each voice data to be retrieved;The classification of described each voice data to be retrieved with
The classification of the voice data that described second audio classification module obtains is identical;
The audio-frequency fingerprint of voice data to be retrieved determines subelement, refers to for audio frequency based on described index data base module stores
Index relative between stricture of vagina and voice data, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
Audio-frequency fingerprint coupling subelement, to be checked with described for the audio-frequency fingerprint that described second audio-frequency fingerprint extraction module is obtained
The audio-frequency fingerprint of rope voice data determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, and obtains one
Individual or multiple target audio data.
6. a video retrieval method based on the device described in any one of claim 1 to 5, it is characterised in that including:
Audio, video data receiver module receives voice data or the video data of user's input;
After described audio, video data receiver module receives video data, the second audio frequency and video separation module separates described audio frequency and video
Voice data in the video data that data reception module receives;
Voice data that user is inputted by voice data matching module or the sound of described second audio frequency and video separation module isolated
Frequency is mated according to the voice data of storage in voice data library module, obtains one or more target audio data;Institute
Stating target audio data is the audio frequency being stored in described voice data library module that the voice data with user's input matches
Data;
Target video data corresponding for the one or more target audio data is displayed to the user that by video frequency searching display module,
Described target video data is the video data of storage in video data library module;Audio frequency number in described voice data library module
Obtain according to the video data separated in described video data library module by the first audio frequency and video separation module.
Method the most according to claim 6, it is characterised in that described audio, video data receiver module receives user's input
After voice data or video data, described method also includes:
Described audio, video data receiver module, based on default audio-frequency fingerprint extracting rule, is connect by the second audio-frequency fingerprint extraction module
The voice data of user's input or the voice data of described second audio frequency and video separation module isolated that receive carry out audio frequency and refer to
Stricture of vagina extracts;
The audio-frequency fingerprint that second audio classification module is extracted based on described second audio-frequency fingerprint extraction module, defeated to described user
The voice data entered or the voice data of described second audio frequency and video separation module isolated are classified.
Method the most according to claim 7, it is characterised in that the audio frequency that user is inputted by described voice data matching module
The voice data of data or described second audio frequency and video separation module isolated and the audio frequency number of storage in voice data library module
According to mating, obtain one or more target audio data, including:
The classification and first of the voice data that described voice data matching module obtains based on described second audio classification module
The classification of the voice data of the described audio database module stores that audio classification module obtains, from described voice data library module
The voice data of storage determines each voice data to be retrieved;The classification of described each voice data to be retrieved and described second audio frequency
The classification of the voice data that sort module obtains is identical;
Index between described voice data matching module audio-frequency fingerprint based on index data base module stores and voice data
Relation, determines the audio-frequency fingerprint that each voice data to be retrieved is corresponding;
The audio-frequency fingerprint that described second audio-frequency fingerprint extraction module is obtained by described voice data matching module is to be retrieved with described
The audio-frequency fingerprint of voice data determines that the audio-frequency fingerprint that audio frequency each to be detected that subelement determines is corresponding mates, and obtains one
Or multiple target audio data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610339063.9A CN106055570A (en) | 2016-05-19 | 2016-05-19 | Video retrieval device based on audio data and video retrieval method for same |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610339063.9A CN106055570A (en) | 2016-05-19 | 2016-05-19 | Video retrieval device based on audio data and video retrieval method for same |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106055570A true CN106055570A (en) | 2016-10-26 |
Family
ID=57177344
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610339063.9A Pending CN106055570A (en) | 2016-05-19 | 2016-05-19 | Video retrieval device based on audio data and video retrieval method for same |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106055570A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106686401A (en) * | 2017-01-13 | 2017-05-17 | 山东鑫诚信电子科技有限公司 | Video data distributed storage method, video data distributed storage device, video data retrieval method and video data retrieval device |
CN107886959A (en) * | 2017-09-30 | 2018-04-06 | 中国农业科学院蜜蜂研究所 | A kind of method and apparatus extracted honeybee and visit flower video segment |
CN108520078A (en) * | 2018-04-20 | 2018-09-11 | 百度在线网络技术(北京)有限公司 | Video frequency identifying method and device |
CN109446356A (en) * | 2018-09-21 | 2019-03-08 | 深圳市九洲电器有限公司 | A kind of multimedia document retrieval method and device |
CN110412368A (en) * | 2019-06-27 | 2019-11-05 | 安徽继远软件有限公司 | Electrical equipment online supervision method and system based on Application on Voiceprint Recognition |
CN110958485A (en) * | 2019-10-30 | 2020-04-03 | 维沃移动通信有限公司 | Video playing method, electronic equipment and computer readable storage medium |
CN112165634A (en) * | 2020-09-29 | 2021-01-01 | 北京百度网讯科技有限公司 | Method for establishing audio classification model and method and device for automatically converting video |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102411578A (en) * | 2010-09-25 | 2012-04-11 | 盛乐信息技术(上海)有限公司 | Multimedia playing system and method |
US20120185475A1 (en) * | 2008-08-19 | 2012-07-19 | Miller Frank W | Variable audio/visual data incorporation system and method |
CN102799605A (en) * | 2012-05-02 | 2012-11-28 | 天脉聚源(北京)传媒科技有限公司 | Method and system for monitoring advertisement broadcast |
CN104598541A (en) * | 2014-12-29 | 2015-05-06 | 乐视网信息技术(北京)股份有限公司 | Identification method and device for multimedia file |
CN104778238A (en) * | 2015-04-03 | 2015-07-15 | 中国农业大学 | Video saliency analysis method and video saliency analysis device |
CN105184610A (en) * | 2015-09-02 | 2015-12-23 | 王磊 | Real-time mobile advertisement synchronous putting method and device based on audio fingerprints |
-
2016
- 2016-05-19 CN CN201610339063.9A patent/CN106055570A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120185475A1 (en) * | 2008-08-19 | 2012-07-19 | Miller Frank W | Variable audio/visual data incorporation system and method |
CN102411578A (en) * | 2010-09-25 | 2012-04-11 | 盛乐信息技术(上海)有限公司 | Multimedia playing system and method |
CN102799605A (en) * | 2012-05-02 | 2012-11-28 | 天脉聚源(北京)传媒科技有限公司 | Method and system for monitoring advertisement broadcast |
CN104598541A (en) * | 2014-12-29 | 2015-05-06 | 乐视网信息技术(北京)股份有限公司 | Identification method and device for multimedia file |
CN104778238A (en) * | 2015-04-03 | 2015-07-15 | 中国农业大学 | Video saliency analysis method and video saliency analysis device |
CN105184610A (en) * | 2015-09-02 | 2015-12-23 | 王磊 | Real-time mobile advertisement synchronous putting method and device based on audio fingerprints |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106686401A (en) * | 2017-01-13 | 2017-05-17 | 山东鑫诚信电子科技有限公司 | Video data distributed storage method, video data distributed storage device, video data retrieval method and video data retrieval device |
CN107886959A (en) * | 2017-09-30 | 2018-04-06 | 中国农业科学院蜜蜂研究所 | A kind of method and apparatus extracted honeybee and visit flower video segment |
CN107886959B (en) * | 2017-09-30 | 2021-07-27 | 中国农业科学院蜜蜂研究所 | Method and device for extracting bee interview video clip |
CN108520078A (en) * | 2018-04-20 | 2018-09-11 | 百度在线网络技术(北京)有限公司 | Video frequency identifying method and device |
CN108520078B (en) * | 2018-04-20 | 2020-03-20 | 百度在线网络技术(北京)有限公司 | Video identification method and device |
CN109446356A (en) * | 2018-09-21 | 2019-03-08 | 深圳市九洲电器有限公司 | A kind of multimedia document retrieval method and device |
WO2020057347A1 (en) * | 2018-09-21 | 2020-03-26 | 深圳市九洲电器有限公司 | Multimedia file retrieval method and apparatus |
CN110412368A (en) * | 2019-06-27 | 2019-11-05 | 安徽继远软件有限公司 | Electrical equipment online supervision method and system based on Application on Voiceprint Recognition |
CN110958485A (en) * | 2019-10-30 | 2020-04-03 | 维沃移动通信有限公司 | Video playing method, electronic equipment and computer readable storage medium |
CN112165634A (en) * | 2020-09-29 | 2021-01-01 | 北京百度网讯科技有限公司 | Method for establishing audio classification model and method and device for automatically converting video |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106055570A (en) | Video retrieval device based on audio data and video retrieval method for same | |
WO2018072071A1 (en) | Knowledge map building system and method | |
US20200125981A1 (en) | Systems and methods for recognizing ambiguity in metadata | |
US10089392B2 (en) | Automatically selecting thematically representative music | |
US8457368B2 (en) | System and method of object recognition and database population for video indexing | |
KR20140093957A (en) | Interactive multi-modal image search | |
US20100217755A1 (en) | Classifying a set of content items | |
US20110029510A1 (en) | Method and apparatus for searching a plurality of stored digital images | |
KR100615522B1 (en) | music contents classification method, and system and method for providing music contents using the classification method | |
CN1759396A (en) | Improved data retrieval method and system | |
CN106682012A (en) | Commodity object information searching method and device | |
CN108228682A (en) | Character string verification method, character string expansion method and verification model training method | |
CN109857898A (en) | A kind of method and system of mass digital audio-frequency fingerprint storage and retrieval | |
JP5306114B2 (en) | Query extraction device, query extraction method, and query extraction program | |
JP2014500528A (en) | Enhancement of meaning using TOP-K processing | |
CN106446235A (en) | Video searching method and device | |
TW201417093A (en) | Electronic device with video/audio files processing function and video/audio files processing method | |
CN101957860B (en) | Method and device for releasing and searching information | |
TW200805251A (en) | A method and apparatus for accessing a digital file from a collection of digital films | |
CN113065018A (en) | Audio and video index library creating and retrieving method and device and electronic equipment | |
KR100644016B1 (en) | Moving picture search system and method thereof | |
JP2012015809A (en) | Music selection apparatus, music selection method, and music selection program | |
JP2005346259A5 (en) | ||
JP2010003219A (en) | Related query derivation device, and related query derivation method and program | |
JPH09223150A (en) | Information classification processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161026 |