CN104268279B - The querying method and device of corpus data - Google Patents

The querying method and device of corpus data Download PDF

Info

Publication number
CN104268279B
CN104268279B CN201410549904.XA CN201410549904A CN104268279B CN 104268279 B CN104268279 B CN 104268279B CN 201410549904 A CN201410549904 A CN 201410549904A CN 104268279 B CN104268279 B CN 104268279B
Authority
CN
China
Prior art keywords
sound
groove model
corpus data
model
prestores
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410549904.XA
Other languages
Chinese (zh)
Other versions
CN104268279A (en
Inventor
张征
张烁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weizhen Technology (Beijing) Co., Ltd
Original Assignee
Rubik's Cube Sky Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rubik's Cube Sky Technology (beijing) Co Ltd filed Critical Rubik's Cube Sky Technology (beijing) Co Ltd
Priority to CN201410549904.XA priority Critical patent/CN104268279B/en
Publication of CN104268279A publication Critical patent/CN104268279A/en
Application granted granted Critical
Publication of CN104268279B publication Critical patent/CN104268279B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses the querying method and device of a kind of corpus data.Wherein, the querying method of the corpus data includes:Obtain the first sound-groove model of user;The sound-groove model to match with the first sound-groove model is searched from the sound-groove model that prestores in corpus data storehouse, obtains the second sound-groove model;According to the incidence relation of prestore sound-groove model and the corpus data prestored in corpus data storehouse, first corpus data associated with the second sound-groove model is obtained;And the first corpus data is sent to user.By the present invention, the efficiency that solves the problems, such as to search corpus data in the prior art improves the effect for the efficiency for searching corpus data than relatively low.

Description

The querying method and device of corpus data
Technical field
The present invention relates to MultiMedia Field, in particular to the querying method and device of a kind of corpus data.
Background technology
With the continuous development and progress of multimedia technology, more and more corpus datas are produced and stored.Needing to adjust With during these corpus datas, it is necessary to lookup be compared one by one according to the filename of corpus data in the file of storage, for a large amount of Corpus data for, the obvious efficiency of mode of corpus data is searched by way of documents name one by one than relatively low.In addition, By the method for filename lookup corpus data when the naming rule of filename is lack of standardization, corresponding language can not be accurately found Expect data, or even corpus data can not be found.
For in the prior art search corpus data efficiency than it is relatively low the problem of, not yet propose effective solution party at present Case.
The content of the invention
It is a primary object of the present invention to provide the querying method and device of a kind of corpus data, to solve in the prior art Search corpus data efficiency than it is relatively low the problem of.
To achieve these goals, according to an aspect of the invention, there is provided a kind of querying method of corpus data.Root Querying method according to the corpus data of the present invention includes:Obtain the first sound-groove model of user;The sound that prestores from corpus data storehouse The sound-groove model to match with first sound-groove model is searched in line model, obtains the second sound-groove model;According to institute's predicate The incidence relation of prestore sound-groove model and the corpus data that are prestored in material database, obtains and the second vocal print mould The first corpus data that type is associated;And first corpus data is sent to the user.
Further, obtaining first corpus data associated with second sound-groove model includes:From the language material number The object identity that there are mapping relations with second sound-groove model according to being searched in storehouse;Obtain and the object identity associated the Two corpus datas;And will be with associated second corpus data of the object identity as first corpus data.
Further, first corpus data is sent to the user includes:Obtain the object of the object identity Information;And when first corpus data is sent to the user, the object information is sent to the user.
Further, search from the sound-groove model that prestores in corpus data storehouse and to match with first sound-groove model Sound-groove model, obtaining the second sound-groove model includes:First sound-groove model is calculated respectively using posterior probability to prestore with described The similarity of each sound-groove model that prestores in sound-groove model, obtains multiple similarities;More the multiple similarity, obtains most Big similarity;And will there is the sound-groove model of the maximum similarity as second vocal print in the sound-groove model that prestores Model.
Further, search in the sound-groove model that prestores from corpus data storehouse and match with first sound-groove model Sound-groove model before, the method further includes:Collect the corpus data of multiple objects indicated by multiple object identities;Obtain The sound-groove model of the corpus data of the multiple object, obtains the sound-groove model that prestores;And the vocal print mould that prestores described in establishing The correspondence of type and the multiple object identity.
Further, the sound-groove model of the corpus data of multiple objects is obtained, obtaining the sound-groove model that prestores includes:Will The mark of the corpus data and the multiple object is associated;Extraction and each object identity in the multiple object identity The speech characteristic parameter of each frame voice signal in associated all corpus datas;To each object identity for extracting Speech characteristic parameter is trained, and obtains the sound-groove model for belonging to each object identity;And it will belong to described each right Prestore sound-groove model as described in being used as the sound-groove model of mark.
Further, after correspondence of the sound-groove model with the multiple object identity that prestore described in foundation, and obtain Before taking first sound-groove model at family, the method further includes:Obtain the corpus data of the first object;Identify described first pair Sound-groove model in the corpus data of elephant;Search from the sound-groove model that prestores and matched with the sound-groove model of first object Sound-groove model;And the corpus data of first object is associated with to the object mark corresponding to the sound-groove model found Know.
To achieve these goals, according to another aspect of the present invention, there is provided a kind of inquiry unit of corpus data.Root Inquiry unit according to the corpus data of the present invention includes:First acquisition unit, for obtaining the first sound-groove model of user;First Searching unit, for searching the vocal print to match with first sound-groove model from the sound-groove model that prestores in corpus data storehouse Model, obtains the second sound-groove model;Second acquisition unit, for prestoring according to being prestored in the corpus data storehouse The incidence relation of sound-groove model and corpus data, obtains first corpus data associated with second sound-groove model;And Transmitting element, for first corpus data to be sent to the user.
Further, the first acquisition unit includes:Searching module, for lookup and institute from the corpus data storehouse Stating the second sound-groove model has the object identity of mapping relations;First acquisition module, associates for obtaining with the object identity The second corpus data;And first determining module, for will make with associated second corpus data of the object identity For first corpus data.
Further, the transmitting element includes:Second acquisition module, the object for obtaining the object identity are believed Breath;And sending module, for when first corpus data is sent to the user, the object information to be sent to The user.
Further, the searching unit includes:Computing module, for calculating first sound respectively using posterior probability The similarity of line model and each sound-groove model that prestores in the sound-groove model that prestores, obtains multiple similarities;Comparison module, For more the multiple similarity, maximum similarity is obtained;And second determining module, for by the sound-groove model that prestores In have the maximum similarity sound-groove model as second sound-groove model.
Further, described device further includes:Collector unit, for being looked into the sound-groove model that prestores from corpus data storehouse Look for before the sound-groove model to match of first sound-groove model, collecting multiple objects indicated by multiple object identities Corpus data;3rd acquiring unit, the sound-groove model of the corpus data for obtaining the multiple object, obtains the sound that prestores Line model;And unit is established, for establishing the correspondence of the prestore sound-groove model and the multiple object identity.
Further, the 3rd acquiring unit includes:Relating module, for by the corpus data with it is the multiple right The mark of elephant is associated;Extraction module, it is associated all with each object identity in the multiple object identity for extracting The speech characteristic parameter of each frame voice signal in corpus data;Training module, for each object mark to extracting The speech characteristic parameter of knowledge is trained, and obtains the sound-groove model for belonging to each object identity;And the 3rd determining module, For the sound-groove model for belonging to each object identity to be prestored sound-groove model as described in.
Further, described device further includes:4th acquiring unit, for prestore described in foundation sound-groove model with it is described After the correspondence of multiple object identities, and before the first sound-groove model of acquisition user, the language material number of the first object is obtained According to;Recognition unit, the sound-groove model in corpus data for identifying first object;Second searching unit, for from institute State the matched sound-groove model of sound-groove model searched in the sound-groove model that prestores with first object;And associative cell, it is used for The corpus data of first object is associated with to the object identity corresponding to the sound-groove model found.
By the present invention, the second sound-groove model and the second sound-groove model and corpus data are preserved in corpus data storehouse Mapping relations, therefore, found from corpus data storehouse with matched second sound-groove model of the first sound-groove model, with regard to that can find and the The corpus data that two sound-groove models are associated, so that find with the first sound-groove model of user to matched all corpus datas, The efficiency for solving the problems, such as to search corpus data in the prior art improves the effect for searching corpus data than relatively low The effect of rate.
Brief description of the drawings
The attached drawing for forming the part of the application is used for providing a further understanding of the present invention, schematic reality of the invention Apply example and its explanation is used to explain the present invention, do not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of the querying method of corpus data according to embodiments of the present invention;
Fig. 2 is the flow chart of the querying method of the corpus data of another embodiment according to the present invention;
Fig. 3 be it is according to embodiments of the present invention by object identity by corpus data table and the associated signal of object information table Figure;And
Fig. 4 is the schematic diagram of the inquiry unit of corpus data according to embodiments of the present invention.
Embodiment
It should be noted that in the case where there is no conflict, the feature in embodiment and embodiment in the application can phase Mutually combination.Below with reference to the accompanying drawings and the present invention will be described in detail in conjunction with the embodiments.
In order to make those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention Attached drawing, is clearly and completely described the technical solution in the embodiment of the present invention, it is clear that described embodiment is only The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people Member's all other embodiments obtained without making creative work, should all belong to the model that the present invention protects Enclose.
It should be noted that term " first " in description and claims of this specification and above-mentioned attached drawing, " Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so use Data can exchange in the appropriate case, so as to the embodiment of the present invention described herein can with except illustrating herein or Order beyond those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover Cover it is non-exclusive include, be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product Or the intrinsic other steps of equipment or unit.
The present invention provides a kind of lookup method of corpus data.Fig. 1 is corpus data according to embodiments of the present invention The flow chart of lookup method.As shown in the figure, the lookup method of the corpus data includes the following steps:
Step S102, obtains the first sound-groove model of user;
Step S104, searches the vocal print to match with the first sound-groove model from the sound-groove model that prestores in corpus data storehouse Model, obtains the second sound-groove model;
Step S106, according to the incidence relation of prestore sound-groove model and the corpus data prestored in corpus data storehouse, Obtain first corpus data associated with the second sound-groove model;And
Step S108, user is sent to by the first corpus data.
The second sound-groove model and the mapping relations of the second sound-groove model and corpus data are saved in corpus data storehouse, because This, found from corpus data storehouse with matched second sound-groove model of the first sound-groove model, with regard to that can find and the second sound-groove model Associated corpus data, so as to find with the first sound-groove model of user to matched all corpus datas, improves lookup The efficiency of corpus data, and can accurate match to the corpus data required to look up.
The present embodiment is illustrated below in conjunction with table 1.
Table 1
Object identity Prestore sound-groove model Corpus data
ID1 M1 Y11
Y12
ID2 M2 Y2
As shown in table 1, the sound-groove model that prestores be after the first sound-groove model of user is obtained, using the first sound-groove model by It is a to be matched with prestore sound-groove model M1 and M2, if the first sound-groove model matches with the sound-groove model M1 that prestores, obtain With the relevant corpus data Y11 and Y12 of the sound-groove model M1 that prestores, and corpus data Y11 and Y12 are sent to user.Namely Say, the mapping relations of prestore sound-groove model and prestore sound-groove model and corpus data are saved in corpus data storehouse, pre- Deposit and found in sound-groove model with after matched second sound-groove model of the first sound-groove model, just language material can be found according to mapping relations Data, so as to quickly find and the matched corpus data of the first sound-groove model.
Specifically, the vocal print mould to match with the first sound-groove model is searched from the sound-groove model that prestores in corpus data storehouse Type, obtaining the second sound-groove model includes:Calculated respectively using posterior probability the first sound-groove model with it is every in the sound-groove model that prestores The similarity of a sound-groove model that prestores, obtains multiple similarities;More multiple similarities, obtain maximum similarity;And will be pre- Depositing in sound-groove model has the sound-groove model of maximum similarity as the second sound-groove model.
After the voice signal of user is got, the voice signal of input is pre-processed, removes non-speech audio, and To voice signal framing, extract and preserve the speech characteristic parameter of each frame voice signal in voice signal, obtain the user's Sound-groove model, i.e. the first sound-groove model.Then the first sound-groove model is calculated respectively using posterior probability with being prestored in data The sound-groove model that prestores in each sound-groove model similarity, by the model in the sound-groove model that prestores corresponding to maximum similarity As the second sound-groove model.
Specifically, since the sound-groove model that prestores in database is according to all language materials for belonging to same object identity What data were trained, therefore, obtaining first corpus data associated with the second sound-groove model includes:From corpus data storehouse Search the object identity that there are mapping relations with the second sound-groove model;Obtain and associated second corpus data of object identity;With And will be with associated second corpus data of object identity as the first corpus data.
For example, object identity ID1 is the unique identity of passerby's first, corpus data Y11 corresponding with object identity ID1 All it is the voice document for belonging to passerby's first with Y12, the sound-groove model M1 that prestores extracts to obtain according to corpus data Y11 and Y12, The phonetic feature of passerby's first can be characterized.By corpus data Y11 and Y12 obtain prestoring sound-groove model M1 when determined that passerby The mapping relations of the object identity ID1 of first and the sound-groove model M1 that prestores, then obtaining and the matched rising tone of the first sound-groove model After line model M 1, can according to the mapping relations of the second vocal print model M 1 and object identity ID1 and object identity ID1 with The incidence relation of corpus data Y11 and Y12, determine that with 1 associated corpus data of the second vocal print model M be Y11 and Y12.
Alternatively, the first corpus data is sent to user includes:Obtain the object information of object identity;And by When one corpus data is sent to user, object information is sent to user.
In order to provide a user more information, the object information of object identity is also stored with database, such as the institute of table 2 Show.
Table 2
Compared with table 1, object information is also recorded in table 2, audio, video and text are also recorded in corpus data Etc. content.Wherein, object information can be the object oriented passerby first, Lu Renyi shown in table 2, can also include the photo of object Deng, if the species of object information is more, object information table can also be established for object information, for storage object information, And associated object information table and corpus data by object identity, after object identity is determined, according to object identity The object information associated with the object identity is searched from object information table, and object information is sent to user.
The audio of corpus data in table 2, video are associated with object information, that is to say, that these Voice & Videos are to belong to In the object corresponding to object information associated with it.For example, audio A 1, video V1 and text T1 belong to the language of passerby's first Expect data, that is to say, that the sound in audio A 1 and video V1 is all from passerby's first, and text T1 is in audio A 1 and video V1 Voice corresponding to writing text, e.g., lines of TV play etc..To user send expect data when, can by audio A 1, Video V1 and text T1 is sent to user, and user can compare text and listen to audio A 1 or viewing video V1.
Above-mentioned audio file and video file can be the files such as video display, drama, user can utilize these files into Row imitates, dubs and learn.For example, the corresponding object informations of object identity ID2 are Mei Lanfang, audio A 2 is Mei Lanfang's 《Farewell My Concubine》Audio file, video V2 is Mei Lanfang《Farewell My Concubine》Video file, text T2 is《Farewell My Concubine》 Lines, and default sound-groove model M2 is the model trained according to audio A 2 and video V2, can embody the sound of Mei Lanfang Line feature.Present user needs to learn Mei Lanfang's《Farewell My Concubine》, the first sound is extracted from the voice document of user's offer Line model, and find with the matched second vocal print model M 2 of first sound-groove model, also just have found《Farewell My Concubine》Regard Frequency file and audio file.User passes through《Farewell My Concubine》Video and audio file study imitate Mei Lanfang《Overlord is other A Ji》, so as to reach the destination of study is carried out using existing multimedia file.In addition, if it is the audio text of films and television programs Part or video file, can also be dubbed etc. by these files, and method is with study Mei Lanfang's《Farewell My Concubine》Side Method is similar, repeats no more.
Alternatively, searching with before the matched sound-groove model of the first sound-groove model, establishing the data of storage corpus data Storehouse device, before searching the sound-groove model to match with the first sound-groove model in the sound-groove model that prestores from corpus data storehouse, Method further includes the following steps shown in Fig. 2:
Step S202, collects the corpus data of multiple objects indicated by multiple object identities.
Step S204, obtains the sound-groove model of the corpus data of multiple objects, obtains the sound-groove model that prestores.
Step S206, establishes the correspondence of prestore sound-groove model and multiple object identities.
Collect the corpus data of multiple objects indicated by multiple object identities, such as corpus data Y11, Y12 in table 1 and Y2, for the ease of database lookup and storage data, establish object information table, corpus data table respectively.Wherein, object information The file such as the information such as table storage title, head portrait, corpus data table storage audio, video and text, object information table It can be associated with corpus data table by object identity, as shown in Figure 3.
To belonging to all corpus datas of same object after the sound-groove model that obtains prestoring, will prestore sound-groove model with Multiple object identities establish correspondence, i.e. an object identity corresponds to the sound-groove model that prestores, so according to the vocal print that prestores Model can determine an object identity, be further able to find corpus data according to object identity.
After prestoring sound-groove model and object identity establishes correspondence, it can be searched and be associated with by object identity Multiple corpus datas, avoid establishing the incidence relation of prestore sound-groove model and corpus data again so that stored in database Mapping relations it is simpler, improve calculate mapping relations efficiency, also just improve and search the second vocal print mould in the database The efficiency of type, and then improve the effect for the efficiency for searching corpus data.
Specifically, the sound-groove model of the corpus data of multiple objects is obtained, the sound-groove model that obtains prestoring includes:By language material number It is associated according to the mark with multiple objects;Extraction and the associated all corpus datas of each object identity in multiple object identities In each frame voice signal speech characteristic parameter;The speech characteristic parameter of each object identity to extracting is trained, Obtain the sound-groove model for belonging to each object identity;And using the sound-groove model for belonging to each object identity as the vocal print mould that prestores Type.
Such as table 1, corpus data is associated with the mark of multiple objects, and an object identity can associate multiple corpus datas, As object identity ID1 associates corpus data Y11 and corpus data Y12 in table 1.Corpus data is pre-processed, removes language material Non-speech audio in data, and framing is carried out to the voice signal in corpus data Y11 and corpus data Y12, it is subordinated to Speech characteristic parameter, and the voice to extracting are extracted in each frame voice signal in corpus data Y11 and corpus data Y12 Characteristic parameter is trained, and obtains the sound-groove model M1 for belonging to object identity ID1, and sound-groove model M1 can embody its subordinate The vocal print feature of object corresponding to object identity ID1.Aforesaid operations are all carried out to the associated corpus data of multiple object identities, The sound-groove model of each object identity is obtained, the sound-groove model corresponding to all object identities just constitutes the sound-groove model that prestores.
Preferably, after the correspondence of prestore sound-groove model and multiple object identities is established, and the of user is obtained Before one sound-groove model, method further includes:Obtain the corpus data of the first object;Identify the sound in the corpus data of the first object Line model;The matched sound-groove model of sound-groove model with the first object is searched from the sound-groove model that prestores;And by the first object Corpus data be associated with object identity corresponding to the sound-groove model found.
After correspondence of the sound-groove model with multiple object identities that prestore is established, the number of sound-groove model is also just obtained According to storehouse, when there is new corpus data to add, by searching for the matched vocal print mould that prestores of the sound-groove model of new corpus data Type, and new corpus data is associated with the object identity to prestore corresponding to sound-groove model searched, realize and deposit automatically The corpus data of Chu Xin, and store new corpus data with existing object identity is associated in database, be easy to The data maintenance of database.
For example, store Mei Lanfang's in the database《Farewell My Concubine》, needing Mei Lanfang's《Drunken Concubine》's When audio file is added to database, then extract《Drunken Concubine》Audio file sound-groove model, according to the sound-groove model from It is sound-groove model M2 to prestore and find matched sound-groove model in sound-groove model, then will《Drunken Concubine》Audio and find sound Object identity D2 corresponding to line model M 2 is associated, namely complete by《Drunken Concubine》Be added in database and with it is right As mark establishes association.If user needs to extract all videos of Mei Lanfang, according to the sound-groove model that prestores with Mei Lanfang Matched voice is found with the associated corpus datas of Mei Lanfang, and the corpus data found not only includes《Farewell My Concubine》, also Including《Drunken Concubine》.
The above embodiment of the present invention reaches following effect:
1st, by receive the voice signal of user search with the matched sound-groove model that prestores of the voice signal, utilize vocal print Identification technology, and according to object identity and the correspondence of sound-groove model, corpus data of prestoring, quickly find with it is matched pre- The corpus data corresponding to sound-groove model is deposited, improves the efficiency for searching corpus data;
2nd, according to the sound-groove model that prestores, new corpus data can be added to by number by the contrast of sound-groove model rapidly According in storehouse, and mapping relations are established with corresponding object identity, easy to the maintenance of database.
The embodiment of the present invention additionally provides a kind of inquiry unit of corpus data.The corpus data of the embodiment of the present invention is looked into Inquiry method can be performed by the inquiry unit for the corpus data that the embodiment of the present invention is provided, the language material of the embodiment of the present invention The inquiry unit of data can be used for performing the querying method for the corpus data that the embodiment of the present invention is provided.
Fig. 4 is the schematic diagram of the inquiry unit of corpus data according to embodiments of the present invention.As shown in the figure, the corpus data Inquiry unit include:First acquisition unit 10, the first searching unit 30, second acquisition unit 50 and transmitting element 70.
First acquisition unit 10 is used for the first sound-groove model for obtaining user;
First searching unit 30 is used to search the phase with the first sound-groove model from the sound-groove model that prestores in corpus data storehouse Matched sound-groove model, obtains the second sound-groove model;
Second acquisition unit 50 is used for according to prestore sound-groove model and the corpus data prestored in corpus data storehouse Incidence relation, obtains first corpus data associated with the second sound-groove model;
Transmitting element 70 is used to the first corpus data being sent to user.
The second sound-groove model and the mapping relations of the second sound-groove model and corpus data are saved in corpus data storehouse, because This, found from corpus data storehouse with matched second sound-groove model of the first sound-groove model, with regard to that can find and the second sound-groove model Associated corpus data, so as to find with the first sound-groove model of user to matched all corpus datas, improves lookup The efficiency of corpus data, and can accurate match to the corpus data required to look up.
The present embodiment is illustrated below in conjunction with table 1.
As shown in table 1, the sound-groove model that prestores be after the first sound-groove model of user is obtained, using the first sound-groove model by It is a to be matched with prestore sound-groove model M1 and M2, if the first sound-groove model matches with the sound-groove model M1 that prestores, obtain With the relevant corpus data Y11 and Y12 of the sound-groove model M1 that prestores, and corpus data Y11 and Y12 are sent to user.Namely Say, the mapping relations of prestore sound-groove model and prestore sound-groove model and corpus data are saved in corpus data storehouse, pre- Deposit and found in sound-groove model with after matched second sound-groove model of the first sound-groove model, just language material can be found according to mapping relations Data, so as to quickly find and the matched corpus data of the first sound-groove model.
Specifically, searching unit includes:Computing module, for calculated respectively using posterior probability the first sound-groove model with it is pre- The similarity of each sound-groove model that prestores in sound-groove model is deposited, obtains multiple similarities;Comparison module, for more multiple phases Like degree, maximum similarity is obtained;And second determining module, for that will have the vocal print of maximum similarity in the sound-groove model that prestores Model is as the second sound-groove model.
After the voice signal of user is got, the voice signal of input is pre-processed, removes non-speech audio, and To voice signal framing, extract and preserve the speech characteristic parameter of each frame voice signal in voice signal, obtain the user's Sound-groove model, i.e. the first sound-groove model.Then the first sound-groove model is calculated respectively using posterior probability with being prestored in data The sound-groove model that prestores in each sound-groove model similarity, by the model in the sound-groove model that prestores corresponding to maximum similarity As the second sound-groove model.
Specifically, since the sound-groove model that prestores in database is according to all language materials for belonging to same object identity What data were trained, therefore, first acquisition unit includes:Searching module, for lookup and the rising tone from corpus data storehouse Line model has the object identity of mapping relations;First acquisition module, for obtaining and the associated second language material number of object identity According to;And first determining module, for will be with associated second corpus data of object identity as the first corpus data.
For example, object identity ID1 is the unique identity of passerby's first, corpus data Y11 corresponding with object identity ID1 All it is the voice document for belonging to passerby's first with Y12, the sound-groove model M1 that prestores extracts to obtain according to corpus data Y11 and Y12, The phonetic feature of passerby's first can be characterized.By corpus data Y11 and Y12 obtain prestoring sound-groove model M1 when determined that passerby The mapping relations of the object identity ID1 of first and the sound-groove model M1 that prestores, then obtaining and the matched rising tone of the first sound-groove model After line model M 1, can according to the mapping relations of the second vocal print model M 1 and object identity ID1 and object identity ID1 with The incidence relation of corpus data Y11 and Y12, determine that with 1 associated corpus data of the second vocal print model M be Y11 and Y12.
Alternatively, transmitting element includes:Second acquisition module, for obtaining the object information of object identity;And send Module, for when the first corpus data is sent to user, object information to be sent to user.
In order to provide a user more information, the object information of object identity is also stored with database, such as the institute of table 2 Show.Compared with table 1, object information is also recorded in table 2, is also recorded in corpus data in audio, video and text etc. Hold.Wherein, object information can be the object oriented passerby first, Lu Renyi shown in table 2, can also include photo of object etc., If the species of object information is more, object information table can also be established for object information, for storage object information, and led to Cross object identity to associate object information table and corpus data, after object identity is determined, according to object identity from right The object information associated with the object identity is searched in image information table, and object information is sent to user.
The audio of corpus data in table 2, video are associated with object information, that is to say, that these Voice & Videos are to belong to In the object corresponding to object information associated with it.For example, audio A 1, video V1 and text T1 belong to the language of passerby's first Expect data, that is to say, that the sound in audio A 1 and video V1 is all from passerby's first, and text T1 is in audio A 1 and video V1 Voice corresponding to writing text, e.g., lines of TV play etc..To user send expect data when, can by audio A 1, Video V1 and text T1 is sent to user, and user can compare text and listen to audio A 1 or viewing video V1.
Above-mentioned audio file and video file can be the files such as video display, drama, user can utilize these files into Row imitates, dubs and learn.For example, the corresponding object informations of object identity ID2 are Mei Lanfang, audio A 2 is Mei Lanfang's 《Farewell My Concubine》Audio file, video V2 is Mei Lanfang《Farewell My Concubine》Video file, text T2 is《Farewell My Concubine》 Lines, and default sound-groove model M2 is the model trained according to audio A 2 and video V2, can embody the sound of Mei Lanfang Line feature.Present user needs to learn Mei Lanfang's《Farewell My Concubine》, the first sound is extracted from the voice document of user's offer Line model, and find with the matched second vocal print model M 2 of first sound-groove model, also just have found《Farewell My Concubine》Regard Frequency file and audio file.User passes through《Farewell My Concubine》Video and audio file study imitate Mei Lanfang《Overlord is other A Ji》, so as to reach the destination of study is carried out using existing multimedia file.In addition, if it is the audio text of films and television programs Part or video file, can also be dubbed etc. by these files, and method is with study Mei Lanfang's《Farewell My Concubine》Side Method is similar, repeats no more.
Alternatively, searching with before the matched sound-groove model of the first sound-groove model, establishing the data of storage corpus data Storehouse device, device further include:Collector unit, for being searched and the first sound-groove model in the sound-groove model that prestores from corpus data storehouse The sound-groove model to match before, collect the corpus datas of multiple objects indicated by multiple object identities;3rd obtains list Member, the sound-groove model of the corpus data for obtaining multiple objects, obtains the sound-groove model that prestores;And unit is established, for building The correspondence of vertical prestore sound-groove model and multiple object identities.
Collect the corpus data of multiple objects indicated by multiple object identities, such as corpus data Y11, Y12 in table 1 and Y2, for the ease of database lookup and storage data, establish object information table, corpus data table respectively.Wherein, object information The file such as the information such as table storage title, head portrait, corpus data table storage audio, video and text, object information table It can be associated with corpus data table by object identity, as shown in Figure 3.
To belonging to all corpus datas of same object after the sound-groove model that obtains prestoring, will prestore sound-groove model with Multiple object identities establish correspondence, i.e. an object identity corresponds to the sound-groove model that prestores, so according to the vocal print that prestores Model can determine an object identity, be further able to find corpus data according to object identity.
After prestoring sound-groove model and object identity establishes correspondence, it can be searched and be associated with by object identity Multiple corpus datas, avoid establishing the incidence relation of prestore sound-groove model and corpus data again so that stored in database Mapping relations it is simpler, improve calculate mapping relations efficiency, also just improve and search the second vocal print mould in the database The efficiency of type, and then improve the effect for the efficiency for searching corpus data.
Specifically, the 3rd acquiring unit includes:Relating module, for the mark of corpus data and multiple objects to be closed Connection;Extraction module, for extracting and each frame language in the associated all corpus datas of each object identity in multiple object identities The speech characteristic parameter of sound signal;Training module, the speech characteristic parameter for each object identity to extracting are instructed Practice, obtain the sound-groove model for belonging to each object identity;And the 3rd determining module, for the sound of each object identity will to be belonged to Line model is as the sound-groove model that prestores.
Such as table 1, corpus data is associated with the mark of multiple objects, and an object identity can associate multiple corpus datas, As object identity ID1 associates corpus data Y11 and corpus data Y12 in table 1.Corpus data is pre-processed, removes language material Non-speech audio in data, and framing is carried out to the voice signal in corpus data Y11 and corpus data Y12, it is subordinated to Speech characteristic parameter, and the voice to extracting are extracted in each frame voice signal in corpus data Y11 and corpus data Y12 Characteristic parameter is trained, and obtains the sound-groove model M1 for belonging to object identity ID1, and sound-groove model M1 can embody its subordinate The vocal print feature of object corresponding to object identity ID1.Aforesaid operations are all carried out to the associated corpus data of multiple object identities, The sound-groove model of each object identity is obtained, the sound-groove model corresponding to all object identities just constitutes the sound-groove model that prestores.
Preferably, device further includes:4th acquiring unit, for establishing prestore sound-groove model and multiple object identities After correspondence, and before the first sound-groove model of acquisition user, the corpus data of the first object is obtained;Recognition unit, is used Sound-groove model in the corpus data for identifying the first object;Second searching unit, for from the sound-groove model that prestores search with The matched sound-groove model of sound-groove model of first object;And associative cell, for the corpus data of the first object to be associated with The object identity corresponding to sound-groove model found.
After correspondence of the sound-groove model with multiple object identities that prestore is established, the number of sound-groove model is also just obtained According to storehouse, when there is new corpus data to add, by searching for the matched vocal print mould that prestores of the sound-groove model of new corpus data Type, and new corpus data is associated with the object identity to prestore corresponding to sound-groove model searched, realize and deposit automatically The corpus data of Chu Xin, and store new corpus data with existing object identity is associated in database, be easy to The data maintenance of database.
For example, store Mei Lanfang's in the database《Farewell My Concubine》, needing Mei Lanfang's《Drunken Concubine》's When audio file is added to database, then extract《Drunken Concubine》Audio file sound-groove model, according to the sound-groove model from It is sound-groove model M2 to prestore and find matched sound-groove model in sound-groove model, then will《Drunken Concubine》Audio and find sound Object identity D2 corresponding to line model M 2 is associated, namely complete by《Drunken Concubine》Be added in database and with it is right As mark establishes association.If user needs to extract all videos of Mei Lanfang, according to the sound-groove model that prestores with Mei Lanfang Matched voice is found with the associated corpus datas of Mei Lanfang, and the corpus data found not only includes《Farewell My Concubine》, also Including《Drunken Concubine》.
The above embodiment of the present invention reaches following effect:
1st, by receive the voice signal of user search with the matched sound-groove model that prestores of the voice signal, utilize vocal print Identification technology, and according to object identity and the correspondence of sound-groove model, corpus data of prestoring, quickly find with it is matched pre- The corpus data corresponding to sound-groove model is deposited, improves the efficiency for searching corpus data;
2nd, according to the sound-groove model that prestores, new corpus data can be added to by number by the contrast of sound-groove model rapidly According in storehouse, and mapping relations are established with corresponding object identity, easy to the maintenance of database.
If the integrated unit in above-described embodiment is realized in the form of SFU software functional unit and is used as independent product Sale or in use, the storage medium that above computer can be read can be stored in.Based on such understanding, skill of the invention The part or all or part of the technical solution that art scheme substantially in other words contributes the prior art can be with soft The form of part product embodies, which is stored in storage medium, including some instructions are used so that one Platform or multiple stage computers equipment (can be personal computer, server or network equipment etc.) perform each embodiment institute of the present invention State all or part of step of method.
In the above embodiment of the present invention, the description to each embodiment all emphasizes particularly on different fields, and does not have in some embodiment The part of detailed description, may refer to the associated description of other embodiment.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the invention, for the skill of this area For art personnel, the invention may be variously modified and varied.Within the spirit and principles of the invention, that is made any repaiies Change, equivalent substitution, improvement etc., should all be included in the protection scope of the present invention.

Claims (12)

  1. A kind of 1. querying method of corpus data, it is characterised in that including:
    Obtain the first sound-groove model of user;
    The sound-groove model to match with first sound-groove model is searched from the sound-groove model that prestores in corpus data storehouse, is obtained Second sound-groove model;
    Prestore the incidence relation of sound-groove model and corpus data according to being prestored in the corpus data storehouse, obtain with The first corpus data that second sound-groove model is associated;And
    First corpus data is sent to the user;
    Wherein, by collecting the corpus data of multiple objects indicated by multiple object identities, and the multiple object is obtained The sound-groove model of corpus data, obtains the sound-groove model that prestores, and prestore sound-groove model and the multiple object mark described in foundation The correspondence of knowledge;
    After correspondence of the sound-groove model with the multiple object identity that prestore described in foundation, and obtain the first sound of user Before line model, the method further includes:
    Obtain the corpus data of the first object;
    Identify the sound-groove model in the corpus data of first object;
    The matched sound-groove model of sound-groove model with first object is searched from the sound-groove model that prestores;And
    The corpus data of first object is associated with to the object identity corresponding to the sound-groove model found.
  2. 2. according to the method described in claim 1, it is characterized in that, obtain first language associated with second sound-groove model Material data include:
    The object identity that there are mapping relations with second sound-groove model is searched from the corpus data storehouse;
    Obtain and associated second corpus data of the object identity;And
    Will be with associated second corpus data of the object identity as first corpus data.
  3. 3. according to the method described in claim 2, wrapped it is characterized in that, first corpus data is sent to the user Include:
    Obtain the object information of the object identity;And
    When first corpus data is sent to the user, the object information is sent to the user.
  4. 4. according to the method described in claim 1, it is characterized in that, from the sound-groove model that prestores in corpus data storehouse search and institute The sound-groove model to match of the first sound-groove model is stated, obtaining the second sound-groove model includes:
    Calculate first sound-groove model and each vocal print mould that prestores in the sound-groove model that prestores respectively using posterior probability The similarity of type, obtains multiple similarities;
    More the multiple similarity, obtains maximum similarity;And
    To there is the sound-groove model of the maximum similarity as second sound-groove model in the sound-groove model that prestores.
  5. 5. according to the method described in claim 1, it is characterized in that, in the sound-groove model that prestores from corpus data storehouse search with Before the sound-groove model to match of first sound-groove model, the method further includes:
    Collect the corpus data of multiple objects indicated by multiple object identities;
    The sound-groove model of the corpus data of the multiple object is obtained, obtains the sound-groove model that prestores;And
    Prestore the correspondence of sound-groove model and the multiple object identity described in foundation.
  6. 6. according to the method described in claim 5, it is characterized in that, obtain the sound-groove model of the corpus data of multiple objects, obtain Include to the sound-groove model that prestores:
    The mark of the corpus data and the multiple object is associated;
    Extraction and each frame voice signal in the associated all corpus datas of each object identity in the multiple object identity Speech characteristic parameter;
    The speech characteristic parameter of each object identity to extracting is trained, and obtains belonging to each object identity Sound-groove model;And
    The sound-groove model for belonging to each object identity is prestored sound-groove model as described in.
  7. A kind of 7. inquiry unit of corpus data, it is characterised in that including:
    First acquisition unit, for obtaining the first sound-groove model of user;
    First searching unit, for searching the phase with first sound-groove model from the sound-groove model that prestores in corpus data storehouse The sound-groove model matched somebody with somebody, obtains the second sound-groove model;
    Second acquisition unit, for sound-groove model and the corpus data of prestoring according to being prestored in the corpus data storehouse Incidence relation, obtain first corpus data associated with second sound-groove model;And
    Transmitting element, for first corpus data to be sent to the user;
    Wherein, described device further includes:
    4th acquiring unit, for after the correspondence of prestore described in foundation sound-groove model and multiple object identities, and is obtained Before taking first sound-groove model at family, the corpus data of the first object is obtained, wherein, it is signified by collecting multiple object identities The corpus data of the multiple objects shown, and the sound-groove model of the corpus data of the multiple object is obtained, obtain the sound that prestores Line model, the correspondence of prestore described in foundation sound-groove model and the multiple object identity;
    Recognition unit, the sound-groove model in corpus data for identifying first object;
    Second searching unit, for searching the matched sound of sound-groove model with first object from the sound-groove model that prestores Line model;And
    Associative cell, for the object mark being associated with the corpus data of first object corresponding to the sound-groove model found Know.
  8. 8. device according to claim 7, it is characterised in that the first acquisition unit includes:
    Searching module, for searching the object mark that there are mapping relations with second sound-groove model from the corpus data storehouse Know;
    First acquisition module, for obtaining and associated second corpus data of the object identity;And
    First determining module, for will be with associated second corpus data of the object identity as the first language material number According to.
  9. 9. device according to claim 8, it is characterised in that the transmitting element includes:
    Second acquisition module, for obtaining the object information of the object identity;And
    Sending module, for when first corpus data is sent to the user, the object information to be sent to institute State user.
  10. 10. device according to claim 7, it is characterised in that the searching unit includes:
    Computing module, for calculated respectively using posterior probability first sound-groove model with it is every in the sound-groove model that prestores The similarity of a sound-groove model that prestores, obtains multiple similarities;
    Comparison module, for more the multiple similarity, obtains maximum similarity;And
    Second determining module, for will there is the sound-groove model of the maximum similarity as described in the sound-groove model that prestores Second sound-groove model.
  11. 11. device according to claim 7, it is characterised in that described device further includes:
    Collector unit, matches for being searched in the sound-groove model that prestores from corpus data storehouse with first sound-groove model Sound-groove model before, collect the corpus datas of multiple objects indicated by multiple object identities;
    3rd acquiring unit, the sound-groove model of the corpus data for obtaining the multiple object, obtains the vocal print mould that prestores Type;And
    Unit is established, for establishing the correspondence of the prestore sound-groove model and the multiple object identity.
  12. 12. according to the devices described in claim 11, it is characterised in that the 3rd acquiring unit includes:
    Relating module, for the mark of the corpus data and the multiple object to be associated;
    Extraction module, for extract with it is each in each associated all corpus datas of object identity in the multiple object identity The speech characteristic parameter of frame voice signal;
    Training module, the speech characteristic parameter for each object identity to extracting are trained, obtain belonging to institute State the sound-groove model of each object identity;And
    3rd determining module, for the sound-groove model for belonging to each object identity to be prestored sound-groove model as described in.
CN201410549904.XA 2014-10-16 2014-10-16 The querying method and device of corpus data Active CN104268279B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410549904.XA CN104268279B (en) 2014-10-16 2014-10-16 The querying method and device of corpus data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410549904.XA CN104268279B (en) 2014-10-16 2014-10-16 The querying method and device of corpus data

Publications (2)

Publication Number Publication Date
CN104268279A CN104268279A (en) 2015-01-07
CN104268279B true CN104268279B (en) 2018-04-20

Family

ID=52159800

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410549904.XA Active CN104268279B (en) 2014-10-16 2014-10-16 The querying method and device of corpus data

Country Status (1)

Country Link
CN (1) CN104268279B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105679296A (en) * 2015-12-28 2016-06-15 百度在线网络技术(北京)有限公司 Instrumental performance assessment method and device
CN108962253A (en) * 2017-05-26 2018-12-07 北京搜狗科技发展有限公司 A kind of voice-based data processing method, device and electronic equipment
CN108364654B (en) * 2018-01-30 2020-10-13 网易乐得科技有限公司 Voice processing method, medium, device and computing equipment
CN108922543B (en) * 2018-06-11 2022-08-16 平安科技(深圳)有限公司 Model base establishing method, voice recognition method, device, equipment and medium
CN108986825A (en) * 2018-07-02 2018-12-11 北京百度网讯科技有限公司 Context acquisition methods and equipment based on interactive voice
CN109129509A (en) * 2018-09-17 2019-01-04 金碧地智能科技(珠海)有限公司 A kind of endowment based on screen intelligent interaction is accompanied and attended to robot
CN111368191B (en) * 2020-02-29 2021-04-02 重庆百事得大牛机器人有限公司 User portrait system based on legal consultation interaction process
CN113327622A (en) * 2021-06-02 2021-08-31 云知声(上海)智能科技有限公司 Voice separation method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404278A (en) * 2010-09-08 2012-04-04 盛乐信息技术(上海)有限公司 Song request system based on voiceprint recognition and application method thereof
CN102831890A (en) * 2011-06-15 2012-12-19 镇江佳得信息技术有限公司 Method for recognizing text-independent voice prints
CN103035247A (en) * 2012-12-05 2013-04-10 北京三星通信技术研究有限公司 Method and device of operation on audio/video file based on voiceprint information
CN103077713A (en) * 2012-12-25 2013-05-01 青岛海信电器股份有限公司 Speech processing method and device
CN103853778A (en) * 2012-12-04 2014-06-11 大陆汽车投资(上海)有限公司 Methods for updating music label information and pushing music, as well as corresponding device and system
CN103956168A (en) * 2014-03-29 2014-07-30 深圳创维数字技术股份有限公司 Voice recognition method and device, and terminal

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070033044A1 (en) * 2005-08-03 2007-02-08 Texas Instruments, Incorporated System and method for creating generalized tied-mixture hidden Markov models for automatic speech recognition

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102404278A (en) * 2010-09-08 2012-04-04 盛乐信息技术(上海)有限公司 Song request system based on voiceprint recognition and application method thereof
CN102831890A (en) * 2011-06-15 2012-12-19 镇江佳得信息技术有限公司 Method for recognizing text-independent voice prints
CN103853778A (en) * 2012-12-04 2014-06-11 大陆汽车投资(上海)有限公司 Methods for updating music label information and pushing music, as well as corresponding device and system
CN103035247A (en) * 2012-12-05 2013-04-10 北京三星通信技术研究有限公司 Method and device of operation on audio/video file based on voiceprint information
CN103077713A (en) * 2012-12-25 2013-05-01 青岛海信电器股份有限公司 Speech processing method and device
CN103956168A (en) * 2014-03-29 2014-07-30 深圳创维数字技术股份有限公司 Voice recognition method and device, and terminal

Also Published As

Publication number Publication date
CN104268279A (en) 2015-01-07

Similar Documents

Publication Publication Date Title
CN104268279B (en) The querying method and device of corpus data
CN101021855B (en) Video searching system based on content
CN104685501B (en) Text vocabulary is identified in response to visual query
CN100414548C (en) Search system and technique comprehensively using information of graphy and character
JP5466119B2 (en) Optimal viewpoint estimation program, apparatus, and method for estimating viewpoints of attributes of viewers interested in the same shared content
CN103761261B (en) A kind of media search method and device based on speech recognition
CN109684513B (en) Low-quality video identification method and device
CN101950302B (en) Method for managing immense amounts of music libraries based on mobile device
WO2017166512A1 (en) Video classification model training method and video classification method
CN108595660A (en) Label information generation method, device, storage medium and the equipment of multimedia resource
US20200349385A1 (en) Multimedia resource matching method and apparatus, storage medium, and electronic apparatus
CN108520046B (en) Method and device for searching chat records
CN103365936A (en) Video recommendation system and method thereof
US9606975B2 (en) Apparatus and method for automatically generating visual annotation based on visual language
CN102760169A (en) Method for detecting advertising slots in television direct transmission streams
CN102549603A (en) Relevance-based image selection
US9736520B2 (en) System and method for organizing multimedia content
CN105426550B (en) Collaborative filtering label recommendation method and system based on user quality model
KR101811468B1 (en) Semantic enrichment by exploiting top-k processing
CN102737029A (en) Searching method and system
CN105893404A (en) Natural information identification based pushing system and method, and client
CN102855317A (en) Multimode indexing method and system based on demonstration video
CN102855245A (en) Image similarity determining method and image similarity determining equipment
JP6397378B2 (en) Feature value generation method, feature value generation device, and feature value generation program
JP6446987B2 (en) Video selection device, video selection method, video selection program, feature amount generation device, feature amount generation method, and feature amount generation program

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200409

Address after: Room 522, floor 5, chuangji Building 1, No. 10 yard, Longyu North Street, Huilongguan, Changping District, Beijing 100085

Patentee after: Weizhen Technology (Beijing) Co., Ltd

Address before: 100193 Beijing city Haidian District Dongbeiwang West Road No. 8 Zhongguancun Software Park Building 5, building 2 207 Hanvon

Patentee before: Mofun Sky Science & Technology (Beijing) Co.,Ltd.

TR01 Transfer of patent right