CN109582822A - A kind of music recommended method and device based on user speech - Google Patents
A kind of music recommended method and device based on user speech Download PDFInfo
- Publication number
- CN109582822A CN109582822A CN201811222418.1A CN201811222418A CN109582822A CN 109582822 A CN109582822 A CN 109582822A CN 201811222418 A CN201811222418 A CN 201811222418A CN 109582822 A CN109582822 A CN 109582822A
- Authority
- CN
- China
- Prior art keywords
- user
- subscriber
- class
- music
- voice data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Abstract
The application provides a kind of music recommended method and device based on user speech, the method includes obtaining the voice data of user, extracts corresponding class of subscriber and identification text in the voice data;According to the class of subscriber and identification text, corresponding recommendation music is obtained;Show corresponding recommendation music to user.It can be avoided during recommending music in the prior art not in view of age of user, cause to be directed to the problem that older user recommends inaccuracy.Generalization bounds are more perfect, and recommendation is also more accurate, to improve the satisfaction of user.
Description
[technical field]
This application involves artificial intelligence application field more particularly to a kind of music recommended methods and dress based on user speech
It sets.
[background technique]
Artificial intelligence (Artificial Intelligence;Al), it is research, develops for simulating, extending and extending people
Intelligence theory, method, a new technological sciences of technology and application system.Artificial intelligence is one of computer science
Branch, it attempts to understand the essence of intelligence, and produces a kind of new intelligence that can be made a response in such a way that human intelligence is similar
Energy machine, the research in the field includes robot, language identification, image recognition, natural language processing and expert system etc..
In recent years, artificial intelligence technology has far-reaching development, and commercialization is done step-by-step.Especially Intelligent voice dialog
Product has started based on talking with the rise of external Amazon Echo intelligent sound and Google Home intelligent sound
Want the popular upsurge of the smart home product especially intelligent sound product of interactive mode.
The typically used as scene of Intelligent voice dialog product including intelligent sound box is among family, in the family
User is interacted with voice with machine very natural.The more frequent application occurred in above-mentioned interaction includes according to user's language
Sound plays song.
In current smart home product, when user expresses the non-precision demand of " playing a song ", cloud is general
It can recommend some new songs or some hit songs at random.But the song of these recommendations, do not consider the age of user,
Due to older user, since the surf time is relatively fewer, in server recorded data, for older
The search of user and Download History proportion are smaller, and therefore, new song and hit song often can only covering part youthful users
Demand, older user is often unable to satisfy.
As it can be seen that commending contents strategy is not perfect, accurate, and the satisfaction of user is not high in traditional music recommended method.
[summary of the invention]
The many aspects of the application provide a kind of music recommended method and device based on user speech, to mention for user
For personalized service.
The one side of the application provides a kind of music recommended method based on user speech, comprising:
The voice data for obtaining user extracts corresponding class of subscriber and identification text in the voice data;
According to the class of subscriber and identification text, corresponding recommendation music is obtained;
Show corresponding recommendation music to user.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the user class
It Bao Kuo not user's gender, age of user section.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the extraction institute
Stating corresponding class of subscriber in voice data includes:
According to accessed user speech, using Application on Voiceprint Recognition mode, identification issues the class of subscriber of order voice.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, it is described according to institute
The user speech got, using Application on Voiceprint Recognition mode, identification is issued before the class of subscriber of order voice, further includes:
According to the sound characteristic of different user classification, model training is carried out, establishes the vocal print processing mould of different user classification
Type.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the extraction institute
Stating corresponding identification text in voice data includes:
Speech recognition is carried out to the voice data using speech recognition modeling, to obtain the corresponding text of the voice data
This request;Or,
Speech recognition, the predicate to obtain are carried out to the voice data using the speech recognition modeling of corresponding class of subscriber
The corresponding text request of sound data.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, it is described according to institute
Class of subscriber and identification text are stated, obtaining corresponding recommendation music includes:
Judge the type of the identification text;
If precision demand type, according to the corresponding recommendation music of the identification text search;
Phase is obtained in the corresponding recommendation music libraries of each class of subscriber according to the class of subscriber if general demand type
The recommendation music answered.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, it is described according to institute
Class of subscriber is stated, is obtained in the corresponding recommendation music libraries of each class of subscriber before recommending music accordingly, further includes:
It is opposite with the class of subscriber of each historical user based on what is extracted from the sample voice request data of each historical user
The search music content answered establishes the corresponding recommendation music libraries of each class of subscriber.
Another aspect of the present invention provides a kind of music recommendation apparatus based on user speech, comprising:
Extraction module extracts corresponding class of subscriber and knowledge in the voice data for obtaining the voice data of user
Other text;
Searching module, for obtaining corresponding recommendation music according to the class of subscriber and identification text;
Display module, for showing corresponding recommendation music to user.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the user class
It Bao Kuo not user's gender, age of user section.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the extraction mould
Block includes Application on Voiceprint Recognition submodule, for according to accessed user speech, using Application on Voiceprint Recognition mode, identification to issue order
The class of subscriber of voice.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the extraction mould
Block further includes that vocal print processing model foundation submodule carries out model training, build for the sound characteristic according to different user classification
The vocal print of vertical different user classification handles model.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the extraction mould
Block includes speech recognition submodule, for carrying out speech recognition to the voice data using speech recognition modeling, to obtain
State the corresponding text request of voice data;Or, being carried out using the speech recognition modeling of corresponding class of subscriber to the voice data
Speech recognition, to obtain the corresponding text request of the voice data.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the lookup mould
Block is specifically used for:
Judge the type of the identification text;
If precision demand type, according to the corresponding recommendation music of the identification text search;
Phase is obtained in the corresponding recommendation music libraries of each class of subscriber according to the class of subscriber if general demand type
The recommendation music answered.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the lookup mould
Block further includes recommending music libraries setting up submodule, is used for:
It is opposite with the class of subscriber of each historical user based on what is extracted from the sample voice request data of each historical user
The search music content answered establishes the corresponding recommendation music libraries of each class of subscriber.
The another aspect of the application provides a kind of equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processing
Device realizes any above-mentioned method.
The another aspect of the application provides a kind of computer readable storage medium, is stored thereon with computer program, special
Sign is that the program realizes any above-mentioned method when being executed by processor.
It can be seen that based on above-mentioned introduction using scheme of the present invention, Generalization bounds are more perfect, recommend also more smart
Standard, to improve the satisfaction of user.
[Detailed description of the invention]
It in order to more clearly explain the technical solutions in the embodiments of the present application, below will be to embodiment or description of the prior art
Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is some realities of the application
Example is applied, it for those of ordinary skill in the art, without any creative labor, can also be attached according to these
Figure obtains other attached drawings.
Fig. 1 is the flow diagram for the voice-based music recommended method that some embodiments of the application provide;
Fig. 2 is the structural schematic diagram for the voice-based music recommendation apparatus that some embodiments of the application provide;
Fig. 3 is the block diagram suitable for being used to realize the exemplary computer system/server of the embodiment of the present invention.
[specific embodiment]
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
Whole other embodiments obtained without creative efforts, shall fall in the protection scope of this application.
In addition, the terms "and/or", only a kind of incidence relation for describing affiliated partner, indicates may exist
Three kinds of relationships, for example, A and/or B, can indicate: individualism A exists simultaneously A and B, these three situations of individualism B.Separately
Outside, character "/" herein typicallys represent the relationship that forward-backward correlation object is a kind of "or".
Fig. 1 is the flow diagram for the music recommended method based on user speech that one embodiment of the application provides, such as Fig. 1
It is shown, comprising the following steps:
Step S11, the voice data for obtaining user extracts corresponding class of subscriber and identification text in the voice data
This;
Step S12, according to the class of subscriber and identification text, corresponding recommendation music is obtained;
Step S13, show corresponding recommendation music to user.
In a kind of preferred implementation of step S11,
Preferably, class of subscriber is identified using Application on Voiceprint Recognition mode according to the voice data of accessed user.
Specifically, the class of subscriber includes user's gender, age of user section.
By taking user's gender as an example, the attribute value of user's gender is male, women.
By taking the age as an example, the attribute value of age of user section be children, youth, middle age and old age or value be 1-5 years old,
6-9 years old, 10-15 years old, 16-18 years old, 19-25 years old, 26-30 years old, 31-35 years old, 36-40 years old, 41-50 years old, 51-60 years old, 61-70
Year, 71-80 years old, 80 years old or more.
Due to different user classification, the i.e. user group of different sexes, age bracket, there is special vocal print feature, therefore,
Before carrying out Application on Voiceprint Recognition, model training can be carried out, different user classification is established according to the sound characteristic of different user classification
Vocal print handle model, with realize the user group towards different user classification voiceprint analysis.When user initiates phonetic search,
The order voice that can be issued according to user identifies the user's gender for issuing order voice, age using Application on Voiceprint Recognition mode
Segment information.
It before Application on Voiceprint Recognition, needs first to model the vocal print of speaker, i.e. " training " or " study ".The model
Neural network model generally in deep learning method, such as deep neural network model, convolutional neural networks model etc..Tool
Body, by applying deep neural network DNN vocal print baseline system, extract the first eigenvector of every voice in training set;
Gender sorter is respectively trained in the gender that marks according to the first eigenvector of every voice and in advance, age bracket label
With character classification by age device, thus establish distinguish gender, age bracket vocal print processing model.
According to accessed order voice, the fisrt feature information of the order voice is extracted, and fisrt feature is believed
Breath is sent respectively to pre-generated gender sorter and age bracket classifier.Gender sorter and age bracket classifier are to first
Characteristic information is analyzed, and obtains the gender label and age bracket label of the fisrt feature information, that is, order voice
Gender label and age bracket label.
For example, by taking gender sorter is gauss hybrid models as an example, it is special fundamental frequency first can be extracted to the voice request
Sign and mel-frequency cepstrum coefficient MFCC feature later can be based on gauss hybrid models to fundamental frequency feature and MFCC feature
Posterior probability values calculating is carried out, the gender of the user is determined according to calculated result, for example, it is assumed that the gauss hybrid models are male
Gauss hybrid models, then when calculated result be posterior probability values it is very high, such as larger than certain threshold value when, it may be determined that the gender of the user
For male, when calculated result is posterior probability values very little, such as less than certain threshold value, it may be determined that the gender of the user is women.
Preferably, the vocal print feature is d-vector feature, is by deep neural network (Deep Neural
Network, DNN) a kind of feature for extracting, the output of the last layer hidden layer in specifically DNN.
Preferably, speech recognition is carried out according to voice data of the preset speech recognition modeling to accessed user,
To obtain the corresponding identification text of the voice data.
Speech recognition (specific extracting method and existing skill are carried out using voice data of the existing voice recognition methods to user
Art is identical, and the present embodiment is not specifically limited herein), obtain the corresponding identification text of the voice data.
Preferably, the speech recognition steps can carry out simultaneously with above-mentioned Application on Voiceprint Recognition step.
Preferably, in the preferred embodiment of the application,
Above-mentioned Application on Voiceprint Recognition step is carried out first, identifies class of subscriber;
Then according to the class of subscriber, voice is carried out to order voice using the speech recognition modeling of corresponding class of subscriber
Identification, to obtain the corresponding text request of the voice data.
Specifically, the corpus for acquiring corresponding different user types forms corpus, carries out voice knowledge using the corpus
Other model training obtains the speech recognition modeling of corresponding user type.
By using corresponding speech recognition modeling to different types of user, the accuracy of speech recognition can be improved.
Preferably, the text request includes following two type: precision demand, general demand.For example, precision demand is "
Play song Super Star ";General demand is " playing a song ".
In a kind of preferred implementation of step S12,
According to the class of subscriber and identification text, corresponding recommendation music is obtained.
Preferably, first determine whether that the type of the identification text does not consider the classification of user, directly for precision demand
It is accurately inquired, searches corresponding music list;General demand is carried out according to the classification of user, including age and gender
Music screening.
Preferably, according to the identification text, the semantic feature of the corresponding text of the voice data is extracted;Wherein, language
Adopted feature is used to characterize the semantic information of the corresponding text of the voice data, specifically can be corresponding using the voice data
The term vector or sentence vector of text indicate;Wherein, sentence vector can be added by the term vector of each word in identification text
After take mean value to obtain, term vector extracting method is same as the prior art, such as uses Word2Vec technology, extracts every in identification text
The term vector of a word, the present embodiment are not specifically limited herein.
Preferably, word segmentation processing is carried out to the identification text, determines whether the identification text is to recommend musical instruction,
It whether include specific musical designation.If the recommendation musical instruction including specific musical designation, then it is judged as precision demand, if
Do not include the recommendation musical instruction of specific musical designation, is then judged as general demand.
Preferably for the precision demand for for example " playing song Super Star ", needed for user has specified
The music to be played therefore, there is no need to the classification for considering user, directly search corresponding music list and be showed.
Preferably, to identify that the specific musical designation for including in text is scanned for as search term, music searching column are obtained
Table.
Preferably for the general demand for example " to play a song ", the music played required for being not known due to user,
Therefore, it is necessary to be recommended according to class of subscriber.
Preferably, it according to the class of subscriber, including age and gender, is obtained in recommending music libraries in corresponding recommend
Hold.It wherein, include multiple corresponding relationship models in the recommendation music libraries, also, each corresponding relationship model is based respectively on
Relatively right search music content of historical user's classification extracted from the sample voice request data of each historical user and each
It establishes.
Preferably, described that music libraries is recommended to pass through the smart home for collecting a large amount of historical users and scheme execution in advance
Interactive information between product is constructed, and interactive information includes user voice data, the corresponding user class of user voice data
Other and user corresponding musical designation of precision demand.
Preferably, can also be according in other music servers, different classes of user records the search of music and establishes
Each corresponding relationship model.Wherein, according to the attribute information of user, for example, gender, age information that user fills in carry out user
Cluster, is divided into multiple class of subscribers.
Preferably, it in the corresponding relationship model, is arranged according to searching times/broadcasting frequency of the wherein music
Sequence.
For example, being 61-70 years old for age of user, gender is woman's class of subscriber, is obtained in its corresponding relationship model
Music candidate result.
Preferably, it in a kind of preferred implementation of the present embodiment, is clustered, is established different according to the age of user
The corresponding relationship model of the user at age.That is, music candidate result in the corresponding relationship model only with the age phase of user
It closes, puts aside the gender of user.
After obtaining corresponding music recommendation results according to the age of user, screened further according to the gender of user, for example,
It is woman user for gender, it is the music of songstress as consequently recommended knot that singer is screened from the music recommendation results
Fruit.
In a kind of preferred implementation of step S13,
Show corresponding recommendation music to user.
Preferably for the precision demand for for example " playing song Super Star ", directly broadcasting is searched "
Super Star " collating sequence highest music in music list, wherein the sequence is according to it in search and broadcasting time
It carries out.
Preferably for the general demand for example " to play a song ", show music recommendation results, the music to user
Recommendation results include sort a forward head or the more song lists being ranked up according to its search and broadcasting time, with
Just user selects.
According to the present embodiment the method, corresponding use in the voice data can be extracted according to the voice data of user
Family classification and identification text obtain corresponding recommendation music.Can corresponding sound targetedly be provided to different user classification
It is happy.Age of user and gender can not be learnt compared to traditional Generalization bounds, and after joined age and gender, Generalization bounds can be more
Add it is kind, recommend it is also more accurate, to improve the satisfaction of user.Due to being implicit recommendation identification, even if identification is wrong once in a while
Accidentally recommend mistake, user will not obviously perceive.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of
Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because
According to the application, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know
It knows, the embodiments described in the specification are all preferred embodiments, related actions and modules not necessarily the application
It is necessary.
In the described embodiment, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, reference can be made to the related descriptions of other embodiments.
Fig. 2 is the structural schematic diagram for the music recommendation apparatus based on user speech that one embodiment of the application provides, such as Fig. 2
It is shown, comprising:
Extraction module 21, for obtaining the voice data of user, extract in the voice data corresponding class of subscriber and
Identify text;
Searching module 22, for obtaining corresponding recommendation music according to the class of subscriber and identification text;
Display module 23, for showing corresponding recommendation music to user.
In a kind of preferred implementation of extraction module 21,
Preferably, the extraction module 21 includes Application on Voiceprint Recognition submodule, with the voice number according to accessed user
According to using Application on Voiceprint Recognition mode, identification class of subscriber.
Specifically, the class of subscriber includes user's gender, age of user section.
By taking user's gender as an example, the attribute value of user's gender is male, women.
By taking the age as an example, the attribute value of age of user section be children, youth, middle age and old age or value be 1-5 years old,
6-9 years old, 10-15 years old, 16-18 years old, 19-25 years old, 26-30 years old, 31-35 years old, 36-40 years old, 41-50 years old, 51-60 years old, 61-70
Year, 71-80 years old, 80 years old or more.
Due to different user classification, the i.e. user group of different sexes, age bracket, there is special vocal print feature, therefore, institute
Stating extraction module 21 further includes vocal print processing model foundation submodule, is used for before carrying out Application on Voiceprint Recognition, according to different user
The sound characteristic of classification carries out model training, the vocal print processing model of different user classification is established, to realize towards different user
The voiceprint analysis of the user group of classification.When user initiates phonetic search, the order voice that can be issued according to user, using sound
Line identification method identifies the user's gender for issuing order voice, age segment information.
It before Application on Voiceprint Recognition, needs first to model the vocal print of speaker, i.e. " training " or " study ".The model
Neural network model generally in deep learning method, such as deep neural network model, convolutional neural networks model etc..Tool
Body, by applying deep neural network DNN vocal print baseline system, extract the first eigenvector of every voice in training set;
Gender sorter is respectively trained in the gender that marks according to the first eigenvector of every voice and in advance, age bracket label
With character classification by age device, thus establish distinguish gender, age bracket vocal print processing model.
According to accessed order voice, the fisrt feature information of the order voice is extracted, and fisrt feature is believed
Breath is sent respectively to pre-generated gender sorter and age bracket classifier.Gender sorter and age bracket classifier are to first
Characteristic information is analyzed, and obtains the gender label and age bracket label of the fisrt feature information, that is, order voice
Gender label and age bracket label.
For example, by taking gender sorter is gauss hybrid models as an example, it is special fundamental frequency first can be extracted to the voice request
Sign and mel-frequency cepstrum coefficient MFCC feature later can be based on gauss hybrid models to fundamental frequency feature and MFCC feature
Posterior probability values calculating is carried out, the gender of the user is determined according to calculated result, for example, it is assumed that the gauss hybrid models are male
Gauss hybrid models, then when calculated result be posterior probability values it is very high, such as larger than certain threshold value when, it may be determined that the gender of the user
For male, when calculated result is posterior probability values very little, such as less than certain threshold value, it may be determined that the gender of the user is women.
Preferably, the vocal print feature is d-vector feature, is by deep neural network (Deep Neural
Network, DNN) a kind of feature for extracting, the output of the last layer hidden layer in specifically DNN.
Preferably, the extraction module 21 further includes speech recognition submodule, for according to preset speech recognition modeling
Speech recognition is carried out to the voice data of accessed user, to obtain the corresponding identification text of the voice data.
Speech recognition (specific extracting method and existing skill are carried out using voice data of the existing voice recognition methods to user
Art is identical, and the present embodiment is not specifically limited herein), obtain the corresponding identification text of the voice data.
Preferably, the speech recognition steps can carry out simultaneously with above-mentioned Application on Voiceprint Recognition step.
Preferably, in the preferred embodiment of the application, the speech recognition submodule is using corresponding class of subscriber
Speech recognition modeling to the voice data carry out speech recognition, to obtain the corresponding text request of the voice data.
Above-mentioned Application on Voiceprint Recognition step is carried out by the Application on Voiceprint Recognition submodule first, identifies class of subscriber;
Then by the speech recognition submodule according to the class of subscriber, using the speech recognition mould of corresponding class of subscriber
Type carries out speech recognition to order voice, to obtain the corresponding text request of the voice data.
Specifically, the corpus for acquiring corresponding different user types forms corpus, carries out voice knowledge using the corpus
Other model training obtains the speech recognition modeling of corresponding user type.
By using corresponding speech recognition modeling to different types of user, the accuracy of speech recognition can be improved.
Preferably, the text request includes following two type: precision demand, general demand.For example, precision demand is "
Play song Super Star ";General demand is " playing a song ".
In a kind of preferred implementation of searching module 22,
The searching module 22 is used to obtain corresponding recommendation music according to the class of subscriber and identification text.
Preferably, first determine whether that the type of the identification text does not consider the classification of user, directly for precision demand
It is accurately inquired, searches corresponding music list;General demand is carried out according to the classification of user, including age and gender
Music screening.
Preferably, according to the identification text, the semantic feature of the corresponding text of the voice data is extracted;Wherein, language
Adopted feature is used to characterize the semantic information of the corresponding text of the voice data, specifically can be corresponding using the voice data
The term vector or sentence vector of text indicate;Wherein, sentence vector can be added by the term vector of each word in identification text
After take mean value to obtain, term vector extracting method is same as the prior art, such as uses Word2Vec technology, extracts every in identification text
The term vector of a word, the present embodiment are not specifically limited herein.
Preferably, word segmentation processing is carried out to the identification text, determines whether the identification text is to recommend musical instruction,
It whether include specific musical designation.If the recommendation musical instruction including specific musical designation, then it is judged as precision demand, if
Do not include the recommendation musical instruction of specific musical designation, is then judged as general demand.
Preferably for the precision demand for for example " playing song Super Star ", needed for user has specified
The music to be played therefore, there is no need to the classification for considering user, directly search corresponding music list and be showed.
Preferably, to identify that the specific musical designation for including in text is scanned for as search term, music searching column are obtained
Table.
Preferably for the general demand for example " to play a song ", the music played required for being not known due to user,
Therefore, it is necessary to be recommended according to class of subscriber.
Preferably, it according to the class of subscriber, including age and gender, is obtained in recommending music libraries in corresponding recommend
Hold.The searching module 22 further includes recommending music libraries setting up submodule, for based on asking from the sample voice of each historical user
The search music content corresponding with the class of subscriber of each historical user extracted in data is sought, it is corresponding to establish each class of subscriber
Recommend music libraries.
It preferably, include multiple corresponding relationship models in the recommendation music libraries, also, each corresponding relationship model divides
Not based on the relatively right search sound of historical user's classification extracted from the sample voice request data of each historical user and each
Happy content is established.
Preferably, described that music libraries is recommended to pass through the smart home for collecting a large amount of historical users and scheme execution in advance
Interactive information between product is constructed, and interactive information includes user voice data, the corresponding user class of user voice data
Other and user corresponding musical designation of precision demand.
Preferably, can also be according in other music servers, different classes of user records the search of music and establishes
Each corresponding relationship model.Wherein, according to the attribute information of user, for example, gender, age information that user fills in carry out user
Cluster, is divided into multiple class of subscribers.
Preferably, it in the corresponding relationship model, is arranged according to searching times/broadcasting frequency of the wherein music
Sequence.
For example, being 61-70 years old for age of user, gender is woman's class of subscriber, is obtained in its corresponding relationship model
Music candidate result.
Preferably, it in a kind of preferred implementation of the present embodiment, is clustered, is established different according to the age of user
The corresponding relationship model of the user at age.That is, music candidate result in the corresponding relationship model only with the age phase of user
It closes, puts aside the gender of user.
After obtaining corresponding music recommendation results according to the age of user, screened further according to the gender of user, for example,
It is woman user for gender, it is the music of songstress as consequently recommended knot that singer is screened from the music recommendation results
Fruit.
In a kind of preferred implementation of step display module 23,
The step display module 23 is used to show to user corresponding recommendation music.
Preferably for the precision demand for for example " playing song Super Star ", directly broadcasting is searched "
Super Star " collating sequence highest music in music list, wherein the sequence is according to it in search and broadcasting time
It carries out.
Preferably for the general demand for example " to play a song ", show music recommendation results, the music to user
Recommendation results include sort a forward head or the more song lists being ranked up according to its search and broadcasting time, with
Just user selects.
According to the present embodiment described device, corresponding use in the voice data can be extracted according to the voice data of user
Family classification and identification text obtain corresponding recommendation music.Can corresponding sound targetedly be provided to different user classification
It is happy.Age of user and gender can not be learnt compared to traditional Generalization bounds, and after joined age and gender, Generalization bounds can be more
Add it is kind, recommend it is also more accurate, to improve the satisfaction of user.Due to being implicit recommendation identification, even if identification is wrong once in a while
Accidentally recommend mistake, user will not obviously perceive.
It is apparent to those skilled in the art that for convenience and simplicity of description, the terminal of the description
It with the specific work process of server, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
In several embodiments provided herein, it should be understood that disclosed method and apparatus can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit
Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.The integrated list
Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
Fig. 3 shows the frame for being suitable for the exemplary computer system/server 012 for being used to realize embodiment of the present invention
Figure.The computer system/server 012 that Fig. 3 is shown is only an example, should not function and use to the embodiment of the present invention
Range band carrys out any restrictions.
As shown in figure 3, computer system/server 012 is showed in the form of universal computing device.Computer system/clothes
The component of business device 012 can include but is not limited to: one or more processor or processing unit 016, system storage
028, connect the bus 018 of different system components (including system storage 028 and processing unit 016).
Bus 018 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts
For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC)
Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Computer system/server 012 typically comprises a variety of computer system readable media.These media, which can be, appoints
The usable medium what can be accessed by computer system/server 012, including volatile and non-volatile media, movably
With immovable medium.
System storage 028 may include the computer system readable media of form of volatile memory, such as deposit at random
Access to memory (RAM) 030 and/or cache memory 032.Computer system/server 012 may further include other
Removable/nonremovable, volatile/non-volatile computer system storage medium.Only as an example, storage system 034 can
For reading and writing immovable, non-volatile magnetic media (Fig. 3 do not show, commonly referred to as " hard disk drive ").Although in Fig. 3
It is not shown, the disc driver for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided, and to can
The CD drive of mobile anonvolatile optical disk (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these situations
Under, each driver can be connected by one or more data media interfaces with bus 018.Memory 028 may include
At least one program product, the program product have one group of (for example, at least one) program module, these program modules are configured
To execute the function of various embodiments of the present invention.
Program/utility 040 with one group of (at least one) program module 042, can store in such as memory
In 028, such program module 042 includes --- but being not limited to --- operating system, one or more application program, other
It may include the realization of network environment in program module and program data, each of these examples or certain combination.Journey
Sequence module 042 usually executes function and/or method in embodiment described in the invention.
Computer system/server 012 can also with one or more external equipments 014 (such as keyboard, sensing equipment,
Display 024 etc.) communication, in the present invention, computer system/server 012 is communicated with outside radar equipment, can also be with
One or more enable a user to the equipment interacted with the computer system/server 012 communication, and/or with make the meter
Any equipment (such as network interface card, the modulation that calculation machine systems/servers 012 can be communicated with one or more of the other calculating equipment
Demodulator etc.) communication.This communication can be carried out by input/output (I/O) interface 022.Also, computer system/clothes
Being engaged in device 012 can also be by network adapter 020 and one or more network (such as local area network (LAN), wide area network (WAN)
And/or public network, such as internet) communication.As shown in figure 3, network adapter 020 by bus 018 and computer system/
Other modules of server 012 communicate.It should be understood that computer system/server 012 can be combined although being not shown in Fig. 3
Using other hardware and/or software module, including but not limited to: microcode, device driver, redundant processing unit, external magnetic
Dish driving array, RAID system, tape drive and data backup storage system etc..
The program that processing unit 016 is stored in system storage 028 by operation, thereby executing described in the invention
Function and/or method in embodiment.
Above-mentioned computer program can be set in computer storage medium, i.e., the computer storage medium is encoded with
Computer program, the program by one or more computers when being executed, so that one or more computers execute in the present invention
State method flow shown in embodiment and/or device operation.
With time, the development of technology, medium meaning is more and more extensive, and the route of transmission of computer program is no longer limited by
Tangible medium, can also be directly from network downloading etc..It can be using any combination of one or more computer-readable media.
Computer-readable medium can be computer-readable signal media or computer readable storage medium.Computer-readable storage medium
Matter for example may be-but not limited to-system, device or the device of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or
Any above combination of person.The more specific example (non exhaustive list) of computer readable storage medium includes: with one
Or the electrical connections of multiple conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM),
Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light
Memory device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer readable storage medium can
With to be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or
Person is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but
It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be
Any computer-readable medium other than computer readable storage medium, which can send, propagate or
Transmission is for by the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited
In --- wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof
Program code, described program design language include object oriented program language one such as Java, Smalltalk, C++,
It further include conventional procedural programming language one such as " C " language or similar programming language.Program code can be with
It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion
Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.?
Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including local area network (LAN) or
Wide area network (WAN) is connected to subscriber computer, or, it may be connected to outer computer (such as provided using Internet service
Quotient is connected by internet).
Finally, it should be noted that above embodiments are only to illustrate the technical solution of the application, rather than its limitations;Although
The application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: it still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and
Range.
Claims (16)
1. a kind of music recommended method based on user speech characterized by comprising
The voice data for obtaining user extracts corresponding class of subscriber and identification text in the voice data;
According to the class of subscriber and identification text, corresponding recommendation music is obtained;
Show corresponding recommendation music to user.
2. the method according to claim 1, wherein
The class of subscriber includes user's gender, age of user section.
3. the method according to claim 1, wherein described extract corresponding class of subscriber in the voice data
Include:
According to accessed user speech, using Application on Voiceprint Recognition mode, identification issues the class of subscriber of order voice.
4. according to the method described in claim 3, it is characterized in that, the user speech according to accessed by, using vocal print
Identification method, identification issue before the class of subscriber of order voice, further includes:
According to the sound characteristic of different user classification, model training is carried out, establishes the vocal print processing model of different user classification.
5. the method according to claim 1, wherein described extract corresponding identification text packet in the voice data
It includes:
Speech recognition is carried out to the voice data using speech recognition modeling, is asked with obtaining the corresponding text of the voice data
It asks;Or,
Speech recognition is carried out to the voice data using the speech recognition modeling of corresponding class of subscriber, to obtain the voice number
According to corresponding text request.
6. the method according to claim 1, wherein it is described according to the class of subscriber and identification text, obtain
Recommendation music includes: accordingly
Judge the type of the identification text;
If precision demand type, according to the corresponding recommendation music of the identification text search;
If general demand type, according to the class of subscriber, obtained in the corresponding recommendation music libraries of each class of subscriber corresponding
Recommend music.
7. according to the method described in claim 6, it is characterized in that, described according to the class of subscriber, in each class of subscriber pair
It is obtained in the recommendation music libraries answered before recommending music accordingly, further includes:
It is corresponding with the class of subscriber of each historical user based on what is extracted from the sample voice request data of each historical user
Music content is searched for, the corresponding recommendation music libraries of each class of subscriber are established.
8. a kind of music recommendation apparatus based on user speech characterized by comprising
Extraction module extracts corresponding class of subscriber and identification text in the voice data for obtaining the voice data of user
This;
Searching module, for obtaining corresponding recommendation music according to the class of subscriber and identification text;
Display module, for showing corresponding recommendation music to user.
9. device according to claim 8, which is characterized in that
The class of subscriber includes user's gender, age of user section.
10. device according to claim 8, which is characterized in that the extraction module includes Application on Voiceprint Recognition submodule, is used for
According to accessed user speech, using Application on Voiceprint Recognition mode, identification issues the class of subscriber of order voice.
11. device according to claim 10, which is characterized in that the extraction module further includes vocal print processing model foundation
Submodule carries out model training for the sound characteristic according to different user classification, establishes the vocal print processing of different user classification
Model.
12. device according to claim 8, which is characterized in that the extraction module includes speech recognition submodule, is used for
Speech recognition is carried out to the voice data using speech recognition modeling, to obtain the corresponding text request of the voice data;
Or, speech recognition is carried out to the voice data using the speech recognition modeling of corresponding class of subscriber, to obtain the voice number
According to corresponding text request.
13. device according to claim 8, which is characterized in that the searching module is specifically used for:
Judge the type of the identification text;
If precision demand type, according to the corresponding recommendation music of the identification text search;
If general demand type, according to the class of subscriber, obtained in the corresponding recommendation music libraries of each class of subscriber corresponding
Recommend music.
14. device according to claim 13, which is characterized in that the searching module further includes that music libraries is recommended to establish son
Module is used for:
It is corresponding with the class of subscriber of each historical user based on what is extracted from the sample voice request data of each historical user
Music content is searched for, the corresponding recommendation music libraries of each class of subscriber are established.
15. a kind of equipment, which is characterized in that the equipment includes:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
The now method as described in any in claim 1-7.
16. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The method as described in any in claim 1-7 is realized when execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811222418.1A CN109582822A (en) | 2018-10-19 | 2018-10-19 | A kind of music recommended method and device based on user speech |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811222418.1A CN109582822A (en) | 2018-10-19 | 2018-10-19 | A kind of music recommended method and device based on user speech |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109582822A true CN109582822A (en) | 2019-04-05 |
Family
ID=65920672
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811222418.1A Pending CN109582822A (en) | 2018-10-19 | 2018-10-19 | A kind of music recommended method and device based on user speech |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109582822A (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110223134A (en) * | 2019-04-28 | 2019-09-10 | 平安科技(深圳)有限公司 | Products Show method and relevant device based on speech recognition |
CN110598011A (en) * | 2019-09-27 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Data processing method, data processing device, computer equipment and readable storage medium |
CN111023470A (en) * | 2019-12-06 | 2020-04-17 | 厦门快商通科技股份有限公司 | Air conditioner temperature adjusting method, medium, equipment and device |
CN111371838A (en) * | 2020-02-14 | 2020-07-03 | 厦门快商通科技股份有限公司 | Information pushing method and system based on voiceprint recognition and mobile terminal |
CN111414512A (en) * | 2020-03-02 | 2020-07-14 | 北京声智科技有限公司 | Resource recommendation method and device based on voice search and electronic equipment |
CN111488485A (en) * | 2020-04-16 | 2020-08-04 | 北京雷石天地电子技术有限公司 | Music recommendation method based on convolutional neural network, storage medium and electronic device |
CN111638830A (en) * | 2020-05-27 | 2020-09-08 | 杭州网易云音乐科技有限公司 | Multimedia file selection method, device, equipment and computer readable storage medium |
CN111694982A (en) * | 2019-11-27 | 2020-09-22 | 深圳友宝科斯科技有限公司 | Song recommendation method and system |
CN111782878A (en) * | 2020-07-06 | 2020-10-16 | 聚好看科技股份有限公司 | Server, display equipment and video searching and sorting method thereof |
CN111798857A (en) * | 2019-04-08 | 2020-10-20 | 北京嘀嘀无限科技发展有限公司 | Information identification method and device, electronic equipment and storage medium |
CN111862991A (en) * | 2019-04-30 | 2020-10-30 | 杭州海康威视数字技术股份有限公司 | Method and system for identifying baby crying |
CN111859008A (en) * | 2019-04-29 | 2020-10-30 | 深圳市冠旭电子股份有限公司 | Music recommending method and terminal |
CN111951809A (en) * | 2019-05-14 | 2020-11-17 | 深圳子丸科技有限公司 | Multi-person voiceprint identification method and system |
CN112230555A (en) * | 2020-10-12 | 2021-01-15 | 珠海格力电器股份有限公司 | Intelligent household equipment, control method and device thereof and storage medium |
CN112948662A (en) * | 2019-12-10 | 2021-06-11 | 北京搜狗科技发展有限公司 | Recommendation method and device and recommendation device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106128467A (en) * | 2016-06-06 | 2016-11-16 | 北京云知声信息技术有限公司 | Method of speech processing and device |
CN107507612A (en) * | 2017-06-30 | 2017-12-22 | 百度在线网络技术(北京)有限公司 | A kind of method for recognizing sound-groove and device |
-
2018
- 2018-10-19 CN CN201811222418.1A patent/CN109582822A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106128467A (en) * | 2016-06-06 | 2016-11-16 | 北京云知声信息技术有限公司 | Method of speech processing and device |
CN107507612A (en) * | 2017-06-30 | 2017-12-22 | 百度在线网络技术(北京)有限公司 | A kind of method for recognizing sound-groove and device |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111798857A (en) * | 2019-04-08 | 2020-10-20 | 北京嘀嘀无限科技发展有限公司 | Information identification method and device, electronic equipment and storage medium |
CN110223134A (en) * | 2019-04-28 | 2019-09-10 | 平安科技(深圳)有限公司 | Products Show method and relevant device based on speech recognition |
CN111859008A (en) * | 2019-04-29 | 2020-10-30 | 深圳市冠旭电子股份有限公司 | Music recommending method and terminal |
CN111859008B (en) * | 2019-04-29 | 2023-11-10 | 深圳市冠旭电子股份有限公司 | Music recommending method and terminal |
CN111862991A (en) * | 2019-04-30 | 2020-10-30 | 杭州海康威视数字技术股份有限公司 | Method and system for identifying baby crying |
CN111951809A (en) * | 2019-05-14 | 2020-11-17 | 深圳子丸科技有限公司 | Multi-person voiceprint identification method and system |
CN110598011A (en) * | 2019-09-27 | 2019-12-20 | 腾讯科技(深圳)有限公司 | Data processing method, data processing device, computer equipment and readable storage medium |
CN111694982A (en) * | 2019-11-27 | 2020-09-22 | 深圳友宝科斯科技有限公司 | Song recommendation method and system |
CN111023470A (en) * | 2019-12-06 | 2020-04-17 | 厦门快商通科技股份有限公司 | Air conditioner temperature adjusting method, medium, equipment and device |
CN112948662A (en) * | 2019-12-10 | 2021-06-11 | 北京搜狗科技发展有限公司 | Recommendation method and device and recommendation device |
CN111371838A (en) * | 2020-02-14 | 2020-07-03 | 厦门快商通科技股份有限公司 | Information pushing method and system based on voiceprint recognition and mobile terminal |
CN111414512A (en) * | 2020-03-02 | 2020-07-14 | 北京声智科技有限公司 | Resource recommendation method and device based on voice search and electronic equipment |
CN111488485A (en) * | 2020-04-16 | 2020-08-04 | 北京雷石天地电子技术有限公司 | Music recommendation method based on convolutional neural network, storage medium and electronic device |
CN111488485B (en) * | 2020-04-16 | 2023-11-17 | 北京雷石天地电子技术有限公司 | Music recommendation method based on convolutional neural network, storage medium and electronic device |
CN111638830A (en) * | 2020-05-27 | 2020-09-08 | 杭州网易云音乐科技有限公司 | Multimedia file selection method, device, equipment and computer readable storage medium |
CN111782878A (en) * | 2020-07-06 | 2020-10-16 | 聚好看科技股份有限公司 | Server, display equipment and video searching and sorting method thereof |
CN111782878B (en) * | 2020-07-06 | 2023-09-19 | 聚好看科技股份有限公司 | Server, display device and video search ordering method thereof |
CN112230555A (en) * | 2020-10-12 | 2021-01-15 | 珠海格力电器股份有限公司 | Intelligent household equipment, control method and device thereof and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109582822A (en) | A kind of music recommended method and device based on user speech | |
CN107507612B (en) | Voiceprint recognition method and device | |
CN107481720B (en) | Explicit voiceprint recognition method and device | |
CN107492379B (en) | Voiceprint creating and registering method and device | |
CN106548773B (en) | Child user searching method and device based on artificial intelligence | |
US11030412B2 (en) | System and method for chatbot conversation construction and management | |
EP3803846B1 (en) | Autonomous generation of melody | |
US8972265B1 (en) | Multiple voices in audio content | |
CN108874895B (en) | Interactive information pushing method and device, computer equipment and storage medium | |
CN110069608A (en) | A kind of method, apparatus of interactive voice, equipment and computer storage medium | |
CN110838286A (en) | Model training method, language identification method, device and equipment | |
CN108197282A (en) | Sorting technique, device and the terminal of file data, server, storage medium | |
CN111081280B (en) | Text-independent speech emotion recognition method and device and emotion recognition algorithm model generation method | |
CN107785018A (en) | More wheel interaction semantics understanding methods and device | |
US20220076674A1 (en) | Cross-device voiceprint recognition | |
US10854189B2 (en) | Techniques for model training for voice features | |
CN110853617A (en) | Model training method, language identification method, device and equipment | |
CN109858038A (en) | A kind of text punctuate determines method and device | |
CN110232340A (en) | Establish the method, apparatus of video classification model and visual classification | |
CN109325091A (en) | Update method, device, equipment and the medium of points of interest attribute information | |
CN109785846A (en) | The role recognition method and device of the voice data of monophonic | |
CN110223134A (en) | Products Show method and relevant device based on speech recognition | |
CN111147871B (en) | Singing recognition method and device in live broadcast room, server and storage medium | |
CN110647613A (en) | Courseware construction method, courseware construction device, courseware construction server and storage medium | |
CN109800410A (en) | A kind of list generation method and system based on online chatting record |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190405 |