CN104766607A - Television program recommendation method and system - Google Patents
Television program recommendation method and system Download PDFInfo
- Publication number
- CN104766607A CN104766607A CN201510098643.9A CN201510098643A CN104766607A CN 104766607 A CN104766607 A CN 104766607A CN 201510098643 A CN201510098643 A CN 201510098643A CN 104766607 A CN104766607 A CN 104766607A
- Authority
- CN
- China
- Prior art keywords
- speech data
- dialect
- model
- submodel
- feature sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000010276 construction Methods 0.000 claims description 8
- 238000007476 Maximum Likelihood Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 238000003066 decision tree Methods 0.000 claims description 6
- 238000009432 framing Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 241001672694 Citrus reticulata Species 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 6
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000000875 corresponding effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241001575999 Hakka Species 0.000 description 1
- 235000016278 Mentha canadensis Nutrition 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
Landscapes
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The invention discloses a television program recommendation method, which comprises the following steps: receiving a voice signal of a user; converting the voice signal into discrete voice data; recognizing the dialect type used by the user according to the voice data; recommending the television programs related to the dialect category to the user. By adopting the embodiment of the invention, programs which are more in line with the cultural language background of the user can be recommended, and the user experience is enhanced, especially for the old users who are not skilled in mastering Mandarin and man-machine operation. Meanwhile, the embodiment of the invention also provides a television program recommendation system which can execute all the method steps of the television program recommendation method.
Description
Technical field
The present invention relates to field of computer technology, particularly relate to a kind of TV programme suggesting method and system.
Background technology
Along with the fast development of digital television techniques, Digital Cable Television System can reach the transmission capacity of hundreds of programs under current encoder and modulation classification.Add the intelligent television of Built In Operating System, can the video resource of magnanimity on view Internet, therefore TV user is difficult in so numerous videos, select their interested content.In order to solve this TV information " overload " problem, electronic program guides must have intelligent, and it can according to the interest of user, and hobby and use history are automatically in advance to user's recommending television.It can also be made adjustment to recommended TV programme from the change of motion tracking user interest simultaneously.The concept of digital television program recommending system that Here it is.
Existing TV programme suggesting method carrys out recommended program according to the dominant character of user and recessive character mostly.When dominant character refers to that user is registered as commending system user, the property attribute provided, comprising: sex, the age, occupation grade for hardware information; Recessive character refers to the time period of user's TV reception, program category, the software informations such as the program often watched.
The shortcoming of the TV program commending method of prior art is: when utilizing dominant information, needs to be registered as commending system user, and provides enough dominant informations, does not take into full account the old user of unskilled grasp mandarin and human-machine operation; When utilizing recessive information, the user personality Expressive Features of utilization is inadequate, can not recommend enough TV programme accurately to user.
Recently, market has also occurred one is based on voice-operated TV programme Adjusted Option.According to user speech, carry out zapping.Such as, user pronunciation " I wants to see Hunan platform ", then automatically switch to Hunan platform.The intelligence degree of this kind of scheme is not high, can only identify fixing statement, can be regarded as a kind of control system, can not be user's intelligent recommendation TV programme.
Summary of the invention
The embodiment of the present invention proposes a kind of TV programme suggesting method and system, can recommend out the program more meeting user's Cultural Language background, strengthens Consumer's Experience, particularly for those and the old user of unskilled grasp mandarin and human-machine operation.
The embodiment of the present invention provides a kind of TV programme suggesting method, comprising:
Receive the voice signal of user;
Described voice signal is converted to discrete speech data;
User's dialect classification used is identified according to described speech data;
The TV programme relevant to described dialect classification is recommended to user.
Further, describedly identify user's dialect classification used according to described speech data, specifically comprise:
To described speech data framing;
Obtain the robust features of each frame speech data, form the fisrt feature sequence X={ x of described speech data
1, x
2..., x
m; Wherein, x
mrepresent the robust features of M frame speech data;
Remove the silence clip in described fisrt feature sequence X, obtain the second feature sequence Y={y of described speech data
1, y
2..., y
n; Wherein, y
nthe robust features of N frame speech data after silence clip in the described fisrt feature sequence X of representative removal, N≤M;
According to the second feature sequence Y of described speech data, calculate the likelihood score of described speech data under different dialect model;
The dialect classification that user is used is judged according to the likelihood score of described speech data under different dialect model.
By extracting the robust features of each frame speech data and through after quiet process, the characteristic that can characterize described speech data can being obtained: second feature sequence; And then utilize described second feature sequence to calculate the likelihood score of described speech data under different dialect models; Likelihood score is higher, and illustrate that this group second feature sequence is more similar to described dialect model, namely the dialect model that wherein likelihood score is the highest is judged to be the dialect classification that user is used.
Further, the described second feature sequence Y according to described speech data, calculates the likelihood score of described speech data under different dialect model, specifically according to following formulae discovery:
Wherein, p (Y/ λ
k) be the likelihood score of described speech data under kth kind dialect model; P (y
i/ λ
k) be the robust features y of the i-th frame speech data of described second feature sequence
iappear at the probability of a kth dialect model; ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
It should be noted that, above formula is the likelihood function of kth kind dialect model.For the likelihood function of a discrete type, can know
namely the likelihood score of described speech data under kth kind dialect equals the product of the probability that each element occurs in second feature sequence.In the present embodiment, described dialect model adopts gauss hybrid models.Gauss hybrid models accurately quantizes things with Gaussian probability-density function (normal distribution curve), and a things is decomposed into some models formed based on Gaussian probability-density function (normal distribution curve).Namely a dialect model is mixed to form by multiple Gauss's submodel.Thus under the probability that each element occurs equals again this dialect model each Gauss's submodel probability and, and each Gauss's submodel is assigned different weights.Gauss hybrid models has the advantage that stability is high and convergence is good, for the accuracy that can improve likelihood score calculating.
Further, before the voice signal of described input user, also comprise the step building dialect model, specifically comprise:
Obtain the second feature sequence based on known dialect;
Adopt the method for decision tree-based clustering that the robust features of each frame speech data of described second feature sequence is carried out cluster, each classification adopts Gauss's submodel to characterize;
According to the robust features that each classification comprises, maximum likelihood algorithm is adopted to calculate the weight of the Gauss's submodel corresponding to each classification, average and covariance;
According to the weight of each Gauss's submodel, average and covariance, generate the dialect model of described known dialect; Wherein, the robust features y of the i-th frame speech data of described second feature sequence
ithe probability appearing at a kth dialect model is
ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
It should be noted that, described acquisition is actually based on the second feature sequence of known dialect the step obtaining sample data, and the process obtaining second feature sequence is same as described above; Each Gauss's submodel is at least assigned the robust features of a frame speech data.
Further, described robust features comprises the energy of each frame speech data, mel-frequency cepstrum coefficient, the first order difference of mel-frequency cepstrum coefficient and second order difference.
Correspondingly, the embodiment of the present invention also provides a kind of television program recommendation system, and the institute that can implement said method in steps, comprising:
Signal receiving module, for receiving the voice signal of user;
Signal conversion module, for being converted to discrete speech data by described voice signal;
Identification module, for identifying user's dialect classification used according to described speech data;
Recommending module, for recommending the TV programme relevant to described dialect classification to user.
Further, described judge module comprises:
Point frame unit, for described speech data framing;
First ray acquiring unit, for obtaining the robust features of each frame speech data, forms the fisrt feature sequence X={ x of described speech data
1, x
2..., x
m; Wherein, x
mrepresent the robust features of M frame speech data in described speech data;
Second retrieval unit, for removing the silence clip in described fisrt feature sequence, obtains the second feature sequence Y={y of described speech data
1, y
2..., y
n; Wherein, y
nthe robust features of N frame speech data after silence clip in the described fisrt feature sequence X of representative removal, N≤M;
Likelihood score computing unit, for the second feature sequence Y of described speech data, calculates the likelihood score of described speech data under different dialect model;
Identifying unit, for judging according to the likelihood score of described speech data under different dialect model the dialect classification that user is used.
Further, the described likelihood score computing unit specifically likelihood score of speech data under different dialect model according to following formulae discovery:
Wherein, p (Y/ λ
k) be the likelihood score of described speech data under kth kind dialect model; P (y
i/ λ
k) be the robust features y of the i-th frame speech data of described second feature sequence
iappear at the probability of a kth dialect model; ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
Further, described television program recommendation system also comprises dialect model construction module; Described dialect model construction module specifically comprises:
Sample sequence acquiring unit, for obtaining the second feature sequence based on known dialect;
Cluster cell, for adopting the method for decision tree-based clustering that the robust features of each frame speech data of described second feature sequence is carried out cluster, each classification adopts Gauss's submodel to characterize;
Model parameter calculation unit, for according to robust features assigned in each Gauss's submodel, adopts maximum likelihood algorithm to calculate the weight of each Gauss's submodel, average and covariance;
Model generation unit, for the weight according to each Gauss's submodel, average and covariance, generates the dialect model of described known dialect; Wherein, the robust features y of the i-th frame speech data of described second feature sequence
ithe probability appearing at a kth dialect model is
ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
Further, described robust features comprises the energy of each frame speech data, mel-frequency cepstrum coefficient, the first order difference of mel-frequency cepstrum coefficient and second order difference.
Implement the embodiment of the present invention, there is following beneficial effect: embodiments provide a kind of TV programme suggesting method, according to the voice signal of user, the said dialect classification of user can be judged, and the dialect that recommendation of taking this as a foundation is correlated with is main TV programme.This makes TV can recommend out more to meet the program of user's Cultural Language background, strengthens Consumer's Experience, particularly for those and the old user of unskilled grasp mandarin and human-machine operation.The present invention can see a kind of method of professional recommendation Dialect program as, forms complementation with existing television program recommendation system.Meanwhile, the embodiment of the present invention also provides a kind of television program recommendation system, can perform all method steps of described TV programme suggesting method.
Accompanying drawing explanation
Fig. 1 is the schematic flow sheet of the TV programme suggesting method that the embodiment of the present invention provides;
Fig. 2 is the schematic flow sheet of the step S3 in Fig. 1;
Fig. 3 is the schematic flow sheet of the step S5 in Fig. 1;
Fig. 4 is the structural representation of television program recommendation system provided by the invention;
Fig. 5 is the structural representation of the judge module 3 in Fig. 4;
Fig. 6 is the structural representation of the dialect model construction module 5 in Fig. 4.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
See Fig. 1, be the schematic flow sheet of the TV programme suggesting method that the embodiment of the present invention provides, the method comprises the following steps:
S1, receives the voice signal of user;
S2, is converted to discrete speech data by described voice signal;
S3, identifies user's dialect classification used according to described speech data;
S4, recommends the TV programme relevant to described dialect classification to user.
Dialect can be divided into a variety of, and Modern Chinese dialect can be divided into seven large localism areas, comprises northern dialect, Wu Fangyan, Hunan dialect, Hakka dialect, Fujian dialect, Guangdong dialect, Jiangxi dialect.Such as, when user uses Guangdong dialect, utilize the TV programme suggesting method of the present embodiment can recommend some TV programme of the TV station in Guangdong Province to user; When user uses Hunan dialect, utilize the TV programme suggesting method of the present embodiment can recommend some TV programme of the TV station in Hunan Province to user.Wherein, each TV programme should demarcate affiliated dialect classification in advance.
In step s 4 which, describedly recommending the TV programme relevant to described dialect classification can be by showing a TV program list on television screen to user, or quoting described TV programme by the mode of voice.The TV programme suggesting method of the embodiment of the present invention can be applied to televisor or Set Top Box.
Further, the program commending instruction receiving user can also be comprised the steps: before step S1.Just perform step S1-S4 after only having user to input described program commending instruction, prevent maloperation.
As shown in Figure 2, it is the schematic flow sheet of the step S3 in Fig. 1.
In step s3, describedly identify user's dialect classification used according to described speech data, specifically comprise:
S31, to described speech data framing;
S32, obtains the robust features of each frame speech data, forms the fisrt feature sequence X={ x of described speech data
1, x
2..., x
m; Wherein, x
mrepresent the robust features of M frame speech data;
S33, removes the silence clip in described fisrt feature sequence X, obtains the second feature sequence Y={y of described speech data
1, y
2..., y
n; Wherein, y
nthe robust features of N frame speech data after silence clip in the described fisrt feature sequence X of representative removal, N≤M;
S34, according to the second feature sequence Y of described speech data, calculates the likelihood score of described speech data under different dialect model;
S35, judges according to the likelihood score of described speech data under different dialect model the dialect classification that user is used.
By extracting the robust features of each frame speech data and through after quiet process, the characteristic that can characterize described speech data can being obtained: second feature sequence; And then utilize described second feature sequence to calculate the likelihood score of described speech data under different dialect models; Likelihood score is higher, and illustrate that this group second feature sequence is more similar to described dialect model, namely the dialect model that wherein likelihood score is the highest is judged to be the dialect classification that user is used.
Particularly, in step S34, the described second feature sequence Y according to described speech data, calculates the likelihood score of described speech data under different dialect model, specifically according to following formulae discovery:
Wherein, p (Y/ λ
k) be the likelihood score of described speech data under kth kind dialect; P (y
i/ λ
k) be the robust features y of the i-th frame speech data of described second feature sequence
iappear at the probability of a kth dialect model; ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect.
It should be noted that, above formula is the likelihood function of kth kind dialect model.For the likelihood function of a discrete type, can know
namely the likelihood score of described speech data under kth kind dialect equals the product of the probability that each element occurs in second feature sequence.In the present embodiment, described dialect model adopts gauss hybrid models.Gauss hybrid models accurately quantizes things with Gaussian probability-density function (normal distribution curve), and a things is decomposed into some models formed based on Gaussian probability-density function (normal distribution curve).Namely a dialect model is mixed to form by multiple Gauss's submodel.Thus under the probability that each element occurs equals again this dialect model each Gauss's submodel probability and, and each Gauss's submodel is assigned different weights.Gauss hybrid models has the advantage that stability is high and convergence is good, for the accuracy that can improve likelihood score calculating.
Further, before the voice signal of described input user, also comprise the step S5 building dialect model.As shown in Figure 3, it is the schematic flow sheet of the step S5 in Fig. 1, specifically comprises:
S51, obtains the second feature sequence based on known dialect;
S52,
Adopt the method for decision tree-based clustering that the robust features of each frame speech data of described second feature sequence is carried out cluster, each classification adopts Gauss's submodel to characterize;
;
S53, according to the robust features comprised in each classification, adopts maximum likelihood algorithm to calculate the weight of the Gauss's submodel corresponding to each classification, average and covariance;
S54, according to the weight of each Gauss's submodel, average and covariance, generates the dialect model of described known dialect; Wherein, the robust features y of the i-th frame speech data of described second feature sequence
ithe probability appearing at a kth dialect model is
ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
It should be noted that, described acquisition is actually based on many groups second feature sequence of the known dialect of same the step obtained as many groups second feature sequence of sample data, and the process obtaining second feature sequence is identical with step S31-S33; Each Gauss's submodel is at least assigned the robust features of a frame speech data.By performing step S51-S54 to different dialects, obtaining the parameter of each Gauss's submodel of each dialect model, thus building each dialect model.
Further, described robust features comprises the energy of each frame speech data, mel-frequency cepstrum coefficient, the first order difference of mel-frequency cepstrum coefficient and second order difference.
Correspondingly, the embodiment of the present invention also provides a kind of television program recommendation system, can implement the institute of said method in steps.As shown in Figure 4, be the structural representation of television program recommendation system provided by the invention, this system comprises:
Signal input module 1, for inputting the voice signal of user;
Signal conversion module 2, for being converted to discrete speech data by described voice signal;
Judge module 3, for judging the dialect classification that user is used according to described speech data;
Recommending module 4, for recommending the TV programme relevant to described dialect classification to user.
As shown in Figure 5, be the structural representation of judge module 3 in Fig. 4.
Described judge module 3 comprises:
Point frame unit 31, for described speech data framing;
First ray acquiring unit 32, for obtaining the robust features of each frame speech data, forms the fisrt feature sequence X={ x of described speech data
1, x
2..., x
m; Wherein, x
mrepresent the robust features of M frame speech data in described speech data;
Second retrieval unit 33, for removing the silence clip in described fisrt feature sequence, obtains the second feature sequence Y={y of described speech data
1, y
2..., y
n; Wherein, y
nthe robust features of N frame speech data after silence clip in the described fisrt feature sequence X of representative removal, N≤M;
Likelihood score computing unit 34, for the second feature sequence Y of described speech data, calculates the likelihood score of described speech data under different dialect model;
Identifying unit 35, for judging according to the likelihood score of described speech data under different dialect model the dialect classification that user is used.
Further, the described likelihood score computing unit 34 specifically likelihood score of speech data under different dialect model according to following formulae discovery:
Wherein, p (Y/ λ
k) be the likelihood score of described speech data under kth kind dialect model; P (y
i/ λ
k) be the robust features y of the i-th frame speech data of described second feature sequence
iappear at the probability of a kth dialect model; ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
Further, described television program recommendation system also comprises dialect model construction module 5.As shown in Figure 6, it is the structural representation of the dialect model construction module 5 in Fig. 4.
Described dialect model construction module 5 specifically comprises:
Sample sequence acquiring unit 51, for obtaining the second feature sequence based on known dialect;
Allocation units 52, for
Adopt the method for decision tree-based clustering that the robust features of each frame speech data of described second feature sequence is carried out cluster, each classification adopts Gauss's submodel to characterize;
Model parameter calculation unit 53, for according to robust features assigned in each Gauss's submodel, adopts maximum likelihood algorithm to calculate the weight of each Gauss's submodel, average and covariance;
Model generation unit 54, for the weight according to each Gauss's submodel, average and covariance, generates the dialect model of described known dialect; Wherein, the robust features y of the i-th frame speech data of described second feature sequence
ithe probability appearing at a kth dialect model is
ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
Further, described robust features comprises the energy of each frame speech data, mel-frequency cepstrum coefficient, the first order difference of mel-frequency cepstrum coefficient and second order difference.
Implement the embodiment of the present invention, there is following beneficial effect: embodiments provide a kind of TV programme suggesting method, according to the voice signal of user, the said dialect classification of user can be judged, and the dialect that recommendation of taking this as a foundation is correlated with is main TV programme.This makes TV can recommend out more to meet the program of user's Cultural Language background, strengthens Consumer's Experience, particularly for those and the old user of unskilled grasp mandarin and human-machine operation.The present invention can see a kind of method of professional recommendation Dialect program as, forms complementation with existing television program recommendation system.Meanwhile, the embodiment of the present invention also provides a kind of television program recommendation system, can perform all method steps of described TV programme suggesting method.
One of ordinary skill in the art will appreciate that all or part of flow process realized in above-described embodiment method, that the hardware that can carry out instruction relevant by computer program has come, described program can be stored in a computer read/write memory medium, this program, when performing, can comprise the flow process of the embodiment as above-mentioned each side method.Wherein, described storage medium can be magnetic disc, CD, read-only store-memory body (Read-Only Memory, ROM) or random store-memory body (Random Access Memory, RAM) etc.
The above is the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications are also considered as protection scope of the present invention.
Claims (10)
1. a TV programme suggesting method, is characterized in that, comprising:
Receive the voice signal of user;
Described voice signal is converted to discrete speech data;
The dialect classification that user uses is identified according to described speech data;
The TV programme relevant to described dialect classification is recommended to user.
2. TV programme suggesting method as claimed in claim 1, is characterized in that, describedly identifies user's dialect classification used according to described speech data, specifically comprises:
To described speech data framing;
Obtain the robust features of each frame speech data, form the fisrt feature sequence X={ x of described speech data
1, x
2..., x
m; Wherein, x
mrepresent the robust features of M frame speech data;
Remove the silence clip in described fisrt feature sequence X, obtain the second feature sequence Y={y of described speech data
1, y
2..., y
n; Wherein, y
nthe robust features of N frame speech data after silence clip in the described fisrt feature sequence X of representative removal, N≤M;
According to the second feature sequence Y of described speech data, calculate the likelihood score of described speech data under different dialect model;
The dialect classification that user is used is judged according to the likelihood score of described speech data under different dialect model.
3. TV programme suggesting method as claimed in claim 2, is characterized in that, according to the second feature sequence Y of described speech data, calculate the likelihood score of described speech data under different dialect model, specifically according to following formulae discovery:
Wherein, p (Y/ λ
k) be the likelihood score of described speech data under kth kind dialect model; P (y
i/ λ
k) be the robust features y of the i-th frame speech data of described second feature sequence
iappear at the probability of a kth dialect model; ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
4. TV programme suggesting method as claimed in claim 2 or claim 3, is characterized in that, before the voice signal of described input user, also comprises the step building dialect model, specifically comprises:
Obtain the second feature sequence based on known dialect;
Adopt the method for decision tree-based clustering that the robust features of each frame speech data of described second feature sequence is carried out cluster, each classification adopts Gauss's submodel to characterize;
According to the robust features that each classification comprises, maximum likelihood algorithm is adopted to calculate the weight of the Gauss's submodel corresponding to each classification, average and covariance;
According to the weight of each Gauss's submodel, average and covariance, generate the dialect model of described known dialect; Wherein, the robust features y of the i-th frame speech data of described second feature sequence
ithe probability appearing at a kth dialect model is
ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
5. TV programme suggesting method as claimed in claim 4, is characterized in that, described robust features comprises the energy of each frame speech data, mel-frequency cepstrum coefficient, the first order difference of mel-frequency cepstrum coefficient and second order difference.
6. a television program recommendation system, is characterized in that, comprising:
Signal receiving module, for receiving the voice signal of user;
Signal conversion module, for being converted to discrete speech data by described voice signal;
Identification module, for identifying the dialect classification that user uses according to described speech data;
Recommending module, for recommending the TV programme relevant to described dialect classification to user.
7. television program recommendation system as claimed in claim 6, it is characterized in that, described identification module comprises:
Point frame unit, for described speech data framing;
First ray acquiring unit, for obtaining the robust features of each frame speech data, forms the fisrt feature sequence X={ x of described speech data
1, x
2..., x
m; Wherein, x
mrepresent the robust features of M frame speech data in described speech data;
Second retrieval unit, for removing the silence clip in described fisrt feature sequence, obtains the second feature sequence Y={y of described speech data
1, y
2..., y
n; Wherein, y
nthe robust features of N frame speech data after silence clip in the described fisrt feature sequence X of representative removal, N≤M;
Likelihood score computing unit, for the second feature sequence Y of described speech data, calculates the likelihood score of described speech data under different dialect model;
Identifying unit, for judging according to the likelihood score of described speech data under different dialect model the dialect classification that user is used.
8. television program recommendation system as claimed in claim 7, it is characterized in that, described likelihood score computing unit is the likelihood score of speech data under different dialect model according to following formulae discovery specifically:
Wherein, p (Y/ λ
k) be the likelihood score of described speech data under kth kind dialect model; P (y
i/ λ
k) be the robust features y of the i-th frame speech data of described second feature sequence
iappear at the probability of a kth dialect model; ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
9. television program recommendation system as claimed in claim 7 or 8, it is characterized in that, described television program recommendation system also comprises dialect model construction module; Described dialect model construction module specifically comprises:
Sample sequence acquiring unit, for obtaining the second feature sequence based on known dialect;
Cluster cell, for adopting the method for decision tree-based clustering that the robust features of each frame speech data of described second feature sequence is carried out cluster, each classification adopts Gauss's submodel to characterize;
Model parameter calculation unit, for the robust features comprised according to each classification, adopts maximum likelihood algorithm to calculate the weight of the Gauss's submodel corresponding to each classification, average and covariance;
Model generation unit, for the weight according to each Gauss's submodel, average and covariance, generates the dialect model of described known dialect; Wherein, the robust features y of the i-th frame speech data of described second feature sequence
ithe probability appearing at a kth dialect model is
ω
(k) jfor the weight of jth Gauss's submodel of kth kind dialect model; C
(k) jfor the covariance of jth Gauss's submodel of kth kind dialect model; μ
(k) jfor the average of jth Gauss's submodel of kth kind dialect model.
10. television program recommendation system as claimed in claim 9, is characterized in that, described robust features comprises the energy of each frame speech data, mel-frequency cepstrum coefficient, the first order difference of mel-frequency cepstrum coefficient and second order difference.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510098643.9A CN104766607A (en) | 2015-03-05 | 2015-03-05 | Television program recommendation method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510098643.9A CN104766607A (en) | 2015-03-05 | 2015-03-05 | Television program recommendation method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104766607A true CN104766607A (en) | 2015-07-08 |
Family
ID=53648391
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510098643.9A Pending CN104766607A (en) | 2015-03-05 | 2015-03-05 | Television program recommendation method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104766607A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105683964A (en) * | 2016-01-07 | 2016-06-15 | 马岩 | Network social contact searching method and system |
CN105677722A (en) * | 2015-12-29 | 2016-06-15 | 百度在线网络技术(北京)有限公司 | Method and apparatus for recommending friends in social software |
CN108172212A (en) * | 2017-12-25 | 2018-06-15 | 横琴国际知识产权交易中心有限公司 | A kind of voice Language Identification and system based on confidence level |
CN108810566A (en) * | 2018-06-12 | 2018-11-13 | 忆东兴(深圳)科技有限公司 | A kind of smart television dialect is interpreted method |
CN114449342A (en) * | 2022-01-21 | 2022-05-06 | 腾讯科技(深圳)有限公司 | Video recommendation method and device, computer readable storage medium and computer equipment |
CN115497475A (en) * | 2022-09-21 | 2022-12-20 | 深圳市人马互动科技有限公司 | Information recommendation method based on voice interaction system and related device |
Citations (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1298533A (en) * | 1998-04-22 | 2001-06-06 | 国际商业机器公司 | Adaptation of a speech recognizer for dialectal and linguistic domain variations |
US6411930B1 (en) * | 1998-11-18 | 2002-06-25 | Lucent Technologies Inc. | Discriminative gaussian mixture models for speaker verification |
CN1735924A (en) * | 2002-11-21 | 2006-02-15 | 松下电器产业株式会社 | Standard model creating device and standard model creating method |
CN1763843A (en) * | 2005-11-18 | 2006-04-26 | 清华大学 | Pronunciation quality evaluating method for language learning machine |
CN101154380A (en) * | 2006-09-29 | 2008-04-02 | 株式会社东芝 | Method and device for registration and validation of speaker's authentication |
CN101241699A (en) * | 2008-03-14 | 2008-08-13 | 北京交通大学 | A speaker identification system for remote Chinese teaching |
CN101286317A (en) * | 2008-05-30 | 2008-10-15 | 同济大学 | Speech recognition device, model training method and traffic information service platform |
CN101436403A (en) * | 2007-11-16 | 2009-05-20 | 创新未来科技有限公司 | Method and system for recognizing tone |
CN101552004A (en) * | 2009-05-13 | 2009-10-07 | 哈尔滨工业大学 | Method for recognizing in-set speaker |
CN101573749A (en) * | 2006-12-15 | 2009-11-04 | 摩托罗拉公司 | Method and apparatus for robust speech activity detection |
CN101622660A (en) * | 2007-02-28 | 2010-01-06 | 日本电气株式会社 | Audio recognition device, audio recognition method, and audio recognition program |
CN101645269A (en) * | 2008-12-30 | 2010-02-10 | 中国科学院声学研究所 | Language recognition system and method |
US20110071823A1 (en) * | 2008-06-10 | 2011-03-24 | Toru Iwasawa | Speech recognition system, speech recognition method, and storage medium storing program for speech recognition |
CN102184732A (en) * | 2011-04-28 | 2011-09-14 | 重庆邮电大学 | Fractal-feature-based intelligent wheelchair voice identification control method and system |
CN102231281A (en) * | 2011-07-18 | 2011-11-02 | 渤海大学 | Voice visualization method based on integration characteristic and neural network |
CN102238190A (en) * | 2011-08-01 | 2011-11-09 | 安徽科大讯飞信息科技股份有限公司 | Identity authentication method and system |
CN102290047A (en) * | 2011-09-22 | 2011-12-21 | 哈尔滨工业大学 | Robust speech characteristic extraction method based on sparse decomposition and reconfiguration |
CN102394062A (en) * | 2011-10-26 | 2012-03-28 | 华南理工大学 | Method and system for automatically identifying voice recording equipment source |
CN103313108A (en) * | 2013-06-14 | 2013-09-18 | 山东科技大学 | Smart TV program recommending method based on context aware |
CN103474061A (en) * | 2013-09-12 | 2013-12-25 | 河海大学 | Automatic distinguishing method based on integration of classifier for Chinese dialects |
CN103491411A (en) * | 2013-09-26 | 2014-01-01 | 深圳Tcl新技术有限公司 | Method and device based on language recommending channels |
CN103546773A (en) * | 2013-08-15 | 2014-01-29 | Tcl集团股份有限公司 | Television program recommendation method and system |
CN103839545A (en) * | 2012-11-23 | 2014-06-04 | 三星电子株式会社 | Apparatus and method for constructing multilingual acoustic model |
CN103943104A (en) * | 2014-04-15 | 2014-07-23 | 海信集团有限公司 | Voice information recognition method and terminal equipment |
CN103945250A (en) * | 2013-01-17 | 2014-07-23 | 三星电子株式会社 | Image processing apparatus, control method thereof, and image processing system |
CN104038788A (en) * | 2014-06-19 | 2014-09-10 | 中山大学深圳研究院 | Community social network system and content recommendation method |
CN104123934A (en) * | 2014-07-23 | 2014-10-29 | 泰亿格电子(上海)有限公司 | Speech composition recognition method and system |
CN104200804A (en) * | 2014-09-19 | 2014-12-10 | 合肥工业大学 | Various-information coupling emotion recognition method for human-computer interaction |
-
2015
- 2015-03-05 CN CN201510098643.9A patent/CN104766607A/en active Pending
Patent Citations (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1298533A (en) * | 1998-04-22 | 2001-06-06 | 国际商业机器公司 | Adaptation of a speech recognizer for dialectal and linguistic domain variations |
US6411930B1 (en) * | 1998-11-18 | 2002-06-25 | Lucent Technologies Inc. | Discriminative gaussian mixture models for speaker verification |
CN1735924A (en) * | 2002-11-21 | 2006-02-15 | 松下电器产业株式会社 | Standard model creating device and standard model creating method |
CN1763843A (en) * | 2005-11-18 | 2006-04-26 | 清华大学 | Pronunciation quality evaluating method for language learning machine |
CN101154380A (en) * | 2006-09-29 | 2008-04-02 | 株式会社东芝 | Method and device for registration and validation of speaker's authentication |
CN101573749A (en) * | 2006-12-15 | 2009-11-04 | 摩托罗拉公司 | Method and apparatus for robust speech activity detection |
CN101622660A (en) * | 2007-02-28 | 2010-01-06 | 日本电气株式会社 | Audio recognition device, audio recognition method, and audio recognition program |
CN101436403A (en) * | 2007-11-16 | 2009-05-20 | 创新未来科技有限公司 | Method and system for recognizing tone |
CN101241699A (en) * | 2008-03-14 | 2008-08-13 | 北京交通大学 | A speaker identification system for remote Chinese teaching |
CN101286317A (en) * | 2008-05-30 | 2008-10-15 | 同济大学 | Speech recognition device, model training method and traffic information service platform |
US20110071823A1 (en) * | 2008-06-10 | 2011-03-24 | Toru Iwasawa | Speech recognition system, speech recognition method, and storage medium storing program for speech recognition |
CN101645269A (en) * | 2008-12-30 | 2010-02-10 | 中国科学院声学研究所 | Language recognition system and method |
CN101552004A (en) * | 2009-05-13 | 2009-10-07 | 哈尔滨工业大学 | Method for recognizing in-set speaker |
CN102184732A (en) * | 2011-04-28 | 2011-09-14 | 重庆邮电大学 | Fractal-feature-based intelligent wheelchair voice identification control method and system |
CN102231281A (en) * | 2011-07-18 | 2011-11-02 | 渤海大学 | Voice visualization method based on integration characteristic and neural network |
CN102238190A (en) * | 2011-08-01 | 2011-11-09 | 安徽科大讯飞信息科技股份有限公司 | Identity authentication method and system |
CN102290047A (en) * | 2011-09-22 | 2011-12-21 | 哈尔滨工业大学 | Robust speech characteristic extraction method based on sparse decomposition and reconfiguration |
CN102394062A (en) * | 2011-10-26 | 2012-03-28 | 华南理工大学 | Method and system for automatically identifying voice recording equipment source |
CN103839545A (en) * | 2012-11-23 | 2014-06-04 | 三星电子株式会社 | Apparatus and method for constructing multilingual acoustic model |
CN103945250A (en) * | 2013-01-17 | 2014-07-23 | 三星电子株式会社 | Image processing apparatus, control method thereof, and image processing system |
CN103313108A (en) * | 2013-06-14 | 2013-09-18 | 山东科技大学 | Smart TV program recommending method based on context aware |
CN103546773A (en) * | 2013-08-15 | 2014-01-29 | Tcl集团股份有限公司 | Television program recommendation method and system |
CN103474061A (en) * | 2013-09-12 | 2013-12-25 | 河海大学 | Automatic distinguishing method based on integration of classifier for Chinese dialects |
CN103491411A (en) * | 2013-09-26 | 2014-01-01 | 深圳Tcl新技术有限公司 | Method and device based on language recommending channels |
CN103943104A (en) * | 2014-04-15 | 2014-07-23 | 海信集团有限公司 | Voice information recognition method and terminal equipment |
CN104038788A (en) * | 2014-06-19 | 2014-09-10 | 中山大学深圳研究院 | Community social network system and content recommendation method |
CN104123934A (en) * | 2014-07-23 | 2014-10-29 | 泰亿格电子(上海)有限公司 | Speech composition recognition method and system |
CN104200804A (en) * | 2014-09-19 | 2014-12-10 | 合肥工业大学 | Various-information coupling emotion recognition method for human-computer interaction |
Non-Patent Citations (6)
Title |
---|
CARLOS LIMA ET AL: ""A Robust Features Extraction for Automatic Speech Recognition in Noisy Environments"", 《ICSP PROCESSINGS》 * |
MING-LIANG GU ET AL: ""Chinese Dialect Identification Using SC-GMM"", 《ADVANCED MATERIALS RESEARCH》 * |
WUEI-HE TSAI ET AL: ""Discriminative training of Gaussian mixture bigram models with application to Chinese dialect identification"", 《ELSEVIER》 * |
杨澄宇: ""基于高斯混合模型的说话人确认系统"", 《计算机应用》 * |
王岐学 等: ""基于差分特征和高斯混合模型的湖南方言识别"", 《计算机工程与应用》 * |
顾明亮: ""基于高斯混合模型的汉语方言辨识系统"", 《计算机工程与应用》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105677722A (en) * | 2015-12-29 | 2016-06-15 | 百度在线网络技术(北京)有限公司 | Method and apparatus for recommending friends in social software |
CN105683964A (en) * | 2016-01-07 | 2016-06-15 | 马岩 | Network social contact searching method and system |
WO2017117786A1 (en) * | 2016-01-07 | 2017-07-13 | 马岩 | Social network search method and system |
CN108172212A (en) * | 2017-12-25 | 2018-06-15 | 横琴国际知识产权交易中心有限公司 | A kind of voice Language Identification and system based on confidence level |
CN108172212B (en) * | 2017-12-25 | 2020-09-11 | 横琴国际知识产权交易中心有限公司 | Confidence-based speech language identification method and system |
CN108810566A (en) * | 2018-06-12 | 2018-11-13 | 忆东兴(深圳)科技有限公司 | A kind of smart television dialect is interpreted method |
CN114449342A (en) * | 2022-01-21 | 2022-05-06 | 腾讯科技(深圳)有限公司 | Video recommendation method and device, computer readable storage medium and computer equipment |
CN114449342B (en) * | 2022-01-21 | 2024-02-27 | 腾讯科技(深圳)有限公司 | Video recommendation method, device, computer readable storage medium and computer equipment |
CN115497475A (en) * | 2022-09-21 | 2022-12-20 | 深圳市人马互动科技有限公司 | Information recommendation method based on voice interaction system and related device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104766607A (en) | Television program recommendation method and system | |
CN108509619B (en) | Voice interaction method and device | |
CN108920648B (en) | Cross-modal matching method based on music-image semantic relation | |
CN105895087B (en) | Voice recognition method and device | |
CN104575504A (en) | Method for personalized television voice wake-up by voiceprint and voice identification | |
CN108538294A (en) | A kind of voice interactive method and device | |
CN113590850A (en) | Multimedia data searching method, device, equipment and storage medium | |
CN111508505B (en) | Speaker recognition method, device, equipment and storage medium | |
CN103970802A (en) | Song recommending method and device | |
CN107358947A (en) | Speaker recognition methods and system again | |
CN108710653B (en) | On-demand method, device and system for reading book | |
CN114676689A (en) | Sentence text recognition method and device, storage medium and electronic device | |
CN111309855A (en) | Text information processing method and system | |
CN112417132A (en) | New intention recognition method for screening negative samples by utilizing predicate guest information | |
CN106204103A (en) | The method of similar users found by a kind of moving advertising platform | |
CN111724766A (en) | Language identification method, related equipment and readable storage medium | |
CN110378190A (en) | Video content detection system and detection method based on topic identification | |
CN113220929A (en) | Music recommendation method based on time-staying and state-staying mixed model | |
Leng et al. | Audio scene recognition based on audio events and topic model | |
CN115438153A (en) | Training method and device for intention matching degree analysis model | |
CN112699831B (en) | Video hotspot segment detection method and device based on barrage emotion and storage medium | |
CN108595630A (en) | A kind of user behavior data analysis model and its construction method | |
CN111477248B (en) | Audio noise detection method and device | |
CN114758664A (en) | Voice data screening method and device, electronic equipment and readable storage medium | |
CN110347824B (en) | Method for determining optimal number of topics of LDA topic model based on vocabulary similarity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20150708 |
|
RJ01 | Rejection of invention patent application after publication |