CN106128467A

CN106128467A - Method of speech processing and device

Info

Publication number: CN106128467A
Application number: CN201610395662.2A
Authority: CN
Inventors: 黄宇
Original assignee: Beijing Yunzhisheng Information Technology Co Ltd
Current assignee: Beijing Yunzhisheng Information Technology Co Ltd
Priority date: 2016-06-06
Filing date: 2016-06-06
Publication date: 2016-11-16

Abstract

The present invention is that wherein, method includes about a kind of method of speech processing and device: receive the voice messaging of user's input；Described voice messaging is carried out Application on Voiceprint Recognition, to determine the characteristic information of described user；Described voice messaging is carried out voice and semantics recognition, to obtain text identification result；According to described characteristic information and described text identification result, determine the object content recommendation service corresponding with described voice messaging.By this technical scheme, the voice messaging of user's input is carried out Application on Voiceprint Recognition and voice, semantics recognition respectively, determine characteristic information and the recognition result of user, wherein, characteristic information can be sex and the age etc. of user, recognition result is the word content that the voice messaging identified is corresponding, and then determines the object content recommendation service of correspondence according to user's characteristic information and recognition result, thus meets the different recommended requirements of different user.

Description

Method of speech processing and device

Technical field

The present invention relates to voice processing technology field, particularly relate to a kind of method of speech processing and device.

Background technology

Speech recognition is a cross discipline.Recent two decades comes, and speech recognition technology obtains marked improvement, starts from experiment Market is moved towards in room.It is contemplated that, in coming 10 years, speech recognition technology will enter industry, household electrical appliances, communication, automotive electronics, doctor The every field such as treatment, home services, consumption electronic product.The application in some fields of the speech recognition dictation machine is by US News circle It is chosen as one of ten major issues of development of computer in 1997.A lot of experts think that speech recognition technology is between 2000 to 2010 One of development in science and technology technology that areas of information technology ten are the most important.Field involved by speech recognition technology includes: signal processing, Pattern recognition, theory of probability and theory of information, sound generating mechanism and hearing mechanism, artificial intelligence etc..

Summary of the invention

The embodiment of the present invention provides a kind of method of speech processing and device, in order to realize the different demands according to different user For user's recommendation service.

First aspect according to embodiments of the present invention, it is provided that a kind of method of speech processing, including:

Receive the voice messaging of user's input；

Described voice messaging is carried out Application on Voiceprint Recognition, to determine the characteristic information of described user；

Described voice messaging is carried out voice and semantics recognition, to obtain text identification result；

According to described characteristic information and described text identification result, determine that the object content corresponding with described voice messaging pushes away Recommend service.

In this embodiment, the voice messaging of user's input is carried out Application on Voiceprint Recognition and voice, semantics recognition respectively, determines The characteristic information of user and recognition result, wherein, characteristic information can be sex and the age etc. of user, and recognition result is to identify The word content that voice messaging out is corresponding, and then the object content of correspondence is determined according to user's characteristic information and recognition result Recommendation service, thus meet the different recommended requirements of different user.

In one embodiment, described characteristic information include following at least one:

The sex of described user and age.

The voice messaging of user's input is carried out Application on Voiceprint Recognition, it may be determined that the sex of user and age etc..

In one embodiment, described method also includes:

Export described object content recommendation service.

In this embodiment it is possible to output object content recommendation service, push away so that user can normally view content Recommend service, promote the experience of user.

In one embodiment, described according to described characteristic information with described text identification result, determine and described voice The object content recommendation service that information is corresponding, including:

The Keyword Tag of each content recommendation service is obtained from preset content recommendation service data base, wherein, described Keyword Tag includes COS, is suitable for the range of age and applicable sex；

According to described Keyword Tag, determine and mate with described text identification result, and mate with described characteristic information Object content recommendation service.

In this embodiment, in preset content recommendation service data base each content recommendation service with crucial sign Signing, wherein, Keyword Tag can be COS, such as amusement and recreation service, study class service etc., specifically, can have song Song, reading matter, cross-talk etc..Keyword Tag can also include being suitable for the range of age, if content recommendation service is to be suitable for old man, green grass or young crops less Year or child, it is, of course, also possible to include being suitable for sex.As such, it is possible to according to the feature of user and the voice content of input be User finds the content recommendation service being best suitable for its feature and voice content, recommends to meet its content required for user Recommendation service, promotes the experience of user.

In one embodiment, described according to described Keyword Tag, determine and mate with described text identification result, and with The object content recommendation service of described characteristic information coupling, including:

Determine that the target belonging to the age of user in described characteristic information is suitable for the range of age, and/or belonging to user's sex Target be suitable for sex；

Determine the destination service type comprised in described text identification result；

According to described Keyword Tag, determine and the destination service type matching in described text identification result, and with institute State target be suitable for the range of age and/or be suitable for the object content recommendation service of gender matched with described target.

In this embodiment it is possible to arrange multiple applicable the range of age, as arrange 1-6 year be child's section, 7-17 year be blue or green Juvenile section, 18-50 year for growing up section, within more than 50 years old, be section in old age, and for different age brackets, the commending contents that correspondence is different Service, as serviced for song class, recommendations song corresponding to child's section can be that recommendation song corresponding to nursery rhymes etc., youth's section is Cartoon theme songs etc., recommendation song corresponding to section of growing up is popular song etc., and song corresponding to old section be classics old song etc.. Simultaneously for different sexes, content recommendation service can also, as schoolgirl, corresponding recommendation song can be all man The song of singer, for boy student, corresponding recommendation song can be all the song of female singer.So, age of user institute is first determined The target belonged to is suitable for the range of age and/or target is suitable for sex, and then is user's content recommendation recommendation service according to these.

So, according to the demand of the user of different age group, for its recommendation service, so that the service recommended is more satisfied The requirement of user, is more suitable for user, promotes the experience of user.

Second aspect according to embodiments of the present invention, it is provided that a kind of voice processing apparatus, including:

Receiver module, for receiving the voice messaging of user's input；

First determines module, for described voice messaging being carried out Application on Voiceprint Recognition, to determine the characteristic information of described user；

Identification module, for carrying out voice and semantics recognition to described voice messaging, to obtain text identification result；

Second determines module, for according to described characteristic information and described text identification result, determines and believes with described voice The object content recommendation service that breath is corresponding.

In one embodiment, described device also includes:

Output module, is used for exporting described object content recommendation service.

The sex of described user and age.

In one embodiment, described second determines that module includes:

Obtain submodule, for obtaining the keyword of each content recommendation service from preset content recommendation service data base Label, wherein, described Keyword Tag includes COS, is suitable for the range of age and applicable sex；

Determine submodule, for according to described Keyword Tag, determine and mate with described text identification result, and with described The object content recommendation service of characteristic information coupling.

In one embodiment, described determine submodule for:

It should be appreciated that it is only exemplary and explanatory, not that above general description and details hereinafter describe The present invention can be limited.

Other features and advantages of the present invention will illustrate in the following description, and, partly become from description Obtain it is clear that or understand by implementing the present invention.The purpose of the present invention and other advantages can be by the explanations write Structure specifically noted in book, claims and accompanying drawing realizes and obtains.

Below by drawings and Examples, technical scheme is described in further detail.

Accompanying drawing explanation

Accompanying drawing herein is merged in description and constitutes the part of this specification, it is shown that meet the enforcement of the present invention Example, and for explaining the principle of the present invention together with description.

Fig. 1 is the flow chart according to a kind of method of speech processing shown in an exemplary embodiment.

Fig. 2 is the flow chart according to the another kind of method of speech processing shown in an exemplary embodiment.

Fig. 3 is according to the flow chart of step S104 in a kind of method of speech processing shown in an exemplary embodiment.

Fig. 4 is according to the flow chart of step S302 in a kind of method of speech processing shown in an exemplary embodiment.

Fig. 5 is the block diagram according to a kind of voice processing apparatus shown in an exemplary embodiment.

Fig. 6 is the block diagram according to the another kind of voice processing apparatus shown in an exemplary embodiment.

Fig. 7 is to determine the block diagram of module according in the another kind of voice processing apparatus shown in an exemplary embodiment second.

Detailed description of the invention

Here will illustrate exemplary embodiment in detail, its example represents in the accompanying drawings.Explained below relates to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represents same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the present invention.On the contrary, they are only with the most appended The example of the apparatus and method that some aspects that described in detail in claims, the present invention are consistent.

Fig. 1 is the flow chart according to a kind of method of speech processing shown in an exemplary embodiment.This method of speech processing Being applied in terminal unit, this terminal unit can be mobile phone, computer, digital broadcast terminal, messaging devices, trip Play control station, tablet device, armarium, body-building equipment, arbitrary equipment with language process function such as personal digital assistant.

As it is shown in figure 1, the method comprising the steps of S101-S104:

In step S101, receive the voice messaging of user's input；

In step s 102, described voice messaging is carried out Application on Voiceprint Recognition, to determine the characteristic information of described user；

So-called vocal print (Voiceprint), is the sound wave spectrum carrying verbal information that shows of electricity consumption acoustic instrument.The mankind The generation of language is a complicated physiology physical process between Body Languages maincenter and phonatory organ, and people uses when speech Phonatory organ--tongue, tooth, larynx, lung, nasal cavity everyone widely different in terms of size and form, so any two people Vocal print collection of illustrative plates the most variant.Everyone existing relative stability of Speech acoustics feature, has again variability, be not absolute, Unalterable.This variation may be from physiology, pathology, psychology, simulates, pretends, also relevant with environmental disturbances.While it is true, Owing to everyone phonatory organ are not quite similar, the most in the ordinary course of things, remain to distinguish sound or the judgement of different people It is whether the sound of same people.

And by voice messaging being carried out Application on Voiceprint Recognition, the specific features of user can be identified, age of such as user, Sex etc..

In step s 103, described voice messaging is carried out voice and semantics recognition, to obtain text identification result；

Voice messaging is carried out voice and semantics recognition, then can identify the concrete content of text described in user, such as User has said " I wants to listen a first song ", then, after voice and speech recognition, the text identification result obtained is that " I wants to listen one First song ", corresponding word content will be changed into by the voice content described in user.

In step S104, according to described characteristic information and described text identification result, determine and described voice messaging pair The object content recommendation service answered.

The sex of described user and age.

As in figure 2 it is shown, in one embodiment, said method also includes step S201:

In step s 201, described object content recommendation service is exported.

As it is shown on figure 3, in one embodiment, above-mentioned steps S104 can include step S301-S302:

In step S301, from preset content recommendation service data base, obtain the crucial sign of each content recommendation service Signing, wherein, described Keyword Tag includes COS, is suitable for the range of age and applicable sex；

In step s 302, according to described Keyword Tag, determine and mate with described text identification result, and with described spy Levy the object content recommendation service of information matches.

As shown in Figure 4, in one embodiment, above-mentioned steps S302 can include step S401-S403:

In step S401, determine that the target belonging to the age of user in described characteristic information is suitable for the range of age, and/or Target belonging to user's sex is suitable for sex；

In step S402, determine the destination service type comprised in described text identification result；

In step S403, according to described Keyword Tag, determine and the destination service class in described text identification result Type mates, and is suitable for the range of age with described target and/or is suitable for the object content recommendation service of gender matched with described target.

Such as, when user input voice information: " I wants to listen a first song ".Equipment this voice messaging is carried out Application on Voiceprint Recognition and By voice semantics recognition, voice, semantics recognition, determined that user wants to listen a first song, determined the age of user by Application on Voiceprint Recognition And/or sex, if identifying age of user is 5 years old, then it belongs to child's section, and equipment is from preset content recommendation service data base Middle search key label is the content recommendation service of song and child, if the content recommendation service found has one, the most defeated Going out this content recommendation service, the content recommendation service as found has multiple, then can randomly select, such as the commending contents clothes found Business has many first children's songs, can be with the first children's song of shuffle one.It is similar to, if identifying age of user is 15 years old, then It belongs to teenager section, then equipment can be with the first cartoon theme song of shuffle one.If identifying user is 30 years old, then it belongs to In adult section, then equipment can be with the first popular song of shuffle one.If identifying age of user is 60 years old, then it belongs to old Section, then equipment can be with the first old song of shuffle one.

The most such as, user input voice information " my talking book to be listened ", equipment carries out Application on Voiceprint Recognition to this voice messaging With voice, semantics recognition, determine that user wants to listen talking book by voice semantics recognition, determined the year of user by Application on Voiceprint Recognition Age and/or sex, if identifying age of user is 5 years old, then it belongs to child's section, and equipment is from preset content recommendation service data In storehouse, search key label is the content recommendation service of talking book and child, if the content recommendation service found has one Individual, then export this content recommendation service, the content recommendation service as found has multiple, then can randomly select, in finding Hold recommendation service and have multiple children stories, then can be with one children stories of shuffle.It is similar to, if identifying age of user Be 15 years old, then it belongs to teenager section, then equipment can be with one section of prose of shuffle.If identifying user is 30 years old, then its Belong to adult section, then equipment can be with one story of pursuing a goal with determination of shuffle.If identifying age of user is 60 years old, then it belongs to old Year section, then equipment can be with one cross-talk of shuffle.

Following for apparatus of the present invention embodiment, may be used for performing the inventive method embodiment.

Fig. 5 is the block diagram according to a kind of voice processing apparatus shown in an exemplary embodiment, and this device can be by soft Part, hardware or both be implemented in combination with become the some or all of of terminal unit.As it is shown in figure 5, this voice processing apparatus Including:

Receiver module 51, for receiving the voice messaging of user's input；

First determines module 52, for described voice messaging is carried out Application on Voiceprint Recognition, to determine that the feature of described user is believed Breath；

Identification module 53, for carrying out voice and semantics recognition to described voice messaging, to obtain text identification result；

Second determines module 54, for according to described characteristic information and described text identification result, determines and described voice The object content recommendation service that information is corresponding.

As shown in Figure 6, in one embodiment, said apparatus also includes:

Output module 61, is used for exporting described object content recommendation service.

The sex of described user and age.

As it is shown in fig. 7, in one embodiment, described second determines that module 54 includes:

Obtain submodule 71, for obtaining the key of each content recommendation service from preset content recommendation service data base Sign label, wherein, described Keyword Tag includes COS, is suitable for the range of age and applicable sex；

Determine submodule 72, for according to described Keyword Tag, determine and mate with described text identification result, and with institute State the object content recommendation service of characteristic information coupling.

In one embodiment, described determine submodule 72 for:

Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program Product.Therefore, the reality in terms of the present invention can use complete hardware embodiment, complete software implementation or combine software and hardware Execute the form of example.And, the present invention can use at one or more computers wherein including computer usable program code The shape of the upper computer program implemented of usable storage medium (including but not limited to disk memory and optical memory etc.) Formula.

The present invention is with reference to method, equipment (system) and the flow process of computer program according to embodiments of the present invention Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.

These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or The function specified in multiple square frames.

These computer program instructions also can be loaded in computer or other programmable data processing device so that at meter Perform sequence of operations step on calculation machine or other programmable devices to produce computer implemented process, thus at computer or The instruction performed on other programmable devices provides for realizing at one flow process of flow chart or multiple flow process and/or block diagram one The step of the function specified in individual square frame or multiple square frame.

Obviously, those skilled in the art can carry out various change and the modification essence without deviating from the present invention to the present invention God and scope.So, if these amendments of the present invention and modification belong to the scope of the claims in the present invention and equivalent technologies thereof Within, then the present invention is also intended to comprise these change and modification.

Claims

1. a method of speech processing, it is characterised in that including:

Receive the voice messaging of user's input；

According to described characteristic information and described text identification result, determine that the object content corresponding with described voice messaging recommends clothes Business.

Method the most according to claim 1, it is characterised in that described method also includes:

Export described object content recommendation service.

Method the most according to claim 1, it is characterised in that described characteristic information include following at least one:

The sex of described user and age.

Method the most according to claim 3, it is characterised in that described tie according to described characteristic information and described text identification Really, determine the object content recommendation service corresponding with described voice messaging, including:

The Keyword Tag of each content recommendation service, wherein, described key is obtained from preset content recommendation service data base Sign label include COS, are suitable for the range of age and applicable sex；

According to described Keyword Tag, determine and mate with described text identification result, and the target mated with described characteristic information Content recommendation service.

Method the most according to claim 4, it is characterised in that described according to described Keyword Tag, determines and described literary composition This recognition result mates, and the object content recommendation service mated with described characteristic information, including:

Determine that the target belonging to the age of user in described characteristic information is suitable for the range of age, and/or the mesh belonging to user's sex Mark is suitable for sex；

According to described Keyword Tag, determine and the destination service type matching in described text identification result, and with described mesh Mark is suitable for the range of age and/or is suitable for the object content recommendation service of gender matched with described target.

6. a voice processing apparatus, it is characterised in that including:

Receiver module, for receiving the voice messaging of user's input；

Second determines module, for according to described characteristic information and described text identification result, determines and described voice messaging pair The object content recommendation service answered.

Device the most according to claim 6, it is characterised in that described device also includes:

Device the most according to claim 6, it is characterised in that described characteristic information include following at least one:

The sex of described user and age.

Device the most according to claim 8, it is characterised in that described second determines that module includes:

Obtain submodule, for obtaining the crucial sign of each content recommendation service from preset content recommendation service data base Signing, wherein, described Keyword Tag includes COS, is suitable for the range of age and applicable sex；

Determine submodule, for according to described Keyword Tag, determine and mate with described text identification result, and with described feature The object content recommendation service of information matches.

Device the most according to claim 9, it is characterised in that described determine submodule for: