CN104157286B

CN104157286B - A kind of phrasal acquisition methods and device

Info

Publication number: CN104157286B
Application number: CN201410374537.4A
Authority: CN
Inventors: 卢存洋
Original assignee: Shenzhen Jinli Communication Equipment Co Ltd
Current assignee: Shenzhen Microphone Holdings Co Ltd
Priority date: 2014-07-31
Filing date: 2014-07-31
Publication date: 2017-12-29
Anticipated expiration: 2034-07-31
Also published as: CN104157286A

Abstract

The embodiment of the invention discloses a kind of phrasal acquisition methods, including：If detecting the voice signal that user sends, speech data corresponding to the voice signal is obtained；According to default voice byte-threshold, the target voice byte of the voice byte-threshold corresponding number is filtered out from the speech data；The target voice byte is parsed, and obtains the phrasal analysis result for including the user.The embodiment of the invention also discloses a kind of idiom acquisition device.Using the present invention, the idiom of associated user can be targetedly obtained.

Description

A kind of phrasal acquisition methods and device

Technical field

The present invention relates to technical field of media, more particularly to a kind of phrasal acquisition methods and device.

Background technology

In daily life, people are inevitably exchanged with other people.However, with people's communication process, people There is the words custom of oneself, therefore some idioms may be carried in exchange.Wherein, some term customs are such as uncivil Words may destroy communication environment, such as in some more formal occasion, it is not intended to several uncultivated mouths of emerging of knowledge Head buddhist, then can influence the concordance exchanged between people, may be negatively affected to speaker, or even cause certain damage Lose.Therefore, grasping the words custom of itself in time turns into key.User's words are accustomed to however, being not present in the prior art Analyzed, nor the words custom of associated user by current means of communication, can be obtained.

The content of the invention

The embodiments of the invention provide a kind of phrasal acquisition methods and device, can targetedly obtain related use The idiom at family.

The embodiments of the invention provide a kind of phrasal acquisition methods, including：

If detecting the voice signal that user sends, speech data corresponding to the voice signal is obtained；

According to default voice byte-threshold, the voice byte-threshold corresponding number is filtered out from the speech data Target voice byte；

The target voice byte is parsed, and obtains the phrasal analysis result for including the user.

Correspondingly, the embodiment of the present invention additionally provides a kind of idiom acquisition device, including：

First acquisition unit, if the voice signal sent for detecting user, is obtained corresponding to the voice signal Speech data；

Screening unit, for the speech data according to default voice byte-threshold, obtained from the first acquisition unit In filter out the target voice byte of the voice byte-threshold corresponding number；

Second acquisition unit, the target voice byte for being filtered out to the screening unit parses, and obtains bag Phrasal analysis result containing the user.

The embodiment of the present invention can obtain corresponding speech data, by right when detecting the voice signal that user sends The target voice byte filtered out in the speech data is analyzed, and so as to obtain the idiom of active user, can be directed to The idiom of associated user is obtained to property, flexibility is stronger.

Brief description of the drawings

In order to illustrate the technical solution of the embodiments of the present invention more clearly, required use in being described below to embodiment Accompanying drawing be briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for this For the those of ordinary skill of field, on the premise of not paying creative work, it can also be obtained according to these accompanying drawings other Accompanying drawing.

Fig. 1 is a kind of schematic flow sheet of phrasal acquisition methods of the embodiment of the present invention；

Fig. 2 is the schematic flow sheet of another phrasal acquisition methods of the embodiment of the present invention；

Fig. 3 is a kind of schematic flow sheet of the method for acquisition target voice byte of the embodiment of the present invention；

Fig. 4 is a kind of interaction schematic diagram of phrasal acquisition methods of the embodiment of the present invention；

Fig. 5 is the schematic flow sheet of another phrasal acquisition methods of the embodiment of the present invention；

Fig. 6 is a kind of structural representation of idiom acquisition device of the embodiment of the present invention；

Fig. 7 is the structural representation of another idiom acquisition device of the embodiment of the present invention；

Fig. 8 is the structural representation of another idiom acquisition device of the embodiment of the present invention；

Fig. 9 is a kind of structural representation of terminal of the embodiment of the present invention；

Figure 10 is a kind of structural representation of server of the embodiment of the present invention；

Figure 11 is that a kind of idiom of the embodiment of the present invention obtains the structural representation of system.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made Embodiment, belong to the scope of protection of the invention.

Fig. 1 is referred to, is a kind of schematic flow sheet of phrasal acquisition methods of the embodiment of the present invention, methods described It can be applied particularly in the terminal devices such as mobile phone, tablet personal computer, wearable device, or can be applied in server, the present invention Embodiment does not limit.Specifically, methods described includes：

S101：If detecting the voice signal that user sends, speech data corresponding to the voice signal is obtained.

, can be by detecting the voice signal currently sent with the presence or absence of user in specific embodiment, and detecting voice During signal, triggering obtains speech data corresponding to the voice signal, for example acquires the speech data by recording.

Further, before the speech data is obtained, whether the also detectable current user for sending voice signal is to work as The validated user of preceding terminal, for example matching detection is carried out by default speech samples, wherein, the speech samples are validated user Sound clip, specifically can by validated user carry out in advance recording obtain.

S102：According to default voice byte-threshold, the voice byte-threshold pair is filtered out from the speech data Answer the target voice byte of number.

In specific embodiment, a voice byte-threshold, and the speech data according to the threshold value from acquisition can be pre-set In extract target voice byte.In general, the i.e. corresponding voice byte of the word that user often says, such as user say Go out " how do you do ", then correspond to three voice bytes.

Alternatively, the speech data of the acquisition can be a word, and voice byte-threshold can be preset according to this from the sentence The voice byte of the threshold value corresponding number is extracted as target voice byte at the ad-hoc location of words such as beginning and/or ending. That is can often acquire in short, for example often record when obtaining a word, you can carry out the screening of target voice byte Operation, so as to screen to obtain a number of target voice byte.Wherein, the default dead time can be passed through between every words Interval makes a distinction.

Still optionally further, the speech data of the acquisition can also be one section of word (being made up of more words), then can be according to pre- If dead time interval segment processing is carried out to the speech data of the acquisition, obtain multiple sound bites (sound bite It can correspond in short).Correspondingly, if the voice byte-threshold is arranged to 5, can be extracted from the ad-hoc location of each sound bite Go out 5 voice bytes as target voice byte, for example extract preceding 5 bytes of the sound bite and/or rear 5 byte conducts Target voice byte, so as to obtain multiple target voice bytes.

S103：The target voice byte is parsed, and obtains the phrasal parsing knot for including the user Fruit.

If existing identical specifically, being resolved in each target voice byte, i.e., some target voice bytes repeat, then Calculate the occurrence number of the voice byte, i.e. number of repetition, and exceed default amount threshold in the number of repetition, such as 5 When secondary, corresponding target voice byte is stored as the idiom of the user.

Further, it can also will parse obtained user's idiom and the phrasal number of repetition is pushed to currently Terminal.

Further, also voice signal, and speech data corresponding to the voice signal can be sent in subsequent detection to user When the idiom obtained with the parsing matches, message notifying is sent, with call user's attention correlation words.

Corresponding speech data can be obtained when detecting the voice signal that user sends by implementing the embodiment of the present invention, be led to Cross and the target voice byte filtered out in the speech data is analyzed, so as to obtain the idiom of active user, can have The idiom of associated user is pointedly obtained, flexibility is stronger.

Fig. 2 is referred to, is the schematic flow sheet of another phrasal acquisition methods of the embodiment of the present invention, specifically , methods described includes：

S201：If detecting the voice signal that user sends, voice attribute corresponding to the voice signal is obtained.

S202：Judge the voice attribute whether corresponding with preset speech samples of voice attribute corresponding to the voice signal Match.

In specific embodiment, a speech samples can be pre-set, the speech samples are the sound clip of validated user, are had Body can be recorded to obtain by current Lawful user.

S203：If matching, obtains speech data corresponding to the voice signal.

Specifically, when the voice signal for detecting that user sends detects that someone speaks, can be by the way that the voice be believed Number the voice attributes of voice attribute and the speech samples carry out matching contrast, for example judge that tone colors and frequency are corresponding to both It is no to match, so that it is determined that the legitimacy of current user identities, and be matching in judged result, i.e., current user identities are legal When, triggering obtains speech data corresponding to the voice signal.Wherein, the voice attribute may include word speed, intonation, tone color or frequency Rate etc..

S204：The speech data is segmented according to default dead time interval, obtains sound bite.

When it is determined that the current user for sending voice signal is validated user, then it can obtain corresponding speech data and such as pass through The speech data is recorded to obtain.Specifically, the speech data can be one whole section of voice, that is, contain multiple voice sheets Section, then segment processing can be carried out to the speech data by default segmented mode, such as according to each voice in the speech data Default dead time interval such as 200ms between byte is segmented, and obtaining sound bite, (sound bite can correspond to In short).Further, if the speech data currently recorded is only in short, can using the word as a sound bite, Often record in short, then can be using the word as a sound bite, so as to obtain the sound bite of predetermined number threshold value.

S205：According to default voice byte-threshold, institute is extracted at the start or end of the sound bite respectively The voice byte of predicate sound byte-threshold corresponding number is as target voice byte.

In specific embodiment, a voice byte-threshold can be also pre-set, each voice sheet according to the threshold value from division Target voice byte is extracted at the ad-hoc location of section such as beginning and/or ending.For example, if the voice byte-threshold is set For 5, then can extract simultaneously the sound bite preceding 5 bytes and rear 5 bytes as target voice byte, it is multiple so as to obtain Target voice byte.

Further, it can be set and the voice byte-threshold successively decrease successively, for example 4,3,2,1 is decremented to successively from 5, and Repeat the target voice word that corresponding voice byte-threshold corresponding number is extracted at the beginning and end of each sound bite Section, until the voice byte-threshold is changed into 0, i.e., extract 5 voice bytes, 4 from the beginning and end of each sound bite respectively Individual voice byte, 3 voice bytes, 2 voice bytes and 1 voice byte are as target voice byte, so as to obtain To the target voice byte of different phonetic byte number.

S206：The number of repetition of the target voice byte is calculated, and records the number of repetition.

S207：If detection obtains the number of repetition and reaches default second amount threshold, by the target voice word The idiom as the user is saved, and preserves the idiom.

If identical target voice byte be present specifically, being resolved in each target voice byte, the voice word is calculated The occurrence number of section, i.e. number of repetition, and exceed default amount threshold in the number of repetition, such as at 5 times, will be corresponding Target voice byte is stored as the idiom of the user, so that user carries out analysis result inquiry or directly wraps this The phrasal analysis result containing user is pushed to user.

It is alternatively possible to a reminder time is preset, such as 9 points of every night, and when reaching the reminder time, will The analysis result acquired such as object information such as user's idiom and its corresponding number of repetition is pushed to present terminal.

In specific embodiment, it can also pre-set one and forbid sound bank, preset can be carried in the sound bank and forbid indicating Sound bite, i.e. some usual uncivil words, such as " leaning on ", " behaviour ", " your younger sister " voice byte.Alternatively, if parsing The idiom can then generate to need the voice byte forbidden, such as some uncivil words and forbid indicating, and will carry institute State and forbid the idiom of instruction to be added to described forbid in sound bank as the sound bite forbidden.

Further, if detecting, speech data corresponding to the voice signal that the user sends forbids sound bank with described In any one of each sound bite match, then message notifying can be sent, with call user's attention correlation words.Specifically, should Message notifying may include the prompting of short message, the tinkle of bells or vibration mode, and the embodiment of the present invention is not construed as limiting.

Implement the embodiment of the present invention triggering can obtain phase when detecting that the current user identity for sending voice signal is legal The speech data answered, by the speech data is carried out segment processing obtain sound bite and from the beginning of each sound bite and/ Or more representational words are filtered out at ending, so as to analyze to obtain the idiom of active user, and targetedly will The idiom is pushed to associated user, further, can also be provided at subsequent detection and says the pet phrase if unliterary to user When explicit word is taken leave, user is reminded.

Fig. 3 is referred to, is a kind of schematic flow sheet of the method for acquisition target voice byte of the embodiment of the present invention, specifically , methods described includes：

S301：Voice byte number is filtered out from the sound bite more than or equal to default voice byte-threshold Target voice fragment.

S302：If the quantity of the target voice fragment filtered out is not less than default first amount threshold, distinguish The voice byte conduct of the voice byte-threshold corresponding number is extracted at the start or end of the target voice fragment Target voice byte.

For example, if the voice byte-threshold is arranged to 5, amount threshold corresponding to sound bite is arranged to 6, then can be from Filtered out in the sound bite voice byte be more than or equal to 5 sound bite, and can when screening reaches 6 sound bites, Preceding 5 voice bytes and/or rear 5 voice bytes of triggering extraction 6 sound bites are as target voice byte.

Alternatively, voice byte is less than the voice of the default voice byte-threshold in the sound bite obtained for division Fragment, this can be less than the sound bite of voice byte-threshold and preset taboo as will appear from for uncivil words is guessed Only each sound bite is contrasted in sound bank, if detecting, both match, and this can be less than to the voice of voice byte-threshold Fragment preserves the uncivil words and its occurrence number as uncivil words, inquired about follow-up in order to user or The uncivil words and its occurrence number of the appearance are pushed to active user.

S303：The voice byte-threshold is successively decreased successively, and judges whether the voice byte-threshold after successively decreasing is zero.

Further, it can be set and the voice byte-threshold successively decrease successively, for example 4,3,2,1 is decremented to successively from 5, and Step S302 is repeated, until the voice byte-threshold is changed into 0, i.e., respectively from the beginning of the target voice fragment filtered out And/or ending extracts 5,4,3,2 and 1 voice bytes as target voice byte.

S304：Obtain target voice byte.

If the voice byte-threshold is changed into 0, the extraction operation of target end voice byte is can be shown that, so as to acquire The target voice byte of different phonetic byte number.

For example, if screening obtains following sound bite：

1. this class has been about to begin at once.

2. and then classmates recall rapidly the content that class is said.

3. get well elder generation not see.

4. and then open your book and translate into page 55.

5. and then have a look the suggestion content of there.

6. this class has started.

Wherein, amount threshold corresponding to the sound bite is 6, and voice byte-threshold is arranged to 5, you can by what is continuously accumulated 6 words are used as a comparative unit, and every words all meet that voice byte is more than or equal to 5.

For 6 above-mentioned words, according to the voice byte-threshold 5, can be extracted respectively at the beginning and end of every words Go out " this class " and " having started ", " and then classmates " and " content said ", " got well elder generation not see " and " elder generation is not at once See ", " and then opening you " and " page 55 ", " and then having a look " and " suggestion content ", " this class starts " and " class Started " corresponding to voice byte as target voice byte, and each target voice byte to extracting parses.

In specific embodiment, the voice at the voice byte for the every words beginning that can be extracted by being respectively compared and ending Byte parses to each target voice byte.For example comparable heading statement is the beginning of every words, i.e., " this class horse back ", " and then classmates ", " having got well elder generation not see ", " and then opening you ", " and then having a look ", " this class starts ", find 6 first languages No one of sentence is identical；Further, at the ending for comparing the words of tail sentence i.e. every, i.e., " having started ", " say Content ", " first do not see ", " page 55 ", " suggestion content ", " class has started ", find in six tail sentences Neither one is identical, then can be set the voice byte-threshold being decremented to 4 by 5.

According to the voice byte-threshold 4, then may compare heading statement " this class horse ", " and then classmate ", " got well first not ", " and then opening ", " and then see one ", " this class is opened ", it is found that no one of six heading statements are identicals；Further, than Compared with tail sentence " beginning ", " content ", " not seeing first ", " page 55 ", " showing content ", " beginning ", hair No one of existing six tail sentences are identicals, then can be set the voice byte-threshold being decremented to 3 by 4, and so on.

Until the voice byte-threshold is decremented into 2, find " then " occur three times, now protecting in the heading statement of 6 words Voice byte corresponding to " then " is deposited, records corresponding number of repetition 3, that is, is occurred 3 times.

The voice byte-threshold is finally decremented to 1 by 2, it is found that in heading statement that " this " occurs twice, now preserving " this " Corresponding voice byte, record its number of repetition 2；" right " word occurs three times, recording its number of repetition 3；It can be also found that tail sentence In " " occur four times, preserve " " corresponding to voice byte, record number of repetition be 4.Further, it is somebody's turn to do the repetition time of " right " The number of repetition of " then " when number with voice byte-threshold is 2 is identical, is 3, that is, is not higher than occurrence number " then ", and " then " " right " is included, then can directly gives up the relative recording of " right ", otherwise record " right " and its number of repetition.

To sum up analysis obtains, and this user's idiom i.e. pet phrase being resolved to has " this ", " then " and " ".Enter one Step, if amount threshold corresponding to the number of repetition is arranged into 3, " then " will can be used with " " as the custom of the user Language is stored.

Further, above-mentioned resolving can be carried out to 6 words of the follow-up voice byte more than or equal to 5, and obtained The analysis result for including user's pet phrase is taken, if the pet phrase monitored has consistent with pet phrase above, this can be added up The occurrence number of pet phrase, and exceed in preset time range when occurring more than 20 times in certain number, such as 3 hours, Labeled as grave warning, message informing active user is sent.

The sound bite of certain voice byte number can be exceeded by filtering out by implementing the embodiment of the present invention, and according to predetermined word The descending of joint number, the target voice byte for taking corresponding byte number is you can well imagine from the beginning and end punishment of each sound bite, and parsing is each With the presence or absence of the byte repeated in target voice byte, so as to analyze to obtain the idiom of active user, specific aim is stronger.

Fig. 4 please be participate in, is a kind of interaction schematic diagram of phrasal acquisition methods of the embodiment of the present invention, methods described Including：

S401：If terminal detects the voice signal that user sends, the terminal is obtained corresponding to the voice signal Speech data.

Alternatively, if terminal detects the voice signal that user sends, it is corresponding that the terminal obtains the voice signal Speech data, can be specially：If terminal detects the voice signal that user sends, the terminal obtains the voice signal Corresponding voice attribute；The terminal judges whether voice attribute corresponding to the voice signal is corresponding with preset speech samples Voice attribute match, the speech samples be validated user sound clip, the voice attribute include word speed, intonation, Any one of tone color and frequency are multinomial；If the terminal judged result is matching, that is, it is legal use to detect active user During family, the terminal triggering obtains speech data corresponding to the voice signal.

S402：Terminal sends the speech data to server.

S403：Server receives the speech data that the terminal is sent, and according to default voice byte-threshold, from described The target voice byte of the voice byte-threshold corresponding number is filtered out in speech data.

In specific embodiment, a voice byte-threshold, and the speech data according to the threshold value from acquisition can be pre-set In extract target voice byte.

Alternatively, the server filters out institute's predicate according to default voice byte-threshold from the speech data The target voice byte of sound byte-threshold corresponding number, can be specially：The server is according to default dead time interval pair The speech data is segmented, and obtains sound bite, and the speech data includes at least one sound bite；The server According to default voice byte-threshold, the voice byte-threshold is extracted at the start or end of the sound bite respectively The voice byte of corresponding number is as target voice byte.

It can also be performed it should be noted that the S403 obtains the step of target voice byte by terminal, you can by terminal root According to default voice byte-threshold, the target voice of the voice byte-threshold corresponding number is filtered out from the speech data After byte, the target voice byte of acquisition is sent to server, so that server solves to the target voice byte Analysis.

S404：Server parses to target voice byte, and obtains the phrasal parsing for including the user As a result.

Alternatively, the server parses to target voice byte, and obtains the idiom for including the user Analysis result, can be specially：The server calculates the number of repetition of the target voice byte, and records the repetition time Number；If the server detects to obtain the number of repetition and reaches default amount threshold, the server is by the target Idiom of the voice byte as the user, and preserve the idiom.

S405：The analysis result is pushed to terminal by server.

Further, the server can also will parse obtained user's idiom and the phrasal number of repetition Present terminal is pushed to Deng analysis result.

Fig. 5 is referred to, is the schematic flow sheet of another phrasal acquisition methods of the embodiment of the present invention, the side Method can be applied particularly in server, specifically, methods described includes：

S501：The speech data that server receiving terminal is sent, and according to default voice byte-threshold, from the voice The target voice byte of the voice byte-threshold corresponding number is filtered out in data.

Wherein, the speech data be the terminal when detecting the voice signal that user sends it is acquired with it is described Speech data corresponding to voice signal.

In specific embodiment, the server can divide the speech data according to default dead time interval Section, sound bite is obtained, and according to default voice byte-threshold, extracted respectively at the start or end of the sound bite Go out the voice byte of the voice byte-threshold corresponding number as target voice byte.Wherein, the speech data is included extremely A few sound bite；

Specifically, server can pre-set a voice byte-threshold, and each voice sheet according to the threshold value from division Target voice byte is extracted at the ad-hoc location of section such as beginning and/or ending.For example, if the voice byte-threshold is set For 5, then can extract simultaneously the sound bite preceding 5 bytes and rear 5 bytes as target voice byte, it is multiple so as to obtain Target voice byte.

S502：The server parses to target voice byte, and obtains phrasal comprising the user Analysis result.

In specific embodiment, the server can calculate the number of repetition of the target voice byte, and record described heavy Again count；If the server detects to obtain the number of repetition and reaches default amount threshold, the server will described in Idiom of the target voice byte as the user, and preserve the idiom.

Implementing server of the embodiment of the present invention can be when receiving the speech data of terminal transmission, by the speech data In the target voice byte that filters out analyzed, so as to obtain the idiom of active user, can targetedly obtain phase The idiom of user is closed, flexibility is stronger.

Fig. 6 is referred to, is a kind of structural representation of idiom acquisition device of the embodiment of the present invention, described device can Specifically it is arranged in the terminal devices such as mobile phone, tablet personal computer, wearable device, or is arranged in server, the present invention is implemented Example does not limit.Specifically, described device includes first acquisition unit 11, screening unit 12 and second acquisition unit 13.Its In,

First acquisition unit 11, if the voice signal sent for detecting user, it is corresponding to obtain the voice signal Speech data.

In specific embodiment, first acquisition unit 11 can currently whether there is the voice signal that user send by detection, And when detecting voice signal, triggering obtains speech data corresponding to the voice signal, for example acquires this by recording Speech data.

Screening unit 12, for the voice according to default voice byte-threshold, obtained from the first acquisition unit 11 The target voice byte of the voice byte-threshold corresponding number is filtered out in data.

In specific embodiment, a voice byte-threshold can be pre-set, screening unit 12 can be according to the threshold value from acquisition Speech data in extract target voice byte.In general, the i.e. corresponding voice byte of the word that user often says, For example user says " how do you do ", then three voice bytes are corresponded to.

Alternatively, the speech data that the first acquisition unit 11 obtains can be that in short screening unit 12 can be according to this Predetermined threshold value extracts the voice byte of the threshold value corresponding number as mesh at the ad-hoc location such as beginning and/or ending of the word Poster sound byte.That is, can often acquire in short, for example often record when obtaining a word, you can pass through screening Unit 12 carries out the screening operation of target voice byte, so as to screen to obtain a number of target voice byte.

Still optionally further, the speech data that the first acquisition unit 11 obtains can also be one section of word (i.e. by more word groups Into), screening unit 12 can carry out segment processing according to default dead time interval to the speech data of the recording, obtain multiple Sound bite (i.e. a sound bite can correspond in short).Correspondingly, if the voice byte-threshold is arranged to 5, screen Unit 12 can extract 5 voice bytes as target voice byte from the ad-hoc location of each sound bite, for example extract the language Preceding 5 bytes and/or rear 5 bytes of tablet section are as target voice byte, so as to obtain multiple target voice bytes.

Second acquisition unit 13, the target voice byte for being filtered out to the screening unit 12 parses, and obtains Take the phrasal analysis result for including the user.

Specifically, if second acquisition unit 13, which is resolved in each target voice byte, has identical, i.e., some target voices Byte repeats, then can calculate the occurrence number of the voice byte, i.e. number of repetition, and in the number of repetition more than default Amount threshold, such as at 5 times, corresponding target voice byte is stored as the idiom of the user.

Further, the second acquisition unit 13 can also will parse obtained user's idiom and this is phrasal heavy Again number is pushed to present terminal.

Implement the embodiment of the present invention to record corresponding speech data when detecting the voice signal that user sends Sound, by analyzing the target voice byte filtered out from the speech data of recording, so as to obtain the custom of active user Term, can targetedly obtain the idiom of associated user, and flexibility is stronger.

Fig. 7 is referred to, is the structural representation of another idiom acquisition device of the embodiment of the present invention, described device First acquisition unit 11, screening unit 12 and second acquisition unit 13 including above-mentioned idiom acquisition device, further , in embodiments of the present invention, the first acquisition unit 11 may include：

Information acquisition unit 111, if the voice signal sent for detecting user, it is corresponding to obtain the voice signal Voice attribute；

Judging unit 112, sound category corresponding to the voice signal obtained for judging described information acquiring unit 111 Whether voice attribute corresponding with preset speech samples matches property.

Wherein, the voice attribute includes any one of word speed, intonation, tone color and frequency or multinomial.

Data capture unit 113, for when the judged result of judging unit 112 is matches, obtaining the voice letter Speech data corresponding to number.

Specifically, when information acquisition unit 111 detects that the voice signal that user sends detects that someone speaks, can Obtain voice attribute corresponding to the voice signal, and by judging unit 112 by the voice attribute of the voice signal and the voice The voice attribute of sample carries out matching contrast, for example judges whether tone color and frequency match corresponding to both, so that it is determined that working as The legitimacy of preceding user identity, and be matching in judged result, i.e., when current user identities are legal, pass through data capture unit 113 obtain speech data corresponding to the voice signal.

Further, in embodiments of the present invention, the screening unit 12 may include：

Data segmentation unit 121, for being segmented according to default dead time interval to the speech data, obtain Sound bite.

Wherein, the speech data includes at least one sound bite.

When the judged result of judging unit 112 for matching, i.e., when the user for currently sending voice signal is validated user, then may be used Corresponding speech data is obtained by data capture unit 113, for example the speech data carried out by data capture unit 113 Recording.Specifically, the speech data can be one whole section of voice, that is, multiple sound bites are contained, then data segmentation unit 121 can Segment processing is carried out to the speech data by default segmented mode, such as according between each voice byte in the speech data Dead time interval such as 200ms be segmented, obtain sound bite (sound bite can correspond in short).Enter one Step, if the speech data recorded by first acquisition unit 11 is only that in short data segmentation unit 121 can be by the word As a sound bite, i.e., first acquisition unit 11 is often recorded in short, then data segmentation unit 121 can using the word as One sound bite, so as to obtain the sound bite of predetermined number threshold value.

Data extracting unit 122, for according to default voice byte-threshold, respectively from the data segmentation unit 121 The voice byte of the voice byte-threshold corresponding number is extracted at the start or end of the sound bite of division as target Voice byte.

In specific embodiment, each voice sheet that data extracting unit 122 can be according to default voice byte-threshold from division Target voice byte is extracted at the ad-hoc location of section such as beginning and/or ending.For example, if the voice byte-threshold is set For 5, then data extracting unit 122 can extract simultaneously the sound bite preceding 5 bytes and rear 5 bytes as target voice word Section, so as to obtain multiple target voice bytes.

Alternatively, the data extracting unit 122 can be specifically used for：

The target that voice byte number is more than or equal to default voice byte-threshold is filtered out from the sound bite Sound bite；If the quantity of the target voice fragment filtered out is not less than default first amount threshold, respectively from institute The voice byte that the voice byte-threshold corresponding number is extracted at the start or end of target voice fragment is stated as target Voice byte.

For example, if the voice byte-threshold is arranged to 5, amount threshold corresponding to sound bite is arranged to 6, then data Extraction unit 122 can filter out voice byte from the sound bite and be more than or equal to 5 sound bite, and can reach in screening During 6 sound bites, preceding 5 voice bytes and/or rear 5 of the extraction of subelement 1,222 6 sound bites are obtained by voice Individual voice byte is as target voice byte.

Further, in embodiments of the present invention, described device may also include：

Control unit 14, the voice byte-threshold is successively decreased successively for controlling, and notify data extracting unit 122 to divide The voice byte for not extracting the voice byte-threshold corresponding number at the start or end of the target voice fragment is made For target voice byte, until the voice byte-threshold is zero.

Further, control unit 14 is settable successively decreases the voice byte-threshold successively, for example is decremented to successively from 5 4th, 3,2,1, and notify data extracting unit 122 to extract corresponding voice byte-threshold at the beginning and end of each sound bite The target voice byte of corresponding number, until the voice byte-threshold is changed into 0, that is, notify data extracting unit 122 respectively from each The beginning and end of sound bite extract 5 voice bytes, 4 voice bytes, 3 voice bytes, 2 voice bytes and 1 voice byte is as target voice byte, so as to acquire the target voice byte of different phonetic byte number.

Further, in embodiments of the present invention, the second acquisition unit 13 may include：

Computing unit 131, for calculating the number of repetition of the target voice byte, and record the number of repetition；

Information memory cell 132, will if obtaining the number of repetition for detection reaches default second amount threshold Idiom of the target voice byte as the user, and preserve the idiom.

, can be single by calculating if identical target voice byte be present specifically, being resolved in each target voice byte The occurrence number of the calculating of member 131 speech data, i.e. number of repetition, and exceed default amount threshold in the number of repetition, than When such as saying 5 times, the idiom that corresponding target voice byte is used as the user is deposited by information memory cell 132 Storage, so that user carries out analysis result inquiry or this directly is pushed into user comprising the phrasal analysis result of user.

Implement the embodiment of the present invention triggering can obtain phase when detecting that the current user identity for sending voice signal is legal The speech data answered, sound bite and the beginning from each sound bite and knot are obtained by carrying out segment processing to the speech data More representational words are filtered out at tail, so as to analyze to obtain the idiom of active user, and are targetedly practised this Idiom is pushed to associated user.

Fig. 8 is referred to, is the structural representation of another idiom acquisition device of the embodiment of the present invention, described device Specifically it can be arranged in server, specifically, described device includes screening unit 21 and acquiring unit 22.Wherein,

The screening unit 21, for according to default voice byte-threshold, being screened in the speech data sent from terminal Go out the target voice byte of the voice byte-threshold corresponding number.

In specific embodiment, voice byte-threshold, the language that screening unit 12 can be according to the threshold value from acquisition can be pre-set Sound extracting data goes out target voice byte.

The acquiring unit 22, the target voice byte for being filtered out to the screening unit 21 parses, and obtains Take the phrasal analysis result for including the user.

Further, in embodiments of the present invention, the screening unit 21 may include：

Data segmentation unit 211, for being segmented according to default dead time interval to the speech data, obtain Sound bite, the speech data include at least one sound bite；

Data extracting unit 212, for according to default voice byte-threshold, respectively from the data segmentation unit 211 The voice byte of the voice byte-threshold corresponding number is extracted at the start or end of the sound bite of division as target Voice byte.

In specific embodiment, data extracting unit 212 can be according to default voice byte-threshold from data segmentation unit 211 Target voice byte is extracted at the ad-hoc location of each sound bite of division such as beginning and/or ending.For example, if the language Sound byte-threshold is arranged to 5, then data extracting unit 212 can extract preceding 5 bytes of the sound bite and rear 5 bytes simultaneously As target voice byte, so as to obtain multiple target voice bytes.

Alternatively, the data extracting unit 212 can be specifically used for：

Further, in embodiments of the present invention, the acquiring unit 22 may include：

Computing unit 221, for calculating the number of repetition of the target voice byte, and record the number of repetition；

Information memory cell 222, if obtaining the number of repetition for detection reaches default amount threshold, by described in Idiom of the target voice byte as the user, and preserve the idiom.

, can be single by calculating if identical target voice byte be present specifically, being resolved in each target voice byte The occurrence number of the calculating of member 221 speech data, i.e. number of repetition, and exceed default amount threshold in the number of repetition, than When such as saying 5 times, the idiom that corresponding target voice byte is used as the user is deposited by information memory cell 222 Storage, so that user carries out analysis result inquiry or this directly is pushed into user comprising the phrasal analysis result of user.

Further, Fig. 9 is referred to, is a kind of structural representation of terminal of the embodiment of the present invention.As shown in figure 9, should Terminal includes：At least one processor 100, such as CPU, at least one user interface 300, memory 400, at least one communication Bus 200.Wherein, communication bus 200 is used to realize the connection communication between these components.Wherein, user interface 300 can wrap Include display screen (Display), keyboard (Keyboard), optional user interface 300 can also include the wireline interface, wireless of standard Interface.Memory 400 can be high-speed RAM memory or non-labile memory (non-volatile Memory), a for example, at least magnetic disk storage.Memory 400 optionally can also be at least one and be located remotely from foregoing place Manage the storage device of device 100.Wherein processor 100 can store with reference to the idiom acquisition device described by Fig. 6 and Fig. 7 Batch processing code is stored in device 400, and processor 100 calls the program code stored in memory 400, it is following for performing Operation：

In an alternative embodiment, processor 100 calls in memory 400 program code that stores detecting that user sends Voice signal when, obtain the voice signal corresponding to speech data, be specifically as follows：

If detecting the voice signal that user sends, voice attribute corresponding to the voice signal is obtained；

Judge the voice attribute phase whether corresponding with preset speech samples of voice attribute corresponding to the voice signal Match somebody with somebody, the speech samples are recorded to obtain by validated user, and the voice attribute includes appointing in word speed, intonation, tone color and frequency It is one or more；

If matching, obtains speech data corresponding to the voice signal.

Further alternative, processor 100 calls the program code stored in memory 400 according to default voice byte Threshold value, the target voice byte of the voice byte-threshold corresponding number is filtered out from the speech data, is specifically as follows：

The speech data is segmented according to default dead time interval, obtains sound bite, the voice number According to including at least one sound bite；

According to default voice byte-threshold, the voice is extracted at the start or end of the sound bite respectively The voice byte of byte-threshold corresponding number is as target voice byte.

In an alternative embodiment, processor 100 calls the program code stored in memory 400 according to default voice word Threshold value is saved, extracts the voice word of the voice byte-threshold corresponding number at the start or end of the sound bite respectively Section is used as target voice byte, is specifically as follows：

The target that voice byte number is more than or equal to default voice byte-threshold is filtered out from the sound bite Sound bite；

If the quantity of the target voice fragment filtered out is not less than default first amount threshold, respectively from described The voice byte of the voice byte-threshold corresponding number is extracted at the start or end of target voice fragment as target language Sound byte.

In an alternative embodiment, processor 100 can also carry out following steps：

The voice byte-threshold is successively decreased successively；

Repeat and extract the voice byte-threshold pair at the start or end of the target voice fragment respectively The voice byte of number is answered as target voice byte step, until the voice byte-threshold is zero.

In an alternative embodiment, processor 100 calls the program code stored in memory 400 to the target voice word Section is parsed, and obtains the phrasal analysis result for including the user, is specifically as follows：

The number of repetition of the target voice byte is calculated, and records the number of repetition；

If detection obtains the number of repetition and reaches default second amount threshold, using the target voice byte as The idiom of the user, and preserve the idiom.

Specifically, the terminal introduced in the present embodiment can combine Fig. 1 to the custom use of Fig. 4 introductions to implement the present invention Part or all of flow in the embodiment of the method that language obtains.

Further, Figure 10 is referred to, is a kind of structural representation of server of the embodiment of the present invention.Such as Figure 10 institutes Show, the server includes：At least one processor 500, such as CPU, at least one user interface 700, memory 800, at least One communication bus 600.Wherein, communication bus 600 is used to realize the connection communication between these components.Wherein, user interface 700 can include wireline interface, the wave point of standard.Memory 800 can be high-speed RAM memory or it is non-not Stable memory (non-volatile memory), for example, at least a magnetic disk storage.Memory 800 optionally may be used also To be at least one storage device for being located remotely from aforementioned processor 500.Wherein processor 500 can be retouched with reference to Fig. 6 and Fig. 7 The idiom acquisition device stated, batch processing code is stored in memory 800, and processor 500 is called and deposited in memory 800 The program code of storage, for performing following operation：

In an alternative embodiment, processor 500 calls in memory 800 program code that stores detecting that user sends Voice signal when, obtain the voice signal corresponding to speech data, be specifically as follows：

If matching, obtains speech data corresponding to the voice signal.

Further alternative, processor 500 calls the program code stored in memory 800 according to default voice byte Threshold value, the target voice byte of the voice byte-threshold corresponding number is filtered out from the speech data, is specifically as follows：

In an alternative embodiment, processor 500 calls the program code stored in memory 800 according to default voice word Threshold value is saved, extracts the voice word of the voice byte-threshold corresponding number at the start or end of the sound bite respectively Section is used as target voice byte, is specifically as follows：

In an alternative embodiment, processor 500 can also carry out following steps：

The voice byte-threshold is successively decreased successively；

In an alternative embodiment, processor 500 calls the program code stored in memory 800 to the target voice word Section is parsed, and obtains the phrasal analysis result for including the user, is specifically as follows：

Specifically, the server introduced in the present embodiment can combine Fig. 1 to the custom of Fig. 4 introductions to implement the present invention Part or all of flow in the embodiment of the method that term obtains.

Further, Figure 11 is referred to, is that a kind of idiom of the embodiment of the present invention obtains the structural representation of system, The system includes：Terminal 1 and server 2；Wherein,

The terminal 1, if the voice signal sent for detecting user, obtains voice corresponding to the voice signal Data, and the speech data is sent to the server 2；

The server 2, the speech data sent for receiving the terminal 1, and according to default voice byte-threshold, The target voice byte of the voice byte-threshold corresponding number is filtered out from the speech data；To the target voice word Section is parsed, and obtains the phrasal analysis result for including the user.

In an alternative embodiment, the terminal 1, it may also be used for if detecting the voice signal that user sends, obtain institute Voice attribute corresponding to predicate sound signal；Judge voice attribute corresponding to the voice signal whether with preset speech samples pair The voice attribute answered matches, and the speech samples are the sound clip of validated user, and the voice attribute includes word speed, language Any one of tune, tone color and frequency are multinomial；If matching, obtains speech data corresponding to the voice signal.

In an alternative embodiment, the server 2, it may also be used for according to default dead time interval to the voice number According to being segmented, sound bite is obtained, the speech data includes at least one sound bite；According to default voice byte threshold Value, the voice byte for extracting the voice byte-threshold corresponding number at the start or end of the sound bite respectively are made For target voice byte.

Specifically, server 2 can filter out voice byte number from the sound bite is more than or equal to default language The target voice fragment of sound byte-threshold, and counted in the quantity of the target voice fragment filtered out not less than default first Threshold value is measured, for example at 6, extracts the voice byte-threshold pair at the start or end of the target voice fragment respectively The voice byte of number is answered as target voice byte.

Further, server 2 is controllable successively decreases the voice byte-threshold successively, and repeats respectively from described The voice byte of the voice byte-threshold corresponding number is extracted at the start or end of target voice fragment as target language The step of sound byte, until the voice byte-threshold is zero, so as to acquire the target of multiple different phonetic byte numbers Voice byte.

In an alternative embodiment, the server 2, it may also be used for the number of repetition of the target voice byte is calculated, and Record the number of repetition；If detection, which obtains the number of repetition, reaches default second amount threshold, the server 2 will Idiom of the target voice byte as the user, and preserve the idiom.

One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with The hardware of correlation is instructed to complete by computer program, described program can be stored in a computer read/write memory medium In, the program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..

It should be noted that in the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, in some embodiment The part being not described in, it may refer to the associated description of other embodiment.Secondly, those skilled in the art should also know Know, embodiment described in this description belongs to preferred embodiment, and involved action and unit are not necessarily of the invention It is necessary.

Step in present invention method can be sequentially adjusted, merged and deleted according to actual needs.

Module or unit in device of the embodiment of the present invention can be combined, divided and deleted according to actual needs.

Module described in the embodiment of the present invention or unit, universal integrated circuit, such as CPU (Central can be passed through Processing Unit, central processing unit), or pass through ASIC (Application Specific Integrated Circuit, application specific integrated circuit) realize.

The text message display methods and terminal provided above the embodiment of the present invention is described in detail, herein Apply specific case to be set forth the principle and embodiment of the present invention, the explanation of above example is only intended to help Understand the method and its core concept of the present invention；Meanwhile for those of ordinary skill in the art, according to the thought of the present invention, There will be changes in specific embodiments and applications, in summary, this specification content should not be construed as to this The limitation of invention.

Claims

A kind of 1. phrasal acquisition methods, it is characterised in that including：

If detecting the voice signal that user sends, speech data corresponding to the voice signal is obtained；

According to default voice byte-threshold, the mesh of the voice byte-threshold corresponding number is filtered out from the speech data Poster sound byte；

The target voice byte is parsed, and obtains the phrasal analysis result for including the user；

Wherein, it is described according to default voice byte-threshold, the voice byte-threshold pair is filtered out from the speech data The target voice byte of number is answered, including：

The speech data is segmented according to default dead time interval, obtains sound bite, the VoP Include at least one sound bite；

According to default voice byte-threshold, the voice byte is extracted at the start or end of the sound bite respectively The voice byte of threshold value corresponding number is as target voice byte；

The voice byte-threshold is successively decreased successively；

Repeat and extract the voice byte-threshold corresponding number at the start or end of the sound bite respectively The step of voice byte is as target voice byte, until the voice byte-threshold is zero.
2. the method as described in claim 1, it is characterised in that if the voice signal for detecting user and sending, is obtained Speech data corresponding to the voice signal, including：

If detecting the voice signal that user sends, voice attribute corresponding to the voice signal is obtained；

Judge that the voice attribute whether corresponding with preset speech samples of voice attribute corresponding to the voice signal matches, institute The sound clip that speech samples are validated user is stated, the voice attribute includes any one of word speed, intonation, tone color and frequency It is or multinomial；

If matching, obtains speech data corresponding to the voice signal.
3. the method as described in claim 1, it is characterised in that it is described that the target voice byte is parsed, and obtain Phrasal analysis result comprising the user, including：

The number of repetition of the target voice byte is calculated, and records the number of repetition；

If detection obtains the number of repetition and reaches default second amount threshold, using the target voice byte as described in The idiom of user, and preserve the idiom.
A kind of 4. idiom acquisition device, it is characterised in that including：

First acquisition unit, if the voice signal sent for detecting user, obtains voice corresponding to the voice signal Data；

Screening unit, for according to default voice byte-threshold, being sieved in the speech data obtained from the first acquisition unit Select the target voice byte of the voice byte-threshold corresponding number；

Second acquisition unit, the target voice byte for being filtered out to the screening unit parse, and obtain and include institute State the phrasal analysis result of user；

Wherein, the screening unit includes：

Data segmentation unit, for being segmented according to default dead time interval to the speech data, obtain voice sheet Section, the speech data include at least one sound bite；

Data extracting unit, for according to default voice byte-threshold, respectively from the voice of data segmentation unit division The voice byte of the voice byte-threshold corresponding number is extracted at the start or end of fragment as target voice byte；

Control unit, the voice byte-threshold is successively decreased successively for controlling, and notify data extracting unit respectively from described The voice byte of the voice byte-threshold corresponding number is extracted at the start or end of sound bite as target voice word Section, until the voice byte-threshold is zero.
5. device as claimed in claim 4, it is characterised in that the first acquisition unit includes：

Information acquisition unit, if the voice signal sent for detecting user, obtains sound corresponding to the voice signal Attribute；

Judging unit, for judge described information acquiring unit obtain the voice signal corresponding to voice attribute whether with advance Voice attribute corresponding to the speech samples put matches, the speech samples be validated user sound clip, the sound category Property include any one of word speed, intonation, tone color and frequency or multinomial；

Data capture unit, for when the judging unit judged result is matches, obtaining language corresponding to the voice signal Sound data.
6. device as claimed in claim 4, it is characterised in that the second acquisition unit includes：

Computing unit, for calculating the number of repetition of the target voice byte, and record the number of repetition；

Information memory cell, if obtaining the number of repetition for detection reaches default second amount threshold, by the mesh Idiom of the poster sound byte as the user, and preserve the idiom.