CN109903781A - A kind of sentiment analysis method for mode matching - Google Patents
A kind of sentiment analysis method for mode matching Download PDFInfo
- Publication number
- CN109903781A CN109903781A CN201910296713.XA CN201910296713A CN109903781A CN 109903781 A CN109903781 A CN 109903781A CN 201910296713 A CN201910296713 A CN 201910296713A CN 109903781 A CN109903781 A CN 109903781A
- Authority
- CN
- China
- Prior art keywords
- frame
- template
- distortion
- speech
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Machine Translation (AREA)
Abstract
The invention discloses a kind of sentiment analysis method for mode matching, it cannot reflect that voice segments fall in temporal variation under different condition using the method that dynamic time is bent to solve Linear Mapping, cause the problem of recognition result inaccuracy, specifically, pass through regular function of time m=ω (n), the time shaft n of the training test template of input is mapped on the time shaft m of reference template, it is ensured that the maximum similarity between two templates acoustically.The present invention solves the problems, such as that speech emotion recognition sequence compares duration etc., for reference template is carried out temporal matching with the mode identified before the deadline, minimum cumulative distance when two template matchings is obtained, and then improves the accuracy of recognition result.
Description
Technical field
The present invention relates to sentiment analysis technical fields, more specifically, are related to a kind of sentiment analysis method for mode matching.
Background technique
It is speech model information that the mankind are obtained by the sense of hearing, including is-not symbol and symbolic information.Voice signal
Processing is the noise got rid of in voice, is-not symbol is retained therein.Identical a word, because expressed by speaker
Emotion difference, the perception of hearer also can there are very big differences.The mankind are received not simultaneously by different perceptual organs
With the information of form, how efficiently and quickly to reach information transmission effect most preferably using different information, will become at future
Manage the important development direction of information research.
So studying the emotion information in voice with very big meaning by computer.It will be in voice signal
Affective characteristics extract, so that this study not only has scientific meaning to judge emotion wherein included, while
Also there are many values in practical application.
In existing speech recognition technology, recognition result is not accurate enough, such as more universal Linear Mapping, cannot be anti-
It reflects voice segments under different condition and falls in temporal variation, so the result of identification is inaccurate.
Summary of the invention
It is an object of the invention to overcome the deficiencies of the prior art and provide a kind of sentiment analysis method for mode matching, solve
Speech emotion recognition sequence compares duration not equal problem, for before the deadline by reference template with know
Other mode carries out temporal matching, obtains minimum cumulative distance when two template matchings, and then improve the accurate of recognition result
Degree.
The purpose of the present invention is achieved through the following technical solutions:
A kind of sentiment analysis method for mode matching, comprising:
(1) expression { R (1), R (2) ... R (m) ... R (M) } of emotion template is established, m is the number of trained emotional speech, m
=1 is to start speech frame, and m=M is to terminate speech frame, and M is the frame sum in speech pattern, and R (m) is the phonetic feature of m frame
Coefficient;
(2) expression { T (1), T (2) ... T (n) ... T (N) } of test pattern is established, n is the number for testing emotional speech, N
It is the frame voice in speech pattern, T (n) is the characteristic coefficient of n-th frame;
(3) similarity between T and R is calculated, it is higher to be distorted smaller similarity by calculated distortion D [T, R];Calculated distortion needs
Calculate the distortion that frame is corresponded in T and R;N and m is the frame number picked out at random in T and R respectively, with D [T (n), R (m)] indicate T and
The distortion of R characteristic vector, it may be considered that the following two kinds situation:
(a) N=M calculates separately n=m=1 ..., then these distortions of n=m=N are summed, and then calculates and acquires total mistake
Very, it is shown below:
(b) N ≠ M, it is assumed that N > M, { R (1) ..., R (M) } is corresponding at a N frame by using the method for linear expansion
SequenceThen it calculatesIt is then counted frame by frame with the distortion summation of { T (1) ... T (N) }
It calculates, calculation formula is shown below:
(4) reference emotion template is carried out temporal with the test pattern identified before the deadline
Match, obtain minimum cumulative distance when two template matchings, estimate reference voice and identifies that the minimum cumulative distance between voice comes
Realize speech recognition.
Further, using the method for dynamic time bending, to reflect, voice segments fall in temporal change under different condition
Change, by regular function of time m=ω (n), the time shaft n of the training test template of input is mapped to the time of reference template
On axis m, using nonlinear mode, and ω meets formula:
D [n, ω (n)] is the distance between m frame reference vector and n-th frame input vector in formula, and D is distance measure.
Further, frame number n=1~N that test template is marked on the horizontal axis of a two-dimensional Cartesian coordinate system, in the longitudinal axis
Each frame number m=1~M of upper label reference template is depicted as one according to the rounded coordinate of frame number and includes wire grid in length and breadth, test
The crosspoint (n, m) on joint grid in template indicates, the road of this grid overcrossing point is found using DP algorithm
Diameter, with reference to obtaining with the crosspoint on the distortion frame number in test template then passage path.
The beneficial effects of the present invention are:
(1) present invention solves the problems, such as that speech emotion recognition sequence compares duration etc., is used to before the deadline
Reference template is subjected to temporal matching with the mode identified, obtains minimum cumulative distance when two template matchings,
And then improve the accuracy of recognition result.
(2) in the prior art, Linear Mapping cannot reflect that voice segments fall in temporal variation under different condition, so knowing
Other result is inaccurate.In the present invention, it is solved the problems, such as using the method for dynamic time bending, specifically, by regular
The time shaft n of the training test template of input is mapped on the time shaft m of reference template by function of time m=ω (n), it is ensured that
Maximum similarity between two templates acoustically.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of invention without any creative labor, may be used also for those of ordinary skill in the art
To obtain other drawings based on these drawings.
Fig. 1 is step flow diagram of the invention.
Specific embodiment
Technical solution of the present invention is described in further detail with reference to the accompanying drawing, but protection scope of the present invention is not limited to
It is as described below.All features disclosed in this specification, or implicit disclosed all methods or in the process the step of, in addition to mutual
Other than the feature and/or step of repulsion, it can combine in any way.
Any feature disclosed in this specification (including any accessory claim, abstract and attached drawing), except non-specifically chatting
It states, can be replaced by other alternative features that are equivalent or have similar purpose.That is, unless specifically stated, each feature is only
It is an example in a series of equivalent or similar characteristics.
Specific embodiments of the present invention are described more fully below, it should be noted that the embodiments described herein is served only for illustrating
Illustrate, is not intended to restrict the invention.In the following description, in order to provide a thorough understanding of the present invention, a large amount of spies are elaborated
Determine details.It will be apparent, however, to one skilled in the art that: this hair need not be carried out using these specific details
It is bright.In other instances, in order to avoid obscuring the present invention, well known circuit, software or method are not specifically described.
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Before embodiment is described, need to explain some necessary terms.Such as:
If occurring describing various elements using the terms such as " first ", " second " in the application, but these elements are not answered
It is limited when by these terms.These terms are only used to distinguish an element and another element.Therefore, discussed below
" first " element can also be referred to as " second " element without departing from the teachings of the present invention.It should be understood that if referring to unitary
Part " connection " perhaps " coupled " to another element when it can be directly connected or be directly coupled to another element or can also
With there are intermediary elements.On the contrary, when referring to an element being " directly connected " or " directly coupled " to another element, then not
There are intermediary elements.
The various terms occurred in this application are used only for the purpose of description specific embodiment and are not intended as pair
Restriction of the invention, unless the context clearly indicates otherwise, otherwise singular intention also includes plural form.
When the terms " comprising " and/or " comprising " are used in this specification, these terms specify the feature, whole
The presence of body, step, operations, elements, and/or components, but be also not excluded for more than one other feature, entirety, step, operation,
The presence of component, assembly unit and/or its group and/or additional.
As shown in Figure 1, a kind of sentiment analysis method for mode matching, comprising:
(1) expression { R (1), R (2) ... R (m) ... R (M) } of emotion template is established, m is the number of trained emotional speech, m
=1 is to start speech frame, and m=M is to terminate speech frame, and M is the frame sum in speech pattern, and R (m) is the phonetic feature of m frame
Coefficient;
(2) expression { T (1), T (2) ... T (n) ... T (N) } of test pattern is established, n is the number for testing emotional speech, N
It is the frame voice in speech pattern, T (n) is the characteristic coefficient of n-th frame;
(3) similarity between T and R is calculated, it is higher to be distorted smaller similarity by calculated distortion D [T, R];Calculated distortion needs
Calculate the distortion that frame is corresponded in T and R;N and m is the frame number picked out at random in T and R respectively, with D [T (n), R (m)] indicate T and
The distortion of R characteristic vector, it may be considered that the following two kinds situation:
(a) N=M calculates separately n=m=1 ..., then these distortions of n=m=N are summed, and then calculates and acquires total mistake
Very, it is shown below:
(b) N ≠ M, it is assumed that N > M, { R (1) ..., R (M) } is corresponding at a N frame by using the method for linear expansion
SequenceThen it calculatesIt is then counted frame by frame with the distortion summation of { T (1) ... T (N) }
It calculates, calculation formula is shown below:
(4) reference emotion template is carried out temporal with the test pattern identified before the deadline
Match, obtain minimum cumulative distance when two template matchings, estimate reference voice and identifies that the minimum cumulative distance between voice comes
Realize speech recognition.
Further, using the method for dynamic time bending, to reflect, voice segments fall in temporal change under different condition
Change, by regular function of time m=ω (n), the time shaft n of the training test template of input is mapped to the time of reference template
On axis m, using nonlinear mode, and ω meets formula:
D [n, ω (n)] is the distance between m frame reference vector and n-th frame input vector in formula, and D is distance measure.
Further, frame number n=1~N that test template is marked on the horizontal axis of a two-dimensional Cartesian coordinate system, in the longitudinal axis
Each frame number m=1~M of upper label reference template is depicted as one according to the rounded coordinate of frame number and includes wire grid in length and breadth, test
The crosspoint (n, m) on joint grid in template indicates, the road of this grid overcrossing point is found using DP algorithm
Diameter, with reference to obtaining with the crosspoint on the distortion frame number in test template then passage path.
Embodiment 1
As shown in Figure 1, a kind of sentiment analysis method for mode matching, comprising:
(1) expression { R (1), R (2) ... R (m) ... R (M) } of emotion template is established, m is the number of trained emotional speech, m
=1 is to start speech frame, and m=M is to terminate speech frame, and M is the frame sum in speech pattern, and R (m) is the phonetic feature of m frame
Coefficient;
(2) expression { T (1), T (2) ... T (n) ... T (N) } of test pattern is established, n is the number for testing emotional speech, N
It is the frame voice in speech pattern, T (n) is the characteristic coefficient of n-th frame;
(3) similarity between T and R is calculated, it is higher to be distorted smaller similarity by calculated distortion D [T, R];Calculated distortion needs
Calculate the distortion that frame is corresponded in T and R;N and m is the frame number picked out at random in T and R respectively, with D [T (n), R (m)] indicate T and
The distortion of R characteristic vector, it may be considered that the following two kinds situation:
(a) N=M calculates separately n=m=1 ..., then these distortions of n=m=N are summed, and then calculates and acquires total mistake
Very, it is shown below:
(b) N ≠ M, it is assumed that N > M, { R (1) ..., R (M) } is corresponding at a N frame by using the method for linear expansion
SequenceThen it calculatesIt is then counted frame by frame with the distortion summation of { T (1) ... T (N) }
It calculates, calculation formula is shown below:
(4) reference emotion template is carried out temporal with the test pattern identified before the deadline
Match, obtains minimum cumulative distance when two template matchings.
Remaining technical characteristic in the present embodiment, those skilled in the art can flexibly be selected according to the actual situation
With with to meet different specific actual demands.It will be apparent, however, to one skilled in the art that: it need not use
These specific details realize the present invention.In other instances, in order to avoid obscuring the present invention, well known calculation is not specifically described
Method, method or system etc. limit within technical protection scope in the claimed technical solution of claims of the present invention.
For the aforementioned method embodiment, for simple description, therefore, it is stated as a series of action combinations, still
Those skilled in the art should understand that the application is not limited by the described action sequence, because according to the application, it is a certain
A little steps can be performed in other orders or simultaneously.Secondly, those skilled in the art should also know that, it is retouched in specification
The embodiment stated belongs to preferred embodiment, necessary to related movement and unit not necessarily the application.
It will be appreciated by those of skill in the art that unit described in conjunction with the examples disclosed in the embodiments of the present disclosure and
Algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually with hard
Part or software mode execute, the specific application and design constraint depending on technical solution.Professional technician can be with
Each specific application is come to realize described function using distinct methods, but this realization should not exceed model of the invention
It encloses.
Disclosed system, module and method, may be implemented in other ways.For example, device described above
Embodiment, only schematically, for example, the division of the unit, can be only a kind of logical function partition, it is practical to realize
When there may be another division manner, such as multiple units or components can be combined or can be integrated into another system, or
Some features can be ignored or not executed.Another point, shown or discussed mutual coupling or direct-coupling or communication
Connection is it may be said that through some interfaces, the indirect coupling or communication connection of device or unit can be electrical property, mechanical or other
Form.
The unit that the discrete parts illustrates can be or can not also receive and is physically separated, shown as a unit
Component can be or can not receive physical unit, it can and it is in one place, or may be distributed over multiple network lists
In member.It can select some or all of unit therein according to the actual needs to realize the purpose of the scheme of the present embodiment.
It, can be with if the function is realized in the form of SFU software functional unit and when sold or used as an independent product
It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially right in other words
The part of part or the technical solution that the prior art contributes can be embodied in the form of software products, the calculating
Machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be individual
Computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention.And
Storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory
The various media that can store program code such as device (Random Access Memory, RAM), magnetic or disk.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in the method for above-described embodiment, being can
It is completed with instructing relevant hardware by computer program, the program can be stored in computer-readable storage medium
In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic
Dish, CD, ROM, RAM etc..
The above is only a preferred embodiment of the present invention, it should be understood that the present invention is not limited to described herein
Form should not be regarded as an exclusion of other examples, and can be used for other combinations, modifications, and environments, and can be at this
In the text contemplated scope, modifications can be made through the above teachings or related fields of technology or knowledge.And those skilled in the art institute into
Capable modifications and changes do not depart from the spirit and scope of the present invention, then all should be in the protection scope of appended claims of the present invention
It is interior.
Claims (3)
1. a kind of sentiment analysis method for mode matching characterized by comprising
(1) expression { R (1), R (2) ... R (m) ... R (M) } of emotion template is established, m is the number of trained emotional speech, and m=1 is
Start speech frame, m=M is to terminate speech frame, and M is the frame sum in speech pattern, and R (m) is the phonetic feature coefficient of m frame;
(2) expression { T (1), T (2) ... T (n) ... T (N) } of test pattern is established, n is the number for testing emotional speech, and N is language
Frame voice in sound mode, T (n) are the characteristic coefficients of n-th frame;
(3) similarity between T and R is calculated, it is higher to be distorted smaller similarity by calculated distortion D [T, R];Calculated distortion needs to count
Calculate the distortion that frame is corresponded in T and R;N and m is the frame number picked out at random in T and R respectively, indicates that T and R is special with D [T (n), R (m)]
The distortion of vector is levied, considers the following two kinds situation:
(a) N=M calculates separately n=m=1 ..., then these distortions of n=m=N are summed, and then calculates and acquires total distortion, such as
Shown in following formula:
(b) N ≠ M, it is assumed that N > M, { R (1) ..., R (M) } is corresponding at a N frame sequence by using the method for linear expansionThen it calculatesIt is then calculated frame by frame with the distortion summation of { T (1) ... T (N) },
Calculation formula is shown below:
(4) reference emotion template is subjected to temporal matching with the test pattern identified before the deadline,
Minimum cumulative distance when two template matchings is obtained, reference voice is estimated and identifies the minimum cumulative distance between voice to realize
Speech recognition.
2. a kind of sentiment analysis method for mode matching according to claim 1, which is characterized in that bent using dynamic time
Method to reflect, voice segments fall in temporal variation under different condition, by regular function of time m=ω (n), will input
The time shaft n of training test template be mapped on the time shaft m of reference template, using nonlinear mode, and ω meets formula:
D [n, ω (n)] is the distance between m frame reference vector and n-th frame input vector in formula, and D is distance measure.
3. a kind of sentiment analysis method for mode matching according to claim 2, which is characterized in that sat at a two-dimentional right angle
Frame number n=1~the N for marking test template on the horizontal axis of system is marked, marks each frame number m=1~M of reference template, root on longitudinal axis
One, which is depicted as, according to the rounded coordinate of frame number includes wire grid in length and breadth, the crosspoint on joint grid in test template
(n, m) is indicated, the path of this grid overcrossing point is found using DP algorithm, with reference to the distortion frame number in test template then
Crosspoint in passage path obtains.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910296713.XA CN109903781A (en) | 2019-04-14 | 2019-04-14 | A kind of sentiment analysis method for mode matching |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910296713.XA CN109903781A (en) | 2019-04-14 | 2019-04-14 | A kind of sentiment analysis method for mode matching |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109903781A true CN109903781A (en) | 2019-06-18 |
Family
ID=66954780
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910296713.XA Pending CN109903781A (en) | 2019-04-14 | 2019-04-14 | A kind of sentiment analysis method for mode matching |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109903781A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619893A (en) * | 2019-09-02 | 2019-12-27 | 合肥工业大学 | Time-frequency feature extraction and artificial intelligence emotion monitoring method of voice signal |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436405A (en) * | 2008-12-25 | 2009-05-20 | 北京中星微电子有限公司 | Method and system for recognizing speaking people |
CN102723078A (en) * | 2012-07-03 | 2012-10-10 | 武汉科技大学 | Emotion speech recognition method based on natural language comprehension |
CN103065627A (en) * | 2012-12-17 | 2013-04-24 | 中南大学 | Identification method for horn of special vehicle based on dynamic time warping (DTW) and hidden markov model (HMM) evidence integration |
CN105788600A (en) * | 2014-12-26 | 2016-07-20 | 联想(北京)有限公司 | Voiceprint identification method and electronic device |
CN109300474A (en) * | 2018-09-14 | 2019-02-01 | 北京网众共创科技有限公司 | A kind of audio signal processing method and device |
-
2019
- 2019-04-14 CN CN201910296713.XA patent/CN109903781A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101436405A (en) * | 2008-12-25 | 2009-05-20 | 北京中星微电子有限公司 | Method and system for recognizing speaking people |
CN102723078A (en) * | 2012-07-03 | 2012-10-10 | 武汉科技大学 | Emotion speech recognition method based on natural language comprehension |
CN103065627A (en) * | 2012-12-17 | 2013-04-24 | 中南大学 | Identification method for horn of special vehicle based on dynamic time warping (DTW) and hidden markov model (HMM) evidence integration |
CN105788600A (en) * | 2014-12-26 | 2016-07-20 | 联想(北京)有限公司 | Voiceprint identification method and electronic device |
CN109300474A (en) * | 2018-09-14 | 2019-02-01 | 北京网众共创科技有限公司 | A kind of audio signal processing method and device |
Non-Patent Citations (2)
Title |
---|
胡航: "《语音信号处理》", 31 July 2009 * |
马俊: "语音识别技术研究", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110619893A (en) * | 2019-09-02 | 2019-12-27 | 合肥工业大学 | Time-frequency feature extraction and artificial intelligence emotion monitoring method of voice signal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107591152B (en) | Voice control method, device and equipment based on earphone | |
CN101833951B (en) | Multi-background modeling method for speaker recognition | |
CN108427690B (en) | Information delivery method and device | |
CN109299458A (en) | Entity recognition method, device, equipment and storage medium | |
CN107977415A (en) | Automatic question-answering method and device | |
CN106297788B (en) | Control method and control device | |
CN108021554A (en) | Audio recognition method, device and washing machine | |
CN106021403B (en) | Client service method and device | |
CN108777141B (en) | Test apparatus, test method, and storage medium | |
CN110890088A (en) | Voice information feedback method and device, computer equipment and storage medium | |
CN111060874B (en) | Sound source positioning method and device, storage medium and terminal equipment | |
CN108510982A (en) | Audio event detection method, device and computer readable storage medium | |
CN110334179A (en) | Question and answer processing method, device, computer equipment and storage medium | |
CN108733712B (en) | Question-answering system evaluation method and device | |
CN111274797A (en) | Intention recognition method, device and equipment for terminal and storage medium | |
CN110532964A (en) | Page number recognition methods and device, reading machine people, computer readable storage medium | |
CN107590460A (en) | Face classification method, apparatus and intelligent terminal | |
CN110610698A (en) | Voice labeling method and device | |
CN114841274B (en) | Language model training method and device, electronic equipment and storage medium | |
CN109903781A (en) | A kind of sentiment analysis method for mode matching | |
CN108847076A (en) | The assessment method of language learner | |
CN109117829A (en) | Local tea variety identifying system based on tensorflow | |
CN109916350A (en) | A kind of method, apparatus and terminal device generating three-dimensional coordinates measurement program | |
CN110222781A (en) | Audio denoising method, device, user terminal and storage medium | |
CN109166569A (en) | The detection method and device that phoneme accidentally marks |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190618 |