KR20170098445A - Situation judgment apparatus based on voice/sound analysis - Google Patents
Situation judgment apparatus based on voice/sound analysis Download PDFInfo
- Publication number
- KR20170098445A KR20170098445A KR1020160020350A KR20160020350A KR20170098445A KR 20170098445 A KR20170098445 A KR 20170098445A KR 1020160020350 A KR1020160020350 A KR 1020160020350A KR 20160020350 A KR20160020350 A KR 20160020350A KR 20170098445 A KR20170098445 A KR 20170098445A
- Authority
- KR
- South Korea
- Prior art keywords
- voice
- module
- speaker
- ambient sound
- analyzing
- Prior art date
Links
- 238000004458 analytical method Methods 0.000 title abstract description 12
- 238000000034 method Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 description 5
- 230000008451 emotion Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000002093 peripheral effect Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000005786 degenerative changes Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 210000001260 vocal cord Anatomy 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 210000000867 larynx Anatomy 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007659 motor function Effects 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 230000035807 sensation Effects 0.000 description 1
- 208000011293 voice disease Diseases 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
- Telephone Function (AREA)
Abstract
Description
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a situation determination apparatus, and more particularly, to a voice / sound analysis based situation determination apparatus.
A lot of emergency calls or criminal incidents are reported every year.
However, a false report is a significant proportion of the report of such a congestion.
Unreasonable calls are frequent every few seconds, so you have to judge whether you are falsely notified within a short period of time and make a quick decision early on.
The recipient of the telephone call should not miss the true telephone call while attracting the false telephone call for a long time.
On the other hand, if it is not a false report, the speaker is often embarrassed and often fails to deliver the situation properly. In this case, it is necessary to quickly and accurately judge not only the authenticity of the declaration but also the urgency of the declaration through the voice of the speaker or the surrounding sound, as well as grasping and reasoning as much information as possible about the accident scene or the speaker in a short time.
However, there is a problem that it is not easy to grasp a lot of information accurately and quickly within a short time, and accuracy and objectivity may not be constant.
It is an object of the present invention to provide a voice / sound analysis based situation determination apparatus.
According to another aspect of the present invention, there is provided an apparatus for determining a situation based on voice / sound analysis, the apparatus comprising: a call receiving module for receiving a voice of a speaking party and a surrounding sound from a speaking party mobile terminal; An age information reasoning module for analyzing the voice and the ambient sound received from the call receiving module to infer the age information of the speaker; A gender information reasoning module for analyzing the voice and the ambient sound received from the call receiving module to infer the gender of the speaker; A psychological state reasoning module for analyzing a voice and a surrounding sound received from the call reception module to infer the psychological state of the speaker; And a truth / false inference module for inferring the truth / false of the speaker by analyzing the voice and the ambient sound received from the call receiving module.
In this case, when the call receiving module receives the voice of the talker and the ambient sound from the speaking terminal, the GPS receiving module transmits a GPS (Global Positioning System) function of turning on the global positioning system (GPS) global positioning system) remote control module.
According to the above-described voice / sound analysis-based situation determination apparatus, since the age, gender, psychological state, truth / false and surrounding situation of the speaking person are inferred from the voice of the speaking person and the surrounding sound, And it is effective to judge the exact situation according to the accident report.
1 is a block diagram of a voice / sound analysis based situation determination apparatus according to an embodiment of the present invention.
FIG. 2 is a flowchart illustrating a method of determining a context based on speech / sound analysis according to an exemplary embodiment of the present invention.
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail to the concrete inventive concept. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing.
The terms first, second, A, B, etc. may be used to describe various elements, but the elements should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.
It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between.
The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.
Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.
Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.
1 is a block diagram of a voice / sound analysis based situation determination apparatus according to an embodiment of the present invention.
1, the voice / sound analysis based
Hereinafter, the detailed configuration will be described.
The call receiving
The call receiving
The age
Specifically, the age information can be inferred according to the following inference criteria.
Generally speaking, there are some factors that cause significant difference in speaking behaviors of elderly people compared to young people. In the elderly, the speed of speech is generally slower than that of young people, and the speed of speech of syllables is not constant. In addition, the silence is inserted in an inappropriate position, and there is a tendency to exhibit abnormal behaviors in pronounciation and pronation.
On the other hand, younger adults showed longer MPT (maximum phonation time) than older adults, which means that the elongation performance of vowels tends to decrease with age. The alternating motion rate (AMR) and sequential motion rate (SMR), which check the repetition rate and regularity of syllables, are also found to be faster in younger people than in older people.
On the other hand, the elderly have lower cognitive sensation and motor function, which contribute to horse output, than the older age group. Therefore, the overall speech rate and articulation rate are slowed down. .
In addition, the elderly exhibit a high incidence of disability in both subjective and objective aspects, and the elderly women exhibit a significantly higher voice disorder index than adult women.
And for men, the vocal pitch is lowered from 40 to 50 and then rising again, and women tend to fall in pitch as they get older.
As a result of measuring jitter and shimmer, the rate of change of vibration and the regularity of waveform are increased in elderly males, and the rate of change of vibration in elderly females only tends to increase. Here, jitter is the rate of change of the vocal fold vibration, and the shimmer means the regularity of the voice waveform. This tendency is indicative of a decrease in laryngeal function or a degenerative change in the laryngeal tissue. As a result of the measurement of the noise contrast ratio, which is another indicator of the stability of the vocalization, it is significantly increased in the elderly woman, which supports the instability of the vocalization according to the age increase.
The change of the voice index due to degenerative changes of the larynx tends to show a larger value in the jitter of the vocal fold vibration.
The gender
The gender
The gender
According to gender, there are significant differences in fundamental frequency, frequency variation, amplitude variation, and maximum fundamental frequency. In addition, there is no significant difference according to gender, noise - to - noise ratio, average fundamental frequency, and minimum fundamental frequency. In addition, the fundamental frequency shows a significant difference between the annual utterance and vocal extension.
The psychological
The psychological state and intention can be inferred by the following criteria.
First, the personality of the speaker can be deduced through the spoken behavior. The extroversion and introversion of the speaker can be judged on the basis of the speaking rate, silence length, silence frequency, and relative variation of the pitch.
In addition, the emotion inference engine for judging one emotion state of pleasant / pleasant / stable from the EEG / pulse wave sensing information of a speaking person can grasp the emotion, personality, psychological state, intention, etc. of the speaking person in various aspects.
The truth /
The truth / falsehood of a speaker can be inferred by the following criteria.
First, the speaker's answer to the question of the sender's question can be stored for 5 seconds and analyzed to judge the truth or falsehood.
Here, the report taker can be configured to ask questions of the same pattern, to pre-set some answers to follow these questions, and to judge truth / falsehood through them.
The peripheral
The ambient
For example, it is possible to preliminarily store sound such as a car sound, a human sound, a rain sound, and the like in the
The GPS
It is preferable that the GPS remote control module 128 turns on the GPS function of the speaking terminal 10 and monitors the GPS coordinates in real time because the error is significant in the case of tracking the location by the base station or the WiFi.
The
Also, the
FIG. 2 is a flowchart illustrating a method of determining a context based on speech / sound analysis according to an exemplary embodiment of the present invention.
Referring to FIG. 2, the talker mobile terminal 10 transmits voice and ambient sounds of a speaking person (S101).
Next, the
Next, the age
Next, the gender
Next, the psychological
Next, the truth /
Next, when the
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention as defined in the following claims. There will be.
110: call receiving module
120: Age Information Inference Module
130: Gender Information Inference Module
140: psychological state inference module
150: Truth / False Inference Module
160: Ambient acoustic reasoning module
170: ambient acoustic database
180: GPS remote control module
Claims (2)
An age information reasoning module for analyzing the voice and the ambient sound received from the call receiving module to infer the age information of the speaker;
A gender information reasoning module for analyzing the voice and the ambient sound received from the call receiving module to infer the gender of the speaker;
A psychological state reasoning module for analyzing a voice and a surrounding sound received from the call receiving module to deduce a psychological state of the speaker;
And a truth / false inference module for inferring the truth / false of the talker by analyzing the voice and the ambient sound received from the call reception module.
When the call receiving module receives the voice of the speaking party and the ambient sound from the speaking mobile terminal, the global positioning (GPS) function of turning on the global positioning system (GPS) function of the speaking party mobile terminal through the corresponding communication company server system) remote control module according to the present invention.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160020350A KR20170098445A (en) | 2016-02-22 | 2016-02-22 | Situation judgment apparatus based on voice/sound analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160020350A KR20170098445A (en) | 2016-02-22 | 2016-02-22 | Situation judgment apparatus based on voice/sound analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20170098445A true KR20170098445A (en) | 2017-08-30 |
Family
ID=59760571
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020160020350A KR20170098445A (en) | 2016-02-22 | 2016-02-22 | Situation judgment apparatus based on voice/sound analysis |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20170098445A (en) |
-
2016
- 2016-02-22 KR KR1020160020350A patent/KR20170098445A/en unknown
Similar Documents
Publication | Publication Date | Title |
---|---|---|
ES2242634T3 (en) | TELEPHONE EMOTION DETECTOR WITH OPERATOR FEEDBACK. | |
US6427137B2 (en) | System, method and article of manufacture for a voice analysis system that detects nervousness for preventing fraud | |
JP5834449B2 (en) | Utterance state detection device, utterance state detection program, and utterance state detection method | |
CN109460752B (en) | Emotion analysis method and device, electronic equipment and storage medium | |
US8784311B2 (en) | Systems and methods of screening for medical states using speech and other vocal behaviors | |
ES2261706T3 (en) | METHOD AND APPARATUS FOR CONVERSATION ANALYSIS. | |
WO2019084214A1 (en) | Separating and recombining audio for intelligibility and comfort | |
JP6268717B2 (en) | State estimation device, state estimation method, and computer program for state estimation | |
JP2017100221A (en) | Communication robot | |
WO2017085992A1 (en) | Information processing apparatus | |
EP4020467A1 (en) | Voice coaching system and related methods | |
KR101799874B1 (en) | Situation judgment system and method based on voice/sound analysis | |
Frank et al. | Nonverbal elements of the voice | |
KR20220048381A (en) | Device, method and program for speech impairment evaluation | |
JP6258172B2 (en) | Sound information processing apparatus and system | |
JP4631464B2 (en) | Physical condition determination device and program thereof | |
Brutten | Behaviour assessment and the strategy of therapy | |
JP2017196115A (en) | Cognitive function evaluation device, cognitive function evaluation method, and program | |
JP2006230446A (en) | Health-condition estimating equipment | |
KR20170098445A (en) | Situation judgment apparatus based on voice/sound analysis | |
JP6598227B1 (en) | Cat-type conversation robot | |
KR20170098446A (en) | Situation judgment ethod based on voice/sound analysis | |
KR20180052909A (en) | Interface system and method for database based on voice/sound analysis and legacy | |
KR20180052907A (en) | System and method of supplying graphic statistics using database based on voice/sound analysis | |
Sheeder et al. | Say it like you mean it: Priming for structure in caller responses to a spoken dialog system |