KR20170099004A - Situation judgment system and method based on voice/sound analysis - Google Patents
Situation judgment system and method based on voice/sound analysis Download PDFInfo
- Publication number
- KR20170099004A KR20170099004A KR1020160020348A KR20160020348A KR20170099004A KR 20170099004 A KR20170099004 A KR 20170099004A KR 1020160020348 A KR1020160020348 A KR 1020160020348A KR 20160020348 A KR20160020348 A KR 20160020348A KR 20170099004 A KR20170099004 A KR 20170099004A
- Authority
- KR
- South Korea
- Prior art keywords
- voice
- module
- speaker
- analyzing
- ambient sound
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
Description
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a situation determination system and method, and more particularly, to a system and method for determining a situation based on voice / sound analysis.
A lot of emergency calls or criminal incidents are reported every year.
However, a false report is a significant proportion of the report of such a congestion.
Unreasonable calls are frequent every few seconds, so you have to judge whether you are falsely notified within a short period of time and make a quick decision early on.
The recipient of the telephone call should not miss the true telephone call while attracting the false telephone call for a long time.
On the other hand, if it is not a false report, the speaker is often embarrassed and often fails to deliver the situation properly. In this case, it is necessary to quickly and accurately judge not only the authenticity of the declaration but also the urgency of the declaration through the voice of the speaker or the surrounding sound, as well as grasping and reasoning as much information as possible about the accident scene or the speaker in a short time.
However, there is a problem that it is not easy to grasp a lot of information accurately and quickly within a short time, and accuracy and objectivity may not be constant.
It is an object of the present invention to provide a voice / sound analysis based situation determination system.
It is another object of the present invention to provide a method for determining a situation based on voice / sound analysis.
According to an aspect of the present invention, there is provided a voice / sound analysis based situation determination system including: a speech mobile terminal for transmitting a voice of a speaking person and a surrounding sound; An age information inference module for inferring age information of the talker by analyzing the voice and the ambient sound received from the call receiving module; A module for inferring a gender of the talker by analyzing the voice and the ambient sound received from the module; a psychological state reasoning module for analyzing the voice and the ambient sound received from the call receiving module to infer the psychological state of the talker; And a truth / false inference module for inferring the truth / false of the talker by analyzing the voice and the ambient sound received from the call reception module. The gender information deduced by the gender information reasoning module, the psychological state inferred from the psychological state reasoning module, and the truth / false information deduced from the truth / false inference module, May be configured to include a terminal.
In this case, when the call reception module receives the voice of the speaker and the ambient sound from the speaking terminal mobile terminal, the global positioning system (GPS) function of the speaking terminal is turned on through the corresponding communication server server and a global positioning system (GPS) remote control module for controlling the operation of the GPS module.
According to another aspect of the present invention, there is provided a speech / sound analysis based context determination method, Transmitting a voice of the speaker and ambient sound; Receiving a voice of the speaker and ambient sound from the speaker terminal; Analyzing the received voice and the ambient sound to infer the age information of the speaker; Analyzing the received voice and ambient sounds to infer the gender information of the speaker; Analyzing the received voice and the ambient sound to infer the psychological state of the speaker; Analyzing the received voice and ambient sound to infer the truth / falsehood of the speaker; And displaying the age information, gender information, psychological state, and truth / falsehood of the user terminal inferred by the situation judgment server.
Here, when the situation determination server receives the voice of the talker and the ambient sound from the speaking terminal, the situation determination server turns on the global positioning system (GPS) function of the speaking terminal through the corresponding communication company server, and receiving and displaying GPS coordinates from the speaker terminal in real time.
According to the voice / sound analysis based situation determination system and method described above, it is possible to deduce the age, gender, psychological state, truth / false and surrounding situation of a speaking person from the voice of the speaking person and the surrounding sound, And it is effective to judge the exact situation according to the accident report.
1 is a block diagram of a voice / sound analysis based situation determination system according to an embodiment of the present invention.
FIG. 2 is a flowchart illustrating a method of determining a context based on speech / sound analysis according to an exemplary embodiment of the present invention.
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail to the concrete inventive concept. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing.
The terms first, second, A, B, etc. may be used to describe various elements, but the elements should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.
It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between.
The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.
Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.
Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.
1 is a block diagram of a voice / sound analysis based situation determination system according to an embodiment of the present invention.
Referring to FIG. 1, a voice / sound analysis based
Hereinafter, the detailed configuration will be described.
The talker
The
Hereinafter, the detailed configuration will be described.
The
The
The age
Specifically, the age information can be inferred according to the following inference criteria.
Generally speaking, there are some factors that cause significant difference in speaking behaviors of elderly people compared to young people. In the elderly, the speed of speech is generally slower than that of young people, and the speed of speech of syllables is not constant. In addition, the silence is inserted in an inappropriate position, and there is a tendency to exhibit abnormal behaviors in pronounciation and pronation.
On the other hand, younger adults showed longer MPT (maximum phonation time) than older adults, which means that the elongation performance of vowels tends to decrease with age. The alternating motion rate (AMR) and sequential motion rate (SMR), which check the repetition rate and regularity of syllables, are also found to be faster in younger people than in older people.
On the other hand, the elderly have lower cognitive sensation and motor function, which contribute to horse output, than the older age group. Therefore, the overall speech rate and articulation rate are slowed down. .
In addition, the elderly exhibit a high incidence of disability in both subjective and objective aspects, and the elderly women exhibit a significantly higher voice disorder index than adult women.
And for men, the vocal pitch is lowered from 40 to 50 and then rising again, and women tend to fall in pitch as they get older.
As a result of measuring jitter and shimmer, the rate of change of vibration and the regularity of waveform are increased in elderly males, and the rate of change of vibration in elderly females only tends to increase. Here, jitter is the rate of change of the vocal fold vibration, and the shimmer means the regularity of the voice waveform. This tendency is indicative of a decrease in laryngeal function or a degenerative change in the laryngeal tissue. As a result of the measurement of the noise contrast ratio, which is another indicator of the stability of the vocalization, it is significantly increased in the elderly woman, which supports the instability of the vocalization according to the age increase.
The change of the voice index due to degenerative changes of the larynx tends to show a larger value in the jitter of the vocal fold vibration.
The gender
The gender
The gender
According to gender, there are significant differences in fundamental frequency, frequency variation, amplitude variation, and maximum fundamental frequency. In addition, there is no significant difference according to gender, noise - to - noise ratio, average fundamental frequency, and minimum fundamental frequency. In addition, the fundamental frequency shows a significant difference between the annual utterance and vocal extension.
The psychological
The psychological state and intention can be inferred by the following criteria.
First, the personality of the speaker can be deduced through the spoken behavior. The extroversion and introversion of the speaker can be judged on the basis of the speaking rate, silence length, silence frequency, and relative variation of the pitch.
In addition, the emotion inference engine for judging one emotion state of pleasant / pleasant / stable from the EEG / pulse wave sensing information of a speaking person can grasp the emotion, personality, psychological state, intention, etc. of the speaking person in various aspects.
The truth /
The truth / falsehood of a speaker can be inferred by the following criteria.
First, the speaker's answer to the question of the sender's question can be stored for 5 seconds and analyzed to judge the truth or falsehood.
Here, the report taker can be configured to ask questions of the same pattern, to pre-set some answers to follow these questions, and to judge truth / falsehood through them.
The peripheral
The ambient
For example, it is possible to preliminarily store sound such as a car sound, a human sound, a rain sound, and the like in the
The GPS
It is preferable that the GPS
The
Also, the
FIG. 2 is a flowchart illustrating a method of determining a context based on speech / sound analysis according to an exemplary embodiment of the present invention.
Referring to FIG. 2, the talker
Next, the
Next, the
Next, the
Next, the
Next, the
Next, the
Next, when the
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention as defined in the following claims. There will be.
110: Speaker Mobile terminal
120: Situation determination server
121: Call receiving module
122: Age Information Inference Module
123: Gender Information Inference Module
124: psychological state inference module
125: Truth / False Inference Module
126: Peripheral acoustic reasoning module
127: ambient acoustic database
128: GPS remote control module
130: User terminal
Claims (4)
An age information inference module for inferring age information of the talker by analyzing the voice and the ambient sound received from the call receiving module; A module for inferring a gender of the talker by analyzing the voice and the ambient sound received from the module; a psychological state reasoning module for analyzing the voice and the ambient sound received from the call receiving module to infer the psychological state of the talker; And a truth / false inference module for inferring the truth / false of the talker by analyzing the voice and the ambient sound received from the call reception module.
The gender information deduced by the gender information reasoning module, the psychological state inferred from the psychological state reasoning module, and the truth / false information deduced from the truth / false inference module, A voice / sound analysis based context determination system including a terminal.
When the call receiving module receives the voice of the speaking party and the ambient sound from the speaking mobile terminal, the global positioning (GPS) function of turning on the global positioning system (GPS) function of the speaking party mobile terminal through the corresponding communication company server system) remote control module. < Desc / Clms Page number 21 >
Receiving a voice of the speaker and ambient sound from the speaker terminal;
Analyzing the received voice and the ambient sound to infer the age information of the speaker;
Analyzing the received voice and ambient sounds to infer the gender information of the speaker;
Analyzing the received voice and the ambient sound to infer the psychological state of the speaker;
Analyzing the received voice and ambient sound to infer the truth / falsehood of the speaker;
And displaying the age information, the gender information, the psychological state, and the truth / false information of the user terminal inferred by the situation judgment server.
When the situation determination server receives the voice of the talker and the ambient sound from the talker mobile terminal, the situation determination server turns on the global positioning system (GPS) function of the talker mobile terminal through the corresponding communication company server, And receiving and displaying GPS coordinates from the speaker terminal in real time.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160020348A KR101799874B1 (en) | 2016-02-22 | 2016-02-22 | Situation judgment system and method based on voice/sound analysis |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160020348A KR101799874B1 (en) | 2016-02-22 | 2016-02-22 | Situation judgment system and method based on voice/sound analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170099004A true KR20170099004A (en) | 2017-08-31 |
KR101799874B1 KR101799874B1 (en) | 2017-12-21 |
Family
ID=59761369
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020160020348A KR101799874B1 (en) | 2016-02-22 | 2016-02-22 | Situation judgment system and method based on voice/sound analysis |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101799874B1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101998650B1 (en) * | 2019-02-12 | 2019-07-10 | 한방유비스 주식회사 | Collecting information management system of report of disaster |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10423773B1 (en) | 2019-04-12 | 2019-09-24 | Coupang, Corp. | Computerized systems and methods for determining authenticity using micro expressions |
-
2016
- 2016-02-22 KR KR1020160020348A patent/KR101799874B1/en active IP Right Grant
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101998650B1 (en) * | 2019-02-12 | 2019-07-10 | 한방유비스 주식회사 | Collecting information management system of report of disaster |
Also Published As
Publication number | Publication date |
---|---|
KR101799874B1 (en) | 2017-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5834449B2 (en) | Utterance state detection device, utterance state detection program, and utterance state detection method | |
EP1222448B1 (en) | System, method, and article of manufacture for detecting emotion in voice signals by utilizing statistics for voice signal parameters | |
ES2242634T3 (en) | TELEPHONE EMOTION DETECTOR WITH OPERATOR FEEDBACK. | |
US8160210B2 (en) | Conversation outcome enhancement method and apparatus | |
US20140314212A1 (en) | Providing advisory information associated with detected auditory and visual signs in a psap environment | |
WO2017085992A1 (en) | Information processing apparatus | |
EP4020467A1 (en) | Voice coaching system and related methods | |
JP2017100221A (en) | Communication robot | |
US11699043B2 (en) | Determination of transcription accuracy | |
KR101799874B1 (en) | Situation judgment system and method based on voice/sound analysis | |
JP6695057B2 (en) | Cognitive function evaluation device, cognitive function evaluation method, and program | |
JP2020000713A (en) | Analysis apparatus, analysis method, and computer program | |
JP6598227B1 (en) | Cat-type conversation robot | |
JP2006230446A (en) | Health-condition estimating equipment | |
KR20180052907A (en) | System and method of supplying graphic statistics using database based on voice/sound analysis | |
KR20180052909A (en) | Interface system and method for database based on voice/sound analysis and legacy | |
JP6718623B2 (en) | Cat conversation robot | |
KR20170098445A (en) | Situation judgment apparatus based on voice/sound analysis | |
KR102571549B1 (en) | Interactive elderly neglect prevention device | |
KR20170098446A (en) | Situation judgment ethod based on voice/sound analysis | |
US20130143543A1 (en) | Method and device for automatically switching a profile of a mobile phone | |
KR20190085272A (en) | Open api system and method of json format support by mqtt protocol | |
KR102000282B1 (en) | Conversation support device for performing auditory function assistance | |
KR20180019375A (en) | Condition check and management system and the method for emotional laborer | |
KR101329175B1 (en) | Sound analyzing and recognizing method and system for hearing-impaired people |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right |