WO2008004844A1 - Method and system for providing voice analysis service, and apparatus therefor - Google Patents

Method and system for providing voice analysis service, and apparatus therefor Download PDF

Info

Publication number
WO2008004844A1
WO2008004844A1 PCT/KR2007/003304 KR2007003304W WO2008004844A1 WO 2008004844 A1 WO2008004844 A1 WO 2008004844A1 KR 2007003304 W KR2007003304 W KR 2007003304W WO 2008004844 A1 WO2008004844 A1 WO 2008004844A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice analysis
voice
called party
party
calling terminal
Prior art date
Application number
PCT/KR2007/003304
Other languages
French (fr)
Inventor
Si-Woo Park
Hee-Jung Ahn
Kyu-Suk Cho
Original Assignee
Ktfreetel Co., Ltd.
Egtek. Co., Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020060063593A external-priority patent/KR20080004813A/en
Priority claimed from KR1020060077781A external-priority patent/KR20080016113A/en
Priority claimed from KR20070039360A external-priority patent/KR100940088B1/en
Application filed by Ktfreetel Co., Ltd., Egtek. Co., Ltd filed Critical Ktfreetel Co., Ltd.
Publication of WO2008004844A1 publication Critical patent/WO2008004844A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42025Calling or Called party identification service
    • H04M3/42034Calling party identification service
    • H04M3/42042Notifying the called party of information on the calling party
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42017Customized ring-back tones

Definitions

  • the present invention relates to a mobile communication service, and in particular, to method and system for analyzing voice of a user who is on a call based on layered voice analysis and providing an analysis result.
  • a human when listening to voice of a person, a human can estimate an emotional state of the person at some extent. For example, when a person is excited or angry, an average pitch of his/her voice becomes higher than usual, and although he/she speaks a sentence composed of the same words as usual, the average pitch in the sentence changes more rapidly. And, as the voice becomes louder, energy becomes relatively higher in the same environment as usual and average energy changes relatively more rapidly.
  • the love detection technology includes an emotional state recognizing technology, a signal processing technology according to emotional state recognition, or an application technology.
  • Prior arts related to the love detection technology are patented or pending, for example "emotion recognition method, and sensibility creating method, device, and software" (Korean Patent Application No. 10-2003-7003615), "method for providing sound source and avatars according to feelings on mobile” (Korean Patent Application No. 10-2003-0081299), “emotion recognition in voice using wavelet transformation” (Korean Patent Application No. 10-2002-0026056), "a doll for representing a user's emotion” (Korean Registration No. 20-0313090-0000) or "accessory capable of expressing one's emotions” (Korean Registration No. 20-0301592-0000).
  • a third generation mobile communication network for example, W-CDMA (Wide-Code Division Multiple Access) network or HSDPA (High-Speed Downlink Packet Access) network
  • W-CDMA Wide-Code Division Multiple Access
  • HSDPA High-Speed Downlink Packet Access
  • the present invention is designed to solve the problems of the prior art, and therefore it is an object of the present invention to provide method and system for providing a voice analysis service, which analyze voice of a user who is on a call based on layered voice analysis and provide an emotional analysis result in various manners, and an apparatus therefor.
  • a method for providing a voice analysis service in a communication network includes establishing a traffic channel between a calling terminal and a called terminal according to call connection of the calling terminal; performing layered voice analysis of voice of a called party transmitted from the called terminal to the calling terminal through the established traffic channel in real time to draw an emotional state information of the called party; and transmitting the drawn emotional state information of the called party to the calling terminal in real time.
  • a method for providing a voice analysis service during video communication in a video communication network includes establishing a video traffic channel between a calling terminal and a called terminal according to video communication call connection of the calling terminal; performing layered voice analysis of voice of a called party transmitted from the called terminal to the calling terminal through the established video traffic channel in real time to draw an emotional state information of the called party; converting the drawn emotional state information of the called party to a visual data; and transmitting and displaying to the calling terminal the converted visual data and a video of the called party to be transmitted to the calling terminal through the video traffic channel.
  • a method for providing a voice analysis service in a communication network includes receiving call connection for the voice analysis service from a calling terminal; notifying a mode information of the voice analysis service to the calling terminal and receiving a selection response to the notification from the calling terminal; in the case that a calling party selects an exercise mode, notifying a virtual opponent and an exercise method to the calling terminal and receiving a selection response to the notification from the calling terminal; receiving voice of the calling party from the calling terminal according to the virtual opponent and exercise method selected by the calling party; and performing layered voice analysis of the received voice of the calling party to draw a voice analysis result, and transmitting the drawn voice analysis result to the calling terminal in real time.
  • a system for providing a voice analysis service in a communication network includes a switching center for receiving a call origination including a feature code for the voice analysis service and a telephone number of a called party from a calling terminal, and routing the call origination based on the feature code; and a layered voice analysis server for receiving the routed call, establishing a traffic channel between the calling terminal and a called terminal corresponding to the telephone number of the called party, analyzing voice of the called party transmitted to the calling terminal through the established traffic channel to draw an emotional state information of the called party in real time, and transmitting the drawn emotional state information to the calling terminal in real time.
  • a mobile communication device for providing a voice analysis service during video communication in a mobile communication network includes a service control unit for controlling a service operation mode according to a video communication call connection signal from a calling terminal; a video communication unit for establishing a traffic channel between the calling terminal and a called terminal under the control of the service control unit to support video communication; and a voice analysis engine for analyzing voice of a called party transmitted to the calling terminal through the established traffic channel under the control of the service control unit to convert an emotional state of the called party to a visual data.
  • FIG. 1 is a view illustrating a configuration of a voice analysis service system of a mobile communication network according to an embodiment of the present invention.
  • FIG. 2 is a block diagram illustrating a layered voice analysis server of FIG. 1 according to an embodiment of the present invention.
  • FIG. 3 is a flow chart illustrating a voice analysis service method according to an embodiment of the present invention.
  • FIG. 4 is a flow chart illustrating a voice analysis service method according to another embodiment of the present invention.
  • FIG. 5 is a flow chart illustrating a voice analysis service method according to still another embodiment of the present invention.
  • FIG. 20 FIG.
  • FIG. 6 is a block diagram illustrating a mobile communication server for providing an emotional state analysis service during video communication according to an embodiment of the present invention.
  • FIG. 7 is a block diagram illustrating a mobile communication server for providing an emotional state analysis service during video communication according to another embodiment of the present invention.
  • FIG. 8 is a flow chart illustrating a system for providing a voice analysis service during video communication according to an embodiment of the present invention.
  • FIG. 9 is a view illustrating an MMS message provided as a voice analysis result according to an embodiment of the present invention.
  • FIG. 10 is a view illustrating an SMS message provided as a love detection result according to an embodiment of the present invention. [25] FIG.
  • FIG. 11 is a view illustrating a process for providing a voice analysis result through a mobile phone Internet according to an embodiment of the present invention.
  • FIG. 12 is a view illustrating a screen providing a love detection result through a web service according to an embodiment of the present invention.
  • FIG. 13 is a view illustrating a screen providing a love detection result through a web service according to another embodiment of the present invention.
  • FIG. 14 is a view illustrating an embodiment of a method for synchronizing the whole conversation between a calling party and a called party with a voice analysis result of the called party.
  • FIG. 1 is a view illustrating a configuration of a voice analysis service system of a mobile communication network according to an embodiment of the present invention.
  • the voice analysis service system includes a mobile switching center (MSC) 120, a home location register (HLR) 130, a layered voice analysis server 140, a short message service center (SMSC) 150, and a multimedia message service center (MMSC) 160.
  • MSC mobile switching center
  • HLR home location register
  • SMSC short message service center
  • MMSC multimedia message service center
  • the configuration of the voice analysis service system according to the present invention is not limited in this regard, and the voice analysis service system may include all variations and combinations of the above-mentioned components.
  • the layered voice analysis server 140 may be connected to an Internet network capable of voice communication to provide a voice analysis service, or may be connected to a communication network based on a public switched telephone network (PSTN), an integrated service digital network (ISDN) and a local area network (LAN) to provide a voice analysis service.
  • PSTN public switched telephone network
  • ISDN integrated service digital network
  • LAN local area network
  • the layered voice analysis server 140 cooperates with the MSC 120 over El/Tl to execute a call processing function for the voice analysis service of the present invention. And, the layered voice analysis server 140 executes voice analysis on-line in real time using an inner voice analysis engine or executes voice analysis of a prerecorded audio file off-line.
  • the layered voice analysis for voice analysis uses voice as a medium of brain to make a blueprint of a brain activity and analyze a complete emotional system of a talker, and does not focus on a conversation, but the brain activity that occurs from making a conversation. That is, generally a human brain leaves "finger prints” on all "events” that pass through the brain.
  • the layered voice analysis server 140 according to the present invention disregards words of an object for analysis (a calling party or a called party), and focuses on only the brain activity of the object for analysis. In other words, the voice analysis according to the present invention does not focus on 'what' the object for analysis speaks, but 'how' the object for analysis speaks.
  • the layered voice analysis service can analyze voice of a calling party 110 or a called party 170 in real time and provide a real time voice analysis result. And, after a call ends, the layered voice analysis service can provide the voice analysis result in a type of a short message or a multimedia message or using a wire or wireless Internet.
  • the calling party 110 and the called party 170 use a terminal capable of voice communication, for example a mobile phone, a notebook, PDA or a typical telephone.
  • the layered voice analysis server 140 stores a conversation between the calling party 110 and the called party 170 who are on a call. And, the layered voice analysis server 140 separates only voice of the calling party 110 or the called party 170 from the conversation between the calling party 110 and the called party 170, stores the separated voice, and extracts a predetermined voice parameter from an audio file storing the voice. At this time, the audio file is formed of 6/8/11 KHz, 8/16 bit mono/ stereo non-compressed pulse code modulation (PCM) format wav file.
  • PCM mono/ stereo non-compressed pulse code modulation
  • the eight voice analysis formulas may include a lie stress analysis formula, a stimulation level analysis formula, an attention level analysis formula, an emotional level analysis formula, a strife or conflict level analysis formula, a fraud and deceit pattern combination analysis formula or additional formulas for reliability evaluation.
  • the voice analysis result includes an excitement level representing a degree of excitement of an object for voice analysis, and a conviction level representing a degree of conviction of the object for voice analysis about his/her words.
  • the voice analysis result includes a stress level representing a degree of stress of the object for voice analysis who knows he/she cannot escape from a current situation, and a thought level representing a degree that the object for voice analysis tries to search for a response.
  • the voice analysis result includes an S.O.S (Say or Stop) level representing whether the object for voice analysis evades a conversation, and a concentration level representing whether the object for voice analysis concentrates on the conversation.
  • the voice analysis result includes an anticipation level representing a degree of anticipation of the object for voice analysis to an opponent, an embarrassment level representing whether the object for voice analysis is embarrassed, or a love level representing a degree of love of the object for voice analysis.
  • a portion or all of the voice analysis result is selected and provided according to service type.
  • the voice analysis result may be synthesized to draw and provide another service level. For example, a lie detection level may be provided.
  • the voice analysis service may be provided in an actual mode and an exercise mode.
  • the actual mode is such that the voice analysis result is transmitted to the calling party 110 or the called party 170 during an actual call.
  • the exercise mode is such that the calling party 110 receives his/ her voice analysis result while speaking by telephone with a virtual opponent.
  • the exercise mode enables the calling party 110 to make a virtual call with the virtual opponent, so that the calling party 110 can reduce a mistake or misunderstanding that may occur during an actual call with the called party 170 and judge whether he/she is properly expressing his/her emotion.
  • the layered voice analysis server 140 establishes a traffic channel between the calling party 110 and the called party 170, and analyzes voice of the called party 170 transmitted to the calling party 110 through the established traffic channel to extract a love level.
  • the layered voice analysis server 140 separately records and stores a conversation between the calling party 110 and the called party 170. Meanwhile, the layered voice analysis server 140 informs in real time the calling party 110 of the love level that is drawn by analyzing the voice of the called party 170 or provides the calling party 110 or the called party 170 with contents corresponding to the love level.
  • the layered voice analysis server 140 may provide the love level when the love level exceeds a predetermined critical value, or the layered voice analysis server 140 may average the love level at a predetermined time interval and inform the calling party 110 of the averaged love level.
  • the layered voice analysis server 140 may store the conversation or a call history between the calling party 110 and the called party 170, and after the call ends, provide the calling party 110 with a voice analysis result obtained by analyzing the recorded conversation in detail and contents corresponding to the voice analysis result as a short message or a multimedia message or through a wire or wireless Internet.
  • the layered voice analysis server 140 receives a virtual opponent and an exercising method selected by the calling party 110 and words from the calling party 110.
  • the virtual opponent may include mother or father, a brother or a sister, a girl friend or a boy friend, a one-sided love interest, or a friend
  • the exercising method may include persuasion, consolation or confession.
  • the layered voice analysis server 140 analyzes voice of the calling party 110, draws a love level and provides the calling party 110 with the drawn love level or contents corresponding to the love level.
  • the layered voice analysis server 140 may provide the love level when the love level exceeds a predetermined critical value, or the layered voice analysis server 140 may average the love level at a predetermined time interval and provide the calling party 110 with the averaged love level. And, the layered voice analysis server 140 may store the words and a call history of the calling party 110, and after the exercise mode ends, provide the calling party 110 with the voice analysis result obtained by analyzing his/her words in detail and contents corresponding to the voice analysis result as a short message or a multimedia message or through a wire or wireless Internet.
  • a user who wants to use the voice analysis service may connect to the layered voice analysis server 140 using a predetermined feature code. That is, when the user inputs a predetermined feature code (for example, "**45") and a telephone number of a called party into a terminal, the MSC 120 recognizes based on the feature code included in a call connection request message transmitted from the user's terminal that the corresponding call is a voice analysis service call. The MSC 120 requests a routing information of the layered voice analysis server 140 to the HLR 130. The MSC 120 routes the call of the user to the layered voice analysis server 140 based on a response from the HLR 130.
  • a predetermined feature code for example, "**45"
  • the MSC 120 requests a routing information of the layered voice analysis server 140 to the HLR 130.
  • the MSC 120 routes the call of the user to the layered voice analysis server 140 based on a response from the HLR 130.
  • a packet data serving node (PDSN or SGSN) routes the outbound call from the user's terminal to the layered voice analysis server 140.
  • PDSN packet data serving node
  • FIG. 2 is a block diagram illustrating the layered voice analysis server 140 of FIG.
  • the layered voice analysis server 140 includes a call setup unit 210, a recording unit 230, a voice analysis engine 250 and a storage device 270.
  • the voice analysis engine 250 includes an analysis object separating unit 251, a voice extracting unit 253 and a voice analysis unit 255
  • the storage device 270 includes an analysis result storage unit 271 and a contents storage unit 273.
  • the voice analysis engine 250 as an essential component of the layered voice analysis server 140 may be formed of a software development kit. However, the voice analysis engine 250 is not limited in this regard.
  • the call setup unit 210 of the layered voice analysis server 140 cooperates with the MSC 120, SMSC 150 and MMSC 160 of the mobile communication network to executes call processing for the voice analysis service of the present invention.
  • the call setup unit 210 establishes a traffic channel between the calling party 110 and the called party 170 to transmit and receive voices of the calling party 110 and the called party 170.
  • the call setup unit 210 establishes a traffic channel with the calling party 110 to receive voice of the calling party 110.
  • the call setup unit 210 transmits a voice analysis result of an object for voice analysis and content corresponding to the voice analysis result as a short message or a multimedia message, thereby providing the voice analysis service after end of the call.
  • the recording unit 230 records the voice received by the call setup unit 210.
  • the audio file recorded by the recording unit 230 is formed of 6/8/11 KHz, 8/16 bit mono/ stereo non-compressed pulse code modulation (PCM) format wav file.
  • the audio file recorded by the recording unit 230 is stored in the storage device 270.
  • the analysis object separating unit 251 of the voice analysis engine 250 separates the voice of the object for voice analysis from the audio file input through the recording unit 230.
  • the analysis object separating unit 251 separates only the voice of the called party 170 from the audio file input through the recording unit 230.
  • the analysis object separating unit 251 separates only the voice of the calling party 110 from the audio file input through the recording unit 230.
  • the voice extracting unit 253 extracts a predetermined voice parameter from the audio file of the object for voice analysis that is separated by the analysis object separating unit 251.
  • the voice analysis unit 255 draws a voice analysis result using eight voice analysis formulas at the maximum based on the voice parameter extracted by the voice extracting unit 253.
  • the voice analysis result includes an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an embarrassment level or a love level.
  • the voice analysis unit 255 selectively provides a portion or all of the above-mentioned levels according to service type. And, the voice analysis unit 255 may synthesize the above-mentioned levels to calculate another type level, for example a lie detection level.
  • the voice analysis unit 255 provides the above-mentioned levels numerically or in a graphic or text.
  • the analysis result storage unit 271 of the storage device 270 stores the conversation between the calling party 110 and the called party 170 that is recorded by the recording unit 230. And, the analysis result storage unit 271 stores the voice analysis result of the object for voice analysis that is drawn by the voice analysis unit 255. At this time, the analysis result storage unit 271 maps and stores information of a user who used the voice analysis service and the date of the voice analysis service together with the voice analysis result and the whole conversation of the object for voice analysis. And, the contents storage unit 273 of the storage device 270 stores predetermined contents that are provided corresponding to the voice analysis result, for example a music or an emoticon. Thus, the whole conversation and the voice analysis result stored in the analysis result storage unit 271 and the predetermined contents stored in the contents storage unit 273 are provided to the user through a short message service, a multimedia message service or a wire or wireless Internet service.
  • FIG. 3 is a flow chart illustrating a voice analysis service method according to an embodiment of the present invention, and illustrates a method for analyzing the voice of the called party during the call between the calling party and the called party and providing the voice analysis result to the calling party.
  • the calling party 110 connects to the layered voice analysis server 140 using a terminal (S301).
  • a predetermined feature key is used to connect the calling party 110 to the layered voice analysis server 140.
  • a call of the calling terminal is routed to the layered voice analysis server 140 via the MSC 120.
  • the calling party 110 may input 'a telephone number of the called party + an Internet connection key' into the calling terminal to connect to the layered voice analysis server 140.
  • a method in which the calling party 110 connects to the layered voice analysis server 140 is not limited in this regard, and may be variously modified according to service type.
  • the layered voice analysis server 140 may execute user authentication when the calling party 110 connects a call.
  • the user authentication is automatically made based on information of the calling party 110, however the present invention is not limited in this regard.
  • the 140 may inform the calling party 110 of the voice analysis service.
  • the voice analysis service is informed in a type of voice notification, and the notification may include a brief description of the voice analysis service and a description of service charge information.
  • the layered voice analysis server 140 connects a call to the called party 170 based on the telephone number of the called party 170 to establish a traffic channel between the calling party 110 and the called party 170 (S303).
  • the layered voice analysis server 140 separates the audio file of the called party 170 from the conversation between the calling party 110 and the called party 170 (S305). At this time, the layered voice analysis server 140 records and stores the conversation between the calling party 110 and the called party 170, and then separates only the audio file of the called party 170 from the conversation.
  • the layered voice analysis server 140 extracts a predetermined voice parameter from the separated audio file of the called party 170 (S307).
  • the audio file is formed of 6/8/11 KHz, 8/16 bit mono/stereo non-compressed pulse code modulation (PCM) format wav file.
  • PCM pulse code modulation
  • the layered voice analysis server 140 executes voice analysis based on the extracted voice parameter to draw a voice analysis result (S309).
  • the voice analysis result includes an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an anticipation level, an embarrassment level or a love level.
  • the voice analysis result may include a lie detection level (in other words, a reliability level) calculated by synthesizing the above-mentioned levels.
  • the voice analysis result includes a love level, a concentration level, an anticipation level, and an embarrassment level.
  • the concentration level analyzed by the layered voice analysis server 140 is drawn in the range of 0 to 100.
  • the range of 0 to 44 means a low concentration level or perplexity
  • the range of 45 to 65 means a normal state
  • the range of 66 to 100 means a high concentration level.
  • the concentration level may be indicated by percentage or classified into very low, low, middle, high and very high according to a predetermined standard, and may be provided in text or a graph.
  • the anticipation level is drawn in the range of 0 to 100.
  • the range of 0 to 30 means a normal state, and the range of 31 to 100 means high anticipation to an opponent and may suggest an attempt of deceit.
  • the anticipation level may be indicated by percentage or classified into low, middle and high according to a predetermined standard, and may be provided in text or a graph.
  • the embarrassment level is drawn into five classified stages in the range of 0 to
  • the embarrassment level 0 means no embarrassment
  • the embarrassment level 25 means a slight embarrassment
  • the embarrassment level 50 means a normal state
  • the embarrassment level 75 means a considerable embarrassment
  • the embarrassment level 100 means very much embarrassment.
  • the embarrassment level may be indicated by percentage or classified into very low, low, middle, high and very high according to a predetermined standard, and may be provided in text or a graph.
  • the love level is drawn in the range of -10 to 50.
  • the love level of -10 to 0 and 1 to 10 means that love is not detected from the called party 170.
  • the love level of 11 to 50 means that love is detected from the called party 170.
  • the love level may be provided by percentage. In the case that the love level is -10 to 0, a percentage of the love level is calculated by multiply the corresponding love level by 0, and in the case that the love level is 1 to 50, a percentage of the love level is calculated by multiply the corresponding love level by 2.
  • the voice analysis result drawn by analyzing the voice of the called party 170 is provided in real time to the calling party 110 that is on a call (S311).
  • the voice analysis result that is transmitted in real time to the calling party 110 on a call is the love level calculated by percentage among the above-mentioned levels.
  • the voice analysis result may be provided to the calling party 110 on a call by a predetermined period (for example, at an interval of ten seconds), or provided when the love level exceeds a predetermined critical value.
  • the voice analysis result transmitted in real time to the calling party 110 on a call may be contents corresponding to the love level. In the case that the provided contents are for example, a music file, the corresponding music may be provided as a background music during the call between the calling party 110 and the called party 170.
  • the layered voice analysis server 140 provides the calling party 110 with the voice analysis result that is drawn by analyzing the voice of the called party 170 and stored, as a short message or a multimedia message or through a wire or wireless Internet (S313).
  • the voice analysis result that is provided as a short message or a multimedia message or through a wire or wireless Internet after the call ends includes all of the love level, the concentration level, the anticipation level and the embarrassment level.
  • the voice analysis result that is provided as a short message or a multimedia message or through a wire or wireless Internet is displayed variously, for example, in text or a graph.
  • the layered voice analysis server 140 may provide only the voice analysis result (i.e. graph) of the called party 170 or provide the voice analysis result together with the whole conversation. That is, in the case that the layered voice analysis server 140 provides only the voice analysis result of the called party 170, the calling party 110 can check the change of emotion of the called party 170, but cannot check where the change of emotion of the called party 170 occurs in the conversation. Thus, the layered voice analysis server 140 provides the change of emotion of the called party 170 together with the whole conversation. While replaying the whole conversation, the layered voice analysis server 140 synchronizes the whole conversation with the voice analysis result at the start time of the voice of the called party 170 to provide the voice analysis result in a graph. Its detailed method is described as follows.
  • the voice analysis service may be charged in proportion to the duration of call or the times of usage.
  • a billing method may be different according to a mobile communication service provider or a separate service provider that provides the voice analysis service, the present invention is not limited in this regard.
  • FIG. 4 is a flow chart illustrating a voice analysis service method according to another embodiment of the present invention.
  • the voice analysis service may be provided in an actual mode and an exercise mode according to an object for voice analysis.
  • the actual mode is such that voices of the calling party and the called party are analyzed during an actual call and a voice analysis result is provided.
  • the exercise mode is such that voice of the calling party who speaks by telephone with a virtual opponent is analyzed and a voice analysis result is transmitted to the corresponding calling party. The exercise mode is described with reference to FIG. 3.
  • the calling party 110 connects to the layered voice analysis server 140 by a terminal (S401).
  • a predetermined feature key may be used to connect the calling party 110 to the layered voice analysis server 140 for love detection in the exercise mode.
  • the calling party 110 inputs "a feature code for voice analysis (*007) + a telephone of the called party" into the terminal.
  • the layered voice analysis server 140 informs the calling party 110 of an actual mode and an exercise mode, and receives selection of the exercise mode from the calling party 110.
  • the calling party 110 may directly input a feature code for the exercise mode and the telephone number of the called party into the terminal.
  • the layered voice analysis server 140 may not inform the calling party 110 of the exercise mode.
  • the calling party 110 may connect to the layered voice analysis server 140 for voice analysis in the exercise mode in various manners according to service type.
  • the layered voice analysis server 140 may execute user authentication when the calling party 110 connects a call.
  • the user authentication is automatically made based on information of the calling party 110, however the present invention is not limited in this regard.
  • the 140 may inform the calling party 110 of the voice analysis service.
  • the voice analysis service is provided in the type of voice notification, and the notification may include a brief description of the voice analysis service and a service charge information.
  • the layered voice analysis server 140 receives selection of an opponent for exercise from the calling party 110 (S403).
  • the opponent for exercise is selected such that voice notification is provided to the calling party 110 and a response thereto is received from the calling party 110.
  • the opponent for exercise may include mother or father, a brother or a sister, a girl friend or a boy friend, a one-sided love interest, or a friend.
  • the layered voice analysis server 140 receives selection of an exercising method suitable for the opponent for exercise from the calling party 110 (S405).
  • the exercising method is selected in the same manner as the opponent for exercise.
  • the exercising method may include persuasion, consolation or confession.
  • the calling party 110 selects a one-sided love interest as the opponent for exercise and confession as the exercising method. This means that the calling party 110 has a person who he/she loves secretly and wants to argue his/her love to the one-sided love interest, and the calling party 110 intends to exercise confession of his/ her love prior to an actual confession.
  • the calling party 110 speaks by telephone with the one-sided love interest, the calling party 110 may reduce a potential mistake or misunderstanding.
  • the layered voice analysis server 140 receives the selection of the opponent for exercise and the exercising method, the layered voice analysis server 140 receives voice of the calling party 110 (S407). Preferably, while receiving the voice of the calling party 110, the layered voice analysis server 140 records the voice of the calling party 110.
  • the layered voice analysis server 140 extracts a predetermined voice parameter from an audio file of the calling party 110 (S409).
  • the audio file of the calling party 110 is formed of 6/8/11 KHz, 8/16 bit mono/stereo non-compressed pulse code modulation (PCM) format wav file.
  • PCM pulse code modulation
  • the layered voice analysis server 140 executes voice analysis based on the extracted voice parameter to draw a voice analysis result (S411).
  • the voice analysis is the same as the voice analysis described with reference to FIG. 3, and thus the detailed description thereof is omitted.
  • the voice analysis result drawn by analyzing the voice of the calling party 110 is provided in real time to the calling party 110 who is on a call (S413).
  • the voice analysis result that is transmitted in real time to the calling party 110 is a love level calculated by percentage.
  • the voice analysis result may be provided to the calling party 110 on a call by a predetermined period (for example, at an interval of ten seconds), or provided when the love level exceeds a predetermined critical value.
  • the voice analysis result transmitted in real time to the calling party 110 on a call may be contents corresponding to the love level.
  • the voice analysis result may be provided differently according to service type.
  • the layered voice analysis server 140 provides the calling party 110 with the voice analysis result that is drawn by analyzing the voice of the calling party 110 and stored, as a short message or a multimedia message or through a wire or wireless Internet.
  • the voice analysis result that is provided as a short message or a multimedia message or through a wire or wireless Internet after the call ends includes all of the love level, the concentration level, the anticipation level and the embarrassment level.
  • the voice analysis result that is provided as a short message or a multimedia message or through a wire or wireless Internet is displayed variously, for example, in text or a graph.
  • the voice analysis service may be charged in proportion to the duration of call or the times of usage.
  • a billing method may be different according to a mobile communication service provider or a separate service provider that provides the voice analysis service, the present invention is not limited in this regard.
  • the layered voice analysis server 140 may establish a call between the calling party 110 and the called party 170 corresponding to the opponent for exercise to provide the voice analysis service in the actual mode.
  • the layered voice analysis server 140 checks only whether the calling party 110 intends to use the voice analysis service in the actual mode and establishes a call between the calling party 110 and the called party 170.
  • the layered voice analysis server 140 receives selection of the called party 170 from the calling party 110 and connects a call between the calling party 110 and the called party 170.
  • the layered voice analysis server 140 performs the steps S305 to S313 described with reference to FIG. 3.
  • FIG. 5 is a flow chart illustrating a voice analysis service method according to another embodiment of the present invention, and describes a method for detecting a false return in an emergency call service.
  • the layered voice analysis server 140 is installed in a public institution such as a police station or a fire station and analyzes voice of an emergency caller to detect a false return.
  • the calling party 110 who intends to make an emergency call to a public institution such as a police station or a fire station inputs an emergency call key or an emergency call telephone number into a terminal to make an emergency call.
  • the emergency call of the calling party 110 is input into the layered voice analysis server 140 (S501).
  • the layered voice analysis server 140 establishes a traffic channel between a telephone operator and the calling party 110 (S503).
  • the layered voice analysis server 140 separates an audio file of the calling party 110 from a conversation between the calling party 110 and the telephone operator (S505). At this time, the layered voice analysis server 140 records and stores the conversation between the calling party 110 and the telephone operator, and separates only the audio file of the calling party 110 from the conversation.
  • the layered voice analysis server 140 extracts a predetermined voice parameter from the separated audio file of the calling party 110 (S507).
  • the audio file is formed of 6/8/11 KHz, 8/16 bit mono/stereo non-compressed pulse code modulation (PCM) format wav file.
  • PCM pulse code modulation
  • the layered voice analysis server 140 executes voice analysis based on the extracted voice parameter to draw a voice analysis result (S509). That is, the layered voice analysis server 140 draws an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an anticipation level, an embarrassment level or a love level, and synthesizes the above-mentioned levels to calculate a lie detection level (in other words, a reliability level).
  • the layered voice analysis server 140 judges whether a report of the calling party
  • the layered voice analysis server 140 draws a lie detection level of the calling party 110 in the range of 0 to 100, and in the case that the lie detection level exceeds 50, judges the report as false.
  • the layered voice analysis server 140 extracts information of the calling party 110, stores the extracted information into the storage device 270, and provides the calling party 110 with information according to emergency situation (S513).
  • the information according to emergency situation may be provided as a message.
  • the calling party 110 receives the information according to emergency situation and deals with the emergency situation prior to arrival of an emergency staff or a police officer.
  • the layered voice analysis server 140 maintains the call between the calling party 110 and the telephone operator.
  • the layered voice analysis server 140 extracts information of the calling party 110, stores the extracted information into the storage device 270, and transmits to the calling party 110 an warning message against the false return based on the information of the calling party 110 (S515).
  • the layered voice analysis server 140 judges whether the calling party 110 is a habitual criminal based on the information of the calling party 110 (S517). In the case that the calling party 110 is a habitual criminal, the layered voice analysis server 140 further transmits to the calling party 110 a warning message including the frequency of false return and a matter of legal measures (S519). This may bring the false reporter to an awareness of the false return and prevent the habitual false return.
  • FIG. 6 is a block diagram illustrating a mobile communication server 600 for providing an emotional state analysis service during video communication according to an embodiment of the present invention.
  • the mobile communication server 600 for providing an emotional state analysis service during video communication includes a comm unication unit 610, a service control unit 620, a video communication unit 630, a voice analysis engine 640, and an image integrating unit 650.
  • the communication unit 610 receives from a calling terminal a call connection request signal that requests a call service of a specific feature.
  • the call connection request signal includes a feature code having a specific feature, a receiving telephone number, and an identifier for identifying a video communication and an audio communication.
  • the communication unit 610 connects a call to a called terminal to establish a traffic channel between the calling terminal and the called terminal.
  • the mobile communication server 600 performs a series of operations for providing a call service of a specific feature requested by the calling party, and transmits a result to the calling terminal through the communication unit 610.
  • the call service of a specific feature may include various services, however this embodiment focuses on a call service for analyzing and displaying an emotional state of an opponent during video communication.
  • a video communication identifier included in the call connection request signal is indicated as '*'
  • a feature code used to analyze the emotional state of the opponent and display the emotional state on a screen of the calling terminal is indicated as '001'.
  • the service control unit 620 transmits a control signal to the video communication unit 630 according to the video communication identifier '*' included in the call connection request signal received from the communication unit 610. And, the service control unit 620 transmits a control signal to the voice analysis engine 640 according to the feature code '001' included in the call connection request signal.
  • the video communication unit 630 requests video communication call connection to the called terminal according to the control signal transmitted from the service control unit 620, and when the video communication call is connected in response to the request, starts video communication between the calling terminal and the called terminal.
  • the voice analysis engine 640 extracts voice of the called party and analyzes an emotional state of the called party.
  • the voice analysis engine 640 includes an analysis object separating unit 641, a voice extracting unit 643 and a voice analysis unit 645.
  • the analysis object separating unit 641 of the voice analysis engine 640 separates the voice of the called party from a conversation between the calling party and the called party.
  • the voice extracting unit 643 extracts a predetermined voice parameter from an audio file of an object for voice analysis that is separated by the analysis object separating unit 641.
  • the voice analysis unit 645 draws a voice analysis result using eight voice analysis formulas at the maximum based on the voice parameter extracted by the voice extracting unit 643.
  • the voice analysis result includes an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an anticipation level, an embarrassment level, or a love level.
  • the voice analysis unit 645 selectively provides a portion or all of the above-mentioned levels according to service type. And, the voice analysis unit 645 may synthesize the above-mentioned levels to calculate another type level, for example a lie detection level. Preferably, the voice analysis unit 645 provides the above-mentioned levels numerically or in a graphic or text.
  • the voice analysis engine 640 may further include an emotional analysis control unit (not shown) for periodically detecting an emotional analysis start signal or an emotional analysis stop signal from the video communication unit 630 to control the emotional analysis service before operation of the analysis object separating unit 641.
  • an emotional analysis control unit (not shown) for periodically detecting an emotional analysis start signal or an emotional analysis stop signal from the video communication unit 630 to control the emotional analysis service before operation of the analysis object separating unit 641.
  • the emotional analysis control unit may stop or restart the operation of the voice analysis engine 640 at the request of the calling party. Therefore, the calling party may selectively request or stop the emotional analysis service during video communication. In other words, the emotional analysis control unit periodically detects an operation start signal or an operation stop signal of the voice analysis engine 640 from the video communication unit 630 to control the operation of the voice analysis engine 640.
  • the calling party selects a video communication service for emotion analysis, and then presses a '*' key to stop the emotion analysis of the opponent and use only the video communication service. And, the calling party presses the '*' key during video communication to restart the stopped emotional analysis of the opponent.
  • the '*' key is different from the above-mentioned video communication identifier, and has such a toggle function that after the video communication starts, when the '*' key is pressed during operation of the voice analysis engine 640, the operation of the voice analysis engine 640 stops, and when the '*' key is pressed again, the stopped operation of the voice analysis engine 640 restarts.
  • a T key is assigned with start of operation of the voice analysis engine 640, a '2' key with stop of operation of the voice analysis engine 640 and a '3' key with contents addition of the voice analysis engine 640 thereby to receive selection of various service requests from the calling party in real time.
  • the image integrating unit 650 integrates a video transmitted from the video communication unit 630 with the emotional state analysis result (for example, text or a graph indicating various levels) transmitted from the voice analysis unit 645 at a predetermined ratio and transmits the integrated image.
  • the emotional state analysis result for example, text or a graph indicating various levels
  • the video transmitted from the video communication unit 630 and the emotional state analysis result (for example, text or a graph indicating various levels) transmitted from the voice analysis unit 645 each may be transmitted to the calling terminal without the image integrating unit 650 and displayed on a divided screen of the calling terminal.
  • the image and the emotional state of the opponent are divisionally displayed on the divided screen of the calling terminal.
  • FIG. 7 is a block diagram illustrating a mobile communication server 700 for providing an emotional state analysis service during video communication according to another embodiment of the present invention.
  • a component having the same reference number as that of FIG. 6 executes the same function, and thus the detailed description thereof is omitted.
  • the mobile communication server 700 for providing an emotional state analysis service during video communication further includes a contents storage unit 760, an analysis result storage unit 770, and a message generating unit 780.
  • the contents storage unit 760 stores various contents corresponding to the voice analysis result of the voice analysis unit 645 therein.
  • the voice analysis unit 645 selects contents corresponding the voice analysis result from the contents storage unit 760.
  • the contents include various visual data, so that the calling party can easily recognize the voice analysis result.
  • the contents storage unit 760 stores various characters, avatars of various expressions, emoticons, flash or moving images corresponding to the voice analysis result such as a love level or a stress level.
  • the voice analysis unit 645 selects as the voice analysis result characters, avatars or multimedia contents that are stored beforehand in the contents storage unit 760, and properly uses the characters, avatars or multimedia contents according to circumstances. Therefore, the calling party can check the emotional state of the opponent using the visual data, such as a graph, a character or an avatar, that is selected from the contents storage unit 760. The calling party can check the emotional state of the opponent and transmit to the opponent moving images or multimedia contents corresponding to the emotional state.
  • the message generating unit 780 prepares a multimedia message (MMS) including the voice analysis result analyzed by the voice analysis unit 645 and the contents selected from the contents storage unit 760 by the voice analysis unit 645, and transmits the multimedia message to the calling terminal or the called terminal.
  • MMS multimedia message
  • the analysis result storage unit 770 stores the voice analysis result analyzed by the voice analysis unit 645.
  • the voice analysis result stored in the analysis result storage unit 770 is provided through a web service using a wire or wireless Internet. A user may see the voice analysis result through the web service.
  • FIG. 8 is a flow chart illustrating a method for providing a voice analysis service during video communication according to an embodiment of the present invention.
  • the calling party transmits a call for video communication with the called party, and the mobile communication server receives the video communication call transmitted from the calling party (S 801).
  • the call transmitted from the calling party includes the video communication identifier '*' and the feature code '001' for the voice analysis service.
  • the mobile communication server When receiving the video communication call transmitted from the calling party, the mobile communication server starts video communication between the calling party and the called party according to the identifier '*' included in the video communication call (S803).
  • the mobile communication server separates voice of the called party from a voice communication conversation between the calling party and the called party based on the feature code '100' included in the video communication call (S805).
  • the mobile communication server analyzes the separated voice of the called party, draws an emotional state of the called party, and the converts the drawn emotional state of the called party to a visual data (S807). That is, the mobile communication server extracts a voice parameter from the voice of the called party, executes voice analysis based on the extracted voice parameter to draw the emotional state result of the called party. And, the mobile communication server converts the drawn emotional state result to a visual data.
  • the emotional state result includes an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an anticipation level, an embarrassment level, or a love level.
  • the emotional state result is converted to a visual data, for example a percentage graph.
  • the mobile communication server may select multimedia contents, for example moving images, avatars or characters as the visual data of the emotional state result.
  • the mobile communication server integrates the visual data of the emotional state of the called party with a video of the called party at a predetermined ratio, and transmits the integrated image to the calling party (S809).
  • the calling party may stop the emotional state analysis.
  • the calling party inputs a key for stopping the emotional state analysis into a terminal.
  • the mobile communication server reads the key signal received from the terminal of the calling party, and in the case that the key signal indicates stop of the emotional state analysis, stops the ongoing voice analysis. And, after the calling party stops the emotional state analysis, the calling party may request the emotional state analysis again. In the same manner, the calling party inputs a key for requesting the emotional state analysis into the terminal.
  • the mobile communication server reads the key signal received from the terminal of the calling party, and in the case that the key signal indicates a request of the emotional state analysis, performs the stopped emotional state analysis again and provides the emotional state analysis result to the calling party.
  • the mobile communication server may prepare the emotional state analysis result of the called party as an MMS message, and transmit the MMS message to the calling party.
  • the mobile communication server attaches to the MMS message multimedia contents corresponding to the emotional state analysis result of the called party, for example avatars, characters or moving images, and transmit the MMS message to the calling party.
  • FIG. 9 is a view illustrating an MMS message provided as a voice analysis result according to an embodiment of the present invention.
  • a multimedia message service includes a call information 900, a final love level 910, other detection level 920 and a mobile phone Internet service URL 930.
  • the call information 900 schematically shows a detailed information of a voice analysis call between the calling party and the called party, and includes a telephone number of the called party, the date of call and the duration of call and may further include an image of love level.
  • the final love level 910 may be displayed in a bar graph and by percentage as shown in FIG. 9.
  • the bar graph type final love level 910 may have a heart-shaped flash effect or a graph visualizing effect for good visibility, and may be expressed in different colors by a predetermined range. This visual effect may be different according to a mobile communication service provider or a separate service provider that provides the voice analysis service.
  • the other detection level 920 includes an embarrassment level, a concentration level and an anticipation level as a real time analysis result according to love detection. As shown in FIG. 9, these levels each is indicated as a table of a predetermined shape, and has color or an image displayed according to the set range.
  • the mobile phone Internet service URL 930 is used for a detailed inquiry and contents providing of the final result, and when a user inputs a call button (not shown), a wireless Internet browser included in a mobile phone connects to a server corresponding to the URL to inquire out a detailed detection result.
  • FIG. 10 is a view illustrating an SMS message provided as a love detection result according to an embodiment of the present invention.
  • the love detection result provided as an SMS message may be provided in the type of a basic information 950a as shown in (a) of FIG. 10, or in the type of a detailed information 950b as shown in (b) of FIG. 10.
  • the basic information type 950a may include a telephone number of the called party as an object for voice analysis, a love level of the corresponding called party to the calling party, and a schematic display for connection to a detailed inquiry 960.
  • the detailed information type 950b includes a love level indicated by percentage, and an embarrassment level, a concentration level and an anticipation level indicated in text (970).
  • the detailed information type 950b provides a more detailed love detection result than the basic information type 950a shown in (a) of FIG. 10.
  • the calling party wants a detailed inquiry in the SMS message that may be provided as the basic information type 950a or the detailed information type 950b
  • the calling party inputs a call button or an Internet button 980 of the terminal. Then, the calling party is connected to a callback URL included in the SMS message to receive the more detailed love detection result.
  • FIG. 11 is a view illustrating a process for providing a voice analysis result through a mobile phone Internet according to an embodiment of the present invention, and a user may connect to the callback URL included in the SMS message or MMS message provided after the call ends, or directly connect to a wireless Internet.
  • a wireless Internet browser included in the mobile phone displays a service screen for connection to a love detection menu. Then, the calling party selects "007call" 1000 for a love detection service among predetermined services displayed on the browser, and moves to a subordinate screen.
  • the mobile terminal of the calling party connects to the server, downloads a subordinate screen and displays the subordinate screen to the calling party.
  • the corresponding subordinate menu screen includes a recent call history menu
  • the subordinate menu may enable the calling party to check information and a love level of the called party to whom the calling party made a call in recent days.
  • a recent call history 1020 is arranged and displayed in a predetermined manner.
  • the recent call history 1020 may include a telephone number and a love level of the called party.
  • information of the called party 1070 is arranged and displayed in a descending order of love level.
  • the calling party when the calling party selects an arbitrary called party in the recent call history 1020 or the information of called party 1070, the calling party can check a detailed love detection result 1040 of the called party.
  • FIG. 12 is a view illustrating a screen providing a love detection result through a web service according to an embodiment of the present invention, and shows a result screen that the calling party receives through a web after the calling party ends the love detection call.
  • the calling party connects to a web server that provides the analysis result of the love detection service of the present invention through a web. After the calling party connects to the web server, the calling party moves to a menu for checking the analysis result to check the detailed analysis result.
  • This connection step corresponds to a general web service, and thus the detailed description is omitted, (a) of FIG. 12 shows a web page displayed when the calling party connects to the web server and selects the recent call history menu.
  • the recent call history web page includes a telephone number 1100 of the called party to whom the calling party made a call in recent days, a date of call 1105, a love level 1110 of the called party, a detailed inquiry 1115, and contents 1120.
  • the recent call history web page including the above-mentioned lists may be different according to service type. For example, in the case that the calling party detects love of the called party by an Internet telephone using a notebook, an IP number of the notebook and an Internet telephone number may be displayed in the telephone number 1100.
  • the date of call 1105 the date of a call between the calling party and the called party is displayed, and the total call duration of the calling party may be further displayed.
  • a love level 1110 a love level of the called party corresponding to the telephone number 1100 may be displayed as a predetermined image or value.
  • the detailed inquiry 1115 a text having hyperlink is displayed, and according to selection of the text of the calling party, the detailed content of the text is displayed as a pop-up window.
  • the contents 1120 a conversation file, and a music file or a text file that may be provided according to service type are linked. The contents 1120 may be displayed or not according to existence of contents.
  • the calling party may check a love level result value 1125, a change of emotion during a call 1130 and a change of love level 1135 of an arbitrary called party through a pop-up window.
  • a love level result value 1125 As shown in (b) of FIG. 12 illustrating the detailed inquiry screen, in the love level result value 1125, a love level, a concentration level, an anticipation level and an embarrassment level are displayed in the type of a graph or a table 1140, and a final result 1145 is displayed in the type of text or an image.
  • the love level result value 1125 displayed in a graph or the table 1140 and the final result 1145 displayed in text or an image may be displayed differently according to a mobile communication service provider or a separate service provider that provides the love detection service, and thus the display method is not limited in this regard.
  • the contents 1120, the change of emotion during a call 1130 and the change of love level 1135 are described with reference to FIG. 13.
  • FIG. 13 is a view illustrating a screen providing a love detection result through a web service according to another embodiment of the present invention, and the change of emotion during a call 1130, the change of love level 1135 and the contents 1120 of the detailed inquiry 1115 shown in FIG. 12 are described with reference to FIG. 13.
  • the change of emotion during a call 1130 of the detailed inquiry 1115 displays, according to time, a love level, a concentration level, an anticipation level and an embarrassment level that are detected during a call between the calling party and the called party, and as shown in (a) of FIG. 13, the change of love level 1135 displays, according to period, the love level, the concentration level, the anticipation level and the embarrassment level that are measured during a predetermined period.
  • the change of emotion during a call 1130 and the change of love level 1135 are displayed in a graph of broken line 1150, however the present invention is not limited in this regard.
  • the change of emotion during a call 1130 and the change of love level 1135 provide the love detection result according to time and a predetermined period, respectively, so that the calling party can easily check the love detection result at an arbitrary time or period.
  • a user may designate an arbitrary time or period in the change of emotion during a call 1130 and the change of love level 1135, respectively, and inquire out results 1155.
  • the contents 1120 may include a voice recorded file 1160 in which voice of an object for voice analysis is recorded, or the frequency of detection 1165 corresponding to the analysis result.
  • the voice recorded file 1160 storing a conversation between the calling party and the called party may be erased by selection of the calling party.
  • a graph that may be displayed when replaying the voice recorded file 1160 may be the love detection analysis result of the called party as an object for voice analysis.
  • the final love level may be displayed in the type of text, and the concentration level, anticipation level, embarrassment level and stress level may be displayed in the type of a graph.
  • the contents 1120 corresponding to the analysis result of the called party may be displayed during replay of the voice recorded file 1160.
  • the contents 1120 may include an advice, an atmosphere producing method or a music according to service type, and may be provided in the type of a hyperlink.
  • the frequency of detection 1165 the frequency of detection of each level and the final love level may be displayed in the type of a bar graph, and a display method may be different according to service type.
  • the calling party may check through the graph or the frequency of detection at which words of the corresponding called party the love level is high or low, and receive the corresponding contents 120.
  • the voice analysis result provided through the web service that is described with reference to FIG. 13 replays only the voice of the called party and provides the voice analysis result of the called party in the type of a graph.
  • the calling party can check the change of emotion of the called party, but cannot check how emotion of the called party is changed at which words of the calling party. Therefore, in another embodiment, the present invention provides the voice analysis result of the called party at the start time of voice of the called party in the type of a graph while replaying the whole conversation between the calling party and the called party. For this purpose, a technique is required to synchronize the conversation between the calling party and the called party with the voice analysis result of the called party.
  • FIG. 14 is a view illustrating an embodiment of a method for synchronizing the whole conversation between the calling party and the called party and the voice analysis result of the called party.
  • the layered voice analysis server 140 records the whole conversation during the call between the calling party and the called party, separates the voice of the called party, executes love detection analysis in the above-mentioned method, and stores the love detection analysis result, (a) of FIG. 14 shows the conversation between the calling party (A) and the called party (B) as a time axis, and (b) of FIG. 14 shows the separated voice of the called party (B).
  • the layered voice analysis server 140 analyzes the separated voice of the called party (B).
  • the voice analysis is made by unit of voice of the called party (B).
  • the conversation of the called party (B) includes four voice units, the love detection analysis is made by each voice unit, and four voice analysis results are divisionally stored as segments (lseg, 2seg, 3seg and 4seg).
  • the layered voice analysis server 140 checks a start time and an end time of each voice of the called party based on a call start time, and records a voice start time (timer pointer) and a voice continuation time (segment length) into a time stamp field of each of the voice analysis result segments.
  • the layered voice analysis server 140 records 1 as the voice start time (time pointer) and 1 as the voice continuation time (segment length) into the time stamp field of the corresponding voice analysis result segment of the called party.
  • a time stamp field of a first voice analysis result segment of the called party (B) 3 is recorded as the voice start time (time pointer) and 1 is recorded as the voice continuation time (segment length).
  • a time stamp field of a second voice analysis result segment of the called party (B) 7 is recorded as the voice start time (time pointer) and 1 is recorded as the voice continuation time (segment length).
  • 11 is recorded as the voice start time (time pointer) and 3 is recorded as the voice continuation time (segment length).
  • a time stamp field of a fourth voice analysis result segment of the called party (B) 17 is recorded as the voice start time (time pointer) and 2 is recorded as the voice continuation time (segment length).
  • the layered voice analysis server 140 checks the voice start time and the voice continuation time based on the call start time, and records the voice start time and the voice continuation time into the time stamp field of the corresponding voice analysis result segment in the above-mentioned manner. Therefore, the layered voice analysis server 140 synchronizes the whole conversation with the voice analysis result of the called party based on a time information recorded in the time stamp field of the voice analysis result segment of the called party, and provides the synchronized result through the web service.
  • the layered voice analysis server 140 checks time from a replay start time, and when it is the time recorded in the time stamp field of each of the voice analysis result segments of the called party, provides the love detection result of the corresponding voice analysis result segment in the type of a graph. Therefore, the calling party can check how the called party reacts to which words of the calling party.
  • the present invention analyzes an emotional state of the called party and provides the emotional state of the called party to the calling party in real time, thereby exciting the calling party's curiosity beyond a simple voice/video communication.
  • the present invention may remarkably increase the saturated voice call amount, thereby increasing sales.
  • the present invention enables the calling party to speak with a virtual opponent prior to an actual voice communication, so that the calling party can check in real time whether he/she expresses his/her emotion properly, thereby reducing a mistake or misunderstanding that may occur during a call with an actual called party.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention relates to system and method for providing a voice analysis service, and an apparatus therefor, and the method for providing a voice analysis service in a communication network according to the present invention includes establishing a traffic channel between a calling terminal and a called terminal according to call connection of the calling terminal; performing layered voice analysis of voice of a called party transmitted from the called terminal to the calling terminal through the established traffic channel in real time to draw an emotional state information of the called party; and transmitting the drawn emotional state information of the called party to the calling terminal in real time.

Description

Description
METHOD AND SYSTEM FOR PROVIDING VOICE ANALYSIS SERVICE, AND APPARATUS THEREFOR
Technical Field
[1] The present invention relates to a mobile communication service, and in particular, to method and system for analyzing voice of a user who is on a call based on layered voice analysis and providing an analysis result. Background Art
[2] Generally, when listening to voice of a person, a human can estimate an emotional state of the person at some extent. For example, when a person is excited or angry, an average pitch of his/her voice becomes higher than usual, and although he/she speaks a sentence composed of the same words as usual, the average pitch in the sentence changes more rapidly. And, as the voice becomes louder, energy becomes relatively higher in the same environment as usual and average energy changes relatively more rapidly.
[3] Based on this fact, studies are in brisk process regarding a love detection technology for recognizing a human voice to understand an emotional state. The love detection technology includes an emotional state recognizing technology, a signal processing technology according to emotional state recognition, or an application technology. Prior arts related to the love detection technology are patented or pending, for example "emotion recognition method, and sensibility creating method, device, and software" (Korean Patent Application No. 10-2003-7003615), "method for providing sound source and avatars according to feelings on mobile" (Korean Patent Application No. 10-2003-0081299), "emotion recognition in voice using wavelet transformation" (Korean Patent Application No. 10-2002-0026056), "a doll for representing a user's emotion" (Korean Registration No. 20-0313090-0000) or "accessory capable of expressing one's emotions" (Korean Registration No. 20-0301592-0000).
[4] Recently, there was an invention titled "portable telephone having lie searching function and searching method therefor" (Korean Registration No. 10-0381970-0000), which detects a lie through recognition of voice of a talker. The invention induces the talker to speak in a comfortable atmosphere and judges the truthfulness of words of the talker.
[5] However, emotion analysis of the above-mentioned prior arts is made by subjective judgment or setting of a user, and thus its analysis result is not objective. And, an advice for the love detection result or a plan to overcome a current state is not suggested. [6] Meanwhile, with development of a mobile communication technology, various additional services beyond a simple voice communication function are provided. That is, a mobile communication service provider developed various additional services other than voice communication or a short message service, for example, a personalized ring-back tone service, a call background sound service, a singing service or a music mail, and provides the additional services to mobile communication subscribers. Further, with development of the mobile communication technology, a third generation mobile communication network (for example, W-CDMA (Wide-Code Division Multiple Access) network or HSDPA (High-Speed Downlink Packet Access) network) is established to transmit a large-capacity data at a high speed, which goes beyond a conventional voice communication service and allows video communication that a calling party and a called party speak by telephone while seeing their faces each other directly.
[7] However, currently a mobile communication additional service using the love detection technology is not provided. Disclosure of Invention Technical Problem
[8] The present invention is designed to solve the problems of the prior art, and therefore it is an object of the present invention to provide method and system for providing a voice analysis service, which analyze voice of a user who is on a call based on layered voice analysis and provide an emotional analysis result in various manners, and an apparatus therefor. Technical Solution
[9] In order to achieve the above-mentioned objects, a method for providing a voice analysis service in a communication network according to a first aspect of the present invention, includes establishing a traffic channel between a calling terminal and a called terminal according to call connection of the calling terminal; performing layered voice analysis of voice of a called party transmitted from the called terminal to the calling terminal through the established traffic channel in real time to draw an emotional state information of the called party; and transmitting the drawn emotional state information of the called party to the calling terminal in real time.
[10] And, a method for providing a voice analysis service during video communication in a video communication network according to a second aspect of the present invention, includes establishing a video traffic channel between a calling terminal and a called terminal according to video communication call connection of the calling terminal; performing layered voice analysis of voice of a called party transmitted from the called terminal to the calling terminal through the established video traffic channel in real time to draw an emotional state information of the called party; converting the drawn emotional state information of the called party to a visual data; and transmitting and displaying to the calling terminal the converted visual data and a video of the called party to be transmitted to the calling terminal through the video traffic channel.
[11] And, a method for providing a voice analysis service in a communication network according to a third aspect of the present invention, includes receiving call connection for the voice analysis service from a calling terminal; notifying a mode information of the voice analysis service to the calling terminal and receiving a selection response to the notification from the calling terminal; in the case that a calling party selects an exercise mode, notifying a virtual opponent and an exercise method to the calling terminal and receiving a selection response to the notification from the calling terminal; receiving voice of the calling party from the calling terminal according to the virtual opponent and exercise method selected by the calling party; and performing layered voice analysis of the received voice of the calling party to draw a voice analysis result, and transmitting the drawn voice analysis result to the calling terminal in real time.
[12] And, a system for providing a voice analysis service in a communication network according to a fourth aspect of the present invention, includes a switching center for receiving a call origination including a feature code for the voice analysis service and a telephone number of a called party from a calling terminal, and routing the call origination based on the feature code; and a layered voice analysis server for receiving the routed call, establishing a traffic channel between the calling terminal and a called terminal corresponding to the telephone number of the called party, analyzing voice of the called party transmitted to the calling terminal through the established traffic channel to draw an emotional state information of the called party in real time, and transmitting the drawn emotional state information to the calling terminal in real time.
[13] And, a mobile communication device for providing a voice analysis service during video communication in a mobile communication network according to a fifth aspect of the present invention, includes a service control unit for controlling a service operation mode according to a video communication call connection signal from a calling terminal; a video communication unit for establishing a traffic channel between the calling terminal and a called terminal under the control of the service control unit to support video communication; and a voice analysis engine for analyzing voice of a called party transmitted to the calling terminal through the established traffic channel under the control of the service control unit to convert an emotional state of the called party to a visual data. Brief Description of the Drawings [14] These and other features, aspects, and advantages of preferred embodiments of the present invention will be more fully described in the following detailed description, taken accompanying drawings. In the drawings: [15] FIG. 1 is a view illustrating a configuration of a voice analysis service system of a mobile communication network according to an embodiment of the present invention. [16] FIG. 2 is a block diagram illustrating a layered voice analysis server of FIG. 1 according to an embodiment of the present invention. [17] FIG. 3 is a flow chart illustrating a voice analysis service method according to an embodiment of the present invention. [18] FIG. 4 is a flow chart illustrating a voice analysis service method according to another embodiment of the present invention. [19] FIG. 5 is a flow chart illustrating a voice analysis service method according to still another embodiment of the present invention. [20] FIG. 6 is a block diagram illustrating a mobile communication server for providing an emotional state analysis service during video communication according to an embodiment of the present invention. [21] FIG. 7 is a block diagram illustrating a mobile communication server for providing an emotional state analysis service during video communication according to another embodiment of the present invention. [22] FIG. 8 is a flow chart illustrating a system for providing a voice analysis service during video communication according to an embodiment of the present invention. [23] FIG. 9 is a view illustrating an MMS message provided as a voice analysis result according to an embodiment of the present invention. [24] FIG. 10 is a view illustrating an SMS message provided as a love detection result according to an embodiment of the present invention. [25] FIG. 11 is a view illustrating a process for providing a voice analysis result through a mobile phone Internet according to an embodiment of the present invention. [26] FIG. 12 is a view illustrating a screen providing a love detection result through a web service according to an embodiment of the present invention. [27] FIG. 13 is a view illustrating a screen providing a love detection result through a web service according to another embodiment of the present invention. [28] FIG. 14 is a view illustrating an embodiment of a method for synchronizing the whole conversation between a calling party and a called party with a voice analysis result of the called party.
Best Mode for Carrying Out the Invention [29] Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Prior to the description, it should be understood that the terms used in the specification and the appended claims should not be construed as limited to general and dictionary meanings, but interpreted based on the meanings and concepts corresponding to technical aspects of the present invention on the basis of the principle that the inventor is allowed to define terms appropriately for the best explanation. Therefore, the description proposed herein is just a preferable example for the purpose of illustrations only, not intended to limit the scope of the invention, so it should be understood that other equivalents and modifications could be made thereto without departing from the spirit and scope of the invention.
[30] FIG. 1 is a view illustrating a configuration of a voice analysis service system of a mobile communication network according to an embodiment of the present invention.
[31] Referring to FIG. 1, the voice analysis service system according to the present invention includes a mobile switching center (MSC) 120, a home location register (HLR) 130, a layered voice analysis server 140, a short message service center (SMSC) 150, and a multimedia message service center (MMSC) 160. The configuration of the voice analysis service system according to the present invention is not limited in this regard, and the voice analysis service system may include all variations and combinations of the above-mentioned components.
[32] In particular, although this embodiment of FIG. 1 cites a mobile communication network, the layered voice analysis server 140 may be connected to an Internet network capable of voice communication to provide a voice analysis service, or may be connected to a communication network based on a public switched telephone network (PSTN), an integrated service digital network (ISDN) and a local area network (LAN) to provide a voice analysis service.
[33] The layered voice analysis server 140 cooperates with the MSC 120 over El/Tl to execute a call processing function for the voice analysis service of the present invention. And, the layered voice analysis server 140 executes voice analysis on-line in real time using an inner voice analysis engine or executes voice analysis of a prerecorded audio file off-line.
[34] The layered voice analysis for voice analysis according to the present invention uses voice as a medium of brain to make a blueprint of a brain activity and analyze a complete emotional system of a talker, and does not focus on a conversation, but the brain activity that occurs from making a conversation. That is, generally a human brain leaves "finger prints" on all "events" that pass through the brain. The layered voice analysis server 140 according to the present invention disregards words of an object for analysis (a calling party or a called party), and focuses on only the brain activity of the object for analysis. In other words, the voice analysis according to the present invention does not focus on 'what' the object for analysis speaks, but 'how' the object for analysis speaks. [35] The layered voice analysis service according to the present invention can analyze voice of a calling party 110 or a called party 170 in real time and provide a real time voice analysis result. And, after a call ends, the layered voice analysis service can provide the voice analysis result in a type of a short message or a multimedia message or using a wire or wireless Internet. At this time, the calling party 110 and the called party 170 use a terminal capable of voice communication, for example a mobile phone, a notebook, PDA or a typical telephone.
[36] The layered voice analysis server 140 stores a conversation between the calling party 110 and the called party 170 who are on a call. And, the layered voice analysis server 140 separates only voice of the calling party 110 or the called party 170 from the conversation between the calling party 110 and the called party 170, stores the separated voice, and extracts a predetermined voice parameter from an audio file storing the voice. At this time, the audio file is formed of 6/8/11 KHz, 8/16 bit mono/ stereo non-compressed pulse code modulation (PCM) format wav file. The layered voice analysis server 140 draws a voice analysis result using eight voice analysis formulas at the maximum based on the extracted voice parameter.
[37] The eight voice analysis formulas may include a lie stress analysis formula, a stimulation level analysis formula, an attention level analysis formula, an emotional level analysis formula, a strife or conflict level analysis formula, a fraud and deceit pattern combination analysis formula or additional formulas for reliability evaluation.
[38] The voice analysis result includes an excitement level representing a degree of excitement of an object for voice analysis, and a conviction level representing a degree of conviction of the object for voice analysis about his/her words. And, the voice analysis result includes a stress level representing a degree of stress of the object for voice analysis who knows he/she cannot escape from a current situation, and a thought level representing a degree that the object for voice analysis tries to search for a response. And, the voice analysis result includes an S.O.S (Say or Stop) level representing whether the object for voice analysis evades a conversation, and a concentration level representing whether the object for voice analysis concentrates on the conversation. And, the voice analysis result includes an anticipation level representing a degree of anticipation of the object for voice analysis to an opponent, an embarrassment level representing whether the object for voice analysis is embarrassed, or a love level representing a degree of love of the object for voice analysis.
[39] A portion or all of the voice analysis result is selected and provided according to service type. Alternatively, the voice analysis result may be synthesized to draw and provide another service level. For example, a lie detection level may be provided.
[40] According to embodiments of the present invention, the voice analysis service may be provided in an actual mode and an exercise mode. The actual mode is such that the voice analysis result is transmitted to the calling party 110 or the called party 170 during an actual call. The exercise mode is such that the calling party 110 receives his/ her voice analysis result while speaking by telephone with a virtual opponent. The exercise mode enables the calling party 110 to make a virtual call with the virtual opponent, so that the calling party 110 can reduce a mistake or misunderstanding that may occur during an actual call with the called party 170 and judge whether he/she is properly expressing his/her emotion.
[41] For example, when the calling party 110 connects a call for use of the actual mode, the layered voice analysis server 140 establishes a traffic channel between the calling party 110 and the called party 170, and analyzes voice of the called party 170 transmitted to the calling party 110 through the established traffic channel to extract a love level. At this time, the layered voice analysis server 140 separately records and stores a conversation between the calling party 110 and the called party 170. Meanwhile, the layered voice analysis server 140 informs in real time the calling party 110 of the love level that is drawn by analyzing the voice of the called party 170 or provides the calling party 110 or the called party 170 with contents corresponding to the love level. The layered voice analysis server 140 may provide the love level when the love level exceeds a predetermined critical value, or the layered voice analysis server 140 may average the love level at a predetermined time interval and inform the calling party 110 of the averaged love level. The layered voice analysis server 140 may store the conversation or a call history between the calling party 110 and the called party 170, and after the call ends, provide the calling party 110 with a voice analysis result obtained by analyzing the recorded conversation in detail and contents corresponding to the voice analysis result as a short message or a multimedia message or through a wire or wireless Internet.
[42] For another example, when the calling party 110 connects a call for use of the exercise mode, the layered voice analysis server 140 receives a virtual opponent and an exercising method selected by the calling party 110 and words from the calling party 110. The virtual opponent may include mother or father, a brother or a sister, a girl friend or a boy friend, a one-sided love interest, or a friend, and the exercising method may include persuasion, consolation or confession. The layered voice analysis server 140 analyzes voice of the calling party 110, draws a love level and provides the calling party 110 with the drawn love level or contents corresponding to the love level. In the same manner as the actual mode, the layered voice analysis server 140 may provide the love level when the love level exceeds a predetermined critical value, or the layered voice analysis server 140 may average the love level at a predetermined time interval and provide the calling party 110 with the averaged love level. And, the layered voice analysis server 140 may store the words and a call history of the calling party 110, and after the exercise mode ends, provide the calling party 110 with the voice analysis result obtained by analyzing his/her words in detail and contents corresponding to the voice analysis result as a short message or a multimedia message or through a wire or wireless Internet.
[43] A user who wants to use the voice analysis service may connect to the layered voice analysis server 140 using a predetermined feature code. That is, when the user inputs a predetermined feature code (for example, "**45") and a telephone number of a called party into a terminal, the MSC 120 recognizes based on the feature code included in a call connection request message transmitted from the user's terminal that the corresponding call is a voice analysis service call. The MSC 120 requests a routing information of the layered voice analysis server 140 to the HLR 130. The MSC 120 routes the call of the user to the layered voice analysis server 140 based on a response from the HLR 130. Alternatively, when the user inputs the telephone number of the called party and a wireless Internet key (for example, MagicN key of KTF), a packet data serving node (PDSN or SGSN) routes the outbound call from the user's terminal to the layered voice analysis server 140.
[44] FIG. 2 is a block diagram illustrating the layered voice analysis server 140 of FIG.
1 according to an embodiment of the present invention.
[45] As shown in FIG. 2, the layered voice analysis server 140 according to the present invention includes a call setup unit 210, a recording unit 230, a voice analysis engine 250 and a storage device 270. And, the voice analysis engine 250 includes an analysis object separating unit 251, a voice extracting unit 253 and a voice analysis unit 255, and the storage device 270 includes an analysis result storage unit 271 and a contents storage unit 273. The voice analysis engine 250 as an essential component of the layered voice analysis server 140 may be formed of a software development kit. However, the voice analysis engine 250 is not limited in this regard.
[46] First of all, the call setup unit 210 of the layered voice analysis server 140 cooperates with the MSC 120, SMSC 150 and MMSC 160 of the mobile communication network to executes call processing for the voice analysis service of the present invention. For example, in the case that the calling party 110 uses the voice analysis service in the actual mode, the call setup unit 210 establishes a traffic channel between the calling party 110 and the called party 170 to transmit and receive voices of the calling party 110 and the called party 170. And, in the case that the calling party 110 uses the voice analysis service in the exercise mode, the call setup unit 210 establishes a traffic channel with the calling party 110 to receive voice of the calling party 110. And, the call setup unit 210 transmits a voice analysis result of an object for voice analysis and content corresponding to the voice analysis result as a short message or a multimedia message, thereby providing the voice analysis service after end of the call.
[47] The recording unit 230 records the voice received by the call setup unit 210. The audio file recorded by the recording unit 230 is formed of 6/8/11 KHz, 8/16 bit mono/ stereo non-compressed pulse code modulation (PCM) format wav file. The audio file recorded by the recording unit 230 is stored in the storage device 270.
[48] Meanwhile, the analysis object separating unit 251 of the voice analysis engine 250 separates the voice of the object for voice analysis from the audio file input through the recording unit 230. For example, in the case that the calling party 110 uses the voice analysis service in the actual mode, the analysis object separating unit 251 separates only the voice of the called party 170 from the audio file input through the recording unit 230. Alternatively, according to service type, in the case that the called party 170 uses the voice analysis service in the actual mode, the analysis object separating unit 251 separates only the voice of the calling party 110 from the audio file input through the recording unit 230.
[49] The voice extracting unit 253 extracts a predetermined voice parameter from the audio file of the object for voice analysis that is separated by the analysis object separating unit 251.
[50] The voice analysis unit 255 draws a voice analysis result using eight voice analysis formulas at the maximum based on the voice parameter extracted by the voice extracting unit 253. The voice analysis result includes an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an embarrassment level or a love level. The voice analysis unit 255 selectively provides a portion or all of the above-mentioned levels according to service type. And, the voice analysis unit 255 may synthesize the above-mentioned levels to calculate another type level, for example a lie detection level. Preferably, the voice analysis unit 255 provides the above-mentioned levels numerically or in a graphic or text.
[51] Meanwhile, the analysis result storage unit 271 of the storage device 270 stores the conversation between the calling party 110 and the called party 170 that is recorded by the recording unit 230. And, the analysis result storage unit 271 stores the voice analysis result of the object for voice analysis that is drawn by the voice analysis unit 255. At this time, the analysis result storage unit 271 maps and stores information of a user who used the voice analysis service and the date of the voice analysis service together with the voice analysis result and the whole conversation of the object for voice analysis. And, the contents storage unit 273 of the storage device 270 stores predetermined contents that are provided corresponding to the voice analysis result, for example a music or an emoticon. Thus, the whole conversation and the voice analysis result stored in the analysis result storage unit 271 and the predetermined contents stored in the contents storage unit 273 are provided to the user through a short message service, a multimedia message service or a wire or wireless Internet service.
[52] FIG. 3 is a flow chart illustrating a voice analysis service method according to an embodiment of the present invention, and illustrates a method for analyzing the voice of the called party during the call between the calling party and the called party and providing the voice analysis result to the calling party.
[53] As shown in FIG. 3, for voice analysis of the called party 170, first, the calling party 110 connects to the layered voice analysis server 140 using a terminal (S301). A predetermined feature key is used to connect the calling party 110 to the layered voice analysis server 140.
[54] For example, when the calling party 110 inputs '*007 + a telephone number of the called party' into the calling terminal, a call of the calling terminal is routed to the layered voice analysis server 140 via the MSC 120. Alternatively, the calling party 110 may input 'a telephone number of the called party + an Internet connection key' into the calling terminal to connect to the layered voice analysis server 140. However, a method in which the calling party 110 connects to the layered voice analysis server 140 is not limited in this regard, and may be variously modified according to service type.
[55] At this time, in the case that the voice analysis service requires a service subscription, the layered voice analysis server 140 may execute user authentication when the calling party 110 connects a call. Preferably, the user authentication is automatically made based on information of the calling party 110, however the present invention is not limited in this regard.
[56] And, when the calling party 110 connects a call, the layered voice analysis server
140 may inform the calling party 110 of the voice analysis service. Preferably, the voice analysis service is informed in a type of voice notification, and the notification may include a brief description of the voice analysis service and a description of service charge information.
[57] Thus, when the calling party 110 connects to the layered voice analysis server 140 for voice analysis of the called party 170, the layered voice analysis server 140 connects a call to the called party 170 based on the telephone number of the called party 170 to establish a traffic channel between the calling party 110 and the called party 170 (S303).
[58] After the traffic channel is established between the calling party 110 and the called party 170, the layered voice analysis server 140 separates the audio file of the called party 170 from the conversation between the calling party 110 and the called party 170 (S305). At this time, the layered voice analysis server 140 records and stores the conversation between the calling party 110 and the called party 170, and then separates only the audio file of the called party 170 from the conversation.
[59] Subsequently, the layered voice analysis server 140 extracts a predetermined voice parameter from the separated audio file of the called party 170 (S307). The audio file is formed of 6/8/11 KHz, 8/16 bit mono/stereo non-compressed pulse code modulation (PCM) format wav file.
[60] And, the layered voice analysis server 140 executes voice analysis based on the extracted voice parameter to draw a voice analysis result (S309). The voice analysis result includes an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an anticipation level, an embarrassment level or a love level. And, the voice analysis result may include a lie detection level (in other words, a reliability level) calculated by synthesizing the above-mentioned levels. In this example, the voice analysis result includes a love level, a concentration level, an anticipation level, and an embarrassment level.
[61] The concentration level analyzed by the layered voice analysis server 140 is drawn in the range of 0 to 100. The range of 0 to 44 means a low concentration level or perplexity, the range of 45 to 65 means a normal state, and the range of 66 to 100 means a high concentration level. The concentration level may be indicated by percentage or classified into very low, low, middle, high and very high according to a predetermined standard, and may be provided in text or a graph.
[62] And, the anticipation level is drawn in the range of 0 to 100. The range of 0 to 30 means a normal state, and the range of 31 to 100 means high anticipation to an opponent and may suggest an attempt of deceit. The anticipation level may be indicated by percentage or classified into low, middle and high according to a predetermined standard, and may be provided in text or a graph.
[63] The embarrassment level is drawn into five classified stages in the range of 0 to
100. The embarrassment level 0 means no embarrassment, the embarrassment level 25 means a slight embarrassment, the embarrassment level 50 means a normal state, the embarrassment level 75 means a considerable embarrassment, and the embarrassment level 100 means very much embarrassment. The embarrassment level may be indicated by percentage or classified into very low, low, middle, high and very high according to a predetermined standard, and may be provided in text or a graph.
[64] Finally, the love level is drawn in the range of -10 to 50. According to the love detection result, the love level of -10 to 0 and 1 to 10 means that love is not detected from the called party 170. The love level of 11 to 50 means that love is detected from the called party 170. The love level may be provided by percentage. In the case that the love level is -10 to 0, a percentage of the love level is calculated by multiply the corresponding love level by 0, and in the case that the love level is 1 to 50, a percentage of the love level is calculated by multiply the corresponding love level by 2.
[65] The voice analysis result drawn by analyzing the voice of the called party 170 is provided in real time to the calling party 110 that is on a call (S311). Preferably, the voice analysis result that is transmitted in real time to the calling party 110 on a call is the love level calculated by percentage among the above-mentioned levels. At this time, the voice analysis result may be provided to the calling party 110 on a call by a predetermined period (for example, at an interval of ten seconds), or provided when the love level exceeds a predetermined critical value. Alternatively, the voice analysis result transmitted in real time to the calling party 110 on a call may be contents corresponding to the love level. In the case that the provided contents are for example, a music file, the corresponding music may be provided as a background music during the call between the calling party 110 and the called party 170.
[66] And, after the call ends, the layered voice analysis server 140 provides the calling party 110 with the voice analysis result that is drawn by analyzing the voice of the called party 170 and stored, as a short message or a multimedia message or through a wire or wireless Internet (S313). Preferably, the voice analysis result that is provided as a short message or a multimedia message or through a wire or wireless Internet after the call ends, includes all of the love level, the concentration level, the anticipation level and the embarrassment level. The voice analysis result that is provided as a short message or a multimedia message or through a wire or wireless Internet, is displayed variously, for example, in text or a graph. In particular, in the case that the voice analysis result is provided through a wire or wireless Internet, the layered voice analysis server 140 may provide only the voice analysis result (i.e. graph) of the called party 170 or provide the voice analysis result together with the whole conversation. That is, in the case that the layered voice analysis server 140 provides only the voice analysis result of the called party 170, the calling party 110 can check the change of emotion of the called party 170, but cannot check where the change of emotion of the called party 170 occurs in the conversation. Thus, the layered voice analysis server 140 provides the change of emotion of the called party 170 together with the whole conversation. While replaying the whole conversation, the layered voice analysis server 140 synchronizes the whole conversation with the voice analysis result at the start time of the voice of the called party 170 to provide the voice analysis result in a graph. Its detailed method is described as follows.
[67] The voice analysis service may be charged in proportion to the duration of call or the times of usage. However, because a billing method may be different according to a mobile communication service provider or a separate service provider that provides the voice analysis service, the present invention is not limited in this regard.
[68] FIG. 4 is a flow chart illustrating a voice analysis service method according to another embodiment of the present invention.
[69] The voice analysis service may be provided in an actual mode and an exercise mode according to an object for voice analysis. The actual mode is such that voices of the calling party and the called party are analyzed during an actual call and a voice analysis result is provided. The exercise mode is such that voice of the calling party who speaks by telephone with a virtual opponent is analyzed and a voice analysis result is transmitted to the corresponding calling party. The exercise mode is described with reference to FIG. 3.
[70] For love detection in the exercise mode, the calling party 110 connects to the layered voice analysis server 140 by a terminal (S401). A predetermined feature key may be used to connect the calling party 110 to the layered voice analysis server 140 for love detection in the exercise mode. For example, as shown in FIG. 3, the calling party 110 inputs "a feature code for voice analysis (*007) + a telephone of the called party" into the terminal. In this case, the layered voice analysis server 140 informs the calling party 110 of an actual mode and an exercise mode, and receives selection of the exercise mode from the calling party 110. Alternatively, the calling party 110 may directly input a feature code for the exercise mode and the telephone number of the called party into the terminal. In this case, the layered voice analysis server 140 may not inform the calling party 110 of the exercise mode. Further, the calling party 110 may connect to the layered voice analysis server 140 for voice analysis in the exercise mode in various manners according to service type.
[71] At this time, in the case that the voice analysis service requires a service subscription, the layered voice analysis server 140 may execute user authentication when the calling party 110 connects a call. Preferably, the user authentication is automatically made based on information of the calling party 110, however the present invention is not limited in this regard.
[72] And, when the calling party 110 connects a call, the layered voice analysis server
140 may inform the calling party 110 of the voice analysis service. Preferably, the voice analysis service is provided in the type of voice notification, and the notification may include a brief description of the voice analysis service and a service charge information.
[73] After the calling party 110 connects to the layered voice analysis server 140 for voice analysis in the exercise mode, the layered voice analysis server 140 receives selection of an opponent for exercise from the calling party 110 (S403). The opponent for exercise is selected such that voice notification is provided to the calling party 110 and a response thereto is received from the calling party 110. For example, the opponent for exercise may include mother or father, a brother or a sister, a girl friend or a boy friend, a one-sided love interest, or a friend.
[74] After the layered voice analysis server 140 receives the opponent for exercise from the calling party 110, the layered voice analysis server 140 receives selection of an exercising method suitable for the opponent for exercise from the calling party 110 (S405). The exercising method is selected in the same manner as the opponent for exercise. For example, the exercising method may include persuasion, consolation or confession.
[75] For example, the calling party 110 selects a one-sided love interest as the opponent for exercise and confession as the exercising method. This means that the calling party 110 has a person who he/she loves secretly and wants to confess his/her love to the one-sided love interest, and the calling party 110 intends to exercise confession of his/ her love prior to an actual confession. Thus, when the calling party 110 speaks by telephone with the one-sided love interest, the calling party 110 may reduce a potential mistake or misunderstanding.
[76] After the layered voice analysis server 140 receives the selection of the opponent for exercise and the exercising method, the layered voice analysis server 140 receives voice of the calling party 110 (S407). Preferably, while receiving the voice of the calling party 110, the layered voice analysis server 140 records the voice of the calling party 110.
[77] Subsequently, the layered voice analysis server 140 extracts a predetermined voice parameter from an audio file of the calling party 110 (S409). The audio file of the calling party 110 is formed of 6/8/11 KHz, 8/16 bit mono/stereo non-compressed pulse code modulation (PCM) format wav file.
[78] And, the layered voice analysis server 140 executes voice analysis based on the extracted voice parameter to draw a voice analysis result (S411). Here, the voice analysis is the same as the voice analysis described with reference to FIG. 3, and thus the detailed description thereof is omitted.
[79] The voice analysis result drawn by analyzing the voice of the calling party 110 is provided in real time to the calling party 110 who is on a call (S413). Preferably, the voice analysis result that is transmitted in real time to the calling party 110 is a love level calculated by percentage. At this time, the voice analysis result may be provided to the calling party 110 on a call by a predetermined period (for example, at an interval of ten seconds), or provided when the love level exceeds a predetermined critical value. Alternatively, the voice analysis result transmitted in real time to the calling party 110 on a call may be contents corresponding to the love level. The voice analysis result may be provided differently according to service type.
[80] And, after the call ends, the layered voice analysis server 140 provides the calling party 110 with the voice analysis result that is drawn by analyzing the voice of the calling party 110 and stored, as a short message or a multimedia message or through a wire or wireless Internet. Preferably, the voice analysis result that is provided as a short message or a multimedia message or through a wire or wireless Internet after the call ends, includes all of the love level, the concentration level, the anticipation level and the embarrassment level. The voice analysis result that is provided as a short message or a multimedia message or through a wire or wireless Internet, is displayed variously, for example, in text or a graph.
[81] The voice analysis service may be charged in proportion to the duration of call or the times of usage. However, because a billing method may be different according to a mobile communication service provider or a separate service provider that provides the voice analysis service, the present invention is not limited in this regard.
[82] Meanwhile, after the step S413, in the case that the call with the calling party 110 ends or the calling party 110 requests to change a mode, the layered voice analysis server 140 may establish a call between the calling party 110 and the called party 170 corresponding to the opponent for exercise to provide the voice analysis service in the actual mode. In the case that the calling party 110 inputs a telephone number of the called party 170 to connect to the layered voice analysis server 140 in the step S401, the layered voice analysis server 140 checks only whether the calling party 110 intends to use the voice analysis service in the actual mode and establishes a call between the calling party 110 and the called party 170. Alternatively, in the case that the calling party 110 connects to the layered voice analysis server 140 without inputting the telephone number of the called party 170, the layered voice analysis server 140 receives selection of the called party 170 from the calling party 110 and connects a call between the calling party 110 and the called party 170.
[83] Thus, after a traffic channel is established between the calling party 110 and the called party 170, the layered voice analysis server 140 performs the steps S305 to S313 described with reference to FIG. 3.
[84] FIG. 5 is a flow chart illustrating a voice analysis service method according to another embodiment of the present invention, and describes a method for detecting a false return in an emergency call service. The layered voice analysis server 140 is installed in a public institution such as a police station or a fire station and analyzes voice of an emergency caller to detect a false return.
[85] As shown in FIG. 5, first, the calling party 110 who intends to make an emergency call to a public institution such as a police station or a fire station inputs an emergency call key or an emergency call telephone number into a terminal to make an emergency call. The emergency call of the calling party 110 is input into the layered voice analysis server 140 (S501).
[86] When the calling party 110 connects to the layered voice analysis server 140 for an emergency call, the layered voice analysis server 140 establishes a traffic channel between a telephone operator and the calling party 110 (S503).
[87] After the traffic channel is established between the telephone operator and the calling party 110, the layered voice analysis server 140 separates an audio file of the calling party 110 from a conversation between the calling party 110 and the telephone operator (S505). At this time, the layered voice analysis server 140 records and stores the conversation between the calling party 110 and the telephone operator, and separates only the audio file of the calling party 110 from the conversation.
[88] Subsequently, the layered voice analysis server 140 extracts a predetermined voice parameter from the separated audio file of the calling party 110 (S507). The audio file is formed of 6/8/11 KHz, 8/16 bit mono/stereo non-compressed pulse code modulation (PCM) format wav file.
[89] And, the layered voice analysis server 140 executes voice analysis based on the extracted voice parameter to draw a voice analysis result (S509). That is, the layered voice analysis server 140 draws an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an anticipation level, an embarrassment level or a love level, and synthesizes the above-mentioned levels to calculate a lie detection level (in other words, a reliability level).
[90] The layered voice analysis server 140 judges whether a report of the calling party
110 is true or false based on the voice analysis result drawn by analyzing the voice of the calling party 110 (S511). Preferably, the layered voice analysis server 140 draws a lie detection level of the calling party 110 in the range of 0 to 100, and in the case that the lie detection level exceeds 50, judges the report as false.
[91] In the case that the report is not false, the layered voice analysis server 140 extracts information of the calling party 110, stores the extracted information into the storage device 270, and provides the calling party 110 with information according to emergency situation (S513). The information according to emergency situation may be provided as a message. The calling party 110 receives the information according to emergency situation and deals with the emergency situation prior to arrival of an emergency staff or a police officer. Here, preferably the layered voice analysis server 140 maintains the call between the calling party 110 and the telephone operator.
[92] Meanwhile, in the case that the report of the calling party 110 is false, the layered voice analysis server 140 extracts information of the calling party 110, stores the extracted information into the storage device 270, and transmits to the calling party 110 an warning message against the false return based on the information of the calling party 110 (S515).
[93] After the layered voice analysis server 140 transmits the warning message against the false return, the layered voice analysis server 140 judges whether the calling party 110 is a habitual criminal based on the information of the calling party 110 (S517). In the case that the calling party 110 is a habitual criminal, the layered voice analysis server 140 further transmits to the calling party 110 a warning message including the frequency of false return and a matter of legal measures (S519). This may bring the false reporter to an awareness of the false return and prevent the habitual false return.
[94] Hereinafter, an embodiment of method and apparatus for analyzing and displaying an emotional state of an opponent during video communication is described with reference to FIG. 6.
[95] FIG. 6 is a block diagram illustrating a mobile communication server 600 for providing an emotional state analysis service during video communication according to an embodiment of the present invention.
[96] With reference to FIG. 6, the mobile communication server 600 for providing an emotional state analysis service during video communication includes a comm unication unit 610, a service control unit 620, a video communication unit 630, a voice analysis engine 640, and an image integrating unit 650.
[97] The communication unit 610 receives from a calling terminal a call connection request signal that requests a call service of a specific feature. At this time, the call connection request signal includes a feature code having a specific feature, a receiving telephone number, and an identifier for identifying a video communication and an audio communication. After the communication unit 610 receives the call connection request signal from the calling terminal, the communication unit 610 connects a call to a called terminal to establish a traffic channel between the calling terminal and the called terminal. The mobile communication server 600 performs a series of operations for providing a call service of a specific feature requested by the calling party, and transmits a result to the calling terminal through the communication unit 610.
[98] The call service of a specific feature may include various services, however this embodiment focuses on a call service for analyzing and displaying an emotional state of an opponent during video communication. For simple and clear description, a video communication identifier included in the call connection request signal is indicated as '*', and a feature code used to analyze the emotional state of the opponent and display the emotional state on a screen of the calling terminal is indicated as '001'.
[99] The service control unit 620 transmits a control signal to the video communication unit 630 according to the video communication identifier '*' included in the call connection request signal received from the communication unit 610. And, the service control unit 620 transmits a control signal to the voice analysis engine 640 according to the feature code '001' included in the call connection request signal.
[100] The video communication unit 630 requests video communication call connection to the called terminal according to the control signal transmitted from the service control unit 620, and when the video communication call is connected in response to the request, starts video communication between the calling terminal and the called terminal. When the video communication starts by the video communication unit 630, the voice analysis engine 640 extracts voice of the called party and analyzes an emotional state of the called party.
[101] The voice analysis engine 640 includes an analysis object separating unit 641, a voice extracting unit 643 and a voice analysis unit 645.
[102] The analysis object separating unit 641 of the voice analysis engine 640 separates the voice of the called party from a conversation between the calling party and the called party.
[103] The voice extracting unit 643 extracts a predetermined voice parameter from an audio file of an object for voice analysis that is separated by the analysis object separating unit 641.
[104] The voice analysis unit 645 draws a voice analysis result using eight voice analysis formulas at the maximum based on the voice parameter extracted by the voice extracting unit 643. The voice analysis result includes an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an anticipation level, an embarrassment level, or a love level. The voice analysis unit 645 selectively provides a portion or all of the above-mentioned levels according to service type. And, the voice analysis unit 645 may synthesize the above-mentioned levels to calculate another type level, for example a lie detection level. Preferably, the voice analysis unit 645 provides the above-mentioned levels numerically or in a graphic or text.
[105] And, the voice analysis engine 640 may further include an emotional analysis control unit (not shown) for periodically detecting an emotional analysis start signal or an emotional analysis stop signal from the video communication unit 630 to control the emotional analysis service before operation of the analysis object separating unit 641.
[106] The emotional analysis control unit may stop or restart the operation of the voice analysis engine 640 at the request of the calling party. Therefore, the calling party may selectively request or stop the emotional analysis service during video communication. In other words, the emotional analysis control unit periodically detects an operation start signal or an operation stop signal of the voice analysis engine 640 from the video communication unit 630 to control the operation of the voice analysis engine 640.
[107] For example, the calling party selects a video communication service for emotion analysis, and then presses a '*' key to stop the emotion analysis of the opponent and use only the video communication service. And, the calling party presses the '*' key during video communication to restart the stopped emotional analysis of the opponent. Here, the '*' key is different from the above-mentioned video communication identifier, and has such a toggle function that after the video communication starts, when the '*' key is pressed during operation of the voice analysis engine 640, the operation of the voice analysis engine 640 stops, and when the '*' key is pressed again, the stopped operation of the voice analysis engine 640 restarts. [108] Alternatively, to control the operation of the voice analysis engine 640, a T key is assigned with start of operation of the voice analysis engine 640, a '2' key with stop of operation of the voice analysis engine 640 and a '3' key with contents addition of the voice analysis engine 640 thereby to receive selection of various service requests from the calling party in real time.
[109] The image integrating unit 650 integrates a video transmitted from the video communication unit 630 with the emotional state analysis result (for example, text or a graph indicating various levels) transmitted from the voice analysis unit 645 at a predetermined ratio and transmits the integrated image.
[110] Further, the video transmitted from the video communication unit 630 and the emotional state analysis result (for example, text or a graph indicating various levels) transmitted from the voice analysis unit 645 each may be transmitted to the calling terminal without the image integrating unit 650 and displayed on a divided screen of the calling terminal. The image and the emotional state of the opponent are divisionally displayed on the divided screen of the calling terminal.
[I l l] FIG. 7 is a block diagram illustrating a mobile communication server 700 for providing an emotional state analysis service during video communication according to another embodiment of the present invention. Here, a component having the same reference number as that of FIG. 6 executes the same function, and thus the detailed description thereof is omitted.
[112] Referring to FIG. 7, the mobile communication server 700 for providing an emotional state analysis service during video communication further includes a contents storage unit 760, an analysis result storage unit 770, and a message generating unit 780.
[113] The contents storage unit 760 stores various contents corresponding to the voice analysis result of the voice analysis unit 645 therein. The voice analysis unit 645 selects contents corresponding the voice analysis result from the contents storage unit 760. The contents include various visual data, so that the calling party can easily recognize the voice analysis result.
[114] For example, the contents storage unit 760 stores various characters, avatars of various expressions, emoticons, flash or moving images corresponding to the voice analysis result such as a love level or a stress level.
[115] Thus, the voice analysis unit 645 selects as the voice analysis result characters, avatars or multimedia contents that are stored beforehand in the contents storage unit 760, and properly uses the characters, avatars or multimedia contents according to circumstances. Therefore, the calling party can check the emotional state of the opponent using the visual data, such as a graph, a character or an avatar, that is selected from the contents storage unit 760. The calling party can check the emotional state of the opponent and transmit to the opponent moving images or multimedia contents corresponding to the emotional state.
[116] The message generating unit 780 prepares a multimedia message (MMS) including the voice analysis result analyzed by the voice analysis unit 645 and the contents selected from the contents storage unit 760 by the voice analysis unit 645, and transmits the multimedia message to the calling terminal or the called terminal.
[117] Meanwhile, the analysis result storage unit 770 stores the voice analysis result analyzed by the voice analysis unit 645. The voice analysis result stored in the analysis result storage unit 770 is provided through a web service using a wire or wireless Internet. A user may see the voice analysis result through the web service.
[118] FIG. 8 is a flow chart illustrating a method for providing a voice analysis service during video communication according to an embodiment of the present invention.
[119] As shown in FIG. 8, first, the calling party transmits a call for video communication with the called party, and the mobile communication server receives the video communication call transmitted from the calling party (S 801). At this time, the call transmitted from the calling party includes the video communication identifier '*' and the feature code '001' for the voice analysis service.
[120] When receiving the video communication call transmitted from the calling party, the mobile communication server starts video communication between the calling party and the called party according to the identifier '*' included in the video communication call (S803).
[121] When video communication starts, the mobile communication server separates voice of the called party from a voice communication conversation between the calling party and the called party based on the feature code '100' included in the video communication call (S805).
[122] Subsequently, the mobile communication server analyzes the separated voice of the called party, draws an emotional state of the called party, and the converts the drawn emotional state of the called party to a visual data (S807). That is, the mobile communication server extracts a voice parameter from the voice of the called party, executes voice analysis based on the extracted voice parameter to draw the emotional state result of the called party. And, the mobile communication server converts the drawn emotional state result to a visual data. The emotional state result includes an excitement level, a conviction level, a stress level, a thought level, an S.O.S (Say or Stop) level, a concentration level, an anticipation level, an embarrassment level, or a love level. The emotional state result is converted to a visual data, for example a percentage graph.
[123] Alternatively, to emphasize a visual aspect of the emotional state result, the mobile communication server may select multimedia contents, for example moving images, avatars or characters as the visual data of the emotional state result.
[124] Next, the mobile communication server integrates the visual data of the emotional state of the called party with a video of the called party at a predetermined ratio, and transmits the integrated image to the calling party (S809).
[125] While the calling party is provided with the analyzed emotional state of the called party during video communication, the calling party may stop the emotional state analysis. The calling party inputs a key for stopping the emotional state analysis into a terminal. The mobile communication server reads the key signal received from the terminal of the calling party, and in the case that the key signal indicates stop of the emotional state analysis, stops the ongoing voice analysis. And, after the calling party stops the emotional state analysis, the calling party may request the emotional state analysis again. In the same manner, the calling party inputs a key for requesting the emotional state analysis into the terminal. The mobile communication server reads the key signal received from the terminal of the calling party, and in the case that the key signal indicates a request of the emotional state analysis, performs the stopped emotional state analysis again and provides the emotional state analysis result to the calling party.
[126] And, after video communication between the calling party and the called party ends, the mobile communication server may prepare the emotional state analysis result of the called party as an MMS message, and transmit the MMS message to the calling party. At this time, the mobile communication server attaches to the MMS message multimedia contents corresponding to the emotional state analysis result of the called party, for example avatars, characters or moving images, and transmit the MMS message to the calling party.
[127] FIG. 9 is a view illustrating an MMS message provided as a voice analysis result according to an embodiment of the present invention.
[128] As shown in FIG. 9, a multimedia message service includes a call information 900, a final love level 910, other detection level 920 and a mobile phone Internet service URL 930.
[129] The call information 900 schematically shows a detailed information of a voice analysis call between the calling party and the called party, and includes a telephone number of the called party, the date of call and the duration of call and may further include an image of love level.
[130] The final love level 910 may be displayed in a bar graph and by percentage as shown in FIG. 9. The bar graph type final love level 910 may have a heart-shaped flash effect or a graph visualizing effect for good visibility, and may be expressed in different colors by a predetermined range. This visual effect may be different according to a mobile communication service provider or a separate service provider that provides the voice analysis service.
[131] The other detection level 920 includes an embarrassment level, a concentration level and an anticipation level as a real time analysis result according to love detection. As shown in FIG. 9, these levels each is indicated as a table of a predetermined shape, and has color or an image displayed according to the set range.
[132] The mobile phone Internet service URL 930 is used for a detailed inquiry and contents providing of the final result, and when a user inputs a call button (not shown), a wireless Internet browser included in a mobile phone connects to a server corresponding to the URL to inquire out a detailed detection result.
[133] FIG. 10 is a view illustrating an SMS message provided as a love detection result according to an embodiment of the present invention.
[134] The love detection result provided as an SMS message may be provided in the type of a basic information 950a as shown in (a) of FIG. 10, or in the type of a detailed information 950b as shown in (b) of FIG. 10.
[135] As shown in (a) of FIG. 10, the basic information type 950a may include a telephone number of the called party as an object for voice analysis, a love level of the corresponding called party to the calling party, and a schematic display for connection to a detailed inquiry 960.
[136] Meanwhile, as shown in (b) of FIG. 10, the detailed information type 950b includes a love level indicated by percentage, and an embarrassment level, a concentration level and an anticipation level indicated in text (970). The detailed information type 950b provides a more detailed love detection result than the basic information type 950a shown in (a) of FIG. 10.
[137] In the case that the calling party wants a detailed inquiry in the SMS message that may be provided as the basic information type 950a or the detailed information type 950b, the calling party inputs a call button or an Internet button 980 of the terminal. Then, the calling party is connected to a callback URL included in the SMS message to receive the more detailed love detection result.
[138] FIG. 11 is a view illustrating a process for providing a voice analysis result through a mobile phone Internet according to an embodiment of the present invention, and a user may connect to the callback URL included in the SMS message or MMS message provided after the call ends, or directly connect to a wireless Internet.
[139] First, when the calling party inputs 'his/her telephone number + a predetermined feature key' into a mobile terminal, a wireless Internet browser included in the mobile phone displays a service screen for connection to a love detection menu. Then, the calling party selects "007call" 1000 for a love detection service among predetermined services displayed on the browser, and moves to a subordinate screen.
[140] Alternatively, when the calling party selects a call connection 1050 in the SMS message or MMS message provided as the love detection result, the mobile terminal of the calling party connects to the server, downloads a subordinate screen and displays the subordinate screen to the calling party.
[141] The corresponding subordinate menu screen includes a recent call history menu
1010, a love rank menu 1060, a lovecall menu or a usage guide menu. The subordinate menu may enable the calling party to check information and a love level of the called party to whom the calling party made a call in recent days.
[142] For example, in the case that the calling party selects the recent call history menu
1010, a recent call history 1020 is arranged and displayed in a predetermined manner. The recent call history 1020 may include a telephone number and a love level of the called party. Meanwhile, in the case that the calling party selects the love rank menu 1060, information of the called party 1070 is arranged and displayed in a descending order of love level.
[143] And, when the calling party selects an arbitrary called party in the recent call history 1020 or the information of called party 1070, the calling party can check a detailed love detection result 1040 of the called party.
[144] FIG. 12 is a view illustrating a screen providing a love detection result through a web service according to an embodiment of the present invention, and shows a result screen that the calling party receives through a web after the calling party ends the love detection call.
[145] Through a wire or wireless terminal, the calling party connects to a web server that provides the analysis result of the love detection service of the present invention through a web. After the calling party connects to the web server, the calling party moves to a menu for checking the analysis result to check the detailed analysis result. This connection step corresponds to a general web service, and thus the detailed description is omitted, (a) of FIG. 12 shows a web page displayed when the calling party connects to the web server and selects the recent call history menu.
[146] The recent call history web page includes a telephone number 1100 of the called party to whom the calling party made a call in recent days, a date of call 1105, a love level 1110 of the called party, a detailed inquiry 1115, and contents 1120. The recent call history web page including the above-mentioned lists may be different according to service type. For example, in the case that the calling party detects love of the called party by an Internet telephone using a notebook, an IP number of the notebook and an Internet telephone number may be displayed in the telephone number 1100.
[147] In the date of call 1105, the date of a call between the calling party and the called party is displayed, and the total call duration of the calling party may be further displayed. And, in the love level 1110, a love level of the called party corresponding to the telephone number 1100 may be displayed as a predetermined image or value. Meanwhile, in the detailed inquiry 1115, a text having hyperlink is displayed, and according to selection of the text of the calling party, the detailed content of the text is displayed as a pop-up window. In the contents 1120, a conversation file, and a music file or a text file that may be provided according to service type are linked. The contents 1120 may be displayed or not according to existence of contents.
[148] When the calling party selects the detailed inquiry 1115, the calling party may check a love level result value 1125, a change of emotion during a call 1130 and a change of love level 1135 of an arbitrary called party through a pop-up window.
[149] As shown in (b) of FIG. 12 illustrating the detailed inquiry screen, in the love level result value 1125, a love level, a concentration level, an anticipation level and an embarrassment level are displayed in the type of a graph or a table 1140, and a final result 1145 is displayed in the type of text or an image. The love level result value 1125 displayed in a graph or the table 1140 and the final result 1145 displayed in text or an image may be displayed differently according to a mobile communication service provider or a separate service provider that provides the love detection service, and thus the display method is not limited in this regard. The contents 1120, the change of emotion during a call 1130 and the change of love level 1135 are described with reference to FIG. 13.
[150] FIG. 13 is a view illustrating a screen providing a love detection result through a web service according to another embodiment of the present invention, and the change of emotion during a call 1130, the change of love level 1135 and the contents 1120 of the detailed inquiry 1115 shown in FIG. 12 are described with reference to FIG. 13.
[151] The change of emotion during a call 1130 of the detailed inquiry 1115 displays, according to time, a love level, a concentration level, an anticipation level and an embarrassment level that are detected during a call between the calling party and the called party, and as shown in (a) of FIG. 13, the change of love level 1135 displays, according to period, the love level, the concentration level, the anticipation level and the embarrassment level that are measured during a predetermined period. Preferably, the change of emotion during a call 1130 and the change of love level 1135 are displayed in a graph of broken line 1150, however the present invention is not limited in this regard.
[152] The change of emotion during a call 1130 and the change of love level 1135 provide the love detection result according to time and a predetermined period, respectively, so that the calling party can easily check the love detection result at an arbitrary time or period. And, as shown in (a) of FIG. 13, a user may designate an arbitrary time or period in the change of emotion during a call 1130 and the change of love level 1135, respectively, and inquire out results 1155.
[153] Meanwhile, in the case that the calling party selects the contents 1120 having a hyperlink, the corresponding contents are displayed as a pop-up window as shown in (b) of FIG. 13. The contents 1120 may include a voice recorded file 1160 in which voice of an object for voice analysis is recorded, or the frequency of detection 1165 corresponding to the analysis result. The voice recorded file 1160 storing a conversation between the calling party and the called party may be erased by selection of the calling party. Preferably, a graph that may be displayed when replaying the voice recorded file 1160 may be the love detection analysis result of the called party as an object for voice analysis.
[154] In the analysis result of the called party, the final love level may be displayed in the type of text, and the concentration level, anticipation level, embarrassment level and stress level may be displayed in the type of a graph. And, the contents 1120 corresponding to the analysis result of the called party may be displayed during replay of the voice recorded file 1160. The contents 1120 may include an advice, an atmosphere producing method or a music according to service type, and may be provided in the type of a hyperlink. Meanwhile, in the frequency of detection 1165, the frequency of detection of each level and the final love level may be displayed in the type of a bar graph, and a display method may be different according to service type.
[155] That is, while the calling party listens to words of the called party in the conversation, the calling party may check through the graph or the frequency of detection at which words of the corresponding called party the love level is high or low, and receive the corresponding contents 120.
[156] Meanwhile, the voice analysis result provided through the web service that is described with reference to FIG. 13 replays only the voice of the called party and provides the voice analysis result of the called party in the type of a graph. However, in this embodiment, the calling party can check the change of emotion of the called party, but cannot check how emotion of the called party is changed at which words of the calling party. Therefore, in another embodiment, the present invention provides the voice analysis result of the called party at the start time of voice of the called party in the type of a graph while replaying the whole conversation between the calling party and the called party. For this purpose, a technique is required to synchronize the conversation between the calling party and the called party with the voice analysis result of the called party.
[157] FIG. 14 is a view illustrating an embodiment of a method for synchronizing the whole conversation between the calling party and the called party and the voice analysis result of the called party.
[158] The layered voice analysis server 140 records the whole conversation during the call between the calling party and the called party, separates the voice of the called party, executes love detection analysis in the above-mentioned method, and stores the love detection analysis result, (a) of FIG. 14 shows the conversation between the calling party (A) and the called party (B) as a time axis, and (b) of FIG. 14 shows the separated voice of the called party (B). The layered voice analysis server 140 analyzes the separated voice of the called party (B). The voice analysis is made by unit of voice of the called party (B). As shown in FIG. 14, the conversation of the called party (B) includes four voice units, the love detection analysis is made by each voice unit, and four voice analysis results are divisionally stored as segments (lseg, 2seg, 3seg and 4seg).
[159] When the layered voice analysis server 140 separates the voice of the called party and executes voice analysis while recording the whole conversation, the layered voice analysis server 140 checks a start time and an end time of each voice of the called party based on a call start time, and records a voice start time (timer pointer) and a voice continuation time (segment length) into a time stamp field of each of the voice analysis result segments. For example, in the case that the called party starts to respond to a question of the calling party at the time of one second after the call start time and finishes the response at the time of two seconds after the call start time, the layered voice analysis server 140 records 1 as the voice start time (time pointer) and 1 as the voice continuation time (segment length) into the time stamp field of the corresponding voice analysis result segment of the called party.
[160] As shown in (c) of FIG. 14, in a time stamp field of a first voice analysis result segment of the called party (B), 3 is recorded as the voice start time (time pointer) and 1 is recorded as the voice continuation time (segment length). In a time stamp field of a second voice analysis result segment of the called party (B), 7 is recorded as the voice start time (time pointer) and 1 is recorded as the voice continuation time (segment length). In a time stamp field of a third voice analysis result segment of the called party (B), 11 is recorded as the voice start time (time pointer) and 3 is recorded as the voice continuation time (segment length). In a time stamp field of a fourth voice analysis result segment of the called party (B), 17 is recorded as the voice start time (time pointer) and 2 is recorded as the voice continuation time (segment length).
[161] Whenever the voice of the called party occurs, the layered voice analysis server 140 checks the voice start time and the voice continuation time based on the call start time, and records the voice start time and the voice continuation time into the time stamp field of the corresponding voice analysis result segment in the above-mentioned manner. Therefore, the layered voice analysis server 140 synchronizes the whole conversation with the voice analysis result of the called party based on a time information recorded in the time stamp field of the voice analysis result segment of the called party, and provides the synchronized result through the web service.
[162] If the calling party replays the whole conversation on the web, the layered voice analysis server 140 checks time from a replay start time, and when it is the time recorded in the time stamp field of each of the voice analysis result segments of the called party, provides the love detection result of the corresponding voice analysis result segment in the type of a graph. Therefore, the calling party can check how the called party reacts to which words of the calling party.
[163] As such, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description. Industrial Applicability
[164] As described above, during a call between the calling party and the called party, the present invention analyzes an emotional state of the called party and provides the emotional state of the called party to the calling party in real time, thereby exciting the calling party's curiosity beyond a simple voice/video communication. Thus, the present invention may remarkably increase the saturated voice call amount, thereby increasing sales. And, the present invention enables the calling party to speak with a virtual opponent prior to an actual voice communication, so that the calling party can check in real time whether he/she expresses his/her emotion properly, thereby reducing a mistake or misunderstanding that may occur during a call with an actual called party.
[165]

Claims

Claims
[1] A method for providing a voice analysis service in a communication network, the method comprising:
(a) establishing a traffic channel between a calling terminal and a called terminal according to call connection of the calling terminal;
(b) performing layered voice analysis of voice of a called party transmitted from the called terminal to the calling terminal through the established traffic channel in real time to draw an emotional state information of the called party; and
(c) transmitting the drawn emotional state information of the called party to the calling terminal in real time.
[2] The method for providing a voice analysis service according to claim 1, wherein the step (a) includes: receiving a call origination including a feature code for the voice analysis service and a telephone number of the called party from the calling terminal; activating a voice analysis mode based on the feature code; and establishing the traffic channel between the calling terminal and the called terminal using the telephone number of the called party.
[3] The method for providing a voice analysis service according to claim 1 or 2, wherein the emotional state information is any one of a love level representing a degree of love of the called party to a calling party and a reliability level representing a degree of reliability of the called party to the calling party.
[4] The method for providing a voice analysis service according to claim 1, further comprising: after the call ends, preparing the emotional state information of the called party as a message and transmitting the message to the calling terminal.
[5] The method for providing a voice analysis service according to claim 1, wherein the step (b) includes: separating the voice of the called party from voices transmitted and received through the traffic channel; extracting a parameter for voice analysis from the separated voice of the called party; drawing the emotional state information of the called party using the extracted parameter.
[6] A method for providing a voice analysis service during video communication in a video communication network, the method comprising: (a) establishing a video traffic channel between a calling terminal and a called terminal according to video communication call connection of the calling terminal;
(b) performing layered voice analysis of voice of a called party transmitted from the called terminal to the calling terminal through the video traffic channel in real time to draw an emotional state information of the called party;
(c) converting the drawn emotional state information of the called party to a visual data; and
(d) transmitting and displaying to the calling terminal the converted visual data and a video of the called party to be transmitted to the calling terminal through the video traffic channel.
[7] The method for providing a voice analysis service according to claim 6, wherein the step (d) integrates the converted visual data and the video of the called party into a single image and transmits the integrated image to the calling terminal.
[8] The method for providing a voice analysis service according to claim 6, wherein the step (c) selects multimedia contents corresponding to the drawn emotional state information of the called party.
[9] The method for providing a voice analysis service according to claim 6, further comprising: when receiving a key signal for stopping the voice analysis from the calling terminal, stopping the voice analysis.
[10] A method for providing a voice analysis service in a communication network, the method comprising:
(a) receiving call connection for the voice analysis service from a calling terminal;
(b) notifying a mode information of the voice analysis service to the calling terminal and receiving a selection response to the notification;
(c) in the case that a calling party selects an exercise mode, notifying a virtual opponent and an exercise method to the calling terminal and receiving a selection response to the notification;
(d) receiving voice of the calling party from the calling terminal according to the virtual opponent and exercise method selected by the calling party; and
(e) performing layered voice analysis of the received voice of the calling party to draw a voice analysis result, and transmitting the drawn voice analysis result to the calling terminal in real time.
[11] The method for providing a voice analysis service according to claim 10, further comprising:
(f) after the exercise mode ends, establishing a traffic channel between the calling party and an actual called party corresponding to the virtual opponent; (g) performing layered voice analysis of voice of the called party transmitted to the calling party through the traffic channel to draw a voice analysis result; and (h) transmitting the drawn voice analysis result to the calling party in real time.
[12] A system for providing a voice analysis service in a communication network, the system comprising: a switching center for receiving a call origination including a feature code for the voice analysis service and a telephone number of a called party from a calling terminal, and routing the call origination based on the feature code; and a layered voice analysis server for receiving the routed call, establishing a traffic channel between the calling terminal and a called terminal corresponding to the telephone number of the called party, analyzing voice of the called party transmitted to the calling terminal through the established traffic channel to draw an emotional state information in real time, and transmitting the drawn emotional state information to the calling terminal in real time.
[13] The system for providing a voice analysis service according to claim 12, wherein the layered voice analysis server transmits multimedia contents corresponding to the emotional state information to at least any one of the calling terminal and the called terminal.
[14] A mobile communication device for providing a voice analysis service during video communication in a mobile communication network, the mobile communication device comprising: a service control unit for controlling a service operation mode according to a video communication call connection signal from a calling terminal; a video communication unit for establishing a traffic channel between the calling terminal and a called terminal under the control of the service control unit to support video communication; and a voice analysis engine for analyzing voice of a called party transmitted to the calling terminal through the traffic channel under the control of the service control unit to convert an emotional state of the called party to a visual data.
[15] The mobile communication device, further comprising: an image integrating unit for integrating the visual data and a video of the called party into a single image and transmitting the integrated image to the calling terminal.
PCT/KR2007/003304 2006-07-06 2007-07-06 Method and system for providing voice analysis service, and apparatus therefor WO2008004844A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
KR1020060063593A KR20080004813A (en) 2006-07-06 2006-07-06 Reliability detection system for layered voice analysis and the service method for the same
KR10-2006-0063593 2006-07-06
KR1020060077781A KR20080016113A (en) 2006-08-17 2006-08-17 Method and system for love detection service using layered voice analysis
KR10-2006-0077781 2006-08-17
KR10-2007-0039360 2007-04-23
KR20070039360A KR100940088B1 (en) 2007-04-23 2007-04-23 Method and device for displaying emotion of telephonee in video communication

Publications (1)

Publication Number Publication Date
WO2008004844A1 true WO2008004844A1 (en) 2008-01-10

Family

ID=38894764

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2007/003304 WO2008004844A1 (en) 2006-07-06 2007-07-06 Method and system for providing voice analysis service, and apparatus therefor

Country Status (1)

Country Link
WO (1) WO2008004844A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010070584A1 (en) * 2008-12-19 2010-06-24 Koninklijke Philips Electronics N.V. Method and system for adapting communications
EP2391105A1 (en) * 2010-05-25 2011-11-30 Sony Ericsson Mobile Communications AB Text enhancement system
EP2482532A1 (en) * 2011-01-26 2012-08-01 Alcatel Lucent Enrichment of a communication
WO2013027893A1 (en) * 2011-08-22 2013-02-28 Kang Jun-Kyu Apparatus and method for emotional content services on telecommunication devices, apparatus and method for emotion recognition therefor, and apparatus and method for generating and matching the emotional content using same
CN103259906A (en) * 2012-02-15 2013-08-21 宇龙计算机通信科技(深圳)有限公司 Processing method and terminal for voice call

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000062279A1 (en) * 1999-04-12 2000-10-19 Amir Liberman Apparatus and methods for detecting emotions in the human voice
JP2004135224A (en) * 2002-10-15 2004-04-30 Fuji Photo Film Co Ltd Mobile telephone set
KR20050072334A (en) * 2004-01-06 2005-07-11 에스케이 텔레콤주식회사 Method for providing polygraph service in mobile communication network
US7043008B1 (en) * 2001-12-20 2006-05-09 Cisco Technology, Inc. Selective conversation recording using speech heuristics
KR20060070982A (en) * 2004-12-21 2006-06-26 주식회사 팬택 Apparatus and method for displaying favor about caller by using character

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000062279A1 (en) * 1999-04-12 2000-10-19 Amir Liberman Apparatus and methods for detecting emotions in the human voice
US7043008B1 (en) * 2001-12-20 2006-05-09 Cisco Technology, Inc. Selective conversation recording using speech heuristics
JP2004135224A (en) * 2002-10-15 2004-04-30 Fuji Photo Film Co Ltd Mobile telephone set
KR20050072334A (en) * 2004-01-06 2005-07-11 에스케이 텔레콤주식회사 Method for providing polygraph service in mobile communication network
KR20060070982A (en) * 2004-12-21 2006-06-26 주식회사 팬택 Apparatus and method for displaying favor about caller by using character

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010070584A1 (en) * 2008-12-19 2010-06-24 Koninklijke Philips Electronics N.V. Method and system for adapting communications
CN102257566A (en) * 2008-12-19 2011-11-23 皇家飞利浦电子股份有限公司 Method and system for adapting communications
EP2391105A1 (en) * 2010-05-25 2011-11-30 Sony Ericsson Mobile Communications AB Text enhancement system
US8588825B2 (en) 2010-05-25 2013-11-19 Sony Corporation Text enhancement
EP2482532A1 (en) * 2011-01-26 2012-08-01 Alcatel Lucent Enrichment of a communication
WO2013027893A1 (en) * 2011-08-22 2013-02-28 Kang Jun-Kyu Apparatus and method for emotional content services on telecommunication devices, apparatus and method for emotion recognition therefor, and apparatus and method for generating and matching the emotional content using same
CN103259906A (en) * 2012-02-15 2013-08-21 宇龙计算机通信科技(深圳)有限公司 Processing method and terminal for voice call
CN103259906B (en) * 2012-02-15 2016-01-06 宇龙计算机通信科技(深圳)有限公司 The processing method of voice call and terminal

Similar Documents

Publication Publication Date Title
CN111817943B (en) Data processing method and device based on instant messaging application
CN100481851C (en) Avatar control using a communication device
CN101485188A (en) Method and system for providing voice analysis service, and apparatus therefor
CN111683175B (en) Method, device, equipment and storage medium for automatically answering incoming call
CN105827516B (en) Message treatment method and device
CN106600223A (en) Schedule creation method and device
KR102136706B1 (en) Information processing system, reception server, information processing method and program
CN104144108B (en) A kind of message responding method, apparatus and system
JP2006005945A (en) Method of communicating and disclosing feelings of mobile terminal user and communication system thereof
CN102694896B (en) In order to store the method for communicating number, terminal and system
WO2008004844A1 (en) Method and system for providing voice analysis service, and apparatus therefor
KR102263154B1 (en) Smart mirror system and realization method for training facial sensibility expression
CN111261139A (en) Character personification broadcasting method and system
CN112073294A (en) Voice playing method and device of notification message, electronic equipment and medium
CN110536092A (en) Video message leaving method, device, electronic equipment and storage medium
JPWO2017179262A1 (en) Information processing apparatus, information processing method, and program
CN112565913B (en) Video call method and device and electronic equipment
CN113965541B (en) Conversation expression processing method and device
CN109559760B (en) Emotion analysis method and system based on voice information
CN111970295B (en) Multi-terminal-based call transaction management method and device
KR100945162B1 (en) System and method for providing ringback tone
KR100873806B1 (en) Apparatus and method for synchronizing voice call content and voice analysis result
JP2002041279A (en) Agent message system
CN108173802B (en) Communication processing method, device and terminal
KR20080095052A (en) Method and device for displaying emotion of telephonee in video communication

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200780025409.7

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07768646

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 12009500053

Country of ref document: PH

NENP Non-entry into the national phase

Ref country code: DE

NENP Non-entry into the national phase

Ref country code: RU

122 Ep: pct application non-entry in european phase

Ref document number: 07768646

Country of ref document: EP

Kind code of ref document: A1