WO2006106671A1

WO2006106671A1 - Image processing device, image display device, reception device, transmission device, communication system, image processing method, image processing program, and recording medium containing the image processing program

Info

Publication number: WO2006106671A1
Application number: PCT/JP2006/306297
Authority: WO
Inventors: Toshiharu Baba
Original assignee: Pioneer Corporation
Priority date: 2005-03-31
Filing date: 2006-03-28
Publication date: 2006-10-12

Abstract

A mobile telephone (100) causes voice change judgment means of a processing unit (180) to recognize a voice pattern of voice information on a partner-speaker. When it is judged that the voice pattern has been changed, the mobile telephone (100) causes image processing means to change a registered image of the registered image information according to the change of the voice pattern. Thus, it is possible to change the registered image of the registered image information according to the voice pattern change and display an appropriate feeling change image on a display (110) according to the partner-speaker.

Description

Specification

Image processing apparatus, image display apparatus, receiving apparatus, transmitting apparatus, communication system, image processing method, image processing program, and recording medium recording image processing program

The present invention relates to an image processing device that displays an image, an image display device, a receiving device, a transmitting device, a communication system, an image processing method, an image processing program, and a recording medium that records the image processing program.

Background art

Conventionally, there has been known a configuration in which a caller can select an image to be displayed on a display of a mobile phone on the receiver side in a mobile phone having a display for displaying an image. (For example, see Patent Document 1).

[0003] In the device described in Patent Document 1, when a call is made from a mobile phone on the sender side to a mobile phone on the receiver side, the mobile phone on the caller side sends the emotion of the caller from a plurality of images by the caller. The selected image is transmitted to the mobile phone on the receiver side. The mobile phone on the receiver side is configured to display on the display an image transmitted simultaneously with the incoming call.

Patent Document 1: Japanese Patent Application Laid-Open No. 2004-15158 (refer to pages 4 to 5 and FIGS. 3 to 4) Disclosure of the Invention

Problems to be solved by the invention

[0005] By the way, in the configuration as in Patent Document 1, when the caller side makes a call to the mobile phone on the receiver side, an image to be displayed on the receiver side is selected, so only the image at the time of transmission can be transmitted. Can not. For this reason, the image displayed on the display during a call cannot be changed. Therefore, for example, even if the emotion changes during a call, an image corresponding to the emotion cannot be displayed on the receiver's mobile phone display.

[0006] The present invention relates to an image processing device, an image display device, a receiving device, a transmitting device, a communication system, an image processing method, an image processing program, and an image processing program capable of displaying an appropriate image. One objective is to provide a recorded recording medium.

Means for solving the problem

[0007] The image processing apparatus of the present invention is an image processing apparatus that processes an image displayed on a display unit according to audio information obtained by reception, and recognizes a change in an audio state in the audio information. And a display control unit configured to display the image on the display unit and to change the displayed image in accordance with the change in the audio state.

[0008] The image display device of the present invention comprises storage means for storing an image, display means for displaying the image when the audio information is received, and the image processing device of the present invention described above. It is characterized by that.

[0009] A receiving device of the present invention includes the above-described image display device of the present invention and a receiving unit capable of receiving the audio information, and the display control unit receives the audio information by the receiving unit. In this case, an image corresponding to the transmission source of the audio information is displayed.

[0010] A transmission device of the present invention includes the above-described image display device of the present invention and a transmission unit capable of transmitting and receiving the audio signal, wherein the transmission unit is for a transmission destination that transmits the audio information. The display unit displays an image corresponding to the transmission destination in response to a call made by the calling unit or a response of the transmission destination to the call, and the image processing The apparatus is characterized in that the displayed image is changed in accordance with a change in a voice state in voice information from a transmission destination received in response to the call.

[0011] The communication system of the present invention is a communication system including a transmission / reception terminal capable of transmitting and receiving audio information to and from each other, and each of the transmission / reception terminals transmits storage means for storing images and audio information. A transmitting means, a calling means for making a call to a transmission source to transmit the voice information, a receiving means for receiving the voice information, and when receiving the voice information, or by the calling means. It is characterized by comprising display means for displaying the image when making a call or when the transmission destination responds to the call, and the image processing apparatus of the present invention described above.

[0012] The image processing method of the present invention provides a sound obtained by receiving an image displayed on the display means. An image processing method for processing according to information, comprising: recognizing a change in sound state in the sound information, displaying the image on the display means, and displaying the displayed image according to the change in sound state. It is characterized by being changed.

[0013] An image processing program of the present invention is characterized in that a computing means functions as the above-described image processing apparatus of the present invention.

[0014] An image processing program of the present invention is characterized by causing a calculation means to perform the above-described image processing method of the present invention.

A recording medium on which the image processing program of the present invention is recorded is characterized in that the above-described image processing program of the present invention is recorded so as to be readable by an arithmetic means. Brief Description of Drawings

FIG. 1 is an overall perspective view of a portable telephone device according to an embodiment of the present invention.

FIG. 2 is a block diagram schematically showing the internal configuration of the portable telephone device.

FIG. 3 is a schematic diagram showing an outline of a registration information table recorded in storage means.

FIG. 4 is a schematic diagram showing an image of registered image information recorded in a storage means.

FIG. 5 is a schematic diagram showing a schematic configuration of a first image template table recorded in a storage means.

FIG. 6 is a schematic diagram showing a schematic configuration of a second image template table recorded in a storage means.

FIG. 7 is a schematic diagram showing a schematic configuration of an emotion recognition information table recorded in a storage means.

FIG. 8 is a block diagram schematically showing a configuration of a processing unit.

FIG. 9A is a schematic diagram showing an emotion deformation image transformed by the image processing means.

FIG. 9B is a schematic diagram showing another emotion deformation image transformed by the image processing means.

FIG. 9C is a schematic diagram showing another emotion deformation image transformed by the image processing means.

FIG. 10 is a flowchart showing a photographing process of a mobile phone.

FIG. 11 is a flowchart of incoming call processing when a mobile phone is incoming.

FIG. 12 is a flowchart of a calling process when a mobile phone is called.

[FIG. 13A] Eyebrow image of the first image basic information in a modification of the mobile phone of the present embodiment It is a schematic diagram which shows an example of the eyebrow image image | photographed when a person is in the state of a smile.

FIG. 13B is a schematic diagram showing an example of an eyebrow image taken when a person in the eyebrow image of the first image basic information is “angry” in a modification of the mobile phone of the present embodiment.

FIG. 13C is a schematic diagram showing an example of an eyebrow image captured when a person in the eyebrow image of the first image basic information is in a sad state in a modification of the mobile phone according to the present embodiment.

FIG. 14A is a schematic diagram showing an example of a mouth image taken when a person in the mouth image of the second image basic information is in a smiling state in a modification of the mobile phone according to the present embodiment.

FIG. 14B is a schematic diagram showing an example of a mouth image taken when a person in the mouth image of the second image basic information is “angry”.

FIG. 14C is a schematic diagram showing an example of a mouth image taken when a person in the mouth image of the second image basic information is in a sad state.

Explanation of symbols

24 First image pattern information as deformation amount information and deformation direction information

34 Second image pattern information as deformation amount information and deformation direction information

Registered images as 50 images

Also as 100 image display device, receiving device, transmitting device, and transmitting / receiving terminal

Functioning mobile phone

110 Display as display means

130 Receiving means and transmitting / receiving section as transmitting means

160 storage means

180 processing unit as image processing device

181 Call arrival / reception recognition means

183 Also functions as voice information recognition means and standard voice state recognition means

Voice recognition means

184 Voice change determination means as voice state recognition means

Image that also functions as 185 image information recognition means and image deformation processing means

Image processing means

186 Display control means BEST MODE FOR CARRYING OUT THE INVENTION

Hereinafter, an embodiment of the present invention will be described with reference to the drawings.

[Configuration of mobile phone]

FIG. 1 is an overall perspective view of a portable telephone device according to an embodiment of the present invention. FIG. 2 is a block diagram schematically showing the internal configuration of the portable telephone device. FIG. 3 is a schematic diagram showing an outline of a registration information table recorded in the storage device. FIG. 4 is a schematic diagram showing an image of registered image information recorded in the storage means. FIG. 5 is a schematic diagram showing a schematic configuration of the first image template table recorded in the storage means. FIG. 6 is a schematic diagram showing a schematic configuration of the second image template table recorded in the storage means. FIG. 7 is a schematic diagram showing a schematic configuration of an emotion recognition information table recorded in the storage means. FIG. 8 is a block diagram schematically showing the configuration of the processing unit. FIG. 9A is a schematic diagram showing an emotion-transformed image transformed by the image processing means. FIG. 9B is a schematic diagram showing another emotion deformation image transformed by the image processing means. FIG. 9C is a schematic diagram showing another emotion deformation image transformed by the image processing means.

In FIG. 1 and FIG. 2, reference numeral 100 denotes a portable telephone device (hereinafter referred to as a mobile phone) that also functions as an image display device, a receiving device, a transmitting device, and a transmitting / receiving terminal. The 100 communicates with a communication device (not shown) as a transmitting device such as another mobile phone or a general telephone device via a network such as a telephone line or an Internet line, and is connected so as to be able to make a call with this communication device. To do. Then, a predetermined image is displayed on the display 110 according to the communication partner. In this embodiment, the mobile phone 100 is used as the image display device. However, the present invention is not limited to this. For example, as the image display device, an image display unit such as a monitor that processes an image with a personal computer. It may be a device that is displayed on the screen. The cellular phone 100 is configured to include an upper casing 100A and a lower casing 100B that can accommodate a circuit board and the like. Further, the upper casing 100A is rotatably provided to the lower casing 100B by a rotating portion 100C. Note that FIG. 1 illustrates an example in which the upper casing 100A is pivotably attached to the lower casing 100B. For example, the upper casing 100A and the lower casing 100B are combined. It may be something that is formed. [0021] The upper casing 100A is provided with a display 110, an audio output unit 120, a receiving unit having an antenna 130A (see Fig. 1) and a transmitting / receiving unit 130 (see Fig. 2) as a transmitting unit. The housing 100B is provided with an operation unit 140, an audio input unit 150, a storage unit 160, a memory 170, a processing unit 180 that also functions as an image processing device, and the like. In addition, the circuit board provided in the upper casing 100A and connected to the display 110, the audio output unit 120, and the transmission / reception unit 130 is connected to the circuit board to which the processing unit 180 in the lower casing 100B is connected. They are electrically connected by a flexible board inserted through the inside of the rotating part 100C. In addition, the upper casing 100A is partially formed with a window portion that communicates the inside and the outside, and the camera portion 101 for taking a picture is attached to the window portion so as to face the outside. . The camera unit 101 may be provided on the surface of the upper housing 100A opposite to the side on which the display 110 faces, or on another position.

The display 110 includes a display area in which various information such as predetermined image information and text information is displayed under the control of the processing unit 180. This image information includes, for example, image information recorded in the storage means 160, TV image data received by a TV receiver (not shown), recording media such as an external device such as an optical disk, a magnetic disk, and a memory card. Image data recorded and read by a drive or driver, or image data from the memory 170. Examples of the display 110 include a liquid crystal display panel, an organic EL (Electro Luminescence), a PDP (Plasma Display Panel), a CRT (and atnode-Ray Tube), an FED (Field Emission Display), and an electrophoretic display panel. Can be illustrated

The sound output unit 120 converts predetermined sound information into sound and outputs the sound under the control of the processing unit 180. As this voice information, for example, voice information related to the voice of the communication partner speaker, music information recorded in the storage means 160 or the like, a warning sound stored in the memory 170 or the like can be exemplified. In addition, the audio output unit 120 shown in FIG. 1 has a speaker that can be switched between directivity and omnidirectionality by the user's operation. For example, when the user uses the mobile phone 100 without using it, In addition, the sound output from the sound output unit 120 is configured to be audible. For example, a separate omnidirectional speaker may be provided on the side surface of the lower housing 100B opposite to the surface on which the operation unit 140 is provided. Good.

[0024] The transceiver 130 includes an antenna 130A. Further, the transmission / reception unit 130 controls the voice information of the user input from the voice input unit 150 through the antenna 130A, for example, by the control of the processing unit 180. Send to a communication device such as a telephone device. Also, audio information transmitted from the other party's communication device or the like is received via the antenna 130A and output to the processing unit 180. Further, the transmission / reception unit 130 receives information from, for example, another sano on the network and outputs the information to the processing unit 180 under the control of the processing unit 180. Further, the transmission / reception unit 130 transmits information input from the processing unit 180 to a predetermined server or terminal device under the control of the processing unit 180.

The operation unit 140 has various operation buttons and operation knobs. The contents of the input operation of these operation buttons and control knobs include, for example, input of the other party's number when making a call to the other party's mobile phone, operation for browsing the registered information of the other party, and start / end operation of the call A setting for acquiring information from another server can be exemplified. Then, the operation unit 140 appropriately outputs a predetermined signal to the processing unit 180 by a user input operation.

[0026] The audio input unit 150 is provided on the lower housing 100B side and includes a microphone capable of inputting audio. For example, the voice input unit 150 can switch between directivity and non-directionality by operating the operation unit 140. For example, the voice input unit 150 can be switched from the voice input unit 150 when the user uses the cellular phone 100 without using the mobile phone 100. It is configured to be able to input voice.

[0027] The storage means 160 includes, for example, the registration information table 10 as shown in FIG. 3, the first image template table 20 as shown in FIG. 5, the second image template table 30 as shown in FIG. The emotion recognition information table 40 shown in FIG. 7 is stored, that is, stored so as to be readable. The storage means 160 stores a registration information storage area in which the registration information table 10 is recorded, a first image storage area in which the first image template table 20 is recorded, and a second image template table 30 in which the second image template table 30 is stored. 2 image storage areas and an emotion recognition information storage area in which the emotion recognition information table 40 is stored. Note that the storage means 160 may be configured to include a recording area for recording other information in addition to these four storage areas. In addition, a registration information storage area, a first image storage area, a second image storage area, an emotion recognition information storage area, and the like are stored in the memory 170 in the storage means. It is good also as a structure. As the storage unit 160, a configuration including a drive or a driver that can be read and stored in a recording medium such as an HD (Hard Disk) or a memory card can be illustrated. A configuration including a DVD (Digital Versatile Disc), an optical disc, or the like may be used.

[0028] The registration information table 10 is information used when, for example, the user confirms the information of the other speaker when the user makes a call with the other speaker. This registration information table 10 includes registration ID information 12, registrant name information 13, telephone number information 14, registration image information 15 as image information, registration detailed information 16, and the like, in association with one piece of data. It is configured in a table structure in which a plurality of registration information 11 configured as follows is recorded.

The registered HD information 12 is unique information for specifying the registered information 11 and is set for each registered information 11. For example, when the user registers the registration information 11 in the registration information table 10, the registration I blueprint 12 is automatically given as a serial number.

The registrant name information 13 is information relating to the name of the registrant registered as the registration information 11 specified by the registration HD information 12. The registrant name information 13 is recorded in a text format, for example.

[0031] The telephone number information 14 is information related to the telephone number of the registrant's mobile phone registered as the registration information 11 specified by the registration HD information 12, or the telephone number of a general telephone. The telephone number information 14 is information that is referred to when, for example, an incoming call is received from the registrant to the user's mobile phone 100 or when the user makes a call to the registrant.

[0032] The registered image information 15 is registered as the registered information 11 specified by the registered HD information 12. When the registrant receives an incoming call to the mobile phone 100 of the user, the registered image information 15 is transmitted to the registrant. This is image information related to an image to be displayed on the display 110 at this time. As shown in FIG. 4, the image of the registered image information 15 is an image obtained by capturing a front face image of a registrant, for example, so as to be stored in a predetermined frame. In FIG. 4, a registered image 50 of the registered image information 15 is image information taken in a state where the registrant's face is housed in a frame 51. The frame 51 is divided by five substantially parallel frame lines 52, and the registrant's face image is arranged so that the eyebrows, eyes, nose, and mouth are arranged in the divided areas. In addition, the center line force of frame 51 in frame area A where the eyebrows are placed is also in the horizontal direction. Up to a position separated by a fixed dimension, there are a right eyebrow region 53 where the right eyebrows are arranged and a left eyebrow region 54 where the left eyebrows are arranged. Further, a region surrounded by a position where the central line force of the frame 51 in the frame region D where the mouth is disposed is also moved by a predetermined dimension in the left-right direction is a mouth region 55 where the mouth is disposed. In FIG. 4, the frame 51 and the frame line 52 are illustrated, but these are supplementarily displayed when the registrant is photographed by the camera unit 101 of the mobile phone 100, for example. Therefore, the frame 51 and the frame line 52 are not displayed in the registered image 50 of the registered image information 15 that is actually registered. Such a registered image 50 may be, for example, a photograph taken with a camera unit 101 for photographing provided in the mobile phone 100 and registered with a commercially available digital camera device. Even if you have registered! /.

The registration detail information 16 is information relating to the detailed items of the registrant registered as the registration information 11 specified by the registration HD information 12. The detailed information recorded in the registration details information 16 includes, for example, information such as the registrant's address and work place, gender of the registrant, information on the action when there is an incoming call from the registrant, and the email address of the registrant. And basic voice information on the registrant's voice.

[0034] The first image template table 20 records the deformation patterns of the eyebrow images in the right eyebrow region 53 and the left eyebrow region 54 when the processing unit 180 deforms and edits the registered image 50 recorded in the registered image information 15. Information. This first image template table 20 corresponds to the first image HD information 22, the first image basic information 23, the first image pattern information 24 as deformation amount information and deformation direction information, as shown in FIG. It is constructed in a table structure in which a plurality of first image template information 21 in which emotion information 25 and the like are associated are recorded.

The first image HD information 22 is unique information that identifies the first image template information 21.

The first image HD information 22 is unique information different from each first image template information 21, and for example, a serial number is recorded.

The first image basic information 23 is image information that is the basis of the first image template information 21 specified by the first image HD information 22. In the first image basic information 23, for example, a basic image of normal eyebrows is recorded. Further, the basic image of the first image basic information is a rectangular shape that is the shape of the right eyebrow region 53 and the left eyebrow region 54 of the registered image 50 as shown in FIG. A rectangular shape having the same size as the shape or a rectangular shape having a similar relationship is formed, and an image of a normal eyebrow, for example, is arranged in the center of the rectangle.

[0037] The first image pattern information 24 is based on the basic image of the first image basic information 23 of the first image template information 21 specified by the first image information 22, and is in a predetermined emotional state. In addition, a deformation direction and a deformation rate as a deformation amount indicating how much and how much each part of the eyebrow image of the first image basic information 23 is deformed are recorded. Specifically, for example, in the case of a smile pattern, the first image pattern information 24 moves the approximate center position of the eyebrows upward, for example, 10 dots, and both ends of the eyebrows downward, for example, 20 Information such as dot movement is recorded. In the first image pattern information 24, for example, the ratio of moving each part of the eyebrow upward and the ratio moving downward with respect to the rectangular vertical dimension of the right eyebrow region 53 and the left eyebrow region 54 is recorded. The amount of deformation and the direction of deformation may be recorded in vector format, and may be.

[0038] Corresponding emotion information 25 is information indicating a change rate indicating what kind of emotion the first image pattern information 24 of the first image template information 21 specified by the first image HD information 22 is. is there. The corresponding emotion information 25 includes, for example, “0” indicating “normal” or “no expression”, “1” indicating “smile”, “2” indicating anger, etc. It is recorded as a numerical value. Information about these emotions may be recorded in a text format such as “normal”, “smile”, “anger”, etc.! /, Etc.

The second image template table 30 is information in which a deformation pattern of the mouth image of the mouth region 55 when the registered image 50 recorded in the registered image information 15 is modified and edited by the processing unit 180 is recorded. As shown in FIG. 6, the second image template table 30 includes second image information 32, second image basic information 33, second image notation information 34 as deformation amount information and deformation direction information, and corresponding emotions. It is constructed in a table structure that records a plurality of second image template information 31 that associates information 35 and the like.

The second image HD information 32 is unique information that identifies the second image template information 31.

The second image HD information 32 is unique information different from each of the second image template information 31, and for example, a serial number is recorded.

[0041] The second image basic information 33 is a second image template specified by the second image HD information 32. This is image information that is the basis of information 31. In the second image basic information 33, for example, a basic image of the mouth at normal times is recorded. This basic image is formed in a rectangular shape that is substantially the same size as the rectangular shape of the mouth area 55 of the registered image shown in FIG. 4 or a similar rectangular shape. The mouth image is arranged in the approximate center.

The second image pattern information 34 is based on the second image basic information 33 of the second image template information 31 specified by the second image information 32, and the second image pattern information 34 in a predetermined emotional state. The deformation direction and the rate of change as the amount of deformation indicating how much each part of the mouth image of the basic information 33 is moved in which direction are recorded. For example, in the case of a smiling face, if both ends of the mouth image are moved upward, for example, 20 dots, the information is recorded. The rate of change may be, for example, the rate of moving each part of the mouth upward, the rate of moving downward, the rate of moving left or right, etc. May be recorded! For example, it may be recorded in vector format! /.

[0043] Corresponding emotion information 35 is similar to the first image template information 21, and the second image pattern information 34 of the second image template information 31 specified by the second image HD information 32 indicates what kind of emotion. It is the information which shows whether it is the change rate to show. The corresponding emotion information 35 includes, for example, “0” indicating “normal” or “no expression”, “1” indicating “smile”, “2” indicating anger, etc. It may be recorded in a text format which may be recorded as a numerical value.

[0044] The emotion recognition information table 40 is used when the processing unit 180 recognizes the emotion of the speaker who generated the voice from the state of the voice information of the other speaker, for example, the strength or tempo of the voice. Data group. This emotion recognition information table 40 records a plurality of emotion recognition information 41 constructed as one piece of data by associating emotion information 42, voice pattern information 43 as voice state change information, emotion information 44, etc. The table structure is configured.

The emotion information 42 is unique information that identifies the emotion recognition information 41. This emotion information

42 is provided for each emotion recognition information 41 and has different information.

[0046] The voice pattern information 43 is information relating to the voice state of the voice. Specifically, this The voice pattern information 43 includes phrase information related to emotional phrases such as “Kora” and “Hahaha”, strength information related to the strength of the voice, height information related to the voice high / low state, and tempo related to the voice tempo. Information is recorded.

[0047] The emotion information 44 is information related to the emotion corresponding to the voice pattern information 43 of the emotion recognition information 41 specified by the emotion I and the blueprint 42. In this emotion information 44, for example, if the phrase “hahaha” is recorded as the phrase information in which the voice strength is strong in the voice pattern information 43, for example, HD indicating that the voice is laughing loudly. Information such as “1” is recorded. The emotion information 44 may be recorded in a text format such as “smile” or “anger”. The emotion information 44 is associated with the corresponding emotion information 25 of the first image template information 21 and the corresponding emotion information 35 of the second image template information 31 described above.

[0048] The memory 170 stores setting items input and operated by the operation unit 140 so that they can be read as appropriate. The memory 170 stores various programs developed on an OS (Operating System) that controls the operation of the entire mobile phone 100. The memory 170 may be configured to include a drive and a driver that are readable and stored in a recording medium such as an HD (Hard Disk) or a magneto-optical disk.

[0049] The processing unit 180 includes various input / output ports (not shown), such as a display control port to which the display 110 is connected, an audio output control port to which the audio output unit 120 is connected, a transmission / reception port to which the transmission / reception unit 130 is connected, An input port to which the operation unit 140 is connected, a voice input control port to which the voice input unit 150 is connected, a storage port to which the storage unit 160 is connected, a memory port to which the memory 170 is connected, and the like. Then, the processing unit 180, as various programs, as shown in FIG. 8, an incoming / outgoing recognition means 181 that also functions as a calling means, a partner speaker recognition means 182 and a voice recognition means as voice information recognition means. 183, sound change determination means 184 as sound state recognition means, image information recognition means, image processing means 185 that also functions as image deformation processing means, display control means 186, photographing means 187, etc. It is equipped with.

[0050] The arrival / departure recognition means 181 recognizes incoming calls from other communication devices to the mobile phone 100 and arrivals / departures from the mobile phone 100 to other communication devices. Specifically, the arrival and departure recognition means 181 Controls the transmission / reception unit 130 to receive incoming call information requesting a call from another communication device. In addition, when the arrival / departure recognition means 181 recognizes the incoming call information, it controls the audio output unit 120 to output the incoming sound such as voice, warning sound, notification sound, etc. to notify the user that the incoming call information has been received. . Further, when the arrival / departure recognition means 181 recognizes request information indicating that a call is to be made to a predetermined destination communication device by an input operation of the operation unit 140 of the user, the transmission information indicating that a call is requested to the destination communication device. Send.

[0051] When the incoming / outgoing recognition means 181 receives the incoming call information or transmits the outgoing call information to the destination communication device, the called party recognition means 182 recognizes the incoming call partner and the outgoing call partner. Specifically, the partner speaker recognition means 182 recognizes the telephone number of the called party who has sent the incoming information from the incoming information received by the arrival / reception recognition means 181. Then, the registration information 11 having the telephone number information 14 that matches the recognized telephone number of the called party is recognized from the registration information table 10. Furthermore, the other speaker recognition means 182 recognizes the destination telephone number of the transmission information transmitted by the arrival / reception recognition means 181. Then, the registration information 11 having the telephone number information 14 that matches the destination telephone number is recognized from the registration information table 10. Then, the partner speaker recognition means 182 stores the recognized registration information 11 in the memory 170 so that it can be read out appropriately.

[0052] The voice recognition means 183 also recognizes the voice status of the other speaker as received by the transmitting / receiving unit 130, and recognizes the voice status of the other speaker as voice basic information as standard voice status information of the other speaker. . Specifically, the voice recognition means 183 recognizes the voice information of the partner speaker input via the transmission / reception unit 130. Then, the voice pattern of the other speaker is determined from the voice state of the voice information, that is, the voice level, the voice strength, the tempo at which the voice is emitted, and the like are recognized as the voice basic information. The voice recognition means 183 may be configured to recognize the other speaker's gender, basic voice information, etc. as voice basic information based on the contents described in the registration details information 16 of the registration information 11. .

[0053] The voice change determination means 184 changes the voice when the voice of the voice information received by the transmission / reception unit 130 changes from the voice basic information of the other speaker recognized by the voice recognition means 183. Information skills Recognize changes in emotions of the other speaker. Specifically, when the voice state of the voice information of the other speaker changes, the voice change determination unit 184 determines the voice parameter of the voice state. Detect turn. For example, if the detected voice pattern is different in height, strength, tempo, etc. compared to the voice pattern of the basic voice information, the voice pattern of the received voice is changed based on the emotion recognition information table 40. The voice pattern information 42 that substantially matches is recognized, and the emotion recognition information 41 corresponding to the voice pattern information 42 is recognized. When the voice change determination means 184 recognizes that the voice of the received voice information includes phrase information representing the emotion, such as “Kora”, “Hahaha”, etc., these phrases information is displayed. The emotion recognition information 41 included in the voice pattern information 43 is recognized. The voice change determination means 184 stores the emotion recognition information 41 including the recognized emotion information 44 in the memory 170 so that it can be read out as appropriate.

[0054] The image processing means 185 recognizes the registered image information 15 of the registered information 11 corresponding to the partner speaker. Then, based on the emotion information 44 of the emotion recognition information 41 recognized by the voice change determination means 184, the image of the registered image information 15 is changed. Specifically, the image processing means 185 recognizes the right eyebrow region 53 and the left eyebrow region 54 of the registered image 50 recorded in the registered image information 15. Then, based on the first image template table 20, the image processing means 185 has a first image having first image basic information 23 of an image that substantially matches the eyebrow images of the right eyebrow region 53 and the left eyebrow region 54. Search for template information 21. For example, the eyebrow image recorded in the first image basic information 23 is overlaid on the right eyebrow region 53 and the left eyebrow region 54 of the registered image information 15, for example, the first image basics with the highest degree of overlap of the eyebrow portions. Search for information 23. Next, the image processing means 185 includes a first image having corresponding emotion information 25 corresponding to the emotion information 44 recognized by the voice change determination means 184 from the first image template information 21 narrowed down by searching. Recognizes template information 21. The shape of the eyebrows of the right eyebrow region 53 and the left eyebrow region 54 of the registered image 50 of the registered image information 15 according to the change rate recorded in the first image pattern information 24 of the first image template information 21 Transform

In addition, the image processing means 185 also deforms the mouth image in the same manner as the deformation process of the eyebrow image in the right eyebrow region 53 and the left eyebrow region 54. That is, the image processing means 185 recognizes the mouth area 55 of the registered image 50. Then, based on the second image template table 30, the second image template having the second image basic information 33 of the image that substantially matches the mouth image in the region 55. Search for port information 31. To do this, for example, the mouth image recorded in the second image basic information 33 is overlaid on the mouth area 55 of the registered image information 15, and for example, the second image basic information 33 with the largest degree of overlap of the mouth portion is searched. To do. Next, the image processing means 185 includes a second image having corresponding emotion information 35 corresponding to the emotion information 44 recognized by the voice change determination means 184 from the second image template information 31 searched and narrowed down. Recognize template information 31. Then, the mouth shape of the mouth region 55 of the registered image 50 of the registered image information 15 is changed according to the change rate recorded in the second image pattern information 34 of the second image template information 31. In addition, the image processing means 185 stores the registered image 50 in which the eyebrows and the mouth are deformed in the memory 170 so as to be appropriately readable as an emotion deformed image.

[0056] The display control means 186 controls the display 110 to display the image of the registered image information 15 of the registration information 11 in the display area. Specifically, the display control means 186, when the arrival / reception recognition means 181 recognizes the call request information indicating that the call is sent to the communication apparatus of the other speaker by the operation of the operation unit 140 by the user, Control is performed to display the registered image 50 recorded in the registered image information 15 on the display 110. In addition, the display control means 186 recognizes that the mobile phone 100 has an incoming communication device such as another mobile phone or a general telephone at the arrival / departure recognition means 181 and the destination speaker recognition means 182 is the transmission destination. When the registered image information 15 of the registered information 11 of the other speaker is recognized, control is performed to display the registered image 50 of the registered image information 15 on the display 110.

[0057] Further, the display control means 186 performs control to display the emotion deformed image processed by the image processing means 185 in the display area of the display 110 as shown in FIG. 9A or FIG. 9C, for example. . Here, in FIG. 9A, the voice change determination means 184 determines that the voice state of the other party's voice information is laughing, for example, and the image processing means 185 is registered according to the emotion information 44. The image 50 is a deformed image. In FIG. 9B, the voice change determination means 184 determines that the voice pattern of the voice information of the other party's call is “angry”, for example, and performs image processing according to the emotion information 44. Means 185 is an image obtained by deforming the registered image 50. FIG. 9C shows that the voice pattern of the caller's voice information is determined to be, for example, “sad state” by the voice change judging means 184, and the image processing means 185 transforms the registered image 50 according to the emotion information 44. It is the image made to do. In addition, the table The display control means 186 controls the video input from the camera unit 101 as image information and displays it on the display area of the display 110.

When the image capturing unit 187 recognizes request information for capturing an image by an input operation of the operation unit 140 by the user, the image capturing unit 187 controls the camera unit 101 to be in a state in which an image can be captured. In addition, the display 110 is controlled to display an image of the shooting range of the camera unit 101. Furthermore, when the image capturing unit 187 recognizes request information indicating that an image that can be processed by the image processing unit 185 is input by a user's setting input, the frame 51 and the frame as illustrated in FIG. Control the display of line 52 on display 110. When the request information indicating that the image is to be captured is recognized by the user's input operation, the image within the imaging range of the camera unit 101 is captured and stored in the storage unit 160 as image information.

[0059] [Operation of mobile phone]

(Shooting process)

Next, as an operation of the mobile phone 100, a deformable image photographing process will be described with reference to FIG. FIG. 10 is a flowchart showing the photographing process of the mobile phone 100.

[0060] First, the processing unit 180 of the mobile phone 100 receives the shooting request information for requesting shooting of a predetermined video by the camera unit 101 by the operation of the operation unit 140 of the user. Recognizing the request information (step S101), the camera unit 101 for shooting is activated (step S102).

[0061] Next, the processing unit 180 controls the display 110 with the display control means 186, and performs image processing according to the deformable image, that is, the voice state of the other speaker during a call with the other speaker. A display screen is displayed asking whether or not the user has the power to capture an image (step S103). In step S 103, when the user inputs and inputs information indicating that the deformable image is not shot, the shooting unit 187 of the processing unit 180 causes the display control unit 186 to enter a video that falls within the shooting range of the camera unit 101. Is controlled to display in the display area of display 110.

On the other hand, in step S103, when the processing unit 180 recognizes information indicating that a deformable image is to be captured by the user's setting input, the display unit 110 displays the display 110 on the photographing unit 187. Control is performed to display the frame 51 and the frame line 52 as shown in FIG. 4 in the area (step S104). Then, so that the user enters the face of the subject within the frame 51, the right eyebrow and the left eyebrow enter the right eyebrow region 53 and the left eyebrow region 54, respectively, and the mouth enters the mouth region 55. When the camera unit 101 is focused and information indicating that shooting is to be performed is input by the operation unit 140, the shooting unit 187 of the processing unit 180 captures the video displayed on the display 110 as image information (step S105). ). Similarly, in the case where the processing unit 180 recognizes the input indicating that the deformable image is not captured in step S103, and the imaging unit 187 also captures the image displayed on the display 110 as image information. .

[0063] Then, the processing unit 180 stores the captured image in the storage unit 160 so that it can be appropriately read (step S106). Further, when the processing unit 180 recognizes the setting input for capturing the deformable image in step S103, the processing unit 180 selects whether or not it is the power to register the captured deformable image in the registration information 11. Then, when recognizing the information indicating that it is registered in the registered image information 15 of the predetermined registration information 11 by the user's setting input, the processing unit 180 sets the captured image information to the predetermined image specified by the user's setting input. Record as registered image information 15 of registered information 11.

[0064] (Incoming call processing of mobile phone)

Next, an incoming call process when there is an incoming call to the mobile phone 100 from another mobile phone or a general phone will be described with reference to FIG. FIG. 11 is a flowchart of the incoming call process when a mobile phone receives an incoming call.

[0065] The processing unit 180 of the mobile phone 100 recognizes that the mobile phone 100 receives an incoming call from another communication device such as a mobile phone or a general phone at the arrival / departure recognition means 181. When it is recognized that the information has been input (step S 201), the partner speaker recognition means 182 recognizes the registration information 11 of the called partner speaker (step S 202). In other words, the partner speaker recognition means 182 of the processing unit 180 recognizes the telephone number of the partner speaker recorded in the incoming call information, and stores the registration information 11 having the telephone number information 14 that matches the telephone number of the partner speaker. recognize. Further, after step S202, the partner speaker recognition means 182 recognizes the registered image information 15 of the recognized registration information 11 (step S203).

[0066] If there is no registration information 11 of the other speaker in this step S202, or in step S203 If there is no registered speaker information 15 of the other party, the processing unit 180 controls the display control means 186 to display the destination telephone number in the display area of the display 110, for example. Then, when the processing unit 180 recognizes an operation signal indicating that the user responds to the incoming call by operating the operation unit 140 (step S204), the processing unit 180 controls the transmission / reception unit 130 to determine the destination partner speaker. For example, a communication connection is established with a communication device through a network such as a telephone line or the Internet so that they can talk to each other. Then, the processing unit 180 recognizes information indicating that the call with the partner speaker is to be terminated by operating the operation unit 140 of the user, or receives an incoming call indicating that the call is to be terminated from the communication device of the partner speaker. When the information is recognized, the transmission / reception unit 130 is controlled to cancel the call ready state, and the communication is terminated (step S205).

On the other hand, when the registered information 11 is recognized by the partner speaker recognizing unit 182 in step S202 and the registered image information 15 is further recognized in step S203, the processing unit 180 displays the registered information 11 in the display control unit 186. The display 110 is controlled to display the registered image 50 of the registered image information 15 in the display area of the display 110 (step S206).

[0068] After that, when the processing unit 180 recognizes an operation signal indicating that the user responds to the incoming call by operating the operation unit 140 (step S207), the processing unit 180 controls the transmission / reception unit 130 to receive the destination partner speaker. For example, a communication connection is established with a communication device such as a telephone line or the Internet so as to be able to talk to each other. On the other hand, in step S207, if the operation signal indicating that the user responds to the incoming call cannot be recognized by operating the operation unit 140 or if the operation signal indicating that the incoming call is rejected is received, the reception of the incoming information is terminated. To terminate the incoming call processing.

[0069] Then, in step S207, when the processing unit 180 recognizes the operation signal for responding to the incoming call and is connected to the communication apparatus of the other party's speaker, the voice recognition means 183 causes the other party The voice information of the speaker is recognized (step S208). After that, the voice recognition means 183 of the processing unit 180 analyzes the received voice information and determines the voice status of the other speaker, that is, the voice level, strength, weakness, tempo of the other speaker's speech, etc. Recognized as information (step S209). The voice recognition means 183 stores the recognized voice basic information in the memory 170 so that it can be read out as appropriate.

[0070] Next, the processing unit 180 changes the voice status of the other speaker while the user is talking to the other speaker. (Step S210), that is, when a change in the voice state of the voice information transmitted from the communication device of the other speaker is detected, the voice change determination means 184 detects the voice of the received voice information. The emotion information 44 of the other speaker is recognized from the state (step S211). Specifically, the voice change determination means 184 prays the voice state of the voice information of the other speaker and compares it with the voice basic information stored in the memory 170. Then, when the voice state of the voice basic information and the voice state of the received voice information are different, the voice change determination means 184 of the processing unit 180 has the changed voice pattern, for example, the amount of change in the voice, the amount of change in the voice, Recognize strength changes, tempo changes, etc. Also, the voice change determination means 184 searches the voice pattern information 43 matching the voice pattern of the other speaker based on the emotion recognition information table 40, and recognizes the emotion recognition information 41 having the voice pattern information 43. . The voice change determination means 184 also recognizes the phrase information recorded in the voice pattern information 43 in the voice information of the other speaker, for example, phrases representing emotions such as “Kora” and “Hahaha”, and these phrase information Emotion recognition information 41 corresponding to is recognized. Further, the voice change determining means 184 stores the recognized emotion recognition information 41 in the memory 170 so that it can be read out appropriately. .

[0071] After step S211, the image processing means 185 of the processing unit 180 reads the registered image 50 recorded in the registered image information 15 and edits the image of the registered image 50 (step S212). The image processing means 185 of the processing unit 180 reads the first image template table 20 and the second image template table 30. Then, the emotion information 44 of the emotion recognition information 41 recognized in step S211 and stored in the memory 170 is recognized, and the first image template information 21 and the first image template information 21 having corresponding emotion information 25 and 35 corresponding to the emotion information 44 are stored. 2 Recognize image template information 31.

Thereafter, the image processing means 185 of the processing unit 180 registers in accordance with the first image pattern information 24 of the first image template information 21 and the second image pattern information 34 of the second image template information 31. The emotional change image is created by deforming the right eyebrow region 53, the left eyebrow region 54, and the mouth region 55 of the image 50 (step S213).

[0073] For example, in step S211, an example in which information indicating "smiling state" is recorded in emotion information 44 of emotion recognition information 41 recognized by voice change determination means 184 is shown. To do. In this case, the image processing means 185 includes the first image pattern information 24 of the first image template information 21 and the second pattern information of the second image template information 31 having corresponding emotion information 25, 35 corresponding to the emotion information 44. In response to 34, the images of the right eyebrow region 53, the left eyebrow region 54, and the mouth region 55 of the registered image 50 are transformed into, for example, a smile image as shown in FIG. 9A. Similarly, when information indicating “anger state” is recorded in the emotion information 44 of the emotion recognition information 41 recognized by the voice change determination means 184, the right eyebrow region 53 and the left eyebrow region of the registered image are recorded. The images of 54 and mouth area 55 are transformed into an “angry” face image as shown in FIG. 9B, for example. In addition, when information indicating that “sadness” is recorded in the emotion information 44 of the emotion recognition information 41 recognized by the voice change determination means 184, the right eyebrow region 53 and the left eyebrow region 54 of the registered image are recorded. And the image of the mouth region 55 are transformed into a face image in a “sadness state” as shown in FIG. 9C, for example.

[0074] Then, the image processing means 185 of the processing unit 180 stores the emotion-changed image deformed as described above in the memory 170 so that it can be read out appropriately.

[0075] Thereafter, the processing unit 180 controls the display 110 with the display control means 186 to display the emotion change image created at step S212 in the display area of the display 110.

[0076] Then, when the processing unit 180 recognizes an operation signal for ending the call with the other party speaker by the operation of the operation unit 140 of the user, the processing unit 180 releases the communication connection and ends the call. If the call is to be continued, the process returns to step S210 to perform processing for recognizing the voice state of the other speaker (step S214).

[0077] (Call processing of mobile phone)

Next, call processing for making a call by calling the mobile phone 100 to the other party speaker will be described with reference to FIG. Fig. 12 is a flowchart of the calling process when a mobile phone is called. In FIG. 12, the same reference numerals are given to the processing that is substantially the same as the incoming processing of the mobile phone 100 in FIG. 11, and the description thereof is omitted or simplified.

In FIG. 12, the mobile phone 100 first sends a call to the communication device of the other party by operating the operation unit 140 by the user at the arrival / recognition recognition unit 181 at the arrival / departure recognition unit 181 of the processing unit. The call request information is recognized (step S301). Then, the processing unit 180 is required to If the registration information 11 of the other party's speaker is recorded in the request information, it is determined that the destination telephone number is set and entered by operating the operation unit 140 of the user. The transmission information is transmitted to the other communication device (step S302). Then, when the other party's caller answers the call (step S303), the transmission / reception unit 130 is controlled to communicate with the other party's caller's communication device via a network such as a telephone line or the Internet. Make a communication connection where possible. Then, the processing unit 180 recognizes information indicating that the call with the other speaker is to be terminated by operating the operation unit 140 of the user, or receives incoming call information indicating that the communication with the other speaker is ended. If it is recognized, the transmission / reception unit 130 is controlled to release the call ready state, and the communication is terminated (step S205).

On the other hand, in step S 301, when recognizing that the registration information 11 of the transmission destination is recorded in the registration information table 10 in the transmission request information, the processing unit 180 stores the registered image information 15 in the registration information 11. Whether or not is recorded is determined (step S304). In step S304, when the registered image information 15 is not recorded, the processing unit 180 executes step S302 and transmits the call to the telephone number recorded in the telephone number information 14 of the registration information 11.

In step S 304, when the registered image information 15 is recorded in the registered information 11, the processing unit 180 causes the display control unit 186 to control the display 110 to register the registered image information 50 in the registered image information 15. Is displayed in the display area of the display 110 (step S30 5).

Furthermore, processing unit 180 controls transmission / reception unit 130 to transmit the transmission information to the telephone number recorded in telephone number information 14 of registration information 11 (step S306).

[0082] Then, when the destination partner speaker responds to the transmission information transmitted in step S306 (step S307), processing unit 180 controls transmission / reception unit 130 to determine the destination partner speaker. For example, a communication connection is established with a communication device through a network such as a telephone line or the Internet so that they can talk to each other. On the other hand, if the other speaker does not respond to the call information in step S307, the call processing of the mobile phone 100 is terminated.

[0083] In step S307, when the other speaker responds to the transmission information, the processing unit 180 performs the processing from step S208 to step S214 of the incoming call processing described above. That is, The processing unit 180 executes step S208, and the voice recognition means 183 recognizes the voice information of the called party's partner. After that, the voice recognition means 183 of the processing unit 180 analyzes the received voice information and uses the voice status of the other speaker, that is, the voice level, strength and weakness, and the other speaker's speaking tempo as the basic voice information. recognize. The voice recognition means 183 stores the recognized voice basic information in the memory 170 so that it can be read out as appropriate.

Next, processing unit 180 performs the process of step S 209, detects that the voice state of the other speaker has changed while the user is talking to the other speaker, and sends it to voice change determination means 184. The other speaker's emotion information 44 is recognized from the voice state of the received voice information. Also, the voice change determination means 184 stores the recognized emotion recognition information 41 in the memory 170 so that it can be read out appropriately. .

[0085] Then, the processing unit 180 performs the processing of step S210, causes the image processing unit 185 to read the registered image 50 in which the registered image information 15 is recorded, edits the image of the registered image 50, and Create a change image. Further, the image processing means 185 of the processing unit stores the emotion change image in the memory 170 so that it can be read out as appropriate.

Thereafter, the processing unit 180 performs the process of step S213, controls the display 110 with the display control means 186, and displays the emotion change image created in step S212 in the display area of the display 110. Display.

[0087] Then, when the processing unit 180 performs the process of step S214 and recognizes an operation signal indicating that the call with the other party is terminated by the operation of the operation unit 140 of the user, the processing unit 180 releases the communication connection. And end the call. If the call is to be continued, the process returns to step S210 to perform processing for recognizing the voice state of the other speaker.

[0088] [Function and effect of mobile phone]

As described above, the mobile phone 100 according to the present embodiment causes the voice change determination unit 184 of the processing unit 180 to recognize the voice pattern of the other speaker's voice information and determines that the voice pattern has changed. Then, the image processing means 185 changes the registered image 50 of the registered image information 15 in accordance with the change of the sound pattern. For this reason, the registered image 50 of the registered image information 15 can be transformed as if a moving image is reproduced according to the change of the voice pattern. Also, the voice pattern of the other speaker changes depending on the emotion of the other speaker. Therefore, opponent The registered image 50 of the registered image information 15 can be transformed according to the speaker's emotion, and an appropriate emotion change image corresponding to the emotion of the other speaker can be displayed on the display 110.

Further, the registered image 50 of the registered image information 15 is deformed and displayed as an emotion change image. For this reason, it is not necessary to prepare a plurality of images corresponding to each emotion. Therefore, it is possible to effectively utilize the free capacity that does not press the storage capacity of the storage unit 160.

[0090] Further, the image processing means 185 deforms the registered image 50 of the registered image information 15 based on the first image template information 21 and the second image template information 31, and generates an emotion change image. For this reason, only the facial expression can be transformed without modifying the background of the face image displayed in the display area of the display 110, the position of the face image, or the like. Therefore, when the image displayed on the display 110 is switched from the registered image 50 to the emotional change image, or when switching from the emotional change image to another emotional change image, the image does not shift and becomes difficult to see. You can switch images.

[0091] Furthermore, the image processing means 185 determines the right eyebrow region 53, the left eyebrow region 54, the eyebrow region 54 based on the eyebrow and mouth change rates recorded in the first image pattern information 24 and the second image pattern information 34. The left eyebrow, the right eyebrow, and the mouth of the mouth region 55 are deformed by moving the dots by a predetermined amount, for example. Therefore, the facial expression can be easily changed by subjecting only a part of the registered image 50 of the registered image information 15 to the image transformation process. Therefore, since only part of the registered image 50 is processed, the processing load on the processing unit associated with the image processing can be reduced.

Further, the image processing means 185 changes the right eyebrow of the right eyebrow region 53, the left eyebrow of the left eyebrow region 54, and the mouth of the mouth region 55, respectively. For this reason, it is possible to deform the eyebrows and mouth of the face that can best display emotional changes. Therefore, the user can easily confirm the other speaker's emotion from the emotion change image.

[0093] Further, the image processing means 185 determines the right eyebrow region 53, the left eyebrow region based on the eyebrow deformation rate and the mouth deformation rate recorded in the first image pattern information 24 and the second image pattern information 34. The eyebrow in area 54 and the mouth in area 55 are deformed. For this reason, the voice of the other speaker's voice According to the pattern, a predetermined part of the image can be appropriately deformed by moving it in a predetermined direction by a predetermined amount. At this time, the image processing means 185 also includes the first image basic information 23 that substantially matches the right eyebrow region 53 of the registered image 50, the eyebrow of the left eyebrow region 54, and the second image that substantially matches the mouth of the mouth region 55. Recognize the basic image information 33 and transform the image based on the deformation rate of the first image pattern information 24 and the second image pattern information 34 corresponding to the first image basic information 23 and the second image basic information 33. Yes. For this reason, it is possible to bring the emotional change image closer to the facial expression of the other speaker's actual emotional change. Therefore, the user can confirm the feeling of the other speaker more reliably.

Then, the voice change determination means 184 compares the voice basic information recognized by the voice recognition means 183 with the voice information of the other speaker, and the voice pattern of the voice information of the other speaker is the voice basic information. If the voice pattern is different, the voice pattern power also recognizes the emotion of the other speaker. Then, the image processing means 185 processes the image according to the emotion of the other speaker. For this reason, the voice change determination means 184 can easily recognize the change of the voice pattern. Therefore, the image processing means 185 can appropriately transform the registered image 50 into an emotion change image corresponding to the change in the other speaker's emotion.

[0095] Further, the voice recognition means 183 receives the first voice information transmitted from the communication device of the other speaker after the mobile phone 100 and the communication device of the other speaker are connected for communication and ready to communicate. Recognize a voice pattern as basic voice information. For this reason, the voice change determination means 184 can recognize how the emotion of the other speaker has changed compared to the state at the beginning of the conversation, based on the emotion of the other speaker at the beginning of the conversation. It is possible to display on the display 110 how the other speaker's emotion has changed compared to the initial state of speaking.

[0096] Further, the display control means 186 causes the display 110 to display the registered image 50 recorded in the registered image information 15 when an incoming call is received from the other speaker. When the sound change determination means 184 determines that the state of the sound information has changed, the image processing means 185 converts the registered image 50 based on the first image template table 20 and the second image template table 30. Let For this reason, the user can make a call while confirming the feeling of the other party's speaker in a device that makes a call with the other party's speaker, such as the mobile phone 100 as in the above embodiment. Shi Therefore, even if there is no function such as a videophone, for example, it is possible to talk with the other party while confirming the emotional change of the other party with a simple configuration and to provide good call support.

[0097] Also, the display control means 186 causes the display 110 to display the registered image 50 recorded in the registered image information 15 in the same way even during outgoing calls for transmitting information to the other party. Then, when the sound change determination means 184 determines that the state of the sound information has changed, the image processing means 185 transforms the registered image 50 based on the first image template table 20 and the second image template table 30. Let For this reason, as with incoming calls, the user can make a call while checking the feelings of the other speaker, and can support good calls.

Further, the photographing means 187 displays the frame 51 and the frame line 52 on the display 110 when photographing the registered image 50 recorded in the registered image information 15. For this reason, when the user captures the registered image 50, the user focuses on the subject so that his / her eye is placed in the frame 51 and the eyebrow and mouth are placed in line with the frame line 52. Is possible. Therefore, the registered image 50 recorded in the registered image information 15 can be easily taken.

Further, image information captured using the frame 51 and the frame line 52 is recorded in the registered image information 15. Therefore, the image processing means 185 can easily recognize the right eyebrow region 53, the left eyebrow region 54, and the mouth region 55, and can transform the image of the eyebrow and mouth. Therefore, the registered image 50 can be appropriately edited according to the voice pattern of the voice information of the other speaker and displayed on the display 110.

[0100] Then, the cellular phone 100 causes the image processing means 185 to transform the image recorded in the registered image information 15, and causes the display control means 186 to display the image on the display 110. This eliminates the need for a storage area for storing large files such as movies. Furthermore, since the registered image 50 of the registered image information 15 is transformed according to the received audio information, the emotional change image that has been stably transformed without the need to send and receive large files such as image information by the communication means. Can be displayed.

[Modification of Embodiment]

The present invention is not limited to the above-described embodiment. To the extent that can be achieved, the following modifications are also included.

That is, in the above-described embodiment, the power illustrated as the mobile phone 100 as the image processing device and the image display device is not limited to this. For example, the present invention may be applied to other electric devices such as a personal computer, a general telephone device, and a car navigation device. For example, in the case of a personal computer, it recognizes the voice state such as voice information in which microphone power is input, voice information obtained from a network such as the Internet, voice information obtained from a storage medium such as an optical disk, and the like. It is also possible to adopt a configuration in which a change in the sound pattern is detected and the image is deformed by the image processing means.

[0103] In the above embodiment, the operation unit 140 includes an operation button and an operation knob. However, the present invention is not limited to this. For example, a touch panel that allows a predetermined setting input operation by touching the display 110 may be used even if it is a keyboard or a mouse connected to the mobile phone 100. In addition, V and shift configurations that can be set and input various setting items such as voice input operations and configurations that output signals via a wireless medium such as a remote controller can be applied.

[0104] Further, the arrival / departure recognition means 181 has shown a configuration in which when the incoming call information is received, the voice output unit 120 outputs a ringtone such as a voice, a warning sound, and a notification sound to notify the user of the incoming call. Not limited to. For example, the incoming call sound may be output after the called party recognition means 182 recognizes the called party. In such a configuration, the ring tone can be changed depending on the destination, and the user can be informed of who the other party is by using only the ring tone.

[0105] Further, in addition to the configuration in which an incoming call is notified by a ring tone, a configuration in which an incoming call is notified to a user by vibration or the like may be employed. In this configuration, the mobile phone 100 is configured to include a vibration unit, and when the incoming / outgoing recognition unit 181 recognizes an incoming call, the mobile unit 100 vibrates the vibration unit and reports incoming information. Further, as described above, the vibration means may be vibrated after the partner speaker recognition means 182 recognizes the called party. In this case, the vibration pattern can tell the user who the call is.

[0106] Then, as described above, the voice recognition means 183 recognizes the gender of the other speaker, basic voice voice information, etc., based on the contents described in the registration details information 16 of the registration information 11. It is good also as a structure. In such a configuration, the voice recognition means 183 records the registration details information 16. Based on the recorded voice basic information, it is possible to compare the voice state of the other speaker at the time of incoming or outgoing call and the information recorded in the voice basic information. Therefore, for example, even when the other speaker is angry at the time of an incoming call or outgoing call, the “anger state” of the other speaker can be displayed on the display 110. Therefore, an image corresponding to the emotion of the other speaker can be displayed more appropriately.

[0107] Further, the basic voice information may be transmitted to the mobile phone 100 such as the communication device of the other speaker. Even in such a configuration, the cellular phone 100 can eliminate the process of recognizing the basic voice information, which simplifies the configuration and reduces the processing load.

[0108] In the above-described embodiment, the configuration has been described in which the registered image is deformed by the image processing unit 185 to create and display an emotion-change image, but the present invention is not limited to this. For example, the mobile phone 100 records a plurality of registered images associated with the emotion information 44 in the registered image information 15. Then, when the voice change determination means 184 of the processing unit 180 recognizes the change in the voice pattern of the voice information of the other speaker, it recognizes the emotion recognition information 41 having the voice pattern information 43 corresponding to the voice pattern. Then, the display control means 186 performs control to read a registered image corresponding to the emotion information 44 of the emotion recognition information 41 from a plurality of registered images recorded in the registered image information 15 and display it in the display area of the display 110. In this manner, the image displayed on the display 110 may be switched to an image corresponding to the emotion of the other speaker according to a change in the voice state of the other speaker. Even in such a configuration, it is possible to display an appropriate emotion change image on the display 110 according to the change in the emotion of the other speaker, as in the above-described effects. Furthermore, since the image recorded in advance in the storage unit 160 is switched, it is not necessary to process the image. Therefore, the processing load on the processing unit 180 can be reduced, and the processing speed can be increased.

[0109] Furthermore, the image processing means 185 places a part of the registered image 50, for example, the eyebrows and the mouth, on the other eyebrows and mouth element images in accordance with the emotion recognition information 41 recognized by the sound change judgment means 184. It is good also as a structure to replace. In this case, for example, an eyebrow image as an element image as shown in FIG. 13 is recorded in the first image pattern information 24 of the first image template information 21. Further, a mouth image as an element image as shown in FIG. 14 is recorded in the second image pattern information 34 of the second image template information 31. Figure 13A shows the person in the eyebrow image of the first basic image information 23 It is an example of the eyebrow image image | photographed in the state of a face. FIG. 13B is an example of an eyebrow image captured when the person in the eyebrow image of the first image basic information 23 is “angry”. FIG. 13C is an example of an eyebrow image captured when the person of the eyebrow image of the first image basic information 23 is in a sad state. FIG. 14A is an example of a mouth image taken when a person in the mouth image of the second image basic information 33 is smiling. FIG. 14B is an example of a mouth image taken when the person of the mouth image of the second image basic information 33 is “angry”. FIG. 14C is an example of a mouth image taken when the person in the mouth image of the second image basic information 33 is in a sad state. In this configuration, when the voice change determination means 184 determines that the voice pattern of the other speaker's voice information is different from the voice pattern of the voice basic information, the emotion recognition of the voice pattern information 43 corresponding to the voice pattern is performed. Recognize information 41. Then, the image processing means 185 obtains the first image basic information 23 and the second image basic information 33 of the eyebrow image whose shapes substantially coincide with the eyebrows of the right and left eyebrow regions 53 and 54 and the mouth region 55 of the registered image 50. The first image template information 21 and the second image template information 31 are searched. Furthermore, the image processing means 185 uses the searched first image template information 21 and second template information 31 to detect the emotion information 44 of the emotion recognition information 41 recognized by the voice change determination means 184. First image template information 21 and second template information 31 having 25 and 35 are recognized. Then, the image processing means 185 converts the images of the left and right eyebrow regions 53 and 54 and the mouth region 55 of the registered image 50 into the recognized first image pattern information 21 of the first image template information 21 and second template information 31. Also, the second image pattern information 34 is replaced with the image. Even in such a configuration, the image information can be switched and displayed according to the voice information of the other speaker, and the emotion of the other speaker can be easily known.

[0110] Further, although an example in which the eyebrow image and the mouth image of the registered image are transformed is shown, the present invention is not limited to this. For example, the image processing means 185 may be configured to similarly deform other parts of the face image, such as the nose, forehead, left and right eyelids, left and right jaws, ears, and the face contour. Further, the color of each of these parts may be transformed. For example, if there is information indicating “angry state” in the corresponding emotion information, the image processing means 185 changes the contrast and color by, for example, increasing the red intensity of the ears and performing image transformation processing. May be.

[0111] Furthermore, the registered image 50 recorded in the registered image information 15 has a face in the frame 51, Although the front image of the face in a state where the eyebrow image and the mouth image are in the left and right eyebrow areas 53 and 54 and the mouth area 55, respectively, is taken, the present invention is not limited to this. For example, a configuration in which a face image taken from a side face is used, or a face image taken from an oblique direction may be used. For example, when a profile is used as a registered image, it is possible to use a frame dedicated to the landscape image and focus on the profile so that the profile appears within this frame. Furthermore, it is good also as a structure which recognizes the outline of a face automatically. In this case, for example, the boundary force between the face color of the registered image in the registered image information 15 and the color of the landscape image also recognizes the outline of the face, and also recognizes the power of the eyes, mouth, eyebrows, etc. inside the face such as different colors. . In such a configuration, not only images taken from the front but also images taken from various directions can be processed and deformed.

Furthermore, in the registered image 50 recorded in the registered image information 15, for example, an animal image, a doll image, or the like may be recorded. Even with such individuality, it is possible to create an emotion change image in which the expression of an animal or doll is changed by the same processing as in the above-described embodiment, and display it in the display area of the display 110. It is also possible to switch between a plurality of images, for example, a plurality of animal images, a plurality of doll images, etc., according to the voice state of the other speaker's voice information.

[0113] The mobile phone 100 according to the above embodiment is not limited to the power shown in the configuration for recognizing the voice information of the other party speaker transmitted from the communication device and recognizing the voice state pattern of the voice information. Absent. For example, the voice state of voice information may be transmitted from the communication device of the other speaker, and the mobile phone 100 may be configured to receive the voice state transmitted from the communication device of the other speaker. This configuration eliminates the need for detecting the voice state of the voice information with the mobile phone 100, thereby reducing the processing load and simplifying the configuration. Furthermore, the image information modified according to the sound pattern of the sound information by the image processing apparatus of the present invention may be transmitted together with the sound information to a receiving device such as a mobile phone.

[0114] Furthermore, a configuration may be adopted in which both the user and the other party's speaker are configured as a communication system using the mobile phone 100 of the above embodiment. In such a configuration, both the user and the other speaker use the mobile phone 100 of the above embodiment as a transmission / reception terminal. You can. In such a communication system using the mobile phone 100, each user can easily feel the emotion of the caller of the mobile phone 100 and the other party of the callee by viewing the image displayed on the display 110. Can be recognized.

[0115] The display 110 may be displayed only when a call is made to a communication device such as a general telephone of the other speaker, and the image may be changed according to the voice state of the other speaker. A configuration may be adopted in which an image is displayed on the display 110 only when the communication device power of the speaker is received and the image is changed according to the voice state of the other speaker.

[0116] Each function described above has been constructed as a program, but it can be used in any form, for example, configured by hardware such as a circuit board or an element such as a single IC (Integrated Circuit). . By adopting a configuration that allows reading from a program or a separate recording medium, handling is easy and usage can be easily expanded.

[0117] In addition, the specific structure and procedure for carrying out the present invention can be appropriately changed to other structures and the like as long as the object of the present invention can be achieved.

[Effects of Embodiment]

As described above, the mobile phone 100 according to the above-described embodiment causes the voice change determination unit 184 of the processing unit 180 to recognize the voice pattern of the voice information of the other speaker and determines that the voice pattern has changed. Then, the image processing means 185 changes the registered image 50 of the registered image information 15 according to the change of the sound pattern. Therefore, the registered image 50 of the registered image information 15 can be deformed according to the change of the voice pattern, and the registered image 50 of the registered image information 15 can be deformed according to the emotion of the other speaker. It is possible to display on the display 110 an appropriate emotion change image corresponding to the emotion of the person.

Industrial applicability

The present invention can be used for an image processing device that displays an image, an image display device, a receiving device, a transmitting device, a communication system, an image processing method, an image processing program, and a recording medium that records the image processing program.

Claims

The scope of the claims

[1] An image processing apparatus that processes an image displayed on a display unit according to audio information obtained by reception,

A voice state recognition unit for recognizing a change in a voice state in the voice information; a display control unit for causing the display unit to display the image and changing the displayed image according to the change in the voice state;

An image processing apparatus comprising:

[2] The image processing device according to claim 1,

An image deformation processing unit that deforms the image according to the change in the sound state is provided, and the display control unit displays the image deformed according to the change in the sound state.

An image processing apparatus.

[3] The image processing device according to claim 2,

The image deformation processing means relatively deforms at least a part of the image with respect to the entire image according to the change in the sound state.

An image processing apparatus.

[4] The image processing device according to claim 3,

Deformation amount information relating to a deformation amount that deforms at least a part of the image according to a change in the sound state;

The image deformation processing means relatively deforms at least a part of the image by a deformation amount of the deformation amount information.

An image processing apparatus.

[5] The image processing device according to claim 1,

Deformation direction information about a deformation direction that deforms at least a part of the image in response to a change in the audio state,

The image deformation processing means relatively deforms at least a part of the image in the deformation direction of the deformation direction information.

An image processing apparatus.

[6] The image processing device according to claim 1,

The image is composed of a plurality of element images,

When the sound state in the sound information changes, the display control means changes and displays at least a part of the element image of the image displayed on the display means to another element image corresponding to the change of the sound state. Make

An image processing apparatus.

[7] The image processing apparatus according to any one of claims 1 to 6 and claim 6,

An image processing apparatus comprising: a standard voice state recognition unit that recognizes a standard voice state of the voice information, wherein the voice state recognition unit recognizes a voice state different from the standard voice state.

[8] The image processing device according to claim 7,

Voice information recognition means for recognizing the voice information;

The standard voice state recognition unit recognizes a voice state in voice information first recognized by the voice information recognition unit as a standard voice state.

An image processing apparatus.

[9] storage means for storing images;

Display means for displaying the image when receiving the audio information;

The image processing apparatus according to any one of claims 1 to 8, and

An image display device comprising:

[10] The image display device according to claim 9,

Receiving means capable of receiving the voice information,

The display control unit displays an image corresponding to the transmission source of the audio information when the reception unit receives the audio information.

A receiving apparatus.

[11] The image display device according to claim 9,

Transmission means capable of transmitting and receiving the audio signal,

The transmission means includes a calling means for calling a transmission destination for transmitting the voice information; The display means displays an image corresponding to the transmission destination according to a call made by the calling means or a response of the transmission destination to the call,

The image processing device changes a displayed image in response to a change in sound state in sound information from a transmission destination received in response to the call.

A transmission apparatus characterized by the above.

[12] A communication system including a transmission / reception terminal capable of transmitting / receiving voice information to / from each other,

Storage means for storing images;

A transmission means for transmitting voice information;

A calling means for calling a transmission source to transmit voice information;

Receiving means for receiving audio information;

Display means for displaying the image when receiving the voice information, when making a call by the calling means, or when the destination responds to the call; and And the image processing device according to claim 8;

A communication system comprising:

[13] An image processing method for processing an image displayed on a display means according to audio information obtained by reception,

Recognizing a change in voice state in the voice information;

The image is displayed on the display means, and the displayed image is changed according to the change in the sound state.

An image processing method.

[14] Let the computing means function as the image processing device according to any one of claims 1 to 8

An image processing program characterized by that.

[15] An image processing method according to claim 13 is executed by an arithmetic means.

An image processing program characterized by that.

[16] The image processing program according to claim 14 or claim 15 is recorded so as to be readable by an arithmetic means. The recording medium which recorded the image processing program characterized by the above-mentioned.