US20140118547A1 - Image processing apparatus, image processing method, and storage medium - Google Patents

Image processing apparatus, image processing method, and storage medium Download PDF

Info

Publication number
US20140118547A1
US20140118547A1 US14/146,934 US201414146934A US2014118547A1 US 20140118547 A1 US20140118547 A1 US 20140118547A1 US 201414146934 A US201414146934 A US 201414146934A US 2014118547 A1 US2014118547 A1 US 2014118547A1
Authority
US
United States
Prior art keywords
image
unit
recognition
captured
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US14/146,934
Inventor
Kenji Tsukamoto
Mahoro Anabuki
Masakazu Matsugu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to US14/146,934 priority Critical patent/US20140118547A1/en
Publication of US20140118547A1 publication Critical patent/US20140118547A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/35Categorising the entire scene, e.g. birthday party or wedding scene
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/00127Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture
    • H04N1/00281Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a telecommunication apparatus, e.g. a switched network of teleprinters for the distribution of text-based information, a selective call terminal
    • H04N1/00283Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a telecommunication apparatus, e.g. a switched network of teleprinters for the distribution of text-based information, a selective call terminal with a television apparatus
    • H04N1/00286Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a telecommunication apparatus, e.g. a switched network of teleprinters for the distribution of text-based information, a selective call terminal with a television apparatus with studio circuitry, devices or equipment, e.g. television cameras
    • G06K9/00684
    • G06K9/00302
    • G06K9/00362
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/00095Systems or arrangements for the transmission of the picture signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/00127Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture
    • H04N1/00281Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a telecommunication apparatus, e.g. a switched network of teleprinters for the distribution of text-based information, a selective call terminal
    • H04N1/00283Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a telecommunication apparatus, e.g. a switched network of teleprinters for the distribution of text-based information, a selective call terminal with a television apparatus
    • H04N1/00299Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a telecommunication apparatus, e.g. a switched network of teleprinters for the distribution of text-based information, a selective call terminal with a television apparatus with a television transmission apparatus, e.g. a videophone, a teletext system or a digital television system
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/00127Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture
    • H04N1/00326Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a data reading, recognizing or recording apparatus, e.g. with a bar-code apparatus
    • H04N1/00328Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a data reading, recognizing or recording apparatus, e.g. with a bar-code apparatus with an apparatus processing optically-read information
    • H04N1/00336Connection or combination of a still picture apparatus with another apparatus, e.g. for storage, processing or transmission of still picture signals or of information associated with a still picture with a data reading, recognizing or recording apparatus, e.g. with a bar-code apparatus with an apparatus processing optically-read information with an apparatus performing pattern recognition, e.g. of a face or a geographic feature
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/0077Types of the still picture apparatus
    • H04N2201/0084Digital still camera
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/0077Types of the still picture apparatus
    • H04N2201/0086Image transceiver
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/0077Types of the still picture apparatus
    • H04N2201/0087Image storage device
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/157Conference systems defining a virtual conference space and using avatars or agents

Definitions

  • the image recording unit 105 records the image transmitted from the image determination unit 104 therein. At this point, the image recording unit 105 adds metadata such as a label to the image. For example, the recognition result may be taken as the label as it is or the user may give any label name to the image via an interface (not illustrated). The label name once given to the image can be changed later by the user. An image can be added and deleted as needed.
  • the image recorded in the image recording unit 105 is acquired by the image output unit 107 .
  • the image that “Mr. A smiles and waves his hand” which is not recorded in the image recording unit 105 may be generated by combining the input captured image with the selected image.
  • the image input unit 102 receives the image captured by the imaging unit 101 and outputs the image to the image recognition unit 103 .
  • the image determination unit 104 receives the captured image from the image recognition unit 103 and selects the image to be recorded in the image recording unit 105 from among the received images based on the recognition results received from the image recognition unit 103 . This selection is performed based on as to whether a situation satisfying the predetermined condition that “one user does not mind to show a remote-image-communication partner.” For example, the image determination unit 104 selects the image which the user does not mind to show the communication partner, such as “image with a smile”. The image selected by the image determination unit 104 is transmitted to the image recording unit 105 along with the corresponding recognition result.
  • step S 401 the imaging unit 101 captures an image of a real space.
  • the captured image is transmitted from the image input unit 102 to the image recognition unit 103 .

Abstract

An image processing apparatus includes an image input unit configured to input an image in which a real space is captured by an image capturing apparatus, an image recognition unit configured to recognize a situation in the real space captured in the input image, an image recording unit configured to record the input image, an image selection unit configured to select an image used for image communication from a plurality of images including images recorded by the image recording unit in the past based on a recognition result of the input image, and an image output unit configured to modify the selected image and output the modified image.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • The present application is a continuation of U.S. patent application Ser. No. 13/323,440, filed Dec. 12, 2011, entitled “IMAGE PROCESSING APPARATUS, IMAGE PROCESSING METHOD, AND STORAGE MEDIUM”, the content of which is expressly incorporated by reference herein in its entirety. Further, the present application claims priority from Japanese Patent Application No. 2010-280848 filed Dec. 16, 2010, which is hereby incorporated by reference herein in its entirety.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention relates to an image processing apparatus, an image processing method, and a storage medium.
  • 2. Description of the Related Art
  • The present invention relates to an image processing apparatus and an image processing method and, in particular, to a technique suitable to modify an image to the one matched with situations in a remote communication based on images of daily behaviors which are selectively stored with results of image recognition as a reference.
  • Communication media referred to as “television telephone” has been practically used in which users can talk with each other while transmitting images captured by cameras in their respective remote situations. Since the users can talk with each other over the television telephone as if they talked face-to-face unlike a general telephone, the television telephone is used for a remote conversation with their families and friends who live in remote areas.
  • One feature of the television telephone that “the users can talk with each other as if they talked face-to-face” means that “the images of the users taken by cameras are always transmitted therebetween while talking.” Some users may feel awkward to use the television telephone. Many users do not want to be photographed by the camera when they do not get dressed, for example, when they just woke up or after a bath.
  • In this respect, various methods have been discussed in which images that one user does not mind to show the other user are captured and stored in advance and an image suitable for the situation of conversation (user's facial expression or presence or absence of person around the user) is selected from among the those images and transmitted to the other user. The methods are discussed in Japanese Patent Application Laid-Open No. 2005-151231, No. 2008-270912, No. 2008-271609, and No. 2009-246566, for example. According to these methods, if images that one user does not mind to show the other user are prepared in advance, the user can talk to a person in a remote location comfortably (or nearly comfortably) as if they talked face-to-face.
  • However, merely preparing previously “images that one user does not mind to show the other user” does not necessarily ensure that images suitable for the situation of remote conversation are included the stored images. In other words, it cannot be denied that a facial expression or a posture which is not stored in advance may appear in the remote conversation. In addition, a “person” varies in appearance with time (hair grows or whatever, for example), so that the stored images do not include images which are suitable for user's “current” situations after the elapse of a certain period of time from the recording of the stored images.
  • Thus, it may happen that images suitable for the situation in talking cannot be transmitted. In order to avoid such situations, a large number of “images that one user does not mind to show the other user” has only to be prepared and continuously updated, however, no matter how many images are prepared, situations which are not included in the stored images can arise. Further, it takes too much time and effort to continue updating a large number of images, so that it is not practical.
  • SUMMARY OF THE INVENTION
  • The present invention provides a technique, used in a television telephone for transmitting an image according to a user's situation to a partner of conversation, for transmitting an image according to the situation to the partner even in situations that are not included in images which are captured and stored in advance.
  • According to an aspect of the present invention, an image processing apparatus includes an image input unit configured to input an image in which a real space is captured by an image capturing apparatus, an image recognition unit configured to recognize a situation in the real space captured in the input image, an image recording unit configured to record the input image, an image selection unit configured to select an image used for image communication from a plurality of images including images recorded by the image recording unit in the past based on a recognition result of the input image, and an image output unit configured to modify the selected image and output the modified image.
  • Further features and aspects of the present invention will become apparent from the following detailed description of exemplary embodiments (with reference to the attached drawings.)
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate exemplary embodiments, features, and aspects of the invention and, together with the description, serve to explain the principles of the invention.
  • FIG. 1 is a block diagram illustrating an example of a configuration of an image processing apparatus according to an exemplary embodiment of the present invention.
  • FIG. 2 is a flow chart illustrating a processing procedure of the image processing apparatus according to the exemplary embodiment of the present invention.
  • FIG. 3 is a block diagram illustrating an example of a configuration of an image processing apparatus according to a modification example.
  • FIG. 4 is a flow chart illustrating a processing procedure of the image processing apparatus according to the modification example.
  • DESCRIPTION OF THE EMBODIMENTS
  • Various exemplary embodiments, features, and aspects of the invention will be described in detail below with reference to the drawings. It should be noted that the relative arrangement of the components, the numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present invention unless it is specifically stated otherwise.
  • An image processing apparatus according to a first exemplary embodiment is the one that modifies an image for remote image communication (television telephone) using images (e.g. video images) captured and stored by a camera installed at a general home environment. The configuration and processing of the image processing apparatus according to the present exemplary embodiment will be described below with reference to the attached drawings.
  • FIG. 1 is a block diagram illustrating a schematic configuration of an image processing apparatus 100 according to the present exemplary embodiment. As illustrated in FIG. 1, the image processing apparatus 100 according to the present exemplary embodiment includes an imaging unit 101, an image input unit 102, an image recognition unit 103, an image determination unit 104, an image recording unit 105, an image selection unit 106, an image output unit 107, and a control unit 108. The image processing apparatus 100 can be used by a user for the remote image communication. However, the image processing apparatus 100 has also a function to operate even when the user does not perform the remote image communication.
  • The imaging unit 101 includes one or more cameras and captures an image in a real space of a general home environment. The imaging unit 101 may be fixed to a ceiling, placed on a floor, a table, or a television, or incorporated in furniture such as a television, a mirror, a table, and a chair. In a case where a camera is incorporated in a mirror, when a user stands in front of the mirror, a figure reflected in the mirror is captured.
  • If the imaging unit 101 includes a plurality of cameras, the cameras can be distributed into places in the home environment and capture images of persons appearing at various places in the home environment. The user performs the remote image communication in front of one or more cameras included in the imaging unit 101. Camera parameters such as pan-tilt and zoom of a camera included in the imaging unit 101 may be fixed or variable.
  • The image input unit 102 receives the image captured by the imaging unit 101 and outputs the image to the image recognition unit 103.
  • The image recognition unit 103 receives the image from the image input unit 102 and recognizes the situation of a person appearing in the image. Situations to be recognized range widely, such as, existence or absence, identification (who is it), position, facial expression, posture, action, and behavior of a person (in the image).
  • Recognition as to whether a person exists is realized by detecting image features resulting from the face and head of the person from the image received from the image input unit 102. Histograms of oriented gradients (HOG) which are feature quantities in which a gradient direction in a local area is represented by a histogram are used as the image feature.
  • The image feature resulting from a person is determined such that a large number of images in which persons are captured are collected and common feature quantities included therein are statistically learned using algorithm called Boosting, for example. If thus determined image feature resulting from a person is included in the image received from the image input unit 102, the image recognition unit 103 recognizes that “a person exists in the area where the feature is detected.” Otherwise, the image recognition unit 103 recognizes that “a person does not exist.”
  • A person is identified such that the area where the person exists who is identified in recognizing whether a person exists in the image received from the image input unit 102 is compared with the image feature of a person (a facial image, for example) being a identification candidate inside. If the an image feature of the person being the identification candidate can be detected from the area where the person exists who is identified in recognizing whether a person exists, the person captured in an image area is recognized as the person.
  • The recognition of a posture of a person is started from the search for an image feature resulting from a human-body part, which is prepared in advance, from the images received from the image input unit 102, for example. The human-body parts refer to a face, head, hand, arm, foot, knee, shoulder, waist, chest, navel, and back, for example. Since each part is different in an image feature according to the orientation at which an image of the part is captured. As for a face, for example, a plurality of image features classified by the orientation such as a front face part, a profile part, and a downward face part is prepared and searched for. Histograms of oriented gradients (HOG) which are feature quantities in which a gradient direction in a local area is represented by a histogram are used as the image feature.
  • The image recognition unit 103 determines the image feature resulting from each human-body part such that a large number of images in which human-body parts are captured are collected and common feature quantities included therein are statistically learned using algorithm called Boosting, for example. If thus determined image feature resulting from each human-body part is found in the image received from the image input unit 102, the image recognition unit 103 recognizes that “the human-body part exists in the position where the feature is found.”
  • The image recognition unit 103 recognizes the positional relationship between the human-body parts as a posture. For example, if a head, chest and waist are arranged on a substantially straight line in the direction of gravity, the image recognition unit 103 recognizes the above positional relationship as an “upright posture.” For example, an angle formed by a line connected among a hand, shoulder, and waist can be a parameter indicating a posture in which to what extent the arm is opened. Recognizing a time-series change pattern of the posture recognition results means the recognition of an action.
  • For recognition of a facial expression of a person, the image recognition unit 103 identifies an area where an image feature resulting from the face of the person is detected from the image received from the image input unit 102. The HOG may be used as the feature quantity. Then, the image recognition unit 103 recognizes to which group and to which extent an image obtained by normalizing the identified area is similar among a plurality of image groups including several facial expressions.
  • The plurality of image groups refer to a collection of images of facial expressions such as “expressions of positive and violent emotions (delight),” “expressions of positive and quiet emotions (pleasure),” “expressions of negative and violent emotions (anger),” “expressions of negative and quiet emotions (sorrow),” and “neutral expression without a particular emotion.” A discriminant axis for discriminating between, for example, the facial image group class of “expressions of positive and violent emotions (delight)” and that of “expressions of negative and violent emotions (anger),” is produced using linear discriminant analysis (LDA).
  • The image recognition unit 103 determines to which class the normalized image area is more similar using the discriminant axis. By repeating the determination in relation to the comparison among a plurality of expression classes, the image recognition unit 103 can recognize to which expression and to which extent the expression of the person captured in the image received from the image input unit 102 is similar among the previously prepared expressions.
  • A human behavior can be recognized in such a manner that, for example, a position and a posture of a person recognized from an image and behavior recognition results corresponding to the recognition time thereof are listed beforehand and the list is referenced in recognition. The image recognition unit 103 may recognize an object and a situation of a scene in an image as well as a person.
  • A scene to be captured is recognized by identifying an object existing in segmentation of a background captured in an imaging environment and an imaging environment by a general object recognition technique. Other than that, the recognition can be performed based on the grasp of a light-source position by light-source estimation and three-dimensional shape measurement results of a scene by three-dimensional re-configuration in an environment where a plurality of cameras is arranged.
  • The control unit 108 controls the image recognition unit 103 to transmit recognition results related to a person, object, and scene to the image determination unit 104 along with the captured image if the user does not perform the remote image communication using the image processing apparatus 100. The control unit 108 performs control to output the recognition results to the image selection unit 106 if the user performs the remote image communication using the image processing apparatus 100. At this point, the control unit 108 performs control so that the corresponding captured image will not be a target image to be directly transmitted.
  • The image determination unit 104 receives the captured image from the image recognition unit 103. The image determination unit 104 determines whether an image to be recorded in the image recording unit 105 among the received captured images satisfies a predetermined condition that “one user does not mind to show a remote-image-communication partner” based on the recognition results received from the image recognition unit 103.
  • For example, a positive image, such as an image from which a smile of a person is recognized, is generally regarded as what satisfies the condition. Whereas an image in which a person closes its eyes is regarded as what does not satisfy the condition. Alternatively, the image determination unit 104 may determine that an image from which a recognition result previously set by the user can be acquired, for example, an image from which the recognition result of the “upright posture” can be acquired satisfies the condition.
  • The image determination unit 104 can determine whether an image is similar to recognition results previously selected by the user. For example, the image determination unit 104 determines whether the positional relationship parameters of human-body parts in the image are similar to a person's posture in the image randomly selected by the user. Further, an image that the user selects at his/her discretion, such as an image captured at a specific time, may be stored. When the image is determined as the one captured items satisfy the condition by the image determination unit 104, the image is transmitted to the image recording unit 105 along with the corresponding recognition results.
  • The image recording unit 105 records the image transmitted from the image determination unit 104 therein. At this point, the image recording unit 105 adds metadata such as a label to the image. For example, the recognition result may be taken as the label as it is or the user may give any label name to the image via an interface (not illustrated). The label name once given to the image can be changed later by the user. An image can be added and deleted as needed. The image recorded in the image recording unit 105 is acquired by the image output unit 107.
  • The image selection unit 106 receives the recognition result of the captured image from the image recognition unit 103, acquires the image recorded in the image recording unit 105 based on the recognition result, and selects an image (to be transmitted to a communication partner) to be used for the remote image communication by the image. For example, if the image selection unit 106 receives a recognition result that “Mr. A smiles and waves his hand” from the image recognition unit 103, the image output unit 107 searches the image which provides the same recognition result that “Mr. A smiles and waves his hand” from the images recorded in the image recording unit 105. If the image output unit 107 finds out the image, it is selected.
  • If there is no image which provides the same recognition result, the image selection unit 106 searches for the image which provides the recognition result that “Mr. A smiles” and the image which provides the recognition result that “Mr. A waves his hand” and performs selection.
  • The image output unit 107 generates and outputs an image by modifying the selected images based on the recognition results. For example, if the selected image provides the recognition result that “Mr. A waves his hand,” only the face part of Mr. A in the selected image is replaced with a face part of Mr. A in the image which provides the recognition result that “Mr. A smiles.” Accordingly, the image that “Mr. A smiles and waves his hand” can be generated by modification.
  • More specifically, the image recognition unit 103 identifies the area, from which image feature resulting from the face of a person for recognizing expression is detected, in each of the images which provide the recognition results that “Mr. A waves his hand” and “Mr. A smiles.” At the same time, the image recognition unit 103 also identifies in which area the image feature resulting from the face of Mr. A is detected from the identified facial area to identify a person. The area where the image feature resulting from the face of Mr. A is detected from the image which provides the recognition result that “Mr. A smiles” is extracted from both results.
  • The extracted area can be superimposed on the area where the image feature resulting from the face of Mr. A is detected in the image which provides the recognition result that “Mr. A waves his hand.” Accordingly, only the face part of Mr. A in the image which provides the recognition result that “Mr. A waves his hand” can be replaced with the face part of Mr. A in the image which provides the recognition result that “Mr. A smiles.”
  • The area where the image feature resulting from the face of Mr. A is detected and which is acquired from the image which provide the recognition result that “Mr. A waves his hand” and that is acquired from the image which provide the recognition result that “Mr. A smiles” do not necessarily coincide with each other in size and shape. In that case, the size and shape may be corrected based on the premise that the face of same person is captured in both image areas (face which is the same in size is captured).
  • Although only the selected image is subjected to the modification in the above description, the image that “Mr. A smiles and waves his hand” which is not recorded in the image recording unit 105 may be generated by combining the input captured image with the selected image.
  • As another example, if the image output unit 107 receives a recognition result that “Mr. B tilts his head rightward by 30 degrees on the image” from the image recognition unit 103, the image output unit 107 searches for and references to the image which provides the recognition result that “Mr. B tilts his head rightward on the image.” Then, the image output unit 107 measures the tilt angle of the neck of Mr. B captured in the image. If the measured angle is not 30 degrees, the image is modified so that the angle becomes equal to 30 degrees. In the present exemplary embodiment, such an image processing method is realized as follows, for example.
  • First, a state where “a head is tilted rightward by 30 degrees on the image” is that an angle formed by a straight line connecting between a right shoulder and a neck and a straight line between a head and the neck is 90 degrees−30 degrees=60 degrees on the image. The image recognition unit 103 identifies the areas where the image features resulting from the head, neck, and right shoulder are detected in the process for estimating the posture in the image which provides the recognition result that “Mr. B tilts his head rightward on the image.”
  • The image recognition unit 103 clips the area where the image feature resulting from the head is detected from the image which provides the recognition result that “Mr. B tilts his head rightward on the image.” Then, the image recognition unit 103 moves the clipped area by rotating it on the image around a point of contact with the area where the image feature resulting from the neck is detected. Accordingly, the image can be modified to the one that an angle formed by the straight line connecting between the right shoulder and the neck and the straight line between the head and the neck is 90 degrees−30 degrees=60 degrees. A hole made by moving the area where the image feature resulting from the head is detected may be filled with background pixel values determined using a generally known method.
  • The image modified by the image output unit 107 is transmitted to the partner of the remote image communication via a communication module (not illustrated) and output via a display near the partner.
  • The configuration of the image processing apparatus 100 according to the present exemplary embodiment is described above.
  • The process procedure performed by the image processing apparatus 100 according to the present exemplary embodiment is described with reference to a flow chart illustrated in FIG. 2. A program code according to the flow chart in FIG. 2 is stored in a memory such as a random access memory (RAM) and a read only memory (ROM) (not illustrated) in the control unit 108, and read and executed by a central processing unit (CPU) (not illustrated). The processing related to transmission and reception of data may be performed directly or performed via a network and is not mentioned in particular herein.
  • When the processing is started in step S200, in step S201, the imaging unit 101 captures an image of a real space. As described above, the real space refers to a home environment in the present exemplary embodiment. If the imaging unit 101 includes a plurality of cameras, each camera captures an image. The captured image is transmitted from the image input unit 102 to the image recognition unit 103.
  • In step S202, the image recognition unit 103 performs processing for recognizing a person, an object, and a scene captured in the image transmitted from the image input unit 102.
  • In step S203, at this point, it is confirmed whether the user performs the remote image communication. For example, if turning ON and OFF of a remote image communication function is controlled by the operation of the user, whether the user performs the remote image communication can be confirmed by checking the state of the function. In addition, if the recognition processing executed in step S202 recognizes that a person is not included in the image captured in step S201, it can be confirmed that the user obviously does not perform the remote image communication.
  • If the recognition processing executed in step S202 recognizes that a person is included in the image captured by the camera for the remote image communication in step S201 and the action of the person is a conversation, it can also be confirmed that the user performs the remote image communication. Whichever method is used for confirmation, if it is confirmed that the user does not execute the remote image communication (NO in step S203), the recognition result acquired in step S202 and the image corresponding thereto are transmitted to the image determination unit 104. Then, the processing proceeds to step S204. If it is confirmed that the user executes the remote image communication (YES in step S203), the recognition result acquired in step S202 is transmitted to the image output unit 107 and then the processing proceeds to step S206.
  • In step S204, the image determination unit 104 determines whether the captured image received from the image recognition unit 103 is a target image to be stored in the image recording unit 105 based on the recognition results corresponding to the image. If the imaging unit 101 includes the plurality of cameras, the image determination unit 104 receives image for each camera and determines whether to store each image.
  • In step S205, the image determined to be stored in step S204 is transmitted to the image recording unit 105 and recorded therein. At this point, the recognition results of the image recognition unit 103 are added to the image as metadata such as a label. The label may be changed later by the user. The image is recorded and then the processing returns to step S201.
  • In step S206, the image selection unit 106 selects the image to be modified from the images recorded in the image recording unit 105 based on the recognition results received from the image recognition unit 103. For example, the image selection unit 106 selects the image whose recognition results are all the same or partially similar to the recognition results received from the image recognition unit 103. Alternatively, for example, the image selected in advance by the user for each recognition result may be acquired. If there is a plurality of images as selectable candidates, the latest image among the stored images, for example, may be selected or all candidate images may be selected. The image is selected and transmitted to the image output unit 107. Then, the processing proceeds to step S207.
  • In step S207, the image output unit 107 modifies the image for the remote image communication using the image selected in step S206 based on the recognition results acquired in step S202. The image is modified and the processing proceeds to step S208.
  • In step S208, the image modified by the image output unit 107 is transmitted to the remote image communication partner via a communication module (not illustrated) and output via a display near the partner.
  • According to the above described processing, the image processing apparatus 100 can select not the image of the user during the communication but the image that provides the same recognition result as the image of the user during the communication from the past images of the user, and modify the selected image to the one transmitted to the partner of the remote image communication The image determined as the one that “can be shown to a communication partner” by the image recognition unit 103 based on the recognition processing is recorded with materials for the images to be selected and modified. Therefore, the user can show the image that “can be shown to a communication partner” of his/her own to the remote image communication partner.
  • In other words, an image in which “the user himself/herself that can be shown to a communication partner” determined by the user in advance is captured is determined by the recognition technique from images in which the daily behavior of the user is regularly captured and stored. At the time of the remote communication, situations (such as, illumination, an orientation of a face, a posture of a body, a facial expression, and an action) at that time are recognized and measured. The image matching the recognized and measured result is selected from the images captured and recorded or modified using the stored images.
  • All materials for the images to be selected and modified are the images in which “the user himself/herself that can be shown to a communication partner” is captured, so that the modified image reflects situations at that time and is the one user can show the partner. Thus, according to the present exemplary embodiment, in the television telephone for transmitting an image according to the situation of the user to the partner, even in a situation that is not included in the previously captured and stored images, the image according to the situation can be transmitted to the partner.
  • An image processing apparatus according to a modification example of the present invention modifies an image for the remote image communication using images captured indoors and outdoors and stored by a digital camera or a digital camcorder. The following describes the configuration and processing procedure of the image processing apparatus according to the present modification example with reference to the attached drawings.
  • FIG. 3 is a block diagram illustrating a schematic configuration of an image processing apparatus 200 according to the present modification example. As illustrated in FIG. 3, the image processing apparatus 200 according to the present modification example includes an imaging unit 101, an image input unit 102, an image recognition unit 103, an image determination unit 104, an image recording unit 105, an image selection unit 106, an image output unit 107, and a control unit 108. The image processing apparatus 200 further includes an image storage unit 206, an environment measurement unit 207, an image communication unit 208, and an image display unit 209. The image processing apparatus 200 is substantially similar in configuration to the image processing apparatus 100, so that the similar components are identified with the same reference numerals, and detailed description thereof is omitted.
  • The imaging unit 101 includes one or more cameras and captures an image in a real space of a general home environment. A user performs the remote image communication in front of one or more cameras included in the imaging unit 101.
  • The image input unit 102 receives the image captured by the imaging unit 101 and outputs the image to the image recognition unit 103.
  • The image storage unit 206 is a unit for storing images captured and stored by a portable camera device such as a digital camera or a digital camcorder. For example, the image storage unit 206 stores a commemorative photo and a family image which are captured in travelling, a family gathering, and the like. The stored image is output to the image recognition unit 103.
  • The image recognition unit 103 receives the images from the image input unit 102 and the image storage unit 206 and recognizes a person, an object, and a scene captured in the images. The image recognition unit 103 recognizes a name and an arrangement of an object captured in the image as well as existence or absence, position (in the image), identification (who is it), facial expression, posture, action, and behavior of a person. Further, the image recognition unit 103 recognizes the imaging environment itself such as a scene (indoor or outdoor) and a context (school event, public space, personal event, and others).
  • When the user does not perform the remote image communication using the image processing apparatus 200, each recognition result is transmitted to the image determination unit 104 along with the image corresponding thereto. When the user performs the remote image communication using the image processing apparatus 200, each recognition result is output to the image output unit 107. At this point, the corresponding image is not transmitted to the image output unit 107.
  • The image determination unit 104 receives the captured image from the image recognition unit 103 and selects the image to be recorded in the image recording unit 105 from among the received images based on the recognition results received from the image recognition unit 103. This selection is performed based on as to whether a situation satisfying the predetermined condition that “one user does not mind to show a remote-image-communication partner.” For example, the image determination unit 104 selects the image which the user does not mind to show the communication partner, such as “image with a smile”. The image selected by the image determination unit 104 is transmitted to the image recording unit 105 along with the corresponding recognition result.
  • The image recording unit 105 records the image transmitted from the image determination unit 104 therein. At this point, the image recording unit 105 adds metadata such as a label to the image. For example, the recognition result may be taken as the label as it is or the user may give any label name to the image via an interface (not illustrated). The label name once given to the image can be changed later by the user. An image can be added and deleted as needed. The image recorded in the image recording unit 105 is acquired by the image selection unit 106.
  • The environment measurement unit 207 is arranged near the imaging unit 101 to measure the imaging environment of the imaging unit 101. The environment measurement unit 207 includes an optical sensor, for example, and the optical sensor measures an actual position and brightness of a light source in an imaging range of the imaging unit 101. Alternatively, the environment measurement unit 207 may include a temperature sensor to measure atmospheric temperature. If the imaging unit 101 captures an image of a plurality of environments using a plurality of cameras, the environment measurement unit 207 is provided with a plurality of sensors to measure a plurality of imaging environments. The measurement results of the environment measurement unit 207 are output to the image output unit 107.
  • The image selection unit 106 receives the recognition results of the captured image from the image recognition unit 103 and the measurement results of the imaging environment from the environment measurement unit 207. The image selection unit 106 selects the image recorded in the image recording unit 105 based on the recognition results and the measurement results.
  • The image output unit 107 modifies the selected image to an image used for the remote image communication (image to be transmitted to the communication partner) and outputs the image.
  • The image output unit 107 further modifies the image selected based on the recognition results from the image recognition unit 103. Furthermore, the image output unit 107 performs modification so that the modified image can provide the measurement results that are the same as (or similar to) those provided by the environment measurement unit 207. For example, if there is information about a lighting environment in measurement items for the imaging environment, the image output unit 107 modifies a lighting state of the image to match the lighting environment of the space measured by the environment measurement unit 207.
  • As a more specific example, it is supposed that an outdoor space in the daytime is captured in an image referenced from the image recording unit 105 while an indoor space in the night is being measured by the environment measurement unit 207. In such a case, the image output unit 107 removes an outdoor lighting component in the image referenced from the image recording unit 105 and add a virtual indoor lighting component to the image.
  • As described above, the image output unit 107 modifies the image to the one which provides the measurement results same as those provided by the imaging unit 101 and which seems as if the image was imaged in the environment that is the same as the space where the environment measurement unit 207 measures the imaging environment. The modified image is transmitted to the image communication unit 208.
  • The image communication unit 208 transmits the image output from the image output unit 107 to the image display unit 209. The transmission may be performed via a wired network or a wireless network such as a cellular phone.
  • The image display unit 209 is a remote image display terminal arranged near the remote image communication partner of the user of the image processing apparatus 200 and displays the image transmitted from the image communication unit 208.
  • The configuration of the image processing apparatus 200 according to the present modification example is described above.
  • The process procedure performed by the image processing apparatus 200 according to the present modification example is described with reference to a flowchart illustrated in FIG. 4. A program code according to the flow chart in FIG. 4 is stored in a memory such as a RAM and a ROM (not illustrated) in the control unit 108 provided in the image processing apparatus 200, and read and executed by a CPU (not illustrated).
  • When the processing is started in step S400, in step S401, the imaging unit 101 captures an image of a real space. The captured image is transmitted from the image input unit 102 to the image recognition unit 103.
  • In step S402, the image recognition unit 103 performs recognition processing on the image transmitted from the image input unit 102. If there is an image which is not processed in the past by the image recognition unit 103 among the images stored in the image storage unit 206, the image is subjected to the recognition processing. By the recognition processing, it can be recognized that existence or absence, identification, facial expression, posture, action, and behavior of a person in the image and an object and scene captured in the image.
  • In step S403, at this point, it is confirmed if the user performs the remote image communication. The confirmation method is similar to that described in step S203 in the first exemplary embodiment. If it is confirmed that the user does not execute the remote image communication (NO in step S403), the recognition result acquired in step S402 and the image corresponding thereto are transmitted to the image determination unit 104. Then, the processing proceeds to step S404. If it is confirmed that the user executes the remote image communication (YES in step S403), the recognition result acquired in step S402 is transmitted to the image output unit 107 and then the processing proceeds to step S406.
  • In step S404, the image determination unit 104 determines whether the captured image received from the image recognition unit 103 is stored in the image recording unit 105 based on the recognition results corresponding to the image. The processing in step S404 in the present modification example is different from that in step S204 in that the image stored in the image storage unit 206 is included in the image selected by the image determination unit 104.
  • In step S405, the image determined to be stored in step S404 is transmitted to the image recording unit 105 and recorded therein. At this point, the recognition results of the image recognition unit 103 are added to the image as a label. The label may be changed later by the user. The image is recorded, and then the processing returns to step S401.
  • In step S406, the image selection unit 106 selects the image to be modified from the images recorded in the image recording unit 105 based on the recognition results received from the image recognition unit 103. The selection method is similar to that described in step S206 in the first exemplary embodiment. The image is selected and transmitted to the image output unit 107.
  • In step S407, the environment measurement unit 207 measures an imaging environment. Environment measurement results are transmitted to the image output unit 107.
  • In step S408, the image output unit 107 modifies the image acquired in step S406 to the image for the remote image communication based on the recognition results acquired in step S402 and the measurement value of the imaging environment acquired in step S407.
  • In step S409, the modified image is transmitted to the partner of the remote image communication via the image communication unit 208 and output via the image display unit 209 near the partner.
  • According to the above described processing, the image processing apparatus 200 can modify the image based on not the image of the user during the communication but the past image of the user that provides the same recognition result as the image of the user during the communication to the one transmitted to the partner of the remote image communication
  • According to the present modification example, the image for the remote image communication can be modified based on the past commemorative photos and family images captured and stored by a portable camera device in particular. In other words, the user can modify the image for the remote image communication using the image captured at a place other than where the user exists at the time of the remote image communication.
  • Aspects of the present invention can also be realized by a computer of a system or apparatus (or devices such as a CPU or a micro processing unit (MPU)) that reads out and executes a program recorded on a memory device to perform the functions of the above-described embodiment (s), and by a method, the steps of which are performed by a computer of a system or apparatus by, for example, reading out and executing a program recorded on a memory device to perform the functions of the above-described embodiment(s). For this purpose, the program is provided to the computer for example via a network or from a recording medium of various types serving as the memory device (e.g., computer-readable storage medium).
  • While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all modifications, equivalent structures, and functions.

Claims (1)

What is claimed is:
1. An image processing apparatus comprising:
an image input unit configured to input an image in which a real space is captured by an image capturing apparatus;
an image recognition unit configured to recognize a situation in the real space captured in the input image;
an image recording unit configured to record the input image;
an image selection unit configured to select an image used for image communication from a plurality of images including images recorded by the image recording unit in the past based on a recognition result of the input image; and
an image output unit configured to modify the selected image and output the modified image.
US14/146,934 2010-12-16 2014-01-03 Image processing apparatus, image processing method, and storage medium Abandoned US20140118547A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/146,934 US20140118547A1 (en) 2010-12-16 2014-01-03 Image processing apparatus, image processing method, and storage medium

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2010-280848 2010-12-16
JP2010280848 2010-12-16
US13/323,440 US8644614B2 (en) 2010-12-16 2011-12-12 Image processing apparatus, image processing method, and storage medium
US14/146,934 US20140118547A1 (en) 2010-12-16 2014-01-03 Image processing apparatus, image processing method, and storage medium

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US13/323,440 Continuation US8644614B2 (en) 2010-12-16 2011-12-12 Image processing apparatus, image processing method, and storage medium

Publications (1)

Publication Number Publication Date
US20140118547A1 true US20140118547A1 (en) 2014-05-01

Family

ID=46234515

Family Applications (2)

Application Number Title Priority Date Filing Date
US13/323,440 Active 2032-08-02 US8644614B2 (en) 2010-12-16 2011-12-12 Image processing apparatus, image processing method, and storage medium
US14/146,934 Abandoned US20140118547A1 (en) 2010-12-16 2014-01-03 Image processing apparatus, image processing method, and storage medium

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US13/323,440 Active 2032-08-02 US8644614B2 (en) 2010-12-16 2011-12-12 Image processing apparatus, image processing method, and storage medium

Country Status (2)

Country Link
US (2) US8644614B2 (en)
JP (1) JP5854806B2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105208355A (en) * 2015-10-21 2015-12-30 合肥华凌股份有限公司 Refrigerator data acquisition method and system as well as refrigerator
CN106864368A (en) * 2015-12-11 2017-06-20 罗伯特·博世有限公司 It is applied to the alarming method for power and system of vehicle
CN108401129A (en) * 2018-03-22 2018-08-14 广东小天才科技有限公司 Video call method, device, terminal based on Wearable and storage medium

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9575641B2 (en) * 2012-03-20 2017-02-21 Adobe Systems Incorporated Content aware image editing
EP2713593B1 (en) * 2012-09-28 2015-08-19 Alcatel Lucent, S.A. Immersive videoconference method and system
US20140153832A1 (en) * 2012-12-04 2014-06-05 Vivek Kwatra Facial expression editing in images based on collections of images
KR101988279B1 (en) * 2013-01-07 2019-06-12 삼성전자 주식회사 Operating Method of User Function based on a Face Recognition and Electronic Device supporting the same
CN105684038B (en) 2013-10-28 2019-06-11 谷歌有限责任公司 For replacing the image buffer storage of the part of image
US10097823B1 (en) * 2015-11-13 2018-10-09 Harmonic, Inc. Failure recovery for real-time audio and video encoding, decoding, and transcoding
KR20200080047A (en) * 2018-12-26 2020-07-06 삼성전자주식회사 Method and wearable device for identifying hand of truly user
US11157549B2 (en) * 2019-03-06 2021-10-26 International Business Machines Corporation Emotional experience metadata on recorded images

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090087099A1 (en) * 2007-09-28 2009-04-02 Fujifilm Corporation Image processing apparatus, image capturing apparatus, image processing method and recording medium

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001127990A (en) * 1999-11-01 2001-05-11 Mega Chips Corp Information communication system
JP2001169257A (en) * 1999-12-14 2001-06-22 Matsushita Electric Ind Co Ltd Video phone
JP2003309829A (en) * 2002-04-15 2003-10-31 Matsushita Electric Ind Co Ltd Mobile moving picture phone
JP2005151231A (en) * 2003-11-17 2005-06-09 Nippon Telegr & Teleph Corp <Ntt> Video communication method, video communication apparatus, video creation program used for apparatus, and recording medium with program recorded thereon
JP4940695B2 (en) * 2006-02-24 2012-05-30 日本電気株式会社 Videophone device, mobile terminal with videophone, videophone method thereof and communication method of mobile terminal with videophone
JP2008131412A (en) * 2006-11-22 2008-06-05 Casio Hitachi Mobile Communications Co Ltd Video telephone system and program
JP2008270912A (en) 2007-04-16 2008-11-06 Ntt Docomo Inc Control device, mobile communication system, and communication terminal
JP2009246566A (en) * 2008-03-28 2009-10-22 Sony Ericsson Mobilecommunications Japan Inc Imaging apparatus, imaging method, imaging control program, and mobile terminal
JP4477079B2 (en) 2008-07-29 2010-06-09 株式会社ケンウッド Mobile phone equipment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090087099A1 (en) * 2007-09-28 2009-04-02 Fujifilm Corporation Image processing apparatus, image capturing apparatus, image processing method and recording medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105208355A (en) * 2015-10-21 2015-12-30 合肥华凌股份有限公司 Refrigerator data acquisition method and system as well as refrigerator
CN106864368A (en) * 2015-12-11 2017-06-20 罗伯特·博世有限公司 It is applied to the alarming method for power and system of vehicle
CN108401129A (en) * 2018-03-22 2018-08-14 广东小天才科技有限公司 Video call method, device, terminal based on Wearable and storage medium

Also Published As

Publication number Publication date
US20120155773A1 (en) 2012-06-21
JP5854806B2 (en) 2016-02-09
JP2012142925A (en) 2012-07-26
US8644614B2 (en) 2014-02-04

Similar Documents

Publication Publication Date Title
US8644614B2 (en) Image processing apparatus, image processing method, and storage medium
US20230156319A1 (en) Autonomous media capturing
JP7077376B2 (en) Image pickup device and its control method
JP4984728B2 (en) Subject collation device and subject collation method
US20050011959A1 (en) Tags and automated vision
EP1793580B1 (en) Camera for automatic image capture having plural capture modes with different capture triggers
WO2017084182A1 (en) Method and apparatus for image processing
CN111294488B (en) Image pickup apparatus, control method thereof, and storage medium
JP6229656B2 (en) Control device and storage medium
KR20050085583A (en) Expression invariant face recognition
CN114500789A (en) Image pickup apparatus, control method therefor, and recording medium
JP6028457B2 (en) Terminal device, server, and program
JP2019110509A (en) Imaging device and method of controlling the same, program, and storage medium
KR102475999B1 (en) Image processing apparatus and method for controling thereof
WO2019124055A1 (en) Image capturing device, control method therefor, program, and storage medium
JP2016038774A (en) Person identification apparatus
JP2015092646A (en) Information processing device, control method, and program
CN110557560B (en) Image pickup apparatus, control method thereof, and storage medium
CN111105039A (en) Information processing apparatus, control method thereof, and memory
JP2019110525A (en) Imaging device and method of controlling the same, program, and storage medium
JP2024045460A (en) Information processing system, information processing device, information processing method, and program
JP4427714B2 (en) Image recognition apparatus, image recognition processing method, and image recognition program
CN111512625B (en) Image pickup apparatus, control method thereof, and storage medium
WO2020115910A1 (en) Information processing system, information processing device, information processing method, and program
JP2005199373A (en) Communication device and communication method

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION