WO2010035524A1 - Intercom system - Google Patents

Intercom system Download PDF

Info

Publication number
WO2010035524A1
WO2010035524A1 PCT/JP2009/054739 JP2009054739W WO2010035524A1 WO 2010035524 A1 WO2010035524 A1 WO 2010035524A1 JP 2009054739 W JP2009054739 W JP 2009054739W WO 2010035524 A1 WO2010035524 A1 WO 2010035524A1
Authority
WO
WIPO (PCT)
Prior art keywords
visitor
unit
face
information
imaging
Prior art date
Application number
PCT/JP2009/054739
Other languages
French (fr)
Japanese (ja)
Inventor
桂 内田
倫 渡邉
博之 高田
Original Assignee
ブラザー工業株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ブラザー工業株式会社 filed Critical ブラザー工業株式会社
Publication of WO2010035524A1 publication Critical patent/WO2010035524A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M11/00Telephonic communication systems specially adapted for combination with other electrical systems
    • H04M11/02Telephonic communication systems specially adapted for combination with other electrical systems with bell or annunciator systems
    • H04M11/025Door telephones

Definitions

  • the present invention relates to an intercom system, and more specifically, to an intercom system including a slave unit installed outdoors and a master unit installed indoors.
  • an interphone system including a child device installed outside a house and a parent device installed indoors is known.
  • an interphone that can capture a visitor with a camera provided in the slave unit and display the captured visitor image on a monitor provided in the master unit.
  • the main purpose of such an intercom system is to make it possible for residents in the room to look at the face of the visitor displayed on the monitor to see who the visitor is and to respond appropriately. That is. Therefore, there is an intercom that asks the visitor for some action when an image that allows the resident to recognize the face of the visitor cannot be acquired.
  • Patent Document 1 discloses an intercom device that outputs a sound for instructing a visitor to move to a predetermined position from a speaker of a slave unit when a normal face cannot be recognized from a captured image. JP 2005-92262 A
  • Patent Literature 1 always gives the same instruction to a visitor even though there are a plurality of reasons why the face cannot be recognized from the captured image. Therefore, even if the visitor handles according to the instruction, there is a possibility that it is not appropriate. Moreover, when a visitor is acting so that a face is not recognized, there is a possibility that the instruction is not followed. If the resident cannot recognize the visitor's face, the resident may respond without warning even if the visitor is a malicious person.
  • the present invention has been made to solve the above problems, and when an image that can recognize the face of the visitor cannot be acquired, the visitor is instructed to appropriately deal with the warning necessary for the resident.
  • An object is to provide an intercom system capable of prompting the user.
  • an interphone system including a child device installed outdoors and a parent device connected to the child device and installed indoors, wherein the child device is a predetermined front of the child device.
  • Imaging means for imaging the imaging range and outputting captured image information that is image information of the imaging range, visitor detection means for detecting a visitor, and slave unit notification means for notifying the visitor of information
  • the master unit is acquired by the image acquisition unit that acquires the captured image information output from the imaging unit when the visitor is detected by the visitor detection unit; Based on the captured image information, a recognition determination unit that determines whether or not the visitor's face is recognizable, and the recognition determination unit determines that the visitor's face is not recognizable,
  • the handset notification means A slave unit notification control unit for controlling and notifying the visitor of information for prompting different actions according to a cause that hinders recognition; a master unit notification unit for notifying information in the room; and the master unit notification
  • an intercom system provided with first parent device notification control means for controlling the means to notify information prompting different levels of warning depending
  • FIG. 1 is a block diagram showing an electrical configuration of an intercom system 1.
  • FIG. FIG. 3 is an external front view of parent device 20 displaying registrant mode screen 310.
  • 4 is an explanatory diagram of a storage area included in a flash ROM 220 of the parent device 20.
  • FIG. 6 is an explanatory diagram showing an example of data stored in a face feature storage area 221 of a flash ROM 220.
  • FIG. 5 is a flowchart of main processing of the master unit 20. It is a flowchart of the recognition judgment process performed during the main process of FIG. It is a detection error process performed during the recognition determination process of FIG. It is a feature extraction error process executed during the recognition determination process of FIG. It is a flowchart of the visitor process performed during the main process of FIG.
  • FIG. 3 is an external front view of master device 20 displaying caution mode screen 320.
  • FIG. 3 is an external front view of parent device 20 displaying warning mode screen 330.
  • the intercom system 1 of the present embodiment includes a child device 10 and a parent device 20 connected via a signal line 30.
  • mobile_unit 10 is installed outdoors, and the main
  • the handset 10 includes a substantially rectangular parallelepiped casing, and a microphone 111, a speaker 112, a camera 113, a call button 114, a call button 114, a front face of the casing (a surface facing the visitor), And infrared LED illumination 117 is provided.
  • the slave unit 10 includes a CPU 101, ROM 102, RAM 103, microphone 111, speaker 112, camera 113, call button 114, human sensor 115, infrared LED illumination 117, and communication device 150. These are all interconnected by a bus.
  • the CPU 101 controls the entire slave unit 10.
  • the ROM 102 stores programs necessary for basic operations of the slave unit 10 and setting values for the programs.
  • the CPU 101 controls the operation of the child device 10 according to a program stored in the ROM 102.
  • the RAM 103 is a storage device for temporarily storing various data.
  • the microphone 111 is a device that converts an input voice of a visitor into a voice signal and outputs the voice signal to the parent device 20 via the communication device 150.
  • the speaker 112 is a device that converts the audio signal input from the parent device 20 into sound and outputs the sound.
  • the camera 113 is, for example, a well-known CCD camera.
  • the camera 113 captures a predetermined imaging range in front of the slave unit 10 and outputs an image signal of the captured image to the master unit 20 via the communication device 150.
  • the predetermined imaging range is set in advance so as to include, for example, an area where the face of the visitor standing facing the front of the child device 10 at the normal magnification is predicted to be located.
  • the camera 113 according to the present embodiment is capable of imaging at different resolutions, imaging at different magnifications, that is, imaging at different angles of view using a zoom function, and still images and moving images.
  • the call button 114 is a button for a visitor to call a room responder. When call button 114 is pressed, a call signal is transmitted to base unit 20 via communication device 150.
  • Human sensor 115 is a sensor that detects the movement of an object in a predetermined area in front of handset 10.
  • the human sensor 115 for example, an infrared sensor that emits infrared rays to an object and detects the object based on a change in the amount of received infrared light can be employed.
  • a human sensor using ultrasonic waves may be adopted as the human sensor 115.
  • the human sensor 115 constantly detects a moving object, and transmits a signal indicating the result to the parent device 20 via the first communication device 250.
  • the infrared LED illumination 117 includes an illuminance sensor that detects illuminance. When the detected illuminance is lower than a predetermined threshold, the infrared LED illumination 117 is lit to illuminate the vicinity of the slave unit 10.
  • the communication device 150 is a device that transmits and receives various signals including a control signal, an image signal, and an audio signal to and from the parent device 20 via the signal line 30.
  • the base unit 20 includes a housing 205 having a substantially rectangular parallelepiped shape.
  • a microphone 211 and a speaker 212 are provided at the lower left front of the housing 205.
  • a display monitor 214 is provided at the front left center of the housing 205, and a warning lamp 216 is provided at the upper left.
  • An operation panel 215 is provided on the front right half of the housing 205.
  • the microphone 211 is a device that converts the voice of the room responder into a voice signal and outputs the voice signal to the slave unit 10 via the first communication device 250.
  • the speaker 212 is a device that converts an audio signal input from the child device 10 into sound and outputs the sound.
  • the display monitor 214 is a liquid crystal monitor including a liquid crystal panel and a drive circuit, for example.
  • the display monitor 214 is a display device that displays an image of the imaging area captured by the camera 113 of the slave unit 10.
  • the warning lamp 216 is, for example, an LED lamp. As will be described later, the warning lamp 216 is turned on in order to alert a room responder when a visitor cannot be identified.
  • the operation panel 215 is a display device in which, for example, a liquid crystal panel, a drive circuit that drives the liquid crystal panel, and a touch pad that is an input device capable of detecting an input position are incorporated at least in part.
  • the operation panel 215 displays a visitor information display area 311 for displaying information related to visitors, and various operation buttons for the room attendant to input instructions.
  • the operation buttons include, for example, a registration correction button 312, a corresponding button 313, and a rejection button 314 shown in FIG.
  • a different display is performed on the operation panel 215 depending on the recognition status of the visitor. Details will be described later.
  • the main unit 20 includes a CPU 201, a ROM 202, a RAM 203, a microphone 211, a speaker 212, a display monitor 214, an operation panel 215, a warning lamp 216, a flash ROM 220, a first communication device 250, and a second communication device. 260, all of which are connected to each other by a bus.
  • the CPU 201 controls the entire master device 20.
  • the ROM 202 stores a program necessary for causing the master unit 20 to execute various processes including a main process, which will be described later, and setting values for that purpose.
  • CPU 201 controls the operation of parent device 20 in accordance with a program stored in ROM 202.
  • the RAM 203 is a storage device for temporarily storing various data.
  • the microphone 211, speaker 212, display monitor 214, operation panel 215, and warning lamp 216 are as described above.
  • the first communication device 250 is a device that transmits and receives various signals including a control signal, an image signal, and an audio signal to and from the slave unit 10 via the signal line 30.
  • the second communication device 260 is connected to the public telephone line network 5 via the telephone line 50 and is a device that transfers a call with the handset 10 to an external telephone through the public telephone line network 5.
  • the flash ROM 220 will be described with reference to FIGS.
  • the flash ROM 220 is a nonvolatile semiconductor memory.
  • the flash ROM 220 includes a face feature storage area 221, a display screen storage area 222, a notification audio storage area 223, a setting information storage area 224, a moving image storage area 225, and a still image storage area 226.
  • a plurality of storage areas are provided.
  • the face feature storage area 2221 face feature data that is data indicating the facial features of a plurality of persons and related information that is information related to the persons are stored in association with each other.
  • the face feature storage area 221 includes, for example, an ID column, a face feature data column, a name column, a relationship column, a previous visit date column, and a memo column.
  • an ID that is unique information for identifying each data is stored.
  • face feature data column for example, numerical data indicating the position and shape of eyebrows, eyes, nose, mouth and the like are stored as face feature data. In the present embodiment, it is assumed that feature point data corresponding to eyes, nose, and mouth is adopted as face feature data.
  • name column the name of the person (hereinafter referred to as a registrant) from which facial feature data is extracted is stored.
  • the relationship column information indicating the relationship between the registrant and the user of the parent device 20 is stored as the relationship.
  • the user of base unit 20 is, for example, a resident if the intercom system 1 is installed in a house, or a company employee if the interphone system 1 is installed in a company. . Therefore, for example, information such as a relative of the resident, a friend, and a nearby resident, or an employee of the company, a customer, and a trader is stored as information indicating the relationship.
  • the previous visit date column stores the date when the registrant visited the house or company where the intercom system 1 was installed last time as the last visit date.
  • a text to be notified to the room responder when the registrant visits is stored as a memo.
  • the name, relationship, last visit to Japan, and memo stored in the face feature storage area 221 corresponding to the face feature data are related information of the registrant.
  • the information is registered by the user of the parent device 20 as appropriate.
  • the previous visit date is information that is automatically updated when it is recognized that the registrant has visited, as will be described later.
  • a screen template to be displayed on the operation panel 215 of the parent device 20 is stored.
  • the operation panel 215 for example, as shown in FIG. 2, a screen having operation buttons for inputting information related to the visitor and various instructions for the user of the parent device 20 is displayed.
  • a plurality of display screen templates are stored in the display screen storage area 222.
  • audio data of audio to be output to the speaker 212 of the parent device 20 or the speaker 112 of the child device 10 is stored.
  • Examples of the stored voice data include voice data that is output from the speaker 112 of the child device 10 and prompts a visitor to perform a predetermined action.
  • the stored setting information includes, for example, collation conditions, resolution at the time of image capturing by the camera 113 of the slave unit 10, and magnification.
  • the collation condition means that if at least one of feature point data corresponding to eyes, nose and mouth is found in the collation of facial feature data, or all feature point data corresponding to eyes, nose and mouth are collected. This is information indicating whether or not collation is performed only when In the present embodiment, two types of values are stored as the resolution, the normal resolution that is the default resolution and the resolution that is higher than the normal resolution. Further, three types of magnifications are stored: a normal magnification which is a default magnification, a low magnification lower than the normal magnification, and a high magnification higher than the normal magnification.
  • moving images in the imaging area are stored. Although details will be described later, in the present embodiment, a moving image captured by the camera 113 is stored in the moving image storage area 225 when the human sensor 115 detects an object whose face area cannot be detected for a predetermined time. Is done.
  • the main process shown in FIG. 5 is started when the power of the main unit 20 is turned on, is continuously repeated while the power is on, and ends when the power is turned off.
  • a reset process is performed (S1). Specifically, all the information stored in the RAM 203 is deleted, and all the various flags stored in the flag storage area (not shown) of the RAM 203 are turned off.
  • the flags that are turned off in step S1 are a detection flag, a detection error flag, and a feature extraction error flag.
  • the detection flag is a flag indicating whether or not a moving object is detected by the human sensor 115. Specifically, when the detection flag is ON, it indicates that a moving object has been detected, and when it is OFF, it indicates that it has not been detected.
  • the detection error flag is a flag indicating whether or not a face area has been detected. Specifically, when the detection error flag is ON, it indicates that the face area cannot be detected, that is, there is a detection error, and when it is OFF, it indicates that the face area has been detected.
  • the feature extraction error flag is a flag indicating whether all feature points of the face have been extracted. Specifically, when the feature extraction error flag is ON, it indicates that all the facial feature points could not be extracted, that is, there was a feature extraction error, and when it was OFF, all the facial feature points could be extracted. It shows that.
  • the CPU 201 determines whether or not the detection flag is ON (S2). In the first process, the detection flag is OFF (S2: NO). In this case, the CPU 201 determines whether or not a signal indicating that a moving object, that is, an object that seems to be a visitor has been detected, has been input from the child device 10 to the parent device 20 by the human sensor 115 of the child device 10. (S3). While the moving object is not detected, the CPU 201 stands by (S3: NO, S2: NO).
  • the detection flag turned OFF in step S1 is turned ON (S4).
  • the first communication device 250 transmits a control signal for starting imaging by the camera 113 and data of predetermined imaging conditions to the slave unit 10 via the signal line 30 (S5).
  • the data of the predetermined imaging condition transmitted at this time includes the normal resolution that is the default resolution and the data of the normal magnification that is the default magnification, which are stored in the setting information storage area 224 of the flash ROM 220. .
  • the resolution and magnification of the camera 113 are set according to the received normal resolution and normal magnification.
  • the camera 113 after the handset 10 receives the imaging start control signal in step S5, the camera 113 always performs imaging.
  • the image signal output from the camera 113 is transmitted to the parent device 20 via the signal line 30 by the communication device 150.
  • processing for converting the image signal received by first communication device 250 into data that can be displayed on display monitor 214 is performed according to a separately executed program. That is, the image captured by the camera 113 of the slave unit 10 can be displayed on the display monitor 214 in real time.
  • the CPU 201 first acquires one frame of the image signal transmitted from the slave 10 and received by the first communication device 250, and generates still image data (S100). Based on the generated still image data, it is determined whether or not the face area can be detected (S101). Any known method may be employed for detecting the face area. For example, a method of matching with a face pattern stored in advance or a method of detecting a skin color region can be employed. If a face area is possible from still image data (S101: YES), the face area is detected (S102).
  • facial feature points it is determined whether or not facial feature points can be extracted (S103).
  • the facial feature points can be extracted only when the feature points corresponding to the eyes, nose, and mouth can be extracted (S103: YES).
  • feature points of eyes, nose, and mouth which are facial feature points, are extracted from the face region, and numerical data indicating these positions and shapes are obtained as visitor facial feature data (S104).
  • the facial feature data of the registrant is read in order from the facial feature storage area 221 of the flash ROM 220 and collated with the facial feature data of the visitor (S105).
  • S105 facial feature data of the current visitor does not match any registrant
  • S109 unregistered person
  • step S101 If it is determined in step S101 that the face area cannot be detected from the generated still image (S101: NO), the fact that the face of the visitor cannot be identified is determined because the face area cannot be detected. . Therefore, assuming that a detection error has occurred, the detection error flag stored in the flag storage area of the RAM 203 is turned ON (S111). Then, detection error processing is performed (S120 and FIG. 7).
  • the detection error process is a process that prompts the visitor to perform a predetermined action so that the face area can be detected.
  • the detection error process it is determined whether or not a predetermined time has elapsed since the moving sensor was detected by the human sensor 115 (S121). Specifically, for example, when the detection flag is turned on in step S4 of the main process shown in FIG. 5, a timer (not shown) is started to measure the elapsed time, and whether or not the elapsed time exceeds a predetermined threshold value. You just have to judge. If the predetermined time has not elapsed (S121: NO), the detection error process is ended as it is because there is a possibility that the face area may be specified within the predetermined time. The process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
  • this moving object detected during the predetermined time is the same object (S122: YES)
  • this moving object is a visitor, but the face is not displayed because the visitor is not standing at an appropriate position.
  • the area may not be detected.
  • a control signal and data for outputting a voice instruction from the speaker 112 of the child device 10 to the visitor for a predetermined action when the face area cannot be detected is transmitted to the child device 10 (S123).
  • voice data of a predetermined instruction voice stored in the notification voice storage area 223 of the flash ROM 220 is read, converted into a voice signal, and transmitted to the slave unit 10.
  • a voice instructing the camera 113 to stand at a position where the face can be photographed is output, such as “Please stand in front of the handset”.
  • the visitor can take an appropriate action according to the recognition error that the face area cannot be detected in accordance with the voice instruction.
  • the CPU 201 acquires the image signal transmitted from the slave unit 10 to generate moving image data, and stores it in the moving image storage area 225 of the flash ROM 220.
  • the CPU 201 turns on the warning lamp 216 (S127) to notify the user of the parent device 20 in the room that there is an unidentified visitor. Then, the detection error process ends. The process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
  • the face area detection error may be caused by the face being too close to the camera 113 and not being in the shooting area. Therefore, in step S126, instead of recording, the slave unit 10 may be instructed to capture a still image at a low magnification (wide angle), and a low-magnification still image may be generated from the obtained image signal. By changing the magnification in this way, there is a possibility that the face of the visitor can be recognized if the person in the room sees it visually.
  • step S103 of the recognition determination process shown in FIG. 6 When it is determined in step S103 of the recognition determination process shown in FIG. 6 that at least one of the eye, nose, and mouth feature points cannot be extracted from the face area (S103: NO), the visitor's face recognition is performed. What cannot be specified is caused by the fact that facial feature points cannot be extracted. Therefore, assuming that a feature extraction error has occurred, the feature extraction error flag stored in the flag storage area of the RAM 203 is turned ON (S112). Then, feature extraction error processing is performed (S130 and FIG. 8).
  • the feature extraction error process is a process that prompts the visitor to perform a predetermined action so that all feature points of the face can be extracted.
  • the feature extraction error process first, which region in the face region is hidden is identified (S131). For example, depending on whether the feature points not extracted in the recognition judgment process correspond to the eyes, nose, or mouth, it is possible to identify whether the feature points are hidden, the eye area, the nose area, or the mouth area. That's fine. Or you may identify based on the luminance value of each pixel in a face area.
  • region is concealed is assumed, for example, when it is specified that the eye area is concealed, and when wearing sunglasses.
  • a case where a mask is put on is assumed.
  • partial verification refers to verification of a part of the registrant's facial feature data using a part of the data when all of the feature points corresponding to the eyes, nose and mouth cannot be extracted.
  • the collation condition which is information indicating whether or not partial collation is performed, is set by the user and stored in the setting information storage area 224 of the flash ROM 220 as described above.
  • a predetermined action when a facial feature point cannot be extracted for a visitor is performed.
  • a control signal and data for outputting a prompt instruction from the speaker 112 of the child device 10 are transmitted to the child device 10 (S133).
  • voice data of a predetermined instruction voice stored in the notification voice storage area 223 of the flash ROM 220 is read, converted into a voice signal, and transmitted to the slave unit 10.
  • an instruction is given to show the hidden area, for example, “Please keep your eyes open”. Audio is output. Therefore, the visitor can take an appropriate action according to the recognition error that the facial feature point cannot be extracted according to the voice instruction.
  • an instruction for imaging at a high resolution is transmitted to the slave unit 10 (S141).
  • the CPU 201 sets the high resolution read from the setting information storage area 224 of the flash ROM 220 as the resolution of the camera 113 and transmits a control signal for imaging to the slave unit 10.
  • the reason why high-resolution imaging is performed here is that facial feature points cannot be extracted at normal resolution, but if the resolution is high, indoor responders may be able to visually recognize the faces of visitors. It is.
  • the CPU 201 acquires one frame of image signal transmitted from the child device 10 to generate a high-resolution still image and stores it in the still image storage area 226 of the flash ROM 220 (S142).
  • step S141 the CPU 201 may instruct imaging at a high magnification (zoom) stored in the setting information storage area 224 of the flash ROM 220 instead of instructing imaging at high resolution. If facial feature points cannot be extracted, it may be because the visitor is too far from the camera 113. Therefore, by changing the magnification in this way, there is a possibility that the face of the visitor can be recognized if the person in the room looks with his / her eyes.
  • the CPU 201 transmits an instruction to return the resolution of the camera 113 to the normal resolution to the slave unit 10 (S143). Specifically, the normal resolution read from the setting information storage area 224 of the flash ROM 220 is set as the resolution of the camera 113 and a control signal for imaging is transmitted to the slave unit 10. Thereafter, the warning lamp 216 is turned on (S144) to notify the user of the parent device 20 in the room that there is an unidentified visitor. Then, the feature extraction error process ends. The process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
  • step S132 If it is determined in step S132 that the collation condition stored in the setting information storage area 224 indicates that partial collation is to be performed (S132: YES), only some of the feature points that can be extracted are used.
  • the collation with the registrant's facial feature data is performed (S135). For example, during the spring pollen season, more people are wearing masks as a countermeasure against hay fever. It is harsh to instruct such visitors to remove their masks and open their noses and mouths. Therefore, in such a case, the collation condition can be set in advance so as to perform partial collation.
  • partial matching for example, when feature points corresponding to eyes cannot be extracted, and only feature points corresponding to nose and mouth can be extracted, data obtained from feature points corresponding to nose and mouth and registrant's The nose and mouth data of the face feature data are collated. If the data does not match any of the registrant data (S136: NO), as described above, an instruction for imaging at high resolution is transmitted to the slave unit 10 (S141), and the high resolution still image becomes a still image. The information is stored in the storage area 226 (S143), and the warning lamp 216 is turned on (S144). Then, the feature extraction error process ends, the process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
  • step S136 If it is determined in step S136 that the extracted feature points are matched with the facial feature data using only some of the feature points, the facial feature data is determined to match any of the registrant facial features data (S136: YES).
  • the registrant is identified as a visitor candidate (S137).
  • the candidate is identified as the result of matching based on only a part of the facial feature points, so the reliability of the results is not as high as when all the feature points are used. is there.
  • the feature extraction error process ends, the process returns to the recognition determination process shown in FIG. 6, and further returns to the main process shown in FIG.
  • step S2 it is determined whether or not the detection flag is ON.
  • the process returns to step S2 after step S11, the moving object has already been detected once, and the detection flag is ON (S2: YES).
  • S23 NO
  • the information stored in the RAM 203 is erased, various flags are turned OFF, and the process for the next visitor is performed as described above.
  • step S23 YES
  • the CPU 201 first outputs a ringing tone from the speaker 212 of the parent device 20, and based on display data generated from the image signal transmitted from the child device 10, For example, as shown in FIG. 2, the image captured by the camera 113 is displayed on the display monitor 214 (S201). Further, it is determined whether or not a visitor or a visitor candidate has been specified in order to display on the operation panel 215 according to the recognition status of the visitor (S202).
  • the CPU 201 displays the registrant mode screen 310 shown in FIG. 2 on the operation panel 215 (S213).
  • a visitor information display area 311, a registration correction button 312, a corresponding button 313, and a reject button 314 are provided in the registrant mode screen 310.
  • the registrant mode screen 310 may be created by pouring information on visitors or visitor candidates into a template stored in the display screen storage area 222 (see FIG. 3) of the flash ROM 220. The same applies to other screens described below.
  • the visitor information display area 311 displays information on visitors or visitor candidates. Specifically, for example, when a visitor is identified as a registrant, the message “Visitor is the following registrant” is displayed, and a visitor candidate is identified. Displays the message “Candidates for visitor are the following registrants”. In addition, the name, relationship, previous visit to Japan, and memo of the registrant stored as related information in the face feature storage area 221 are read and displayed.
  • FIG. 2 is an example of a registrant mode screen 310 when a registrant with ID “1” is identified as a visitor among registrants whose data is stored in the face feature storage area 221 of FIG. Therefore, the visitor's name “A”, which is related information of this registrant, the relationship “acquaintance of daddy”, the previous visit date “September 1, 2008”, and the memo “waiting for daddy to return” Is displayed.
  • the registration correction button 312 is a button for inputting an instruction to shift to the correction screen when the room corresponding person wants to correct the related information of the displayed registrant.
  • the response button 313 is a button for inputting an instruction to start a call with the child device 10 when the room responder wants to directly respond to the visitor.
  • the refusal button 314 is a button for inputting a proxy response instruction when the room responder does not want to directly respond to the visitor.
  • a real-time image transmitted from the slave unit 10 is displayed on the display monitor 214, and a visitor or a visitor candidate is displayed on the operation panel 215.
  • Related information is displayed. Therefore, the room responder can easily know whether or not the visitor is a known person, and if it is a known person, the identity of the visitor can be easily determined.
  • the matching is performed with only a part of the facial feature points, the visitor is displayed as a candidate. Therefore, the room responder can easily guess who the visitor is.
  • step S202 when it is determined in step S202 that the visitor or the visitor candidate is not specified (S202: NO), a notification sound indicating that the connection is being established is output from the speaker 112 of the slave unit 10.
  • the control signal and data are transmitted to the slave unit 10 (S203).
  • a voice message “Connected. Please wait a moment” is output.
  • the caution mode screen 320 is provided with a caution information display area 321, a high-resolution image display area 322, a warning release button 323, and a reject button 314, for example, as shown in FIG. 10.
  • the caution information display area 321 displays information that calls attention to visitors. Specifically, for example, a message “Attention! Visitor confirmation is required” is displayed.
  • the high-resolution image display area 322 is generated when all feature points cannot be extracted in the feature extraction error process (S142 in FIG. 8) and stored in the still image storage area 226 of the flash ROM 220. The image is read and displayed.
  • the warning cancel button 323 allows the room attendant to check the normal resolution real-time image displayed on the display monitor 214 and the high-resolution still image displayed in the high-resolution image display area 322 and then directly to the visitor. This is a button for inputting an instruction to display the corresponding button 313 when it is desired to respond.
  • the reject button 314 is as described in connection with the registrant mode screen 310.
  • a real-time image transmitted from the slave unit 10 is displayed on the display monitor 214 and the operation panel is displayed.
  • the operation panel is displayed.
  • a high-resolution still image of the imaging area is displayed. Therefore, for example, as shown in FIG. 10, even if a part of the visitor's face is hidden and the visitor cannot be identified by collation with the face feature data, By looking at the screen, it is easy to determine whether the visitor is actually a registered person and whether or not the visitor is a suspicious person.
  • step S205 If it is determined in step S205 that no feature extraction error has occurred in the recognition determination process (S205: NO), it is determined whether a detection error has occurred (S208). Specifically, if the face area cannot be detected and the detection error flag is ON (S111 in FIG. 6), it is determined that a detection error has occurred (S208: YES). In this case, the CPU 201 displays the warning mode screen 330 shown in FIG. 11 on the operation panel 215 (S209).
  • the warning mode screen 330 is provided with a warning information display area 331, a moving image display area 332, a transfer button 333, a warning release button 323, and a rejection button 314, for example, as shown in FIG.
  • a warning information display area 331 information for warning that the visitor is a person requiring attention is displayed. Specifically, for example, a message “Warning !!! There is a person who needs attention” is displayed.
  • the moving image display area 332 when the same object is detected for a predetermined time in the detection error processing (S126 in FIG. 7), the moving image stored in the moving image storage area 225 of the flash ROM 220 is reproduced and displayed. .
  • the transfer button 333 is a button for inputting an instruction to transfer the audio data input from the slave unit 10 to a predetermined transfer destination telephone.
  • the warning cancel button 323 and the reject button 314 are as described in relation to the attention mode screen 320 and the registrant mode screen 310, respectively.
  • a real-time image transmitted from the slave unit 10 is displayed on the display monitor 214 and recording is performed.
  • the moving image that has been displayed is displayed on the operation panel 215. Therefore, for example, as shown in FIG. 11, even when a person is wandering, the room responder confirms whether or not the person is a suspicious person by both the real-time image from the camera 113 and the recorded video. be able to.
  • the CPU 201 displays an unregistered mode screen (not shown) on the operation panel 215 (S211).
  • the unregistered mode screen is provided with an unregistered person notification area (not shown) instead of the visitor information display area 311 of the registrant mode screen 310 shown in FIG. 2, and the visitor is an unregistered person. Is displayed.
  • a correspondence button 313 and a rejection button 314 are provided.
  • a registration button or the like for inputting an instruction to shift to a screen for newly registering a visitor's facial feature data or the like in the facial feature storage area 221 may be provided.
  • the operation panel 215 As described above, after the display according to the recognition status of the visitor is performed on the operation panel 215 (S213, S206, S209, or S211), whether or not the panel operation is performed, specifically, the operation panel It is determined whether or not an input from the touch pad 215 is detected (S216). When the panel operation is not performed (S216: NO), it is determined whether or not a predetermined time (for example, 1 minute) has elapsed (S217). For example, the elapsed time may be measured by a timer that is started when pressing of the call button 114 is detected, and it may be determined whether or not a threshold value has been exceeded. While the predetermined time has not elapsed (S217: NO), the CPU 201 returns to the determination of whether or not there has been a panel operation (S216) and repeats the process.
  • a predetermined time for example, 1 minute
  • the elapsed time may be measured by a timer that is started when pressing of the call button 114 is detected,
  • the input instruction is displayed according to the input position detected by the displayed screen and the touch pad, as shown in FIG. ) Is selected, it is determined whether or not a call start instruction is given (S221).
  • the CPU 201 performs a call start process (S222). Specifically, the CPU 201 separately activates a program for controlling the operation of the parent device 20 related to a call with the child device 10. As a result, a communication path between the child device 10 and the parent device 20 is formed, and a call can be made between the visitor and the room responder.
  • the call start process (S222)
  • the visitor process ends, and the process returns to the main process in FIG. In the main process, a reset process is performed for the next visitor (S1).
  • the CPU 201 performs a proxy response process (S225). Specifically, a control signal and data for outputting a notification voice indicating that direct response cannot be received from the speaker 112 of the slave unit 10 is transmitted to the slave unit 10 (S225). As a result, the child device 10 outputs, for example, a voice saying “Sorry, it cannot be handled because it is currently being imported”. Thereafter, the visitor process ends, and the process returns to the main process in FIG. In the main process, a reset process is performed for the next visitor (S1).
  • the instruction input from the operation panel 215 is not a proxy response instruction by selecting the reject button 314 (S224: NO)
  • the transfer button 333 is selected (S227: YES)
  • the CPU 201 performs a transfer process (S228). Specifically, a call is made to a telephone number stored in advance in a predetermined storage area (not shown) of the flash ROM 220 as a transfer destination.
  • the corresponding telephone which may be a mobile phone or a fixed telephone
  • the line enters a call state with the main unit 20 via the public telephone line network 5, the parent unit 10 and the telephone set the parent unit. Transfer calls can be made via the machine 20.
  • the visitor process ends, and the process returns to the main process in FIG. In the main process, a reset process is performed for the next visitor (S1).
  • the instruction input from the operation panel 215 is not a transfer instruction by selecting the transfer button 333 (S227: NO)
  • other processing according to the input instruction is performed.
  • the warning release button 323 is selected on the attention mode screen 320 shown in FIG. 10 or the warning mode screen 330 shown in FIG. 11, a corresponding button 313 (see FIG. 2) is displayed instead of the warning release button 323. Processing is performed. After the other processes are completed, the process returns to step S216, and the process according to the operation on the operation panel 215 is performed as described above.
  • the intercom system 1 of the present embodiment when a visitor is detected, imaging by the camera 113 of the slave unit 10 is started. Then, the face area is detected from the still image generated from the image signal acquired from the handset 10, and the feature points are extracted from the face area.
  • face recognition cannot be performed by collating with face feature data of a plurality of persons stored in the face feature storage area 221 of the flash ROM 220 in advance. Further, in such a case, even if the room responder views the image captured by the camera 113, the face of the visitor cannot be recognized in many cases.
  • an instruction that prompts the visitor to act according to the cause that hinders recognition is output from the speaker 112 as a voice, thereby notifying the visitor. Therefore, the visitor can take an appropriate action according to the cause according to the notified instruction.
  • screens 310 to 330 showing information prompting different levels of warning depending on the cause of hindering recognition are displayed on operation panel 215. Therefore, the room responder can take appropriate warning measures according to the cause according to the notified information.
  • the cause is mainly that the face area of the visitor cannot be detected or the feature of the visitor's face cannot be extracted.
  • the intercom 1 depending on which of the above two causes, the contents of notification to the visitor and the level of caution to be notified in the room are changed, and appropriate instructions to the visitor and appropriate A warning can be made.
  • the imaging method for example, resolution, angle of view, still image or moving image
  • the acquired image is changed. It is displayed on the operation panel 215 of the master unit 20. Therefore, the room responder can check different images depending on the cause of the face recognition failure when alerting is urged, and can take more appropriate alerting measures.
  • the face feature data of the visitor is checked against the face feature data of a plurality of registrants stored in the face feature storage area 221 of the flash ROM 220, and it is determined whether there is a matching person.
  • the Information for prompting different levels of alert according to the determination result is notified to the room through the operation panel 215. Therefore, the room responder can know whether or not the visitor is a person to be wary, and it is easy to avoid careless correspondence.
  • the configuration and processing shown in the above-described embodiment are examples, and it goes without saying that various modifications are possible.
  • the facial feature data of the visitor is obtained in the recognition determination process (see FIG. 6)
  • the facial feature data of the registrant stored in the facial feature storage area 221 of the flash ROM 220. It is attempted to identify visitors by collating with.
  • the process of step S211 does not need to be performed.
  • the attention mode screen 320 (see FIG. 10) or the warning mode screen 330 (see FIG. 11) is displayed according to the type of error. If the facial feature points have been successfully extracted, the image captured by the camera 113 is simply displayed on the display monitor 214 (S201 in FIG. 9). Even in this case, since the detection of the face area and the extraction of the facial feature points have succeeded, the room responder can recognize the visitor's face by looking at the image displayed on the display monitor 214.
  • a display monitor may be provided in the child device 10 to display information that prompts different actions depending on the cause when the face of the visitor cannot be recognized.
  • different screens 310 to 330 are displayed on the operation panel 215 of the main unit 20 depending on the reason why the visitor's face cannot be recognized. Have been notified. However, it is not always necessary to notify the indoor responder by display. Instead, a message prompting a different level of alert from the speaker 212 of the parent device 20 such as “I know my dad”, “Please check the image carefully”, “Be wary because it seems to be a suspicious person”, etc. May be output by voice. Further, the warning lamp 216 may blink at different intervals for each warning level, or a plurality of warning lamps 216 having different emission colors may be provided to light the warning lamps 216 of different colors.
  • the moving object in front of the child device 10 is detected by the human sensor 115, but the human sensor 115 is not necessarily provided in the child device 10. Instead, a moving object may be detected by changing the image using an image captured by the camera 113.
  • the parent device 20 includes the flash ROM 220, and voice data is output from the child device 10 by transmitting instruction voice data and the like stored therein to the child device 10.
  • the instruction voice data does not necessarily have to be stored in the parent device 20, and may be stored in the child device 10 by providing a flash ROM. In this case, only an instruction for specifying data is transmitted from the parent device 20 to the child device 10, and the CPU 101 of the child device 10 performs voice output in accordance with the instruction.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Interconnected Communication Systems, Intercoms, And Interphones (AREA)
  • Closed-Circuit Television Systems (AREA)
  • Collating Specific Patterns (AREA)
  • Image Input (AREA)
  • Burglar Alarm Systems (AREA)
  • Alarm Systems (AREA)

Abstract

Provided is an intercom system that includes a sub-unit which is placed outside a house, and a base unit which is placed inside the house. The sub-unit includes an imaging means that images a predetermined area in front of the sub-unit and outputs image information about the image of the area, a visitor detection means that detects a visitor, and a sub-unit presenting means that presents information to the visitor. The base unit includes an image acquisition means that acquires the image information when a visitor is detected, a recognition determination means that determines, from the acquired image information, whether the face of the visitor can be recognized, a sub-unit presenting control means that controls, if it is determined that the face of the visitor cannot be recognized, the sub-unit presenting means to allow the sub-unit presenting means to present information representing a message to prompt the visitor to take another action depending on why the recognition of the visitor's face is hampered, a base-unit presenting means that presents information to a person in the house, and a first base-unit presentation control means that controls the base-unit presenting means to transmit information for giving warning of a level which varies according to why the recognition of the visitor's face is hampered.

Description

インターホンシステムIntercom system
 本発明は、インターホンシステムに関し、より具体的には、室外に設置される子機と、室内に設置される親機とを備えたインターホンシステムに関する。 The present invention relates to an intercom system, and more specifically, to an intercom system including a slave unit installed outdoors and a master unit installed indoors.
 従来、例えば、住宅の室外に設置される子機と、室内に設置される親機とを備えたインターホンシステムが知られている。このようなインターホンの中には、子機が備えるカメラで来訪者を撮像し、撮像された来訪者の画像を親機の備えるモニタに表示可能なインターホンがある。このようなインターホンシステムの主な目的は、室内に居る居住者が、モニタに表示された来訪者の顔を見て、来訪者が誰であるのかを確認し、適切な応対ができるようにすることである。よって、居住者が来訪者の顔を認識できる画像が取得できない場合には、来訪者に何らかの行動を求めるインターホンがある。例えば、特許文献1には、撮像された画像から正常な顔が認識できない場合、子機のスピーカから、来訪者に所定位置への移動を指示する音声を出力するインターホン装置が開示されている。
特開2005-92262号公報
2. Description of the Related Art Conventionally, for example, an interphone system including a child device installed outside a house and a parent device installed indoors is known. Among such intercoms, there is an interphone that can capture a visitor with a camera provided in the slave unit and display the captured visitor image on a monitor provided in the master unit. The main purpose of such an intercom system is to make it possible for residents in the room to look at the face of the visitor displayed on the monitor to see who the visitor is and to respond appropriately. That is. Therefore, there is an intercom that asks the visitor for some action when an image that allows the resident to recognize the face of the visitor cannot be acquired. For example, Patent Document 1 discloses an intercom device that outputs a sound for instructing a visitor to move to a predetermined position from a speaker of a slave unit when a normal face cannot be recognized from a captured image.
JP 2005-92262 A
 特許文献1に記載のインターホン装置は、撮像された画像から顔が認識できない原因は複数あるにもかかわらず、来訪者に対して常に同じ指示を行う。よって、来訪者が指示通り対処したとしても、適切な対処ではない可能性がある。また、来訪者があえて顔が認識されないように行動している場合には、指示に従わない可能性がある。居住者は、来訪者の顔が認識できない場合、来訪者が悪意を持った人物であっても何ら警戒せずに応対をしてしまう虞がある。 The intercom device described in Patent Literature 1 always gives the same instruction to a visitor even though there are a plurality of reasons why the face cannot be recognized from the captured image. Therefore, even if the visitor handles according to the instruction, there is a possibility that it is not appropriate. Moreover, when a visitor is acting so that a face is not recognized, there is a possibility that the instruction is not followed. If the resident cannot recognize the visitor's face, the resident may respond without warning even if the visitor is a malicious person.
 本発明は、上記問題点を解決するためになされたものであり、来訪者の顔が認識できる画像が取得できない場合に、来訪者に適切な対処の指示を行うとともに、居住者に必要な警戒を促すことが可能なインターホンシステムを提供することを目的とする。 The present invention has been made to solve the above problems, and when an image that can recognize the face of the visitor cannot be acquired, the visitor is instructed to appropriately deal with the warning necessary for the resident. An object is to provide an intercom system capable of prompting the user.
 本発明によれば、室外に設置された子機と、前記子機に接続され、室内に設置された親機とを備えたインターホンシステムであって、前記子機は、前記子機正面の所定の撮像範囲を撮像し、前記撮像範囲の画像の情報である撮像画像情報を出力する撮像手段と、来訪者を検知する来訪者検知手段と、前記来訪者に情報を通知する子機通知手段を備え、前記親機は、前記来訪者検知手段によって前記来訪者が検知された場合に、前記撮像手段から出力された前記撮像画像情報を取得する画像取得手段と、前記画像取得手段によって取得された前記撮像画像情報に基づいて、前記来訪者の顔が認識可能か否かを判断する認識判断手段と、前記認識判断手段によって、前記来訪者の前記顔が認識可能でないと判断された場合に、前記子機通知手段を制御して、認識を妨げる原因に応じて、前記来訪者に対して異なる行動を促す情報を通知させる子機通知制御手段と、前記室内に情報を通知する親機通知手段と、前記親機通知手段を制御して、前記原因に応じて異なるレベルの警戒を促す情報を通知させる第1の親機通知制御手段を備えたインターホンシステムが提供される。 According to the present invention, there is provided an interphone system including a child device installed outdoors and a parent device connected to the child device and installed indoors, wherein the child device is a predetermined front of the child device. Imaging means for imaging the imaging range and outputting captured image information that is image information of the imaging range, visitor detection means for detecting a visitor, and slave unit notification means for notifying the visitor of information The master unit is acquired by the image acquisition unit that acquires the captured image information output from the imaging unit when the visitor is detected by the visitor detection unit; Based on the captured image information, a recognition determination unit that determines whether or not the visitor's face is recognizable, and the recognition determination unit determines that the visitor's face is not recognizable, The handset notification means A slave unit notification control unit for controlling and notifying the visitor of information for prompting different actions according to a cause that hinders recognition; a master unit notification unit for notifying information in the room; and the master unit notification There is provided an intercom system provided with first parent device notification control means for controlling the means to notify information prompting different levels of warning depending on the cause.
インターホンシステム1の電気的構成を示すブロック図である。1 is a block diagram showing an electrical configuration of an intercom system 1. FIG. 登録者モード画面310を表示中の親機20の外観正面図である。FIG. 3 is an external front view of parent device 20 displaying registrant mode screen 310. 親機20のフラッシュROM220が有する記憶エリアの説明図である。4 is an explanatory diagram of a storage area included in a flash ROM 220 of the parent device 20. FIG. フラッシュROM220の顔特徴記憶エリア221に記憶されるデータの一例を示す説明図である。6 is an explanatory diagram showing an example of data stored in a face feature storage area 221 of a flash ROM 220. FIG. 親機20のメイン処理のフローチャートである。5 is a flowchart of main processing of the master unit 20. 図5のメイン処理中に実行される認識判断処理のフローチャートである。It is a flowchart of the recognition judgment process performed during the main process of FIG. 図6の認識判断処理中に実行される検出エラー処理である。It is a detection error process performed during the recognition determination process of FIG. 図6の認識判断処理中に実行される特徴抽出エラー処理である。It is a feature extraction error process executed during the recognition determination process of FIG. 図5のメイン処理中に実行される来訪者処理のフローチャートである。It is a flowchart of the visitor process performed during the main process of FIG. 注意モード画面320を表示中の親機20の外観正面図である。FIG. 3 is an external front view of master device 20 displaying caution mode screen 320. 警告モード画面330を表示中の親機20の外観正面図である。FIG. 3 is an external front view of parent device 20 displaying warning mode screen 330.
 以下、本発明を具体化したインターホンシステム1の実施の形態について、図面を参照して説明する。なお、これらの図面は、本発明が採用しうる技術的特徴を説明するために用いられるものであり、記載されている装置の構成、各種処理のフローチャートなどは、それのみに限定する趣旨ではなく、単なる説明例である。 Hereinafter, an embodiment of an intercom system 1 embodying the present invention will be described with reference to the drawings. These drawings are used for explaining the technical features that can be adopted by the present invention, and the configuration of the apparatus and the flowcharts of various processes described are not intended to be limited to the drawings. This is just an illustrative example.
 まず、図1~図4を参照して、本実施形態に係るインターホンシステム1の全体構成、ならびに、インターホンシステム1の構成要素である子機10および親機20の構成について、順に説明する。 First, with reference to FIGS. 1 to 4, the overall configuration of the interphone system 1 according to the present embodiment, and the configurations of the slave unit 10 and the master unit 20 that are components of the interphone system 1 will be described in order.
 図1を参照して、インターホンシステム1の概略構成について説明する。本実施形態のインターホンシステム1は、図1に示すように、信号線30を介して接続された子機10および親機20を含む。住宅、会社、ビル等において、子機10は室外に設置され、親機20は、室内に設置される。子機10と親機20との間で通話が可能であるため、室内に居る者(以下、室内対応者という)は、入口を開けることなく室外の来訪者に応対することができる。 The schematic configuration of the intercom system 1 will be described with reference to FIG. As shown in FIG. 1, the intercom system 1 of the present embodiment includes a child device 10 and a parent device 20 connected via a signal line 30. In a house, a company, a building, etc., the subunit | mobile_unit 10 is installed outdoors, and the main | base station 20 is installed indoors. Since a call can be made between the child device 10 and the parent device 20, a person in the room (hereinafter referred to as a room responder) can respond to a visitor outside the room without opening the entrance.
 図1を参照して、子機10について説明する。まず、子機10の物理的構成について説明する。詳細は図示しないが、子機10は、略直方体形状の筐体を備えており、筐体の正面(来訪者に対向する面)には、マイク111、スピーカ112、カメラ113、呼出ボタン114、および赤外LED照明117が設けられている。 The slave unit 10 will be described with reference to FIG. First, the physical configuration of the slave unit 10 will be described. Although details are not shown, the handset 10 includes a substantially rectangular parallelepiped casing, and a microphone 111, a speaker 112, a camera 113, a call button 114, a call button 114, a front face of the casing (a surface facing the visitor), And infrared LED illumination 117 is provided.
 図1を参照して、子機10の電気的構成について説明する。図1に示すように、子機10は、CPU101、ROM102、RAM103、マイク111、スピーカ112、カメラ113、呼出ボタン114、人感センサ115、赤外LED照明117、および通信装置150を備えており、これらはすべてバスで相互に接続されている。 Referring to FIG. 1, the electrical configuration of slave unit 10 will be described. As shown in FIG. 1, the slave unit 10 includes a CPU 101, ROM 102, RAM 103, microphone 111, speaker 112, camera 113, call button 114, human sensor 115, infrared LED illumination 117, and communication device 150. These are all interconnected by a bus.
 CPU101は、子機10全体の制御を司る。ROM102は、子機10の基本的な動作に必要なプログラムやそのための設定値を記憶している。CPU101は、ROM102に記憶されたプログラムに従って、子機10の動作を制御する。RAM103は、各種データを一時的に記憶するための記憶装置である。 The CPU 101 controls the entire slave unit 10. The ROM 102 stores programs necessary for basic operations of the slave unit 10 and setting values for the programs. The CPU 101 controls the operation of the child device 10 according to a program stored in the ROM 102. The RAM 103 is a storage device for temporarily storing various data.
 マイク111は、入力された来訪者の音声を音声信号に変換し、通信装置150を介して親機20に出力する機器である。スピーカ112は、親機20から入力された音声信号を、音声に変換して出力する機器である。 The microphone 111 is a device that converts an input voice of a visitor into a voice signal and outputs the voice signal to the parent device 20 via the communication device 150. The speaker 112 is a device that converts the audio signal input from the parent device 20 into sound and outputs the sound.
 カメラ113は、例えば、周知のCCDカメラである。カメラ113は、子機10の正面の所定の撮像範囲を撮像し、撮像した画像の画像信号を、通信装置150を介して親機20に出力する。所定の撮像範囲は、例えば、通常倍率での撮像時に子機10の正面に対向して立った来訪者の顔が位置すると予測される領域を含むように予め設定されている。本実施形態のカメラ113は、異なる解像度での撮像、異なる倍率、すなわち、ズーム機能を用いた異なる画角での撮像、静止画および動画の撮像が可能である。呼出ボタン114は、来訪者が室内対応者を呼び出すためのボタンである。呼出ボタン114が押下げられると、呼出信号が通信装置150を介して親機20に送信される。 The camera 113 is, for example, a well-known CCD camera. The camera 113 captures a predetermined imaging range in front of the slave unit 10 and outputs an image signal of the captured image to the master unit 20 via the communication device 150. The predetermined imaging range is set in advance so as to include, for example, an area where the face of the visitor standing facing the front of the child device 10 at the normal magnification is predicted to be located. The camera 113 according to the present embodiment is capable of imaging at different resolutions, imaging at different magnifications, that is, imaging at different angles of view using a zoom function, and still images and moving images. The call button 114 is a button for a visitor to call a room responder. When call button 114 is pressed, a call signal is transmitted to base unit 20 via communication device 150.
 人感センサ115は、子機10正面の所定領域内にある物体の動きを検知するセンサである。人感センサ115として、例えば、物体に対して赤外線を発射し、反射された赤外線の受光量の変化に基づいて物体を検知する赤外線センサを採用することができる。他に、超音波を利用する人感センサ等を人感センサ115として採用してもよい。本実施形態では、人感センサ115は、常時、移動物の検知を行っており、その結果を示す信号を、第1通信装置250を介して親機20に送信している。 Human sensor 115 is a sensor that detects the movement of an object in a predetermined area in front of handset 10. As the human sensor 115, for example, an infrared sensor that emits infrared rays to an object and detects the object based on a change in the amount of received infrared light can be employed. In addition, a human sensor using ultrasonic waves may be adopted as the human sensor 115. In the present embodiment, the human sensor 115 constantly detects a moving object, and transmits a signal indicating the result to the parent device 20 via the first communication device 250.
 赤外LED照明117は照度を検出する照度センサを備えており、検出された照度が所定の閾値より低い場合に、子機10近傍を照明するために点灯する。通信装置150は、親機20との間で、信号線30を介して、制御信号、画像信号、および音声信号を含む各種信号の送受信を行う装置である。 The infrared LED illumination 117 includes an illuminance sensor that detects illuminance. When the detected illuminance is lower than a predetermined threshold, the infrared LED illumination 117 is lit to illuminate the vicinity of the slave unit 10. The communication device 150 is a device that transmits and receives various signals including a control signal, an image signal, and an audio signal to and from the parent device 20 via the signal line 30.
 次に、図1~図4を参照して、親機20の構成について説明する。まず、図2を参照して、親機20の物理的構成について説明する。親機20は、略直方体形状の筐体205を備えている。筐体205の正面左下には、マイク211およびスピーカ212が設けられている。また、筐体205の正面左中央部には、表示モニタ214が設けられ、左上部には、警告ランプ216が設けられている。筐体205の正面右半分には、操作パネル215が設けられている。 Next, the configuration of the base unit 20 will be described with reference to FIGS. First, with reference to FIG. 2, the physical structure of the main | base station 20 is demonstrated. The base unit 20 includes a housing 205 having a substantially rectangular parallelepiped shape. A microphone 211 and a speaker 212 are provided at the lower left front of the housing 205. A display monitor 214 is provided at the front left center of the housing 205, and a warning lamp 216 is provided at the upper left. An operation panel 215 is provided on the front right half of the housing 205.
 マイク211は、室内対応者の音声を音声信号に変換し、第1通信装置250を介して子機10に出力する機器である。スピーカ212は、子機10から入力された音声信号を音声に変換して出力する機器である。表示モニタ214は、例えば、液晶パネルと駆動回路を備えた液晶モニタである。表示モニタ214は、子機10のカメラ113によって撮像された撮像領域の画像が表示される表示装置である。警告ランプ216は、例えば、LEDランプである。警告ランプ216は、後述するように、来訪者が特定できない場合に、室内対応者に警戒を促すために点灯される。 The microphone 211 is a device that converts the voice of the room responder into a voice signal and outputs the voice signal to the slave unit 10 via the first communication device 250. The speaker 212 is a device that converts an audio signal input from the child device 10 into sound and outputs the sound. The display monitor 214 is a liquid crystal monitor including a liquid crystal panel and a drive circuit, for example. The display monitor 214 is a display device that displays an image of the imaging area captured by the camera 113 of the slave unit 10. The warning lamp 216 is, for example, an LED lamp. As will be described later, the warning lamp 216 is turned on in order to alert a room responder when a visitor cannot be identified.
 操作パネル215は、例えば、液晶パネル、液晶パネルを駆動する駆動回路、および入力位置を検知可能な入力装置であるタッチパッドが少なくとも一部に内蔵された表示装置である。操作パネル215には、例えば、図2に示すように、来訪者に関する情報を表示する来訪者情報表示領域311、および、室内対応者が指示を入力する各種の操作ボタンが表示される。操作ボタンには、例えば、図2に示す登録修正ボタン312、対応ボタン313、および拒否ボタン314がある。室内対応者が指で操作ボタンに触れると、その位置がタッチパッドにより検知され、対応する情報が出力される。本実施形態では、操作パネル215には、来訪者の認識状況に応じて異なる表示が行われるが、詳細については後述する。 The operation panel 215 is a display device in which, for example, a liquid crystal panel, a drive circuit that drives the liquid crystal panel, and a touch pad that is an input device capable of detecting an input position are incorporated at least in part. For example, as shown in FIG. 2, the operation panel 215 displays a visitor information display area 311 for displaying information related to visitors, and various operation buttons for the room attendant to input instructions. The operation buttons include, for example, a registration correction button 312, a corresponding button 313, and a rejection button 314 shown in FIG. When the indoor person touches the operation button with a finger, the position is detected by the touch pad, and the corresponding information is output. In the present embodiment, a different display is performed on the operation panel 215 depending on the recognition status of the visitor. Details will be described later.
 図1を参照して、親機20の電気的構成について説明する。図1に示すように、親機20は、CPU201、ROM202、RAM203、マイク211、スピーカ212、表示モニタ214、操作パネル215、警告ランプ216、フラッシュROM220、第1通信装置250、および第2通信装置260を備えており、これらはすべてバスで相互に接続されている。 Referring to FIG. 1, the electrical configuration of base unit 20 will be described. As shown in FIG. 1, the main unit 20 includes a CPU 201, a ROM 202, a RAM 203, a microphone 211, a speaker 212, a display monitor 214, an operation panel 215, a warning lamp 216, a flash ROM 220, a first communication device 250, and a second communication device. 260, all of which are connected to each other by a bus.
 CPU201は、親機20全体の制御を司る。ROM202は、後述するメイン処理を含む各種処理を親機20に実行させるために必要なプログラムや、そのための設定値を記憶している。CPU201は、ROM202に記憶されたプログラムに従って、親機20の動作を制御する。RAM203は、各種データを一時的に記憶するための記憶装置である。 The CPU 201 controls the entire master device 20. The ROM 202 stores a program necessary for causing the master unit 20 to execute various processes including a main process, which will be described later, and setting values for that purpose. CPU 201 controls the operation of parent device 20 in accordance with a program stored in ROM 202. The RAM 203 is a storage device for temporarily storing various data.
マイク211、スピーカ212、表示モニタ214、操作パネル215、および警告ランプ216については前述した通りである。第1通信装置250は、子機10との間で、信号線30を介して、制御信号、画像信号、および音声信号を含む各種信号の送受信を行う装置である。第2通信装置260は、電話線50を介して公衆電話回線網5に接続されており、公衆電話回線網5を通して子機10との通話を外部の電話機に転送する装置である。 The microphone 211, speaker 212, display monitor 214, operation panel 215, and warning lamp 216 are as described above. The first communication device 250 is a device that transmits and receives various signals including a control signal, an image signal, and an audio signal to and from the slave unit 10 via the signal line 30. The second communication device 260 is connected to the public telephone line network 5 via the telephone line 50 and is a device that transfers a call with the handset 10 to an external telephone through the public telephone line network 5.
 図3および図4を参照して、フラッシュROM220について説明する。フラッシュROM220は、不揮発性の半導体メモリである。フラッシュROM220は、例えば、図3に示すように、顔特徴記憶エリア221、表示画面記憶エリア222、通知音声記憶エリア223、設定情報記憶エリア224、動画記憶エリア225、および静止画記憶エリア226を含む複数の記憶エリアを備えている。 The flash ROM 220 will be described with reference to FIGS. The flash ROM 220 is a nonvolatile semiconductor memory. For example, as shown in FIG. 3, the flash ROM 220 includes a face feature storage area 221, a display screen storage area 222, a notification audio storage area 223, a setting information storage area 224, a moving image storage area 225, and a still image storage area 226. A plurality of storage areas are provided.
 顔特徴記憶エリア221には、複数の人物の顔の特徴を示すデータである顔特徴データとその人物に関する情報である関連情報とが対応付けて記憶されている。例えば、図4に示すように、顔特徴記憶エリア221には、例えば、ID欄、顔特徴データ欄、氏名欄、関係欄、前回来訪日欄、およびメモ欄が設けられている。 In the face feature storage area 221, face feature data that is data indicating the facial features of a plurality of persons and related information that is information related to the persons are stored in association with each other. For example, as shown in FIG. 4, the face feature storage area 221 includes, for example, an ID column, a face feature data column, a name column, a relationship column, a previous visit date column, and a memo column.
 ID欄には、各データを識別するための固有情報であるIDが記憶される。顔特徴データ欄には、顔特徴データとして、例えば、眉、目、鼻、口等の位置や形状を示す数値データが記憶されている。本実施形態では、目、鼻、口に対応する特徴点のデータが、顔特徴データとして採用されているものとする。氏名欄には、顔特徴データの抽出元の人物(以下、登録者という)の氏名が記憶される。関係欄には、関係として、登録者と、親機20のユーザとの関係を示す情報が記憶される。親機20のユーザとは、例えば、インターホンシステム1が設置されているのが住宅であればその住民であり、インターホンシステム1が設置されているのが会社であれば、会社の従業員である。よって、例えば、関係を示す情報として、住民の親族、友人、および近隣の住民、または会社の従業員、顧客、および出入り業者といった情報が記憶される。 In the ID column, an ID that is unique information for identifying each data is stored. In the face feature data column, for example, numerical data indicating the position and shape of eyebrows, eyes, nose, mouth and the like are stored as face feature data. In the present embodiment, it is assumed that feature point data corresponding to eyes, nose, and mouth is adopted as face feature data. In the name column, the name of the person (hereinafter referred to as a registrant) from which facial feature data is extracted is stored. In the relationship column, information indicating the relationship between the registrant and the user of the parent device 20 is stored as the relationship. The user of base unit 20 is, for example, a resident if the intercom system 1 is installed in a house, or a company employee if the interphone system 1 is installed in a company. . Therefore, for example, information such as a relative of the resident, a friend, and a nearby resident, or an employee of the company, a customer, and a trader is stored as information indicating the relationship.
 前回来訪日欄には、前回来訪日として、登録者が前回このインターホンシステム1が設置されている住宅や会社を訪問した日付が記憶される。メモ欄には、メモとして、登録者が来訪した際に室内対応者に通知される文面が記憶される。顔特徴記憶エリア221に顔特徴データに対応して記憶されている氏名、関係、前回来訪日、およびメモは、登録者の関連情報である。顔特徴記憶エリア221に記憶されている上記の情報のうち、前回来訪日以外は、適宜、親機20のユーザによって登録される。一方、前回来訪日は、後述するように、登録者が来訪したと認識された際、自動更新される情報である。 The previous visit date column stores the date when the registrant visited the house or company where the intercom system 1 was installed last time as the last visit date. In the memo field, a text to be notified to the room responder when the registrant visits is stored as a memo. The name, relationship, last visit to Japan, and memo stored in the face feature storage area 221 corresponding to the face feature data are related information of the registrant. Of the above information stored in the face feature storage area 221, except for the previous visit to Japan, the information is registered by the user of the parent device 20 as appropriate. On the other hand, the previous visit date is information that is automatically updated when it is recognized that the registrant has visited, as will be described later.
 図3に示す表示画面記憶エリア222には、親機20の操作パネル215に表示させる画面のテンプレートが記憶されている。詳細は後述するが、操作パネル215には、例えば、図2に示すように、来訪者に関する情報や、親機20のユーザが各種指示を入力するための操作ボタンを有する画面が表示される。本実施形態では、操作パネル215に表示される画面は複数種類あるため、表示画面記憶エリア222には、複数の表示画面のテンプレートが記憶されている。 In the display screen storage area 222 shown in FIG. 3, a screen template to be displayed on the operation panel 215 of the parent device 20 is stored. Although details will be described later, on the operation panel 215, for example, as shown in FIG. 2, a screen having operation buttons for inputting information related to the visitor and various instructions for the user of the parent device 20 is displayed. In the present embodiment, since there are a plurality of types of screens displayed on the operation panel 215, a plurality of display screen templates are stored in the display screen storage area 222.
 通知音声記憶エリア223には、親機20のスピーカ212または子機10のスピーカ112に出力させる音声の音声データが記憶されている。記憶されている音声データとして、例えば、子機10のスピーカ112から出力される、来訪者に対して所定の行動を促す音声の音声データが挙げられる。 In the notification audio storage area 223, audio data of audio to be output to the speaker 212 of the parent device 20 or the speaker 112 of the child device 10 is stored. Examples of the stored voice data include voice data that is output from the speaker 112 of the child device 10 and prompts a visitor to perform a predetermined action.
 設定情報記憶エリア224には、後述する各種処理で使用される各種の設定情報が記憶されている。記憶されている設定情報として、例えば、照合条件、子機10のカメラ113の撮像時の解像度、および倍率がある。照合条件とは、顔特徴データの照合において、目、鼻および口に対応する特徴点のデータの少なくとも1つがあれば照合を行うか、目、鼻および口に対応する特徴点すべてのデータが揃っている場合にのみ照合を行うかを示す情報である。なお、本実施形態では、解像度として、デフォルトの解像度である通常解像度と、通常解像度よりも高い解像度の2種類の値が記憶されている。また、倍率として、デフォルトの倍率である通常倍率、通常倍率より低い低倍率、および通常倍率より高い高倍率の3種類が記憶されている。 In the setting information storage area 224, various setting information used in various processes to be described later is stored. The stored setting information includes, for example, collation conditions, resolution at the time of image capturing by the camera 113 of the slave unit 10, and magnification. The collation condition means that if at least one of feature point data corresponding to eyes, nose and mouth is found in the collation of facial feature data, or all feature point data corresponding to eyes, nose and mouth are collected. This is information indicating whether or not collation is performed only when In the present embodiment, two types of values are stored as the resolution, the normal resolution that is the default resolution and the resolution that is higher than the normal resolution. Further, three types of magnifications are stored: a normal magnification which is a default magnification, a low magnification lower than the normal magnification, and a high magnification higher than the normal magnification.
 動画記憶エリア225には、撮像領域の動画が記憶されている。詳細は後述するが、本実施形態では、所定時間に亘って、人感センサ115により顔領域が検出できない物体が検出された場合に、カメラ113で撮像された動画が、動画記憶エリア225に記憶される。 In the moving image storage area 225, moving images in the imaging area are stored. Although details will be described later, in the present embodiment, a moving image captured by the camera 113 is stored in the moving image storage area 225 when the human sensor 115 detects an object whose face area cannot be detected for a predetermined time. Is done.
 以下に、図5~図11を参照して、インターホンシステム1の親機20において実行される処理について説明する。以下に説明する処理は、ROM202に記憶されたプログラムに従って、CPU201が実行する。 Hereinafter, with reference to FIG. 5 to FIG. 11, processing executed in the base unit 20 of the intercom system 1 will be described. The process described below is executed by the CPU 201 in accordance with a program stored in the ROM 202.
 図5に示すメイン処理は、親機20の電源がONにされると開始され、電源がONの間は継続して繰り返され、電源がOFFにされた時点で終了する。まず、リセット処理が行われる(S1)。具体的には、RAM203に記憶されている情報がすべて消去され、RAM203のフラグ記憶エリア(図示外)に記憶されている各種フラグがすべてOFFとされる。 The main process shown in FIG. 5 is started when the power of the main unit 20 is turned on, is continuously repeated while the power is on, and ends when the power is turned off. First, a reset process is performed (S1). Specifically, all the information stored in the RAM 203 is deleted, and all the various flags stored in the flag storage area (not shown) of the RAM 203 are turned off.
 本実施形態では、ステップS1でOFFにされるフラグは、検知フラグ、検出エラーフラグ、特徴抽出エラーフラグである。検知フラグは、人感センサ115によって移動物が検知されたか否かを示すフラグである。具体的には、検知フラグがONの場合、移動物が検知されたことを示し、OFFの場合、検知されていないことを示す。検出エラーフラグは、顔領域の検出ができたか否かを示すフラグである。具体的には、検出エラーフラグがONの場合、顔領域の検出ができなかった、つまり検出エラーがあったことを示し、OFFの場合、顔領域の検出ができたことを示す。特徴抽出エラーフラグは、顔の特徴点すべての抽出ができたか否かを示すフラグである。具体的には、特徴抽出エラーフラグがONの場合、顔の特徴点すべての抽出ができなかった、つまり特徴抽出エラーがあったことを示し、OFFの場合、顔の特徴点すべての抽出ができたことを示す。 In the present embodiment, the flags that are turned off in step S1 are a detection flag, a detection error flag, and a feature extraction error flag. The detection flag is a flag indicating whether or not a moving object is detected by the human sensor 115. Specifically, when the detection flag is ON, it indicates that a moving object has been detected, and when it is OFF, it indicates that it has not been detected. The detection error flag is a flag indicating whether or not a face area has been detected. Specifically, when the detection error flag is ON, it indicates that the face area cannot be detected, that is, there is a detection error, and when it is OFF, it indicates that the face area has been detected. The feature extraction error flag is a flag indicating whether all feature points of the face have been extracted. Specifically, when the feature extraction error flag is ON, it indicates that all the facial feature points could not be extracted, that is, there was a feature extraction error, and when it was OFF, all the facial feature points could be extracted. It shows that.
 続いて、CPU201は、検知フラグがONであるか否かを判断する(S2)。最初の処理では、検知フラグはOFFである(S2:NO)。この場合、CPU201は、子機10の人感センサ115によって、移動物、すなわち来訪者と思われる物体が検知されたことを示す信号が子機10から親機20に入力されたか否かを判断する(S3)。移動物が検知されない間は、CPU201は待機する(S3:NO、S2:NO)。 Subsequently, the CPU 201 determines whether or not the detection flag is ON (S2). In the first process, the detection flag is OFF (S2: NO). In this case, the CPU 201 determines whether or not a signal indicating that a moving object, that is, an object that seems to be a visitor has been detected, has been input from the child device 10 to the parent device 20 by the human sensor 115 of the child device 10. (S3). While the moving object is not detected, the CPU 201 stands by (S3: NO, S2: NO).
 人感センサ115によって移動物が検知されたことを示す信号が親機20に入力されると(S3:YES)、ステップS1でOFFとされた検知フラグがONとされる(S4)。続いて、第1通信装置250により、カメラ113による撮像を開始させるための制御信号と所定の撮像条件のデータが、信号線30を介して子機10に対して送信される(S5)。このとき送信される所定の撮像条件のデータには、フラッシュROM220の設定情報記憶エリア224に記憶されている、デフォルトの解像度である通常解像度と、デフォルトの倍率である通常倍率のデータとが含まれる。 When a signal indicating that a moving object is detected by the human sensor 115 is input to the parent device 20 (S3: YES), the detection flag turned OFF in step S1 is turned ON (S4). Subsequently, the first communication device 250 transmits a control signal for starting imaging by the camera 113 and data of predetermined imaging conditions to the slave unit 10 via the signal line 30 (S5). The data of the predetermined imaging condition transmitted at this time includes the normal resolution that is the default resolution and the data of the normal magnification that is the default magnification, which are stored in the setting information storage area 224 of the flash ROM 220. .
 親機20からの制御信号と所定の撮像条件のデータを受信した子機10では、受信された通常解像度および通常倍率に従って、カメラ113の解像度および倍率が設定される。本実施形態では、ステップS5で子機10が撮像開始の制御信号を受信した後、カメラ113は常時撮像を行う。カメラ113から出力された画像信号は、通信装置150により、信号線30を介して親機20に送信される。親機20では、別途実行されるプログラムに従って、第1通信装置250により受信された画像信号を表示モニタ214に表示可能なデータに変換する処理が行われる。つまり、子機10のカメラ113で撮像された画像がリアルタイムで表示モニタ214に表示可能な状態となる。 In the slave unit 10 that has received the control signal from the master unit 20 and data of predetermined imaging conditions, the resolution and magnification of the camera 113 are set according to the received normal resolution and normal magnification. In the present embodiment, after the handset 10 receives the imaging start control signal in step S5, the camera 113 always performs imaging. The image signal output from the camera 113 is transmitted to the parent device 20 via the signal line 30 by the communication device 150. In base unit 20, processing for converting the image signal received by first communication device 250 into data that can be displayed on display monitor 214 is performed according to a separately executed program. That is, the image captured by the camera 113 of the slave unit 10 can be displayed on the display monitor 214 in real time.
 撮像開始の指示の後(S5)、すでに来訪者または来訪者の候補は特定済みか否かが判断される(S7)。最初の処理では、まだ来訪者または来訪者の候補は特定されていない(S7:NO)。そこで、認識判断処理が行われる(S10および図6)。以下に、図6~図8を参照して、認識判断処理について説明する。 After an instruction to start imaging (S5), it is determined whether or not a visitor or a visitor candidate has already been specified (S7). In the first process, a visitor or a visitor candidate has not yet been identified (S7: NO). Therefore, recognition determination processing is performed (S10 and FIG. 6). Hereinafter, the recognition determination process will be described with reference to FIGS.
 図6に示すように、CPU201はまず、子機10から送信され、第1通信装置250によって受信された画像信号の1フレームを取得し、静止画データを生成する(S100)。生成された静止画データを基に、顔領域の検出が可能か否かが判断される(S101)。顔領域の検出には、いかなる公知の方法を採用してもよい。例えば、予め記憶された顔のパターンとのマッチングを行う方法や、肌色領域を検出する方法を採用することができる。静止画データから顔領域が可能な場合には(S101:YES)、顔領域が検出される(S102)。 As shown in FIG. 6, the CPU 201 first acquires one frame of the image signal transmitted from the slave 10 and received by the first communication device 250, and generates still image data (S100). Based on the generated still image data, it is determined whether or not the face area can be detected (S101). Any known method may be employed for detecting the face area. For example, a method of matching with a face pattern stored in advance or a method of detecting a skin color region can be employed. If a face area is possible from still image data (S101: YES), the face area is detected (S102).
 続いて、顔の特徴点が抽出可能か否かが判断される(S103)。ここでは、目、鼻、口に対応する特徴点がすべて抽出可能な場合にのみ、顔の特徴点が抽出可能であると判断される(S103:YES)。この場合、顔領域から、顔の特徴点である目、鼻、口の特徴点が抽出され、これらの位置や形状を示す数値データが来訪者の顔特徴データとして求められる(S104)。 Subsequently, it is determined whether or not facial feature points can be extracted (S103). Here, it is determined that the facial feature points can be extracted only when the feature points corresponding to the eyes, nose, and mouth can be extracted (S103: YES). In this case, feature points of eyes, nose, and mouth, which are facial feature points, are extracted from the face region, and numerical data indicating these positions and shapes are obtained as visitor facial feature data (S104).
 フラッシュROM220の顔特徴記憶エリア221から、登録者の顔特徴データが順に読み出され、来訪者の顔特徴データと照合される(S105)。照合の結果、現在の来訪者の顔特徴データが、いずれの登録者とも一致しないと判断された場合には(S106:NO)、来訪者は未登録者であると決定される(S109)。そして認識判断処理は終了して、処理は図5に示すメイン処理に戻る。 The facial feature data of the registrant is read in order from the facial feature storage area 221 of the flash ROM 220 and collated with the facial feature data of the visitor (S105). As a result of the collation, when it is determined that the facial feature data of the current visitor does not match any registrant (S106: NO), it is determined that the visitor is an unregistered person (S109). Then, the recognition determination process ends, and the process returns to the main process shown in FIG.
 一方、照合の結果、来訪者の顔特徴データが、いずれかの登録者の顔特徴データと一致すると判断された場合(S106:YES)、来訪者は顔特徴データが一致した登録者であると決定される(S107)。そして、来訪者であると特定された登録者の関連情報として顔特徴記憶エリア221に記憶されている前回来訪日が、現在の日付に更新される(S108)。これにより、次に同じ登録者が来訪した場合に、後述する登録者モード画面310(図2参照)に最新の来訪日を表示することができる。日付の更新後、認識判断処理は終了して、処理は図5に示すメイン処理に戻る。 On the other hand, if it is determined as a result of the collation that the facial feature data of the visitor matches the facial feature data of any registrant (S106: YES), it is determined that the visitor is a registrant whose facial feature data matches. It is determined (S107). Then, the previous visit date stored in the face feature storage area 221 as the related information of the registrant identified as a visitor is updated to the current date (S108). Thereby, when the same registrant next visits, the latest visit date can be displayed on the registrant mode screen 310 (refer FIG. 2) mentioned later. After the date update, the recognition determination process ends, and the process returns to the main process shown in FIG.
 ステップS101において、生成された静止画から顔領域が検出できなかったと判断された場合(S101:NO)、来訪者の顔認識ができないのは、顔領域の検出ができないことに起因すると特定される。そこで、検出エラーが生じたとして、RAM203のフラグ記憶エリアに記憶されている検出エラーフラグがONにされる(S111)。そして、検出エラー処理が行われる(S120および図7)。検出エラー処理は、顔領域が検出できるように、来訪者に所定の行動を促す処理である。 If it is determined in step S101 that the face area cannot be detected from the generated still image (S101: NO), the fact that the face of the visitor cannot be identified is determined because the face area cannot be detected. . Therefore, assuming that a detection error has occurred, the detection error flag stored in the flag storage area of the RAM 203 is turned ON (S111). Then, detection error processing is performed (S120 and FIG. 7). The detection error process is a process that prompts the visitor to perform a predetermined action so that the face area can be detected.
 図7に示すように、検出エラー処理では、人感センサ115によって移動物が検知されてから所定時間が経過したか否かが判断される(S121)。具体的には、例えば、図5に示すメイン処理のステップS4で検知フラグがONにされた時にタイマ(図示外)をスタートさせて経過時間を計測し、経過時間が所定の閾値を超えたか否かを判断すればよい。所定時間が経過していなければ(S121:NO)、その後、所定時間内に顔領域が特定される可能性があるため、そのまま検出エラー処理は終了する。処理は図6に示す認識判断処理に戻り、さらに図5に示すメイン処理に戻る。 As shown in FIG. 7, in the detection error process, it is determined whether or not a predetermined time has elapsed since the moving sensor was detected by the human sensor 115 (S121). Specifically, for example, when the detection flag is turned on in step S4 of the main process shown in FIG. 5, a timer (not shown) is started to measure the elapsed time, and whether or not the elapsed time exceeds a predetermined threshold value. You just have to judge. If the predetermined time has not elapsed (S121: NO), the detection error process is ended as it is because there is a possibility that the face area may be specified within the predetermined time. The process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
 一方、来訪者が検知されてから所定時間が経過したと判断された場合(S121:YES)、所定時間中に検出された移動物が同一物体であるか否かが判断される(S122)。具体的には、例えば、所定期間中にカメラ113で撮像された複数の画像について、各画像の輝度分布を示すヒストグラムを生成し、これらのヒストグラムの比較結果に基づいて判断することができる。所定時間中に検出された移動物は同一物体ではないと判断された場合には(S122:NO)、そのまま検出エラー処理は終了する。処理は図6に示す認識判断処理に戻り、さらに図5に示すメイン処理に戻る。 On the other hand, if it is determined that a predetermined time has elapsed since the visitor was detected (S121: YES), it is determined whether the moving objects detected during the predetermined time are the same object (S122). Specifically, for example, for a plurality of images captured by the camera 113 during a predetermined period, a histogram indicating the luminance distribution of each image can be generated, and determination can be made based on the comparison result of these histograms. When it is determined that the moving objects detected during the predetermined time are not the same object (S122: NO), the detection error process is ended as it is. The process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
 所定時間中に検出された移動物は同一物体であると判断された場合には(S122:YES)、この移動物は来訪者であるが、来訪者が適切な位置に立っていないために顔領域が検出できていない可能性がある。そこで、来訪者に対して、顔領域が検出できない場合の所定の行動を促す指示を子機10のスピーカ112から音声出力するための制御信号とデータが、子機10に送信される(S123)。具体的には、フラッシュROM220の通知音声記憶エリア223に記憶されている、所定の指示音声の音声データが読み出され、音声信号に変換されて、子機10に送信される。 When it is determined that the moving object detected during the predetermined time is the same object (S122: YES), this moving object is a visitor, but the face is not displayed because the visitor is not standing at an appropriate position. The area may not be detected. Accordingly, a control signal and data for outputting a voice instruction from the speaker 112 of the child device 10 to the visitor for a predetermined action when the face area cannot be detected is transmitted to the child device 10 (S123). . Specifically, voice data of a predetermined instruction voice stored in the notification voice storage area 223 of the flash ROM 220 is read, converted into a voice signal, and transmitted to the slave unit 10.
 その結果、子機10では、例えば、「御用の方は、子機の正面にお立ちください」というように、カメラ113により顔が撮影可能な位置に立つように指示する音声が出力される。これにより、来訪者は、音声による指示に従って、顔領域が検出できないという認識エラーに応じた適切な対処をすることができる。 As a result, in the handset 10, for example, a voice instructing the camera 113 to stand at a position where the face can be photographed is output, such as “Please stand in front of the handset”. Thus, the visitor can take an appropriate action according to the recognition error that the face area cannot be detected in accordance with the voice instruction.
 また、所定時間中に検出された移動物は同一物体であると判断された場合には(S122:YES)、不審者が様子をうかがってうろついている可能性もある。そこで、ステップS123での来訪者への音声指示に加え、録画が行われる。ただし、前の検出エラー処理ですでに録画が開始されており、録画中の場合は(S125:YES)、そのまま検出エラー処理は終了し、処理は図6に示す認識判断処理に戻る。まだ録画が開始されていなければ(S125:NO)、所定時間分(例えば、1分)の録画が開始される(S126)。具体的には、CPU201は、子機10から送信された画像信号を取得して動画データを生成し、フラッシュROM220の動画記憶エリア225に記憶させる。 If it is determined that the moving object detected during the predetermined time is the same object (S122: YES), there is a possibility that a suspicious person is wandering around. Therefore, in addition to the voice instruction to the visitor in step S123, recording is performed. However, if recording has already started in the previous detection error process and recording is in progress (S125: YES), the detection error process is terminated as it is, and the process returns to the recognition determination process shown in FIG. If recording has not yet started (S125: NO), recording for a predetermined time (for example, 1 minute) is started (S126). Specifically, the CPU 201 acquires the image signal transmitted from the slave unit 10 to generate moving image data, and stores it in the moving image storage area 225 of the flash ROM 220.
 このように、来訪者が不審者の可能性がある場合には動画を記録しておくことにより、後にその人物を特定できる可能性のある証拠を保存することができる。その後、CPU201は、警告ランプ216を点灯させることにより(S127)、特定できない来訪者の存在を室内の親機20のユーザに知らせる。そして、検出エラー処理は終了する。処理は図6に示す認識判断処理に戻り、さらに図5に示すメイン処理に戻る。 In this way, if there is a possibility that the visitor is a suspicious person, by recording a video, it is possible to save evidence that may identify the person later. Thereafter, the CPU 201 turns on the warning lamp 216 (S127) to notify the user of the parent device 20 in the room that there is an unidentified visitor. Then, the detection error process ends. The process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
 顔領域の検出エラーは、顔がカメラ113に近すぎて撮影領域に納まっていないことに起因する可能性がある。よって、ステップS126では、録画の代わりに、子機10に対して低倍率(広角)での静止画の撮像を指示し、得られた画像信号から低倍率の静止画を生成してもよい。このように倍率を変更することにより、室内対応者が目で見れば、来訪者の顔が認識できるようになる可能性がある。 The face area detection error may be caused by the face being too close to the camera 113 and not being in the shooting area. Therefore, in step S126, instead of recording, the slave unit 10 may be instructed to capture a still image at a low magnification (wide angle), and a low-magnification still image may be generated from the obtained image signal. By changing the magnification in this way, there is a possibility that the face of the visitor can be recognized if the person in the room sees it visually.
 図6に示す認識判断処理のステップS103において、顔領域から目、鼻、口の特徴点のうち少なくとも1つが抽出できなかったと判断された場合には(S103:NO)、来訪者の顔認識ができないのは、顔の特徴点が抽出できないことに起因すると特定される。そこで、特徴抽出エラーが生じたとして、RAM203のフラグ記憶エリアに記憶されている特徴抽出エラーフラグがONにされる(S112)。そして、特徴抽出エラー処理が行われる(S130および図8)。特徴抽出エラー処理は、顔の特徴点がすべて抽出できるように、来訪者に所定の行動を促す処理である。 When it is determined in step S103 of the recognition determination process shown in FIG. 6 that at least one of the eye, nose, and mouth feature points cannot be extracted from the face area (S103: NO), the visitor's face recognition is performed. What cannot be specified is caused by the fact that facial feature points cannot be extracted. Therefore, assuming that a feature extraction error has occurred, the feature extraction error flag stored in the flag storage area of the RAM 203 is turned ON (S112). Then, feature extraction error processing is performed (S130 and FIG. 8). The feature extraction error process is a process that prompts the visitor to perform a predetermined action so that all feature points of the face can be extracted.
 図8に示すように、特徴抽出エラー処理では、まず、顔領域内のどの領域が隠されているのかが特定される(S131)。例えば、認識判断処理で抽出されなかった特徴点が、目、鼻、および口のいずれに対応するかによって、隠されているのは目領域、鼻領域、および口領域のいずれなのかを特定すればよい。あるいは、顔領域内の各画素の輝度値に基づいて特定してもよい。なお、いずれかの領域が隠されている場合とは、例えば、目領域が隠されていると特定された場合には、サングラスをかけている場合が想定される。鼻領域および口領域が隠されている場合には、マスクをかけている場合が想定される。 As shown in FIG. 8, in the feature extraction error process, first, which region in the face region is hidden is identified (S131). For example, depending on whether the feature points not extracted in the recognition judgment process correspond to the eyes, nose, or mouth, it is possible to identify whether the feature points are hidden, the eye area, the nose area, or the mouth area. That's fine. Or you may identify based on the luminance value of each pixel in a face area. In addition, the case where any area | region is concealed is assumed, for example, when it is specified that the eye area is concealed, and when wearing sunglasses. When the nose region and the mouth region are hidden, a case where a mask is put on is assumed.
 続いて、部分照合を行う設定がされているか否かが判断される(S132)。ここで、部分照合とは、目、鼻および口に対応する特徴点のすべてが抽出できなかった場合には、その一部のデータを用いて登録者の顔特徴データの一部との照合を行うことをいう。部分照合を行うか否かを示す情報である照合条件は、前述したように、ユーザにより設定され、フラッシュROM220の設定情報記憶エリア224に記憶されている。 Subsequently, it is determined whether or not the setting for performing partial verification is made (S132). Here, partial verification refers to verification of a part of the registrant's facial feature data using a part of the data when all of the feature points corresponding to the eyes, nose and mouth cannot be extracted. To do. The collation condition, which is information indicating whether or not partial collation is performed, is set by the user and stored in the setting information storage area 224 of the flash ROM 220 as described above.
 設定情報記憶エリア224に記憶されている照合条件が、部分照合を行わないことを示す場合には(S132:NO)、来訪者に対して、顔の特徴点が抽出できない場合の所定の行動を促す指示を子機10のスピーカ112から音声出力するための制御信号とデータが、子機10に送信される(S133)。具体的には、フラッシュROM220の通知音声記憶エリア223に記憶されている、所定の指示音声の音声データが読み出され、音声信号に変換されて、子機10に送信される。子機10では、例えば、目領域が隠されていると特定された場合には、「御用の方は、目を出してお立ちください」というように、隠されている領域を見せるように指示する音声が出力される。よって、来訪者は、音声による指示に従って、顔の特徴点が抽出できないという認識エラーに応じた適切な対処をすることができる。 When the collation condition stored in the setting information storage area 224 indicates that partial collation is not performed (S132: NO), a predetermined action when a facial feature point cannot be extracted for a visitor is performed. A control signal and data for outputting a prompt instruction from the speaker 112 of the child device 10 are transmitted to the child device 10 (S133). Specifically, voice data of a predetermined instruction voice stored in the notification voice storage area 223 of the flash ROM 220 is read, converted into a voice signal, and transmitted to the slave unit 10. In the handset 10, for example, when it is specified that the eye area is hidden, an instruction is given to show the hidden area, for example, “Please keep your eyes open”. Audio is output. Therefore, the visitor can take an appropriate action according to the recognition error that the facial feature point cannot be extracted according to the voice instruction.
 続いて、子機10に対して高解像度で撮像する指示が送信される(S141)。具体的には、CPU201は、フラッシュROM220の設定情報記憶エリア224から読み出した高解像度をカメラ113の解像度として設定し、撮像させるための制御信号を、子機10に対して送信する。ここで高解像度での撮像を行うのは、通常解像度では顔の特徴点が抽出できないが、高解像度になれば、室内対応者が目で見て来訪者の顔を認識できる可能性があるからである。そして、CPU201は、子機10から送信された1フレームの画像信号を取得して高解像度の静止画を生成し、フラッシュROM220の静止画記憶エリア226に記憶させる(S142)。 Subsequently, an instruction for imaging at a high resolution is transmitted to the slave unit 10 (S141). Specifically, the CPU 201 sets the high resolution read from the setting information storage area 224 of the flash ROM 220 as the resolution of the camera 113 and transmits a control signal for imaging to the slave unit 10. The reason why high-resolution imaging is performed here is that facial feature points cannot be extracted at normal resolution, but if the resolution is high, indoor responders may be able to visually recognize the faces of visitors. It is. Then, the CPU 201 acquires one frame of image signal transmitted from the child device 10 to generate a high-resolution still image and stores it in the still image storage area 226 of the flash ROM 220 (S142).
 なお、ステップS141では、CPU201は、高解像度での撮像を指示する代わりに、フラッシュROM220の設定情報記憶エリア224に記憶されている高倍率(ズーム)での撮像を指示してもよい。顔の特徴点が抽出できない場合、来訪者がカメラ113から遠すぎることが原因である可能性がある。よって、このように倍率を変更することにより、室内対応者が目で見れば、来訪者の顔が認識できるようになる可能性がある。 In step S141, the CPU 201 may instruct imaging at a high magnification (zoom) stored in the setting information storage area 224 of the flash ROM 220 instead of instructing imaging at high resolution. If facial feature points cannot be extracted, it may be because the visitor is too far from the camera 113. Therefore, by changing the magnification in this way, there is a possibility that the face of the visitor can be recognized if the person in the room looks with his / her eyes.
 その後、CPU201は、カメラ113の解像度を通常解像度に戻す指示を子機10に送信する(S143)。具体的には、フラッシュROM220の設定情報記憶エリア224から読み出した通常解像度をカメラ113の解像度として設定し、撮像させるための制御信号を、子機10に対して送信する。その後、警告ランプ216を点灯させることにより(S144)、特定できない来訪者の存在を室内の親機20のユーザに知らせる。そして、特徴抽出エラー処理は終了する。処理は図6に示す認識判断処理に戻り、さらに図5に示すメイン処理に戻る。 Thereafter, the CPU 201 transmits an instruction to return the resolution of the camera 113 to the normal resolution to the slave unit 10 (S143). Specifically, the normal resolution read from the setting information storage area 224 of the flash ROM 220 is set as the resolution of the camera 113 and a control signal for imaging is transmitted to the slave unit 10. Thereafter, the warning lamp 216 is turned on (S144) to notify the user of the parent device 20 in the room that there is an unidentified visitor. Then, the feature extraction error process ends. The process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
 ステップS132において、設定情報記憶エリア224に記憶されている照合条件が、部分照合を行うことを示すと判断された場合には(S132:YES)、抽出できた一部の特徴点のみを用いて、登録者の顔特徴データとの照合が行われる(S135)。例えば、春の花粉飛散期には、花粉症対策としてマスクをかけている人が多くなる。このような来訪者に対してマスクを外して鼻と口を出すように指示するのは酷である。よって、このような場合、照合条件を予め部分照合を行うように設定しておくことが可能である。 If it is determined in step S132 that the collation condition stored in the setting information storage area 224 indicates that partial collation is to be performed (S132: YES), only some of the feature points that can be extracted are used. The collation with the registrant's facial feature data is performed (S135). For example, during the spring pollen season, more people are wearing masks as a countermeasure against hay fever. It is harsh to instruct such visitors to remove their masks and open their noses and mouths. Therefore, in such a case, the collation condition can be set in advance so as to perform partial collation.
 部分照合では、例えば、目に対応する特徴点は抽出できず、鼻と口に対応する特徴点のみが抽出できている場合、鼻と口に対応する特徴点から求められるデータと、登録者の顔特徴データのうち鼻と口のデータとが照合される。登録者のデータのいずれとも一致しない場合には(S136:NO)、前述したように、子機10に対して高解像度で撮像する指示が送信され(S141)、高解像度の静止画が静止画記憶エリア226に記憶され(S143)、警告ランプ216が点灯される(S144)。そして、特徴抽出エラー処理は終了し、処理は図6に示す認識判断処理に戻り、さらに図5に示すメイン処理に戻る。 In partial matching, for example, when feature points corresponding to eyes cannot be extracted, and only feature points corresponding to nose and mouth can be extracted, data obtained from feature points corresponding to nose and mouth and registrant's The nose and mouth data of the face feature data are collated. If the data does not match any of the registrant data (S136: NO), as described above, an instruction for imaging at high resolution is transmitted to the slave unit 10 (S141), and the high resolution still image becomes a still image. The information is stored in the storage area 226 (S143), and the warning lamp 216 is turned on (S144). Then, the feature extraction error process ends, the process returns to the recognition determination process shown in FIG. 6, and then returns to the main process shown in FIG.
 ステップS136において、抽出できた一部の特徴点のみを用いて顔特徴データとの照合が行われた結果、登録者の顔特徴データのいずれかと一致すると判断された場合には(S136:YES)、その登録者が来訪者の候補として特定される(S137)。ここで「候補」として特定されるのは、顔の特徴点の一部のみに基づいて照合を行った結果であるため、結果の信頼性はすべての特徴点を用いた場合ほど高くないためである。その後、特徴抽出エラー処理は終了し、処理は図6に示す認識判断処理に戻り、さらに図5に示すメイン処理に戻る。 If it is determined in step S136 that the extracted feature points are matched with the facial feature data using only some of the feature points, the facial feature data is determined to match any of the registrant facial features data (S136: YES). The registrant is identified as a visitor candidate (S137). Here, the candidate is identified as the result of matching based on only a part of the facial feature points, so the reliability of the results is not as high as when all the feature points are used. is there. Thereafter, the feature extraction error process ends, the process returns to the recognition determination process shown in FIG. 6, and further returns to the main process shown in FIG.
 図5に示すメイン処理では、認識判断処理の後(S10)、来訪者によって子機10の呼出ボタン114が押下げられたか否かが判断される(S11)。具体的には、子機10から送信された呼出信号が受信されたか否かが判断される。来訪者によって呼出ボタン114が押下げられていなければ(S11:NO)、来訪者が呼出ボタン114を押下げるまで処理を継続するために、処理はステップS2に戻る。 In the main process shown in FIG. 5, after the recognition determination process (S10), it is determined whether or not the call button 114 of the child device 10 has been pressed by the visitor (S11). Specifically, it is determined whether or not a call signal transmitted from slave unit 10 has been received. If the call button 114 is not depressed by the visitor (S11: NO), the process returns to step S2 in order to continue the process until the visitor depresses the call button 114.
 続くステップS2では、検知フラグがONとされているか否かが判断される。ステップS11の後でステップS2に処理が戻った場合、すでに一度移動物が検知されており、検知フラグはONである(S2:YES)。さらに、人感センサ115によって移動物が検知されたか否かが判断される(S23)。ここで移動物が検知されない場合(S23:NO)、一旦検知された移動物が、その後の処理が行われている間に、人感センサ115の検知範囲から外れたことを意味する。よって、この場合は来訪者がいなくなったものとして、ステップS1に戻る。そして、RAM203に記憶されている情報が消去され、各種フラグがOFFにされて、次の来訪者についての処理が、前述したように行われる。 In subsequent step S2, it is determined whether or not the detection flag is ON. When the process returns to step S2 after step S11, the moving object has already been detected once, and the detection flag is ON (S2: YES). Further, it is determined whether or not a moving object is detected by the human sensor 115 (S23). Here, when a moving object is not detected (S23: NO), it means that the moving object once detected is out of the detection range of the human sensor 115 during the subsequent processing. Therefore, in this case, it is assumed that there are no visitors, and the process returns to step S1. Then, the information stored in the RAM 203 is erased, various flags are turned OFF, and the process for the next visitor is performed as described above.
 一方、ステップS23で、移動物が検知された場合には(S23:YES)、移動物がそのまま検知範囲にとどまっている、すなわち、来訪者がまだ子機10近辺にいることを意味する。そこで、前の処理ですでに来訪者または来訪者の候補が特定されたか否かが判断される(S7)。すでに特定されていれば、再び認識判断処理を行う必要はないため、処理はそのままステップS11に進み、呼出ボタン114が押し下げられたか否かが判断される(S11)。一方、まだ前の処理で来訪者または来訪者の候補が特定されていなければ(S7:NO)、認識判断処理が再び行われる(S10)。そして、来訪者によって呼出ボタン114が押下げられるまで、同様の処理が繰り返される(S2~S10)。 On the other hand, if a moving object is detected in step S23 (S23: YES), it means that the moving object remains in the detection range as it is, that is, the visitor is still in the vicinity of the child device 10. Therefore, it is determined whether or not a visitor or a visitor candidate has already been specified in the previous process (S7). If it has already been specified, it is not necessary to perform the recognition determination process again. Therefore, the process proceeds to step S11 as it is, and it is determined whether or not the call button 114 is pressed (S11). On the other hand, if a visitor or a visitor candidate has not been specified in the previous process (S7: NO), the recognition determination process is performed again (S10). The same processing is repeated until the call button 114 is pressed by the visitor (S2 to S10).
 認識判断処理の後(S10)、呼出ボタン114が押下げられた場合には(S11:YES)、来訪者処理が行われる(S20および図9)。以下に、図2および図9~図11を参照して来訪者処理の詳細について説明する。 After the recognition determination process (S10), if the call button 114 is pressed (S11: YES), a visitor process is performed (S20 and FIG. 9). Details of the visitor process will be described below with reference to FIG. 2 and FIGS.
 図9に示すように、来訪者処理では、CPU201はまず、親機20のスピーカ212から呼出音を出力するとともに、子機10から送信された画像信号から生成された表示用のデータに基づき、例えば図2に示すように、カメラ113によって撮像された画像を表示モニタ214に表示させる(S201)。さらに、操作パネル215に来訪者の認識状況に応じた表示を行うために、来訪者または来訪者の候補が特定されているか否かが判断される(S202)。 As shown in FIG. 9, in the visitor process, the CPU 201 first outputs a ringing tone from the speaker 212 of the parent device 20, and based on display data generated from the image signal transmitted from the child device 10, For example, as shown in FIG. 2, the image captured by the camera 113 is displayed on the display monitor 214 (S201). Further, it is determined whether or not a visitor or a visitor candidate has been specified in order to display on the operation panel 215 according to the recognition status of the visitor (S202).
 前述のように、認識判断処理において来訪者が登録者のいずれかであると特定された場合(図6のS107)、未登録者であると特定された場合(S109)、および特徴検出エラー処理において来訪者の候補として登録者のいずれかが特定された場合には(図8のS137)、来訪者またはその候補が特定されていると判断される(S202:YES)。この場合、CPU201は、図2に示す登録者モード画面310を操作パネル215に表示させる(S213)。 As described above, when the visitor is identified as one of the registrants in the recognition determination process (S107 in FIG. 6), when identified as an unregistered person (S109), and the feature detection error process If any of the registrants is specified as a visitor candidate (S137 in FIG. 8), it is determined that the visitor or the candidate is specified (S202: YES). In this case, the CPU 201 displays the registrant mode screen 310 shown in FIG. 2 on the operation panel 215 (S213).
 登録者モード画面310には、例えば、図2に示すように、来訪者情報表示領域311、登録修正ボタン312、対応ボタン313、拒否ボタン314が設けられている。なお、登録者モード画面310は、フラッシュROM220の表示画面記憶エリア222(図3参照)に記憶されたテンプレートに、来訪者または来訪者の候補に関する情報を流し込むことにより作成すればよい。以下に説明するその他の画面も同様である。 In the registrant mode screen 310, for example, as shown in FIG. 2, a visitor information display area 311, a registration correction button 312, a corresponding button 313, and a reject button 314 are provided. The registrant mode screen 310 may be created by pouring information on visitors or visitor candidates into a template stored in the display screen storage area 222 (see FIG. 3) of the flash ROM 220. The same applies to other screens described below.
 来訪者情報表示領域311には、来訪者または来訪者の候補に関する情報が表示される。具体的には、例えば、来訪者が登録者であると特定されている場合には、「来訪者は以下の登録者です」というメッセージが表示され、来訪者の候補が特定されている場合には、「来訪者の候補は以下の登録者です」というメッセージが表示される。あわせて、顔特徴記憶エリア221に関連情報として記憶されている登録者の氏名、関係、前回来訪日、およびメモが読み出され、表示される。 The visitor information display area 311 displays information on visitors or visitor candidates. Specifically, for example, when a visitor is identified as a registrant, the message “Visitor is the following registrant” is displayed, and a visitor candidate is identified. Displays the message “Candidates for visitor are the following registrants”. In addition, the name, relationship, previous visit to Japan, and memo of the registrant stored as related information in the face feature storage area 221 are read and displayed.
 図2は、図4の顔特徴記憶エリア221にデータが記憶されている登録者のうち、ID「1」の登録者が来訪者として特定された場合の登録者モード画面310の例である。よって、この登録者の関連情報である来訪者の氏名「Aさん」、関係「パパの知り合い」、前回来訪日「2008年9月1日」、およびメモ「パパが帰るのを待ってもらって」が表示されている。 FIG. 2 is an example of a registrant mode screen 310 when a registrant with ID “1” is identified as a visitor among registrants whose data is stored in the face feature storage area 221 of FIG. Therefore, the visitor's name “A”, which is related information of this registrant, the relationship “acquaintance of daddy”, the previous visit date “September 1, 2008”, and the memo “waiting for daddy to return” Is displayed.
 登録修正ボタン312は、室内対応者が、表示された登録者の関連情報を修正したい場合に、修正画面への移行の指示を入力するためのボタンである。対応ボタン313は、室内対応者が、来訪者に直接対応したい場合に、子機10との通話開始の指示を入力するためのボタンである。拒否ボタン314は、室内対応者が、来訪者に直接対応したくない場合に、代理応対の指示を入力するためのボタンである。 The registration correction button 312 is a button for inputting an instruction to shift to the correction screen when the room corresponding person wants to correct the related information of the displayed registrant. The response button 313 is a button for inputting an instruction to start a call with the child device 10 when the room responder wants to directly respond to the visitor. The refusal button 314 is a button for inputting a proxy response instruction when the room responder does not want to directly respond to the visitor.
 このように、来訪者または来訪者の候補が特定されている場合には、表示モニタ214に子機10から送信されるリアルタイムの画像が表示され、操作パネル215に来訪者または来訪者の候補の関連情報が表示される。よって、室内対応者は、来訪者が既知の人物なのか否か、また既知の人物であればその素性を容易に知ることができ、警戒すべき人物か否かを判断することが容易となる。また、顔の特徴点の一部のみで照合が行われた場合には、来訪者は、候補者として表示される。よって、室内対応者は、来訪者が誰なのかを容易に推測することができる。 As described above, when a visitor or a visitor candidate is specified, a real-time image transmitted from the slave unit 10 is displayed on the display monitor 214, and a visitor or a visitor candidate is displayed on the operation panel 215. Related information is displayed. Therefore, the room responder can easily know whether or not the visitor is a known person, and if it is a known person, the identity of the visitor can be easily determined. . In addition, when the matching is performed with only a part of the facial feature points, the visitor is displayed as a candidate. Therefore, the room responder can easily guess who the visitor is.
 一方、ステップS202において、来訪者または来訪者の候補が特定されていないと判断された場合には(S202:NO)、接続中である旨の通知音声を子機10のスピーカ112から出力するための制御信号とデータが、子機10に送信される(S203)。ここでは、例えば、「接続中です。少々お待ちください。」という音声が出力される。 On the other hand, when it is determined in step S202 that the visitor or the visitor candidate is not specified (S202: NO), a notification sound indicating that the connection is being established is output from the speaker 112 of the slave unit 10. The control signal and data are transmitted to the slave unit 10 (S203). Here, for example, a voice message “Connected. Please wait a moment” is output.
 続いて、認識判断処理において特徴抽出エラーが生じたか否かが判断される(S205)。具体的には、すべての顔の特徴点が抽出できず、特徴抽出エラーフラグがONとされていれば(図6のS112)、特徴抽出エラーがあったと判断される(S205:YES)。この場合、CPU201は、図10に示す注意モード画面320を操作パネル215に表示させる(S206)。 Subsequently, it is determined whether or not a feature extraction error has occurred in the recognition determination process (S205). Specifically, if all the feature points of the face cannot be extracted and the feature extraction error flag is ON (S112 in FIG. 6), it is determined that there is a feature extraction error (S205: YES). In this case, the CPU 201 displays the attention mode screen 320 shown in FIG. 10 on the operation panel 215 (S206).
 注意モード画面320には、例えば図10に示すように、注意情報表示領域321、高解像度画像表示領域322、警告解除ボタン323、および拒否ボタン314が設けられている。注意情報表示領域321には、来訪者に対する注意を促す情報が表示される。具体的には、例えば、「注意!来訪者の確認が必要です」というメッセージが表示される。高解像度画像表示領域322には、特徴抽出エラー処理において、すべての特徴点が抽出できない場合に生成され(図8のS142)、フラッシュROM220の静止画記憶エリア226に記憶されている高解像度の静止画が読み出され、表示される。 The caution mode screen 320 is provided with a caution information display area 321, a high-resolution image display area 322, a warning release button 323, and a reject button 314, for example, as shown in FIG. 10. The caution information display area 321 displays information that calls attention to visitors. Specifically, for example, a message “Attention! Visitor confirmation is required” is displayed. The high-resolution image display area 322 is generated when all feature points cannot be extracted in the feature extraction error process (S142 in FIG. 8) and stored in the still image storage area 226 of the flash ROM 220. The image is read and displayed.
 警告解除ボタン323は、室内対応者が、表示モニタ214に表示された通常解像度のリアルタイム画像と、高解像度画像表示領域322に表示された高解像度の静止画を確認した上で、来訪者に直接対応したい場合に、前述の対応ボタン313を表示させる指示を入力するためのボタンである。拒否ボタン314は、登録者モード画面310に関連して説明した通りである。 The warning cancel button 323 allows the room attendant to check the normal resolution real-time image displayed on the display monitor 214 and the high-resolution still image displayed in the high-resolution image display area 322 and then directly to the visitor. This is a button for inputting an instruction to display the corresponding button 313 when it is desired to respond. The reject button 314 is as described in connection with the registrant mode screen 310.
 このように、来訪者の顔は検出されているのにすべての顔の特徴点が抽出できない場合には、表示モニタ214に子機10から送信されるリアルタイムの画像が表示されるとともに、操作パネル215に撮像領域の高解像度の静止画が表示される。よって、例えば図10に示すように、来訪者の顔の一部が隠されており、顔特徴データとの照合によって来訪者が特定できない場合であっても、室内対応者は、高解像度の静止画を見て、来訪者が実は登録者であるか否か、また、不審者か否かを判別することが容易となる。 As described above, when the feature points of all the faces cannot be extracted even though the face of the visitor is detected, a real-time image transmitted from the slave unit 10 is displayed on the display monitor 214 and the operation panel is displayed. In 215, a high-resolution still image of the imaging area is displayed. Therefore, for example, as shown in FIG. 10, even if a part of the visitor's face is hidden and the visitor cannot be identified by collation with the face feature data, By looking at the screen, it is easy to determine whether the visitor is actually a registered person and whether or not the visitor is a suspicious person.
 ステップS205において、認識判断処理において特徴抽出エラーは生じていないと判断された場合には(S205:NO)、検出エラーが生じたか否かが判断される(S208)。具体的には、顔領域が検出できず、検出エラーフラグがONとされていれば(図6のS111)、検出エラーがあったと判断される(S208:YES)。この場合、CPU201は、図11に示す警告モード画面330を操作パネル215に表示させる(S209)。 If it is determined in step S205 that no feature extraction error has occurred in the recognition determination process (S205: NO), it is determined whether a detection error has occurred (S208). Specifically, if the face area cannot be detected and the detection error flag is ON (S111 in FIG. 6), it is determined that a detection error has occurred (S208: YES). In this case, the CPU 201 displays the warning mode screen 330 shown in FIG. 11 on the operation panel 215 (S209).
 警告モード画面330には、例えば図11に示すように、警告情報表示領域331、動画表示領域332、転送ボタン333、警告解除ボタン323および拒否ボタン314が設けられている。警告情報表示領域331には、来訪者が要注意人物であることを警告する情報が表示される。具体的には、例えば、「警告!!!要注意人物がいます」というメッセージが表示される。動画表示領域332には、検出エラー処理において、所定時間中同一物体が検出された場合に録画され(図7のS126)、フラッシュROM220の動画記憶エリア225に記憶されている動画が再生表示される。 The warning mode screen 330 is provided with a warning information display area 331, a moving image display area 332, a transfer button 333, a warning release button 323, and a rejection button 314, for example, as shown in FIG. In the warning information display area 331, information for warning that the visitor is a person requiring attention is displayed. Specifically, for example, a message “Warning !!!! There is a person who needs attention” is displayed. In the moving image display area 332, when the same object is detected for a predetermined time in the detection error processing (S126 in FIG. 7), the moving image stored in the moving image storage area 225 of the flash ROM 220 is reproduced and displayed. .
 転送ボタン333は、予め定められた転送先の電話機に、子機10から入力された音声データを転送する指示を入力するためのボタンである。警告解除ボタン323および拒否ボタン314は、それぞれ、注意モード画面320および登録者モード画面310に関連して説明した通りである。 The transfer button 333 is a button for inputting an instruction to transfer the audio data input from the slave unit 10 to a predetermined transfer destination telephone. The warning cancel button 323 and the reject button 314 are as described in relation to the attention mode screen 320 and the registrant mode screen 310, respectively.
 このように、所定時間中、移動物が検出されているのに来訪者の顔領域が検出できない場合には、表示モニタ214に子機10から送信されるリアルタイムの画像が表示されるとともに、録画されていた動画が操作パネル215に表示される。よって、例えば図11に示すように、人がうろついている場合等にも、室内対応者は、カメラ113からのリアルタイム画像と録画された動画の両方によって、不審者であるのか否かを確認することができる。 As described above, when a moving object is detected for a predetermined time, but a visitor's face area cannot be detected, a real-time image transmitted from the slave unit 10 is displayed on the display monitor 214 and recording is performed. The moving image that has been displayed is displayed on the operation panel 215. Therefore, for example, as shown in FIG. 11, even when a person is wandering, the room responder confirms whether or not the person is a suspicious person by both the real-time image from the camera 113 and the recorded video. be able to.
 ステップS208において、認識判断処理において検出エラーは生じていないと判断された場合には(S208:NO)、CPU201は、未登録モード画面(図示外)を操作パネル215に表示させる(S211)。未登録モード画面には、図2に示す登録者モード画面310の来訪者情報表示領域311に代えて、未登録者通知領域(図示外)が設けられており、来訪者が未登録者である旨が表示される。また、登録者モード画面310と同様、対応ボタン313および拒否ボタン314が設けられている。さらに、来訪者の顔特徴データ等を顔特徴記憶エリア221に新たに登録するための画面への移行指示を入力する登録ボタン等を設けてもよい。 If it is determined in step S208 that no detection error has occurred in the recognition determination process (S208: NO), the CPU 201 displays an unregistered mode screen (not shown) on the operation panel 215 (S211). The unregistered mode screen is provided with an unregistered person notification area (not shown) instead of the visitor information display area 311 of the registrant mode screen 310 shown in FIG. 2, and the visitor is an unregistered person. Is displayed. Similarly to the registrant mode screen 310, a correspondence button 313 and a rejection button 314 are provided. Further, a registration button or the like for inputting an instruction to shift to a screen for newly registering a visitor's facial feature data or the like in the facial feature storage area 221 may be provided.
 以上に説明したように、操作パネル215に来訪者の認識状況に応じた表示が行われた後(S213、S206、S209、またはS211)、パネル操作がされたか否か、具体的には操作パネル215のタッチパッドからの入力が検知されたか否かが判断される(S216)。パネル操作がされていない場合(S216:NO)、所定時間(例えば、1分)が経過したか否かが判断される(S217)。例えば、呼出ボタン114の押下が検知された際スタートされるタイマによって経過時間を計測し、閾値を超えたか否かで判断すればよい。所定時間が経過しない間は(S217:NO)、CPU201は、パネル操作があったか否かの判断に戻り(S216)、処理を繰り返す。 As described above, after the display according to the recognition status of the visitor is performed on the operation panel 215 (S213, S206, S209, or S211), whether or not the panel operation is performed, specifically, the operation panel It is determined whether or not an input from the touch pad 215 is detected (S216). When the panel operation is not performed (S216: NO), it is determined whether or not a predetermined time (for example, 1 minute) has elapsed (S217). For example, the elapsed time may be measured by a timer that is started when pressing of the call button 114 is detected, and it may be determined whether or not a threshold value has been exceeded. While the predetermined time has not elapsed (S217: NO), the CPU 201 returns to the determination of whether or not there has been a panel operation (S216) and repeats the process.
 所定時間が経過した場合は(S217:YES)、室内に人がおらず、応答がされないと考えられるので、来訪者処理はそのまま終了し、処理は図5のメイン処理に戻る。この場合、メイン処理では、次の来訪者のためにリセット処理が行われる(S1)。 If the predetermined time has elapsed (S217: YES), it is considered that there is no person in the room and no response is made, so the visitor process is terminated as it is, and the process returns to the main process in FIG. In this case, in the main process, a reset process is performed for the next visitor (S1).
 操作パネル215からの入力がされたと判断された場合(S216:YES)、表示されている画面とタッチパッドによって検出される入力位置に応じて、入力された指示が、対応ボタン313(図2参照)の選択による通話開始の指示か否かが判断される(S221)。対応ボタン313が選択された場合には(S221:YES)、CPU201は、通話開始の処理を行う(S222)。具体的には、CPU201は、別途、子機10との通話に関する親機20の動作を制御するためのプログラムを起動させる。これにより、子機10と親機20の間の通話路が形成され、来訪者と室内対応者の通話が可能となる。通話開始処理の後(S222)、来訪者処理は終了し、図5のメイン処理に戻る。メイン処理では、次の来訪者のためにリセット処理が行われる(S1)。 When it is determined that an input has been made from the operation panel 215 (S216: YES), the input instruction is displayed according to the input position detected by the displayed screen and the touch pad, as shown in FIG. ) Is selected, it is determined whether or not a call start instruction is given (S221). When the corresponding button 313 is selected (S221: YES), the CPU 201 performs a call start process (S222). Specifically, the CPU 201 separately activates a program for controlling the operation of the parent device 20 related to a call with the child device 10. As a result, a communication path between the child device 10 and the parent device 20 is formed, and a call can be made between the visitor and the room responder. After the call start process (S222), the visitor process ends, and the process returns to the main process in FIG. In the main process, a reset process is performed for the next visitor (S1).
 操作パネル215から入力された指示が、対応ボタン313の選択による通話開始の指示ではない場合(S221:NO)、拒否ボタン314(図2、図10および図11参照)の選択による代理応対の指示か否かが判断される(S224)。拒否ボタン314が選択された場合には(S224:YES)、CPU201は、代理応答の処理を行う(S225)。具体的には、直接応対できない旨の通知音声を子機10のスピーカ112から出力するための制御信号とデータが、子機10に送信される(S225)。その結果、子機10では、例えば、「今、取り込み中ですので、申し訳ありませんが応対できません」という音声が出力される。その後、来訪者処理は終了し、図5のメイン処理に戻る。メイン処理では、次の来訪者のためにリセット処理が行われる(S1)。 If the instruction input from the operation panel 215 is not an instruction to start a call by selecting the corresponding button 313 (S221: NO), an instruction for proxy response by selecting the reject button 314 (see FIGS. 2, 10, and 11) It is determined whether or not (S224). When the refusal button 314 is selected (S224: YES), the CPU 201 performs a proxy response process (S225). Specifically, a control signal and data for outputting a notification voice indicating that direct response cannot be received from the speaker 112 of the slave unit 10 is transmitted to the slave unit 10 (S225). As a result, the child device 10 outputs, for example, a voice saying “Sorry, it cannot be handled because it is currently being imported”. Thereafter, the visitor process ends, and the process returns to the main process in FIG. In the main process, a reset process is performed for the next visitor (S1).
 操作パネル215から入力された指示が、拒否ボタン314の選択による代理応対の指示ではない場合(S224:NO)、転送ボタン333(図11参照)の選択による転送の指示か否かが判断される(S227)。転送ボタン333が選択された場合には(S227:YES)、CPU201は、転送処理を行う(S228)。具体的には、予め転送先としてフラッシュROM220の所定の記憶エリア(図示外)に記憶されている電話番号に対して発呼がなされる。そして、対応する電話機(携帯電話でも固定電話でもよい)が応答して親機20との間で公衆電話回線網5を介して回線が通話状態となると、子機10と電話機との間で親機20を介して転送通話が可能となる。その後、来訪者処理は終了し、図5のメイン処理に戻る。メイン処理では、次の来訪者のためにリセット処理が行われる(S1)。 If the instruction input from the operation panel 215 is not a proxy response instruction by selecting the reject button 314 (S224: NO), it is determined whether or not the instruction is a transfer instruction by selecting the transfer button 333 (see FIG. 11). (S227). When the transfer button 333 is selected (S227: YES), the CPU 201 performs a transfer process (S228). Specifically, a call is made to a telephone number stored in advance in a predetermined storage area (not shown) of the flash ROM 220 as a transfer destination. Then, when the corresponding telephone (which may be a mobile phone or a fixed telephone) responds and the line enters a call state with the main unit 20 via the public telephone line network 5, the parent unit 10 and the telephone set the parent unit. Transfer calls can be made via the machine 20. Thereafter, the visitor process ends, and the process returns to the main process in FIG. In the main process, a reset process is performed for the next visitor (S1).
 操作パネル215から入力された指示が、転送ボタン333の選択による転送の指示ではない場合には(S227:NO)、入力された指示に応じたその他の処理が行われる。例えば、図10に示す注意モード画面320または図11に示す警告モード画面330で警告解除ボタン323が選択された場合には、警告解除ボタン323に代えて対応ボタン313(図2参照)を表示させる処理が行われる。その他の処理終了後、処理はステップS216に戻り、前述したように、操作パネル215の操作に応じた処理が行われる。 If the instruction input from the operation panel 215 is not a transfer instruction by selecting the transfer button 333 (S227: NO), other processing according to the input instruction is performed. For example, when the warning release button 323 is selected on the attention mode screen 320 shown in FIG. 10 or the warning mode screen 330 shown in FIG. 11, a corresponding button 313 (see FIG. 2) is displayed instead of the warning release button 323. Processing is performed. After the other processes are completed, the process returns to step S216, and the process according to the operation on the operation panel 215 is performed as described above.
 以上、説明したように、本実施形態のインターホンシステム1では、来訪者が検知されると、子機10のカメラ113による撮像が開始される。そして、子機10から取得された画像信号から生成される静止画から顔領域が検出され、顔領域から特徴点が抽出される。顔領域が検出できない場合、または特徴点が抽出できない場合、予めフラッシュROM220の顔特徴記憶エリア221に記憶されている複数の人物の顔特徴データとの照合による顔認識が不可能である。また、このような場合には、室内対応者がカメラ113によって撮像された画像を見たとしても、来訪者の顔を認識できない場合が多い。 As described above, in the intercom system 1 of the present embodiment, when a visitor is detected, imaging by the camera 113 of the slave unit 10 is started. Then, the face area is detected from the still image generated from the image signal acquired from the handset 10, and the feature points are extracted from the face area. When face areas cannot be detected or feature points cannot be extracted, face recognition cannot be performed by collating with face feature data of a plurality of persons stored in the face feature storage area 221 of the flash ROM 220 in advance. Further, in such a case, even if the room responder views the image captured by the camera 113, the face of the visitor cannot be recognized in many cases.
 よって、来訪者に認識を妨げる原因に応じた行動を促す指示がスピーカ112から音声出力されることにより、来訪者に通知される。したがって、来訪者は、通知された指示に従って、原因に応じた適切な対処をすることができる。また、認識を妨げる原因に応じて異なるレベルの警戒を促す情報を示す画面310~330が操作パネル215に表示される。したがって、室内対応者は、通知された情報に従って、原因に応じた適切な警戒措置をとることができる。 Therefore, an instruction that prompts the visitor to act according to the cause that hinders recognition is output from the speaker 112 as a voice, thereby notifying the visitor. Therefore, the visitor can take an appropriate action according to the cause according to the notified instruction. In addition, screens 310 to 330 showing information prompting different levels of warning depending on the cause of hindering recognition are displayed on operation panel 215. Therefore, the room responder can take appropriate warning measures according to the cause according to the notified information.
 特に、取得された画像から来訪者の顔が認識できない場合、その原因は、主に、来訪者の顔領域が検出できないか、来訪者の顔の特徴が抽出できないかのいずれかである。インターホン1によれば、原因が上記2つのうちいずれなのかに応じて、来訪者に対する通知内容と、室内に通知される警戒のレベルを変え、来訪者への適切な指示と室内への適切な警告を行うことができる。 In particular, when the visitor's face cannot be recognized from the acquired image, the cause is mainly that the face area of the visitor cannot be detected or the feature of the visitor's face cannot be extracted. According to the intercom 1, depending on which of the above two causes, the contents of notification to the visitor and the level of caution to be notified in the room are changed, and appropriate instructions to the visitor and appropriate A warning can be made.
 また、インターホンシステム1では、来訪者の顔が認識できない場合には、その原因に応じてカメラ113による撮像方法(例えば、解像度、画角、静止画または動画)が変更され、取得された画像が親機20の操作パネル215に表示される。したがって、室内対応者は、警戒が促された場合、顔認識ができなかった原因に応じて異なる画像を確認することができ、より適切な警戒措置をとることができる。 Also, in the intercom system 1, when the visitor's face cannot be recognized, the imaging method (for example, resolution, angle of view, still image or moving image) by the camera 113 is changed according to the cause, and the acquired image is changed. It is displayed on the operation panel 215 of the master unit 20. Therefore, the room responder can check different images depending on the cause of the face recognition failure when alerting is urged, and can take more appropriate alerting measures.
 さらに、インターホン1では、来訪者の顔特徴データが、フラッシュROM220の顔特徴記憶エリア221に記憶された複数の登録者の顔特徴データと照合され、一致している人物がいるか否かが判断される。そして、その判断結果に応じて異なるレベルの警戒を促す情報が操作パネル215を通じて室内に通知される。したがって、室内対応者は、来訪者が警戒すべき人物か否かを知ることが出来、不用意な対応を避けることが容易となる。 Furthermore, in the intercom 1, the face feature data of the visitor is checked against the face feature data of a plurality of registrants stored in the face feature storage area 221 of the flash ROM 220, and it is determined whether there is a matching person. The Information for prompting different levels of alert according to the determination result is notified to the room through the operation panel 215. Therefore, the room responder can know whether or not the visitor is a person to be wary, and it is easy to avoid careless correspondence.
 なお、前述の実施形態に示される構成や処理は例示であり、各種の変形が可能なことはいうまでもない。例えば、前述の実施形態では、認識判断処理(図6参照)において、来訪者の顔特徴データが得られた場合には、フラッシュROM220の顔特徴記憶エリア221に記憶された登録者の顔特徴データとの照合によって、来訪者の特定が試みられている。しかしながら、顔特徴データの照合による来訪者の特定は、必ずしも行う必要はない。つまり、図6の認識判断処理において、ステップS104~S109の処理は行わなくてもよい。この場合、図5に示すメイン処理のステップS7の処理、図9に示す来訪者処理のステップS202およびステップS213の処理は行わなくてよい。すなわち、ステップS201の後、ステップS203に進む。そして、ステップS211の処理は行わなくてもよい。 It should be noted that the configuration and processing shown in the above-described embodiment are examples, and it goes without saying that various modifications are possible. For example, in the above-described embodiment, when the facial feature data of the visitor is obtained in the recognition determination process (see FIG. 6), the facial feature data of the registrant stored in the facial feature storage area 221 of the flash ROM 220. It is attempted to identify visitors by collating with. However, it is not always necessary to specify a visitor by collating facial feature data. That is, in the recognition determination process of FIG. 6, the processes of steps S104 to S109 need not be performed. In this case, the process of step S7 of the main process shown in FIG. 5 and the process of steps S202 and S213 of the visitor process shown in FIG. That is, after step S201, the process proceeds to step S203. And the process of step S211 does not need to be performed.
 つまり、特徴抽出エラーまたは検出エラーがあった場合は、エラーの種別に応じて注意モード画面320(図10参照)または警告モード画面330(図11参照)が表示されるが、顔領域の検出および顔の特徴点の抽出に成功している場合には、単にカメラ113によって撮像された画像が表示モニタ214に表示されることになる(図9のS201)。この場合でも、顔領域の検出および顔の特徴点の抽出に成功しているので、表示モニタ214に表示された画像を見ることにより、室内対応者は来訪者の顔を認識することができる。 That is, when there is a feature extraction error or a detection error, the attention mode screen 320 (see FIG. 10) or the warning mode screen 330 (see FIG. 11) is displayed according to the type of error. If the facial feature points have been successfully extracted, the image captured by the camera 113 is simply displayed on the display monitor 214 (S201 in FIG. 9). Even in this case, since the detection of the face area and the extraction of the facial feature points have succeeded, the room responder can recognize the visitor's face by looking at the image displayed on the display monitor 214.
 前述の実施形態では、来訪者の顔の認識ができない場合に、その原因に応じて異なる行動を促す情報を、子機10のスピーカ112から音声によって通知している。しかしながら、来訪者への通知は必ずしも音声によって行う必要はない。代わりに、子機10に表示モニタを設け、来訪者の顔の認識ができない場合に、その原因に応じて異なる行動を促す情報を表示させてもよい。 In the above-described embodiment, when the face of the visitor cannot be recognized, information prompting different actions depending on the cause is notified by voice from the speaker 112 of the slave unit 10. However, it is not always necessary to notify the visitors by voice. Instead, a display monitor may be provided in the child device 10 to display information that prompts different actions depending on the cause when the face of the visitor cannot be recognized.
 前述の実施形態では、来訪者の顔の認識ができない原因に応じて異なる画面310~330を親機20の操作パネル215に表示させることにより、室内対応者へ、異なるレベルの警戒を促す情報が通知されている。しかしながら、室内対応者への通知は必ずしも表示によって行う必要はない。代わりに、親機20のスピーカ212から、「パパの知り合いです」、「画像をよく確認してください」、「不審者と思われますので警戒してください」等、異なるレベルの警戒を促すメッセージを音声出力させてもよい。また、警戒レベル毎に異なる間隔で警告ランプ216を点滅させたり、発光色が互いに異なる警告ランプ216を複数設けて、異なる色の警告ランプ216を点灯させたりしてもよい。 In the above-described embodiment, different screens 310 to 330 are displayed on the operation panel 215 of the main unit 20 depending on the reason why the visitor's face cannot be recognized. Have been notified. However, it is not always necessary to notify the indoor responder by display. Instead, a message prompting a different level of alert from the speaker 212 of the parent device 20 such as “I know my dad”, “Please check the image carefully”, “Be wary because it seems to be a suspicious person”, etc. May be output by voice. Further, the warning lamp 216 may blink at different intervals for each warning level, or a plurality of warning lamps 216 having different emission colors may be provided to light the warning lamps 216 of different colors.
 前述の実施形態では、人感センサ115によって、子機10正面の移動物の検知を行っているが、必ずしも人感センサ115を子機10に設ける必要はない。代わりに、カメラ113によって撮像される画像を利用して、画像の変化によって移動物を検出する構成としてもよい。 In the above-described embodiment, the moving object in front of the child device 10 is detected by the human sensor 115, but the human sensor 115 is not necessarily provided in the child device 10. Instead, a moving object may be detected by changing the image using an image captured by the camera 113.
 前述の実施形態では、親機20がフラッシュROM220を備え、そこに記憶された指示音声のデータ等を子機10に送信することにより、子機10での音声出力が行われる。しかしながら、指示音声のデータは、必ずしも親機20に記憶しておく必要はなく、子機10にフラッシュROMを設けて記憶しておいてもよい。この場合、親機20からは、データを特定する指示のみが子機10に送信され、子機10のCPU101が、指示に従って音声出力を行う。 In the above-described embodiment, the parent device 20 includes the flash ROM 220, and voice data is output from the child device 10 by transmitting instruction voice data and the like stored therein to the child device 10. However, the instruction voice data does not necessarily have to be stored in the parent device 20, and may be stored in the child device 10 by providing a flash ROM. In this case, only an instruction for specifying data is transmitted from the parent device 20 to the child device 10, and the CPU 101 of the child device 10 performs voice output in accordance with the instruction.

Claims (6)

  1. 室外に設置された子機と、前記子機に接続され、室内に設置された親機とを備えたインターホンシステムであって、
     前記子機は、
     前記子機正面の所定の撮像範囲を撮像し、前記撮像範囲の画像の情報である撮像画像情報を出力する撮像手段と、
     来訪者を検知する来訪者検知手段と、
     前記来訪者に情報を通知する子機通知手段を備え、
     前記親機は、
     前記来訪者検知手段によって前記来訪者が検知された場合に、前記撮像手段から出力された前記撮像画像情報を取得する画像取得手段と、
     前記画像取得手段によって取得された前記撮像画像情報に基づいて、前記来訪者の顔が認識可能か否かを判断する認識判断手段と、
     前記認識判断手段によって、前記来訪者の前記顔が認識可能でないと判断された場合に、前記子機通知手段を制御して、認識を妨げる原因に応じて、前記来訪者に対して異なる行動を促す情報を通知させる子機通知制御手段と、
     前記室内に情報を通知する親機通知手段と、
     前記親機通知手段を制御して、前記原因に応じて異なるレベルの警戒を促す情報を通知させる第1の親機通知制御手段を備えたことを特徴とするインターホンシステム。
    An intercom system comprising a slave unit installed outdoors and a master unit connected to the slave unit and installed indoors,
    The slave is
    Imaging means for imaging a predetermined imaging range in front of the slave unit and outputting captured image information that is information of an image in the imaging range;
    A visitor detection means for detecting a visitor;
    Comprising a handset notification means for notifying the visitor of information,
    The base unit is
    An image acquisition unit that acquires the captured image information output from the imaging unit when the visitor is detected by the visitor detection unit;
    Recognition determination means for determining whether the visitor's face is recognizable based on the captured image information acquired by the image acquisition means;
    When the recognition determining unit determines that the face of the visitor is not recognizable, the slave unit notification unit is controlled to perform different actions on the visitor depending on the cause of the recognition hindering. A handset notification control means for notifying the prompting information;
    A master notification means for notifying information in the room;
    An interphone system comprising: a first master unit notification control unit that controls the master unit notification unit to notify information that prompts different levels of alerting according to the cause.
  2.  前記認識判断手段は、
     前記画像取得手段によって取得された前記撮像画像情報に基づいて、前記撮像範囲内に前記来訪者の顔に相当する領域である顔領域が存在するか否かを判断する第1の判断手段と、
     前記第1の判断手段によって、前記撮像範囲内に前記来訪者の前記顔領域が存在すると判断された場合に、前記撮像画像情報に基づいて、前記来訪者の前記顔の特徴である来訪者顔特徴が抽出可能か否かを判断する第2の判断手段と、
     前記第1の判断手段による判断結果および前記第2の判断手段による判断結果のいずれかが否である場合に、前記第1の判断手段または前記第2の判断手段による判断結果に応じて前記原因を特定する原因特定手段を備えたことを特徴とする請求項1に記載のインターホンシステム。
    The recognition determining means includes
    Based on the captured image information acquired by the image acquisition means, first determination means for determining whether or not a face area that is an area corresponding to the visitor's face exists in the imaging range;
    When the first determination means determines that the face area of the visitor exists within the imaging range, a visitor face that is a feature of the face of the visitor based on the captured image information Second determination means for determining whether or not a feature can be extracted;
    When the determination result by the first determination unit or the determination result by the second determination unit is negative, the cause is determined according to the determination result by the first determination unit or the second determination unit. The intercom system according to claim 1, further comprising cause identifying means for identifying
  3.  前記撮像手段は、異なる解像度での撮像、および異なる画角での撮像、ならびに静止画および動画の撮像が可能であり、
     前記親機は、
     前記認識判断手段によって、前記来訪者の前記顔が認識可能でないと判断された場合に、前記撮像手段を制御して、前記原因に応じて前記撮像手段の撮像方法を変更させる撮像制御手段と、
     画像を表示する画像表示手段と、
     前記画像表示手段を制御して、前記画像取得手段によって取得された前記撮像画像情報に基づき、前記撮像範囲の画像を表示させる表示制御手段をさらに備えたことを特徴とする請求項1または2に記載のインターホンシステム。
    The imaging means is capable of imaging at different resolutions, imaging at different angles of view, and still images and moving images,
    The base unit is
    An imaging control unit that controls the imaging unit to change the imaging method of the imaging unit according to the cause when the recognition determining unit determines that the face of the visitor is not recognizable;
    Image display means for displaying an image;
    3. The display control unit according to claim 1, further comprising a display control unit configured to control the image display unit to display an image in the imaging range based on the captured image information acquired by the image acquisition unit. The listed intercom system.
  4.  前記親機は、
     前記認識判断手段によって、前記来訪者の前記顔が認識可能であると判断された場合に、前記撮像画像情報に基づいて、前記来訪者の前記顔の特徴である来訪者顔特徴を抽出する顔特徴抽出手段と、
     複数の人物の識別情報に対応づけて複数の顔の特徴を記憶する顔特徴記憶手段に記憶された前記複数の顔の特徴と、前記来訪者顔特徴とを照合し、前記複数の顔の特徴のいずれかが、前記来訪者顔特徴と一致するか否かを判断する一致判断手段と、
     前記親機通知手段を制御して、前記一致判断手段による判断結果に応じて異なるレベルの警戒を促す情報を通知させる第2の親機通知制御手段をさらに備えたことを特徴とする請求項1~3のいずれかに記載のインターホンシステム。
    The base unit is
    A face for extracting a visitor face feature that is a feature of the visitor's face based on the captured image information when the recognition judgment means determines that the face of the visitor is recognizable. Feature extraction means;
    The plurality of facial features stored in a facial feature storage unit that stores a plurality of facial features in association with identification information of a plurality of persons are compared with the visitor facial features, and the plurality of facial features A coincidence judging means for judging whether or not any of the above matches the visitor face characteristics;
    2. The device according to claim 1, further comprising a second parent device notification control unit that controls the parent device notification unit to notify information prompting different levels of alerting according to a determination result by the match determination unit. The intercom system according to any one of items 1 to 3.
  5.  前記親機は、
     前記一致判断手段によって、前記来訪者顔特徴と一致すると判断された前記顔の特徴に基づき、前記複数の人物のいずれかを前記来訪者として特定する来訪者特定手段と、
     前記親機通知手段を制御して、前記来訪者特定手段によって特定された前記来訪者の情報を通知させる第3の親機通知制御手段をさらに備えたことを特徴とする請求項4に記載のインターホンシステム。
    The base unit is
    A visitor specifying means for specifying any of the plurality of persons as the visitor based on the facial features determined by the match determining means to match the visitor face features;
    5. The device according to claim 4, further comprising third parent device notification control means for controlling the parent device notification means to notify the information of the visitor specified by the visitor specifying means. Intercom system.
  6.  前記一致判断手段は、前記顔特徴判断手段によって、前記来訪者特徴の一部が抽出された場合には、前記複数の顔の特徴の一部と、前記来訪者特徴の一部に基づいて判断を行い、
     前記親機は、
     前記一致判断手段によって、前記来訪者顔特徴の一部と一致すると判断された前記複数の顔の特徴の一部に基づき、前記複数の人物のいずれかを前記来訪者の候補として特定する候補特定手段と、
     前記親機通知手段を制御して、前記候補特定手段によって特定された前記来訪者の候補の情報を通知させる第4の親機通知制御手段をさらに備えたことを特徴とする請求項4または5に記載のインターホンシステム。
    The coincidence determining unit determines, based on a part of the plurality of facial features and a part of the visitor features, when a part of the visitor features is extracted by the face feature determining unit. And
    The base unit is
    Candidate specification for specifying any one of the plurality of persons as a candidate for the visitor based on a part of the plurality of facial features determined to match a part of the visitor facial feature by the matching determination unit Means,
    6. A fourth parent device notification control unit that controls the parent device notification unit to notify information on the visitor candidate specified by the candidate specifying unit. The intercom system described in.
PCT/JP2009/054739 2008-09-23 2009-03-12 Intercom system WO2010035524A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008-243428 2008-09-23
JP2008243428A JP2010080993A (en) 2008-09-23 2008-09-23 Intercom system

Publications (1)

Publication Number Publication Date
WO2010035524A1 true WO2010035524A1 (en) 2010-04-01

Family

ID=42059534

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/054739 WO2010035524A1 (en) 2008-09-23 2009-03-12 Intercom system

Country Status (2)

Country Link
JP (1) JP2010080993A (en)
WO (1) WO2010035524A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017147646A (en) * 2016-02-18 2017-08-24 アイホン株式会社 Intercom system
TWI740475B (en) * 2020-04-29 2021-09-21 中華電信股份有限公司 Guest authentication method and guest authentication device

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013042357A (en) * 2011-08-16 2013-02-28 Misawa Homes Co Ltd Intercom system
JP5518919B2 (en) * 2012-02-29 2014-06-11 株式会社東芝 Face registration device, program, and face registration method
WO2014045670A1 (en) * 2012-09-19 2014-03-27 日本電気株式会社 Image processing system, image processing method, and program
KR101492506B1 (en) * 2013-08-09 2015-02-12 주식회사 시큐인포 Security system of apartment house using home blackbox, and management method thereof
JP6614473B2 (en) * 2015-02-25 2019-12-04 パナソニックIpマネジメント株式会社 Interphone database generation device, interphone indoor device, interphone system, display method, and program using the same
JP2016154327A (en) * 2016-02-01 2016-08-25 パナソニックIpマネジメント株式会社 Intercom system and communication method
JP6773493B2 (en) * 2016-09-14 2020-10-21 株式会社東芝 Detection device, detection method, and detection program
JP6989109B2 (en) * 2017-10-16 2022-01-05 株式会社パロマ Cooker
JP2019083468A (en) * 2017-10-31 2019-05-30 シャープ株式会社 Output control device, intercom slave unit, and intercom system
JP7209206B2 (en) * 2018-06-19 2023-01-20 パナソニックIpマネジメント株式会社 INTERCOM ENTRANCE DEVICE, INTERCOM SYSTEM, CONTROL METHOD, AND PROGRAM
JP6964350B2 (en) * 2018-10-30 2021-11-10 株式会社Receptionist Reception system, reception program and reception method
JP7312061B2 (en) * 2019-09-02 2023-07-20 アイホン株式会社 intercom device
JP7127864B2 (en) * 2020-02-18 2022-08-30 アイメソフト ジェイエスシー Information processing method, information processing device and program
KR102461858B1 (en) * 2021-04-28 2022-11-01 주식회사 대림 Doorphone Capable of Recognizing objects, System Including the Same, and Method of Using the Same

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000163600A (en) * 1998-11-18 2000-06-16 Sintec Co Ltd Face photographing and recognizing method and device
JP2007037088A (en) * 2005-06-24 2007-02-08 Matsushita Electric Ind Co Ltd Intercom device
JP2007048209A (en) * 2005-08-12 2007-02-22 Fujifilm Holdings Corp Crime prevention apparatus, vending machine, crime prevention method, and crime prevention program
JP2007049551A (en) * 2005-08-11 2007-02-22 Fujifilm Holdings Corp Device, method, and program for crime prevention
JP2007257221A (en) * 2006-03-23 2007-10-04 Oki Electric Ind Co Ltd Face recognition system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000163600A (en) * 1998-11-18 2000-06-16 Sintec Co Ltd Face photographing and recognizing method and device
JP2007037088A (en) * 2005-06-24 2007-02-08 Matsushita Electric Ind Co Ltd Intercom device
JP2007049551A (en) * 2005-08-11 2007-02-22 Fujifilm Holdings Corp Device, method, and program for crime prevention
JP2007048209A (en) * 2005-08-12 2007-02-22 Fujifilm Holdings Corp Crime prevention apparatus, vending machine, crime prevention method, and crime prevention program
JP2007257221A (en) * 2006-03-23 2007-10-04 Oki Electric Ind Co Ltd Face recognition system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017147646A (en) * 2016-02-18 2017-08-24 アイホン株式会社 Intercom system
TWI740475B (en) * 2020-04-29 2021-09-21 中華電信股份有限公司 Guest authentication method and guest authentication device

Also Published As

Publication number Publication date
JP2010080993A (en) 2010-04-08

Similar Documents

Publication Publication Date Title
WO2010035524A1 (en) Intercom system
US8606316B2 (en) Portable blind aid device
CN107408028A (en) Message processing device, control method and program
JP2010079609A (en) Personal authentication device, personal authentication program, and intercom system with personal authentication device
JP2010226541A (en) Reception apparatus, visitor reception method, and visitor reception control program
JP2007037088A (en) Intercom device
JP2008113396A (en) Intercom system and entrance subunit
JP2010074628A (en) Intercom system
JP2010114544A (en) Intercom system, and program and method for receiving visitor
JP2010062797A (en) Intercom system
JP5579565B2 (en) Intercom device
KR20110137469A (en) Intelligent entrance managing apparatus using face detection and entrance managing method thereof
KR101720762B1 (en) Home-network system for Disabled
JP2010152423A (en) Personal authentication device, personal authentication method and personal authentication program
JP4889568B2 (en) Imaging device and portable terminal device
JP2017108343A (en) Intercom system and entrance unit
JP2011035644A (en) Intercom system
JP6804510B2 (en) Detection system and display method of detection system
JP2012204949A (en) Intercom system, intercom outdoor apparatus, control method, and program
JP2007150511A (en) Intercom system
JP2004266714A (en) Personal identification device
CN111479060B (en) Image acquisition method and device, storage medium and electronic equipment
JP2007096831A (en) Interphone system
JP2010200184A (en) Communication device, communication control method, and communication control program
JP2007104380A (en) Doorphone unit

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09815943

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09815943

Country of ref document: EP

Kind code of ref document: A1