CN111405225A

CN111405225A - Method, device and system for realizing visual intercom service of access control and intelligent robot

Info

Publication number: CN111405225A
Application number: CN202010128056.0A
Authority: CN
Inventors: 赵杰; 宋宇; 许楠; 张勇
Original assignee: Beijing Aijieli Technology Development Co ltd
Current assignee: Beijing Aijieli Technology Development Co ltd
Priority date: 2020-02-28
Filing date: 2020-02-28
Publication date: 2020-07-10

Abstract

The invention discloses a method, a device, a system and an intelligent robot for realizing an entrance guard visual intercom service, which realize the door opening or answering control in the entrance guard service by receiving visitor incoming call information transmitted by an entrance guard host and executing a voice instruction and semantic recognition of a user, connect the entrance guard host and realize the intercom service with the entrance guard host by receiving voice data of the user under the answering instruction, and simultaneously realize the door opening or hanging up control in the entrance guard service by performing semantic recognition on the voice data of the user. Therefore, by adopting the technical scheme provided by the invention, a user can realize all access control services only by inputting voice information to the robot, and completely does not need to put down the things in the hands, and further does not need to walk to the gate of the hung access control extension to realize the access control services through keys. Therefore, the hands of the user can be well liberated, the current activities of the user can not be influenced completely, and the user has good use experience.

Description

Method, device and system for realizing visual intercom service of access control and intelligent robot

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method, a device and a system for realizing an entrance guard visual intercom service and an intelligent robot.

Background

Along with the increasing of residential houses and the popularization and promotion of intelligent community concepts, the entrance guard intercom system is also transited to the digital visual intercom system from the direct-pressing voice intercom system, and the digital visual intercom system not only improves the information confidentiality, but also is more friendly to use.

At present, digital video intercom systems mainly include the following two types:

the other one is a common visual intercom system, which consists of a host, an extension, a UPS power supply, an electric control lock and the like. The main machine is a control core part of the building intercom system, and transmission signals, electric lock control signals and the like of each extension telephone are controlled by the main machine. The extension telephone is a talkback telephone with the functions of controlling the electric lock and monitoring, is generally arranged at the door of a user home and is mainly used for talkback conversation between a resident and a visitor. The UPS has the function of keeping the building intercom system from power failure. The internal structure of the electric control lock mainly comprises an electromagnetic mechanism. The user can electrify the electromagnetic coil by pressing the electric lock key on the extension, so that the electromagnetic mechanism drives the connecting rod to act, and the opening of the gate can be controlled.

However, since the door access system is generally installed and debugged before the residential building is built and delivered, the extension set is generally installed at the door position in the user room and cannot be adjusted by the user. When a visitor calls, the user needs to put down what he is doing and walk to the extension to answer the intercom or open the door.

The other is a cloud video intercom technology, which adds a cloud intercom server and any mobile devices (generally, mobile phones and tablets of residents) using conversation protocols thereof on the basis of common video intercom. When the host calls the extension set, the host not only establishes direct connection with the calling extension set, but also reports the captured audio and video stream and the unique identifier of the called user to the cloud talkback server, and the cloud talkback server finds the mobile equipment registered on the server according to the unique identifier of the called user reported by the host and establishes a session with the called user. The user may then complete the operations of speaking with the visitor, opening the door, etc. on his mobile device.

The cloud visual intercom technology solves the problem that a user must walk to the front of a door to intercom to open the door to a certain extent, but still has the following problems:

1. the user can not liberate both hands, and still needs to complete operations such as answering, opening a door, hanging up and the like in a clicking mode like an indoor unit;

2. contending for the user mobile device. When a visitor calls in, the mobile equipment of the user may be handling important things, such as answering a call, video conference and the like, and the user is disturbed by the interruption of the entrance guard incoming call;

3. uncertainty of the user about the home situation. Since the mobile device of the user is generally carried around, when the visitor comes in, the user may not be at home or cannot determine whether the home is occupied, so that the visitor cannot be clearly entered or denied.

Disclosure of Invention

The invention provides a method for realizing an access control visual intercom service on one hand, which comprises the following steps:

detecting whether visitor incoming call information transmitted by an access control host is received;

if the visitor incoming call information is received, detecting whether a voice instruction of a user is received;

if a voice instruction of a user is received, performing semantic recognition on the voice instruction;

judging whether the voice command is answered according to the result of the semantic recognition of the voice command;

if the voice instruction is answering, receiving voice data of a user, connecting an access control host, talkbacking with the access control host by using the voice data of the user, and performing semantic recognition on the voice data of the user;

and judging whether preset keywords are contained or not according to the result of semantic recognition on the voice data of the user, and executing corresponding actions if the preset keywords are contained.

Preferably, after receiving the visitor incoming call information and before detecting whether a voice instruction of the user is received, the method further comprises the steps of: and starting the visitor management scene, and pushing and displaying information corresponding to the visitor management scene.

Preferably, after receiving the visitor incoming call information and before detecting whether a voice instruction of the user is received, the method further comprises the steps of:

detecting whether a video connection instruction is received;

if the video connection instruction is received, establishing video connection with the access control host;

and acquiring and displaying the video image transmitted by the access control host.

Preferably, the visitor information is judged according to the acquired video image information, and is pushed and displayed.

Preferably, the method further comprises the steps of:

judging whether the voice instruction is door opening according to the result of semantic recognition on the voice instruction;

if the voice command is to open the door, then a door opening control signal is generated and sent to the access control host, and the door opening control signal is used for controlling the access control host to execute a door opening action.

Preferably, the performing semantic recognition on the voice data of the user includes:

judging whether the voice data reaches a preset time length, if so, storing the voice data as an audio file;

and reading the current audio file and the previous audio file, performing semantic recognition, and deleting the previous audio file after a recognition result is obtained.

Preferably, the determining whether the preset keyword is included, and if the preset keyword is included, executing a corresponding action, including:

the preset keywords comprise door opening and hanging up;

if the key words contain the preset key words for opening the door, generating a door opening control signal and sending the door opening control signal to the access control host, wherein the door opening control signal is used for controlling the access control host to execute a door opening action;

and if the preset keyword is contained, generating a hang-up instruction, and disconnecting the access control host.

The second aspect of the present invention further provides an apparatus for implementing a visual intercom service for an access control, including:

the incoming call detection module is used for detecting whether visitor incoming call information transmitted by the access control host is received or not;

the voice instruction detection module is used for detecting whether a voice instruction of a user is received or not when the visitor incoming call information is received;

the first semantic recognition module is used for performing semantic recognition on a voice instruction when the voice instruction of a user is received;

the first judgment module is used for judging whether the voice instruction is answered according to the result of semantic recognition on the voice instruction;

the talkback module is used for receiving voice data of a user when the voice instruction is answering, connecting the access control host and talkback with the access control host by using the voice data of the user;

the second semantic recognition module is used for performing semantic recognition on the voice data of the user when the voice command is answering;

the second judgment module is used for judging whether preset keywords are contained or not according to the result of semantic recognition on the voice data of the user;

and the execution module is used for executing corresponding actions when the semantic recognition result of the voice data of the user contains preset keywords.

The invention provides an intelligent robot, which comprises a processor and a memory connected with the processor, wherein the memory stores a plurality of instructions, and the instructions can be loaded and executed by the processor, so that the processor can execute the implementation method of the door control visual intercom service.

The invention provides a system for realizing the visible intercom service of the entrance guard, which comprises the intelligent robot, an entrance guard host, an entrance guard extension and a cloud intercom server, wherein the entrance guard host is connected with the entrance guard extension, the entrance guard host is connected with the intelligent robot through the cloud intercom server, and the intelligent robot is set through the information of the entrance guard extension.

Drawings

Fig. 1 is a schematic flow chart of a method for implementing the visual intercom service of the access control according to the present invention;

fig. 2 is a schematic flow chart illustrating a method for implementing a visual intercom service of an access control system according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an implementation apparatus of the visual intercom service of entrance guard according to the present invention;

FIG. 4 is a schematic structural diagram of an intelligent robot according to the present invention;

fig. 5 is a schematic structural diagram of a system for implementing the visual intercom service of the access control system of the present invention.

Detailed description of the preferred embodiments

In order to better understand the technical solution, the technical solution will be described in detail with reference to the drawings and the specific embodiments.

The method provided by the invention can be implemented in the following terminal environment, and the terminal can comprise one or more of the following components: a processor, a memory, and a display screen. The memory stores at least one instruction, and the instruction is loaded and executed by the processor to implement the implementation method of the door control visual intercom service according to the following embodiments.

A processor may include one or more processing cores. The processor connects various parts within the overall terminal using various interfaces and lines, performs various functions of the terminal and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory, and calling data stored in the memory.

The Memory may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory may be used to store instructions, programs, code sets, or instructions.

The display screen is used for displaying user interfaces of all the application programs.

In addition, those skilled in the art will appreciate that the above-described terminal configurations are not intended to be limiting, and that the terminal may include more or fewer components, or some components may be combined, or a different arrangement of components. For example, the terminal further includes a radio frequency circuit, an input unit, a sensor, an audio circuit, a power supply, and other components, which are not described herein again.

Example one

As shown in fig. 1, an embodiment of the present invention provides a method for implementing an access control visual intercom service, where the method includes the following steps:

s101, detecting whether visitor incoming call information transmitted by an access control host is received or not;

s102, if the visitor incoming call information is received, detecting whether a voice instruction of a user is received;

s103, if a voice instruction of a user is received, performing semantic recognition on the voice instruction;

s104, judging whether the voice command is answered according to the result of the semantic recognition of the voice command;

s105, if the voice instruction is answering, receiving voice data of a user, connecting an entrance guard host, talkbacking with the entrance guard host by using the voice data of the user, and performing semantic recognition on the voice data of the user;

s106, judging whether preset keywords are contained or not according to the result of semantic recognition on the voice data of the user, and executing corresponding actions if the preset keywords are contained.

Specifically, the method provided by the embodiment can be used for indoor intelligent robots.

In the practical application process, a visitor can dial the number of a user room through the access control host to send incoming call information to the corresponding room, and after receiving the incoming call information of the visitor, the intelligent robot can automatically start or jump to a visitor management scene according to a built-in program of the intelligent robot. In the visitor management scenario, human-machine voice communication may be performed.

The user can send a voice instruction, such as an instruction of answering or opening a door, to the robot according to the received visitor incoming call information. Since the user may send different voice commands to the robot, the robot first performs semantic recognition on the voice commands in order to perform corresponding actions according to the user commands. Through the semantic recognition technology, the robot can convert the voice command of the user into text information, so that the intention of the user is analyzed, and the command corresponding to the intention of the user is generated. And if the voice command sent by the user is analyzed to be answered through semantic recognition, the robot monitors the data of the microphone. After the user sends the answering instruction, the user can send the talkback content with the visitor through voice, and the robot receives the voice data of the user and generates an instruction to be connected with the access control host. And after the robot is communicated with the entrance guard host, the received user voice data is used for carrying out talkback conversation with the entrance guard host, and meanwhile, the robot carries out semantic recognition on the received user voice data. And judging whether the voice data contains preset keywords or not according to the semantic recognition result, if so, generating a corresponding instruction and executing a corresponding action. Therefore, in the process of realizing the talkback between the user and the visitor, the entrance guard service instruction of the user is generated and executed through semantic recognition. For example, when the user and the visitor perform an intercom session, and the session is considered to be ended, the speech data may include the word of hang-up, and the hang-up is also a preset keyword in the robot, and when the semantic recognition result includes the word of hang-up, the robot generates a hang-up instruction and executes a hang-up action.

In the method, the robot sends the voice data of the user to the access control host computer, and simultaneously separates the voice data for semantic recognition, so that the control of a talkback process is realized in the process of realizing talkback between the user and a visitor. In addition, before the talkback starts and in the talkback process, the semantic recognition technology is used for recognizing the voice command and the voice data of the user, so that the talkback control function is realized.

By adopting the method, the user can realize all access control services such as answering, hanging up, opening a door and the like and talkback operations through the robot only by inputting voice information to the robot. And the user is not required to put down things in the hand at all, and the user is not required to walk to the door of the hung entrance guard extension set to realize entrance guard service through the keys. Therefore, the method provided by the invention can be used for realizing the access control service, the hands of the user can be well liberated, the current activities of the user can not be influenced, and the method has good user experience.

In a preferred embodiment of the present invention, after receiving the visitor incoming call information and before detecting whether a voice command of a user is received, the method further includes the steps of:

and starting the visitor management scene, and pushing and displaying information corresponding to the visitor management scene.

In the practical application process, after receiving visitor incoming call information sent by the access control host, the robot can automatically jump to a visitor management scene, and once the scene is started, the robot can remind a user of a visitor in the forms of ringing or voice and the like. Therefore, a user can timely perceive the incoming call information of the visitor and process the incoming call information at any indoor place, for example, a voice command of answering or opening a door is sent out, and after the robot receives the command, the robot can execute corresponding actions so as to avoid the problem of missing instruments for the visitor.

In another embodiment of the present invention, after receiving the visitor incoming call information and before detecting whether a voice instruction of a user is received, the method further includes the steps of:

detecting whether a video connection instruction is received;

In the embodiment of the invention, the video connection instruction is sent out in a voice form, and after the robot monitors the voice instruction, the instruction content is determined through a semantic recognition technology, and corresponding action is executed. The user sends a voice command to the robot, so that both hands can be better liberated, and the use is more convenient.

In the actual application process, if the incoming call information of the non-reserved visitor is received, the user can control the robot to establish video connection with the gate inhibition host by sending a video connection instruction, so that the video image of the visitor is obtained. The user can confirm whether to answer or open the door and other operations by checking the video image transmitted by the access host. Thereby further ensuring the safety of the user.

And further, judging the visitor information according to the acquired video image information, and pushing and displaying the visitor information.

Specifically, the robot can judge the acquired video image information according to locally stored historical information, determine and push and display visitor information including the visiting times, visiting time and other information of the visitors. The user can judge whether the visitor is safe according to the information, so that whether the visitor is open or closed or not is determined, and the safety of the user is improved.

The robot can also send the acquired video image information to a remote public security system end, the public security system end can compare the video image information with images in an image library of pre-established record personnel, judge whether the personnel in the video image are the record personnel, generate visitor information according to a judgment result and send the visitor information to the robot end.

The robot receives visitor information sent by a remote public security system end, the visitor information comprises visitor safety information or visitor danger information, if the visitor is a recorded person registered by a public security organization, the visitor danger information is the visitor safety information, and if the visitor is a non-recorded person, the visitor safety information is obtained. If the feedback information is visitor safety information, generating safety prompt information and pushing and displaying the safety prompt information; and if the feedback information is visitor danger information, generating danger prompt information and pushing and displaying the danger prompt information. The user can judge whether the visitor is safe according to the prompt message, so that whether the door is opened or the visitor is answered is determined, and the safety is improved.

The invention provides a method for realizing the visual intercom service of entrance guard, which further comprises the following steps:

In the actual application process, if the visitor has a reservation, the user may directly send a door opening instruction, and if the visitor has no reservation, the user may obtain safety information of the visitor by connecting a video and then send the door opening instruction. After receiving a door opening instruction of a user, the robot can generate a door opening control signal and send the door opening control signal to the access control host, wherein the door opening control signal is used for controlling the access control host to execute a door opening action.

In a preferred embodiment of the present invention, the performing semantic recognition on the voice data of the user includes:

After the user sends the answering command, the robot is connected with the entrance guard host, and the user completes the talkback session with the visitor through the robot and the entrance guard host.

The robot is used for carrying out semantic recognition on voice data of a received user while using the voice data of the user for talkback conversation in order to realize the control of the access control, and if preset keywords such as door opening, hang-up and the like are contained in the voice data, corresponding operation is executed, so that the access control is controlled in the talkback process.

In this embodiment, when the robot performs semantic recognition on voice data, the data with the duration reaching the preset time is stored as an audio file, and then the current audio file and the previous audio file are subjected to semantic recognition together, and the previous audio file is deleted after a result is obtained. And sequentially carrying out semantic recognition on the whole voice data.

The method for semantic recognition is high in speed, high in accuracy and small in occupied resource.

In the embodiment of the present invention, the determining whether the preset keyword is included, and if the preset keyword is included, executing a corresponding action, including:

the preset keywords comprise door opening and hanging up;

By adopting the method, the door opening, hanging up and other controls of the entrance guard can be realized in the talkback process. The experience degree of the user is well improved.

As shown in fig. 2, as a specific embodiment, the access control service according to the present invention may be implemented according to the following steps:

the robot receives visitor incoming call information, a user sends an instruction to connect with a video, the robot acquires and displays a video image transmitted by the access control host, the user sends an answering instruction even door opening instruction according to the information obtained by the video image, if the robot receives the door opening instruction, the robot sends the door opening instruction to the access control host through an interface provided by an access control manufacturer, if the robot receives the answering instruction, the access control host is switched on and receives voice data of the user, talkback is started by utilizing the voice data of the user from the access control host, meanwhile, the voice data is separated out, when the voice duration reaches 1 second, the voice data is stored into an audio file, the currently generated audio file and the previous audio file are read for semantic recognition, after a result is obtained, the non-latest generated audio file, namely the previous audio file, is deleted, if a preset keyword is recognized, for example, the door is opened, the door is hung up, and the like, corresponding door opening and hanging up actions are executed, that is, the door opening and hanging up control signals are sent to the door control host through the interface provided by the door control manufacturer.

Example two

As shown in fig. 3, another aspect of the present invention further includes a functional module architecture completely corresponding to and consistent with the foregoing method flow, that is, an embodiment of the present invention further provides an apparatus for implementing an access control visual intercom service, including:

the incoming call detection module 201 is configured to detect whether visitor incoming call information transmitted by the access control host is received;

a voice instruction detection module 202, configured to detect whether a voice instruction of a user is received when the visitor incoming call information is received;

the first semantic recognition module 203 is used for performing semantic recognition on a voice instruction when the voice instruction of a user is received;

a first judging module 204, configured to judge whether the voice instruction is answered according to a result of performing the semantic recognition on the voice instruction;

the talkback module 205 is configured to receive voice data of a user when the voice instruction is answered, connect to the access control host, and talkback with the access control host by using the voice data of the user;

a second semantic recognition module 206, configured to perform semantic recognition on the voice data of the user when the voice instruction is answered;

a second judging module 207, configured to judge whether a preset keyword is included according to a result of performing the semantic recognition on the voice data of the user;

the executing module 208 is configured to execute a corresponding action when the semantic recognition result of the voice data of the user includes a preset keyword.

And the control module is used for starting a visitor management scene and pushing and displaying information corresponding to the visitor management scene after receiving the visitor incoming call information and before detecting whether a voice instruction of a user is received.

Further comprises a video connection instruction connection module, which is used for detecting whether the voice instruction of the user is received or not after the visitor incoming call information is received,

detecting whether a video connection instruction is received;

And the third judging module is used for judging the visitor information according to the acquired video image information and pushing and displaying the visitor information.

Further, the first judging module is further configured to judge whether the voice instruction is to open a door according to a result of performing the semantic recognition on the voice instruction;

Further, the second semantic identification module is specifically configured to:

Further, the execution module is specifically configured to:

the preset keywords comprise door opening and hanging up;

The device can be implemented by the implementation method of the visual intercom service for entrance guard provided in the first embodiment, and the specific implementation method can be referred to the description in the first embodiment, and is not described herein again.

EXAMPLE III

As shown in fig. 4, an embodiment of the present invention further provides an intelligent robot 300, which includes a processor 301 and a memory 302 connected to the processor 301, where the memory 302 stores a plurality of instructions, and the instructions can be loaded and executed by the processor 301, so that the processor 301 can execute the implementation method of the door access visual intercom service according to the first embodiment.

Example four

As shown in fig. 5, the embodiment of the present invention provides a system for implementing an entrance guard visual intercom service, including an intelligent robot 300 as described in the third embodiment, further including an entrance guard host 400, an entrance guard extension 500 and a cloud intercom server 600, where the entrance guard host 400 is connected to the entrance guard extension 500, the entrance guard host 400 is connected to the intelligent robot 300 through the cloud intercom server 600, and the intelligent robot 300 performs setting through information of the entrance guard extension 500.

In the practical application process, firstly, an intercom module of the access control system is integrated in the intelligent robot, and a user is registered on the cloud intercom server to generate a unique identifier of the user. And the host of the access control system and the intelligent robot are connected with the cloud talkback server.

When the host calls the extension set, the host is not only directly connected with the calling extension set, but also reports the captured audio and video stream and the unique identifier of the called user to the cloud talkback server, and the cloud talkback server finds the intelligent robot registered on the server according to the unique identifier of the called user reported by the host, so that a connection bridge between the host and the intelligent robot is established.

If the user a wants to use the access control technical scheme provided by the invention, the user a needs to firstly confirm whether the access control interphone installed in the affiliated cell building can be supported by the access control interphone technology integrated in the intelligent robot, namely whether the access control interphone brand supported by the robot includes the access control interphone brand installed in the affiliated cell building, and if the access control interphone brand is supported by the user a, the user a only needs to fill each setting in the page of the access control interphone module of the intelligent robot according to the configuration information on the indoor extension of the user a and store the setting. Therefore, when a visitor calls outside the unit building, the user A can use pure voice to control and talk to the entrance guard service on the intelligent robot.

By adopting the technical scheme disclosed by the invention, the following beneficial effects are obtained: according to the method, the device and the system for realizing the door control visual intercom service and the intelligent robot, provided by the embodiment of the invention, the door opening or answering control in the door control service is realized by receiving the visitor incoming call information transmitted by the door control host and executing the voice instruction and the semantic recognition of the user, under the answering instruction, the door control host is connected and the intercom service with the door control host is realized by utilizing the voice data of the user by receiving the voice data of the user, and meanwhile, the door opening or hanging-up control in the door control service is realized by carrying out the semantic recognition on the voice data of the user. Therefore, by adopting the technical scheme provided by the invention, a user can realize all access control services such as answering, hanging up, opening a door and the like and talkback operations through the robot only by inputting voice information to the robot. And the user is not required to put down things in the hand at all, and the user is not required to walk to the door of the hung entrance guard extension set to realize entrance guard service through the keys. Therefore, the hands of the user can be well liberated, the current activities of the user can not be influenced completely, and the user has good use experience.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention. It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for realizing an access control visual intercom service is characterized by comprising the following steps:

2. The method for implementing video intercom service of entrance guard according to claim 1, wherein after receiving the incoming call information of the visitor and before detecting whether the voice command of the user is received, the method further comprises the steps of: and starting the visitor management scene, and pushing and displaying information corresponding to the visitor management scene.

3. The method for implementing video intercom service of entrance guard according to claim 1, wherein after receiving the incoming call information of the visitor and before detecting whether the voice command of the user is received, the method further comprises the steps of:

detecting whether a video connection instruction is received;

4. The method for implementing the door control visual intercom service according to claim 3, wherein the visitor information is judged according to the acquired video image information and is pushed for display.

5. The method for implementing the door control visual intercom service of claim 1, further comprising the steps of:

6. The method for implementing video intercom service for entrance guard according to claim 1, wherein said performing semantic recognition on said voice data of said user comprises:

7. The method for implementing the door control visual intercom service according to claim 1, wherein the judging whether the preset keyword is included is performed, and if the preset keyword is included, the corresponding action is performed, including:

the preset keywords comprise door opening and hanging up;

8. The utility model provides an entrance guard visual intercom service's realization device which characterized in that includes:

9. An intelligent robot, comprising a processor and a memory connected to the processor, wherein the memory stores a plurality of instructions, and the instructions can be loaded and executed by the processor, so that the processor can execute the implementation method of the door access visual intercom service according to any one of claims 1 to 7.

10. The system for realizing the visual intercom service of the entrance guard is characterized by comprising the intelligent robot according to claim 9, and further comprising an entrance guard host, entrance guard extensions and a cloud intercom server, wherein the entrance guard host is connected with the entrance guard extensions, the entrance guard host passes through the cloud intercom server and the intelligent robot are connected, and the intelligent robot passes through the information of the entrance guard extensions.