CN106873773B

CN106873773B - Robot interaction control method, server and robot

Info

Publication number: CN106873773B
Application number: CN201710013365.1A
Authority: CN
Inventors: 何坚强
Original assignee: Beijing Qihoo Technology Co Ltd
Current assignee: Beijing Qihoo Technology Co Ltd
Priority date: 2017-01-09
Filing date: 2017-01-09
Publication date: 2021-02-05
Anticipated expiration: 2037-01-09
Also published as: CN106873773A

Abstract

The invention provides a robot interaction control method, a server and a robot, wherein the method comprises the following steps: receiving audio and video streams or related sensor data which are uploaded by the robot and belong to a monitored object obtained by a camera unit of the robot; matching corresponding behavior types from a preset behavior type library according to feature data extracted from the audio and video stream or the sensor data; determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database; and sending the machine behavior instruction to the robot to drive the robot to execute the machine behavior instruction, so that the robot executes a corresponding output behavior as feedback to a monitored object. The method and the equipment provided by the invention greatly improve the intelligence level of the robot, so that the interaction between the robot and people is more convenient and the forms are richer.

Description

Robot interaction control method, server and robot

Technical Field

The invention relates to the technical field of computers, in particular to the technical field of robots, and particularly relates to a robot interaction control method, a server and a robot.

Background

A Robot (Robot) is a general term for a machine device capable of automatically performing work, and relates to a human smart crystal of a plurality of disciplines, and among them, an intelligent Robot having a certain degree of self-consciousness has been receiving much attention.

In the prior art, a robot generally has an instruction receiving unit, an information processing unit, and an execution unit, and can execute instructions in a preset or autonomous mode. In the production line of industrial production, such robots are already common to the production line, however, the robots which are integrated into the lives and families of people and can smoothly communicate with family members have not been reported, and the technical problems to be overcome are still many. In the prior art, robots almost all complete preset actions according to preset instructions, and when interacting with users, the robots need to rely on human-computer interaction components or human-computer interaction interfaces, and such interaction components or interaction interfaces are arranged on the robots as a hardware component, and corresponding preset actions are completed by receiving instructions given by the users or input instruction information. However, the current robot cannot automatically respond to temporary information or commands of the user, for example, only communication with a human or processing temporary instructions is performed. Furthermore, because the existing robot still has the problems of single action command mode and no random strain, the robot is required to have certain human capability in a certain application field, so that the robot can smoothly enter the family of people, and the robot cannot be well realized in the application fields of accompanying children, old people and partners and the like.

Disclosure of Invention

Based on this, the primary object of the present invention is to solve at least one of the above problems, to provide a robot interaction control method, and to provide a server and a robot accordingly to operate the method described in the previous object.

In order to realize the purpose, the following technical scheme is adopted:

the invention discloses a robot interaction control method, which comprises the following steps:

receiving audio and video streams or related sensor data which are uploaded by the robot and belong to a monitored object obtained by a camera unit of the robot;

matching corresponding behavior types from a preset behavior type library according to feature data extracted from the audio and video stream or the sensor data;

determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database;

and sending the behavior command to the robot to drive the robot to execute the machine behavior command, so that the robot executes a corresponding output behavior as feedback to a monitored object.

Further, the step of matching out a corresponding behavior type from a predetermined behavior type library according to the feature data extracted from the audio/video stream includes:

extracting audio characteristic information or keyword information of audio data in the audio and video stream;

and matching the characteristic information or the keyword information with pre-stored audio characteristic information or pre-stored keyword information in a preset behavior type library so as to determine the behavior type corresponding to the audio data.

In one embodiment, the step of matching a corresponding behavior type from a predetermined behavior type library according to the feature data extracted from the audio/video stream includes:

extracting video characteristic frame information of video data in the audio and video stream;

and matching the video characteristic frame information with pre-stored video characteristic frame information in a preset behavior type library so as to determine the behavior type corresponding to the video data.

Further, the video feature frame information is composed of picture information recording behavior actions, and a plurality of continuous picture information in a preset time is the video feature frame information.

In one embodiment, the step of matching a corresponding behavior type from a predetermined behavior type library according to the sensor data includes:

extracting a specific data field or information meaning in the sensor data;

and matching the specific data field or the information meaning with a prestored data field or prestored information meaning in a preset behavior type library so as to determine the behavior type corresponding to the sensor data.

Further, the sensor data are geographic information sensor data, geographic information data in the geographic information sensor data are extracted, the geographic information data are matched with pre-stored geographic information data, so that a behavior type represented by the geographic information sensor data is determined, a machine behavior instruction corresponding to the behavior type is determined from a preset corresponding relation database, the machine behavior instruction comprises speed information, direction information and route information, the behavior instruction is sent to the robot to drive the robot to execute the machine behavior instruction, the robot is enabled to execute corresponding output behaviors, and geographic information of the robot relative to a monitored object of the robot is changed.

Further, the sensor data is touch screen sensor data, touch operation information in the touch screen sensor data is extracted and matched with pre-stored touch operation information in a preset behavior type library, and accordingly the behavior type corresponding to the touch screen sensor data is determined.

In one embodiment, the preset corresponding relation database comprises a preset behavior type data table and a machine behavior instruction table which has a mapping relation with the preset behavior type data table; the specific steps of determining the machine behavior instruction corresponding to the behavior type from the preset corresponding relation database are as follows:

and searching a preset behavior type data table in a corresponding relation database by taking the behavior type as a keyword, determining the matched preset behavior type, and generating a machine behavior instruction according to the mapping relation mapping of the preset behavior type.

In one embodiment, the machine behavior instructions comprise:

voice instructions to drive the robot voice playing interface to output voice, and/or

Video instructions to drive the robotic video playback interface to output video, and/or

And the action execution instruction drives the robot motion unit to output the action.

In one embodiment, before the step of receiving the audio and video stream or the related sensing data which are uploaded by the robot and belong to the monitoring object and acquired by the camera unit of the robot, the method further comprises the steps of receiving the biological characteristic information sensor data uploaded by the robot, determining the identity authority of the user corresponding to the biological characteristic information sensor data when the biological characteristic information sensor data is matched with the pre-stored biological characteristic information, and generating a corresponding preset feedback instruction in response to the identity authority.

Further, the biometric information sensor data comprises at least one of fingerprint information data, iris information data or voice information data, the identity authority of the user is determined by comparing with preset identity authority information, and the corresponding preset feedback authority is opened in response to the identity authority.

In one embodiment, the preset behavior type library and/or the preset correspondence database are set or updated by a cloud server or a user associated with the cloud server.

In one embodiment, the uploading mode is real-time uploading.

The invention also provides a robot interaction control method, which comprises the following steps:

acquiring audio and video stream or related sensor data of a monitored object acquired by a camera unit;

and driving the relevant interfaces or units of the robot to make corresponding output behaviors according to the behavior instructions, and taking the corresponding output behaviors as feedback of monitoring objects.

extracting a specific data field or information meaning in the sensor data;

In one embodiment, the machine behavior instructions comprise:

In one embodiment, before the step of acquiring the audio and video stream or the related sensing data of the monitored object, which is acquired by the camera unit, the method further comprises the steps of receiving the biometric information sensor data, determining the identity authority of the user corresponding to the biometric information sensor data when the biometric information sensor data is matched with pre-stored biometric information, and generating a corresponding preset feedback instruction in response to the identity authority.

In one embodiment, the biometric sensor data includes at least one of fingerprint information data, iris information data, or voice information data, and the user's authentication is determined by comparing the biometric sensor data with predetermined authentication information, and a corresponding predetermined feedback authentication is opened in response to the authentication.

In one embodiment, the preset behavior type library and/or the preset corresponding relation database are set by a user or downloaded from a cloud server.

Further, the downloading is performed through bluetooth, a Wi-Fi network, a mobile data network, or a wired network.

The invention provides a robot interaction server, comprising:

the receiving module is used for receiving audio and video streams or related sensor data which are uploaded by the robot and belong to a monitoring object of the robot and acquired by the camera shooting unit of the robot;

the analysis module is used for matching a corresponding behavior type from a preset behavior type library according to feature data extracted from the audio and video stream or the sensor data;

the data generation module is used for determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database;

and the sending module is used for sending the behavior instruction to the robot so as to drive the robot to execute the machine behavior instruction, so that the robot can make a corresponding output behavior as feedback to a monitoring object of the robot.

The present invention also provides a robot comprising:

the acquisition module is used for acquiring audio and video stream or related sensor data of a monitored object, which is acquired by the camera unit;

the analysis module is used for matching a corresponding behavior type from a preset behavior type library according to feature data extracted from the audio and video stream or the sensor data, and then determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database;

and the execution module is used for driving the relevant interfaces or units of the robot to make corresponding output behaviors according to the behavior instructions, and the output behaviors are used as feedback of the monitored objects.

The method and the equipment provided by the invention greatly improve the intelligent level of the robot, so that the interaction between the robot and the human is more convenient and the forms are richer, and the corresponding feedback behaviors can be made by depending on the audio frequency, the video frequency or the sensor equipment of the robot to analyze certain actions, languages or even expressions of a monitored object, thereby enabling the human to realize the interaction with the robot in various forms. Moreover, the feedback behaviors of the robot are richer, and the feedback behaviors of the robot can be fed back through voice, video or combination of the voice and the video and actions, so that even children and old people can conveniently interact with the robot, the robot attends the children and the old people, and the robot can enter families of people to be realistic.

Drawings

FIG. 1 is a flowchart of a robot interaction control method according to an embodiment of the present invention;

FIG. 2 is a schematic structural diagram of a robot interaction server according to an embodiment of the present invention;

FIG. 3 is a flowchart of a robot interaction control method according to another embodiment of the present invention;

fig. 4 is a schematic structural diagram of a robot according to an embodiment of the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative only and should not be construed as limiting the invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As will be appreciated by those skilled in the art, a "terminal" as used herein includes both devices having a wireless signal receiver, which are devices having only a wireless signal receiver without transmit capability, and devices having receive and transmit hardware, which have devices having receive and transmit hardware capable of two-way communication over a two-way communication link. Such a device may include: a cellular or other communication device having a single line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Personal Communications Service), which may combine voice, data processing, facsimile and/or data communication capabilities; a PDA (Personal Digital Assistant), which may include a radio frequency receiver, a pager, internet/intranet access, a web browser, a notepad, a calendar and/or a GPS (Global Positioning System) receiver; a conventional laptop and/or palmtop computer or other device having and/or including a radio frequency receiver. As used herein, a "terminal" or "terminal device" may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or land-based), or situated and/or configured to operate locally and/or in a distributed fashion at any other location(s) on earth and/or in space. As used herein, a "terminal Device" may also be a communication terminal, a web terminal, a music/video playing terminal, such as a PDA, an MID (Mobile Internet Device) and/or a Mobile phone with music/video playing function, or a smart tv, a set-top box, etc.

The robot generally comprises an actuating mechanism, a driving device, a detection device, a control system, complex machinery and the like, can be automatically controlled, can be repeatedly programmed, has multiple functions, has several degrees of freedom, can be fixed or moved, and is used in a related automatic system.

An embodiment of the present invention provides a robot interaction control method, as shown in fig. 1, the interaction control method includes the following steps:

s110: and receiving audio and video streams or related sensor data which are uploaded by the robot and belong to the monitored object and acquired by the camera unit of the robot.

As a first step of implementing the interaction, it is first required to obtain raw information as rich as possible, where the raw information includes various behaviors or instructions of a user, that is, various behaviors or instructions of a monitored object of the robot, including audio/video contents acquired through a camera mounted on the robot, and audio/video streams displayed on a robot device, and a description of personification is sound "heard" and images "seen" by the robot, for example, a microphone mounted on the robot records voice of the monitored object, or a camera mounted on the robot records a continuous image of an action related to the monitored object, and related sensor data is data collected by various sensors mounted on the robot, such as a distance sensor, a temperature sensor, an acceleration sensor, an odor sensor, and the like, according to different use occasions or functions, these sensors are selected accordingly, and correspond to the "sense" organs of the robot, and the sensor data obtained is the "sense" thereof. After the robot acquires the original data, the original data are uploaded to a cloud server for further processing, and the processing can be performed more quickly and accurately by means of the powerful data processing capacity and the large data resources of the cloud server. Preferably, the mode that the robot uploads data to the cloud server is real-time uploading, and for the robot in practical application, the network state is good and is a basic environmental factor which can be inferred, and the real-time uploading can realize a quick communication mechanism between the robot and the cloud server, so that the robot has good information processing capacity and quick response performance.

S120: and matching corresponding behavior types from a preset behavior type library according to the characteristic data extracted from the audio and video stream or the sensor data.

And the cloud server analyzes and processes the audio and video stream or the sensor data after acquiring the audio and video stream or the sensor data, and analyzes and determines a behavior pattern expressed by the monitored object represented by the audio and video stream or the sensor data. Each kind of data information always comprises the most fundamental characteristic data to distinguish different behaviors, the characteristic data can form a related characteristic database according to the summary and induction of related field personnel, a certain behavior is formed by a certain characteristic data or a plurality of characteristic data together, and in some cases, even the characteristic data with the same type number exists, and the arrangement sequence of the characteristic data is different, so that the behavior patterns represented by the characteristic data are different. The method comprises the steps of classifying characteristic data of each behavior, combining the behaviors in the same database to form a preset behavior type library, analyzing the characteristic data in audio and video stream or sensor data by a cloud server, searching for matching in the preset behavior type library according to the characteristic data, finding out the behavior corresponding to the matched characteristic data, and determining the behavior type corresponding to the audio and video stream or the sensor data. It will be appreciated that some behavior actions include more than one profile, and that the more sophisticated the profile, the more accurate the profile is in describing the behavior action, so the profile for each behavior in the library of predefined behavior types should be continuously updated and sophisticated, as well as the ability of the cloud server to parse the raw data.

Preferably, the specific method for matching out the corresponding behavior type from the predetermined behavior type library according to the feature data extracted from the audio/video stream is as follows: extracting audio characteristic information or keyword information of audio data in the audio and video stream, matching the characteristic information or the keyword information with pre-stored audio characteristic information or pre-stored keyword information in a preset behavior type library, and determining the behavior type corresponding to the audio data. The first segment of voice information contains one or more keyword information such as time, place, person, object or reason, and the like, and can also include audio characteristic information such as sound frequency, tone, timbre, and the like, and a person skilled in the art can properly analyze the audio characteristic information or the keyword information from the audio data in the audio/video stream. When the audio characteristic information or the keyword information is obtained through analysis, the audio characteristic information or the keyword information is compared with pre-stored data pre-stored in the cloud server, namely the audio characteristic information or the keyword information is searched in the pre-stored audio characteristic information or the pre-stored keyword information in the pre-stored behavior type library, once a matching item is found, the pre-stored behavior type corresponding to the pre-stored audio characteristic information or the pre-stored keyword information is determined, and therefore the cloud server determines the behavior type corresponding to the audio data. According to the technical scheme, even the old or the children who do not know how to operate the computer can send the instruction to the robot as long as the old or the children can make the sound, and the communication interaction with the robot is realized.

Preferably, the specific method for matching out the corresponding behavior type from the predetermined behavior type library according to the feature data extracted from the audio/video stream is as follows: extracting video characteristic frame information of video data in audio and video streams, matching the video characteristic frame information with pre-stored video characteristic frame information in a preset behavior type library, and accordingly determining a behavior type corresponding to the video data. Video is composed of a plurality of static pictures in a certain time, each picture is a frame in principle, and the continuous frames form the video. Each image has its characteristic features, such as basic figure position, line shape, light and shade distribution, etc., the technology for analyzing picture features is image recognition technology, so that the video characteristic frame information in the video data is extracted by the image recognition technology, the video characteristic frame information is composed of picture information for recording behavior actions and is used for recording a plurality of behavior types with typical features, such as walking, running, standing, sitting, facial expression or gesture of a person, each video characteristic frame information is a picture for recording a certain action in one behavior type and contains computer recognizable information such as basic figure position, line shape or light and shade part for representing the action picture, such video characteristic frame information is often multi-frame information, and a plurality of continuous numbers in a preset time, such as ten continuous pictures in one second, or ten consecutive frames in 0.1 second, the shorter the time and the greater the number, the more accurate the video feature frame information. One of the databases of the cloud server is a preset behavior type library, the preset behavior type library has a plurality of behavior types, each behavior type directory includes a plurality of pre-stored video characteristic frame information (it can be known that the more the pre-stored video characteristic frame information is, the more the corresponding preset behavior type is accurate), the pre-stored video characteristic frame information is the video characteristic frame information which is pre-collected and stored in a computer or a network, when the cloud server detects that the video characteristic frame information extracted from the audio/video stream is matched with the pre-stored video characteristic frame information in a certain behavior type directory, the behavior type corresponding to the video data in the audio/video stream is determined, of course, the matching is not the same as the pre-stored video characteristic frame information in percentage, but has a certain degree of ambiguity, and the matching is realized by means of a fuzzy algorithm in computer image recognition, more detailed and specific methods can be implemented by those skilled in the art related to computer image recognition through related technical means. According to the technical scheme, even the old or the child who does not know how to operate the computer can send instructions to the robot through actions, and communication interaction with the robot is achieved.

As another preferable scheme, a specific method of matching out a corresponding behavior type from a predetermined behavior type library according to feature data extracted from sensor data is as follows: extracting a specific data field or information meaning in the sensor data; and matching the specific data field or the information meaning with a prestored data field or prestored information meaning in a preset behavior type library so as to determine the behavior type corresponding to the sensor data. The sensor data includes various types of data, and for different periods of time, there are idle segment data (data in which the sensor is in a standby state) and working segment data, and the working segment data inevitably includes time data, detection content data, intensity data (measuring the expression degree of the detection content, such as the force magnitude and speed), and the like. Specific data fields or information meanings are extracted from the sensor data according to a certain algorithm or processing method, and technicians related to the sensor data processing can complete the specific algorithm or processing method. The preset behavior type library comprises a plurality of behavior types, each behavior type comprises one or more corresponding pre-stored data fields or pre-stored information meanings, and the cloud server searches the pre-stored data fields or the pre-stored information meanings according to the data fields or the information meanings extracted from the sensor data to find matching items, namely the behavior type corresponding to the sensor data can be determined. According to different functional requirements, the robot is provided with different types of sensors, for example, in a family, the temperature sensor is needed to detect the room temperature or the body temperature of family members; it may also be necessary to detect if there is a natural gas leak in the kitchen, so a gas detection sensor may be installed; the robot may be required to accompany a certain family member to realize the function of looking after, and the robot needs to follow the family member, so that at least a geographic information sensor is required.

S130: and determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database.

After the behavior type corresponding to the audio/video stream or the sensor data is determined in step S120, the behavior type is searched in the cloud server database, and the searched object is a preset corresponding relationship database. The preset corresponding relation database comprises behavior types and machine behavior instructions corresponding to the behavior types. When a certain behavior type in the preset corresponding relation database is matched, a corresponding machine behavior instruction can be obtained and the next operation can be carried out. Preferably, the preset corresponding relation database includes preset behavior type data, a plurality of preset behavior type data are combined to form a preset behavior type data table, each preset behavior type data corresponds to a machine behavior instruction table, that is, the preset behavior type data and the machine behavior instructions have a mapping relation, the mapping relation is not a one-to-one mapping relation, a plurality of different behavior types may correspond to the same machine behavior instruction, for example, a voice behavior "take a cup" and a gesture is used for representing the same meaning, and the final machine behavior instruction may be for realizing "take a cup". The specific steps of determining the machine behavior instruction corresponding to the behavior type from the preset corresponding relation database are as follows: and taking the behavior type as a keyword, retrieving a preset behavior type data table in a corresponding relation database, determining the matched preset behavior type, and mapping according to the mapping relation of the preset behavior type to generate a machine behavior instruction. Of course, the manner determined here may also be calculated, and the robot determines the keyword information and correspondingly processes the related information to obtain a machine behavior instruction, for example, a behavior of "deliver cup to me", the robot knows the specific object of "cup" and the position of cup, knows the meaning of the object of "me", and the position of "me", and calculates the machine behavior instruction by determining its position and the geographic information of the environment where various scene objects are located, and may complete related tasks by executing the machine behavior instruction. When the preset behavior type or the preset corresponding relation database is richer, the behaviors which can be fed back by the robot are richer, so that the preset behavior type database and/or the preset corresponding relation database need to be continuously updated and supplemented, the updating and supplementing mode can be directly set, supplemented or updated on the cloud server, and can also be set or updated by a user associated with the cloud server, so that the data sources of the preset behavior type database and/or the preset corresponding relation database are richer and can be closer to reality.

Preferably, and the behavioral instructions include: the robot video playing interface is driven to output video, and/or the action executing instruction is driven to drive the robot motion unit to output the action. In the face of multiple expression modes of human beings, the robot also has multiple behavior expression modes or is expressed by sound, for example, when the human beings propose to make the robot play a certain section of sound, the robot sends an instruction to learn to call the duck, then the robot receives and uploads the instruction, and after the instruction is processed by the cloud server, the robot outputs a machine instruction to play the call sound of the duck, and the instruction drives the robot to start the sound production unit to play corresponding audio. Or the robot captures certain expressions of the person, such as smiles, and the expressions are uploaded to the server, and the server analyzes and processes the expressions to obtain corresponding machine instructions, so that a video playing interface corresponding to the robot can be driven to play a cheerful video, or the expressions of smiling faces are displayed, or corresponding action execution instructions are obtained, and a robot motion unit, such as a hand, can be driven to instruct an arm and a palm related machine part to form a hand clapping action. These types of machine instructions may be operated individually or simultaneously, for example, when a person says "we jump a dance bar" to the robot, the robot uploads the voice information data to the server, and the server makes a judgment and then sends out a machine behavior instruction, where the machine behavior instruction may cause a sound system of the robot to play music, and cause a video playing interface, such as a display screen, to play a corresponding dance picture, or make an expression, and may drive relevant motion units of the robot, such as a leg motion unit and a hand motion unit, to move together with the music. The content of these machine behavior instructions may be implemented programmatically by one skilled in the relevant art.

S140: and sending the behavior command to the robot to drive the robot to execute the machine behavior command, so that the robot executes a corresponding output behavior as feedback to a monitored object.

The robot behavior instruction is still in the cloud server after being generated, and finally the aim is to enable the robot to feed back a certain behavior of the monitored object, so that the behavior instruction processed by the cloud server needs to be immediately transmitted to the robot, and the robot executes a corresponding output behavior according to the machine behavior instruction and shows corresponding feedback to the monitored object. The execution of the behavior instruction is directly performed according to codes contained in detailed behavior instructions, for example, a music instruction is simply played, or a video behavior instruction is played, and the actual executable codes of each component can also be calculated through the built-in computer according to the behavior instruction through the calculation processing capacity of the robot, for example, when the behavior instruction is 'take a book', the robot calculates, judges and executes according to the position of the robot, the position of a bookcase, the layout of a room where the robot is located, the distance between the objects and the like.

As a preferred scheme, when the sensor data is geographic information sensor data, the cloud server processes the geographic information sensor data as follows: when the sensor data is geographic information sensor data, extracting geographic information data from the sensor data, matching the geographic information data with pre-stored geographic information data pre-stored in a server, determining a behavior type represented by the geographic information data, determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database, wherein the machine behavior instruction comprises speed information, direction information and route information, and sending the behavior instruction to a robot to drive the robot to execute the machine behavior instruction, so that the robot executes a corresponding output behavior, and the geographic information of the robot relative to a monitored object is changed. For example, when it is required that the robot becomes a partner of a certain person among family members (usually, an old person or a child), it is set that the robot always comes within a certain distance range in which the monitoring object is set, and when the monitoring object moves, the robot also moves accordingly, and the set distance range is maintained again with the monitoring object. When the monitored object moves beyond the set distance range, the geographic information sensor on the robot body can acquire geographic information of the monitored object, such as distance, longitude and latitude, obstacles or relative coordinates and the like, after the information is uploaded to the cloud server, the cloud server carries out corresponding processing or calculation to obtain corresponding machine behavior instructions which need to be executed by the robot, the behavior instructions comprise contents such as the speed, the moving direction and the moving distance or path and the like which need to be moved, finally, the geographic information (the position or the distance and the like relative to the monitored object) of the robot is changed along with the movement of the monitored object, the distance between the robot and the monitored object is kept within a certain range, and the accompanying of the monitored object is completed.

As another preferred scheme, the sensor data is touch screen sensor data, touch operation information in the touch screen sensor data is extracted and matched with pre-stored touch operation information in a predetermined behavior type library, so as to determine a behavior type corresponding to the touch screen sensor data. The touch screen sensor is arranged on the robot, for example, the existing touch screen is a typical touch sensor device, a plurality of common instruction options are arranged on the touch screen sensor, when a monitoring object of the robot selects a certain instruction on the touch screen, the robot uploads the selection information to the cloud server, the cloud server extracts the represented touch operation information and generates a corresponding behavior type, and then the corresponding machine behavior instruction is determined according to the behavior type and is delivered to the robot to be executed. According to the technical scheme, the instruction execution difficulty can be reduced, the processing burden of the cloud server is reduced, the data processing speed of the cloud server is higher, the instruction output is more accurate, and finally the robot can feed back the instruction of the monitored object more quickly.

Preferably, generally, the robot is a valuable device, the identification is a function which the robot often needs to have, the robot is developed to come into an application, the objects which the robot contacts are many, for example, in a debugging stage after assembly, research personnel, debugging personnel and the like need to be contacted, and when the robot is used, users with different identities are encountered, the personnel should have different permission levels relative to the robot, and correspondingly have instruction modification permission, system updating permission or use permission and the like which are allowed by the level within the corresponding permission levels, therefore, before the step of receiving audio and video streams or related sensing data which are uploaded by the robot and belong to the monitored object and acquired by a camera unit of the robot, the method further comprises the steps of receiving biometric information sensor data uploaded by the robot, when the biometric information sensor data is matched with pre-stored biometric information, and determining the identity authority of the user corresponding to the biological characteristic information sensor data, and responding to the identity authority to generate a corresponding preset feedback instruction. And setting one-step identity verification operation before the robot receives audio and video stream or related sensor data and performs corresponding operation behaviors, so that the robot recognizes the identity of a monitored object, and performs subsequent operation after confirming the corresponding authority. For example, in the face of family members, the robot listens to the command and performs the action according to the identity authority of the pre-stored biometric information, and for a temporary guest, the robot recognizes that the pre-stored biometric information does not include the biometric information of the temporary guest, and will not listen to the instruction of the temporary guest, or perform the action according to the instruction corresponding to the temporary guest, such as asking for a good. Preferably, the biometric information sensor data includes at least one of fingerprint information data, iris information data or voice information data, and the identity authority of the user is determined by comparing the biometric information with preset identity authority information, and the corresponding preset feedback authority is opened in response to the identity authority. Accurate identity verification operation can be obtained through the biological characteristic information, and the biological characteristic information is very convenient to acquire.

Adapting to the foregoing method, based on modular thinking, an embodiment of the present invention provides a robot interaction server, as shown in fig. 2, including:

and the receiving module 110 is configured to receive audio and video streams or related sensor data, which are uploaded by the robot and belong to the monitoring object obtained by the camera unit of the robot. After the robot acquires data through its own camera recording interface and various sensors, it uploads the data to the cloud server through the receiving module 110.

And the analysis module 120 is configured to match a corresponding behavior type from a predetermined behavior type library according to feature data extracted from the audio/video stream or the sensor data. The cloud server analyzes and processes the audio and video stream or the sensor data after acquiring the audio and video stream or the sensor data through the receiving module 110, and analyzes and determines a behavior pattern expressed by the monitored object represented by the audio and video stream or the sensor data. Each kind of data information always comprises the most fundamental characteristic data to distinguish different behaviors, the characteristic data can form a related characteristic database according to the summary and induction of related field personnel, a certain behavior is formed by a certain characteristic data or a plurality of characteristic data together, and in some cases, even the characteristic data with the same type number exists, and the arrangement sequence of the characteristic data is different, so that the behavior patterns represented by the characteristic data are different. The method comprises the steps of classifying feature data of each behavior, combining the behaviors in the same database to form a preset behavior type library, analyzing feature data extracted from audio and video stream or sensor data by a cloud server, searching for matching in the preset behavior type library according to the feature data, finding out a behavior corresponding to the matched feature data, and determining the behavior type corresponding to the audio and video stream or the sensor data.

And the data generating module 130 is configured to determine a machine behavior instruction corresponding to the behavior type from a preset correspondence database. After the analysis module 120 determines the behavior type corresponding to the audio/video stream or the sensor data, the data generation module 130 searches in the cloud server database according to the behavior type, and the searched object is a preset corresponding relation database. The preset corresponding relation database comprises behavior types and machine behavior instructions corresponding to the behavior types. When a certain behavior type in the preset corresponding relationship database is matched, the data generation module 130 may obtain a corresponding machine behavior instruction and perform the next operation.

And the sending module 140 is configured to send the behavior instruction to the robot to drive the robot to execute the machine behavior instruction, so that the robot makes a corresponding output behavior as feedback to the monitored object. The robot behavior instruction is still in the cloud server after being generated, and finally the aim is to enable the robot to feed back a certain behavior of the monitored object, so that the behavior instruction processed by the cloud server needs to be immediately transmitted to the robot, and then the robot executes a corresponding output behavior according to the machine behavior instruction and shows corresponding feedback to the monitored object.

The robot interaction control method and the server provided by the embodiment of the invention greatly improve the intelligence level of the robot, facilitate the interaction between the robot and the robot, and can make corresponding feedback behaviors by analyzing certain actions, languages and even expressions of a monitored object of the robot by means of the audio and video of the robot or sensor equipment, so that the robot can realize the interaction with the robot in various forms.

Another embodiment of the present invention provides a robot interaction control method, as shown in fig. 3, the interaction control method including the steps of:

s210: and audio and video stream or related sensor data of the monitored object acquired by the camera unit are acquired.

Nowadays, the development of computer technology is controversial, and chip layers with small size and strong performance are diversified, so that the robot can be gradually and directly used as an integrated body of information receiving, information processing and information execution, and a cloud server is used as a supplement, so that the information processing efficiency is greatly improved. In this embodiment, as a first step of implementing the interaction, it is first required that the robot acquires as rich raw information as possible, where the raw information includes various behaviors or instructions of a user, that is, various behaviors or instructions of a monitored object of the robot, includes audio and video contents acquired through a camera mounted on the robot, and is represented on a robot device as audio and video streams, and is described as personification as sounds "heard" and images "seen" by the robot, for example, a microphone mounted on the robot records voices of the monitored object, or a camera mounted on the robot records a continuous image of an action related to the monitored object, and related sensor data is data acquired by various sensors mounted on the robot, such as a distance sensor, a temperature sensor, an acceleration sensor, an odor sensor, and the like, according to different use occasions or functions, these sensors are selected accordingly, and correspond to the "sense" organs of the robot, and the sensor data obtained is the "sense" thereof. After the robot acquires the original data, the original data are transmitted to other processing modules for further processing, and the processing can be rapidly and accurately performed by virtue of the powerful data processing capability of the computer carried by the robot.

S220: and matching corresponding behavior types from a preset behavior type library according to the feature data extracted from the audio and video stream or the sensor data.

And after the robot acquires the audio and video stream or the sensor data, analyzing and processing the audio and video stream or the sensor data, and analyzing and determining a behavior pattern expressed by the monitored object represented by the audio and video stream or the sensor data. Each kind of data information always comprises the most fundamental characteristic data to distinguish different behaviors, the characteristic data can form a related characteristic database according to the summary and induction of related field personnel, a certain behavior is formed by a certain characteristic data or a plurality of characteristic data together, and in some cases, even the characteristic data with the same type number exists, and the arrangement sequence of the characteristic data is different, so that the behavior patterns represented by the characteristic data are different. The characteristic data of each behavior is classified and combined in the same database to form a preset behavior type library, when the robot analyzes the characteristic data in the audio and video stream or the sensor data, the matching is searched in the preset behavior type library according to the characteristic data, the behavior corresponding to the matched characteristic data is found, and the behavior type corresponding to the audio and video stream or the sensor data can be determined. It will be appreciated that some behavioral actions include more than one profile, and that the more sophisticated the profile, the more accurate the profile is in describing the behavioral action, so the profile for each behavior in the library of predetermined behavior types should be continuously updated and sophisticated, as well as the ability of the robot to parse the raw data.

Preferably, the specific method for matching out the corresponding behavior type from the predetermined behavior type library according to the feature data extracted from the audio/video stream is as follows: extracting audio characteristic information or keyword information of audio data in the audio and video stream, matching the characteristic information or the keyword information with pre-stored audio characteristic information or pre-stored keyword information in a preset behavior type library, and determining the behavior type corresponding to the audio data. The first segment of voice information contains one or more keyword information such as time, place, person, object or reason, and the like, and can also include audio characteristic information such as sound frequency, tone, timbre, and the like, and a person skilled in the art can properly analyze the audio characteristic information or the keyword information from the audio data in the audio/video stream. When the audio characteristic information or the keyword information is obtained through analysis, the audio characteristic information or the keyword information is compared with pre-stored data pre-stored in the cloud server, namely the audio characteristic information or the keyword information is searched in a pre-stored behavior type library, once a matching item is found, the pre-stored behavior type corresponding to the pre-stored audio characteristic information or the pre-stored keyword information is determined, and therefore the robot determines the behavior type corresponding to the audio data. For example, the robot hears a sound "help me deliver me cup on the table" made by the monitored object, the robot extracts key information "deliver me cup on the table", or further serves the service object "me", the operation object "cup", the operation object position information "table", the operation content "deliver", and the like, the actual audio feature information or keyword information extraction is not limited to this way, more appropriate operation can be performed according to actual needs, and then the robot searches in a predetermined behavior type library according to the keywords to find the corresponding pre-stored audio feature information or pre-stored keyword information, thereby determining that the behavior type is "deliver cup to me". According to the technical scheme, even the old or the child who does not know how to operate the computer can send instructions to the robot through certain actions of the old or the child, and interactive communication with the robot is achieved.

Preferably, the specific method for matching out the corresponding behavior type from the predetermined behavior type library according to the feature data extracted from the audio/video stream is as follows: extracting video characteristic frame information of video data in audio and video streams, matching the video characteristic frame information with pre-stored video characteristic frame information in a preset behavior type library, and accordingly determining a behavior type corresponding to the video data. Video is composed of a plurality of static pictures in a certain time, each picture is a frame in principle, and the continuous frames form the video. Each image has characteristic features, such as basic image position, linear shape, light and shade distribution and the like, and the technology for analyzing picture characteristics is the image recognition technology, so that the video characteristic frame information in the video data is extracted through the image recognition technology, and the video characteristic frame information is usually multi-frame information. One of the databases of the cloud server is a preset behavior type library, the preset behavior type library has multiple behavior types, each behavior type directory includes a plurality of pieces of pre-stored video characteristic frame information (it can be known that the more the pre-stored video characteristic frame information is, the more the corresponding preset behavior type is accurate), and when the cloud server detects that the video characteristic frame information extracted from the audio/video stream is matched with the pre-stored video characteristic frame information in a certain behavior type directory, the behavior type corresponding to the video data in the audio/video stream is determined. For example, when the camera of the robot captures a crying picture of a child as a monitored object, the robot analyzes the video stream containing the crying content of the child, extracts video characteristic frame information in the video stream, wherein the video characteristic frame information possibly comprises a frame of crying expression of tears or crying of the facial stream of the child, compares the frame information with video characteristic frame information prestored in a database of the robot, matches the prestored video characteristic frame information with the crying facial characteristic of the child, and correspondingly judges that the child is in the crying state. Or for example, a certain robot monitoring object sends a gesture that the fingers of the palm are closed up and the palm is moved to the left to the robot, the robot captures the information, analyzes video feature frame information with the gesture that the fingers of the palm are closed up and the palm is moved to the left, searches the video feature frame information in a self pre-stored behavior type library, finds out video feature frame information with the same feature pre-stored in a pre-stored behavior type directory of 'walking left', and correspondingly judges the behavior type that the monitoring object wants to move to the left. In conclusion, the technical scheme enables the old or the children who do not know how to operate the computer to send instructions to the robot through certain actions of the old or the children, and communication interaction with the robot is achieved.

As another preferable scheme, a specific method of matching out a corresponding behavior type from a predetermined behavior type library according to feature data extracted from sensor data is as follows: extracting a specific data field or information meaning in the sensor data; and matching the specific data field or the information meaning with a prestored data field or prestored information meaning in a preset behavior type library so as to determine the behavior type corresponding to the sensor data. The sensor data includes various types of data, and for different periods of time, there are idle segment data (data in which the sensor is in a standby state) and working segment data, and the working segment data inevitably includes time data, detection content data, intensity data (measuring the expression degree of the detection content, such as the force magnitude and speed), and the like. Specific data fields or information meanings are extracted from the sensor data according to a certain algorithm or processing method, and technicians related to the sensor data processing can complete the specific algorithm or processing method. The preset behavior type library comprises a plurality of behavior types, each behavior type comprises one or more corresponding pre-stored data fields or pre-stored information meanings, and the robot searches the pre-stored data fields or the pre-stored information meanings according to the data fields or the information meanings extracted from the sensor data to find a matching item, namely the behavior type corresponding to the sensor data can be determined. According to different functional requirements, the robot is provided with different types of sensors, for example, in a family, the temperature sensor is needed to detect the room temperature or the body temperature of family members; it may also be necessary to detect if there is a natural gas leak in the kitchen, so a gas detection sensor may be installed; the robot may be required to accompany a certain family member to realize the function of looking after, and the robot needs to follow the family member, so that at least a geographic information sensor is required.

S230: and determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database.

After the behavior type corresponding to the audio/video stream or the sensor data is determined in step S220, the behavior type is searched in the robot database, and the searched object is a preset corresponding relationship database. The preset corresponding relation database comprises behavior types and machine behavior instructions corresponding to the behavior types. When a certain behavior type in the preset corresponding relation database is matched, a corresponding machine behavior instruction can be obtained and the next operation can be carried out. Preferably, the preset corresponding relation database includes preset behavior type data, a plurality of preset behavior type data are combined to form a preset behavior type data table, each preset behavior type data corresponds to a machine behavior instruction table, that is, the preset behavior type data and the machine behavior instructions have a mapping relation, the mapping relation is not a one-to-one mapping relation, a plurality of different behavior types may correspond to the same machine behavior instruction, for example, a voice behavior "take a cup" and a gesture is used for representing the same meaning, and the final machine behavior instruction may be for realizing "take a cup". The specific steps of determining the machine behavior instruction corresponding to the behavior type from the preset corresponding relation database are as follows: and taking the behavior type as a keyword, retrieving a preset behavior type data table in a corresponding relation database, determining the matched preset behavior type, and mapping according to the mapping relation of the preset behavior type to generate a machine behavior instruction. Of course, the determined manner here may also be calculated, and the robot determines the keyword information and correspondingly processes the related information to obtain a machine behavior instruction, for example, a behavior of "deliver cup to me", the robot knows the specific object of "cup" and the position of cup, knows the meaning of the object of "me", and the position of "me", and calculates the machine behavior instruction by determining its position and the geographic information of the environment where various scene objects are located, and can complete related tasks by executing the machine behavior instruction, and a deeper implementation method can be implemented by a person skilled in the art through programming. When the preset behavior type or the preset corresponding relation database is richer, the behaviors which can be fed back by the robot are richer, so that the preset behavior type database and/or the preset corresponding relation database need to be continuously updated and supplemented, the updating and supplementing mode can be directly set, supplemented or updated in an interface provided on the robot by a user, and the preset behavior type database and/or the preset corresponding relation database can be richer in data source and closer to reality through being connected with a cloud server and downloading and updating, so that the technical scheme enables related technical personnel to fully analyze the details of human behaviors by utilizing internet big data, continuously improves the preset behavior type database, fully improves the intelligence of the robot and enables the personifying degree of the robot to be more sufficient.

Preferably, and the behavioral instructions include: the robot video playing interface is driven to output video, and/or the action executing instruction is driven to drive the robot motion unit to output the action. In the face of various expression modes of human beings, the robot also has various behavior expression modes or is expressed by sound, for example, when the human beings propose to make the robot play a certain section of sound, the robot sends an instruction 'we sing national songs together' to the robot, then the robot receives the instruction, and outputs a machine instruction 'playing national songs' after processing, and the instruction drives the robot to start a sound production unit to play corresponding audio. Or the robot captures certain expressions of the person, such as smiles, and obtains corresponding machine instructions through analysis and processing, so that a video playing interface corresponding to the robot can be driven to play a cheerful video, or the expression of the smiling face is displayed, or corresponding action execution instructions are obtained, so that a robot motion unit, such as a hand, can be driven to instruct an arm and a palm related machine part to form a hand clapping action. These types of machine instructions may be operated individually or simultaneously, for example, when a person says "we jump a dance bar" to the robot, and then the robot makes a judgment on the voice information data, the robot sends a machine behavior instruction, which may cause a sound system of the robot to play music, and also cause a video playing interface, such as a display screen, to play a corresponding dance picture, or make an expression, and may drive relevant motion units of the robot, such as a leg motion unit and a hand motion unit, to move together with the music. The content of these machine behavior instructions may be implemented programmatically by one skilled in the relevant art.

S240: and driving the relevant interfaces or units of the robot to make corresponding output behaviors according to the machine behavior instructions, and taking the corresponding output behaviors as feedback of monitoring objects.

After the robot determines the machine behavior command, the robot needs to transmit the command to its own components, such as a sound playing interface, a video playing interface and/or related moving components (e.g., various servo motors). The execution of the behavior instruction is directly performed according to codes contained in detailed behavior instructions, for example, a music instruction is simply played, or a video behavior instruction is played, and the actual executable codes of each component can also be calculated by a built-in computer according to the behavior instruction through the computing processing capacity of the robot, for example, when the behavior instruction is 'go to the side of a table in a bedroom', the robot calculates, judges and executes according to the position of the robot, the position of the bedroom, the position of the table in the bedroom, the layout of the whole room where the robot is located, the distance between the objects and the like.

As a preferable scheme, when the sensor data is geographic information sensor data, the robot processes the geographic information sensor data as follows: when the sensor data is geographic information sensor data, extracting geographic information data from the sensor data, matching the geographic information data with pre-stored geographic information data pre-stored in the robot, determining a behavior type represented by the geographic information data, determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database, wherein the machine behavior instruction comprises speed information, direction information and distance information, and sending the behavior instruction to the robot to drive the robot to execute the machine behavior instruction, so that the robot executes a corresponding output behavior, and the geographic information of the robot relative to a monitored object is changed. For example, when it is required that the robot becomes a partner of a certain person among family members (usually, an old person or a child), it is set that the robot always comes within a certain distance range in which the monitoring object is set, and when the monitoring object moves, the robot also moves accordingly, and the set distance range is maintained again with the monitoring object. When the monitored object moves beyond the set distance range, the geographic information sensor on the robot body can acquire the geographic information of the monitored object, such as distance, longitude and latitude, existence of obstacles, or relative coordinates, and the like, the robot performs corresponding processing or calculation to obtain corresponding machine behavior instructions which need to be executed by the robot, wherein the behavior instructions comprise contents such as the speed, the moving direction, the moving distance or path and the like which need to be moved, and finally the geographic information (the position or the distance and the like relative to the monitored object) of the robot is changed along with the movement of the monitored object, the distance between the robot and the monitored object is continuously kept within a certain range, and the accompanying of the monitored object is completed.

As another preferred scheme, the sensor data is touch screen sensor data, touch operation information in the touch screen sensor data is extracted and matched with pre-stored touch operation information in a predetermined behavior type library, so as to determine a behavior type corresponding to the touch screen sensor data. The touch screen sensor is arranged on the robot, for example, the existing touch screen is a typical touch sensor device, a plurality of common instruction options are arranged on the touch screen sensor, after a monitoring object of the robot selects a certain instruction on the touch screen, the robot extracts the represented touch operation information and generates a corresponding behavior type, and then the corresponding machine behavior instruction is determined according to the behavior type and is executed by a relevant robot component. The technical scheme can reduce the instruction execution difficulty, reduce the burden of a built-in computer of the robot, enable the robot to process data more quickly, enable the instruction output to be more accurate, and finally enable the robot to feed back the instruction of the monitored object more quickly.

Preferably, generally, the robot is a valuable device, the identification is a function which the robot often needs to have, the robot is developed to come into an application, the objects which the robot contacts are many, for example, in a debugging stage after assembly, research personnel, debugging personnel and the like need to be contacted, and when the robot is used, users with different identities are encountered, the personnel should have different permission levels relative to the robot, and correspondingly have instruction modification permission, system updating permission or use permission and the like which are allowed by the level within the corresponding permission levels, therefore, before the step of receiving audio and video streams or related sensing data which are uploaded by the robot and belong to the monitored object and acquired by a camera unit of the robot, the method further comprises the steps of receiving biometric information sensor data uploaded by the robot, when the biometric information sensor data is matched with pre-stored biometric information, and determining the identity authority of the user corresponding to the biological characteristic information sensor data, and responding to the identity authority to generate a corresponding preset feedback instruction. And setting one-step identity verification operation before the robot receives audio and video stream or related sensor data and performs corresponding operation behaviors, so that the robot recognizes the identity of a monitored object, and performs subsequent operation after confirming the corresponding authority. For example, in the face of family members, the robot listens to the command and performs the action according to the identity authority of the pre-stored biometric information, and for a temporarily arriving guest, the robot recognizes that the pre-stored biometric information does not include the biometric information of the temporarily arriving guest, and will not listen to the instruction of the guest, or perform the action according to a preset instruction corresponding to the guest, such as asking for a good. Preferably, the biometric information sensor data includes at least one of fingerprint information data, iris information data or voice information data, and the identity authority of the user is determined by comparing the biometric information with preset identity authority information, and the corresponding preset feedback authority is opened in response to the identity authority. Accurate identity verification operation can be obtained through the biological characteristic information, and the biological characteristic information is very convenient to acquire.

In accordance with the foregoing method, based on modular thinking, another embodiment of the present invention provides a robot, as shown in fig. 4, including:

the acquiring module 210 is configured to acquire audio and video streams or related sensor data of a monitored object acquired by the camera unit.

The analysis module 220 is configured to match a corresponding behavior type from a predetermined behavior type library according to feature data extracted from the audio/video stream or the sensor data, and determine a machine behavior instruction corresponding to the behavior type from a preset corresponding relationship database.

And the execution module 230 is configured to drive the robot related interface or unit to make a corresponding output behavior according to the behavior instruction, and the output behavior is used as feedback for the monitored object.

The robot interaction method and the robot provided by the other embodiment of the invention can better realize the interaction between the robot and the human body, so that the robot is ensured to have higher intelligent level, the efficiency of processing problems is higher, and the anthropomorphic degree of the robot is higher. In addition, the robot reasonably utilizes the advantages of cloud big data, continuously updates the data types and the number of behavior judgment and response of the robot, so that the problem that the robot can respond to the data types and the number of the behavior judgment and the data types and the number of the response of the robot are wider, deeper and more accurate, and finally good interaction between the robot and the robot is realized.

To facilitate easier understanding of the implementation of the present invention by those skilled in the art, the following examples describe how human interaction with a robot can be accomplished in a practical scenario.

One of the scenarios is: in a home with a good network status, a child says about a robot along with him: "I want to drink water. The robot captures the voice information through a microphone arranged at the head of the robot, immediately sends the voice information to a cloud server through a network, the cloud server sends a behavior instruction for pouring a cup of water for a child to the robot, after the robot obtains the behavior instruction, various moving parts of the robot are driven to operate according to the behavior instruction, and finally the robot goes out of a water dispenser or a water drinking bottle to obtain a cup of water and sends the cup of water to the child.

The second scenario is as follows: in a family with a good network state, a child gets down from a chair in a bedroom, walks to a sofa in a living room, the robot serving as a follower acquires the change of the position of the child, the distance between the robot and the child exceeds a certain range, then the robot immediately sends the information to a cloud server, the server judges that the position of the child is changed after processing, the position of the child is changed, the position of the child is moved to the robot to serve as an instruction, each part of the robot moves according to the action instruction and follows the child, if the child is a running living room, the traveling speed of the robot is relatively high, and if the child is climbed, the traveling speed of the robot is relatively low until the distance between the child and the living room meets a certain range again.

The third scenario: in a family with a good network state, a guest suddenly comes from the family one day, a robot serving as a family member captures a facial image of the guest through a camera and immediately uploads the facial image to a server, the server identifies the identity of the guest through comparison and sends an action instruction to the robot, if the guest is an acquaintance, the action instruction comprises an action of hospitalization and guides the guest to a specified area for rest, if the guest is a stranger, the action instruction comprises a polite calling phrase and is played through an audio playing interface, and meanwhile, a series of action information of the stranger starts to be recorded.

Or a process of:

in the family, the robot is a partner of the young child, and when the young child is outdoors, the robot determines the movement speed of the robot and the distance to the young child by measuring the distance to the young child and the movement speed of the young child, and when the young child refers to a flower, "what is the flower? The microphone on the robot body acquires the sound information, judges the behavior of the child wanting to acquire the information, observes the direction of the child through the camera unit on the robot body, determines the object inquired by the child, further finds out the information related to the object in the database of the robot body, and explains the related information of the flower, such as the name, the variety and the like, which the child points to in the form of voice playing.

It will be understood by those within the art that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. Those skilled in the art will appreciate that the computer program instructions may be implemented by a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, implement the features specified in the block or blocks of the block diagrams and/or flowchart illustrations of the present disclosure.

Those of skill in the art will appreciate that various operations, methods, steps in the processes, acts, or solutions discussed in the present application may be alternated, modified, combined, or deleted. Further, various operations, methods, steps in the flows, which have been discussed in the present application, may be interchanged, modified, rearranged, decomposed, combined, or eliminated. Further, steps, measures, schemes in the various operations, methods, procedures disclosed in the prior art and the present invention can also be alternated, changed, rearranged, decomposed, combined, or deleted.

The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention. In summary, the technical solutions provided by the present invention are as follows:

a1, a robot interaction control method, comprising the following steps:

and sending the machine behavior instruction to the robot to drive the robot to execute the machine behavior instruction, so that the robot executes a corresponding output behavior as feedback to a monitored object.

A2, according to the robot interaction control method of A1, the step of matching out a corresponding behavior type from a predetermined behavior type library according to the feature data extracted from the audio/video stream includes:

A3, according to the robot interaction control method of claim 1, the step of matching out a corresponding behavior type from a predetermined behavior type library according to the feature data extracted from the audio/video stream includes:

A4, according to the robot interaction control method of A3, the video feature frame information is composed of picture information recording behavior actions, and a plurality of continuous picture information in a preset time is the video feature frame information.

A5, the method for robot interaction according to A1, wherein the step of matching out corresponding behavior types from a predetermined behavior type library according to sensor data comprises:

extracting a specific data field or information meaning in the sensor data;

A6, according to the robot interaction control method of A5, the sensor data is geographic information sensor data, geographic information data in the geographic information sensor data are extracted, the geographic information data are matched with pre-stored geographic information data, so that a behavior type represented by the geographic information sensor data is determined, a machine behavior instruction corresponding to the behavior type is determined from a preset corresponding relation database, the machine behavior instruction comprises speed information, direction information and route information, the behavior instruction is sent to a robot to drive the robot to execute the machine behavior instruction, the robot executes corresponding output behaviors, and geographic information of the robot relative to a monitored object is changed.

A7, according to the robot interaction control method in A5, the sensor data are touch screen sensor data, touch operation information in the touch screen sensor data is extracted and matched with pre-stored touch operation information in a preset behavior type library, and accordingly the behavior type corresponding to the touch screen sensor data is determined.

A8, according to the robot interaction method A1, the preset corresponding relation database comprises a preset behavior type data table and a machine behavior instruction table which has a mapping relation with the preset behavior type data table; the specific steps of determining the machine behavior instruction corresponding to the behavior type from the preset corresponding relation database are as follows:

A9, the robot interaction control method of A1, the machine behavior instructions comprising:

A10, according to the robot interaction method A1, before the step of receiving the audio/video stream or the related sensing data which are uploaded by the robot and belong to the monitored object and acquired by the camera unit of the robot, the method further comprises the steps of receiving the biological characteristic information sensor data uploaded by the robot, determining the identity authority of the user corresponding to the biological characteristic information sensor data when the biological characteristic information sensor data is matched with the pre-stored biological characteristic information, and generating a corresponding preset feedback instruction in response to the identity authority.

A11, the robot interaction method according to A10, wherein the biometric sensor data includes at least one of fingerprint information data, iris information data or voice information data, the identity authority of the user is determined by comparing with preset identity authority information, and a corresponding preset feedback authority is opened in response to the identity authority.

A12, setting or updating the preset behavior type library and/or the preset corresponding relation database by a cloud server or a user associated with the cloud server according to the robot interaction control method A1.

A13, the robot interaction control method according to A1, wherein the uploading mode is real-time uploading.

B14, a robot interaction control method, comprising the following steps:

and driving the relevant interfaces or units of the robot to make corresponding output behaviors according to the machine behavior instructions, and taking the corresponding output behaviors as feedback of monitoring objects.

B15, according to the robot interaction control method of B14, the step of matching out the corresponding behavior type from a predetermined behavior type library according to the feature data extracted from the audio/video stream includes:

B16, according to the robot interaction control method of B14, the step of matching out the corresponding behavior type from a predetermined behavior type library according to the feature data extracted from the audio/video stream includes:

B17, the method for controlling robot interaction according to B14, wherein the step of matching out the corresponding behavior type from the predetermined behavior type library according to the sensor data comprises:

extracting a specific data field or information meaning in the sensor data;

B18, according to the robot interaction control method of B17, the sensor data is geographic information sensor data, geographic information data in the geographic information sensor data are extracted, the geographic information data are matched with pre-stored geographic information data, so that a behavior type represented by the geographic information sensor data is determined, a machine behavior instruction corresponding to the behavior type is determined from a preset corresponding relation database, the machine behavior instruction comprises speed information, direction information and route information, the behavior instruction is sent to a robot to drive the robot to execute the machine behavior instruction, the robot executes corresponding output behaviors, and geographic information of the robot relative to a monitored object is changed.

B19, according to the robot interaction control method of B17, the sensor data are touch screen sensor data, touch operation information in the touch screen sensor data is extracted and matched with pre-stored touch operation information in a preset behavior type library, and accordingly the behavior type corresponding to the touch screen sensor data is determined.

B20, according to the robot interaction control method of B14, the preset corresponding relation database comprises a preset behavior type data table and a machine behavior instruction table which has a mapping relation with the preset behavior type data table; the specific steps of determining the machine behavior instruction corresponding to the behavior type from the preset corresponding relation database are as follows:

B21, the robot interaction control method according to B14, the machine behavior instruction includes:

B22, according to the robot interaction control method of B14, before the step of acquiring the audio/video stream or the related sensing data of the monitored object acquired by the camera unit, the method further comprises the steps of receiving the data of the biological characteristic information sensor, determining the identity authority of the user corresponding to the data of the biological characteristic information sensor when the data of the biological characteristic information sensor is matched with the pre-stored biological characteristic information, and generating a corresponding preset feedback instruction in response to the identity authority.

B23, the method according to claim B22, wherein the biometric sensor data includes at least one of fingerprint information data, iris information data or voice information data, the authentication of the user is determined by comparison with preset authentication information, and a corresponding preset feedback authentication is opened in response to the authentication.

B24, according to the robot interaction control method of B14, the preset behavior type library and/or the preset corresponding relation database are set by a user or downloaded from a cloud server.

B25, the downloading is performed through Bluetooth, a Wi-Fi network, a mobile data network or a wired network according to the robot interaction control method of B24.

C26, a robot interaction server, comprising:

D27, a robot, comprising:

Claims

1. A robot interaction control method is characterized by comprising the following steps:

the step of matching out the corresponding behavior type from a preset behavior type library according to the feature data extracted from the audio and video stream comprises the following steps:

matching the characteristic information or the keyword information with pre-stored audio characteristic information or pre-stored keyword information in a preset behavior type library so as to determine a behavior type corresponding to the audio data;

the matching of the corresponding behavior types from a predetermined behavior type library according to the feature data extracted from the sensor data comprises:

extracting a specific data field or information meaning in the sensor data;

matching the specific data field or the information meaning with a prestored data field or prestored information meaning in a preset behavior type library so as to determine a behavior type corresponding to the sensor data; each of the behavior types includes one or more corresponding pre-stored data fields or information meanings;

determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database; the preset behavior type library and/or the preset corresponding relation database are set or updated by a cloud server or a user associated with the cloud server;

2. The robot interaction control method according to claim 1, wherein the step of matching out a corresponding behavior type from a predetermined behavior type library based on the feature data extracted from the audio-video stream comprises:

3. The robot interaction control method according to claim 2, wherein the video feature frame information is composed of picture information in which behavior actions are described, and a plurality of consecutive pieces of the picture information in a predetermined time are the video feature frame information.

4. The robot interaction control method according to claim 1, wherein the sensor data is geographic information sensor data, the geographic information data in the geographic information sensor data is extracted, the geographic information data is matched with pre-stored geographic information data to determine a behavior type represented by the geographic information sensor data, a machine behavior instruction corresponding to the behavior type is determined from a preset correspondence database, the machine behavior instruction comprises speed information, direction information and route information, the behavior instruction is sent to a robot to drive the robot to execute the machine behavior instruction, so that the robot executes a corresponding output behavior, and the geographic information of the robot relative to a monitored object is changed.

5. The robot interaction control method according to claim 1, wherein the sensor data is touch screen sensor data, touch operation information in the touch screen sensor data is extracted and matched with pre-stored touch operation information in a predetermined behavior type library, and thus a behavior type corresponding to the touch screen sensor data is determined.

6. The robot interaction control method according to claim 1, wherein the preset correspondence database includes a preset behavior type data table and a machine behavior instruction table having a mapping relationship with the preset behavior type data table; the specific steps of determining the machine behavior instruction corresponding to the behavior type from the preset corresponding relation database are as follows:

7. The robot interaction control method of claim 1, wherein the machine behavior instruction comprises:

8. The robot interaction control method according to claim 1, further comprising, before the step of receiving the audio/video stream or the related sensing data of the monitored object, which is uploaded by the robot and acquired by the camera unit, receiving biometric information sensor data uploaded by the robot, determining the identity authority of the user corresponding to the biometric information sensor data when the biometric information sensor data matches with pre-stored biometric information, and generating a corresponding preset feedback instruction in response to the identity authority.

9. The robot interaction control method of claim 8, wherein the biometric information sensor data includes at least one of fingerprint information data, iris information data, or voice information data, and an authentication authority of the user is determined by comparing with preset authentication authority information, and a corresponding preset feedback authority is opened in response to the authentication authority.

10. The robot interaction control method according to claim 1, wherein the uploading is real-time uploading.

11. A robot interaction control method is characterized by comprising the following steps:

extracting a specific data field or information meaning in the sensor data;

determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database; the preset behavior type library and/or the preset corresponding relation database are/is downloaded from a cloud server;

12. The robot interaction control method according to claim 11, wherein the step of matching out a corresponding behavior type from a predetermined behavior type library based on the feature data extracted from the audio-video stream comprises:

13. The robot interaction control method according to claim 11, wherein the sensor data is geographic information sensor data, the geographic information data in the geographic information sensor data is extracted, the geographic information data is matched with pre-stored geographic information data to determine a behavior type represented by the geographic information sensor data, a machine behavior command corresponding to the behavior type is determined from a preset correspondence database, the machine behavior command includes speed information, direction information and route information, the behavior command is sent to the robot to drive the robot to execute the machine behavior command, so that the robot executes a corresponding output behavior, and the geographic information of the robot relative to a monitored object is changed.

14. The robot interaction control method according to claim 11, wherein the sensor data is touch screen sensor data, touch operation information in the touch screen sensor data is extracted and matched with pre-stored touch operation information in a predetermined behavior type library, and thus a behavior type corresponding to the touch screen sensor data is determined.

15. The robot interaction control method according to claim 11, wherein the preset correspondence database includes a preset behavior type data table and a machine behavior instruction table having a mapping relationship with the preset behavior type data table; the specific steps of determining the machine behavior instruction corresponding to the behavior type from the preset corresponding relation database are as follows:

16. The robot interaction control method of claim 11, wherein the machine behavior instruction comprises:

17. The robot interaction control method according to claim 11, further comprising, before the step of acquiring the audio/video stream or the related sensing data of the monitored object, which is acquired by the camera unit, receiving biometric information sensor data, determining the identity authority of the user corresponding to the biometric information sensor data when the biometric information sensor data matches with pre-stored biometric information, and generating a corresponding preset feedback instruction in response to the identity authority.

18. The robot interaction control method of claim 17, wherein the biometric information sensor data includes at least one of fingerprint information data, iris information data, or voice information data, and an authentication authority of the user is determined by comparing with preset authentication authority information, and a corresponding preset feedback authority is opened in response to the authentication authority.

19. A robot interaction control method according to claim 11, characterized in that the library of predetermined behavior types and/or the database of preset correspondences is set by a user.

20. A robot interaction control method according to claim 19, characterized in that the downloading is performed by bluetooth, a Wi-Fi network, a mobile data network or a wired network.

21. A robot interaction server, comprising:

the analysis module is used for matching a corresponding behavior type from a preset behavior type library according to feature data extracted from the audio and video stream or the sensor data; the step of matching out the corresponding behavior type from a preset behavior type library according to the feature data extracted from the audio and video stream comprises the following steps: extracting audio characteristic information or keyword information of audio data in the audio and video stream; matching the characteristic information or the keyword information with pre-stored audio characteristic information or pre-stored keyword information in a preset behavior type library so as to determine a behavior type corresponding to the audio data; the matching of the corresponding behavior types from a predetermined behavior type library according to the feature data extracted from the sensor data comprises:

extracting a specific data field or information meaning in the sensor data;

the data generation module is used for determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database; the preset behavior type library and/or the preset corresponding relation database are set or updated by a cloud server or a user associated with the cloud server;

22. A robot, comprising:

the analysis module is used for matching a corresponding behavior type from a preset behavior type library according to feature data extracted from the audio and video stream or the sensor data, and then determining a machine behavior instruction corresponding to the behavior type from a preset corresponding relation database; the step of matching out the corresponding behavior type from a preset behavior type library according to the feature data extracted from the audio and video stream comprises the following steps: extracting audio characteristic information or keyword information of audio data in the audio and video stream; matching the characteristic information or the keyword information with pre-stored audio characteristic information or pre-stored keyword information in a preset behavior type library so as to determine a behavior type corresponding to the audio data; the matching of the corresponding behavior types from a predetermined behavior type library according to the feature data extracted from the sensor data comprises:

extracting a specific data field or information meaning in the sensor data;

matching the specific data field or the information meaning with a prestored data field or prestored information meaning in a preset behavior type library so as to determine a behavior type corresponding to the sensor data; each of the behavior types includes one or more corresponding pre-stored data fields or information meanings; the preset behavior type library and/or the preset corresponding relation database are/is downloaded from a cloud server;