WO2020259524A1

WO2020259524A1 - Robot obstacle avoidance method, apparatus, and system

Info

Publication number: WO2020259524A1
Application number: PCT/CN2020/097871
Authority: WO
Inventors: 徐学军
Original assignee: 华为技术有限公司
Priority date: 2019-06-27
Filing date: 2020-06-24
Publication date: 2020-12-30
Also published as: CN110370273B; CN110370273A

Abstract

An obstacle avoidance method, comprising: a robot (401) acquires sample information at N moments during the process of movement. The sample information of each moment amongst the N moments comprises: environment image information for indicating the environment in which the robot (401) is located at said moment, position information for indicating the position in which the robot is located (401) at said moment, and label information for indicating whether the robot (401) is in a trapped state at said moment. The robot sends sample information of M moments amongst the N moments to a server (402). The robot (401) receives from the server (402) a predicament recognition model trained on sample information collected from one or a plurality of robots (401). On the basis of the predicament recognition model, the robot (401) implements obstacle avoidance during the process of movement. The present obstacle avoidance method solves the problems of the large labour cost investment and the low accuracy of trained models when using machine learning methods to implement predicament recognition. Also relating to an obstacle avoidance apparatus and system.

Description

Method, device and system for robot avoiding obstacles

This application claims the priority of the Chinese patent application filed with the State Intellectual Property Office of China with application number 201910566538.1 on June 27, 2019, and the priority of the Chinese patent application with the title of “a method, device and system for robot obstacle avoidance” Right, the entire contents of which are incorporated in this application by reference.

Technical field

This application relates to the field of electronic equipment, and in particular to a method, device and system for robot obstacle avoidance.

Background technique

Robots (such as sweeping robots) have been widely used in people's daily lives. Because the robot's work is almost completely autonomous, it often encounters some trapped situations in the process of work. At present, machine learning methods can be used to train robots to recognize dilemmas and avoid obstacles.

Among them, the more common machine learning methods include supervised learning and reinforcement learning. In supervised learning, it is necessary to manually collect a large amount of sample data, and manually label these sample data with corresponding trapped or untrapped labels, and then perform supervised learning model training on the labeled sample data to obtain the corresponding model. So that the robot judges whether it needs to avoid obstacles according to the model.

Exemplarily, it is taken as an example that the robot is a cleaning robot, and the sample data is environment information where the cleaning robot is located. The environmental information as shown in Fig. 1A, because the cleaning robot is trapped by the connection line of the plug-in row and cannot move freely, therefore, the sample data can be manually labeled as trapped. For another example, in the environment shown in FIG. 1B, the cleaning robot is not trapped by the plug-in row and its connecting line, and can move freely. Therefore, the sample data can be manually labeled as not trapped. For another example, in the environment shown in FIG. 2A, the sweeping robot is at an angle between the wall and the fixed object and cannot continue to move freely. Therefore, the sample data can be manually labeled as trapped. For another example, in the environment shown in FIG. 2B, the sweeping robot is at a position far away from the wall and the fixed object, and can continue to move freely. Therefore, the sample data can be manually labeled as not trapped. For another example, in the environment shown in FIG. 3A, the sweeping robot is in the gap between the fixed object and the ground and cannot continue to move freely. Therefore, the sample data can be manually labeled as trapped. Correspondingly, as shown in FIG. 3B, the robot is far away from the fixed object and the gap between the fixed object and the ground, and can move freely, so the sample data can be manually labeled as not trapped. According to a large number of labeled sample data similar to those shown in Figs. 1A-3B, the corresponding model can be obtained by training the supervised learning model for obstacle avoidance of the sweeping robot.

Reinforcement learning refers to a method in which robots learn in a "trial and error" manner. Based on the large amount of sample data collected manually, and the trapped or untrapped labels manually labeled by the sample data, the robot determines the behavior that rewards the most environmental feedback through constant trial and error. Mark according to the reward result and the trial and error method corresponding to the reward result, as a reference for the next work.

It can be seen that whether supervised learning or reinforcement learning is used for dilemma recognition, a large amount of sample data is required. At present, these sample data need to be collected manually, such as obtained through web crawler search. In order to distinguish the state of the robot corresponding to the collected sample data (such as whether it is in a trapped state), it is necessary to manually identify the collected sample data and label the corresponding label. This will bring a lot of labor costs. In addition, because there is no uniform sample data collection standard, the quality of collected sample data is difficult to control, which will result in low labeling quality, such as labeling errors. In addition, the sample data collected through the web crawler, such as the above environmental information, are mostly environmental pictures taken from the user's perspective, which cannot intuitively and accurately reflect the environment where the robot is located. These will lead to inaccurate training models.

Summary of the invention

The embodiments of the present application provide a robot obstacle avoidance method, device, and system, which solves the problem of large labor costs and low accuracy of trained models in the process of using machine learning methods for dilemma recognition.

In order to achieve the foregoing objectives, the following technical solutions are adopted in the embodiments of this application:

In the first aspect of the embodiments of the present application, a method for avoiding obstacles for a robot is provided. The method includes: the robot obtains sample information at N times during the movement of the robot, where N is an integer greater than 0; wherein, each of the N times The sample information of includes: environmental image information indicating the environment in which the robot is located at the moment, position information indicating the position of the robot in the environment indicated by the environmental image information at that moment, and tags indicating the state of the robot at that moment Information; the tag information includes the first tag or the second tag. The first tag is used to indicate that the robot is in an untrapped state, and the second tag is used to indicate that the robot is in a trapped state; The sample information is sent to the server, and M is a positive integer less than or equal to N; the robot receives the dilemma recognition model from the server, and the dilemma recognition model is trained from the sample information collected by one or more robots; the robot performs the movement according to the dilemma recognition model Avoidance.

In this way, the robot autonomously collects environmental image information and location information, and obtains 100% correct label information for identifying whether the robot is trapped according to the current state, which ensures the accuracy of the sample information while avoiding a lot of manpower Invest. The robot sends the sample information at N times to the server, and obtains the dilemma recognition model after training, which is used to guide the robot's movement to avoid obstacles. Since the accuracy of the sample information can be guaranteed, the accuracy of the dilemma recognition model is also obtained. Significantly improved, which can guide the robot to avoid obstacles more accurately.

Combining the first aspect and the foregoing possible implementation manners, in another possible implementation manner, the environmental image information at each moment includes: in the environment where the robot is located at that moment, the information on the robot's moving route and the surrounding objects on the moving route Image information. In this way, it can be ensured that the environment image information can include information about objects that are most likely to become obstacles to the movement of the robot.

Combining the first aspect and the foregoing possible implementation manners, in another possible implementation manner, the position information at each moment includes: in the environment indicated by the environmental image information at that moment, the robot and the robot's moving route and moving route Relative position information of surrounding objects. In this way, the specific location of the robot in the environment at that moment can be determined, so that the robot can accurately predict the location of the robot in the environment at the next moment.

Combining the first aspect and the foregoing possible implementation manners, in another possible implementation manner, the robot sends the sample information of M time out of the N time to the server, including: the robot determines the sample information of the N time, the first The label information included in the sample information at the N time is used to indicate that the machine is in a trapped state at the N time; the robot combines the sample information at the M-1 time before the N time and the sample information at the N time Send to the server. In this way, for the training of the dilemma recognition model, the quality of the trapped sample information and the sample information for a period of time before the trapping is higher, so that while ensuring the transmission of high-quality sample information, the system is reduced. The total amount of information transmission reduces the communication load of the system.

Combining the first aspect and the above possible implementations, in another possible implementation, the robot avoids obstacles during the movement according to the dilemma recognition model, including: the robot obtains the environment image information at the current moment and the current location Information; the robot obtains the environment image information at the next time and the location information at the next time according to the environment image information at the current time, the location information at the current time, and the movement direction and speed of the robot at the current time; Environmental image information, location information at the next moment, and the dilemma recognition model determine the probability of the robot being trapped at the next moment; the robot determines that the probability of the robot being trapped at the next moment is greater than the preset threshold, and the robot changes its motion strategy to avoid obstacles . In this way, by predicting the environment image information and position information of the environment where the robot is located at the next moment, combined with the dilemma recognition model, it is possible to more accurately determine whether the robot will be trapped at the next moment, and avoid obstacles based on the result, which can be effective Avoid the situation where the robot gets trapped during the movement.

A second aspect of the embodiments of the present application provides an obstacle avoidance device, which may include: an acquisition unit, a communication unit, and an obstacle avoidance unit. The acquisition unit is used to acquire sample information at N times during the movement of the robot, where N is an integer greater than 0; wherein, the sample information at each of the N times includes: an environment image used to indicate the environment where the robot is located at that moment Information, position information indicating the position of the robot in the environment indicated by the environmental image information at that moment, and tag information indicating the state of the robot at that moment; the tag information includes the first tag or the second tag, and the first tag is used for To indicate that the robot is in an untrapped state, the second tag is used to indicate that the robot is in a trapped state; the communication unit is used to send the sample information at M times out of N times to the server, where M is less than or equal to N A positive integer; the communication unit is also used to receive the dilemma recognition model from the server. The dilemma recognition model is trained from the sample information collected by one or more robots; the obstacle avoidance unit is used to avoid obstacles during the movement of the robot according to the dilemma recognition model .

Combining the second aspect and the foregoing possible implementation manners, in another possible implementation manner, the environmental image information at each moment includes: in the environment where the robot is located at that moment, the information on the robot's moving route and the surrounding objects on the moving route Image information.

Combining the second aspect and the foregoing possible implementation manners, in another possible implementation manner, the location information at each time includes: the environment indicated by the environmental image information at that time, on the movement line and the movement line of the robot and the robot Relative position information of surrounding objects.

Combining the second aspect and the foregoing possible implementation manners, in another possible implementation manner, the determining unit is configured to determine the sample information at the N time, and the label information included in the sample information at the N time is used to indicate the machine In the trapped state at the Nth time; the communication unit is used to send the sample information of the M time in the N time to the server, including: the communication unit is used to send the first M-1 information at the Nth time The sample information at time and the sample information at the Nth time are sent to the server.

Combining the second aspect and the foregoing possible implementation manners, in another possible implementation manner, the acquiring unit is also used to acquire environmental image information at the current moment, and location information at the current moment, according to the environmental image information at the current moment, The position information at the current moment, and the moving direction and speed of the robot at the current moment, to obtain the image information of the environment at the next moment and the location information at the next moment; the obstacle avoidance unit is used to avoid the robot during the movement of the robot according to the dilemma recognition model Obstacles, including: obstacle avoidance unit, used to determine the probability of the robot being trapped at the next moment according to the environmental image information at the next moment, the location information at the next moment, and the dilemma recognition model, and to determine the next moment the robot is trapped If the probability is greater than the preset threshold, change the robot's motion strategy to avoid obstacles.

A third aspect of the embodiments of the present application provides a robot obstacle avoidance system. The system may include: one or more robots and a server. The robot is used to obtain the sample information at N times during the robot movement, and send the sample information at the N times to the server; where N is an integer greater than 0, and the sample information at each of the N times includes: The environment image information indicating the environment where the robot is located at that moment, the position information indicating the position of the robot in the environment indicated by the environment image information at that moment, and the label information indicating the state of the robot at that moment; the label information includes the first Tag or second tag, the first tag is used to indicate that the robot is in an untrapped state, and the second tag is used to indicate that the robot is in a trapped state; the server is used to receive sample information from one or more robots, The sample information of one or more robots is trained to obtain a dilemma recognition model, and the dilemma recognition model is sent to the robot; the robot is also used to receive the dilemma recognition model, and avoid obstacles during the movement according to the dilemma recognition model.

Combining the third aspect and the foregoing possible implementation manners, in another possible implementation manner, the sample information received by the server includes the first type of sample information and the second type of sample information; the server is used to verify the received one or more The robot's sample information is trained to obtain the dilemma recognition model, including: the server is used to train the first type of sample information to obtain the initial model, and verify the accuracy of the initial model based on the second type of sample information. If the verification is accurate If the rate is greater than the preset threshold, the initial model is determined to be the dilemma recognition model. If the verification accuracy rate is less than the preset threshold, the server receives new sample information sent by the robot, and continues training based on the new sample information and the initial model, and According to the second type of sample information, the model obtained by continuing training is verified until the verification accuracy is greater than the preset threshold, and the model with the verification accuracy greater than the preset threshold is determined as the dilemma recognition model; where, if the second type is The state of whether the robot is trapped indicated by the result of inputting the environmental image information and position information in the sample information into the initial model is the same as the state of whether the robot is trapped indicated by the tag information in the second type of sample information, then it is determined The verification of the sample information is accurate; if the environment image information and position information of each sample information in the second type of sample information are input into the initial model, the result indicated whether the robot is trapped or not is the same as the second type of sample information If the robot is trapped in different states indicated by the tag information, it is determined that the verification of the sample information is not accurate; the verification accuracy is determined according to the verification result of each sample information in the second type of sample information. In this way, the server can obtain an accurate dilemma recognition model and send it to the robot for effective obstacle avoidance through the training and verification of the first type of sample information and the second type of sample information.

In a fourth aspect, an embodiment of the present application provides a robot. The robot may include a processor, which is configured to be connected to a memory and call a program stored in the memory to execute any of the first aspect or the possible implementation manners of the first aspect. A method of obstacle avoidance for robots.

In a fifth aspect, an embodiment of the present application provides a computer-readable storage medium, including: computer software instructions; when the computer software instructions run in the obstacle avoidance device, the obstacle avoidance device can execute the first aspect or the first aspect. The obstacle avoidance method of any one of the implementation modes of the robot.

In a sixth aspect, the embodiments of the present application provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the robot obstacle avoidance method as in the first aspect or any of the possible implementations of the first aspect .

Understandably, the obstacle avoidance device of the second aspect, the robot obstacle avoidance system of the third aspect, the robot of the fourth aspect, the computer-readable storage medium of the fifth aspect, and the computer program product of the sixth aspect provided above are all used for The corresponding method provided above is executed. Therefore, the beneficial effects that can be achieved can refer to the beneficial effects in the corresponding method provided above, which will not be repeated here.

Description of the drawings

Figure 1A is a schematic diagram of an environment during the movement of the robot;

Figure 1B is a schematic diagram of another environment during the movement of the robot;

Figure 2A is a schematic diagram of another environment during the movement of the robot;

Figure 2B is a schematic diagram of another environment during the movement of the robot;

Figure 3A is a schematic diagram of another environment during the movement of the robot;

Figure 3B is a schematic diagram of another environment during the movement of the robot;

4 is a simplified schematic diagram of a robot obstacle avoidance system provided by an embodiment of the application;

FIG. 5 is a schematic diagram of the composition of a robot provided in this application;

FIG. 6 is a schematic flowchart of a method for avoiding obstacles for a robot according to an embodiment of the application;

FIG. 7 is a schematic flowchart of another obstacle avoidance method for a robot according to an embodiment of the application;

FIG. 8 is a schematic diagram of the composition of a dilemma recognition model provided by an embodiment of the application;

FIG. 9 is a schematic diagram of robot obstacle avoidance provided by an embodiment of the application;

FIG. 10 is a simplified schematic diagram of an obstacle avoidance device provided by an embodiment of the application.

Detailed ways

Existing robots (such as sweeping robots) need to manually collect sample data. These collected sample data also need to be manually labeled, which will result in a large labor cost. The manual labeling process may also be wrong. Illustratively, the robot is a sweeping robot as an example for description. For example, when the sweeping robot is in the position shown in Figure 3A, people can think that the sweeping robot is stuck in the gap between the fixed object and the ground and cannot continue to move when viewed from the angle shown in Figure 3A. Manually tag them as trapped. However, the sweeping robot may not be blocked by the gap and can continue to work in the gap. In this way, the label information of the sample data may be incorrectly labeled. In addition, most of the currently collected sample data are environmental pictures taken from the perspective of the user, which cannot intuitively and accurately reflect the environment where the robot is located. These will lead to inaccurate dilemma recognition models obtained by training.

The present application provides a robot obstacle avoidance method, the basic principle of which is that the robot obtains sample information at N (N is an integer greater than 0) time during its movement, and sends the sample information at the N time to the server. The robot receives the dilemma recognition model from the server, where the dilemma recognition model is trained on sample information collected by one or more robots. The robot can avoid obstacles in the process of moving according to the dilemma recognition model.

Wherein, the sample information corresponding to each of the aforementioned N times includes environmental image information of the environment in which the robot is located at that time, and position information of the position of the robot in the environment indicated by the environmental image information at that time. This information is collected by the robot, which can accurately identify the environment where the robot is at that time. At the same time, each sample information also includes tag information used to indicate whether the robot is trapped at that moment. Similarly, the tag information is also marked by the robot according to its own recognition of the current state, which can also accurately reflect the current state of the robot. In this way, by sending such sample information to the server for the server to train the dilemma recognition model, the accuracy of the dilemma recognition model can be improved. At the same time, since there is no need to manually collect environmental image information during the robot's working process, and no need to manually label the environmental image information, a large amount of labor cost is saved.

The robot obstacle avoidance method provided by the present application will be described in detail below with reference to the accompanying drawings.

Please refer to FIG. 4, which is a simplified schematic diagram of a robot obstacle avoidance system provided by an embodiment of this application. As shown in FIG. 4, the robot obstacle avoidance system may include a robot 401 and a server 402. It should be noted that the aforementioned robot 401 may include one or more robots, such as robot 1-robot n shown in FIG. 4.

Among them, each of the above-mentioned robots 401 can be used to obtain sample information at N times during its own movement, and send the obtained sample information at N times to the server 402 through wireless communication. Wherein, the sample information at each time may include: environmental image information at that time, location information at that time, and tag information at that time. The environment image information is used to indicate the environment where the robot is at that moment. The location information is used to indicate the location of the robot in the environment indicated by the environment image information at that moment. The tag information is used to indicate whether the robot is trapped at that moment.

The above-mentioned server 402 may train the received sample information of one or more robots according to the sample information received from one or more robots, obtain the dilemma recognition model, and send the dilemma recognition model to each robot. In this way, the robot 401 can avoid obstacles during the movement according to the received dilemma recognition model.

Exemplarily, the process in which the server 402 obtains the dilemma recognition model through sample information can be implemented by the following method: the server 402 can divide the obtained sample information into two types, such as the first type of sample information and the second type of sample information. The first type of sample information can be used for training of the dilemma recognition model, and the second type of sample information can be used to detect the trained dilemma recognition model to determine the accuracy of the trained dilemma recognition model. The process of obtaining the dilemma recognition model by the server 402 will be described in detail in the following embodiments, and will not be repeated here.

Wherein, the aforementioned robot 401 may be an electronic device that can move autonomously, such as a cleaning robot, a self-service robot provided by a service industry such as a bank, and the like.

Please refer to FIG. 5, which provides a schematic diagram of the composition of a robot 500 for this application. As shown in FIG. 5, the robot 500 may include an image acquisition module 501, a sensor 502, a processor 503, a memory 504, and a communication module 505.

The processor 503 may include one or more processing units. For example, the processor 503 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal. Processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit, NPU) and so on. Among them, different processing units can be independent devices or integrated in one or more processors.

The controller may be the nerve center and command center of the robot 500. The controller can generate operation control signals according to the instruction operation code and timing signals to complete the control of fetching and executing instructions.

A memory may also be provided in the processor 503 for storing instructions and data. In some embodiments, the memory in the processor is a cache memory. The memory can store instructions or data that the processor has just used or recycled. If the processor needs to use the instruction or data again, it can be directly called from the memory. Repeated access is avoided, the waiting time of the processor 503 is reduced, and the efficiency of the system is improved.

In some embodiments, the processor 503 may include one or more interfaces. Interfaces may include integrated circuit (I2C) interface, universal asynchronous receiver/transmitter (UART) interface, mobile industry processor interface (MIPI), general input and output (general -purpose input/output, GPIO) interface, subscriber identity module (SIM) interface, and/or universal serial bus (universal serial bus, USB) interface, etc.

The interface is a bidirectional synchronous serial bus, which includes a serial data line (SDA) and a serial clock line (SCL). In some embodiments, the processor may include multiple sets of I2C buses. The processor can couple sensors, cameras, etc. separately through different I2C bus interfaces.

The UART interface is a universal serial data bus used for asynchronous communication. The bus can be a two-way communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, the UART interface is usually used to connect the processor 503 and the communication module 505. For example, the processor 503 communicates with the Bluetooth module in the wireless communication module through the UART interface to realize the Bluetooth function.

The MIPI interface can be used to connect peripheral devices such as processors and cameras. MIPI interface includes camera serial interface (camera serial interface, CSI) and so on. In some embodiments, the processor and the camera communicate through a CSI interface to realize the shooting function of the robot 500.

The GPIO interface can be configured through software. The GPIO interface can be configured as a control signal or as a data signal. In some embodiments, the GPIO interface can be used to connect the processor and camera, wireless communication module, sensor module, etc. GPIO interface can also be configured as I2C interface, UART interface, MIPI interface, etc.

The USB interface is an interface that complies with the USB standard specifications, and can be a Mini USB interface, a Micro USB interface, and a USB Type C interface. The USB interface can be used to connect a charger to charge the robot 500, and can also be used to transfer data between the robot 500 and peripheral devices. This interface can also be used to connect other robots 500 and so on.

The image collection module 501 can collect image information around the robot 500, for example, taking photos or videos. The robot 500 can implement image acquisition functions through ISP, camera, video codec, GPU, and application processor.

ISP is used to process the data fed back from the camera. For example, when taking a picture, the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transfers the electrical signal to the ISP for processing and is converted into an image visible to the naked eye. ISP can also optimize algorithms for image noise and brightness. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP can be set in the camera.

The camera is used to capture still images or videos. The object generates an optical image through the lens and projects it to the photosensitive element. The photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal. ISP outputs digital image signals to DSP for processing. DSP converts digital image signals into standard RGB, YUV and other formats. In some embodiments, the robot 500 may include 1 or N cameras, and N is a positive integer greater than one.

The sensor 502 can obtain information such as the moving speed, the moving direction, and the distance to surrounding objects of the robot 500. Exemplarily, the sensor 502 may include a gyroscope sensor, a speed sensor, an acceleration sensor, a distance sensor, and the like.

Among them, the gyroscope sensor can be used to determine the movement posture of the robot 500. In some embodiments, the angular velocity of the robot 500 around three axes (ie, x, y, and z axes) can be determined by a gyroscope sensor. The gyroscope sensor can be used for shooting anti-shake. Exemplarily, when the robot 500 is performing image acquisition, the gyroscope sensor detects the angle of the robot 500 shaking, and calculates the distance that the lens module needs to compensate according to the angle, so that the lens can counteract the shaking of the robot 500 through reverse movement to achieve anti-shake . The gyro sensor can also be used for navigation and to determine whether the robot 500 is trapped or not.

The speed sensor is used to measure the moving speed. In some embodiments, the robot 500 uses a speed sensor to measure the moving speed at the current moment, and the distance sensor may be combined with the distance sensor to predict the environment where the robot 500 is located at the next moment, etc.

The acceleration sensor can detect the magnitude of the acceleration of the robot 500 in various directions (generally three axes). When the robot 500 is stationary, the magnitude and direction of gravity can be detected.

Distance sensor, used to measure distance. The robot 500 can measure the distance by infrared or laser. In some embodiments, when shooting a scene, the robot 500 may use a distance sensor to measure the distance to achieve fast focusing.

The memory 504 may include external memory and internal memory. The external memory interface can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the robot 500. The external memory card communicates with the processor through the external memory interface to realize the data storage function. For example, save the sample information file in an external memory card.

The internal memory may be used to store computer executable program code, the executable program code including instructions. The processor executes various functional applications and data processing of the robot 500 by running instructions stored in the internal memory. The internal memory can include a program storage area and a data storage area. Among them, the storage program area can store the operating system and at least one application program required by the function. The data storage area can store data created during the use of the robot 500. In addition, the internal memory may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (UFS), and the like.

The wireless communication function of the robot 500 can be implemented by the communication module 505. For example, through the communication module 505, the robot 500 can realize communication with other devices, such as communication with a server. As an example, the communication module 505 may include antenna 1, antenna 2, mobile communication module, wireless communication module, modem processor, baseband processor, and so on.

The antenna 1 and the antenna 2 are used to transmit and receive electromagnetic wave signals. Each antenna in the robot 500 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example, antenna 1 can be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna can be used in combination with a tuning switch.

The mobile communication module can provide wireless communication solutions including 2G/3G/4G/5G and other wireless communication solutions applied to the robot 500. The mobile communication module may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc. The mobile communication module can receive electromagnetic waves by the antenna 1, and perform processing such as filtering and amplifying the received electromagnetic waves, and then transmitting them to the modem processor for demodulation. The mobile communication module can also amplify the signal modulated by the modem processor, and convert it into electromagnetic wave radiation by the antenna 1. In some embodiments, at least part of the functional modules of the mobile communication module may be provided in the processor. In some embodiments, at least part of the functional modules of the mobile communication module and at least part of the modules of the processor may be provided in the same device.

The wireless communication module can provide applications on the robot 500 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), and global navigation satellite systems ( Global navigation satellite system, GNSS), infrared technology (infrared, IR) and other wireless communication solutions. The wireless communication module may be one or more devices integrating at least one communication processing module. The wireless communication module receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor. The wireless communication module can also receive the signal to be sent from the processor, perform frequency modulation, amplify, and radiate electromagnetic waves through the antenna 2.

In some embodiments, the antenna 1 of the robot 500 is coupled with the mobile communication module, and the antenna 2 is coupled with the wireless communication module, so that the robot 500 can communicate with the server and other devices through wireless communication technology. The wireless communication technologies may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time-division code division multiple access (TD-SCDMA), long term evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc. The GNSS may include global positioning system (GPS), global navigation satellite system (GLONASS), Beidou navigation satellite system (BDS), quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite-based augmentation systems (SBAS).

It can be understood that the structure illustrated in this embodiment does not constitute a specific limitation on the robot 500. In other embodiments, the robot 500 may include more or fewer components than shown, or combine certain components, or split certain components, or arrange different components. The illustrated components can be implemented in hardware, software, or a combination of software and hardware.

The obstacle avoidance methods for the robot provided by the embodiments of the present application can be implemented in the system shown in FIG. 4 and the robot shown in FIG. 5.

FIG. 6 is a schematic flowchart of a method for avoiding obstacles for a robot according to an embodiment of the application. Please refer to FIG. 6, the method may include S601-S605.

S601. The robot obtains sample information at N times during the movement, where N is an integer greater than 0.

Among them, the sample information at each time includes environmental image information used to indicate the environment in which the robot is located at that time, position information used to indicate the position of the robot in the environment indicated by the environmental image information at that time, and position information used to indicate the robot at that time The tag information includes the first tag or the second tag. The first tag is used to indicate that the robot is in an untrapped state, and the second tag is used to indicate that the robot is in a trapped state.

Exemplarily, the robot can obtain the environmental image information and position information of the environment where the robot is located through a module provided on the robot. At the same time, the robot can determine whether it is trapped according to the current state of movement to obtain the tag information at that moment. For example, when the robot cannot continue to move, it is judged that it is trapped at the current moment, and the tag information at that moment can be obtained as the second tag. Otherwise, it is determined that the robot is in an untrapped state at the current moment, and the label information at that moment can be obtained as the first label. In this way, by acquiring this information in real time during the movement of the robot, the sample information at N times during the movement of the robot can be obtained. Wherein, N is an integer greater than zero.

S602: The robot sends the sample information of M time in N time to the server, where M is a positive integer less than or equal to N.

Among them, the robot can send all acquired samples at N times to the server, that is, M is equal to N. Exemplarily, the robot may send the sample information of the moment to the server in real time after acquiring the sample information of the moment, or may send the sample information of the N moments to the server together.

It is understandable that the robot can obtain a large amount of sample information in the process of moving, and some of the sample information is of high quality, so the robot can send some high-quality information in the obtained sample information to the robot, namely Send the sample information of M time in N time, M is a positive integer less than N. Exemplarily, the robot may send the sample information when it is trapped and the sample information that is not trapped before the robot is trapped to the server. In this way, the information quality can be improved while ensuring sufficient sample information and the system information load can be reduced.

S603: The server trains the sample information collected from one or more robots to obtain a dilemma recognition model.

Among them, since the training of the dilemma recognition model requires a large amount of sample information, in this embodiment, there may be multiple robots that can report the sample information collected by themselves to the server. In this way, the server can perform model training based on the received sample information collected by multiple robots to obtain a more accurate dilemma recognition model.

S604. The robot receives the dilemma recognition model from the server.

Wherein, the server may send the dilemma recognition model trained based on the sample information obtained by one or more robots to the robot in real time, or it may send the dilemma recognition model to the robot at a predetermined time interval. The robot can receive the dilemma recognition model through the communication module and store it in the memory.

S605: The robot avoids obstacles during the movement process according to the dilemma recognition model.

During the movement of the robot, it can obtain the current direction and speed of movement, and predict the environment image information and location information of the environment where the robot is located at the next moment based on the environment image information and location information at the current moment. So that the robot can determine whether the robot will be trapped in the next moment according to the dilemma recognition model, or determine the probability of the robot being trapped in the next moment. In this way, the robot can determine whether obstacle avoidance is needed at the next moment according to the determined result.

Exemplarily, in some embodiments, the environment image information and position information of the environment in which the robot is located at the next moment are used as input to the dilemma recognition model to determine whether the robot will be trapped in the next moment. When the robot determines that the robot will be trapped at the next moment, the robot can avoid obstacles at the current moment to avoid being trapped at the next moment. In some other embodiments, the environment image information and location information of the environment where the robot is located at the next moment are used as input to the dilemma recognition model to determine the probability of the robot being trapped at the next moment. When the robot determines that the probability of the robot being trapped at the next moment is greater than the preset threshold, the robot can perform obstacle avoidance operations.

In the obstacle avoidance method for the robot provided by the embodiments of the present application, the robot collects environment image information and position information at N times during its movement, and can accurately identify the environment where the robot is located at each of the N personal times. At the same time, the robot autonomously judges whether it is in a trapped state to automatically label the environment at the corresponding time according to the judgment result, which can accurately reflect whether the robot is trapped at the corresponding time. The robot sends this information as sample information to the server for training the dilemma recognition model on the server, which can improve the accuracy of the dilemma recognition model. At the same time, since there is no need to manually collect environmental image information during the robot's working process, and no need to manually label the environmental image information, a large amount of labor cost is saved.

Please refer to FIG. 7, which is another obstacle avoidance method for a robot provided by an embodiment of this application. As shown in Figure 7, the method may include S701-S706.

S701. The robot obtains sample information at N moments in the movement process, where the sample information at each moment of the N moments includes environmental image information at that moment, position information at that moment, and tag information at that moment.

Among them, the environment image information at that moment is used to indicate the environment where the robot is located at that moment. The position information at this time is used to indicate the position of the robot in the environment indicated by the environment image information at that time. The tag information at this moment is used to indicate the state of the robot at that moment. The tag information includes a first tag or a second tag, the first tag is used to indicate that the robot is in an untrapped state, and the second tag is used to indicate that the robot is in a trapped state.

Since the robot is constantly moving when it is working, at different times, the robot may be in a different environment, the location may be different, and whether it is trapped or not may also be different. In order to allow the server to obtain enough sample information to train the dilemma recognition model, the robot can obtain sample information at N times during its movement according to the environment, location and whether it is trapped during the movement. Wherein, N is an integer greater than zero.

Exemplarily, the following takes the robot acquiring sample information at the first moment in the movement process as an example to illustrate the process of acquiring sample information at a certain moment. The first moment is any one of the above N moments.

During the movement, the robot obtains the environmental image information of the environment where the robot is located at the first moment.

Exemplarily, with reference to FIG. 5, the robot 500 may obtain environment image information of the environment in which it is located through an image acquisition module 501 including a camera provided on the robot 500.

In some embodiments, the environment image information may be a photo of the environment in which the robot is located at the first moment.

Generally speaking, the robot will use video recording to record the working status or feed back work information to the user during the work. Therefore, in some other embodiments, the robot may use the image at the first moment in the recorded video as the environment image information of the environment in which the robot is located. It is understandable that the video can be decomposed into multiple consecutive images in the time domain. A video corresponds to a period of playing time. Therefore, each image in a video can correspond to a moment in the playing time. Then, each image corresponding to the time can be used as the environment image information of the environment where the robot is at that time. For example, suppose that the video recorded by the robot includes image A at time 1, image B at time 2, and image C at time 3. Then, the robot can use image A as the environment image information 1 of the environment the robot is in at the first moment. Similarly, take image B as the environment image information B of the environment where the robot is at other times, such as the second time, and take image C as the environment image information C of the environment where the robot is at other times, such as the third time.

It should be noted that because the environment image information is directly obtained by the robot, it can more accurately reflect the environment the robot is in. In addition, during the movement of the robot, objects on or around the moving line are most likely to hinder the movement of the robot. Therefore, the environmental image information obtained by the robot can include the moving line or the surrounding area. Image information of the object. In this way, the environmental image information can provide a more accurate reference for whether the robot may be trapped. Wherein, the moving route of the robot can be determined according to the current moment, such as the moving direction and moving speed of the robot at the first moment mentioned above. In some scenarios, the movement route of the robot can be planned in advance, and the robot can also obtain information about the movement route by querying the planned movement route in advance.

During the movement, the robot obtains the position information of the robot in the environment at the first moment.

Wherein, the position information of the robot in the environment at the first moment may be relative position information between the robot and other objects in the environment where the robot is located at the first moment. The environment where the robot is located at the first moment may refer to the environment indicated by the environment image information at the first moment.

The environment of the robot at different times is different, therefore, the distance between the robot and the same object at different times is also different. Exemplarily, in conjunction with FIG. 5, the robot 500 can obtain the relative position information between the robot and other objects in the environment at the first moment through the sensor 502 provided on the robot 500. For example, the relative position information may be distance information between the robot 500 and the aforementioned object. Wherein, the distance information may include the distance between the robot and objects on or around the moving line at the first moment.

Of course, in some embodiments, the robot 500 may also obtain the latitude and longitude information, altitude information or altitude information of the robot at the current moment (such as the first moment) through the sensor 502, and these information may also be used as the aforementioned position information. In this way, the robot can more accurately determine the position of the robot at that moment based on the position information. Then, when the robot is at another time (such as the second time), when the acquired position information is the same as the above-mentioned position information, it can accurately determine the environment where the robot is at the second time and the robot at the first time. The environment is exactly the same. The dilemma recognition model used as the robot's obstacle avoidance guidance is obtained by training the information including the sample information collected by the robot at the first moment. Therefore, the robot can use the dilemma recognition model to identify the environment that the robot is in at the second moment. Make a more accurate judgment (that is, judge whether the robot will be trapped at the second moment).

During the movement, the robot obtains the tag information indicating the state of the robot at the first moment. For example, the label information may include the above-mentioned first label or the above-mentioned second label.

It is understandable that the robot can accurately recognize whether it is trapped at the current moment (such as the first moment). For example, when the robot is stuck in wheels, trapped around, or spinning in a fixed place within a certain period of time, the robot can be considered to be trapped. The trapped information thus obtained can accurately identify the current state of the robot. Similarly, when the robot is not trapped, for example, the robot does not appear to be trapped in the first moment, then the robot can accurately determine the current state. After the robot determines that it is trapped or not trapped at the first moment, it can obtain the tag information corresponding to the state of the robot at the first moment. Then, we can think that the robot automatically tags the environment image information (such as surrounding picture information) and location information (such as distance information) collected by the robot at the first moment.

After obtaining the environmental image information at the first moment, the location information at the first moment, and the label information at the first moment, the robot can use these information as the sample information at the first moment. Similarly, by repeating the above process during the movement, the sample information at N times during the movement of the robot can be obtained.

S702. The robot sends the sample information of M moments out of N moments to the server, where M is a positive integer less than or equal to N.

In some embodiments, M may be equal to N. The robot can send the acquired sample information at N times to the server. In this way, it can be guaranteed that the server can train based on enough sample information to obtain an accurate dilemma recognition model.

It is understandable that the robot is not trapped most of the time when it is moving. Therefore, the number of sample information with the first label may be much larger than the number of sample information with the second label. In addition, the untrapped sample information for a period of time before the robot is trapped has high value. Therefore, in this embodiment of the present application, the robot can filter the sample information at N times before sending it to the server. . In other words, the robot can select the sample information of M out of the N times and send it to the server. Among them, the robot is trapped at the Nth moment, and the robot is not trapped at M-1 moments before this moment.

Therefore, in some other embodiments, M may be less than N. The robot can send sample information of M time with higher value among the N time to the server. In this way, it is possible to effectively reduce the amount of communication data between the robot and the server while ensuring that the server can obtain an accurate dilemma recognition model.

Exemplarily, in the sample information of the N time determined by the robot, the label information included in the sample information of the N time is used to indicate that the machine is in a trapped state at the N time, that is, the sample information at the N time It may include a second tag identifying that the robot is trapped at that moment. The robot sends the sample information of the first M-1 time and the sample information of the Nth time to the server.

For example, the memory of the robot can be a buffer. The robot can store the sample information acquired during the movement in the buffer, and monitor whether the label information in the sample information acquired at the current time is the second label. The label information in the sample information of is the second label, then the robot sends the sample information in the buffer to the server. If the label information in the sample information obtained at the current moment does not include the second label, the robot can continue to obtain the sample information and store it in the buffer until the robot obtains the sample information with the second label. For example, suppose the buffer of the robot can store up to 10 sample information. If the 10th sample information in the buffer includes the second tag, the robot can send the 10 sample information in the buffer to the server. If none of the 10 sample information stored in the buffer includes the sample information of the second tag, the robot can continue to obtain the sample information, delete the earliest sample information among the 10 sample information already in the buffer, and replace the new The sample information is stored in the buffer, and so on, until the sample information including the second label appears.

Optionally, the robot can clear the buffer after sending sample information to the server.

When the robot and the server are in a normal communication state, the robot can send the acquired sample information to the server in various forms. For example, the robot can send multiple sample information to the server together. For another example, the robot can send multiple sample information to the server several times. For another example, the robot can send multiple sample information to the server at a certain time interval.

When the communication between the robot and the server is not possible, such as the robot is offline or the server is offline, the robot can store the sample information in the memory provided on the robot. After the robot resumes communication with the server, the robot can send the sample information stored in the memory to the server.

S703. The server trains sample information collected from one or more robots to obtain a dilemma recognition model.

Exemplarily, the server may divide the acquired sample information into two parts (such as the first type of sample information and the second type of sample information). The first type of sample information can be used to train and obtain the dilemma recognition model, and the second type of sample information can be used to detect whether the detection result of the dilemma recognition model is accurate.

For example, suppose that the sample information obtained by the server includes X pieces of first type sample information and Y pieces of second type sample information, where X and Y are both integers greater than 0, and X+Y=N. Take the environment image information and position information included in the sample information as input, and train X first-type sample information, taking the probability of the robot being trapped in the current environment or the specific result of whether it will be trapped as the result Output, and compare the output result with the label information included in the sample information, and repeat the training until the comparison result converges to the expected level, so as to obtain the initial model. When the initial model is obtained by training based on X first-type sample information, the environmental image information and location information in each sample information of Y second-type sample information can be input into the dilemma recognition model to perform the initial model Detect and check the accuracy of the dilemma recognition model. For example, if the environment image information and position information of a certain sample information in the second type of sample information are input into the state of whether the robot is trapped or not indicated by the result obtained by the initial model, and the label information in the second type of sample information indicates Whether the robots are trapped in the same state, it is determined that the verification of the initial model for the sample information is accurate; if the environmental image information and position information of a certain sample information in the second type of sample information are input into the initial model The state of whether the robot is trapped indicated by the result is different from the state of whether the robot is trapped indicated by the tag information in the second type of sample information, and it is determined that the verification result for the sample information is inaccurate. Since the second type of sample information can be multiple, the verification accuracy of the initial model can be obtained by integrating the accuracy of the initial model after each test. When the accuracy rate is greater than the preset threshold, the initial model can be used as a dilemma recognition model. When the accuracy rate is less than the preset threshold, it can be considered that the initial model is not accurate enough and training needs to be continued. Then the server can add new sample information from the robot to the training, and further train the above-mentioned initial model to obtain a model with higher accuracy. Thus, the server can determine a model with an accuracy rate greater than a preset threshold, and send the model to the robot as a dilemma recognition model.

Exemplarily, training the X pieces of first-type sample information may be training according to a loss function (loss function) or a cost function (cost function). The loss function or cost function is a function that maps the value of a random event or its related random variable to a non-negative real number to express the "risk" or "loss" of the random event. In the embodiment of the present application, it can be solved by minimizing the loss function (cost function). When solving the minimum value of the loss function, the gradient descent method can be used to solve it step by step to obtain the minimized loss function and model parameter values, so as to realize the acquisition of the dilemma recognition model (initial model).

Exemplarily, the dilemma recognition model obtained through training may be a model based on a convolutional neural network. For example, as shown in Figure 8, the dilemma recognition model may include an input layer, a volume base layer, a pooling layer, a fully connected layer, and an output layer. It should be noted that the figure only illustrates two convolutional layers, two pooling layers and one fully connected layer. The dilemma recognition model in the embodiment of the present application may include multiple convolutional layers, multiple pooling layers, and multiple fully connected layers.

It should be noted that the larger the type and quantity of sample information, the more accurate the dilemma recognition model obtained by training and learning. Therefore, the sample information may be sample information obtained during one movement of one robot, sample information obtained during multiple movements of one robot, or sample information obtained during multiple movements of multiple robots. The embodiments of the application are not limited here.

S704. The robot receives the dilemma recognition model from the server.

The server trains and obtains the dilemma recognition model according to multiple sample information sent by one or more robots, and sends the dilemma recognition model to the robot. 5, the robot 500 can receive the dilemma recognition model through the communication module 505 provided on the robot, and store the dilemma recognition model in the memory 504. To use the stored dilemma recognition model to avoid obstacles, such as performing the following S705-S706.

S705: The robot obtains the moving direction and moving speed of the robot at the current moment.

While the robot is moving, it can predict the environment the robot will be in at the next moment. Exemplarily, the robot can obtain the moving direction and moving speed of the robot at the current moment through a sensor provided on the robot. And get the environmental image information and location information at the current moment. The robot's processor can obtain the environment image information of the environment where the robot is located at the next moment and the location information in the environment according to the robot's current moving direction, moving speed, and current environment image information and location information.

Assume that the position of the robot at the current moment is the position shown in (a) of FIG. 9, and the robot is moving in direction 1. Then, according to the first position information of the robot at the current moment, such as the distance information between the robot and the object 1 and the distance information between the robot and the object 2, the moving direction and the moving speed, the second position information of the robot at the next time can be obtained, and according to The position information and the image information of the environment where the robot is located at the current moment are triangulated to obtain the image information of the environment where the robot is located at the second moment, for example, as shown in (b) in FIG. 9.

For example, at the current moment, the robot obtains P1 (such as a picture of the environment in which the robot is located) through the image acquisition module of the environment image information, and the position information obtained through the sensor is S1 (such as the robot and the objects in the environment). distance). The robot can obtain the current moving direction and moving speed through sensors. Furthermore, the position information S2 at the next moment can be calculated according to the formula S1-S2=t*V. Among them, S1 is the distance of the robot from the front object at the current moment, S2 is the physical distance of the robot from the front at the next moment, t is the time interval between the current moment and the next moment, and V is the moving speed of the robot at the current moment.

In addition, the robot can also calculate and obtain the image information P2 of the environment where the robot is located at the next time by using a triangle transformation based on the distance (such as S1-S2) that it moves from the current time to the next time.

In this way, the robot obtains the environment image information P2 of the environment where the robot is located at the next time and the position information S2 of the environment where the robot is located at the next time.

S706: The robot predicts whether it will be trapped at the next moment according to the environment image information, location information, and the dilemma recognition model of the environment in the next moment, and performs obstacle avoidance.

Combining the example in S705 above, the robot can use the image information P2 and the position information S2 of the environment where the robot is located at the next moment as the input of the dilemma recognition model, and obtain the probability that the robot is trapped at the next moment through calculation. The robot can compare the probability of being trapped with a preset threshold. When the robot determines that the probability of being trapped at the next moment is greater than the preset threshold, the robot can change the current motion strategy (such as the direction of motion at the current moment) to avoid Obstacle to avoid the situation that the robot is trapped during the movement. When the robot determines that the probability of being trapped at the next moment is less than the preset threshold, it can be considered that the robot will not be trapped at the next moment, and the robot can continue to execute the current motion strategy and move.

Exemplarily, in conjunction with Figure 9, assuming that the robot is in the environment shown in Figure 9(a) at the current moment, when the robot judges that it will be trapped at the next moment according to the dilemma recognition model, then the robot can change at that moment The original movement strategy, for example, changes the original movement in direction 1 to movement in direction 2 as shown in (c) in FIG. 9 to avoid the robot being trapped in the next moment.

It should be noted that the above S706 is described by taking the output of the dilemma recognition model as the probability of being trapped at the next moment as an example. In the embodiment of the present application, the dilemma recognition model can also output a certain prediction result of whether the robot will be trapped at the next moment, that is, it will be trapped at the next moment or will not be trapped at the next moment. When the robot determines that it will be trapped at the next moment according to the dilemma recognition model, the robot can change the current motion strategy (such as the current motion direction, etc.) to avoid obstacles to avoid the situation where the robot is trapped during the movement. . When the robot determines that it will not be trapped at the next moment according to the dilemma recognition model, the robot can continue to execute the current motion strategy and move.

In the obstacle avoidance method for the robot provided by the embodiment of the present application, the process of the robot performing the above S701-S706 is not a single pass, but can be cyclic. For example, the robot can continuously obtain sample information at different times and send the sample information to the server, so that the server obtains an updated dilemma recognition model through incremental training based on the updated sample information and the dilemma recognition model. Among them, the process of incremental training is similar to the process of initial training, except that in the process of incremental training, the loaded model is the previously trained model, and the weight of the loaded model is also trained. In the initial training, the weight of the loaded model is the default value. After the incremental training is completed, use the second type of sample information to detect the model. If the accuracy of the detection is higher than before, it means that the quality of the incremental training model is high. At this time, the server can deliver the newly trained model to the robot, so that the robot can recognize the model according to the updated dilemma, and use it to guide the subsequent movement for more effective obstacle avoidance.

In this way, the robot can autonomously acquire the sample information at N times, and the sample information corresponding to each of the above N times includes the environment image information of the environment where the robot is at that time, and the robot at that time in the environment indicated by the environment image information. Location information of the location. Since these information is collected by the robot, it can accurately identify the environment where the robot is at that time. At the same time, each sample information also includes tag information used to indicate whether the robot is trapped at that moment. Similarly, the tag information is also marked by the robot according to its own recognition of the current state, which can also accurately reflect the current state of the robot. The robot can improve the accuracy of the dilemma recognition model by sending such sample information to the server for the server to train the dilemma recognition model. At the same time, since there is no need to manually collect environmental image information during the robot's working process, and no need to manually label the environmental image information, a large amount of labor cost is saved. The server can also perform incremental training on the dilemma recognition model based on the updated sample information to obtain a more accurate dilemma recognition model and send it to the robot. In this way, the robot can continuously obtain sample information and send it to the server during the movement. The obtained and updated dilemma recognition model provides more accurate obstacle avoidance guidance for the robot's movement process.

The above description mainly introduces the solution provided by the embodiment of the present application from the perspective of a robot. It can be understood that, in order to realize the above-mentioned functions, the robot includes hardware structures and/or software modules corresponding to each function, and these hardware structures and/or software modules corresponding to each function may constitute a robot. Those skilled in the art should easily realize that in combination with the algorithm steps of the examples described in the embodiments disclosed herein, the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.

The embodiment of the application can divide the functional modules of the robot according to the above method examples. For example, the robot can include an obstacle avoidance device, which can divide each functional module corresponding to each function, or divide two or more functions into one. Integrated in a processing module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules. It should be noted that the division of modules in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.

In the case of dividing each functional module corresponding to each function, FIG. 10 shows a schematic diagram of a possible composition of the obstacle avoidance device involved in the above embodiment. As shown in FIG. 10, the obstacle avoidance device includes: an acquiring unit 1001 , The communication unit 1003, the obstacle avoidance unit 1004. For example, the acquisition unit 1001 may also be the image acquisition module 501 and the sensor 502 as shown in FIG. 5, the communication unit 1003 may also be the communication module 505 as shown in FIG. 5, and the obstacle avoidance unit 1004 may also be as shown in FIG. Processor 503.

Wherein, the obtaining unit 1001 is used to obtain sample information at N times during the movement of the robot, and N is an integer greater than zero. Among them, the sample information at each of the N times includes: environmental image information indicating the environment in which the robot is located at that instant, position information indicating the position of the robot in the environment indicated by the environmental image information at that instant, and Tag information indicating the state of the robot at that moment; the tag information includes a first tag or a second tag, the first tag is used to indicate that the robot is in an untrapped state, and the second tag is used to indicate that the robot is in a trapped state. Exemplarily, the acquiring unit 1001 may be used to execute S601 of the obstacle avoidance method for the robot shown in FIG. 6. The acquiring unit 1001 may also be used to execute S701 of the robot obstacle avoidance method shown in FIG. 7.

The communication unit 1003 is configured to send the sample information of M time in the N time to the server, and M is a positive integer less than or equal to N. Exemplarily, the communication unit 1003 may be used to execute S602 of the robot obstacle avoidance method shown in FIG. 6 above. The communication unit 1003 may also be used to execute S702 of the obstacle avoidance method for the robot shown in FIG. 7.

The communication unit 1003 is also configured to receive a dilemma recognition model from a server, and the dilemma recognition model is trained on sample information collected by one or more robots. Exemplarily, the communication unit 1003 may also be used to execute S604 of the robot obstacle avoidance method shown in FIG. 6 above. The communication unit 1003 may also be used to execute S704 of the robot obstacle avoidance method shown in FIG. 7 above.

The obstacle avoidance unit 1004 is used to avoid obstacles during the movement of the robot according to the dilemma recognition model. Exemplarily, the obstacle avoidance unit 1004 may also be used to execute S605 of the robot obstacle avoidance method shown in FIG. 6. The obstacle avoidance unit 1004 may also be used to execute S706 of the robot obstacle avoidance method shown in FIG. 7.

Further, the above obstacle avoidance device may further include: a determining unit 1002.

The determining unit 1002 is configured to determine the sample information at the N time, and the label information included in the sample information at the N time is used to indicate that the machine is in a trapped state at the N time.

It should be noted that all relevant content of the steps involved in the foregoing method embodiments can be cited in the functional description of the corresponding functional module, and will not be repeated here. The obstacle avoidance device provided by the embodiment of the present application is used to implement the above-mentioned robot obstacle avoidance method, and therefore can achieve the same effect as the above-mentioned robot obstacle avoidance method.

In the case of using an integrated unit, another possible composition of the obstacle avoidance device involved in the foregoing embodiment may include: a collection module, a processing module, and a communication module.

The collection module is used to collect information needed by the robot to avoid obstacles. For example, the collection module is used to support the robot to execute S601 in FIG. 6, S701 and S705 in FIG. 7 and/or other processes used in the technology described herein. The processing module is used to control and manage the actions of the robot. For example, the processing module is used to support the robot to execute S605 in FIG. 6, S706 in FIG. 7 and/or other processes used in the technology described herein. The communication module is used to support communication between the robot and other network entities, such as the communication between the server and other functional modules or network entities shown in FIG. 6 or FIG. 7. The robot may also include a storage module for storing sample information of the robot, a dilemma recognition model, and other program codes and data.

The obstacle avoidance device provided by the embodiment of the present application is used to implement the above-mentioned robot obstacle avoidance method, and therefore can achieve the same effect as the above-mentioned robot obstacle avoidance method.

Through the description of the above embodiments, those skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can be allocated as needed. It is completed by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.

In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are only illustrative, for example, the division of modules or units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separate, and the components displayed as units may be one physical unit or multiple physical units, that is, they may be located in one place, or they may be distributed to multiple different places. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional units in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.

If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium. Based on this understanding, the technical solutions of the embodiments of the present application are essentially or the part that contributes to the prior art, or all or part of the technical solutions can be embodied in the form of software products, which are stored in a storage medium It includes several instructions to make a device (may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

The above are only specific implementations of this application, but the protection scope of this application is not limited to this, and any changes or substitutions within the technical scope disclosed in this application should be covered within the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims

An obstacle avoidance method for a robot, characterized in that the method includes:

The robot obtains the sample information at N times during the movement of the robot, where N is an integer greater than 0; wherein, the sample information at each of the N times includes: information indicating the environment in which the robot is located at that moment Environmental image information, position information indicating the position of the robot in the environment indicated by the environmental image information at that moment, and label information indicating the state of the robot at that moment; the label information includes first A tag or a second tag, the first tag is used to indicate that the robot is in an untrapped state, and the second tag is used to indicate that the robot is in a trapped state;

The robot sends the sample information at M times out of N times to the server, where M is a positive integer less than or equal to N;

The robot receives a dilemma recognition model from the server, and the dilemma recognition model is trained on sample information collected by one or more robots;

The robot avoids obstacles during the movement process according to the dilemma recognition model.
The method according to claim 1, wherein:

The image information of the environment at each moment includes: image information of objects on and around the moving route of the robot in the environment where the robot is located at that moment.
The method of claim 1 or 2, wherein:

The position information at each time includes: relative position information of the robot and the objects on and around the moving line of the robot in the environment indicated by the environmental image information at the time.
The method according to any one of claims 1 to 3, wherein the robot sending the sample information at M times out of N times to the server, comprising:

The robot determines that in the sample information at the N time, the label information included in the sample information at the N time is used to indicate that the machine is in a trapped state at the N time;

The robot sends the sample information of the M-1 time before the Nth time and the sample information of the Nth time to the server.
The method according to any one of claims 1 to 4, wherein the robot avoiding obstacles during movement according to the dilemma recognition model comprises:

The robot obtains the environment image information at the current moment and the location information at the current moment;

The robot obtains the environment image information at the next time and the location information at the next time according to the environment image information at the current time, the location information at the current time, and the moving direction and speed of the robot at the current time;

The robot determines the probability of the robot being trapped at the next time according to the environmental image information at the next time, the location information at the next time, and the dilemma recognition model;

The robot determines that the probability of the robot being trapped at the next moment is greater than a preset threshold, and the robot changes its motion strategy to avoid obstacles.
An obstacle avoidance device, characterized by being applied to a robot, the device comprising: an acquisition unit, a communication unit, and an obstacle avoidance unit;

The acquiring unit is configured to acquire sample information at N times during the movement of the robot, where N is an integer greater than 0; wherein the sample information at each of the N times includes: the robot at that moment Environment image information of the environment in which it is located, position information indicating the position of the robot in the environment indicated by the environment image information at that moment, and tag information indicating the state of the robot at that moment; the tag The information includes a first tag or a second tag, the first tag is used to indicate that the robot is in an untrapped state, and the second tag is used to indicate that the robot is in a trapped state;

The communication unit is configured to send sample information at M time in N time to the server, where M is a positive integer less than or equal to N;

The communication unit is further configured to receive a dilemma recognition model from the server, where the dilemma recognition model is trained on sample information collected by one or more robots;

The obstacle avoidance unit is used to avoid obstacles during the movement of the robot according to the dilemma recognition model.
The device according to claim 6, wherein:

The image information of the environment at each moment includes: image information of objects on and around the moving route of the robot in the environment where the robot is located at that moment.
The device according to claim 6 or 7, characterized in that:

The position information at each time includes: relative position information of the robot and the objects on and around the moving line of the robot in the environment indicated by the environmental image information at the time.
The device according to any one of claims 6-8, wherein the device further comprises: a determining unit;

The determining unit is configured to determine that in the sample information at the N time, the label information included in the sample information at the N time is used to indicate that the machine is in a trapped state at the N time;

The communication unit is configured to send sample information of M time in N time to the server, including:

The communication unit is configured to send the sample information of the M-1 time before the Nth time and the sample information of the Nth time to the server.
The device according to any one of claims 6-9, wherein:

The acquiring unit is also used to acquire environment image information at the current moment, and location information at the current moment, according to the environment image information at the current moment, the location information at the current moment, and the moving direction of the robot at the current moment And moving speed to obtain the environment image information at the next moment and the location information at the next moment;

The obstacle avoidance unit is configured to avoid obstacles during the movement of the robot according to the dilemma recognition model, including:

The obstacle avoidance unit is configured to determine the probability that the robot is trapped at the next moment according to the environmental image information at the next moment, the position information at the next moment, and the dilemma recognition model to determine the next The probability of the robot being trapped is greater than the preset threshold at the moment, and the motion strategy of the robot is changed to avoid obstacles.
A robot obstacle avoidance system is characterized by comprising: one or more robots and a server;

The robot is configured to obtain sample information at N times during the movement of the robot, and send the sample information at the N times to the server; where N is an integer greater than 0, and each of the N times The sample information at the time includes: environment image information for indicating the environment in which the robot is located at that time, position information for indicating the position of the robot in the environment indicated by the environment image information at that time, and for indicating Tag information of the state of the robot at that moment; the tag information includes a first tag or a second tag, the first tag is used to indicate that the robot is in an untrapped state, and the second tag is used to indicate The robot is in a trapped state;

The server is configured to receive sample information from the one or more robots, train the received sample information of the one or more robots, obtain a dilemma recognition model, and send the dilemma recognition model to all The robot;

The robot is also used for receiving the dilemma recognition model, and avoiding obstacles in the moving process according to the dilemma recognition model.
The system according to claim 11, wherein the sample information received by the server includes a first type of sample information and a second type of sample information;

The server is configured to train the received sample information of the one or more robots to obtain a dilemma recognition model, including:

The server is configured to train the first type of sample information to obtain an initial model, verify the accuracy of the initial model according to the second type of sample information, and obtain a verification accuracy rate. If the verification accuracy rate is greater than the preset threshold, the initial model is determined as the dilemma recognition model, and if the verification accuracy rate is less than the preset threshold, the server receives new sample information sent by the robot , Continue training the new sample information and the initial model, and verify the accuracy of the model obtained by continuing training according to the second type of sample information, until the verification accuracy of the currently obtained model is greater than Determining the preset threshold value as the dilemma recognition model;

Wherein, if the environment image information and position information of the sample information included in the second type of sample information are input into the state of whether the robot is trapped indicated by the result obtained by the initial model, it is the same as in the sample information Whether the robots are trapped in the same state indicated by the tag information, it is determined that the initial model is accurate; if the environmental image information and position information of the sample information included in the second type of sample information are input into the initial The state of whether the robot is trapped indicated by the result obtained by the model is different from the state of whether the robot is trapped indicated by the tag information in the second type of sample information, then it is determined that the initial model is not Accurate; the verification accuracy rate of the initial model is determined according to the verification result of each sample information in the second type of sample information on the accuracy of the initial model.