CN110276302B - Method and system for taking elevator by robot - Google Patents

Method and system for taking elevator by robot Download PDF

Info

Publication number
CN110276302B
CN110276302B CN201910551001.8A CN201910551001A CN110276302B CN 110276302 B CN110276302 B CN 110276302B CN 201910551001 A CN201910551001 A CN 201910551001A CN 110276302 B CN110276302 B CN 110276302B
Authority
CN
China
Prior art keywords
elevator
detection result
feature map
preset area
resolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910551001.8A
Other languages
Chinese (zh)
Other versions
CN110276302A (en
Inventor
张雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongguancun Technology Leasing Co ltd
Original Assignee
Shanghai Mrobot Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Mrobot Technology Co ltd filed Critical Shanghai Mrobot Technology Co ltd
Priority to CN201910551001.8A priority Critical patent/CN110276302B/en
Publication of CN110276302A publication Critical patent/CN110276302A/en
Application granted granted Critical
Publication of CN110276302B publication Critical patent/CN110276302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B66HOISTING; LIFTING; HAULING
    • B66BELEVATORS; ESCALATORS OR MOVING WALKWAYS
    • B66B5/00Applications of checking, fault-correcting, or safety devices in elevators
    • B66B5/0006Monitoring devices or performance analysers
    • B66B5/0012Devices monitoring the users of the elevator system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • G06V20/53Recognition of crowd images, e.g. recognition of crowd congestion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention belongs to the field of robots, and discloses a method and a system for a robot to board an elevator, wherein the method comprises the following steps: when the elevator door is opened, acquiring an image in the elevator; inputting the image into a pre-trained neural network model to obtain a target detection result and a preset area detection result, wherein the preset area is an area which needs to be occupied when the robot enters the elevator; and judging whether the elevator can be entered or not according to the target detection result and a preset area detection result, if so, entering the elevator, otherwise, releasing the elevator, and calling the elevator again. According to the invention, whether the elevator can be accessed is judged by acquiring the image in the elevator and detecting whether the preset area in the elevator is occupied and the occupied object, and when the elevator cannot be accessed, the elevator is directly released, so that the time of occupying elevator passengers is avoided, and the friendly interaction capability of the robot is improved.

Description

Method and system for taking elevator by robot
Technical Field
The invention belongs to the technical field of robots, and particularly relates to a method and a system for taking an elevator by a robot.
Background
In recent years, with the development of robot technology and the continuous and deep research of artificial intelligence, intelligent mobile robots play an increasingly important role in human life and are widely applied in various fields. In some applications, the robot may be used in a cross-floor scenario, such as delivering items across floors. In this scenario, the robot often needs to take an elevator to perform cross-floor work.
And the space of the elevator is narrow, the robot will have to exit the elevator if there is insufficient space left to accommodate the robot when entering the elevator or if the travel path of the robot is blocked. When such a situation occurs, the robot entering the elevator and exiting midway will consume time for taking the elevator, making the robot somewhat lacking friendly interaction capabilities.
Disclosure of Invention
The invention aims to provide a method and a system for taking an elevator by a robot, which can avoid wasting the time for taking the elevator and improve the friendly interaction capability of the robot.
The technical scheme provided by the invention is as follows:
in one aspect, a method of using a robot to board an elevator is provided, comprising:
when the elevator door is opened, acquiring an image in the elevator;
inputting the image into a pre-trained neural network model to obtain a target detection result and a preset area detection result, wherein the preset area is an area which needs to be occupied when the robot enters the elevator;
and judging whether the elevator can be entered or not according to the target detection result and a preset area detection result, if so, entering the elevator, otherwise, releasing the elevator, and calling the elevator again.
Further preferably, the determining whether the elevator can be entered according to the target detection result and a preset area detection result specifically includes:
judging whether the preset area is occupied or not according to the detection result of the preset area, and if not, judging that the elevator can be accessed;
if the preset area is occupied according to the detection result of the preset area, judging whether the occupied object is an object or not according to the target detection result;
if the occupied object is an object, judging that the elevator cannot be accessed;
if the occupied object is not an object, outputting avoidance information, waiting for a preset time, acquiring a new image in the elevator again, inputting the new image into the neural network model, acquiring a target detection result and a preset area detection result again, judging whether the elevator can be entered according to the acquired target detection result and the preset area detection result again, and judging that the elevator cannot be entered when the number of times of repeatedly acquiring the image in the elevator is greater than a preset threshold value.
Further preferably, the inputting the image into a pre-trained neural network model, and the obtaining the target detection result and the preset region detection result specifically includes:
inputting the images into a pre-trained neural network model, and extracting feature maps of different layers;
carrying out feature fusion and resolution improvement on the feature maps of different layers;
and outputting a target detection result and a preset region detection result according to the feature map obtained after feature fusion and resolution improvement.
Further preferably, the inputting the image into a pre-trained neural network model, and the extracting feature maps of different layers specifically includes:
inputting the image into a pre-trained neural network model, extracting a first feature map of the last layer, and extracting a second feature map of the front layer, wherein the resolution of the second feature map is twice of that of the first feature map;
the performing feature fusion and resolution improvement on the feature maps of different layers specifically includes:
processing the first characteristic diagram according to a preset step, so that the resolution of the first characteristic diagram is doubled, and a third characteristic diagram is obtained;
performing 1 × 1 convolution processing on the second feature map, and then fusing the second feature map with the third feature map to obtain a fourth feature map;
processing the fourth feature map for multiple times according to the preset step, and increasing the resolution of the fourth feature map for multiple times to obtain a fifth feature map;
the outputting of the target detection result and the preset area detection result according to the feature map obtained after feature fusion and resolution improvement specifically comprises:
and outputting a target detection result and a preset area detection result according to the fifth characteristic diagram.
Further preferably, the processing the first feature map according to the preset step to increase the resolution of the first feature map by one time, and the obtaining a third feature map specifically includes:
deconvolving the first feature map;
performing channel separation on the deconvolved first feature map;
performing convolution processing on the first characteristic diagram after channel separation by adopting two different convolution kernels;
and performing channel fusion on the feature map obtained after the convolution processing to obtain a third feature map.
In another aspect, there is also provided a system for a robot to board an elevator, comprising:
the image acquisition module is used for acquiring an image in the elevator when the elevator door is opened;
the detection module is used for inputting the image into a pre-trained neural network model to obtain a target detection result and a preset area detection result, wherein the preset area is an area which needs to be occupied when the robot enters the elevator;
and the processing module is used for judging whether the elevator can be entered or not according to the target detection result and a preset area detection result, entering the elevator if the elevator can be entered, releasing the elevator if the elevator can not be entered, and calling the elevator again.
Further preferably, the processing module comprises:
the processing unit is used for judging whether the preset area is occupied or not according to the detection result of the preset area, and if not, judging that the elevator can be entered;
the processing unit is further configured to determine whether an occupied object is an object according to the target detection result if the preset area is determined to be occupied according to the preset area detection result;
the processing unit is further used for judging that the elevator cannot be accessed if the occupied object is an object;
the processing unit is further used for outputting avoidance information if the occupied object is not an object, obtaining a new image in the elevator again after waiting for a preset time, inputting the new image into the neural network model, obtaining a target detection result and a preset area detection result again, judging whether the elevator can be entered according to the obtained target detection result and the preset area detection result again, and judging that the elevator cannot be entered when the number of times of repeatedly obtaining the image in the elevator is larger than a preset threshold value.
Further preferably, the detection module comprises:
the characteristic extraction unit is used for inputting the images into a pre-trained neural network model and extracting characteristic graphs of different layers;
the resolution improving unit is used for carrying out feature fusion and resolution improvement on the feature maps of different layers;
and the task output unit is used for outputting a target detection result and a preset area detection result according to the feature map obtained after feature fusion and resolution improvement.
Further preferably, the feature extraction unit is further configured to input the image into a pre-trained neural network model, extract a first feature map of a last layer, and extract a second feature map of a previous layer, where a resolution of the second feature map is twice of a resolution of the first feature map;
the resolution increasing unit includes:
the resolution improving subunit is configured to, according to a preset step, process the first feature map to improve the resolution of the first feature map by one time, so as to obtain a third feature map;
the feature fusion subunit is configured to perform 1 × 1 convolution processing on the second feature map and then fuse the second feature map with the third feature map to obtain a fourth feature map;
the resolution enhancement subunit is configured to perform multiple processing on the fourth feature map according to the preset step, and enhance the resolution of the fourth feature map by multiple times to obtain a fifth feature map;
and the task output unit is further used for outputting a target detection result and a preset area detection result according to the fifth feature map.
Further preferably, the resolution enhancement subunit is further configured to deconvolve the first feature map; performing channel separation on the deconvolved first feature map; performing convolution processing on the first characteristic diagram after channel separation by adopting two different convolution kernels; and performing channel fusion on the feature map obtained after the convolution processing to obtain a third feature map.
Compared with the prior art, the method and the system for the robot to board the elevator have the following beneficial effects: according to the invention, whether the elevator can be accessed is judged by acquiring the image in the elevator and detecting whether the preset area in the elevator is occupied and the occupied object, and when the elevator cannot be accessed, the elevator is directly released, so that the time of occupying the elevator taking people is avoided, and the friendly interaction capacity of the robot is improved; in addition, the invention judges whether the elevator can be entered or not by the detection mode of obtaining the image and the neural network model, thereby avoiding the problem that the blocked part cannot be detected due to the unknown object type when the depth sensor is used, causing the wrong space judgment, and improving the judgment accuracy.
Drawings
The above-mentioned characteristics, technical features, advantages and realisations of a method and a system for boarding an elevator by a robot will be further explained in a clearly understandable manner in the following description of preferred embodiments in connection with the accompanying drawings.
Fig. 1 is a schematic flow diagram of a first embodiment of a method of the invention for a robot to board an elevator;
fig. 2 is a schematic flow diagram of a second embodiment of a method of the invention for a robot to board an elevator;
fig. 3 is a schematic flow diagram of a third embodiment of a method of the invention for a robot to board an elevator;
fig. 4 is a schematic flow diagram of a fourth embodiment of a method of the invention for a robot to board an elevator;
FIG. 5 is a flow chart illustrating an example of feature fusion and resolution enhancement for a feature map according to the present invention;
fig. 6 is a block diagram schematically illustrating the construction of one embodiment of a system for a robot to board an elevator of the present invention.
Description of the reference numerals
10. An image acquisition module; 20. a detection module; 30. and a processing module.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description will be made with reference to the accompanying drawings. It is obvious that the drawings in the following description are only some examples of the invention, and that for a person skilled in the art, without inventive effort, other drawings and embodiments can be derived from them.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure as a product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "one" means not only "only one" but also a case of "more than one".
According to a first embodiment provided by the present invention, as shown in fig. 1, a method of riding an elevator by a robot includes:
s100, when the elevator door is opened, obtaining an image in the elevator;
s200, inputting the image into a pre-trained neural network model to obtain a target detection result and a preset area detection result, wherein the preset area is an area which needs to be occupied when the robot enters the elevator;
s300, judging whether the elevator can be entered or not according to the target detection result and the preset area detection result, if so, entering the elevator, otherwise, releasing the elevator, and calling the elevator again.
Specifically, when the robot needs to take the elevator, the robot firstly runs to the position of the elevator and then calls the elevator; when the lift-cabin door is opened, the robot is just to the elevator, acquires the inside image of elevator through the camera of installing on the robot. The camera can be installed on the robot and also can be installed in the elevator, after the robot calls the elevator, the server dispatches the target elevator to the target floor where the robot is located, and when the target elevator reaches the target floor where the robot is located, the server controls the camera in the elevator to acquire the image in the elevator.
After the images in the elevator are obtained, the obtained images are identified through a pre-trained neural network model, and a target detection result and a preset area detection result are obtained. The target refers to a relatively common object or person, for example, in a hospital setting, the target refers to a person or a relatively common location category object such as a hospital bed, cart, wheelchair, etc. in the hospital. The preset area refers to the position that the robot needs to be able to take into the elevator. Since the size of the robot is known, a preset area that can accommodate the robot can be extracted and set in the elevator, and the size of the preset area is larger than that of the robot. When entering the elevator, the robot can only move forwards or backwards and cannot move left and right, so that the preset area is a fixed area in the elevator.
The neural network model is improved by using an object as Points basic structure, features are extracted by adopting a mobilenetv3 structure in a basic network part, an intermediate module is added, the intermediate module is used for carrying out resolution improvement on a feature map extracted from the mobilenetv3, and then the feature map with the improved resolution is input into a multi-task output module to output a detection result, namely, a branch of region judgment is added on the basis of object asPoints.
After the neural network model is built, acquiring a large number of images in the elevator, and marking the states of the target and the occupied preset area in the images in a manual marking mode; respectively training and testing the marked images, training the input well-constructed neural network model through the training set, testing the well-trained neural network model through the testing set, performing data enhancement on specific training samples in the training set according to a testing result, subdividing the enhanced data set into the training set and the testing set, training and testing the well-trained neural network model again, repeating the steps of data set enhancement, training and testing until a limit detection effect is obtained, and obtaining the well-trained neural network model.
After the target detection result and the preset area detection result are obtained, whether the elevator can be accessed can be judged according to the obtained result. For example, if it is detected that the preset area is not occupied, it indicates that the robot can directly enter the elevator, and if it is detected that the preset area is occupied by the hospital bed, it indicates that the robot cannot enter the elevator, and the elevator needs to be released to call another elevator again.
According to the invention, whether the elevator can be accessed is judged by acquiring the image in the elevator and detecting whether the preset area in the elevator is occupied and the occupied object, and when the elevator cannot be accessed, the elevator is directly released, so that the time of occupying the elevator taking people is avoided, and the friendly interaction capacity of the robot is improved; in addition, the invention judges whether the elevator can be accessed by the detection mode of obtaining the image and the neural network model, thereby avoiding the problem that the blocked part cannot be detected because the object type is unknown when the depth sensor is used, causing the wrong judgment of the space and improving the judgment accuracy.
According to a second embodiment provided by the present invention, as shown in fig. 2, a method for a robot to board an elevator, based on the first embodiment, the determining whether the elevator can be entered according to the target detection result and the preset area detection result specifically comprises:
judging whether the preset area is occupied or not according to the detection result of the preset area, and if not, judging that the elevator can be accessed;
if the preset area is occupied according to the detection result of the preset area, judging whether the occupied object is an object or not according to the target detection result;
if the occupied object is an object, judging that the elevator cannot be accessed;
if the occupied object is not an object, outputting avoidance information, waiting for a preset time, acquiring a new image in the elevator again, inputting the new image into the neural network model, acquiring a target detection result and a preset area detection result again, judging whether the elevator can be entered according to the target detection result and the preset area detection result which are acquired again, and judging that the elevator cannot be entered when the number of times of repeatedly acquiring the image in the elevator is larger than a preset threshold value.
Specifically, after a target detection result and a preset area detection result are obtained, the state of the preset area is judged according to the preset area detection result, and if the preset area is not occupied, the robot can be directly judged to enter the elevator.
If at least 10% (which can be set according to actual conditions) of the preset area is found to be classified as the occupied state according to the detection result of the preset area, analyzing the object occupying the preset area according to the target detection result. If the non-pedestrian occupying the preset area is a sickbed and the like, the non-pedestrian can not enter the elevator, the elevator is directly released, other elevators are called again, and a new detection and judgment process is entered.
If the robot occupies the preset area, the robot carries out voice broadcast to prompt pedestrians to avoid; and then, after waiting for a preset time (such as waiting for 5 seconds), shooting again to obtain a new image in the elevator, detecting the new image through a neural network model, and obtaining a new target detection result and a preset area detection result to judge whether the pedestrian avoids. If the pedestrian has avoided, the robot can enter the elevator by judging, and the robot can directly drive into the elevator.
If the pedestrian still carries out avoidance, voice broadcasting is carried out again to prompt the pedestrian to avoid, after the preset time is waited, the whole judging process is carried out again, namely the steps S100, S200 and S300 are carried out again; when the times of repeatedly executing the whole judging process exceed a preset threshold (such as 2 times, 3 times and the like), it indicates that the pedestrians in the elevator cannot avoid the elevator, or the pedestrians in the elevator do not want to avoid the elevator, at the moment, the elevator is directly released, the elevator is abandoned, and the elevator is called again.
In this scheme, when predetermineeing the region and being occupied by the pedestrian, through many times detection and judgement, can distinguish out whether friendly the interaction of people, when the friendly interaction of pedestrian, the elevator is directly released, avoids extravagant pedestrian and robot self time.
According to a third embodiment provided by the present invention, as shown in fig. 3, a method for a robot to board an elevator, based on the first embodiment or the second embodiment, the step S200 of inputting the image into a pre-trained neural network model, and the obtaining of the target detection result and the preset area detection result specifically includes:
s210, inputting the image into a pre-trained neural network model, and extracting feature maps of different layers;
s220, performing feature fusion and resolution improvement on the feature maps of different layers;
and S230, outputting a target detection result and a preset region detection result according to the feature map obtained after feature fusion and resolution improvement.
Specifically, after the obtained images inside the elevator are input into a trained neural network model, feature maps of different layers are extracted through mobilenetv 3. In the convolutional neural network, a high-level feature map has stronger semantic property and lower resolution, and global and contour features are mapped. The low-level feature map has weaker semanteme and higher resolution, and local and detail features are mapped.
And performing feature fusion on feature maps of different layers to combine features in the low-level feature map and features in the high-level feature map, further performing complementary fusion on the features, then performing resolution improvement on the feature map obtained after feature fusion, and then obtaining a target detection result and a preset area detection result according to the feature map obtained after resolution improvement. By fusing the features and improving the resolution, the detection of the target in the image and the detection of the area in the image can be well considered.
Illustratively, the process of acquiring the detection result of the preset area includes: assuming that the resolution of the feature map after resolution enhancement is 1/n of the obtained image inside the elevator (original image), that is, each point in the finally obtained feature map corresponds to an n × n area in the original image, performing classification detection on each point in the finally obtained feature map, that is, performing classification detection on each n × n area in the original image, judging whether each area in the original image is occupied or not, and then extracting whether each n × n area in the preset area is occupied or not, thereby obtaining a preset area detection result.
According to the scheme, the characteristics in the low-level characteristic diagram and the characteristics in the high-level characteristic diagram are subjected to complementary fusion, so that the high efficiency and the accuracy of target detection can be improved.
According to a fourth embodiment of the present invention, as shown in fig. 4, a method for a robot to board an elevator, comprises:
s100, when the elevator door is opened, acquiring an image in the elevator;
s211, inputting the image into a pre-trained neural network model, extracting a first feature map of a last layer, and extracting a second feature map of a front layer, wherein the resolution of the second feature map is twice of that of the first feature map;
s221, processing the first feature map according to a preset step, so that the resolution of the first feature map is doubled, and a third feature map is obtained;
s222, fusing the second feature map and the third feature map after performing 1 × 1 convolution processing on the second feature map to obtain a fourth feature map;
s223, processing the fourth feature map for multiple times according to the preset step, and increasing the resolution of the fourth feature map for multiple times to obtain a fifth feature map;
s231, outputting a target detection result and a preset area detection result according to the fifth characteristic diagram, wherein the preset area is an area which needs to be occupied when the robot enters the elevator;
s300, judging whether the elevator can be entered or not according to the target detection result and the preset area detection result, if so, entering the elevator, otherwise, releasing the elevator, and calling the elevator again.
In particular, in network architectures, resolution is generally reduced and the number of channels is increased as the depth increases. As shown in fig. 5, assuming that the resolution of each feature map of the last layer extracted from the mobilenetv3 is 1/32 of the original image, each feature map of the last layer is defined as a first feature map; then, each feature map of the layer with the resolution of 1/16 of the original image is selected, and each feature map of the layer is defined as a second feature map.
And then, processing the first feature map according to a preset step, so that the resolution of the first feature map is doubled, and a third feature map is obtained, wherein the resolution of the third feature map is 1/16 of that of the original map.
And performing 1 × 1 convolution transformation on the second feature map to enable the number of channels of the second feature map to be the same as that of channels of the third feature map, and then fusing the second feature map and the third feature map after the convolution transformation to obtain a fourth feature map.
And performing resolution enhancement processing on the fourth feature map for 2 times according to the preset steps to obtain a fifth feature map, wherein the resolution of the obtained fifth feature map is 1/4 of that of the original map. After the fifth feature map is obtained, the Objects in the image are detected through the Objects as Points infrastructure, and then the region classification detection is carried out through the added region judgment branch.
When the area classification detection is carried out through the fifth feature map, each point in the fifth feature map is classified, whether each point is occupied or not is judged, one point in the fifth feature map corresponds to one 4 x 4 area in the original map, the original map is divided into 4 x 4 areas, each point in the fifth feature map is classified, each 4 x 4 area in the original map is classified and detected, the occupied condition of the area in the original map is obtained, the occupied condition of the preset area is obtained from the detection result of the original map, and the detection result of the target area is obtained.
Preferably, in step S221, according to a preset step, processing the first feature map to increase the resolution of the first feature map by one time, and obtaining a third feature map specifically includes:
deconvolving the first feature map;
performing channel separation on the deconvolved first feature map;
performing convolution processing on the first characteristic diagram after channel separation by adopting two different convolution kernels;
and performing channel fusion on the feature map obtained after the convolution processing to obtain a third feature map.
Specifically, the presetting step includes deconvolution, channel separation, convolution processing, and channel fusion. The specific steps are shown in fig. 5. The method comprises the steps of firstly carrying out deconvolution on a first feature map with the resolution being 1/32 of that of the original image, then carrying out channel separation, then respectively carrying out 3 x 3 convolution and 5 x 5 convolution, and finally carrying out channel fusion to obtain a third feature map, wherein the resolution of the obtained third feature map is doubled, and the resolution of the third feature map is 1/16 of that of the original image.
The resolution of the fourth feature map obtained by performing 1 × 1 convolution processing on the second feature map and then fusing the second feature map with the third feature map is also 1/16 of that of the original image. And then, performing resolution enhancement on the fourth feature map twice according to the preset steps, wherein the resolution of the obtained fifth feature map is 1/4 of that of the original map. By improving the resolution of the fifth feature map to 1/4 of the original image, the region segmentation of the original image is reasonable, the influence on the accuracy of region classification detection due to too large or too small region segmentation is avoided, and the detection of the object as Points by the algorithm can be considered.
It should be understood that, in the foregoing embodiments, the sequence numbers of the steps do not mean the execution sequence, and the execution sequence of the steps should be determined by functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
According to a fifth embodiment provided by the present invention, as shown in fig. 6, a system for a robot to board an elevator, includes:
the image acquisition module 10 is used for acquiring an image in the elevator when the elevator door is opened;
the detection module 20 is configured to input the image into a pre-trained neural network model, and obtain a target detection result and a preset area detection result, where the preset area is an area that the robot needs to occupy when entering the elevator;
and the processing module 30 is used for judging whether the elevator can be entered according to the target detection result and a preset area detection result, if so, entering the elevator, otherwise, releasing the elevator, and calling the elevator again.
Specifically, this embodiment is a device embodiment corresponding to the method embodiment, and specific effects are referred to the above embodiments, which are not described in detail herein.
According to a sixth embodiment of the present invention, in the system for boarding an elevator by a robot, on the basis of the fifth embodiment, the processing module 30 includes:
the processing unit is used for judging whether the preset area is occupied or not according to the detection result of the preset area, and if not, judging that the elevator can be entered;
the processing unit is also used for judging whether the preset area is occupied or not according to the target detection result if the preset area is occupied according to the preset area detection result;
the processing unit is also used for judging that the elevator cannot be accessed if the occupied object is an object;
and the processing unit is further used for outputting avoidance information if the occupied object is not an object, acquiring a new image in the elevator again after waiting for a preset time, inputting the new image into the neural network model, acquiring a target detection result and a preset area detection result again, judging whether the elevator can be entered according to the acquired target detection result and preset area detection result again, and judging that the elevator cannot be entered when the number of times of repeatedly acquiring the image in the elevator is greater than a preset threshold value.
Specifically, this embodiment is a device embodiment corresponding to the method embodiment, and specific effects are referred to the above embodiments, which are not described in detail herein.
According to a seventh embodiment of the present invention, in the system for boarding an elevator by a robot, based on the fifth or sixth embodiment, the detection module 20 includes:
the characteristic extraction unit is used for inputting the images into a pre-trained neural network model and extracting characteristic graphs of different layers;
the resolution improving unit is used for carrying out feature fusion and resolution improvement on the feature maps of different layers;
and the task output unit is used for outputting a target detection result and a preset area detection result according to the feature map obtained after feature fusion and resolution improvement.
Specifically, this embodiment is a device embodiment corresponding to the method embodiment, and specific effects are referred to the above embodiments, which are not described in detail herein.
According to an eighth embodiment of the present invention, there is provided a system for a robot to board an elevator, which comprises, in addition to the seventh embodiment,
the feature extraction unit is further configured to input the image into a pre-trained neural network model, extract a first feature map of a last layer, and extract a second feature map of a previous layer, where a resolution of the second feature map is twice of a resolution of the first feature map;
the resolution increasing unit includes:
the resolution improving subunit is configured to, according to a preset step, process the first feature map to improve the resolution of the first feature map by one time, so as to obtain a third feature map;
the feature fusion subunit is configured to perform 1 × 1 convolution processing on the second feature map and then fuse the second feature map with the third feature map to obtain a fourth feature map;
the resolution improvement subunit is configured to perform multiple processing on the fourth feature map according to the preset step, and improve the resolution of the fourth feature map by multiple times to obtain a fifth feature map;
and the task output unit is further used for outputting a target detection result and a preset area detection result according to the fifth feature map.
Preferably, the resolution enhancement subunit is further configured to deconvolve the first feature map; performing channel separation on the deconvolved first feature map; performing convolution processing on the first characteristic diagram after channel separation by adopting two different convolution kernels; and performing channel fusion on the feature map obtained after the convolution processing to obtain a third feature map.
Specifically, this embodiment is a device embodiment corresponding to the method embodiment, and specific effects are referred to the above embodiments, which are not described in detail herein.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (8)

1. A method of using a robot to board an elevator, comprising:
when the elevator door is opened, acquiring an image in the elevator;
inputting the image into a pre-trained neural network model to obtain a target detection result and a preset area detection result, wherein the preset area is an area which needs to be occupied when the robot enters the elevator;
judging whether the elevator can be entered or not according to the target detection result and a preset area detection result, if so, entering the elevator, otherwise, releasing the elevator, and calling the elevator again;
the step of judging whether the elevator can be accessed according to the target detection result and the preset area detection result specifically comprises the following steps:
judging whether the preset area is occupied or not according to the detection result of the preset area, and if not, judging that the elevator can be accessed;
if the preset area is occupied according to the detection result of the preset area, judging whether the occupied object is an object or not according to the target detection result;
if the occupied object is an object, judging that the elevator cannot be accessed;
if the occupied object is not an object, outputting avoidance information, waiting for a preset time, acquiring a new image in the elevator again, inputting the new image into the neural network model, acquiring a target detection result and a preset area detection result again, judging whether the elevator can be entered according to the acquired target detection result and the preset area detection result again, and judging that the elevator cannot be entered when the number of times of repeatedly acquiring the image in the elevator is greater than a preset threshold value.
2. The method of claim 1, wherein the inputting the image into a pre-trained neural network model, and the obtaining the target detection result and the preset area detection result specifically comprises:
inputting the images into a pre-trained neural network model, and extracting feature maps of different layers;
carrying out feature fusion and resolution improvement on the feature maps of different layers;
and outputting a target detection result and a preset region detection result according to the feature map obtained after feature fusion and resolution improvement.
3. The method of claim 2, wherein the inputting the image into a pre-trained neural network model and the extracting the feature maps of different layers specifically comprises:
inputting the image into a pre-trained neural network model, extracting a first feature map of the last layer, and extracting a second feature map of the front layer, wherein the resolution of the second feature map is twice of that of the first feature map;
the performing feature fusion and resolution improvement on the feature maps of different layers specifically includes:
processing the first characteristic diagram according to a preset step, so that the resolution of the first characteristic diagram is doubled, and a third characteristic diagram is obtained;
performing 1 × 1 convolution processing on the second feature map, and then fusing the second feature map with the third feature map to obtain a fourth feature map;
processing the fourth feature map for multiple times according to the preset step, and increasing the resolution of the fourth feature map for multiple times to obtain a fifth feature map;
the outputting of the target detection result and the preset area detection result according to the feature map obtained after feature fusion and resolution improvement specifically comprises:
and outputting a target detection result and a preset area detection result according to the fifth characteristic diagram.
4. The method of claim 3, wherein the processing the first characteristic map according to the preset steps to increase the resolution of the first characteristic map by one time to obtain a third characteristic map specifically comprises:
deconvolving the first feature map;
performing channel separation on the deconvolved first feature map;
carrying out convolution processing on the first characteristic diagram after channel separation by adopting two different convolution kernels;
and performing channel fusion on the feature map obtained after the convolution processing to obtain a third feature map.
5. A system for a robot to board an elevator, comprising:
the image acquisition module is used for acquiring an image in the elevator when the elevator door is opened;
the detection module is used for inputting the image into a pre-trained neural network model to obtain a target detection result and a preset area detection result, wherein the preset area is an area which needs to be occupied when the robot enters the elevator;
the processing module is used for judging whether the elevator can be entered or not according to the target detection result and a preset area detection result, if so, entering the elevator, and if not, releasing the elevator and calling the elevator again;
the processing module comprises:
the processing unit is used for judging whether the preset area is occupied or not according to the detection result of the preset area, and if not, judging that the elevator can be entered;
the processing unit is further configured to determine whether an occupied object is an object according to the target detection result if the preset area is determined to be occupied according to the preset area detection result;
the processing unit is further used for judging that the elevator cannot be accessed if the occupied object is an object;
the processing unit is further used for outputting avoidance information if the occupied object is not an object, obtaining a new image in the elevator again after waiting for a preset time, inputting the new image into the neural network model, obtaining a target detection result and a preset area detection result again, judging whether the elevator can be entered according to the obtained target detection result and the preset area detection result again, and judging that the elevator cannot be entered when the number of times of repeatedly obtaining the image in the elevator is larger than a preset threshold value.
6. The system of claim 5, wherein the detection module comprises:
the characteristic extraction unit is used for inputting the images into a pre-trained neural network model and extracting characteristic graphs of different layers;
the resolution improving unit is used for carrying out feature fusion and resolution improvement on the feature maps of different layers;
and the task output unit is used for outputting a target detection result and a preset area detection result according to the feature map obtained after feature fusion and resolution improvement.
7. The system of claim 6, wherein the robot rides on the elevator,
the feature extraction unit is further configured to input the image into a pre-trained neural network model, extract a first feature map of a last layer, and extract a second feature map of a previous layer, where a resolution of the second feature map is twice of a resolution of the first feature map;
the resolution increasing unit includes:
the resolution improving subunit is configured to, according to a preset step, process the first feature map to improve the resolution of the first feature map by one time, so as to obtain a third feature map;
the feature fusion subunit is configured to perform 1 × 1 convolution processing on the second feature map and then fuse the second feature map with the third feature map to obtain a fourth feature map;
the resolution enhancement subunit is configured to perform multiple processing on the fourth feature map according to the preset step, and enhance the resolution of the fourth feature map by multiple times to obtain a fifth feature map;
and the task output unit is further used for outputting a target detection result and a preset area detection result according to the fifth feature map.
8. The system of claim 7, wherein the system comprises a robot for taking an elevator,
the resolution improving subunit is further configured to perform deconvolution on the first feature map; performing channel separation on the deconvolved first feature map; performing convolution processing on the first characteristic diagram after channel separation by adopting two different convolution kernels; and performing channel fusion on the feature map obtained after the convolution processing to obtain a third feature map.
CN201910551001.8A 2019-06-24 2019-06-24 Method and system for taking elevator by robot Active CN110276302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910551001.8A CN110276302B (en) 2019-06-24 2019-06-24 Method and system for taking elevator by robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910551001.8A CN110276302B (en) 2019-06-24 2019-06-24 Method and system for taking elevator by robot

Publications (2)

Publication Number Publication Date
CN110276302A CN110276302A (en) 2019-09-24
CN110276302B true CN110276302B (en) 2022-11-25

Family

ID=67961774

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910551001.8A Active CN110276302B (en) 2019-06-24 2019-06-24 Method and system for taking elevator by robot

Country Status (1)

Country Link
CN (1) CN110276302B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20210063121A (en) * 2019-11-22 2021-06-01 엘지전자 주식회사 Robot and method for controlling same
CN111161740A (en) * 2019-12-31 2020-05-15 中国建设银行股份有限公司 Intention recognition model training method, intention recognition method and related device
CN111153300B (en) * 2019-12-31 2022-01-07 深圳优地科技有限公司 Ladder taking method and system for robot, robot and storage medium
CN112537705B (en) * 2020-03-31 2023-04-11 深圳优地科技有限公司 Robot elevator taking scheduling method and device, terminal equipment and storage medium
CN111847152B (en) * 2020-06-30 2022-05-24 深圳优地科技有限公司 Robot elevator taking determination method and device, electronic equipment and medium
CN111874764B (en) * 2020-09-28 2021-02-02 上海木承智能医疗科技有限公司 Robot scheduling method, server and storage medium
CN112529953B (en) * 2020-12-17 2022-05-03 深圳市普渡科技有限公司 Elevator space state judgment method and device and storage medium
CN113911864A (en) * 2021-10-13 2022-01-11 北京云迹科技有限公司 Control method for robot to board elevator and related equipment
CN115893138A (en) * 2022-10-27 2023-04-04 合肥瓦力觉启机器人科技有限公司 Wireless interaction system in AGV dolly elevator car

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005018382A (en) * 2003-06-25 2005-01-20 Matsushita Electric Works Ltd Autonomous mobile robot for getting on and off elevator
JP2011088721A (en) * 2009-10-23 2011-05-06 Fuji Heavy Ind Ltd Autonomous travel robot and control system of autonomous travel robot
CN102821272A (en) * 2012-08-16 2012-12-12 安徽中科智能高技术有限责任公司 Video monitoring system with elevator invalid request signal removing function
CN106965198A (en) * 2017-03-30 2017-07-21 上海木爷机器人技术有限公司 Robot control method and device
CN108439099A (en) * 2018-04-26 2018-08-24 珠海亿智电子科技有限公司 A kind of elevator intelligent control method based on space and load detecting
CN109059882A (en) * 2018-08-07 2018-12-21 北京云迹科技有限公司 Interior space detection method and system
CN109095299A (en) * 2018-08-23 2018-12-28 北京云迹科技有限公司 Robot boarding method and device based on Internet of Things
CN109867180A (en) * 2017-12-01 2019-06-11 株式会社日立大厦系统 Elevator device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5572018B2 (en) * 2010-07-08 2014-08-13 株式会社日立製作所 Autonomous mobile equipment riding elevator system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005018382A (en) * 2003-06-25 2005-01-20 Matsushita Electric Works Ltd Autonomous mobile robot for getting on and off elevator
JP2011088721A (en) * 2009-10-23 2011-05-06 Fuji Heavy Ind Ltd Autonomous travel robot and control system of autonomous travel robot
CN102821272A (en) * 2012-08-16 2012-12-12 安徽中科智能高技术有限责任公司 Video monitoring system with elevator invalid request signal removing function
CN106965198A (en) * 2017-03-30 2017-07-21 上海木爷机器人技术有限公司 Robot control method and device
CN109867180A (en) * 2017-12-01 2019-06-11 株式会社日立大厦系统 Elevator device
CN108439099A (en) * 2018-04-26 2018-08-24 珠海亿智电子科技有限公司 A kind of elevator intelligent control method based on space and load detecting
CN109059882A (en) * 2018-08-07 2018-12-21 北京云迹科技有限公司 Interior space detection method and system
CN109095299A (en) * 2018-08-23 2018-12-28 北京云迹科技有限公司 Robot boarding method and device based on Internet of Things

Also Published As

Publication number Publication date
CN110276302A (en) 2019-09-24

Similar Documents

Publication Publication Date Title
CN110276302B (en) Method and system for taking elevator by robot
CN106845487B (en) End-to-end license plate identification method
JP7012880B2 (en) Target detection method and equipment, equipment and storage media
CN108229523B (en) Image detection method, neural network training method, device and electronic equipment
CN108776819A (en) A kind of target identification method, mobile terminal and computer readable storage medium
CN109284670A (en) A kind of pedestrian detection method and device based on multiple dimensioned attention mechanism
US10303981B1 (en) Learning method and testing method for R-CNN based object detector, and learning device and testing device using the same
CN111105030B (en) Active zero bypass and weight pruning in neural networks for vehicle sensing systems
CN110738160A (en) human face quality evaluation method combining with human face detection
CN110543838A (en) Vehicle information detection method and device
EP4047561A1 (en) Method for recognizing an emotion of a driver, apparatus, device, medium and vehicle
US11804026B2 (en) Device and a method for processing data sequences using a convolutional neural network
CN113032114A (en) Neural network system and operation method thereof
CN111310650A (en) Vehicle riding object classification method and device, computer equipment and storage medium
CN112348845A (en) System and method for parking space detection and tracking
CN116152744A (en) Dynamic detection method and device for electric vehicle, computer equipment and storage medium
Velez et al. Embedded platforms for computer vision-based advanced driver assistance systems: a survey
WO2022019355A1 (en) Disease diagnosis method using neural network trained by using multi-phase biometric image, and disease diagnosis system performing same
CN114387554A (en) Vehicle personnel overload identification method, device, equipment and readable medium
Jain et al. Ensembled Neural Network for Static Hand Gesture Recognition
CN111062311B (en) Pedestrian gesture recognition and interaction method based on depth-level separable convolution network
CN116363542A (en) Off-duty event detection method, apparatus, device and computer readable storage medium
JP2023529239A (en) A Computer-implemented Method for Multimodal Egocentric Future Prediction
Rakhmonov et al. Airy YOLOv5 for Disabled Sign Detection
CN112507933B (en) Saliency target detection method and system based on centralized information interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 200335 402 rooms, No. 33, No. 33, Guang Shun Road, Shanghai

Patentee after: Shanghai Zhihui Medical Technology Co.,Ltd.

Address before: 200335 402 rooms, No. 33, No. 33, Guang Shun Road, Shanghai

Patentee before: SHANGHAI MROBOT TECHNOLOGY Co.,Ltd.

Address after: 200335 402 rooms, No. 33, No. 33, Guang Shun Road, Shanghai

Patentee after: Shanghai zhihuilin Medical Technology Co.,Ltd.

Address before: 200335 402 rooms, No. 33, No. 33, Guang Shun Road, Shanghai

Patentee before: Shanghai Zhihui Medical Technology Co.,Ltd.

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 202150 room 205, zone W, second floor, building 3, No. 8, Xiushan Road, Chengqiao Town, Chongming District, Shanghai (Shanghai Chongming Industrial Park)

Patentee after: Shanghai Noah Wood Robot Technology Co.,Ltd.

Address before: 200335 402 rooms, No. 33, No. 33, Guang Shun Road, Shanghai

Patentee before: Shanghai zhihuilin Medical Technology Co.,Ltd.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230804

Address after: 610, Floor 6, Block A, No. 2, Lize Middle Second Road, Chaoyang District, Beijing 100102

Patentee after: Zhongguancun Technology Leasing Co.,Ltd.

Address before: 202150 room 205, zone W, second floor, building 3, No. 8, Xiushan Road, Chengqiao Town, Chongming District, Shanghai (Shanghai Chongming Industrial Park)

Patentee before: Shanghai Noah Wood Robot Technology Co.,Ltd.