CN107066507A

CN107066507A - A kind of semantic map constructing method that cloud framework is mixed based on cloud robot

Info

Publication number: CN107066507A
Application number: CN201710019515.XA
Authority: CN
Inventors: 王怀民; 丁博; 刘惠; 李艺颖; 史佩昌; 车慧敏; 胡奔; 包慧; 彭维崑
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2017-01-10
Filing date: 2017-01-10
Publication date: 2017-08-18
Anticipated expiration: 2037-01-10
Also published as: CN107066507B

Abstract

The invention discloses a kind of semantic map constructing method that cloud framework is mixed based on cloud robot, it is therefore an objective to reaches an appropriate balance between improving the accuracy rate of object identification and shortening recognition time.Technical scheme is to build the mixed cloud being made up of robot, private clound node, public cloud node, private clound node obtains environment picture and odometer and the position data that robot is shot based on ROS message mechanisms, and SLAM real-time rendering environment geometry maps are utilized based on odometer and position data.Private clound node is based on environment map piece and carries out object identification, it would be possible to which the object of identification mistake is uploaded to public cloud node and is identified.The object classification identification label that private clound node returns to public cloud node is mapped with SLAM maps, and object classification is recognized into label is marked on the structure of the semantic map of relevant position completion of map.The local computational load of robot can be mitigated using the present invention, request response time is minimized, the accuracy rate of object identification is improved.

Description

A kind of semantic map constructing method that cloud framework is mixed based on cloud robot

Technical field

The present invention relates to robot Distributed Computing Technology field, and in particular to a kind of to be supported by the use of cloud computing as reinforcement, By building mixing cloud framework, the method based on the semantic map structuring of cloud robot is realized.

Background technology

The perception data source of robot potentially includes multiple dimensions such as vision, power feel, tactile, infrared, ultrasonic, laser radar Degree, robot semanteme map structuring refers to that robot is based on these perception datas, cognitive and understand local environment, and core is paid close attention to such as What is discarded the dross and selected the essential perception data, eliminated the false and retained the true, and then analysis integrated, extracts and is available for height used in robot autonomous decision-making Layer semantics information (such as object names and present position), is embodied in the mark that object in environment is added on geometry map Label.The acquisition of semantic information can be realized by key technologies such as REID, hearing technology, vision techniques, big at present Majority research concentrates on the latter, and the present invention is also based on vision technique.The semantic map building process of robot view-based access control model is same When the characteristics of possess computation-intensive and knowledge-intensive.It particularly may be divided into following two links performed parallel：(1) build geometrically Figure, is sensed generally by means of RGB-D (Red Green Blue-Depth) camera, laser radar entrained by robot etc. Device, using synchronous superposition algorithm SLAM (Simultaneous Location and Mapping), SLAM algorithms Proposed earliest by Smith, Self and Cheeseman in 1988, related thought is published in paper《Estimating uncertain spatial relationship in robotics》In (estimation of Instable Space relation in robotics) (Smith R,Self M,Cheeseman P.Estimating uncertain spatial relationship in robotic[M]//Autonomous robot vehickes.Springer New York,1990:167~193.).It is this kind of Algorithm carries out calculating solution usually using statistical filtering algorithm or image matching technology is based on by successive ignition, It is typical computation-intensive task.(2) it is likely to contain in addition object label information, the image of robot real-time perception many Individual object, it is therefore desirable to split first to image, then the object in image after segmentation is entered by methods such as machine learning Row identification.The process relates equally to substantial amounts of image operation, is typical computation-intensive task.Recognition accuracy is tight simultaneously The forms such as the model generally trained again dependent on the preset knowledge of robot, these knowledge with machine learning are present, therefore It is knowledge-intensive tasks.If environment opens dynamic, can not accurately predicted, the degree of accuracy of robot semanteme map will Face a severe challenge.

Cloud robot is using backend infrastructures such as cloud computing and big datas, and hoisting machine people carries out under complex environment to be appointed The ability of business, therefore to this kind of calculating of the semantic map structuring of robot and knowledge-intensive tasks new solution can be brought to think Road：(1) map structuring is carried out using the powerful computing resource in high in the clouds, realizes that local computing is unloaded.(2) importantly, utilizing High in the clouds rich experience knowledge, the limitation of breaking machines people's Indigenous knowledge makes robot realize that knowledge expands based on high in the clouds wisdom Exhibition, has preferable recognition capability to object in environment.The cloud that robot can be utilized include private clound (Private Cloud) and Public cloud (Public Cloud)：Private clound refers to the cloud built is used alone for a client, thus can provide logarithm According to the most effective control of, security and service quality；Public cloud is often referred to third party provider can use for what client provided Cloud, can typically be used by internet, it is with low cost in addition some be free, the core attribute of public cloud is shared resource Service.

At present, the main stream approach of the semantic map structuring of cloud robot is usually to use private clound, is machine in privately owned high in the clouds People provides object knowledge base, and picture is uploaded to private clound and obtains object identification class label by robot, to solve robot certainly Body calculates the limitation with knowledge resource, improves operational efficiency, shortens request response time.But this privately owned high in the clouds object is known Be not to be matched based on object individual, or object recognition algorithm accuracy rate also than relatively low, and knowledge still need to it is preset in advance, and And None- identified unfamiliar object.That is, though private clound resource is controllable, request response time can shorten, its knowledge is still deposited In limitation.

If being applied to language in the open object identification cloud service such as CloudSight of big data using the Internet-based The object identification link of adopted map structuring, robot can just utilize enriching one's knowledge based on internet big data, improve multiple Recognition accuracy in miscellaneous open environment.CloudSight provides open API, user's uploading pictures, CloudSight Return to object tags or the description to object.Based on internet mass image data knowledge, CloudSight can recognize 4 at present Ten million remaining kind article, and also have preferable recognition result even for the article shot under poor light condition and angle.But Such publicly-owned cloud service in internet is based on " Best Effort " models, performance is uncontrollable, request response time is long, and this is for being permitted Multirobot application is unacceptable, especially when they are with physical world direct interaction.That is, though public cloud has There is internet big data to enrich one's knowledge, but resource is uncontrollable, request response time is long.

In summary, the scheme based on private clound is showed preferably in request response time, can recognize familiar objects, but need Training in advance is wanted, and there is knowledge limitation, it is impossible to the unfamiliar object in open environment is recognized；Scheme based on public cloud is utilized Internet big data knowledge has wider intelligence to the object identification in environment, can be in the case of without training in advance Unfamiliar object is recognized, but identification delay is big, and request response time is long.The analysis based on more than is it can be found that building machine human speech During the figure of free burial ground for the destitute, how robot accurately identifies unfamiliar object in open environment, and shortens identification time delay as far as possible, is The semantic map constructing method technical issues that need to address.

The content of the invention

The technical problem to be solved in the present invention is to provide a kind of semantic map structuring that cloud framework is mixed based on cloud robot Method, the mixed cloud being combined into using public cloud and private clound so that robot can extend recognition capability using high in the clouds, Improve and reach an appropriate balance between the accuracy rate of object identification and shortening recognition time.Mixed cloud refers to public cloud and privately owned The combination of cloud, it integrates the advantage of the two, and realizes good coordination between the two.

The technical scheme is that the mixing cloud environment being made up of robot, private clound node, public cloud node is built, The message mechanism that private clound node is based on ROS (Robot Operating System) obtains the environment picture number that robot is shot According to robot odometer and position data, based on odometer and robot position data using synchronous positioning with building nomography SLAM real-time rendering environment geometry maps.Then private clound node is used based on the environment picture that robot is shot and is based on Faster R-CNN (Faster Region Convolutional Neural Network) object identification module carries out object identification, It will likely recognize that the object of mistake is uploaded to public cloud node and is identified.What last private clound node returned to public cloud node Object classification identification label is mapped with SLAM maps, and object classification is recognized into label is marked on the relevant position completion language of map The structure of free burial ground for the destitute figure.

The present invention comprises the following steps：

The first step, builds robot mixing cloud environment, it is by robot calculate node and private clound node, public cloud node Constitute.Robot calculate node is can be with robot hardware's equipment (such as unmanned plane, unmanned vehicle, the humanoid of runs software program Device people etc.), private clound node is the controllable computing device of the resource with good computing capability, can run computation-intensive or The knowledge-intensive robot application of person.Public cloud node is that storage resource is enriched and can externally provide the computing device of service. Robot calculate node, private clound node are interconnected by the network equipment, and private clound node passes through internet access public cloud node.

Except equipped with operating system Ubuntu (such as 14.04 versions), robot middleware ROS in robot calculate node (such as Indigo versions are made up of ROS message issuers, ROS message subscribings person and ROS communication protocols etc.) outside, is also equipped with Perception data acquisition module.

In ROS, message transmission uses the publish/subscribe mechanism based on theme (Topic), i.e., message is according to theme Classified, the publisher of message gives out information on a certain theme, all message subscribing persons that have subscribed this theme will receive To the message.The one class message of ROS subjects unique mark one, ROS subjects are made up of letter, numeral and "/" symbol.

Multiple software kits are included in ROS, providing auxiliary for robot autonomous action supports.Tf storehouses are a series of soft in ROS The set of part bag, it can provide coordinate transform, direction calculating function.Input changing coordinates tie up to retouching in reference frame State, then tf storehouses can be the coordinate transform that any one is put in current coordinate system into the coordinate in reference frame；Input gyro The historical data and current data of instrument, then tf storehouses the current direction of robot can be obtained by integral and calculating.

Rbx1 bags in ROS can provide angular transformation function.Revolved using quaternary number in direction in robot posture information The mode of turning represents (quaternary number is a kind of high-order plural number, is made up of a real number unit and three imaginary units), calculates object and exists Need to use during position on SLAM maps robot rotational angle (i.e. according to certain reference axis order, each axle rotation Certain angle), then the direction representation of quaternary number is transformed to the direction representation of rotational angle.Quaternary counts to rotation The direction representation conversion of angle is a classical conversion mathematically, related mathematical algorithm most early in 1843 by Hamilton is proposed, is usually used in object positioning etc. in figure, iconology.

The posture information of robot refers to position and the posture of robot；Mileage information refers to the distance that robot passes by；Depth Information refers to object identification position to the distance of the shooting head plane of robot node.

What posture information, mileage information, depth information and the camera of perception data acquisition module collection robot were shot Colored environment picture, and using ROS news release/subscribing mechanism by these data publications to the perception number on private clound node According to receiving module.

On private clound node in addition to equipped with operating system Ubuntu, robot middleware ROS, it is also equipped with perception data and connects Module, SLAM is received to build module, upload mould based on Faster R-CNN object identification modules, collaboration recognition decision module, picture Block and semantic tagger module.

Perception data receiving module subscribes to pose using ROS news release/subscribing mechanism from perception data acquisition module Information, mileage information, depth information and colored environment picture.Perception data receiving module receive posture information, mileage information, After depth information and colored environment picture, posture information is transmitted to SLAM and builds module and semantic tagger module, mileage is believed Breath is sent to SLAM and builds module, and colored environment picture is sent to based on Faster R-CNN object identification modules, by depth Information is sent to semantic tagger module.

SLAM builds module and utilizes the robot posture information and mileage information received from perception data receiving module, complete Real-time rendering environment SLAM maps in the full foreign environment without priori, and SLAM maps are sent to semantic tagger mould Block.SLAM maps refer to the two-dimensional geometry map drawn using SLAM algorithms.

The colored environment picture received from perception data receiving module is utilized based on Faster R-CNN object identification modules, Object is identified based on Faster R-CNN models, each object space, object picture, object in colored environment picture is obtained Recognize class label, object picture confidence score.

Faster R-CNN models be 2015 by Shaoqing Ren, Kaiming He, Ross Girshick et al. is used Object identification engine (Ren S, He K, Girshick R, the et al.Faster R-CNN that convolutional neural networks (CNN) are realized: Towards real-time object detection with region proposal networks (are recommended using region Network carries out real-time object detection) [C] //Advances in neural information processing systems.2015:91~99.).Faster R-CNN models are entered using training sample set picture using convolutional neural networks Row has the training of supervision, i.e., feature extraction and object segmentation are carried out to training sample set picture, with the sample of known class label Constantly adjust the parameter of Faster R-CNN softmax graders so that the object in output after classifier calculated It is as much as possible the real class label of training sample to recognize class label, that is, adjustment classifier parameters cause classification to imitate It is really as well as possible.Faster R-CNN model trainings complete after, to colour picture carry out object identification, obtain object space, Object identification class label, object confidence score.Object space refers to be come with the length and width of object top left co-ordinate, object Position of the object in environment picture is described.Object identification class label includes certain specific category label (such as apple class, cup Class etc.) and other object class labels in addition to specific category object." other object types " represents " every other object ", Varied other kinds of object is contained, can only determine that these are all object but can not represent that it is certain specific type objects Feature.Confidence score is drawn by Faster R-CNN Softmax classifier calculateds, characterizes Faster R-CNN identifications As a result reliability size.Between zero and one, score is bigger, and the reliability for representing recognition result is bigger for recognition confidence score.

Based on Faster R-CNN object identification modules using the famous PASCAL VOC2007 data in object identification field Several picture (such as famous picture and the video websites that include various objects downloaded at random on collection and internetwww.flickr.comOn object picture) Faster R-CNN models are trained.PASCAL VOC full name is Pattern Analysis,Statistical Modelling and Computational Learning Visual Object Classes Chanllenge, are that the match recognizes the standardized data collection provided by objects in images, wherein with The data set of match in 2007 is the most representative, abbreviation PASCAL VOC2007.

Based on Faster R-CNN object identification modules to the colored environment picture transmitted from perception data receiving module, adopt The object for carrying out each object in object identification, the colored environment picture of output to colored environment picture with Faster R-CNN models is known Other position, object picture, object identification class label, object confidence score.Wherein object identification class label and object are put Confidence score is obtained after the progress object identification of Faster R-CNN models.Object identification position refers to be sat with object housing pixel Cursor position describes position of the object in environment picture, takes the center point coordinate of object housing as the seat of object identification position Mark.Object picture refers to the picture being partitioned into according to the object space that Faster R-CNN models are exported.Based on Faster R- CNN object identification modules are by recognition result (including object identification class label, confidence score, object identification position, object figure Piece) it is sent to collaboration recognition decision module.

Collaboration recognition decision module judges to receive from based on Faster R-CNN object identification modules according to confidence threshold value Recognition result it is whether correct, to whether needing to be uploaded to open cloud and again identifying that carry out decision-making.Confidence threshold value is set by user Put, be the optimal value tested and selected using the data sets of PASCAL VOC 2007：The selectable range of confidence threshold value is 0~1, it is 0.1 that experiment step-length is set when selecting threshold value, i.e., from [0,0.1,0.2 ... ... 1] middle selection confidence threshold value most Excellent arranges value.Experiment is found, when confidence threshold value is set to 0.7, lifting of the collaboration recognition decision module in recognition accuracy Upper effect is optimal, it is therefore proposed that confidence threshold value is set to 0.7.

If object identification class label is not in recognition result " other object types " and score is more than or equal to confidence threshold value, Collaboration recognition decision module thinks correct to the object identification, and object identification position and object identification class label are sent into language Adopted labeling module.When following two kinds of object identification mistakes or None- identified, recognition decision module is cooperateed with by object figure Piece is sent to picture uploading module, is identified again by public cloud, finally cooperates with what recognition decision module returned to public cloud Object identification class label and object identification position are sent to semantic tagger module.Two kinds of object identification mistakes or None- identified Situation includes：If 1) confidence score is less than threshold value in object identification result, then it is assumed that based on Faster R-CNN object identifications Module is not high enough to the recognition result confidence level of this object, i.e., based on Faster R-CNN object identification modules to the object identification Mistake；If 2) the object identification class label is " other object types ", illustrate that the object generic may indiscipline, base It is difficult to be classified as a certain specific category in Faster R-CNN object identification modules, i.e., is known based on Faster R-CNN objects The object generic can not be identified for other module.

Picture uploading module receives collaboration recognition decision module from collaboration recognition decision module and is judged as identification mistake or nothing These object pictures are transmitted to public cloud node and again identified that by the object picture of method identification, and by public cloud node again The object identification class label obtained after identification is sent to collaboration recognition decision module.

Semantic tagger module builds module from SLAM and receives SLAM maps, and posture information is received from perception data receiving module And depth information, object identification position and object identification class label are received from collaboration recognition decision module, " mark is saved it in In label-position " table.Object identification class label is labeled in SLAM map phases by semantic tagger module by " label-position " table Position is answered, the structure of semantic map is completed, semantic map is finally distributed to robot calculate node.Semantic map, which refers to, with the addition of Object identification position, the SLAM maps of object identification class label so that robot and people are understood that each object distribution in scene, It is the basis that robot realizes independent behaviour ability.

Open cloud service CloudSight of the public cloud node using the Internet-based in big data recognizes picture uploading module The object picture of reception, provides the identification label of object in picture.2014, CloudSight companies developed object identification cloud Service CloudSight (https://www.cloudsightapi.com).CloudSight provides open API, Yong Hushang Blit piece, CloudSight is that can return to object identification class label or the description to object.CloudSight provides POST Method and GET methods.User is character string password or digital certificate firstly the need of API KEY, API KEY is obtained, and accesses application The authority of routine interface；POST method is the method that client computer uploads data to cloud service CloudSight；GET methods are clients The method that machine obtains data from cloud service CloudSight, can be got with GET methods from cloud service CloudSight For the token token of identity and safety certification, the recognition result class label of object picture is then obtained.

Second step, private clound node subscribes to the related ROS message of perception data；Robot calculate node perceives environment, hair The related ROS message of cloth perception data simultaneously subscribes to the related ROS message of semantic map.Comprise the following steps that：

The perception data receiving module of 2.1 private clound nodes is calculated by the publish/subscribe mechanism based on ROS from robot Node topic of subscription is entitled/and tf posture information, theme be entitled/and odom mileage information, theme be entitled/camera/depth/ Message on points depth information and/camera/rgb/image_raw colored environment picture.

2.2 use the data sets of object identification field PASCAL VOC 2007 based on Faster R-CNN object identification modules " the other object types " constituted with several pictures comprising various objects downloaded at random on internet is to Faster R-CNN Model is trained.

The collaboration recognition decision module of 2.3 private clound nodes receives confidence threshold value from keyboard.

Message in 2.4 robot calculate nodes on ROS message subscribings person subscription/sem_map themes.

The calculate node movement of 2.5 robots, perceives environment by hardware sensor and accelerometer, gyroscope and obtains number According to, and data are issued by ROS mechanism.Comprise the following steps that：

2.5.1 perception data acquisition module obtains data from range sensor, produces mileage information and is published to/odom themes On；

2.5.2 perception data acquisition module from range sensor, accelerometer, gyroscope obtain robot movement away from From, the acceleration at each moment, gyroscope angle.When perception data acquisition module obtains the initial position of robot and calculates mobile Between, using tf storehouses in ROS calculate the position coordinates for obtaining robot currently on SLAM maps according to above- mentioned information (position) and direction (orientation) produces posture information, and be published to/tf themes on；

2.5.3 perception data acquisition module obtains data from vision sensor, and vision sensor data include each in image Pixel is apart from the colour information of depth information and RGB (Red Green Blue) color value of vision sensor, and the latter is table It is now the colored environment picture photographed.Perception data acquisition module produces robot away from front thing according to vision sensor data The depth information of body and colored environment picture, and/camera/depth/points and/camera/rgb/ is published to respectively On image_raw themes.

Perception data receiving module on 3rd step, private clound node obtains the posture information of robot calculate node and inner Journey information, and the posture information and mileage information of robot calculate node are sent to SLAM build module, SLAM builds module Set up SLAM maps.Comprise the following steps that：

Perception data receiving module on 3.1 private clound nodes received using ROS message mechanisms/tf themes and/odom Message on theme, obtains the posture information and mileage information of robot calculate node, and by the pose of robot calculate node Information and mileage information are sent to SLAM and build module, and the posture information of robot calculate node is sent into semantic tagger mould Block.

The SLAM of 3.2 private clound nodes builds the posture information and mileage information that module receives robot calculate node, profit With synchronous positioning the i.e. SLAM maps, and SLAM maps are sent to semantic tagger of environment geometry map are drawn with building nomography SLAM Module.

Perception data receiving module on 4th step, private clound node obtains the colored environment picture of robot calculate node And send it to based on Faster R-CNN object identification modules, object is carried out based on Faster R-CNN object identification modules Recognition result is sent to semantic tagger module after identification.Comprise the following steps that：

4.1 perception data receiving modules are received using ROS message mechanisms/camera/rgb/image_raw themes on Message, obtains the colored environment picture of robot node current shooting, and sends it to based on the knowledge of Faster R-CNN objects Other module.

4.2 receive the colored environment picture of robot calculate node based on Faster R-CNN object identification modules, carry out Object identification, is comprised the following steps that：

4.2.1 colored environment picture is received from perception data receiving module based on Faster R-CNN object identification modules.

4.2.2 picture feature extraction is carried out using Faster R-CNN models based on Faster R-CNN object identification modules And object segmentation, object identification position is calculated according to object space and object picture is intercepted from colored environment picture, and by thing Body identification position, object picture, object identification result class label and recognition confidence score are sent to collaboration recognition decision mould Block.Comprise the following steps that：

4.2.2.1Faster R-CNN models carry out feature extraction to the picture in sliding window, and determine whether one Individual object.If it is, Faster R-CNN models carry out object segmentation, and return to object space, object in colored environment picture Recognition result class label and corresponding recognition confidence score, perform step 4.2.2.2；Otherwise step 5.5 is performed.

4.2.2.2 Faster R-CNN models are obtained based on Faster R-CNN object identification modules and returns to colored environment map Object space, object identification result class label and corresponding recognition confidence score in piece, object is calculated according to object space Recognize position and intercept object picture from colored environment picture.Based on Faster R-CNN object identification modules by object identification Position, object picture, object identification result class label and recognition confidence score are sent to collaboration recognition decision module.

4.2.3 collaboration recognition decision module judgment object identifies whether correctly.If correct, collaboration recognition decision module will As a result semantic tagger module is sent to, the 5th step is performed；Otherwise the object for recognizing mistake is sent to by collaboration recognition decision module Picture uploading module, picture uploading module is further uploaded to public cloud node.Comprise the following steps that：

4.2.3.1 collaboration recognition decision module receives object picture, thing from based on Faster R-CNN object identification modules Body identification position, object identification class label and recognition confidence score, are judged.If object identification class label is not " other object types " and confidence score are more than or equal to confidence threshold value, then cooperate with recognition decision module to judge to the object identification Correctly, step 4.2.3.4 is performed.

If 4.2.3.2 object identification class label is that " other object types " or recognition confidence score are less than confidence level threshold Value, then cooperate with recognition decision module to judge to the object identification mistake, the object picture be sent into picture uploading module.

4.2.3.3 picture uploading module is again identified that using public cloud CloudSight to object picture, and by thing Body identification class label is sent to collaboration recognition decision module.Comprise the following steps that：

4.2.3.3.1 one CloudSight of picture uploading module application API KEY.

4.2.3.3.2 picture uploading module using HTTP POST method by the address URL (resource description of object picture Symbol) and apllied API KEY be uploaded to https://api.cloudsightapi.com/image_requests is sent please Ask.

4.2.3.3.3 picture uploading module gets the token for identity and safety certification using HTTP GET methods token。

4.2.3.3.4 picture uploading module is by accessing https://api.cloudsightapi.com/image_ Responses/ [token] obtains the recognition result class label of object picture, and object identification class label is sent into association With recognition decision module.

4.2.3.4 collaboration recognition decision module is received after object identification result class label, by object identification classification mark Label and object identification position are sent to semantic tagger module.

Depth information, the pose letter of semantic tagger module receiver device people's calculate node on 5th step, private clound node Behind breath, object identification class label and object identification position, calculate after position coordinates of the object on SLAM maps, by object Identification class label is labeled in SLAM maps relevant position, completes the structure of semantic map.Comprise the following steps that：

5.1 perception data receiving modules are received using ROS message mechanisms/camera/depth/points themes on Message, obtains the depth information of robot node, and the depth information of robot node is sent into semantic tagger module.

5.2 semantic tagger modules receive the posture information of robot node, including robot from perception data receiving module Position coordinates (x₀,y₀) (cartesian coordinate system) and represent 3D postures direction γ.Using the rbx1 bags in ROS, by robot Direction γ is converted into corresponding rotational angle β.

5.3 semantic tagger modules receive the depth information of robot calculate node from perception data receiving module, from collaboration Recognition decision module receives object identification class label and object identification position (x₂, y₂), calculating robot's node to object is regarded Feel angle [alpha] and depth d.

Visual angle is the angle detection scope that position and camera are recognized by binding object, according to trigonometric function ratio Corresponding relation is calculated and obtained.If the half of picture level pixel value be b, then in picture object centre distance center picture level Pixel value a is

A=| x₂-b|

The level detecting angular range of vision sensor is that (i.e. centered on sensor, left and right horizontal detects angular range to θ Each θ/2), the visual angle α of robot to object center is

α=tan^-1((tan(θ/2))·a/b))

, by combining robot node to the angle of object, calculated according to trigonometric function relation and obtain robot nodal point separation From the depth of object.

If the depth information that robot node is obtained is D, the horizontal depth d of robot nodal distance object is

D=D/cos α

5.4 semantic tagger module integrated environment objects in images identification position, the visual angle of robot node to object With depth, the position coordinates and rotational angle of robot node, calculated using trigonometric function relation and obtain object in SLAM maps On position coordinates, object result class label is labeled in the relevant position of SLAM maps.Calculated using trigonometric function relation The method of position of the object on SLAM maps is：

If the position coordinates of robot is (x₀,y₀), then position coordinates (x of the object on SLAM maps₁,y₁) be

For the object on the left of camera：

For the object on the right side of camera：

Whether 5.5 judge sliding window to the lower right corner based on Faster R-CNN object identification modules, if it is, i.e. Colored environment picture has completed identification, performs step 5.6；Otherwise sliding window is moved on into next position, performs step 4.2.2.1。

5.6 semantic tagger modules using semantic map as ROS news releases on/sem_map themes.

6th step, robot calculate node receives the related ROS message of semantic map using ROS message mechanisms.Specific step It is rapid as follows：

6.1ROS message subscribings person using ROS message mechanisms receive private clound node semantic tagger module issue/ Message on sem_map themes, obtains semantic map.

6.2 robots determine whether to receive the instruction of " traversal for having completed whole environment " that user sends, if receiving, and turn 7th step；If not receiving, step 2.5 is performed.

7th step, terminates.

Following beneficial effect can be reached using the present invention：

(1) private clound cooperates with constructing environment geometry map with robot, using the powerful computing capability in high in the clouds, alleviates machine The local computational load of device people.

(2) using the identification for mixing object in cloud framework progress environment, high in the clouds expanding machinery people's ability can utilized Meanwhile, while minimizing request response time.Based on collaboration recognition decision module, object can be in private clound known to private clound End is identified, and the resource of private clound is controllable, and request response time is short；The object of identification mistake, meeting are possible to for private clound Screen and be uploaded to public cloud and again identified that, labelled using the Internet-based in enriching one's knowledge for big data for object, The identification cognitive ability of expanding machinery people, improves the accuracy rate of object identification so that improve object identification accuracy rate and An appropriate balance is reached between shortening recognition time.

(3) mapping of " object tags-map location " is realized based on data fusion, the structure of semantic map is completed.Fusion The data of different spaces dimension are simultaneously calculated, and object tags are labeled in into map relevant position, build semantic map, display machine Effect of the device people to environment understanding.

Brief description of the drawings

Fig. 1 is the robot mixing cloud environment overall logic structure chart that the first step of the present invention is built.

Fig. 2 is the software deployment figure of the robot mixed cloud of first step structure of the present invention environmentally.

Fig. 3 is overall flow figure of the present invention.

Fig. 4 is the 4th step of the invention and the 5th Bu Zhong robots calculate node, private clound node and the collaboration of public cloud node Set up the flow chart of semantic map.

Fig. 5 be in step 5.3 of the present invention robot node to the visual angle of object and the calculating schematic diagram of depth.

Fig. 6 is the calculating schematic diagram of object position on SLAM maps in step 5.4 of the present invention.Fig. 6 (a) is that object is being taken the photograph Calculating schematic diagram on the left of head, Fig. 6 (b) is calculating schematic diagram of the object on the right side of camera.

Embodiment

Fig. 1 is the robot mixing cloud environment that the first step of the present invention is built, and it is by robot calculate node, private clound node Constituted with public cloud node.Robot calculate node is can be with robot hardware's equipment (such as unmanned plane, nothing of runs software program People's car, anthropomorphic robot etc.), private clound node is the controllable computing device of the resource with good computing capability, can be run Computation-intensive or knowledge-intensive robot application.Public cloud node is that storage resource is enriched and can externally provide clothes The computing device of business.Robot calculate node, private clound node are interconnected by the network equipment, and private clound is public by internet access There is cloud.

Fig. 2 is the software deployment figure on robot calculate node of the present invention and private clound node.Robot calculate node is Robot hardware's equipment of runs software program can be moved, be capable of in the environment, passed thereon with camera, laser radar etc. Sensor, robot calculate node and private clound node are equipped with operating system Ubuntu and robot middleware ROS.Except this it Outside, it is also equipped with perception data acquisition module in robot calculate node.Be also equipped with private clound node perception data receiving module, SLAM builds module, based on Faster R-CNN object identification modules, collaboration recognition decision module, picture uploading module and semanteme Labeling module.Public cloud uses CloudSight object identification cloud services.

Illustrate the embodiment of the present invention so that wheeled robot TurtleBot builds semantic map as an example below.Should Microsoft vision sensor Kinect is configured with the TurtleBot of example, can be with captured in real-time ambient image and ranging.Private clound On the server, because robot and private clound have the limitation of knowledge, therefore it is big by public cloud to be based on internet for deployment The identification cognitive ability of Data expansion robot so that mixed cloud can be minimized while object identification cognitive ability is expanded Request response time.

Embodiment using the present invention is as follows as shown in Figure 3：

The first step, builds robot mixing cloud system, and it is by TurtleBot wheeled robots, server and based on big number According to open cloud service CloudSight compositions, three interconnected by the network equipment.TurtleBot wheeled robots are equipped with operation System Ubuntu14.04 versions, robot operating system middleware ROS Indigo versions and perception data module.Server is filled There are operating system Ubuntu, robot middleware ROS, perception data receiving module, SLAM to build module, based on Faster R- CNN object identification modules, collaboration recognition decision module, picture uploading module and semantic tagger module.Public cloud is used CloudSight object identifications cloud service (https://www.cloudsightapi.com)。

Second step, private clound node subscribes to the related ROS message of perception data；TurtleBot perceives environment, and issue is perceived The related ROS message of data simultaneously subscribes to the related ROS message of semantic map.

The collaboration recognition decision module of 2.3 private clound nodes receives confidence threshold value 0.7 from keyboard.

Message on the upper ROS message subscribings person subscription/sem_map themes of 2.4TurtleBot.

2.5TurtleBot wheeled robots are moved, and perceiving environment by hardware sensor and accelerometer, gyroscope obtains Data are obtained, and pass through ROS mechanism and issue data.Comprise the following steps that：

2.5.2 perception data acquisition module obtains TurtleBot movements from range sensor, accelerometer, gyroscope Distance, the acceleration at each moment, gyroscope angle.Perception data acquisition module obtains TurtleBot initial position and calculating Traveling time, using tf storehouses in ROS calculate the position seat for obtaining robot currently on SLAM maps according to above- mentioned information Mark and direction produce posture information, and be published to/tf themes on；

2.5.3 perception data acquisition module produces depth of the robot far from objects in front according to Kinect vision sensors data Information and colored environment picture are spent, and is published to/camera/depth/points and/camera/rgb/image_raw respectively On theme.

Perception data receiving module on 3rd step, private clound node obtains the pose letter of TurtleBot wheeled robots Cease and mileage information, and the posture information and mileage information of robot calculate node are sent to SLAM and build module, SLAM is built Module sets up SLAM maps.Comprise the following steps that：

4.2.1 colored environment picture is received from perception data receiving module based on Faster R-CNN object identification modules. Include robot and chair in picture.

4.2.2.1Faster R-CNN models carry out feature extraction to the picture in sliding window, and determine whether one Individual object.Judgement be robot in the colored environment picture that Faster R-CNN models are returned after an object object space, Object identification result class label and corresponding recognition confidence score.Faster R-CNN models are by the object identification of robot As a result class label is set to " person ", and recognition confidence is scored at 0.551.

4.2.3.1 recognition decision module is cooperateed with from the object that robot is received based on Faster R-CNN object identification modules Picture, object identification position, object identification class label and recognition confidence score, are judged.If object identification classification mark Label are not " other object types " and confidence score is more than or equal to confidence threshold value, then cooperate with recognition decision module to judge to the thing Body identification is correct, and object identification position and object identification class label are sent into semantic tagger module, performs the 5th step.

4.2.3.2 because the recognition confidence score of robot is less than threshold value 0.7, collaboration recognition decision module is judged to this Object identification mistake, picture uploading module is sent to by the object picture.

4.2.3.3.1 one CloudSight of picture uploading module application API KEY.

4.2.3.3.4 picture uploading module is by accessing https://api.cloudsightapi.com/image_ The recognition result class label that responses/ [token] obtains object picture is blue and white robot, and by thing Body identification class label is sent to collaboration recognition decision module.

5.2 semantic tagger modules receive the posture information of robot node, including robot from perception data receiving module Position coordinates (0.21,0.18) and represent 3D postures direction [0.8128,0.3430,0.4073,0.2362]^T.Use ROS In rbx1 bags, by robot direction [0.8128,0.3430,0.4073,0.2362]^TIt is converted into corresponding rotational angle 50.02°。

5.3 semantic tagger modules receive the depth information of robot calculate node from perception data receiving module, from collaboration Recognition decision module receives object identification class label and object identification position (391,105), calculating robot's node to object Visual angle and depth.

As shown in figure 5, the half in colored environment picture level pixel value is 250, robot center picture is apart from picture The horizontal pixel value at center is 141.

A=| 391-250 |

The horizontal reconnaissance range of kinect vision sensors is 57 degree (i.e. centered on sensor, each 28.5 degree of left and right), Then robot is to the visual angle at object center

α=tan^-1((tan (28.5 °)) 141/250))=17.03 °

Robot nodal distance Object Depth is d, and the depth information that robot node is obtained is 2.15 meters, robot node Visual angle to object is 17.03 °, then robot is apart from Object Depth

D=2.15/cos (17.03 °)=2.25 (rice)

5.4 semantic tagger module integrated environment objects in images identification position, the visual angle of robot node to object With depth, the position coordinates and rotational angle of robot node, calculated using trigonometric function relation and obtain object in SLAM maps On position coordinates, object result class label is labeled in the relevant position of SLAM maps.Calculated using trigonometric function relation The step of position of the object on SLAM maps is：

The visual angle α that robot to object is obtained by above 5.2 and 5.3 steps is 17.03 °, and robot is apart from thing The depth d of body is 2.25 meters, and the rotational angle β of robot is 50.02 °, and the position coordinates of robot is (0.21,0.18), should Object is located on the right side of camera, shown in such as Fig. 6 (b), position coordinates (x of the object on map₁,y₁) be

After the completion of calculating, object result class label (blue and white robot) is labeled in SLAM map references For the position of (2.10,1.41).

Whether 5.5Faster R-CNN models judge sliding window to the lower right corner, due to not arriving the lower right corner also, i.e., Object identification in colored environment picture is not completed, sliding window is moved on to next position by Faster R-CNN models, performed Step 4.2.2.1, determines whether an object, if not object, continues to move to sliding window, until being in sliding window One object.Continue executing with step 4.2.2.2 and recognize next object chair, and be labeled on SLAM maps.If sliding Window has slided into the lower right corner, then completes the identification of object (Cain and Abel) in the colored environment picture of identification, perform step 5.6。

ROS message subscribings person on 6.1TurtleBot wheeled robots receives private clound node using ROS message mechanisms The issue of semantic tagger module /sem_map themes on message, obtain semantic map.

6.2 robot determines whether to receive the instruction of " traversal for having completed whole environment " from user, if received, turn the Seven steps, otherwise perform step 2.5, and TurtleBot is continued to move to, issue mileage message, pose message, depth message, colour circle Border picture；Perception data on mixed cloud node is received, based on Faster R-CNN object identifications, collaboration recognition decision, semanteme The modules such as mark continue to message, carry out object identification and the semantic map of mark, constantly enrich semantic map.

7th step, terminates.

Claims

1. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot, it is characterised in that comprise the following steps：

The first step, builds robot mixing cloud environment, it is by robot calculate node and private clound node, public cloud node structure Into；Robot calculate node is can be with robot hardware's equipment of runs software program, and private clound node is can to run calculating The computing device of intensive or knowledge-intensive robot application, public cloud node is that storage resource is abundant, can externally carried For the computing device of service；Robot calculate node, private clound node are interconnected by the network equipment, and private clound node passes through interconnection Net accesses public cloud node；

In robot calculate node in addition to equipped with operating system Ubuntu, robot middleware ROS, ROS message subscribing person, also Equipped with perception data acquisition module；The perception data acquisition module collection posture information of robot, mileage information, depth information and The colored environment picture that camera is shot, and give private clound section by these data publications using ROS news release/subscribing mechanism Perception data receiving module on point；

On private clound node in addition to equipped with operating system Ubuntu, robot middleware ROS, it is also equipped with perception data and receives mould Block, SLAM build module, based on Faster R-CNN object identification modules, collaboration recognition decision module, picture uploading module and Semantic tagger module；

Perception data receiving module using ROS news release/subscribing mechanism from perception data acquisition module subscribe to posture information, Mileage information, depth information and colored environment picture；Perception data receiving module receives posture information, mileage information, depth After information and colored environment picture, posture information is transmitted to SLAM and builds module and semantic tagger module, mileage information is sent out Give SLAM and build module, colored environment picture is sent to based on Faster R-CNN object identification modules, by depth information It is sent to semantic tagger module；

SLAM builds module and utilizes the robot posture information and mileage information received from perception data receiving module, is not having completely There are real-time rendering environment SLAM maps in the foreign environment of priori, and SLAM maps are sent to semantic tagger module； SLAM maps refer to the two-dimensional geometry map drawn using SLAM algorithms；

The colored environment picture received from perception data receiving module is utilized based on Faster R-CNN object identification modules, is based on Object is identified Faster R-CNN models, obtains each object space, object picture, object identification in colored environment picture Class label, object picture confidence score；Faster R-CNN models are entered based on Faster R-CNN object identification modules Row training；Based on Faster R-CNN object identification modules to the colored environment picture transmitted from perception data receiving module, adopt The object for carrying out each object in object identification, the colored environment picture of output to colored environment picture with Faster R-CNN models is known Other position, object picture, object identification class label, object confidence score；Wherein object identification class label and object are put Confidence score is obtained after the progress object identification of Faster R-CNN models；It will be known based on Faster R-CNN object identification modules Other result is that object identification class label, confidence score, object identification position, object picture are sent to collaboration recognition decision mould Block；

Collaboration recognition decision module is judged from the knowledge received based on Faster R-CNN object identification modules according to confidence threshold value Whether whether other result is correct, to needing to be uploaded to open cloud and again identifying that carry out decision-making；If object identification class in recognition result Distinguishing label is not " other object types " and score is more than or equal to confidence threshold value, and collaboration recognition decision module thinks to know the object It is incorrect, object identification position and object identification class label are sent to semantic tagger module；For object identification mistake or Object picture is sent to picture uploading module by the situation of None- identified, collaboration recognition decision module, is carried out again by public cloud Identification, the object identification class label and object identification position that finally collaboration recognition decision module returns to public cloud is sent to language Adopted labeling module；

Picture uploading module receives collaboration recognition decision module from collaboration recognition decision module and is judged as that identification is wrong or can not know These object pictures are transmitted to public cloud node and again identified that, and public cloud node is again identified that by other object picture The object identification class label obtained afterwards is sent to collaboration recognition decision module；

Semantic tagger module builds module from SLAM and receives SLAM maps, and posture information and depth are received from perception data receiving module Information is spent, object identification position and object identification class label are received from collaboration recognition decision module, save it in " label- In the table of position "；It is corresponding that semantic tagger module is labeled in SLAM maps by " label-position " table, by object identification class label Position, completes the structure of semantic map, semantic map finally is distributed into robot calculate node；

Open cloud service CloudSight identification picture uploading module of the public cloud node using the Internet-based in big data is received Object picture, provide the identification label of object in picture；CloudSight provides POST method and GET methods；POST method It is the method that client computer uploads data to cloud service CloudSight；GET methods are client computer from cloud service CloudSight The method for obtaining data；

Second step, private clound node subscribes to the related ROS message of perception data；Robot calculate node perceives environment, issue sense The related ROS message of primary data simultaneously subscribes to the related ROS message of semantic map, comprises the following steps that：

The perception data receiving module of 2.1 private clound nodes is by the publish/subscribe mechanism based on ROS from robot calculate node Topic of subscription is entitled/and tf posture information, theme be entitled/and odom mileage information, theme be entitled/camera/depth/ Message on points depth information and/camera/rgb/image_raw colored environment picture；

2.2 are trained based on Faster R-CNN object identification modules to Faster R-CNN models；

The collaboration recognition decision module of 2.3 private clound nodes receives confidence threshold value from keyboard；

Message in 2.4 robot calculate nodes on ROS message subscribings person subscription/sem_map themes；

The calculate node movement of 2.5 robots, perceives environment by hardware sensor and accelerometer, gyroscope and obtains data, and Data are issued by ROS mechanism, comprised the following steps that：

2.5.1 perception data acquisition module obtains data from range sensor, produce mileage information be published to/odom themes on；

2.5.2 perception data acquisition module obtains the distance, each of robot movement from range sensor, accelerometer, gyroscope The acceleration at moment, gyroscope angle.Perception data acquisition module obtains the initial position of robot and calculates traveling time, root Using tf storehouses in ROS calculate according to above- mentioned information and obtain current position coordinates and the direction generation on SLAM maps of robot Posture information, and be published to/tf themes on；

2.5.3 perception data acquisition module obtains data from vision sensor, and vision sensor data include each pixel in image Point is apart from the colour information that the depth information and RGB of vision sensor are RGB color value, and the latter is to show as what is photographed Colored environment picture；Perception data acquisition module produces depth information of the robot far from objects in front according to vision sensor data With colored environment picture, and be published to respectively/camera/depth/points and/camera/rgb/image_raw themes on；

Perception data receiving module on 3rd step, private clound node obtains the posture information and mileage letter of robot calculate node Cease, and the posture information and mileage information of robot calculate node are sent to SLAM and build module, SLAM builds module foundation SLAM maps, are comprised the following steps that：

Perception data receiving module on 3.1 private clound nodes received using ROS message mechanisms/tf themes and/odom themes On message, obtain the posture information and mileage information of robot calculate node, and by the posture information of robot calculate node SLAM is sent to mileage information and builds module, and the posture information of robot calculate node is sent to semantic tagger module；

The SLAM of 3.2 private clound nodes builds the posture information and mileage information that module receives robot calculate node, using same Step positioning draws the i.e. SLAM maps, and SLAM maps are sent into semantic tagger mould of environment geometry map with building nomography SLAM Block；

Perception data receiving module on 4th step, private clound node obtains the colored environment picture of robot calculate node and will It is sent to based on Faster R-CNN object identification modules, and object identification is carried out based on Faster R-CNN object identification modules Recognition result is sent to semantic tagger module afterwards, comprised the following steps that：

4.1 perception data receiving modules are received using ROS message mechanisms/camera/rgb/image_raw themes on disappear Breath, obtains the colored environment picture of robot node current shooting, and sends it to based on FasterR-CNN object identification moulds Block；

4.2.1 colored environment picture is received from perception data receiving module based on Faster R-CNN object identification modules；

4.2.2 picture feature extraction and thing are carried out using Faster R-CNN models based on Faster R-CNN object identification modules Body is split, and calculates object identification position according to object space and intercepts object picture from colored environment picture, and object is known Other position, object picture, object identification result class label and recognition confidence score are sent to collaboration recognition decision module, tool Body step is as follows：

4.2.2.1Faster R-CNN models carry out feature extraction to the picture in sliding window, and determine whether a thing Body, if it is, Faster R-CNN models carry out object segmentation, and returns to object space, object identification in colored environment picture As a result class label and corresponding recognition confidence score, perform step 4.2.2.2；Otherwise step 5.5 is performed；

4.2.2.2 Faster R-CNN models are obtained based on Faster R-CNN object identification modules to return in colored environment picture Object space, object identification result class label and corresponding recognition confidence score, object identification is calculated according to object space Position and object picture is intercepted from colored environment picture, based on Faster R-CNN object identification modules by object identification position Put, object picture, object identification result class label and recognition confidence score are sent to collaboration recognition decision module；

4.2.3 collaboration recognition decision module judgment object identifies whether correctly, if correctly, cooperateing with recognition decision module by result Semantic tagger module is sent to, the 5th step is performed；Otherwise the object for recognizing mistake is sent to picture by collaboration recognition decision module Uploading module, picture uploading module is further uploaded to public cloud node, is comprised the following steps that：

4.2.3.1 collaboration recognition decision module receives object picture, object knowledge from based on Faster R-CNN object identification modules Other position, object identification class label and recognition confidence score, are judged；If object identification class label is not " other Object type " and confidence score are more than or equal to confidence threshold value, then cooperate with recognition decision module to judge correct to the object identification, Perform step 4.2.3.4；

If 4.2.3.2 object identification class label is that " other object types " or recognition confidence score are less than confidence threshold value, Cooperate with recognition decision module to judge to the object identification mistake, the object picture is sent to picture uploading module；

4.2.3.3 picture uploading module is again identified that using public cloud CloudSight to object picture, and object is known Other class label is sent to collaboration recognition decision module；

4.2.3.4 collaboration recognition decision module is received after object identification result class label, by object identification class label and Object identification position is sent to semantic tagger module；

Depth information, posture information, the thing of semantic tagger module receiver device people's calculate node on 5th step, private clound node Body is recognized behind class label and object identification position, is calculated after position coordinates of the object on SLAM maps, by object identification Class label is labeled in SLAM maps relevant position, completes the structure of semantic map, comprises the following steps that：

5.1 perception data receiving modules are received using ROS message mechanisms/camera/depth/points themes on message, The depth information of robot node is obtained, and the depth information of robot node is sent to semantic tagger module；

5.2 semantic tagger modules receive the posture information of robot node from perception data receiving module, include the position of robot Put coordinate (x₀,y₀) and represent 3D postures direction γ, using the rbx1 bags in ROS, robot direction γ is converted into corresponding Rotational angle β；

5.3 semantic tagger modules receive the depth information of robot calculate node from perception data receiving module, from collaboration identification Decision-making module receives object identification class label and object identification position (x₂, y₂), the vision angle of calculating robot's node to object Spend α and depth d；

5.4 semantic tagger module integrated environment objects in images identification position, the visual angle of robot node to object and depth Degree, the position coordinates and rotational angle of robot node, are calculated using trigonometric function relation and obtain object on SLAM maps Position coordinates, object result class label is labeled in the relevant position of SLAM maps；

Whether 5.5 judge sliding window to the lower right corner based on Faster R-CNN object identification modules, if it is, i.e. colour Environment picture has completed identification, performs step 5.6；Otherwise sliding window is moved on into next position, performs step 4.2.2.1；

5.6 semantic tagger modules using semantic map as ROS news releases on/sem_map themes；

6th step, robot calculate node receives the related ROS message of semantic map using ROS message mechanisms, and specific steps are such as Under：

6.1ROS message subscribings person using ROS message mechanisms receive private clound node semantic tagger module issue /sem_map Message on theme, obtains semantic map.

6.2 robots determine whether to receive the instruction of " traversal for having completed whole environment " that user sends, if receiving, and turn the 7th Step；If not receiving, step 2.5 is performed.

7th step, terminates.

2. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 1, it is characterised in that The object space refers to the length and width of object top left co-ordinate, object the position to describe object in environment picture； Object identification class label includes certain specific category label and other object class labels in addition to specific category object； " other object types " representative " every other object ", comprising varied other kinds of object, it is all thing that can only determine these Body but the feature that its is certain specific type objects can not be represented；Confidence score by Faster R-CNN Softmax graders Calculate, characterize the reliability size of Faster R-CNN recognition results；Recognition confidence score between zero and one, get over by score The big reliability for representing recognition result is bigger；Object identification position refers to be existed with object housing pixel coordinate position to describe object Position in environment picture, takes the center point coordinate of object housing as the coordinate of object identification position；Object picture refers to basis The object space of Faster R-CNN models output and the picture that is partitioned into；Semantic map, which refers to, with the addition of object identification position, thing Body recognizes the SLAM maps of class label.

3. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 1, it is characterised in that It is described that object identification field PASCAL VOC2007 data sets and internet are used based on Faster R-CNN object identification modules On download at random several Faster R-CNN models are trained comprising pictures of various objects.

4. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 3, it is characterised in that Several pictures comprising various objects downloaded at random on the internet refer to the object on video website www.flickr.com Picture.

5. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 1, it is characterised in that The selectable range of the confidence threshold value be 0~1, select threshold value when set experiment step-length be 0.1, i.e., from [0,0.1, 0.2 ... 1] in selection confidence threshold value optimal setting value.

6. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 5, it is characterised in that The confidence threshold value is set to 0.7.

7. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 1, it is characterised in that The object identification mistake or the situation of None- identified include：1) confidence score is less than threshold value in object identification result, then says The bright Faster R-CNN object identification modules that are based on are to the object identification mistake；2) the object identification class label is " other things Body class ", then explanation is based on Faster R-CNN object identification modules, and the object generic can not be identified.

8. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 1, it is characterised in that 4.2.3.3 the picture uploading module is using the public cloud CloudSight methods again identified that to object picture：

4.2.3.3.1 one CloudSight of picture uploading module application API KEY；

4.2.3.3.2 picture uploading module using HTTP POST method by the address URL of object picture be resource descriptor and Apllied API KEY are uploaded to https://api.cloudsightapi.com/image_requests sends request；

4.2.3.3.3 picture uploading module gets the token for identity and safety certification using HTTP GET methods token；

4.2.3.3.4 picture uploading module is by accessing https://api.cloudsightapi.com/image_ Responses/ [token] obtains the recognition result class label of object picture.

9. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 1, it is characterised in that Semantic tagger modular computer device people node to the visual angle α at object center method are described in 5.3 steps：

α=tan^-1((tan(θ/2))·a/b))

θ for vision sensor level detecting angular range, centered on sensor, left and right horizontal detecting angular range be θ/ 2, b be the half of picture level pixel value, and a is the horizontal pixel value of object centre distance center picture in picture, a=| x₂-b |。

10. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 1, it is characterised in that The depth d of semantic tagger modular computer device people's nodal distance object method is described in 5.3 steps：

D=D/cos α

D is the depth information that robot node is obtained, and depth information represents object identification position to the distance of shooting head plane.

11. a kind of semantic map constructing method that cloud framework is mixed based on cloud robot as claimed in claim 1, it is characterised in that The method that semantic tagger module described in 5.4 steps calculates position of the object on SLAM maps using trigonometric function relation is：Calculate Position of the object on SLAM maps is the position coordinates (x for calculating object on SLAM maps₁,y₁),

For the object on the left of camera：

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mi>x</mi> <mn>1</mn> </msub> <mo>=</mo> <msub> <mi>x</mi> <mn>0</mn> </msub> <mo>+</mo> <mi>d</mi> <mo>&CenterDot;</mo> <mi>c</mi> <mi>o</mi> <mi>s</mi> <mrow> <mo>(</mo> <mi>&alpha;</mi> <mo>+</mo> <mi>&beta;</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>y</mi> <mn>1</mn> </msub> <mo>=</mo> <msub> <mi>y</mi> <mn>0</mn> </msub> <mo>+</mo> <mi>d</mi> <mo>&CenterDot;</mo> <mi>s</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>&alpha;</mi> <mo>+</mo> <mi>&beta;</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced>

For the object on the right side of camera：

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mi>x</mi> <mn>1</mn> </msub> <mo>=</mo> <msub> <mi>x</mi> <mn>0</mn> </msub> <mo>+</mo> <mi>d</mi> <mo>&CenterDot;</mo> <mi>c</mi> <mi>o</mi> <mi>s</mi> <mrow> <mo>(</mo> <mi>&beta;</mi> <mo>-</mo> <mi>&alpha;</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>y</mi> <mn>1</mn> </msub> <mo>=</mo> <msub> <mi>y</mi> <mn>0</mn> </msub> <mo>+</mo> <mi>d</mi> <mo>&CenterDot;</mo> <mi>s</mi> <mi>i</mi> <mi>n</mi> <mrow> <mo>(</mo> <mi>&beta;</mi> <mo>-</mo> <mi>&alpha;</mi> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced>

β is the rotational angle of robot, (x₀,y₀) be robot position coordinates.