CN107066507B - A kind of semantic map constructing method based on cloud robot mixing cloud framework - Google Patents

A kind of semantic map constructing method based on cloud robot mixing cloud framework Download PDF

Info

Publication number
CN107066507B
CN107066507B CN201710019515.XA CN201710019515A CN107066507B CN 107066507 B CN107066507 B CN 107066507B CN 201710019515 A CN201710019515 A CN 201710019515A CN 107066507 B CN107066507 B CN 107066507B
Authority
CN
China
Prior art keywords
module
robot
picture
object identification
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710019515.XA
Other languages
Chinese (zh)
Other versions
CN107066507A (en
Inventor
王怀民
丁博
刘惠
李艺颖
史佩昌
车慧敏
胡奔
包慧
彭维崑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201710019515.XA priority Critical patent/CN107066507B/en
Publication of CN107066507A publication Critical patent/CN107066507A/en
Application granted granted Critical
Publication of CN107066507B publication Critical patent/CN107066507B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Evolutionary Biology (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Remote Sensing (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of semantic map constructing methods based on cloud robot mixing cloud framework, it is therefore an objective to reach an appropriate balance between the accuracy rate for improving object identification and shortening recognition time.Technical solution is to build the mixed cloud being made of robot, private clound node, public cloud node, private clound node obtains the environment picture and odometer and position data that robot is shot based on ROS message mechanism, utilizes SLAM real-time rendering environment geometry map based on odometer and position data.Private clound node is based on environment map piece and carries out object identification, it would be possible to identify that the object of mistake is uploaded to public cloud node and is identified.Private clound node maps the object category identification label that public cloud node returns with SLAM map, and the corresponding position that object category identification label is marked on map is completed to the building of semantic map.It can reduce the computational load of robot local using the present invention, minimize request response time, improve the accuracy rate of object identification.

Description

A kind of semantic map constructing method based on cloud robot mixing cloud framework
Technical field
The present invention relates to robot Distributed Computing Technology fields, and in particular to and it is a kind of to be supported using cloud computing as reinforcement, By building mixing cloud framework, the method based on cloud robot semanteme map structuring is realized.
Background technique
The perception data source of robot may include multiple dimensions such as vision, power feel, tactile, infrared, ultrasonic, laser radar Degree, robot semanteme map structuring refer to that robot is based on these perception datas, recognize and understand that local environment, core are paid close attention to such as What is discarded the dross and selected the essential to perception data, eliminates the false and retains the true, and then analysis integrated, extracts for height used in robot autonomous decision Layer semantics information (such as object names and present position) is embodied in the mark that object in environment is added on geometry map Label.The acquisition of semantic information can be realized by key technologies such as Radio Frequency Identification Technology, hearing technology, vision techniques, big at present Majority research concentrates on the latter, and the present invention is also based on vision technique.The semantic map building process of robot view-based access control model is same When have computation-intensive and knowledge-intensive feature.It particularly may be divided into following two links executed parallel: (1) constructing geometrically Figure is sensed generally by means of RGB-D entrained by robot (Red Green Blue-Depth) camera, laser radar etc. Device, using synchronous superposition algorithm SLAM (Simultaneous Location and Mapping), SLAM algorithm It was proposed earliest by Smith, Self and Cheeseman in 1988, related thought is published in paper " Estimating Uncertain spatial relationship in robotics " in (estimation of Instable Space relationship in robotics) (Smith R,Self M,Cheeseman P.Estimating uncertain spatial relationship in Robotic [M] //Autonomous robot vehickes.Springer New York, 1990:167~193.).It is this kind of Algorithm carries out calculating solution by successive ignition usually using statistical filtering algorithm or image matching technology is based on, It is typical computation-intensive task.(2) object label information is added, is likely to contain in the image of robot real-time perception more A object, it is therefore desirable to image is split first, then by the methods of machine learning to the object in image after segmentation into Row identification.The process relates equally to a large amount of image operation, is typical computation-intensive task.Recognition accuracy is tight simultaneously Again dependent on the preset knowledge of robot, exist in the form of model that these knowledge are usually trained by machine learning etc., therefore It is knowledge-intensive tasks.If the open dynamic of environment can not be predicted accurately, the accuracy of robot semanteme map will It faces a severe challenge.
Using backend infrastructures such as cloud computing and big datas, hoisting machine people carries out under complex environment to be appointed for cloud robot The ability of business, therefore new solution can be brought to think to this kind of calculating of robot semanteme map structuring and knowledge-intensive tasks Road: (1) carrying out map structuring using the powerful computing resource in cloud, realizes local computing unloading.(2) importantly, utilizing Cloud Heuristics abundant, the limitation of breaking machines people's Indigenous knowledge make robot be based on cloud wisdom and realize that knowledge expands Exhibition, has preferable recognition capability to object in environment.The cloud that robot can utilize include private clound (Private Cloud) and Public cloud (Public Cloud): private clound refers to the cloud constructed for client's exclusive use, thus can provide logarithm According to, the most effective control of safety and service quality;Public cloud is often referred to third party provider and is able to use for what client provided Cloud, can generally be used by internet, it is low in cost in addition some be it is free, the core attribute of public cloud is shared resource Service.
Currently, the main stream approach of cloud robot semanteme map structuring is usually to use private clound, it is machine in privately owned cloud People provides object knowledge base, and picture is uploaded to private clound and obtains object identification class label by robot, to solve robot certainly Body calculates and the limitation of knowledge resource, improves operational efficiency, shortens request response time.But this privately owned cloud object is known Be not based on object individual is matched or the accuracy rate of object recognition algorithm is also relatively low, and knowledge still need to it is preset in advance, and And it can not identify unfamiliar object.Though its knowledge is still deposited that is, private clound resource is controllable, request response time can shorten In limitation.
If being applied to language in the open object identification cloud service such as CloudSight of big data using the Internet-based The object identification link of adopted map structuring, robot can utilize enriching one's knowledge based on internet big data, improve multiple Recognition accuracy in miscellaneous open environment.CloudSight provides open API, user's uploading pictures, CloudSight Return to object tags or the description to object.Based on internet mass image data knowledge, CloudSight can identify 4 at present More than ten million article, and also have preferable recognition result even for the article shot under poor light condition and angle.But Such publicly-owned cloud service in internet is based on " Best Effort " model, and performance is uncontrollable, request response time is long, this is for being permitted Multirobot application be it is unacceptable, especially when they are directly interacted with physical world.That is, though public cloud has There is internet big data to enrich one's knowledge, but resource is uncontrollable, request response time is long.
In conclusion showing in request response time based on the scheme of private clound preferably, familiar objects can be identified, but need It to train in advance, and there are knowledge limitations, can not identify the unfamiliar object in open environment;Scheme based on public cloud utilizes Internet big data knowledge has widely intelligence to the object identification in environment, can be without training in advance It identifies unfamiliar object, but identifies that delay is big, request response time is long.Based on the above analysis it can be found that in building machine human speech When the figure of free burial ground for the destitute, how robot accurately identifies unfamiliar object in open environment, and shortens identification delay time as far as possible, is The semantic map constructing method technical issues that need to address.
Summary of the invention
The technical problem to be solved in the present invention is to provide a kind of semantic map structurings based on cloud robot mixing cloud framework Method, the mixed cloud being combined into using public cloud and private clound, so that robot can use cloud to extend recognition capability, It improves the accuracy rate of object identification and shortens and reach an appropriate balance between recognition time.Mixed cloud refers to public cloud and privately owned The combination of cloud, it integrates the advantage of the two, and realizes good coordination between the two.
The technical scheme is that the mixing cloud environment being made of robot, private clound node, public cloud node is built, The message mechanism that private clound node is based on ROS (Robot Operating System) obtains the environment map the piece number of robot shooting According to robot odometer and position data, based on odometer and robot position data using synchronous positioning with build nomography SLAM real-time rendering environment geometry map.Then private clound node is used based on the environment picture that robot is shot based on Faster The object identification module of R-CNN (Faster Region Convolutional Neural Network) carries out object identification, It will likely identify that the object of mistake is uploaded to public cloud node and is identified.What last private clound node returned to public cloud node Object category identification label is mapped with SLAM map, and language is completed in the corresponding position that object category identification label is marked on map The building of free burial ground for the destitute figure.
The present invention the following steps are included:
The first step constructs robot mixing cloud environment, it is by robot calculate node and private clound node, public cloud node It constitutes.Robot calculate node is can be with robot hardware's equipment (such as unmanned plane, unmanned vehicle, the humanoid of runs software program Device people etc.), private clound node is the calculating equipment for having the resource of good computing capability controllable, can run computation-intensive or The knowledge-intensive robot application of person.Public cloud node is that storage resource is enriched and can externally provide the calculating equipment of service. Robot calculate node, private clound node are interconnected by the network equipment, and private clound node passes through internet access public cloud node.
In addition to being equipped with operating system Ubuntu (such as 14.04 versions), robot middleware ROS in robot calculate node (such as Indigo version, be made of ROS message issuer, ROS message subscribing person and ROS communication protocol etc.) outside, is also equipped with Perception data acquisition module.
In ROS, message transmission is using the publish/subscribe mechanism for being based on theme (Topic), i.e., message is according to theme Classify, the publisher of message gives out information on a certain theme, and all message subscribing persons for having subscribed this theme will receive To the message.One ROS subject unique identification one kind message, ROS subject are made of letter, number and "/" symbol.
Include multiple software packages in ROS, provides auxiliary for robot autonomous action and support.The library tf is a series of soft in ROS The set of part packet, it can provide coordinate transform, direction calculating function.Input changing coordinates tie up to retouching in reference frame It states, then the library tf can be the coordinate transform that any one is put in current coordinate system at the coordinate in reference frame;Input gyro The historical data and current data of instrument, then the library tf can obtain the current direction of robot by integral calculation.
Rbx1 packet in ROS can provide angular transformation function.It is revolved using quaternary number in direction in robot posture information The mode of turning indicates (quaternary number is a kind of high-order plural number, is made of a real number unit and three imaginary units), calculates object and exists Need to use the rotational angle of robot when position on SLAM map (i.e. according to certain reference axis sequence, each axis rotation Certain angle), then the direction representation of quaternary number is transformed to the direction representation of rotational angle.Quaternary counts to rotation The direction representation transformation of angle is mathematically one classical conversion, related mathematical algorithm most early in 1843 by Hamilton is proposed, is usually used in object positioning etc. in figure, iconology.
The posture information of robot refers to position and the posture of robot;Mileage information refers to the distance that robot passes by;Depth Information refers to object identification position to the distance for imaging head plane of robot node.
What posture information, mileage information, depth information and the camera that perception data acquisition module acquires robot were shot Colored environment picture, and use news release/subscribing mechanism of ROS by these data publications to the perception number on private clound node According to receiving module.
On private clound node other than equipped with operating system Ubuntu, robot middleware ROS, it is also equipped with perception data and connects Receive module, SLAM builds module, uploads mould based on Faster R-CNN object identification module, collaboration recognition decision module, picture Block and semantic tagger module.
Perception data receiving module subscribes to pose from perception data acquisition module using news release/subscribing mechanism of ROS Information, mileage information, depth information and colored environment picture.Perception data receiving module receive posture information, mileage information, After depth information and colored environment picture, posture information is transmitted to SLAM and builds module and semantic tagger module, mileage is believed Breath is sent to SLAM and builds module, and colored environment picture is sent to based on Faster R-CNN object identification module, by depth Information is sent to semantic tagger module.
SLAM builds module and utilizes from the received robot posture information of perception data receiving module and mileage information, complete Real-time rendering environment SLAM map in the complete not foreign environment of priori knowledge, and SLAM map is sent to semantic tagger mould Block.SLAM map refers to the two-dimensional geometry map drawn using SLAM algorithm.
It is utilized based on Faster R-CNN object identification module from the received colored environment picture of perception data receiving module, Object is identified based on Faster R-CNN model, obtains each object space, object picture, object in colored environment picture Identify class label, object picture confidence score.
Faster R-CNN model is to be used by Shaoqing Ren, Kaiming He, Ross Girshick et al. for 2015 Convolutional neural networks (CNN) realize object identification engine (Ren S, He K, Girshick R, et al.Faster R-CNN: Towards real-time object detection with region proposal networks (recommend by using area Network carries out real-time object detection) [C] //Advances in neural information processing Systems.2015:91~99.).Using convolutional neural networks using training sample set picture to Faster R-CNN model into Row has the training of supervision, i.e., feature extraction and object segmentation is carried out to training sample set picture, with the sample of known class label The parameter of the softmax classifier of Faster R-CNN is constantly adjusted, so that the object in the output after classifier calculated Identify that class label is the true class label of training sample as much as possible, that is, adjustment classifier parameters imitate classification Fruit is as well as possible.Faster R-CNN model training complete after, to color image carry out object identification, obtain object space, Object identification class label, object confidence score.Object space refers to be come with the length and width of object top left co-ordinate, object Position of the object in environment picture is described.Object identification class label includes certain specific category label (such as apple class, cup Class etc.) and other object category labels in addition to specific category object." other object types " represents " every other object ", Varied other kinds of object is contained, can only determine these all is object but can not indicate that it is certain specific type objects Feature.Confidence score is obtained by the Softmax classifier calculated of Faster R-CNN, characterizes Faster R-CNN identification As a result reliability size.Between zero and one, score is bigger, and the reliability for indicating recognition result is bigger for recognition confidence score.
Based on the PASCAL VOC2007 data that Faster R-CNN object identification module is famous using object identification field Several downloaded at random on collection and internet include picture (such as famous picture and the video website of various objectswww.flickr.comOn object picture) Faster R-CNN model is trained.The full name of PASCAL VOC is Pattern Analysis,Statistical Modelling and Computational Learning Visual It is standardized data collection provided by objects in images identifies that Object Classes Chanllenge, which is the match, wherein with The data set of match in 2007 is the most representative, abbreviation PASCAL VOC2007.
Based on Faster R-CNN object identification module to the colored environment picture transmitted from perception data receiving module, adopt Object identification is carried out to colored environment picture with Faster R-CNN model, the object for exporting each object in colored environment picture is known Other position, object picture, object identification class label, object confidence score.Wherein object identification class label and object are set Confidence score obtains after carrying out object identification by Faster R-CNN model.Object identification position refers to object outline border pixel seat Cursor position describes position of the object in environment picture, takes the center point coordinate of object outline border as the seat of object identification position Mark.Object picture refers to the picture being partitioned into according to the object space of Faster R-CNN model output.Based on Faster R- CNN object identification module is by recognition result (including object identification class label, confidence score, object identification position, object figure Piece) it is sent to collaboration recognition decision module.
Collaboration recognition decision module judges to receive from based on Faster R-CNN object identification module according to confidence threshold value Recognition result it is whether correct, to whether needing to be uploaded to open cloud and again identify that carry out decision.Confidence threshold value is set by user Set, be the optimal value tested and selected using 2007 data set of PASCAL VOC: the selectable range of confidence threshold value for 0~1, when selecting threshold value, setting experiment step-length is 0.1, i.e., selects confidence threshold value most from [0,0.1,0.2 ... ... 1] Excellent setting value.Experiment discovery, when confidence threshold value is set as 0.7, promotion of the collaboration recognition decision module in recognition accuracy Upper effect is optimal, it is therefore proposed that confidence threshold value is set as 0.7.
If object identification class label is not in recognition result " other object types " and score is more than or equal to confidence threshold value, Collaboration recognition decision module thinks correct to the object identification, and object identification position and object identification class label are sent to language Adopted labeling module.For following two kinds of object identification mistakes or unrecognized situation, cooperate with recognition decision module by object figure Piece is sent to picture uploading module, is identified again by public cloud, finally recognition decision module is cooperateed with to return to public cloud Object identification class label and object identification position are sent to semantic tagger module.Two kinds of object identification mistakes or unrecognized If situation includes: that 1) confidence score is less than threshold value in object identification result, then it is assumed that be based on Faster R-CNN object identification Module is not high enough to the recognition result confidence level of this object, i.e., based on Faster R-CNN object identification module to the object identification Mistake;If 2) the object identification class label is " other object types ", illustrate the possible indiscipline of the object generic, base It is difficult to be classified as a certain specific category in Faster R-CNN object identification module, i.e., is known based on Faster R-CNN object Other module can not identify the object generic.
Picture uploading module receives collaboration recognition decision module from collaboration recognition decision module and is judged as identification mistake or nothing These object pictures are transmitted to public cloud node and again identified that, and again by public cloud node by the object picture of method identification The object identification class label obtained after identification is sent to collaboration recognition decision module.
Semantic tagger module builds module from SLAM and receives SLAM map, receives posture information from perception data receiving module And depth information, object identification position and object identification class label are received from collaboration recognition decision module, saves it in " mark In label-position " table.Object identification class label is labeled in SLAM map phase by " label-position " table by semantic tagger module Position is answered, the building of semantic map is completed, semantic map is finally distributed to robot calculate node.Semantic map, which refers to, to be added to Object identification position, object identification class label SLAM map, enable robot and people to understand each object distribution in scene, It is the basis that robot realizes independent behaviour ability.
Open cloud service CloudSight of the public cloud node using the Internet-based in big data identifies picture uploading module Received object picture provides the identification label of object in picture.2014, CloudSight company developed object identification cloud It services CloudSight (https: //www.cloudsightapi.com).CloudSight provides open API, Yong Hushang Blit piece, CloudSight can return to object identification class label or the description to object.CloudSight provides POST Method and GET method.User is character string password or digital certificate, access application firstly the need of API KEY, API KEY is obtained The authority of routine interface;POST method is the method that client computer uploads data to cloud service CloudSight;GET method is client The method that machine obtains data from cloud service CloudSight can be got from cloud service CloudSight with GET method For the token token of identity and safety certification, the recognition result class label of object picture is then obtained.
Second step, private clound node subscribe to the relevant ROS message of perception data;Robot calculate node perceives environment, hair The relevant ROS message of cloth perception data simultaneously subscribes to the relevant ROS message of semantic map.Specific step is as follows:
The perception data receiving module of 2.1 private clound nodes is calculated by the publish/subscribe mechanism based on ROS from robot Node subscribe to theme it is entitled/posture information of tf, theme be entitled/mileage information of odom, theme be entitled/camera/depth/ Message on the colored environment picture of the depth information of points and/camera/rgb/image_raw.
2.2 use 2007 data set of object identification field PASCAL VOC based on Faster R-CNN object identification module " the other object types " constituted with several pictures comprising various objects downloaded at random on internet is to Faster R-CNN Model is trained.
The collaboration recognition decision module of 2.3 private clound nodes receives confidence threshold value from keyboard.
Message in 2.4 robot calculate nodes on ROS message subscribing person subscription/sem_map theme.
2.5 robot calculate nodes are mobile, perceive environment by hardware sensor and accelerometer, gyroscope and obtain number According to, and data are issued by ROS mechanism.Specific step is as follows:
2.5.1 perception data acquisition module obtains data from range sensor, generates mileage information and is published to/odom theme On;
2.5.2 perception data acquisition module from range sensor, accelerometer, gyroscope obtain robot it is mobile away from From, the acceleration at each moment, gyroscope angle.When perception data acquisition module obtains the initial position of robot and calculates mobile Between, it is carried out that position coordinates of the robot currently on SLAM map are calculated using the library tf in ROS according to above- mentioned information (position) and direction (orientation) generates posture information, and be published to/tf theme on;
2.5.3 perception data acquisition module obtains data from visual sensor, and visual sensor data include each in image The colour information of depth information and RGB (Red Green Blue) color value of the pixel apart from visual sensor, the latter, that is, table The now colored environment picture to take.Perception data acquisition module generates robot away from front object according to visual sensor data The depth information of body and colored environment picture, and it is published to/camera/depth/points and/camera/rgb/ respectively On image_raw theme.
Third step, the perception data receiving module on private clound node obtain the posture information of robot calculate node and inner Journey information, and the posture information of robot calculate node and mileage information are sent to SLAM and build module, SLAM builds module Establish SLAM map.Specific step is as follows:
Perception data receiving module on 3.1 private clound nodes received using ROS message mechanism/tf theme and/odom Message on theme obtains the posture information and mileage information of robot calculate node, and by the pose of robot calculate node Information and mileage information are sent to SLAM and build module, and the posture information of robot calculate node is sent to semantic tagger mould Block.
The SLAM of 3.2 private clound nodes builds the posture information and mileage information that module receives robot calculate node, benefit Environment geometry map i.e. SLAM map is drawn with nomography SLAM is built with synchronous positioning, and SLAM map is sent to semantic tagger Module.
4th step, the perception data receiving module on private clound node obtain the colored environment picture of robot calculate node And send it to based on Faster R-CNN object identification module, object is carried out based on Faster R-CNN object identification module Recognition result is sent to semantic tagger module after identification.Specific step is as follows:
4.1 perception data receiving modules are received using ROS message mechanism/camera/rgb/image_raw theme on Message, obtains the colored environment picture of robot node current shooting, and sends it to and known based on Faster R-CNN object Other module.
4.2 receive the colored environment picture of robot calculate node based on Faster R-CNN object identification module, carry out Object identification, the specific steps are as follows:
4.2.1 colored environment picture is received from perception data receiving module based on Faster R-CNN object identification module.
4.2.2 picture feature extraction is carried out using Faster R-CNN model based on Faster R-CNN object identification module And object segmentation, object identification position is calculated according to object space and intercepts from colored environment picture object picture, and by object Body identification position, object picture, object identification result class label and recognition confidence score are sent to collaboration recognition decision mould Block.Specific step is as follows:
4.2.2.1Faster R-CNN model carries out feature extraction to the picture in sliding window, and judges whether it is one A object.If so, Faster R-CNN model carries out object segmentation, and return to object space, object in colored environment picture Recognition result class label and corresponding recognition confidence score execute step 4.2.2.2;It is no to then follow the steps 5.5.
4.2.2.2 Faster R-CNN model is obtained based on Faster R-CNN object identification module and returns to colored environment map Object space, object identification result class label and corresponding recognition confidence score in piece calculate object according to object space It identifies position and intercepts object picture from colored environment picture.Based on Faster R-CNN object identification module by object identification Position, object picture, object identification result class label and recognition confidence score are sent to collaboration recognition decision module.
4.2.3 collaboration recognition decision module judgment object identifies whether correctly.If correct, collaboration recognition decision module will As a result it is sent to semantic tagger module, executes the 5th step;Otherwise the object for identifying mistake is sent to by collaboration recognition decision module Picture uploading module, picture uploading module are further uploaded to public cloud node.Specific step is as follows:
4.2.3.1 collaboration recognition decision module receives object picture, object from based on Faster R-CNN object identification module Body identifies position, object identification class label and recognition confidence score, is judged.If object identification class label is not " other object types " and confidence score are more than or equal to confidence threshold value, then recognition decision module is cooperateed with to determine to the object identification Correctly, step 4.2.3.4 is executed.
If 4.2.3.2 object identification class label is " other object types " or recognition confidence score is less than confidence level threshold Value then cooperates with recognition decision module to determine that the object picture is sent to picture uploading module to the object identification mistake.
4.2.3.3 picture uploading module again identifies that object picture using public cloud CloudSight, and by object Body identification class label is sent to collaboration recognition decision module.Specific step is as follows:
4.2.3.3.1 the API KEY of one CloudSight of picture uploading module application.
4.2.3.3.2 picture uploading module utilizes the POST method of HTTP by the address URL (resource description of object picture Symbol) and applied API KEY be uploaded to https: //api.cloudsightapi.com/image_requests issue ask It asks.
4.2.3.3.3 picture uploading module gets the token for identity and safety certification using the GET method of HTTP token。
4.2.3.3.4 picture uploading module passes through access https: //api.cloudsightapi.com/image_ Responses/ [token] obtains the recognition result class label of object picture, and object identification class label is sent to association With recognition decision module.
4.2.3.4 after collaboration recognition decision module receives object identification result class label, by object identification classification mark Label and object identification position are sent to semantic tagger module.
5th step, depth information, the pose letter of semantic tagger module receiver device people's calculate node on private clound node Behind breath, object identification class label and object identification position, object is calculated after the position coordinates on SLAM map, by object Identification class label is labeled in SLAM map corresponding position, completes the building of semantic map.Specific step is as follows:
5.1 perception data receiving modules are received using ROS message mechanism/camera/depth/points theme on Message, obtains the depth information of robot node, and the depth information of robot node is sent to semantic tagger module.
5.2 semantic tagger modules receive the posture information of robot node, including robot from perception data receiving module Position coordinates (x0,y0) (cartesian coordinate system) and indicate 3D posture direction γ.Using the rbx1 packet in ROS, by robot Direction γ is converted into corresponding rotational angle β.
5.3 semantic tagger modules receive the depth information of robot calculate node from perception data receiving module, from collaboration Recognition decision module receives object identification class label and object identification position (x2, y2), the view of calculating robot's node to object Feel angle [alpha] and depth d.
Visual angle is the angle detection range that position and camera are identified by binding object, according to trigonometric function ratio Corresponding relationship is calculated.If the half of picture level pixel value be b, then in picture object centre distance center picture level Pixel value a is
A=| x2-b|
The horizontal detecting angular range of visual sensor is that (i.e. centered on sensor, left and right horizontal detects angular range to θ Each θ/2), the visual angle α of robot to object center is
α=tan-1((tan(θ/2))·a/b))
, by combining robot node to the angle of object, robot nodal point separation is calculated according to trigonometric function relationship Depth from object.
If the depth information that robot node obtains is D, the horizontal depth d of robot nodal distance object is
D=D/cos α
5.4 semantic tagger module integrated environment objects in images identify the visual angle of position, robot node to object With the position coordinates and rotational angle of depth, robot node, object is calculated in SLAM map using trigonometric function relationship On position coordinates, object result class label is labeled in the corresponding position of SLAM map.It is calculated using trigonometric function relationship The method of position of the object on SLAM map is:
If the position coordinates of robot are (x0,y0), then position coordinates (x of the object on SLAM map1,y1) be
For the object on the left of camera:
For the object on the right side of camera:
5.5 judge sliding window whether to the lower right corner, if so, i.e. based on Faster R-CNN object identification module Colored environment picture completed to identify, executes step 5.6;Otherwise sliding window is moved on into next position, executes step 4.2.2.1。
5.6 semantic tagger modules are using semantic map as ROS news release on/sem_map theme.
6th step, robot calculate node receive the relevant ROS message of semantic map using ROS message mechanism.Specific step It is rapid as follows:
6.1ROS message subscribing person using ROS message mechanism receive private clound node semantic tagger module publication/ Message on sem_map theme obtains semantic map.
6.2 robots determine whether to receive the instruction of " traversal that entire environment is completed " of user's sending, if receiving, turn 7th step;If not receiving, step 2.5 is executed.
7th step terminates.
Using the present invention can achieve it is following the utility model has the advantages that
(1) private clound cooperates with constructing environment geometry map with robot, using the powerful computing capability in cloud, alleviates machine The computational load of device people local.
(2) identification of object in environment is carried out using mixing cloud framework, can utilize cloud expanding machinery people's ability Meanwhile while minimizing request response time.Based on collaboration recognition decision module, object known to private clound can be in private clound End is identified that the resource of private clound is controllable, and request response time is short;The object of identification mistake, meeting are possible to for private clound It screens and is uploaded to public cloud and is again identified that, labelled using the Internet-based in enriching one's knowledge for big data for object, The identification cognitive ability of expanding machinery people, improves the accuracy rate of object identification so that improve object identification accuracy rate and Shorten and reaches an appropriate balance between recognition time.
(3) building of semantic map is completed in the mapping that " object tags-map location " are realized based on data fusion.Fusion The data of different spaces dimension are simultaneously calculated, and object tags are labeled in map corresponding position, construct semantic map, display machine Effect of the device people to environment understanding.
Detailed description of the invention
Fig. 1 is the robot mixing cloud environment overall logic structure chart of first step building of the present invention.
Fig. 2 is the software deployment figure of the robot mixed cloud of first step building of the present invention environmentally.
Fig. 3 is overall flow figure of the present invention.
Fig. 4 is that the 4th step and the 5th Bu Zhong robot calculate node, private clound node and public cloud node of the invention cooperate with Establish the flow chart of semantic map.
Fig. 5 be in step 5.3 of the present invention robot node to the visual angle of object and the calculating schematic diagram of depth.
Fig. 6 is the calculating schematic diagram of object position on SLAM map in step 5.4 of the present invention.Fig. 6 (a) is that object is being taken the photograph Calculating schematic diagram on the left of head, Fig. 6 (b) are calculating schematic diagram of the object on the right side of camera.
Specific embodiment
Fig. 1 is the robot mixing cloud environment of first step building of the present invention, it is by robot calculate node, private clound node It is constituted with public cloud node.Robot calculate node is can be with robot hardware's equipment (such as unmanned plane, nothing of runs software program People's vehicle, anthropomorphic robot etc.), private clound node is the calculating equipment for having the resource of good computing capability controllable, can be run Computation-intensive or knowledge-intensive robot application.Public cloud node is that storage resource is enriched and can externally provide clothes The calculating equipment of business.Robot calculate node, private clound node are interconnected by the network equipment, and private clound passes through internet access public affairs There is cloud.
Fig. 2 is the software deployment figure on robot calculate node of the present invention and private clound node.Robot calculate node is Robot hardware's equipment of runs software program can be moved, be capable of in the environment, passed thereon with camera, laser radar etc. Sensor, robot calculate node and private clound node are equipped with operating system Ubuntu and robot middleware ROS.Except this it Outside, perception data acquisition module is also equipped in robot calculate node.Be also equipped on private clound node perception data receiving module, SLAM builds module, based on Faster R-CNN object identification module, collaboration recognition decision module, picture uploading module and semanteme Labeling module.Public cloud uses CloudSight object identification cloud service.
Illustrate a specific embodiment of the invention so that wheeled robot TurtleBot constructs semantic map as an example below.It should Microsoft's visual sensor Kinect is configured on the TurtleBot of example, it can be with captured in real-time ambient image and ranging.Private clound On the server, since there are the limitations of knowledge for robot and private clound, it is big to be based on internet by public cloud for deployment The identification cognitive ability of Data expansion robot minimizes mixed cloud while expanding object identification cognitive ability Request response time.
It is as follows as shown in Figure 3 using a specific embodiment of the invention:
The first step constructs robot mixing cloud system, it is by TurtleBot wheeled robot, server and based on big number According to open cloud service CloudSight composition, three by the network equipment interconnection.TurtleBot wheeled robot is equipped with operation System Ubuntu14.04 version, robot operating system middleware ROS Indigo version and perception data module.Server dress There are operating system Ubuntu, robot middleware ROS, perception data receiving module, SLAM to build module, based on Faster R- CNN object identification module, collaboration recognition decision module, picture uploading module and semantic tagger module.Public cloud uses CloudSight object identification cloud service (https://www.cloudsightapi.com)。
Second step, private clound node subscribe to the relevant ROS message of perception data;TurtleBot perceives environment, publication perception The relevant ROS message of data simultaneously subscribes to the relevant ROS message of semantic map.
The perception data receiving module of 2.1 private clound nodes is calculated by the publish/subscribe mechanism based on ROS from robot Node subscribe to theme it is entitled/posture information of tf, theme be entitled/mileage information of odom, theme be entitled/camera/depth/ Message on the colored environment picture of the depth information of points and/camera/rgb/image_raw.
2.2 use 2007 data set of object identification field PASCAL VOC based on Faster R-CNN object identification module " the other object types " constituted with several pictures comprising various objects downloaded at random on internet is to Faster R-CNN Model is trained.
The collaboration recognition decision module of 2.3 private clound nodes receives confidence threshold value 0.7 from keyboard.
Message on the upper ROS message subscribing person subscription/sem_map theme of 2.4TurtleBot.
2.5TurtleBot wheeled robot is mobile, perceives environment by hardware sensor and accelerometer, gyroscope and obtains Data are obtained, and data are issued by ROS mechanism.Specific step is as follows:
2.5.1 perception data acquisition module obtains data from range sensor, generates mileage information and is published to/odom theme On;
2.5.2 perception data acquisition module obtains TurtleBot movement from range sensor, accelerometer, gyroscope Distance, the acceleration at each moment, gyroscope angle.Perception data acquisition module obtains initial position and the calculating of TurtleBot Traveling time carries out that position seat of the robot currently on SLAM map is calculated using the library tf in ROS according to above- mentioned information Mark and direction generate posture information, and be published to/tf theme on;
2.5.3 perception data acquisition module generates depth of the robot far from objects in front according to Kinect visual sensor data Information and colored environment picture are spent, and is published to/camera/depth/points and/camera/rgb/image_raw respectively On theme.
Third step, the perception data receiving module on private clound node obtain the pose letter of TurtleBot wheeled robot Breath and mileage information, and the posture information of robot calculate node and mileage information are sent to SLAM and build module, SLAM is built Module establishes SLAM map.Specific step is as follows:
Perception data receiving module on 3.1 private clound nodes received using ROS message mechanism/tf theme and/odom Message on theme obtains the posture information and mileage information of robot calculate node, and by the pose of robot calculate node Information and mileage information are sent to SLAM and build module, and the posture information of robot calculate node is sent to semantic tagger mould Block.
The SLAM of 3.2 private clound nodes builds the posture information and mileage information that module receives robot calculate node, benefit Environment geometry map i.e. SLAM map is drawn with nomography SLAM is built with synchronous positioning, and SLAM map is sent to semantic tagger Module.
4th step, the perception data receiving module on private clound node obtain the colored environment picture of robot calculate node And send it to based on Faster R-CNN object identification module, object is carried out based on Faster R-CNN object identification module Recognition result is sent to semantic tagger module after identification.Specific step is as follows:
4.1 perception data receiving modules are received using ROS message mechanism/camera/rgb/image_raw theme on Message, obtains the colored environment picture of robot node current shooting, and sends it to and known based on Faster R-CNN object Other module.
4.2 receive the colored environment picture of robot calculate node based on Faster R-CNN object identification module, carry out Object identification, the specific steps are as follows:
4.2.1 colored environment picture is received from perception data receiving module based on Faster R-CNN object identification module. It include robot and chair in picture.
4.2.2 picture feature extraction is carried out using Faster R-CNN model based on Faster R-CNN object identification module And object segmentation, object identification position is calculated according to object space and intercepts from colored environment picture object picture, and by object Body identification position, object picture, object identification result class label and recognition confidence score are sent to collaboration recognition decision mould Block.Specific step is as follows:
4.2.2.1Faster R-CNN model carries out feature extraction to the picture in sliding window, and judges whether it is one A object.Judgement be the object space of robot in Faster R-CNN model returns after an object colored environment picture, Object identification result class label and corresponding recognition confidence score.Faster R-CNN model is by the object identification of robot As a result class label is set as " person ", and recognition confidence is scored at 0.551.
4.2.2.2 Faster R-CNN model is obtained based on Faster R-CNN object identification module and returns to colored environment map Object space, object identification result class label and corresponding recognition confidence score in piece calculate object according to object space It identifies position and intercepts object picture from colored environment picture.Based on Faster R-CNN object identification module by object identification Position, object picture, object identification result class label and recognition confidence score are sent to collaboration recognition decision module.
4.2.3 collaboration recognition decision module judgment object identifies whether correctly.If correct, collaboration recognition decision module will As a result it is sent to semantic tagger module, executes the 5th step;Otherwise the object for identifying mistake is sent to by collaboration recognition decision module Picture uploading module, picture uploading module are further uploaded to public cloud node.Specific step is as follows:
4.2.3.1 cooperate with recognition decision module from the object for receiving robot based on Faster R-CNN object identification module Picture, object identification position, object identification class label and recognition confidence score, are judged.If object identification classification mark Label are not " other object types " and confidence score is more than or equal to confidence threshold value, then recognition decision module is cooperateed with to determine to the object Body identification is correct, and object identification position and object identification class label are sent to semantic tagger module, execute the 5th step.
4.2.3.2 since the recognition confidence score of robot is less than threshold value 0.7, collaboration recognition decision module determines to this The object picture is sent to picture uploading module by object identification mistake.
4.2.3.3 picture uploading module again identifies that object picture using public cloud CloudSight, and by object Body identification class label is sent to collaboration recognition decision module.Specific step is as follows:
4.2.3.3.1 the API KEY of one CloudSight of picture uploading module application.
4.2.3.3.2 picture uploading module utilizes the POST method of HTTP by the address URL (resource description of object picture Symbol) and applied API KEY be uploaded to https: //api.cloudsightapi.com/image_requests issue ask It asks.
4.2.3.3.3 picture uploading module gets the token for identity and safety certification using the GET method of HTTP token。
4.2.3.3.4 picture uploading module passes through access https: //api.cloudsightapi.com/image_ The recognition result class label that responses/ [token] obtains object picture is blue and white robot, and by object Body identification class label is sent to collaboration recognition decision module.
4.2.3.4 after collaboration recognition decision module receives object identification result class label, by object identification classification mark Label and object identification position are sent to semantic tagger module.
5th step, depth information, the pose letter of semantic tagger module receiver device people's calculate node on private clound node Behind breath, object identification class label and object identification position, object is calculated after the position coordinates on SLAM map, by object Identification class label is labeled in SLAM map corresponding position, completes the building of semantic map.Specific step is as follows:
5.1 perception data receiving module received using ROS message mechanism/camera/depth/points theme on Message, obtains the depth information of robot node, and the depth information of robot node is sent to semantic tagger module.
5.2 semantic tagger modules receive the posture information of robot node, including robot from perception data receiving module Position coordinates (0.21,0.18) and indicate 3D posture direction [0.8128,0.3430,0.4073,0.2362]T.Use ROS In rbx1 packet, by robot direction [0.8128,0.3430,0.4073,0.2362]TIt is converted into corresponding rotational angle 50.02°。
5.3 semantic tagger modules receive the depth information of robot calculate node from perception data receiving module, from collaboration Recognition decision module receives object identification class label and object identification position (391,105), calculating robot's node to object Visual angle and depth.
As shown in figure 5, the half in colored environment picture level pixel value is 250, robot center picture is apart from picture The horizontal pixel value at center is 141.
A=| 391-250 |
The horizontal reconnaissance range of kinect visual sensor is 57 degree (i.e. centered on sensor, each 28.5 degree of left and right), Then the visual angle at robot to object center is
α=tan-1((tan (28.5 °)) 141/250))=17.03 °
Robot nodal distance Object Depth is d, and the depth information that robot node obtains is 2.15 meters, robot node Visual angle to object is 17.03 °, then robot is apart from Object Depth
D=2.15/cos (17.03 °)=2.25 (rice)
5.4 semantic tagger module integrated environment objects in images identify the visual angle of position, robot node to object With the position coordinates and rotational angle of depth, robot node, object is calculated in SLAM map using trigonometric function relationship On position coordinates, object result class label is labeled in the corresponding position of SLAM map.It is calculated using trigonometric function relationship Object is in the step of position on SLAM map:
Obtaining robot to the visual angle α of object by above 5.2 and 5.3 steps is 17.03 °, and robot is apart from object The depth d of body is 2.25 meters, and the rotational angle β of robot is 50.02 °, and the position coordinates of robot are (0.21,0.18), should Object is located on the right side of camera, as shown in Fig. 6 (b), position coordinates (x of the object on map1,y1) be
After the completion of calculating, object result class label (blue and white robot) is labeled in SLAM map reference For the position of (2.10,1.41).
5.5Faster R-CNN model judge sliding window whether to the lower right corner, due to arriving the lower right corner not yet, i.e., Object identification in colored environment picture is not completed, sliding window is moved on to next position by Faster R-CNN model, is executed Step 4.2.2.1 judges whether it is an object if not object and continues to move to sliding window, until being in sliding window One object.It continues to execute step 4.2.2.2 and identifies next object chair, and be labeled on SLAM map.If sliding Window has slided into the lower right corner, then completes the identification for identifying object (Cain and Abel) in colored environment picture, executes step 5.6。
5.6 semantic tagger modules are using semantic map as ROS news release on/sem_map theme.
6th step, robot calculate node receive the relevant ROS message of semantic map using ROS message mechanism.Specific step It is rapid as follows:
ROS message subscribing person on 6.1TurtleBot wheeled robot receives private clound node using ROS message mechanism The publication of semantic tagger module /sem_map theme on message, obtain semantic map.
6.2 robots determine whether to receive the instruction of " traversal that entire environment is completed " from user, if received, turn the Seven steps, no 2.5, the TurtleBot that thens follow the steps are continued to move to, and issue mileage message, pose message, depth message, colour circle Border picture;Perception data on mixed cloud node receives, based on Faster R-CNN object identification, collaboration recognition decision, semanteme The modules such as mark continue to message, carry out object identification and mark semantic map, constantly enrich semantic map.
7th step terminates.

Claims (11)

1. a kind of semantic map constructing method based on cloud robot mixing cloud framework, it is characterised in that the following steps are included:
The first step constructs robot mixing cloud environment, it is by robot calculate node and private clound node, public cloud node structure At;Robot calculate node be can be that can run calculating with robot hardware's equipment of runs software program, private clound node The calculating equipment of intensive or knowledge-intensive robot application, public cloud node are that storage resource is abundant, can externally mention For the calculating equipment of service;Robot calculate node, private clound node are interconnected by the network equipment, and private clound node passes through interconnection Net access public cloud node;
In robot calculate node other than equipped with operating system Ubuntu, robot middleware ROS, ROS message subscribing person, also Equipped with perception data acquisition module;Perception data acquisition module acquire the posture information of robot, mileage information, depth information and The colored environment picture of camera shooting, and these data publications are given to private clound section using news release/subscribing mechanism of ROS Perception data receiving module on point;
On private clound node other than equipped with operating system Ubuntu, robot middleware ROS, it is also equipped with perception data and receives mould Block, SLAM build module, based on Faster R-CNN object identification module, collaboration recognition decision module, picture uploading module and Semantic tagger module;
Perception data receiving module using ROS news release/subscribing mechanism from perception data acquisition module subscribe to posture information, Mileage information, depth information and colored environment picture;Perception data receiving module receives posture information, mileage information, depth After information and colored environment picture, posture information is transmitted to SLAM and builds module and semantic tagger module, mileage information is sent out It gives SLAM and builds module, colored environment picture is sent to based on Faster R-CNN object identification module, by depth information It is sent to semantic tagger module;
SLAM builds module and utilizes from the received robot posture information of perception data receiving module and mileage information, is not having completely There is real-time rendering environment SLAM map in the foreign environment of priori knowledge, and SLAM map is sent to semantic tagger module; SLAM map refers to the two-dimensional geometry map drawn using SLAM algorithm;
It is utilized based on Faster R-CNN object identification module from the received colored environment picture of perception data receiving module, is based on Faster R-CNN model identifies object, obtains each object space, object picture, object identification in colored environment picture Class label, object picture confidence score;Based on Faster R-CNN object identification module to Faster R-CNN model into Row training;Based on Faster R-CNN object identification module to the colored environment picture transmitted from perception data receiving module, adopt Object identification is carried out to colored environment picture with Faster R-CNN model, the object for exporting each object in colored environment picture is known Other position, object picture, object identification class label, object confidence score;Wherein object identification class label and object are set Confidence score obtains after carrying out object identification by Faster R-CNN model;It will be known based on Faster R-CNN object identification module Other result, that is, object identification class label, confidence score, object identification position, object picture are sent to collaboration recognition decision mould Block;
Collaboration recognition decision module is judged according to confidence threshold value from the knowledge received based on Faster R-CNN object identification module Whether whether other result is correct, to needing to be uploaded to public cloud and again identify that carry out decision;If object identification class in recognition result Distinguishing label is not " other object types " and score is more than or equal to confidence threshold value, and collaboration recognition decision module thinks to know the object It is incorrect, object identification position and object identification class label are sent to semantic tagger module;For object identification mistake or Unrecognized situation cooperates with recognition decision module that object picture is sent to picture uploading module, is carried out again by public cloud Identification finally cooperates with recognition decision module that object identification class label that public cloud returns and object identification position are sent to language Adopted labeling module;
Picture uploading module receives collaboration recognition decision module from collaboration recognition decision module and is judged as that identification is wrong or can not know These object pictures are transmitted to public cloud node and again identified that, and public cloud node is again identified that by other object picture The object identification class label obtained afterwards is sent to collaboration recognition decision module;
Semantic tagger module builds module from SLAM and receives SLAM map, receives posture information and depth from perception data receiving module Information is spent, object identification position and object identification class label is received from collaboration recognition decision module, saves it in " label- In the table of position ";It is corresponding to be labeled in SLAM map by " label-position " table by semantic tagger module for object identification class label Position completes the building of semantic map, semantic map is finally distributed to robot calculate node;
Publicly-owned cloud service CloudSight identification picture uploading module of the public cloud node using the Internet-based in big data receives Object picture, provide the identification label of object in picture;CloudSight provides POST method and GET method;POST method It is the method that client computer uploads data to cloud service CloudSight;GET method is client computer from cloud service CloudSight The method for obtaining data;
Second step, private clound node subscribe to the relevant ROS message of perception data;Robot calculate node perceives environment, publication sense The relevant ROS message of primary data simultaneously subscribes to the relevant ROS message of semantic map, the specific steps are as follows:
The perception data receiving module of 2.1 private clound nodes is by the publish/subscribe mechanism based on ROS from robot calculate node Subscribe to theme it is entitled/posture information of tf, theme be entitled/mileage information of odom, theme be entitled/camera/depth/ Message on the colored environment picture of the depth information of points and/camera/rgb/image_raw;
2.2 are trained Faster R-CNN model based on Faster R-CNN object identification module;
The collaboration recognition decision module of 2.3 private clound nodes receives confidence threshold value from keyboard;
Message in 2.4 robot calculate nodes on ROS message subscribing person subscription/sem_map theme;
2.5 robot calculate nodes are mobile, perceive environment by hardware sensor and accelerometer, gyroscope and obtain data, and Data are issued by ROS mechanism, the specific steps are as follows:
2.5.1 perception data acquisition module obtains data from range sensor, generate mileage information be published to/odom theme on;
2.5.2 perception data acquisition module obtains the mobile distance of robot, each from range sensor, accelerometer, gyroscope The acceleration at moment, gyroscope angle;Perception data acquisition module obtains the initial position of robot and calculates traveling time, root The position coordinates being calculated robot currently on SLAM map are carried out using the library tf in ROS according to above- mentioned information and direction generates Posture information, and be published to/tf theme on;
2.5.3 perception data acquisition module obtains data from visual sensor, and visual sensor data include each pixel in image Depth information and RGB, that is, RGB color value colour information of the point apart from visual sensor, the latter are to show as taking Colored environment picture;Perception data acquisition module generates depth information of the robot far from objects in front according to visual sensor data With colored environment picture, and be published to respectively/camera/depth/points and/camera/rgb/image_raw theme on;
Third step, the perception data receiving module on private clound node obtain the posture information and mileage letter of robot calculate node Breath, and the posture information of robot calculate node and mileage information are sent to SLAM and build module, SLAM builds module foundation SLAM map, the specific steps are as follows:
Perception data receiving module on 3.1 private clound nodes received using ROS message mechanism/tf theme and/odom theme On message, obtain the posture information and mileage information of robot calculate node, and by the posture information of robot calculate node SLAM is sent to mileage information and builds module, and the posture information of robot calculate node is sent to semantic tagger module;
The SLAM of 3.2 private clound nodes builds the posture information and mileage information that module receives robot calculate node, using same Step positioning draws environment geometry map i.e. SLAM map with nomography SLAM is built, and SLAM map is sent to semantic tagger mould Block;
4th step, the perception data receiving module on private clound node obtain the colored environment picture of robot calculate node and will It is sent to based on Faster R-CNN object identification module, carries out object identification based on Faster R-CNN object identification module Recognition result is sent to collaboration recognition decision module afterwards, cooperates with recognition decision module by the object identification classification in recognition result Label and object identification position are sent to semantic tagger module, the specific steps are as follows:
4.1 perception data receiving modules are received using ROS message mechanism/camera/rgb/image_raw theme on disappear Breath, obtains the colored environment picture of robot node current shooting, and sends it to based on Faster R-CNN object identification Module;
4.2 receive the colored environment picture of robot calculate node based on Faster R-CNN object identification module, carry out object Identification, the specific steps are as follows:
4.2.1 colored environment picture is received from perception data receiving module based on Faster R-CNN object identification module;
4.2.2 picture feature extraction and object are carried out using Faster R-CNN model based on Faster R-CNN object identification module Body segmentation, calculates object identification position according to object space and intercepts object picture from colored environment picture, and object is known Other position, object picture, object identification class label and recognition confidence score are sent to collaboration recognition decision module, specific to walk It is rapid as follows:
4.2.2.1 Faster R-CNN model carries out feature extraction to the picture in sliding window, and judges whether it is an object Body if so, Faster R-CNN model carries out object segmentation, and returns to object space, object identification in colored environment picture Class label and corresponding recognition confidence score execute step 4.2.2.2;It is no to then follow the steps 5.5;
4.2.2.2 Faster R-CNN model is obtained based on Faster R-CNN object identification module to return in colored environment picture Object space, object identification class label and corresponding recognition confidence score calculate object identification position according to object space Object picture is intercepted with from colored environment picture, is based on Faster R-CNN object identification module for object identification position, object Body picture, object identification class label and recognition confidence score are sent to collaboration recognition decision module;
4.2.3 collaboration recognition decision module judgment object identifies whether correctly, if correctly, cooperateing with recognition decision module by result It is sent to semantic tagger module, executes the 5th step;Otherwise the object for identifying mistake is sent to picture by collaboration recognition decision module Uploading module, picture uploading module are further uploaded to public cloud node, the specific steps are as follows:
4.2.3.1 collaboration recognition decision module receives object picture, object knowledge from based on Faster R-CNN object identification module Other position, object identification class label and recognition confidence score, are judged;If object identification class label is not " other Object type " and confidence score are more than or equal to confidence threshold value, then cooperate with recognition decision module to determine correct to the object identification, Execute 4.2.3.4;
If 4.2.3.2 object identification class label is " other object types " or recognition confidence score is less than confidence threshold value, Recognition decision module is cooperateed with to determine that the object picture is sent to picture uploading module to the object identification mistake;
4.2.3.3 picture uploading module again identifies that object picture using public cloud CloudSight, and object is known Other class label is sent to collaboration recognition decision module;
4.2.3.4 after collaboration recognition decision module receives object identification class label, by object identification class label and object Identification position is sent to semantic tagger module;
5th step, depth information, posture information, the object of semantic tagger module receiver device people's calculate node on private clound node After body identifies class label and object identification position, object is calculated after the position coordinates on SLAM map, by object identification Class label is labeled in SLAM map corresponding position, completes the building of semantic map, the specific steps are as follows:
5.1 perception data receiving modules are received using ROS message mechanism/camera/depth/points theme on message, The depth information of robot node is obtained, and the depth information of robot node is sent to semantic tagger module;
5.2 semantic tagger modules receive the posture information of robot node, the position including robot from perception data receiving module Set coordinate (x0,y0) and indicate that the direction γ of 3D posture is converted robot direction γ to corresponding using the rbx1 packet in ROS Rotational angle β;
5.3 semantic tagger modules receive the depth information of robot calculate node from perception data receiving module, identify from collaboration Decision-making module receives object identification class label and object identification position (x2, y2), the vision angle of calculating robot's node to object Spend α and depth d;
5.4 semantic tagger module integrated environment objects in images identify the visual angle and depth of position, robot node to object The position coordinates and rotational angle of degree, robot node, are calculated object on SLAM map using trigonometric function relationship Object identification class label is labeled in the corresponding position of SLAM map by position coordinates;
5.5 judge sliding window whether to the lower right corner, if so, i.e. colour based on Faster R-CNN object identification module Environment picture completed to identify, executes step 5.6;Otherwise sliding window is moved on into next position, executes step 4.2.2.1;
5.6 semantic tagger modules are using semantic map as ROS news release on/sem_map theme;
6th step, robot calculate node receive the relevant ROS message of semantic map using ROS message mechanism, and specific steps are such as Under:
6.1ROS message subscribing person using ROS message mechanism receive private clound node semantic tagger module publication /sem_map Message on theme obtains semantic map;
6.2 robots determine whether to receive the instruction of " traversal that entire environment is completed " of user's sending, if receiving, turn the 7th Step;If not receiving, step 2.5 is executed;
7th step terminates.
2. a kind of semantic map constructing method based on cloud robot mixing cloud framework as described in claim 1, it is characterised in that The object space, which refers to, describes position of the object in environment picture with the length and width of object top left co-ordinate, object; Object identification class label includes certain specific category label and other object category labels in addition to specific category object; " other object types " representative " every other object " includes varied other kinds of object, can only determine that these are all objects Body but it can not indicate the feature that it is certain specific type objects;Confidence score by Faster R-CNN Softmax classifier It is calculated, characterizes the reliability size of Faster R-CNN recognition result;Recognition confidence score between zero and one, get over by score The big reliability for indicating recognition result is bigger;Object identification position, which refers to, to be existed with object outline border pixel coordinate position to describe object Position in environment picture takes the center point coordinate of object outline border as the coordinate of object identification position;Object picture refers to basis The object space of Faster R-CNN model output and the picture that is partitioned into;Semantic map, which refers to, is added to object identification position, object The SLAM map of body identification class label.
3. a kind of semantic map constructing method based on cloud robot mixing cloud framework as described in claim 1, it is characterised in that It is described that object identification field PASCAL VOC2007 data set and internet are used based on Faster R-CNN object identification module On download at random several include various objects pictures Faster R-CNN model is trained.
4. a kind of semantic map constructing method based on cloud robot mixing cloud framework as claimed in claim 3, it is characterised in that Several downloaded at random on the internet include that the picture of various objects refers to the object on video website www.flickr.com Picture.
5. a kind of semantic map constructing method based on cloud robot mixing cloud framework as described in claim 1, it is characterised in that The selectable range of the confidence threshold value be 0~1, when select threshold value setting test step-length is 0.1, i.e., from [0,0.1, 0.2 ... 1] in selection confidence threshold value optimal setting value.
6. a kind of semantic map constructing method based on cloud robot mixing cloud framework as claimed in claim 5, it is characterised in that The confidence threshold value is set as 0.7.
7. a kind of semantic map constructing method based on cloud robot mixing cloud framework as described in claim 1, it is characterised in that The object identification mistake or unrecognized situation include: that 1) confidence score is less than threshold value in object identification result, then say The bright Faster R-CNN object identification module that is based on is to the object identification mistake;2) the object identification class label is " other objects Body class ", then explanation can not identify the object generic based on Faster R-CNN object identification module.
8. a kind of semantic map constructing method based on cloud robot mixing cloud framework as described in claim 1, it is characterised in that 4.2.3.3 the picture uploading module is using the method that public cloud CloudSight again identifies that object picture:
4.2.3.3.1 the API KEY of one CloudSight of picture uploading module application;
4.2.3.3.2 picture uploading module using the POST method of HTTP by address URL, that is, resource descriptor of object picture and Applied API KEY is uploaded to https: //api.cloudsightapi.com/image_requests issues request;
4.2.3.3.3 picture uploading module gets the token for identity and safety certification using the GET method of HTTP token;
4.2.3.3.4 picture uploading module passes through access https: //api.cloudsightapi.com/image_ The recognition result class label of responses/ [token] acquisition object picture.
9. a kind of semantic map constructing method based on cloud robot mixing cloud framework as described in claim 1, it is characterised in that The method of visual angle α at semantic tagger modular computer device people node described in 5.3 steps to object center is:
α=tan-1((tan(θ/2))·a/b))
θ is the horizontal detecting angular range of visual sensor, centered on sensor, left and right horizontal detecting angular range be θ/ 2, b be picture level pixel value half, a be picture in object centre distance center picture horizontal pixel value, a=| x2-b |。
10. a kind of semantic map constructing method based on cloud robot mixing cloud framework as described in claim 1, it is characterised in that The method of the depth d of semantic tagger modular computer device people's nodal distance object described in 5.3 steps is:
D=D/cos α
D is the depth information that robot node obtains, and depth information indicates object identification position to the distance of camera shooting head plane.
11. a kind of semantic map constructing method based on cloud robot mixing cloud framework as described in claim 1, it is characterised in that Semantic tagger module described in 5.4 steps is using the method that trigonometric function relationship calculates position of the object on SLAM map: calculating Object is the position coordinates (x for calculating object on SLAM map in the position on SLAM map1,y1),
For the object on the left of camera:
For the object on the right side of camera:
β is the rotational angle of robot, (x0,y0) be robot position coordinates.
CN201710019515.XA 2017-01-10 2017-01-10 A kind of semantic map constructing method based on cloud robot mixing cloud framework Active CN107066507B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710019515.XA CN107066507B (en) 2017-01-10 2017-01-10 A kind of semantic map constructing method based on cloud robot mixing cloud framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710019515.XA CN107066507B (en) 2017-01-10 2017-01-10 A kind of semantic map constructing method based on cloud robot mixing cloud framework

Publications (2)

Publication Number Publication Date
CN107066507A CN107066507A (en) 2017-08-18
CN107066507B true CN107066507B (en) 2019-09-17

Family

ID=59597929

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710019515.XA Active CN107066507B (en) 2017-01-10 2017-01-10 A kind of semantic map constructing method based on cloud robot mixing cloud framework

Country Status (1)

Country Link
CN (1) CN107066507B (en)

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107766430B (en) * 2017-09-18 2020-12-25 北京春鸿科技有限公司 File searching method and system
US10723027B2 (en) * 2017-09-26 2020-07-28 Toyota Research Institute, Inc. Robot systems incorporating cloud services systems
CN107632606A (en) * 2017-09-27 2018-01-26 哈工大机器人(合肥)国际创新研究院 Mobile Robotics Navigation and localization method of the big region based on Slam and Tag labels
CN110869926B (en) * 2017-11-07 2024-01-19 谷歌有限责任公司 Sensor tracking and updating based on semantic state
CN108092859B (en) * 2017-11-30 2021-02-02 山东大学 Home service robot system in intelligent environment and working method
CN109905423B (en) * 2017-12-08 2022-11-08 上海仪电(集团)有限公司中央研究院 Intelligent management system
CN109974719A (en) * 2017-12-28 2019-07-05 周秦娜 A kind of control method and device of the mobile robot environment sensing based on cloud computing
CN108107897B (en) * 2018-01-11 2021-04-16 驭势科技(北京)有限公司 Real-time sensor control method and device
CN108363387B (en) * 2018-01-11 2021-04-16 驭势科技(北京)有限公司 Sensor control method and device
CN108401462A (en) * 2018-01-30 2018-08-14 深圳前海达闼云端智能科技有限公司 Information processing method and system, cloud processing device and computer program product
CN108381519A (en) * 2018-02-24 2018-08-10 上海理工大学 A kind of intelligent robot leg system
CN108445504A (en) * 2018-03-20 2018-08-24 迪比(重庆)智能科技研究院有限公司 A kind of multi-rotor unmanned aerial vehicle indoor navigation method
CN108776474B (en) * 2018-05-24 2022-03-15 中山赛伯坦智能科技有限公司 Robot embedded computing terminal integrating high-precision navigation positioning and deep learning
CN109002037B (en) * 2018-06-27 2021-03-23 中国人民解放军国防科技大学 Multi-robot collaborative path following method based on deep learning
CN109272493A (en) * 2018-08-28 2019-01-25 中国人民解放军火箭军工程大学 A kind of monocular vision odometer method based on recursive convolution neural network
CN109186606B (en) * 2018-09-07 2022-03-08 南京理工大学 Robot composition and navigation method based on SLAM and image information
WO2020053611A1 (en) 2018-09-12 2020-03-19 Toyota Motor Europe Electronic device, system and method for determining a semantic grid of an environment of a vehicle
WO2020103109A1 (en) * 2018-11-22 2020-05-28 深圳市大疆创新科技有限公司 Map generation method and device, drone and storage medium
CN109737974B (en) * 2018-12-14 2020-11-27 中国科学院深圳先进技术研究院 3D navigation semantic map updating method, device and equipment
CN109782902A (en) * 2018-12-17 2019-05-21 中国科学院深圳先进技术研究院 A kind of operation indicating method and glasses
CN109816686A (en) * 2019-01-15 2019-05-28 山东大学 Robot semanteme SLAM method, processor and robot based on object example match
CN109945871B (en) * 2019-03-15 2021-03-02 中山大学 Multi-unmanned platform synchronous positioning and map construction method under condition of limited communication bandwidth and distance
CN110166522B (en) * 2019-04-01 2021-08-24 腾讯科技(深圳)有限公司 Server identification method and device, readable storage medium and computer equipment
CN110263675B (en) * 2019-06-03 2024-02-20 武汉联一合立技术有限公司 Garbage target identification system and method of community security robot
CN110363816B (en) * 2019-06-25 2023-05-26 广东工业大学 Mobile robot environment semantic mapping method based on deep learning
CN110599533B (en) * 2019-09-20 2023-06-27 湖南大学 Quick monocular depth estimation method suitable for embedded platform
CN110703747B (en) * 2019-10-09 2021-08-03 武汉大学 Robot autonomous exploration method based on simplified generalized Voronoi diagram
CN112711249B (en) * 2019-10-24 2023-01-03 科沃斯商用机器人有限公司 Robot positioning method and device, intelligent robot and storage medium
CN110807782B (en) * 2019-10-25 2021-08-20 中山大学 Map representation system of visual robot and construction method thereof
CN113296495B (en) * 2020-02-19 2023-10-20 苏州宝时得电动工具有限公司 Path forming method and device of self-mobile equipment and automatic working system
CN111354045A (en) * 2020-03-02 2020-06-30 清华大学 Visual semantic and position sensing method and system based on infrared thermal imaging
CN111737278B (en) * 2020-08-05 2020-12-04 鹏城实验室 Method, system, equipment and storage medium for simultaneous positioning and mapping
CN113301645A (en) * 2020-10-15 2021-08-24 阿里巴巴集团控股有限公司 Signal source position obtaining and position calibration method, equipment and storage medium
US11964398B2 (en) 2021-01-28 2024-04-23 Micropharmacy Corporation Systems and methods for autonomous robot distributed processing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105563484A (en) * 2015-12-08 2016-05-11 深圳前海达闼云端智能科技有限公司 Cloud robot system, robot and robot cloud platform
CN105844283A (en) * 2015-01-16 2016-08-10 阿里巴巴集团控股有限公司 Method for identifying category of image, image search method and image search device
CN106052674A (en) * 2016-05-20 2016-10-26 青岛克路德机器人有限公司 Indoor robot SLAM method and system
CN106066179A (en) * 2016-07-27 2016-11-02 湖南晖龙股份有限公司 A kind of robot location based on ROS operating system loses method for retrieving and control system
US10223638B2 (en) * 2015-06-24 2019-03-05 Baidu Online Network Technology (Beijing) Co., Ltd. Control system, method and device of intelligent robot based on artificial intelligence

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105844283A (en) * 2015-01-16 2016-08-10 阿里巴巴集团控股有限公司 Method for identifying category of image, image search method and image search device
US10223638B2 (en) * 2015-06-24 2019-03-05 Baidu Online Network Technology (Beijing) Co., Ltd. Control system, method and device of intelligent robot based on artificial intelligence
CN105563484A (en) * 2015-12-08 2016-05-11 深圳前海达闼云端智能科技有限公司 Cloud robot system, robot and robot cloud platform
CN106052674A (en) * 2016-05-20 2016-10-26 青岛克路德机器人有限公司 Indoor robot SLAM method and system
CN106066179A (en) * 2016-07-27 2016-11-02 湖南晖龙股份有限公司 A kind of robot location based on ROS operating system loses method for retrieving and control system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Toward QoS-aware Cloud Robotic applications: A Hybrid Architecture and Its Implementation;Yiying Li 等;《IEEE International Conference on Ubiquitous Intelligence and Computing》;20160721;33-40
基于云计算的机器人SLAM架构的实现与优化;谭杰夫 等;《软件》;20151015;第36卷(第10期);17-20、25

Also Published As

Publication number Publication date
CN107066507A (en) 2017-08-18

Similar Documents

Publication Publication Date Title
CN107066507B (en) A kind of semantic map constructing method based on cloud robot mixing cloud framework
JP7408792B2 (en) Scene interaction methods and devices, electronic equipment and computer programs
EP2701818B1 (en) Generating and joining shared experience
TWI615776B (en) Method and system for creating virtual message onto a moving object and searching the same
CN110908504B (en) Augmented reality museum collaborative interaction method and system
Bahri et al. Accurate object detection system on hololens using yolo algorithm
CN108230437A (en) Scene reconstruction method and device, electronic equipment, program and medium
CN110235120A (en) System and method for the conversion between media content item
Li et al. Camera localization for augmented reality and indoor positioning: a vision-based 3D feature database approach
CN102663722A (en) Moving object segmentation using depth images
CN112070782B (en) Method, device, computer readable medium and electronic equipment for identifying scene contour
Alvey et al. Simulated photorealistic deep learning framework and workflows to accelerate computer vision and unmanned aerial vehicle research
WO2022052782A1 (en) Image processing method and related device
US20230251094A1 (en) Navigation video generation and acquisition methods and apparatuses, server, device, and medium
Maher et al. Realtime human-UAV interaction using deep learning
Zhao et al. Vivid: Augmenting vision-based indoor navigation system with edge computing
Dang et al. Perfc: An efficient 2d and 3d perception software-hardware framework for mobile cobot
Park et al. AR room: Real-time framework of camera location and interaction for augmented reality services
Wong et al. Visual gaze analysis of robotic pedestrians moving in urban space
Ishigaki et al. Real-time 3D reconstruction for mixed reality telepresence using multiple depth sensors
WO2019207875A1 (en) Information processing device, information processing method, and program
CN116469142A (en) Target positioning and identifying method, device and readable storage medium
Hu et al. Computer vision for sight: Computer vision techniques to assist visually impaired people to navigate in an indoor environment
CN112785083B (en) Arrival time estimation method and device, electronic equipment and storage medium
De Paolis et al. Augmented Reality, Virtual Reality, and Computer Graphics: 6th International Conference, AVR 2019, Santa Maria al Bagno, Italy, June 24–27, 2019, Proceedings, Part II

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant