CN115014344A

CN115014344A - Method for positioning equipment on map, server and mobile robot

Info

Publication number: CN115014344A
Application number: CN202210384292.8A
Authority: CN
Inventors: 不公告发明人
Original assignee: Shanghai Akobert Robot Co ltd; Shenzhen Akobot Robot Co ltd
Current assignee: Shanghai Akobert Robot Co ltd; Shenzhen Akobot Robot Co ltd
Priority date: 2019-05-09
Filing date: 2019-05-09
Publication date: 2022-09-06
Also published as: WO2020223975A1; CN110268225A; CN115060262A; CN110268225B

Abstract

The application provides a method for positioning equipment on a map, a method for cooperative operation among multiple devices, a server, a mobile robot, a first electronic device, a second electronic device, a third electronic device and a computer readable storage medium. Meanwhile, the first electronic device, the second electronic device and the third electronic device share the same map and visual positioning data set, so that interaction of multiple devices can be realized.

Description

Method for positioning equipment on map, server and mobile robot

The present application is a divisional application filed according to article 42 of the patent Law, the parent application of which is application number 201980000670.4; application date: 09 month 05 2019; PCT international application entry country phase date: year 2019, month 05, day 21; application data of PCT International application PCT/CN 2019/086282; 2019.05.09, respectively; the invention is named as: a method for positioning equipment on a map, a server and a mobile robot are provided.

Technical Field

The present application relates to the field of map positioning technologies, and in particular, to a method for positioning devices on a map, a method for performing cooperative operation among multiple devices, a server, a mobile robot, a first electronic device, and a second electronic device.

Background

With the development of science and technology and the improvement of living standard, intelligent household appliances are widely applied. For example, the mobile robot is a machine device which automatically executes specific work, can receive the command of people, can run a pre-programmed program, and can perform actions according to principles formulated by artificial intelligence technology. The mobile robot can be used indoors or outdoors, can be used for industry or families, can be used for replacing security patrol, replacing people to clean the ground, and can also be used for family companions, auxiliary office work and the like.

The mobile robot can construct an indoor map according to the image captured by the camera to the environments such as a market, an office, a residence and the like by using a VSLAM (Visual Simultaneous Localization and Mapping, instant positioning and map construction based on Visual information) technology. However, the map constructed by the mobile robot cannot be shared with other intelligent household appliances, and the other intelligent household appliances cannot interact with the mobile robot.

Disclosure of Invention

In view of the above-mentioned shortcomings of the prior art, the present application aims to provide a method for locating devices on a map, a method for cooperation among multiple devices, a server, a mobile robot, a first electronic device, a second electronic device, a third electronic device, and a computer-readable storage medium, which are used for solving the problem in the prior art that multiple devices cannot be located on a map constructed by one of the devices.

To achieve the above and other related objects, a first aspect of the present application provides a method for locating a device on a map, for locating at least one first electronic device, the method comprising the steps of: acquiring an image shot by at least one camera device of the first electronic equipment arranged in a physical space; determining positioning feature information matched with the visual positioning data set in the image based on a map and the visual positioning data set which are constructed in advance and correspond to the physical space; wherein the set of visual positioning data and map are constructed by movement of at least one second electronic device within the physical space; and determining the position of the corresponding first electronic equipment on the map based on the association relation between the matched positioning characteristic information in the visual positioning data set and the coordinate information marked in the map.

The second aspect of the present application further provides a method for cooperative operation among multiple devices, where the multiple devices include a first device and a second electronic device, and the cooperative operation method includes the following steps: acquiring multimedia data including an image captured by a camera device of the first electronic equipment, and identifying an interaction instruction for interacting with the second electronic equipment from the multimedia data; determining position information of the first electronic equipment and/or the second electronic equipment based on a preset map and the interaction instruction; wherein the map is marked with coordinate information determined by the camera of the first electronic equipment and/or the camera of the second electronic equipment based on the images shot respectively; and sending an interaction instruction to the second electronic equipment to enable the second electronic equipment to execute the input operation generated based on the determined at least one piece of coordinate information.

The third aspect of the present application further provides a method for cooperative operation among multiple devices, where the multiple devices include a first electronic device and a third electronic device, and the cooperative operation method includes the following steps: acquiring an interactive instruction from the third electronic equipment; the interactive instruction comprises coordinate information of first electronic equipment for executing corresponding interactive operation on a map; wherein the coordinate information is determined based on an image captured by an image capturing device of the first electronic apparatus and marked in the map; and sending the interaction instruction to the first electronic equipment corresponding to the coordinate information so that the first electronic equipment can execute the interaction operation.

The fourth aspect of the present application further provides a server, including: interface means for data communication with at least one first electronic device and at least one second electronic device; the storage device is used for storing the images shot by the first electronic equipment, the map and the visual positioning data set of the physical space where each piece of first electronic equipment is located and at least one program, wherein the images are obtained by the interface device; processing means, coupled to the storage means and the interface means, for executing the at least one program to coordinate the storage means and the interface means to perform the method according to any of the first aspects of the present application.

A fifth aspect of the present application also provides a mobile robot including: interface means for data communication with at least one first electronic device; the storage device is used for storing the images shot by the first electronic equipment, the map and the visual positioning data set of the physical space where each first electronic equipment is located and at least one program; the mobile device is used for moving in a physical space where the first electronic equipment is located; processing means, coupled to said storage means and said interface means, for executing said at least one program to coordinate said storage means and said interface means to perform the method of: acquiring an image shot by at least one camera device of the first electronic equipment arranged in the physical space; determining positioning feature information in the image that matches the set of visual positioning data based on the map and the set of visual positioning data; wherein the map and set of visual positioning data are constructed by the mobile robot moving at least once within the physical space; and determining the position of the corresponding first electronic equipment on the map based on the association relation between the matched positioning characteristic information in the visual positioning data set and the coordinate information marked in the map.

A sixth aspect of the present application further provides a server, including: interface means for communicating with a first electronic device and a second electronic device; the storage device is used for storing multimedia data containing images from the first electronic equipment, a map corresponding to a physical space where the first electronic equipment and the second electronic equipment are located and at least one program; wherein the map is marked with coordinate information determined by the camera of the first electronic equipment and/or the camera of the second electronic equipment based on the images shot respectively; processing means, coupled to the storage means and the interface means, for executing the at least one program to coordinate the storage means and the interface means to perform the method according to any of the second aspects of the present application.

A seventh aspect of the present application also provides a second electronic apparatus provided with an image pickup device, including: interface means for communicating with at least one first electronic device; the storage device is used for storing multimedia data containing images from the first electronic equipment, a map of a physical space where each first electronic equipment is located and at least one program; wherein the map is marked with coordinate information determined by the camera of the first electronic equipment and/or the camera of the first electronic equipment based on the images shot by the camera; processing means, coupled to said storage means and said interface means, for executing said at least one program to coordinate said storage means and said interface means to perform the method of: identifying an interaction instruction sent by the first electronic equipment from the multimedia data; and determining the position information of the first electronic equipment and/or the first electronic equipment in the map based on a preset map and the interactive instruction, and executing input operation generated based on at least one determined position information.

An eighth aspect of the present application also provides a first electronic device provided with an image pickup apparatus, including: interface means for communicating with at least one second electronic device; the storage device is used for storing the multimedia data which are shot by the camera device and contain images, a map which corresponds to the physical space where each second electronic device is located and at least one program; wherein the map is marked with position information determined by the camera of the second electronic equipment and/or the camera of the second electronic equipment based on the images shot by the camera; processing means, coupled to said storage means and said interface means, for executing said at least one program to coordinate said storage means and said interface means to perform the method of: identifying interaction instructions for interacting with the second electronic device from the multimedia data; and determining coordinate information of the second electronic equipment and/or the second electronic equipment in the map based on a preset map and the interaction instruction, and executing input operation generated based on at least one determined coordinate information.

The ninth aspect of the present application further provides a server, including: interface means for communicating with a first electronic device and a third electronic device; the storage device is used for storing a map of a physical space where the first electronic equipment is located and at least one program; wherein the map is marked with position information determined by the camera of the first electronic equipment based on the shot image; processing means, coupled to the storage means and the interface means, for executing the at least one program to coordinate the storage means and the interface means to perform the method according to any of the third aspects of the present application.

The tenth aspect of the present application also provides a method for cooperation among multiple devices, where the multiple devices include a first electronic device and a third electronic device, the method including the steps of: generating an interactive instruction based on the input operation of the recognition user or the recognized voice data; the interactive instruction comprises coordinate information of first electronic equipment for executing corresponding interactive operation on a map; wherein the coordinate information is determined based on an image captured by an image capturing device of the first electronic apparatus and marked in the map; and sending the interaction instruction to the first electronic equipment corresponding to the coordinate information so that the first electronic equipment can execute the interaction operation.

The eleventh aspect of the present application also provides a third electronic device, comprising: interface means for communicating with a first electronic device; the storage device is used for storing a map of a physical space where the first electronic equipment is located and at least one program; wherein the map is marked with coordinate information determined by the camera of the first electronic equipment based on the shot image; processing means, coupled to said storage means and said interface means, for executing said at least one program to coordinate said storage means and said interface means to perform the method of: generating an interactive instruction based on the recognized input operation of the user or the recognized voice data; the interactive instruction comprises coordinate information of first electronic equipment for executing corresponding interactive operation on a map; wherein the coordinate information is determined based on an image captured by an image capturing device of the first electronic apparatus and marked in the map; and sending the interaction instruction to the first electronic equipment corresponding to the coordinate information so that the first electronic equipment can execute the interaction operation.

A twelfth aspect of the present application also provides a computer-readable storage medium storing a computer program for locating a device on a map, the computer program for locating a device on a map implementing the method for locating a device on a map according to the first aspect when being executed.

A thirteenth aspect of the present application also provides a computer-readable storage medium storing a computer program for cooperation between multiple devices, the computer program for cooperation between multiple devices implementing the method for cooperation between multiple devices described in the third aspect or the method for cooperation between multiple devices described in the tenth aspect when executed.

As described above, the method for locating devices on a map, the method for cooperation among multiple devices, the server, the mobile robot, the first electronic device, the second electronic device, the third electronic device, and the computer-readable storage medium of the present application have the following advantages: the method comprises the steps that a second electronic device with a camera moves in indoor or outdoor physical space and constructs a map and a visual positioning data set of the physical space, the map is shared with a first electronic device with the camera and a third electronic device through a server, the first electronic device can match or compare the map and the visual positioning data set through images shot by the first electronic device, and the positioning of the first electronic device on the map is achieved. Meanwhile, the first electronic device, the second electronic device and the third electronic device share the same map and the same visual positioning data set, so that interaction of multiple devices can be realized, and the user experience is good.

Drawings

Fig. 1 is a flow chart illustrating a method for locating a device on a map according to an embodiment of the present invention.

Fig. 2 is a schematic flow chart of step S11 in one embodiment of the method for locating a device on a map according to the present application.

Fig. 3 is a schematic flow chart of step S12 in one embodiment of the method for locating a device on a map according to the present application.

Fig. 4 is a flowchart illustrating a method for cooperative operation among multiple devices according to an embodiment of the present disclosure.

Fig. 5 is a schematic flow chart of an embodiment of step S20 in the method for cooperative operation among multiple devices according to the present application.

Fig. 6 is a schematic flow chart of an embodiment of step S21 in the method for cooperative operation among multiple devices according to the present application.

Fig. 7 is a flowchart illustrating a method for cooperative operation among multiple devices according to an embodiment of the present application.

Fig. 8 is a flowchart illustrating a service end of the present application in an embodiment.

Fig. 9 is a schematic structural diagram of a mobile robot according to an embodiment of the present invention.

Fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application.

Fig. 11 is a schematic structural diagram of a second electronic device according to an embodiment of the present application.

Fig. 12 is a schematic structural diagram of a first electronic device according to an embodiment of the present disclosure.

Fig. 13 is a schematic structural diagram of a server according to an embodiment of the present application.

Fig. 14 is a schematic structural diagram of a third electronic device according to an embodiment of the present application.

Detailed Description

The following embodiments are provided to illustrate the present disclosure, and other advantages and effects will be apparent to those skilled in the art from the disclosure.

Although the terms first, second, etc. may be used herein to describe various elements in some instances, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first electronic device may be referred to as a second electronic device, and similarly, a second electronic device may be referred to as a first electronic device, without departing from the scope of the various described embodiments. The first electronic device and the second electronic device are both describing one device, but they are not the same device unless the context clearly dictates otherwise. Similar situations also include a first security camera, a second security camera and the like.

Also, as used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used in this specification, specify the presence of stated features, steps, operations, elements, components, items, species, and/or groups, but do not preclude the presence, or addition of one or more other features, steps, operations, elements, components, species, and/or groups thereof. The terms "or" and/or "as used herein are to be construed as inclusive or meaning any one or any combination. Thus, "A, B or C" or "A, B and/or C" means "any of the following: a; b; c; a and B; a and C; b and C; A. b and C ". An exception to this definition will occur only when a combination of elements, functions, steps or operations are inherently mutually exclusive in some way.

At present, the application of intelligent household appliances is quite common, such as mobile robots, intelligent monitoring cameras and the like. For example, the intelligent monitoring camera can be applied to indoor environments such as banks, markets or residences, so that the indoor environments can be captured, recorded, analyzed or monitored in real time. The mobile robot is a machine device which automatically executes specific work, can receive the command of people, can run a pre-programmed program, and can perform actions according to principles formulated by artificial intelligence technology. The mobile robot can be used indoors or outdoors, can be used for industry or families, can be used for replacing security patrol, replacing people to clean the ground, and can also be used for family companions, auxiliary office work and the like.

Generally, the mobile robot may be provided with at least one camera on a top surface (e.g., a central region of the top surface, a front end of the top surface opposite the central region, a rear end of the top surface opposite the central region) disposed on a top of the mobile robot but at an inclination angle in the range of 10 ° -60 ° with respect to a horizontal direction, and a side surface or a junction of the top surface and the side surface for capturing an image of an operating environment of the mobile robot, thereby performing VSLAM (Visual simultaneouspositioning and Mapping); according to the constructed map, the mobile robot can perform route planning for work such as patrol, cleaning, and the like. Generally, the mobile robot caches a map constructed during the operation of the mobile robot in a local storage space, or uploads the map to a separate server or cloud for storage, and other intelligent household appliances cannot interact with the mobile robot.

For example, in an office environment, an intelligent monitoring camera is installed at a doorway to capture, record or monitor the doorway and the surrounding environment in real time, or to give an early warning about possible abnormal situations, such as entrance and exit of suspicious people. Meanwhile, a mobile robot is arranged at one indoor position and used for cleaning the floor in the office. The mobile robot constructs an indoor map during operation and work, uploads the indoor map to a corresponding server, and can present the indoor map on an APP of a user in a picture mode. Because images, videos or other data acquired by the intelligent monitoring camera are stored locally or uploaded to another corresponding server, the intelligent monitoring camera cannot share the indoor map with the mobile robot, and cannot interact with the mobile robot based on the indoor map. In a specific scenario, for example, the intelligent monitoring camera cannot know the position of the intelligent monitoring camera, and when it is monitored that suspicious people enter or exit, the intelligent monitoring camera can only remind the user of the situation, but cannot remind the user of the indoor position where the situation occurs.

In view of the above, the present application provides a method for locating a device on a map, which is used for locating at least one first electronic device on the map.

The first electronic equipment is equipment provided with an image pickup device. The camera device can be a camera, and comprises a spherical camera, a hemispherical camera, a gun-type camera and the like. The first electronic device may be a mobile device that is displaceable within a physical space depicted by a map, the mobile device being controlled by an intelligent control system of the body to produce the displacement, the mobile device including but not limited to: vehicles, unmanned planes, family accompanying mobile robots, cleaning robots, patrol mobile robots, and the like in the cruise mode may autonomously or passively move based on a pre-constructed map, or may be carried on a mobile smart terminal, a vehicle-mounted terminal, and the like. The first electronic device may also be an electronic device that is mounted in a physical space depicted by a map and cannot be displaced in a two-dimensional or three-dimensional space, for example, the electronic device is an intelligent monitoring camera, a video recording camera, or the like. Wherein the electronic device is completely securable in the physical space; or the holder assembled based on the device rotates, thereby achieving the purpose of monitoring or recording videos with wider fields of view.

The physical space is an actual space where a navigation path, through which the second electronic device executes the navigation movement, is located, and at least one first electronic device is further configured in the actual space. The physical space may be an indoor space or an outdoor space. Wherein, the indoor space includes but is not limited to residence, office, bank, mall, etc., and the outdoor space includes but is not limited to amusement park, parking lot, etc. In some embodiments, the second electronic device is an unmanned aerial vehicle, and a physical space in which a navigation path of the second electronic device during navigation flight is located may be an outdoor air environment, and correspondingly, the first electronic device such as an intelligent monitoring camera, a patrol robot, and the like is configured in the physical space. In other embodiments, the second electronic device is an in-vehicle terminal on a vehicle, and when a navigation vehicle of the second electronic device travels to a tunnel road where positioning cannot be obtained or a road surface where network signals are weak and navigation is required, the corresponding tunnel road or road surface is a two-dimensional plane or the like in a corresponding physical space, and correspondingly, the first electronic device such as a vehicle with a camera device is arranged in the physical space. In still other embodiments, the first electronic device is a sweeping robot, a physical space where a navigation path of the first electronic device is located is an indoor or outdoor space, and correspondingly, the first electronic device such as an intelligent monitoring camera, an intelligent terminal, and the like is configured in the physical space.

The displacement refers to the distance between the position of the equipment at the last moment and the position of the equipment at the next moment, the distance has a direction, and the direction is from the position of the equipment at the last moment to the position of the equipment at the next moment. The displacement is used to describe a change in position of the device, including, but not limited to, displacement of the device based on linear motion, reciprocating motion, helical motion, circular motion, and the like.

The navigation movement operation refers to a control operation process of the second electronic device performing autonomous movement according to the sensed obstacle information in the physical space and the determined current position information. The second electronic device needs to navigate by means of the built map and the visual positioning data set thereof in the navigation moving process, for example, after the sweeping robot receives an instruction of a user, the second electronic device completes work in a designated place under the navigation of the map and the visual positioning data thereof. For another example, the vehicle navigates on a road such as a tunnel where satellite positioning cannot be obtained according to the map and the visual positioning data thereof.

In some examples, the second electronic device constructs a map of the corresponding physical space and its set of visual positioning data based on sensed information including visual positioning information, movement-sensed information, and locations traversed during the movement during the navigational movement. In still other examples, the second mobile device obtains a map and its set of visual positioning data that is persisteable based on multiple navigational movement operations performed in the same physical space.

The map is modeled, symbolized and abstracted by a certain mathematical rule, and reflects an image symbolic model of a physical space where the objective first electronic equipment and the second electronic equipment are located, or is called a graphic mathematical model, and the map includes but is not limited to a grid map, a topological map and the like. The map includes: the second electronic device comprises coordinate information corresponding to a starting position of the second electronic device and coordinate information corresponding to obstacles sensed during movement. And through the geographic information described by the map, each piece of visual positioning information in the visual positioning data set comprises coordinate information of a shooting key frame image so as to correspond to the position in the map.

The set of visual positioning data is a set of visual positioning information. The visual positioning information includes: the method includes the steps of obtaining a key frame image, positioning feature information in the key frame image, coordinate information of at least part of the positioning feature information in a map, coordinate information corresponding to the key frame image and the like, wherein the positioning feature information includes but is not limited to: feature points, feature lines, etc. The positioning characteristic information is described by a descriptor for example. For example, based on a Scale-invariant feature transform (SIFT-invariant feature transform), the positioning feature information is extracted from the plurality of key frame images, and a gray value sequence for describing the positioning feature information is obtained based on the image blocks containing the positioning feature information in the plurality of key frame images, and the gray value sequence is a descriptor. For another example, the descriptor is used to describe the positioning feature information by encoding surrounding brightness information of the positioning feature information, sampling a number of points around the positioning feature information by one circle with the positioning feature information as a center, where the number of the sampling points is, but not limited to, 256 or 512, comparing two sampling points with each other to obtain a brightness relationship between the sampling points, and converting the brightness relationship into a binary string or other encoding format.

It should be understood that a frame refers to a single video frame of the smallest unit in the animation, and the frame appears as a grid or a mark on the time axis of the animation software. The key frame image is equivalent to an original picture in the two-dimensional animation, and refers to a frame where a key action in the motion or change of an object is located. The camera device continuously shoots surrounding images in the moving process of the second electronic equipment, wherein the images of adjacent frames have high similarity. Therefore, if adjacent frames are compared, the course of motion of the device may not be clearly judged, and thus the course of motion of the device may be more remarkably judged through the comparison between key frame images.

The map and visual positioning data set constructed through the above examples is stored on a storage medium of a second electronic device or in a storage medium of a server communicating with the second electronic device. The server includes a server cluster (or called cloud server) based on a cloud architecture, or a single server. The Cloud Service end comprises a Public Cloud (Public Cloud) Service end and a Private Cloud (Private Cloud) Service end, wherein the Public or Private Cloud Service end comprises Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS), Infrastructure-as-a-Service (IaaS), and the like. The private cloud service end is used for example for an Aliskian cloud computing service platform, an Amazon cloud computing service platform, a Baidu cloud computing platform, a Tencent cloud computing platform and the like.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a method for positioning a device on a map according to an embodiment of the present disclosure (hereinafter referred to as a positioning method) for positioning a position of at least one first electronic device in the map, where the positioning method is mainly performed by a server through data provided by the first electronic device and a second electronic device in cooperation; or the second electronic equipment is executed by the data provided by the first electronic equipment in a matching way.

Here, for the convenience of describing the implementation process of the scheme, the server performs the method as an example. In fact, the second electronic device can also execute the method, and the difference from the execution of the method by the server side is that the second electronic device does not necessarily need to send the map and visual positioning data set constructed by itself to other devices through network communication, and directly or indirectly receive the image provided by the first electronic device to realize the positioning of the first electronic device. The positioning method comprises the following steps:

in step S10, an image captured by an image capturing device of at least one of the first electronic devices disposed in a physical space is acquired.

One or more first electronic devices can be arranged in a physical space, and the first electronic devices can shoot images of the physical space through the camera device of the first electronic devices and then can transmit the images to the server in a network communication mode; and after receiving the image, the server side performs subsequent processing on the image. The server side is a cloud server or a physical server and the like. The network communication mode may be, for example, WLAN (Wireless Local Area network), cellular network, etc.

Here, the server may preset a time interval for the first electronic device to capture an image, and then obtain the still images captured at different times by the imaging device of the first electronic device at the preset time interval. Or, the server may preset that the first electronic device captures images at a plurality of fixed times, and then acquire still images captured by the imaging device of the first electronic device at the plurality of fixed times. Of course, in some embodiments, the camera device may also capture a video, and since the video is composed of image frames, the server may continuously or discontinuously capture the image frames in the acquired video, and then select one image as an image.

In step S11, positioning feature information in the image that matches the set of visual positioning data is determined based on a pre-constructed map and set of visual positioning data corresponding to the physical space.

Wherein the set of visual positioning data and map are constructed by at least one second electronic device moving within the physical space. In some examples, based on the foregoing description, the set of visual positioning data and map may be constructed by the aforementioned single second electronic device performing one or more navigational movements in the physical space. Taking the second electronic device as an example of the sweeping robot, during the operation and work of the sweeping robot at home, the VSLAM technology is used to capture an indoor image through the camera device of the sweeping robot and construct a map of the interior, such as a living room, a study room, a bedroom, or the whole home. Taking the second electronic device as an unmanned aerial vehicle as an example, the unmanned aerial vehicle uses VSLAM technology to construct a map in the physical space. Taking the second electronic device as a vehicle with a cruising function as an example, the vehicle can construct a map in a tunnel according to the VSLAM technology on a tunnel road where positioning cannot be obtained or a road surface where network signals are weak and navigation is needed. Taking the second electronic device as an example of a navigation robot or a guidance robot in a hotel, the navigation robot or the guidance robot may perform navigation or guidance service for a customer based on the VSLAM technology after receiving a semantic instruction of the customer. In other examples, the map and the set of visual positioning data are constructed by a plurality of the second electronic devices performing respective navigational movement operations within the physical space. A plurality of second electronic devices can be arranged in each scene example, and each second electronic device uploads the map and the visual positioning data set constructed by each second electronic device to the server according to the navigation mobile operation of each second electronic device, and the map and the visual positioning data set are fused together by the server to obtain the map and the visual positioning data set which are convenient for subsequent execution. For example, the server integrates the coordinate information in the maps acquired at different times into the unified coordinate information in the map available for subsequent use, and integrates the visual positioning information in the visual positioning data sets acquired at different times into the unified visual positioning information in the visual positioning data set available for subsequent use, and the like.

And matching the image with a map of the physical space and a visual positioning data set which are constructed in advance by an image matching algorithm or a manual comparison method and the like, so as to determine the positioning characteristic information which is matched with the visual positioning data set in the image. Here, in some examples, the server matches the frequency domain distribution of the acquired image by using the frequency domain distribution of the gray values in the key frame image as a matching index to obtain a plurality of candidate key frame images, and then matches the acquired image with each candidate key frame image one by using the positioning feature information in the visual positioning data set, so as to obtain the positioning feature information that determines that the image matches the visual positioning data set.

In still other examples, referring to fig. 2, fig. 2 is a schematic flow chart of an embodiment of step S11 in the method for locating a device on a map according to the present application, and as shown in the drawing, step S11 further includes the following steps:

in step S111, candidate location feature information in the image is extracted.

In step S112, the location feature information of the image is selected from the candidate location feature information by image matching.

Here, the server is preconfigured with an extraction algorithm in the same extraction manner as that of the second electronic device for extracting the positioning feature information in the key frame image, and extracts the candidate positioning feature information in the image based on the extraction algorithm. Wherein, the extraction algorithm includes but is not limited to: and (3) an extraction algorithm based on at least one characteristic of texture, shape and spatial relationship. The extraction algorithm based on the texture features comprises texture feature analysis of at least one gray level co-occurrence matrix, a checkerboard feature method, a random field model method and the like; examples of the extraction algorithm based on the shape feature include at least one of the following fourier shape description method, shape quantitative measurement method, and the like; the extraction algorithm based on the spatial relationship features is exemplified by the mutual spatial position or relative direction relationship among a plurality of image blocks divided from the image, and these relationships include, but are not limited to, a connection/adjacency relationship, an overlapping/overlapping relationship, an inclusion/containment relationship, and the like.

And the server side matches the candidate positioning characteristic information fs1 in the image with the positioning characteristic information fs2 of the corresponding key frame image in the visual positioning data set by using an image matching technology, so as to obtain matched positioning characteristic information fs 1'.

Here, the server performs image matching by using image search according to a matching index in a set of visual positioning data established in advance. In some examples, the method includes performing region localization on an image captured by the first electronic device to obtain at least one candidate region, then determining a candidate region meeting a certain condition, taking the candidate region meeting the certain condition as a target region, performing region normalization processing on the target region, and obtaining positioning feature information corresponding to the target region after the region normalization processing and taking the positioning feature information as the positioning feature information corresponding to the image. In still other examples, matching may also be performed using a matching algorithm such as SIFT, FLANN (Fast Library for Approximate Neighbors). In other examples, a descriptor in the visual positioning data set describing the positioning feature information is used for constructing a matching index of at least one layer for matching, and the matching index is used for matching the descriptor of the positioning feature information in the image; wherein the matching index comprises at least one of: the matching conditions of the positioning feature information itself, the matching conditions of the spatial relationship between the positioning feature information, and the like. For example, at least one layer of matching index is constructed, descriptors of the positioning feature information in the image are matched by using a first matching condition set based on SIFT, descriptors meeting the first matching condition are extracted, and the spatial relationship of the extracted descriptors meeting the first matching condition is further matched by using a matching condition based on the size, rotation and scale invariance of the image, so that the positioning feature information matched with the positioning feature information of a key frame image in the visual positioning data set in the image is obtained.

It should be noted that, according to the actual image processing algorithm design, the aforementioned extraction and matching manners may be performed in an interleaving manner, for example, when extracting part of the positioning feature information in the image, matching is performed according to the extracted positioning feature information, so as to reduce the amount of computation.

In step S12, a location of the corresponding first electronic device on the map is determined based on the association relationship between the matching positioning feature information in the set of visual positioning data and the coordinate information marked in the map.

In some examples, the pixel position deviation is a pixel deviation between geometric shapes corresponding to the plurality of positioning feature information, such as a deviation between edges, a deviation between a point and a point, and the like, and is obtained by averaging. In still other examples, the pixel position deviation may also be a pixel deviation obtained by averaging deviations between the respective feature points.

In order to mark the location of the first electronic device on the map according to the obtained pixel location deviation, referring to fig. 3, which is a schematic flow chart of S12 in an embodiment of the method for locating a device on a map according to the present application, as shown in the figure, based on the pixel location deviation between two matching locating feature information and the association relationship between the locating feature information and the coordinate information in the set of visual locating data, the step of marking the location of the corresponding first electronic device on the map includes:

in step S121, based on the pixel position deviation between the two pieces of matching positioning feature information and the association relationship, position deviation information between the position of the first electronic device and the coordinate information corresponding to the positioning feature information matched in the set of visual positioning data is determined.

Here, in some specific examples, the server reconstructs a 3D solid model of the object captured together based on the matched image P2 and the key frame image P1, and determines that the pixel position deviation corresponds to the position and posture deviation between the first electronic device and the second electronic device in the physical space according to the coordinate information in the visual positioning information where the key frame image P1 is located. In other specific examples, the server determines coordinate information of the key frame image in the map based on the pixel position deviation and parameter information provided from the second electronic device and related to capturing the key frame image, and determines a position and posture deviation between the first electronic device and the second electronic device corresponding to the physical space based on the coordinate information and the pixel position deviation. Wherein, the parameter information includes, for example: the deflection angle of the main optical axis of the camera device to the moving plane, the height of the camera device from the moving plane, and the like.

In step S122, the position information of the first electronic device in the map is determined based on the position deviation information, and marked on the map. After determining the position and attitude deviation of the first electronic device relative to the second electronic device, the position of the first electronic device is marked on the map. Thereby enabling the location of the first electronic device to be marked on a map provided by the second electronic device.

In some practical scenarios, the map is in need of updating, and the method for constructing a map further comprises the step of updating the map and the set of visual positioning data.

In some specific scenarios, the map is updated for the purpose of tracking the latest location of a first electronic device, wherein the first electronic device is an autonomously moving mobile robot. The server may repeatedly perform the above examples according to an image provided by the first electronic device during movement to mark the latest location of the first electronic device in the map.

In still other particular scenarios, the purpose of updating the map is to provide a map and its set of visual positioning data that are persistently available to the first electronic device and the second electronic device.

In some specific examples, the server updates the map and set of visual positioning data based on visual positioning information and obstacle information collected from the first electronic device during movement within the physical space.

Here, in one aspect, the set of map and visual positioning data provided by the second electronic device is constructed based on the physical space through which the second electronic device navigates, the integrity of its map data relating to the spatial extent over which the second electronic device moves. On the other hand, because the positioning feature information in the visual positioning data set is greatly influenced by the environment, for example, the positioning feature information obtained under natural light may be different from the positioning feature information obtained under light, while the map marked with the first electronic device is constructed, the server side fuses the visual positioning information collected by the movement of the first electronic device in the physical space into the visual positioning data set according to the determined position of the first electronic device in the map, so as to update the map and the visual positioning data set; and fusing obstacle information collected by the first electronic device during movement into a map, thereby obtaining a map and a set of visual positioning data available for continuous use.

The fusion refers to integrating maps constructed at different times and visual positioning data sets. Wherein the integration of the map comprises any one of: integrating all coordinate information of the obstacles collected by the first electronic equipment at different times into all coordinate information in the unified map; or integrating the coordinate information of the obstacles collected by the first electronic equipment at a single time into the map. The integration of the map further includes removing geographical positions not included in the map in the near future, for example, removing coordinate information of geographical positions of obstacles determined to have been temporarily placed, and the like.

The integration of the set of visual positioning data includes any of: integrating the visual positioning information collected by the first electronic equipment at different times into unified visual positioning information in the map; or integrating the visual positioning information collected at the current time by the first electronic device into the visual positioning data set. The integration of the set of visual positioning data further includes removing the visual positioning information and the like in the set of visual positioning data that is not updated recently, for example, removing the visual positioning information and the like that are determined to reflect the temporarily placed obstacle.

In some specific examples, the server updates the map and set of visual positioning data based on visual positioning information collected from the first electronic device at a fixed location within a physical space.

Similar to the foregoing example, for a first electronic device that is fixedly disposed, it may still provide visual positioning information that facilitates persistent use in different periods of time and in different light environments, upon determining the location of the first electronic device, the server adds the visual positioning information corresponding to the location in the visual positioning data set, and performs a fusion operation based on images of different environments provided by the first electronic device to provide the visual positioning information available for persistent use at the location.

In other specific examples, the server may obtain images of a plurality of first electronic devices, and in combination with the first two specific examples, the server performs an update operation on the map and the set of visual positioning data according to data from the plurality of first electronic devices. And will not be described in detail herein.

By utilizing the map and the visual positioning data set obtained after the updating operation, the server can position the first electronic device more quickly and reduce the calculation amount.

In still other practical scenarios, the server may further acquire device information of the first electronic device for classifying into the device class tag when acquiring the image provided by the first electronic device. Wherein the device class label is used for classifying the entity type of the first electronic device. The classification mode can be classified based on the functions, performances, interactive capabilities and other aspects of the first electronic device. The device class label may be one or more. The device information includes, but is not limited to: device brand, manufacturer information, device model, etc. For example, the server determines that the first electronic device is classified into one of a preset intelligent monitoring camera tag, a cleaning robot tag, an intelligent terminal tag and a video conference terminal tag according to the device information of the first electronic device.

And the server side sends the map marked with each first electronic device to a third electronic device so that the third electronic device can display the map. The third electronic device is a device with certain communication capability and a display device, and can be used for displaying the map marked with the first electronic device. In some embodiments, the third electronic device may be a mobile terminal, and the mobile terminal includes a smart terminal, a multimedia device, or a streaming media device, for example, a mobile phone, a tablet computer, a notebook computer, or the like. As mentioned above, while the position of each first electronic device is displayed, the corresponding device type tag marked at the position of each first electronic device may also be displayed, so that the user may distinguish the function, performance, or interactive capability of each first electronic device.

According to the method for positioning the equipment on the map, the second electronic equipment with the camera moves in indoor or outdoor physical spaces and constructs the map and the visual positioning data set of the physical spaces, images shot by the first electronic equipment are matched or compared with the map and the visual positioning data set through the server (or the second electronic equipment), the first electronic equipment is positioned on the map, and the map after positioning is conveniently shared among the first electronic equipment, the second electronic equipment and the third electronic equipment.

The application further provides a method for cooperation among multiple devices by using a map containing a first electronic device constructed by any one of the previous examples, wherein the multiple devices contain the first electronic device and a second electronic device. Wherein the first electronic device and the second electronic device are both located in a physical space described by a map to perform a cooperative operation. The cooperative operation represents a process of interaction between the first electronic device and the second electronic device based on coordinate information of either or both sides provided by the map. The method of co-operation is applicable to processes performed by means of interactive instructions generated by a first electronic device with a second electronic device at different locations of a physical space depicted by the map. In order to facilitate interaction, the device (or system) performing the method is at least preconfigured with interaction instructions for the corresponding first electronic device and/or the corresponding second electronic device, which are preset for performing the respective co-operation. Wherein the device (or system) is exemplified by cooperation of any one or more of a server, a first electronic device, or a second electronic device. The server may be a server used for constructing the map, or another server capable of executing the cooperative operation method according to the map. The first electronic device can be the first electronic device indicated by the map building method. The second electronic device may be the second electronic device referred to by the aforementioned map construction method, or other first electronic devices that do not generate the interactive instruction may be referred to as second electronic devices for convenience of distinction.

Referring to fig. 4, fig. 4 is a schematic flowchart illustrating a method for cooperative operation among multiple devices according to an embodiment of the present application, where the method for cooperative operation includes:

in step S20, multimedia data including an image captured by the camera device from the first electronic device is acquired, and an interaction instruction for interacting with the second electronic device is identified from the multimedia data.

Here, the first electronic device may record a video to obtain multimedia data including an image, or capture a single image at intervals and use the single image as multimedia data, and send the multimedia data to a device (such as the first electronic device, a server, or a second electronic device, which is not repeated in the following steps) for executing the step, and the device executes an identification operation to obtain an interactive instruction. The multimedia data includes image data (image for short), voice data, video data including image and voice data, and the like.

Wherein, the manner of identifying the interactive instruction includes but is not limited to: 1) an interactive instruction is identified from at least one image. For example, the device executing the step performs classification recognition on at least one image in the acquired multimedia data by using a posture recognition classifier obtained through pre-machine training, thereby obtaining a corresponding interactive instruction. For another example, the device executing this step is preconfigured with image features corresponding to each interactive instruction in an interactive instruction set, and identifies at least one image in the multimedia data using the image features, so as to determine the corresponding interactive instruction according to the identification result. For another example, the device executing this step identifies characters in the image, and matches the identified characters according to instruction keywords in a preconfigured interactive instruction set to obtain a corresponding interactive instruction.

2) An interactive instruction is identified from the voice data. For example, the device executing this step performs semantic recognition on voice data in the acquired multimedia data by using a semantic translator obtained through machine training in advance, converts the voice data into text data, and matches the text data based on instruction keywords in a preconfigured interaction instruction set to obtain a corresponding interaction instruction.

In some embodiments, referring to fig. 5, fig. 5 is a flow chart illustrating an implementation manner of S20 in the method for collaborative operation between multiple devices according to the present application, where as shown in the figure, the step of identifying interaction instructions for interacting with the first electronic device from multimedia data includes:

in step S201, an interactive instruction is recognized from an image in the multimedia data, or an interactive instruction is recognized from voice data in the multimedia data.

In step S202, based on a preset instruction set of at least one second electronic device, the second electronic device corresponding to the interaction instruction is determined.

Here, the apparatus for performing this step recognizes the interactive instruction from the image or from the voice data in the same or similar manner as the aforementioned interactive instruction recognition, and will not be described in detail here.

Here, the device performing this step is preconfigured with an instruction set of at least one second electronic device (i.e. the aforementioned interaction instruction set). In some examples, the instruction set includes an interactive instruction uniquely corresponding to a certain second electronic device, and the device executing the step determines, according to the correspondence, the second electronic device corresponding to the acquired interactive instruction. In still other examples, the device performing this step selects a plurality of second electronic devices that can execute the interaction instruction provided by the first electronic device based on a preset instruction set of at least one second electronic device, feeds the selected second electronic devices back to the first electronic device, performs communication interaction with the first electronic device by using multimedia data, and determines the second electronic device that executes the interaction instruction.

In step S21, determining location information of the first electronic device and/or the second electronic device based on a preset map and the interaction instruction; wherein the map is marked with coordinate information determined by the camera of the first electronic device and/or the camera of the second electronic device based on the images respectively captured.

In some scenarios, the interactive instruction is used to instruct the second electronic device to perform a corresponding operation based on the location information of the first electronic device, and for this purpose, the device performing this step determines the location information of the first electronic device based on a preset map and the interactive instruction. In some specific examples, the map is marked with a location of the first electronic device, and the location information of the first electronic device is determined based on the marked location on the map. In another specific example, the location of the first electronic device is not marked on the map, or the current location of the first electronic device in the map is not determined, and the device performing this step performs the aforementioned positioning method based on the acquired image in the multimedia data to determine the current location of the first electronic device. In other words, the apparatus performing this step performs the following steps: acquiring an image shot by a camera of the first electronic equipment based on the interactive instruction; determining positioning feature information in the image that matches the set of visual positioning data based on the map and the set of visual positioning data; and determining the position information of the corresponding first electronic equipment in the map based on the association relation between the matched positioning characteristic information in the visual positioning data set and the coordinate information marked in the map. The execution process of the above steps is similar to the corresponding execution process in the aforementioned positioning method, and is not described in detail here.

For example, the user issues a cleaning gesture to an intelligent monitoring camera (corresponding to the first electronic device) mounted on the wall of the room R1, the intelligent monitoring camera acquires an image or video containing the cleaning gesture and provides the image or video to the cleaning robot (corresponding to the second electronic device), and the cleaning robot recognizes a cleaning instruction (corresponding to an interaction instruction) and the position of the intelligent monitoring camera in a map from the image, so as to determine a target position for the cleaning robot to determine the interaction instruction and perform the corresponding cleaning operation.

In other scenarios, the interactive instruction is used to instruct the second electronic device to perform a corresponding operation based on the location information of the second electronic device, and for this purpose, the device performing this step determines the location information of the second electronic device based on a preset map and the interactive instruction. In some specific examples, the map is marked with a location of the second electronic device, and the location information of the second electronic device is determined based on the marked location on the map. In another specific example, the device that has been registered with the device that performs this step but does not mark the location of the second electronic device on the map, or the device that performs this step performs the aforementioned positioning method to determine the current location of the second electronic device based on the obtained interactive instruction in order to determine the current location of the second electronic device. In other words, the apparatus performing this step performs the steps of: acquiring an image shot by a camera of the second electronic equipment based on the interactive instruction; determining positioning feature information in the image that matches the set of visual positioning data based on the map and the set of visual positioning data; and determining the position information of the corresponding second electronic equipment in the map based on the pixel position deviation between the two matched positioning characteristic information pairs and the incidence relation between the positioning characteristic information and the coordinate information in the visual positioning data set. The execution process of the above steps is the same as or similar to the corresponding execution process in the aforementioned positioning method, and is not described in detail here.

For example, the second electronic device is a security camera, and the first electronic device is a terminal device (such as a video television, a cleaning robot, etc.) including a camera and a display screen; the method comprises the steps that a user sends multimedia data containing an interactive instruction for opening a monitoring video by using gesture images or voice data description to a terminal device, the terminal device analyzes the interactive instruction containing the monitoring video, on one hand, a corresponding security camera is positioned through a preset instruction set, on the other hand, the position of the security camera in a map is positioned, and therefore the terminal device determines the security camera executing the interactive instruction and the position of the security camera in the map.

In still other scenarios, the interactive instruction is used to instruct the second electronic device to perform corresponding operations based on the location of the first electronic device and the location information of the second electronic device, and for this purpose, the device performing this step determines the location of the first electronic device and the location of the second electronic device based on a preset map and the interactive instruction. In some specific examples, the map is marked with the location of the first electronic device and the location of the second electronic device, and the location of the first electronic device and the location of the second electronic device are determined based on the marked locations on the map. In another specific example, the device that has been registered with the device that performs this step but does not mark the location of the first electronic device and the location of the second electronic device on the map, or the device that performs this step is to determine the current locations of the first electronic device and the second electronic device in the map, and the device performs the foregoing positioning method based on the obtained interactive instruction to determine the current location of the first electronic device and the current location of the second electronic device. In other words, the apparatus performing this step performs the following steps: respectively acquiring images shot by the camera devices of the first electronic equipment and the second electronic equipment based on the interactive instruction; determining each positioning feature information matched with the visual positioning data set in each image based on the map and the visual positioning data set; and determining the position information of the corresponding second electronic equipment and the first electronic equipment in the map based on the pixel position deviation between the matched positioning feature information pairs and the incidence relation between the corresponding positioning feature information and the coordinate information in the visual positioning data set. The execution process of the above steps is the same as or similar to the corresponding execution process in the positioning method mentioned above, and is not described in detail here.

For example, the first electronic device and the second electronic device are two cleaning robots, a user makes a gesture command by using a camera device of the first electronic device, the server identifies an interaction command based on the gesture command to move and clean the second electronic device located in a preset range to the first electronic device, and the server determines the executable second electronic device through a preset command set on one hand and determines the position of the second electronic device located in the preset range in a map on the other hand, so that the server determines the second electronic device executing the interaction command and the position of the second electronic device in the map.

In step S22, an interaction instruction is issued to the second electronic device for the second electronic device to perform a corresponding operation based on the determined at least one coordinate information. Here, the device executing the step sends the obtained interactive instruction to the corresponding second electronic device, so that the second electronic device executes corresponding interactive operation.

Taking the cleaning robot and the monitoring camera as an example, the cleaning robot plans a navigation path from the current position to the position of the monitoring camera based on the corresponding interactive instruction and the position of the monitoring camera, and navigates to move to the corresponding position to execute the cleaning operation.

Taking the terminal device and the security camera as examples, the security camera receives the corresponding interactive instruction and feeds back the captured real-time video to the terminal device, so that the user can check the video conveniently.

Taking the two cleaning robots cooperating with cleaning as an example, the server provides the interactive instruction and the position of at least the first electronic device in the map to the second electronic device, and the second electronic device generates a corresponding navigation route based on the obtained interactive instruction and the position and executes navigation movement and cleaning operation.

In some embodiments, the method for cooperating further includes a step of marking the position information corresponding to the second electronic device during the operation on the map and displaying the position information on the first electronic device or a third electronic device sharing the map.

Here, the operations provided by the second electronic device and their position in the map can be displayed in the map in real time or retrospectively. In some examples, the first electronic device has a display screen viewable by the user, and the user may obtain the performance of the second electronic device by viewing a map. For example, a route of movement of the second electronic device is viewed. As another example, view a surveillance video provided by the second electronic device, and so on.

In still other examples, the displayable content may also be displayed in a third electronic device that can share the map. Wherein, the third electronic device includes but is not limited to: and the intelligent terminal, the personal computer, the network control center and other electronic equipment used by the user are positioned or not positioned in the physical space of the map. And will not be repeated here.

According to the method for the cooperative operation among the multiple devices, the device executing the method acquires and identifies the multimedia data shot by the camera device from the first electronic device, and sends an interaction instruction to the second electronic device with the position information determined from the map, so that the second electronic device executes response operation based on the coordinate information. The method for the cooperative operation among the multiple devices can enable the multiple devices to interact with each other, and user experience is good.

The application also provides a method for cooperation among multiple devices, wherein the multiple devices comprise a first electronic device and a third electronic device. Referring to fig. 7, a schematic flow chart of a method for cooperative operation among multiple devices according to an embodiment of the present application is shown, where as shown in the drawing, the cooperative operation method includes:

in step S30, acquiring an interactive command from the third electronic device; the interactive instruction comprises coordinate information of first electronic equipment for executing corresponding interactive operation on a map; wherein the coordinate information is determined based on an image captured by an image capturing device of the first electronic apparatus and marked in the map.

When the third electronic device displays the map, the device executing the step first determines the current position of the first electronic device by executing the positioning method. In other words, the apparatus performing this step performs the following steps: acquiring an image shot by a camera of the first electronic equipment based on the interactive instruction; determining positioning feature information in the image that matches the set of visual positioning data based on the map and the set of visual positioning data; and determining the position information of the corresponding first electronic equipment in the map based on the association relation between the matched positioning characteristic information in the visual positioning data set and the coordinate information marked in the map. The execution process of the above steps is similar to the corresponding execution process in the aforementioned positioning method, and is not described in detail here.

Here, the third electronic device may generate an interactive instruction based on the input operation or the voice data of the recognized user, and send the interactive instruction to the device (such as the first electronic device, the server, or the third electronic device, which is not repeated subsequently) that performs this step. The interaction instruction comprises coordinate information of a first electronic device which executes corresponding interaction operation on a map, wherein the coordinate information is determined based on an image shot by a camera of the first electronic device and marked in the map. Taking the first electronic device as an example of a cleaning robot, the cleaning robot captures an image through an image capturing device of the cleaning robot during operation, compares the captured image with a map, confirms the position of the cleaning robot, sends the position to the device executing the step, and marks the position in the map through the device executing the step. Wherein the identified operations include, but are not limited to: 1) an input operation by a user is recognized. Determining an interactive instruction corresponding to an input operation based on a mapping relation between an instruction set of first electronic equipment pointed by the input operation of a user and the input operation. For example, a map marked with the position of the first electronic device is displayed on a touch screen of the third electronic device, a user generates a gesture operation starting from the position of the first electronic device on the map on the touch screen, and the third electronic device recognizes an interaction instruction corresponding to the gesture operation according to a preset mapping relation between an instruction set of the first electronic device and the gesture operation. 2) The voice data is recognized. For example, the device executing this step performs semantic recognition on voice data in the acquired multimedia data by using a semantic translator obtained through machine training in advance, converts the voice data into text data, and matches the text data based on instruction keywords in a preconfigured interaction instruction set to obtain a corresponding interaction instruction.

In some embodiments, the instructions for interacting comprise: interactive instructions generated based on user input operations on a map presented by the third electronic device. In summary, the third electronic device is a device having a certain communication capability and a display device, and can be used for displaying the map marked with the first electronic device. And the third electronic equipment acquires the input operation of the user on the map, generates an interactive instruction according to the input operation and sends the interactive instruction to the equipment for executing the step. The device for performing the step may be the third electronic device itself, or may be a device other than the first electronic device and the third electronic device.

In step S31, the interaction instruction is sent to the first electronic device corresponding to the coordinate information, so that the first electronic device performs the interaction operation.

Here, in some scenarios, the interactive instruction is used to instruct the first electronic device to perform a corresponding operation based on the own location information. In other scenarios, the instructions are configured to instruct the first electronic device to perform a corresponding operation based on the location information of the destination. In still other scenarios, the interactive instruction is used to instruct the first electronic device to perform a corresponding operation based on the location information of the first electronic device and the location information of the destination.

Taking the third electronic device as an example of a mobile phone, the mobile phone displays a map marked with the first electronic device on the display interface. The user can perform input operations (including but not limited to point touch, dragging, sliding and the like) on the display interface, for example, point touch of a position mark (corresponding to coordinate information) of a cleaning robot (corresponding to the first electronic device) on a map and setting of a target cleaning area, the mobile phone obtains the input operations, generates a cleaning instruction (corresponding to an interactive instruction) according to the input operations and the position mark, and sends the instruction to the cleaning robot, and after receiving the instruction, the cleaning robot moves to the target cleaning area to perform the cleaning operation or cleans the cleaning area constructed based on the position of the cleaning robot based on the instruction.

Taking the third electronic device as an example of a remote controller with a display interface, the remote controller displays a map marked with the first electronic device on the display interface. A user can operate on the remote controller based on a touch screen or a key, for example, one of the position markers (corresponding to coordinate information) is selected from the position markers on the map, the position marker corresponds to one monitoring camera (corresponding to the first electronic device), the remote controller generates an instruction (corresponding to an interaction instruction) for recording a video in a designated area and sends the instruction to the monitoring camera, and the monitoring camera starts to record the video in the designated area after receiving the instruction.

Taking the third electronic device as a notebook computer as an example, the notebook computer displays a map marked with a cleaning robot (corresponding to the first electronic device) on the display screen. The notebook computer obtains voice data of a user, identifies the voice data and obtains an interactive instruction. The voice data comprises semantic information representing the position (corresponding coordinate information) of the first electronic equipment designated by the user and semantic information representing interactive operation designated by the user. And the notebook computer sends the interactive instruction to the cleaning robot, and the cleaning robot executes interactive operation after receiving the interactive instruction.

In some embodiments, the method further comprises the step of marking the corresponding position information on the map and displaying the position information on a third electronic device sharing the map during the interactive operation performed by the first electronic device.

Here, in some examples, the first electronic device acquires an interaction instruction and performs a corresponding interaction operation according to the determined own coordinate information, and during the interaction operation, sends own position information to a device performing the step, and the device marks the position information of the first electronic device on a map and sends the marked map to the third electronic device for displaying by the third electronic device. In still other examples, the first electronic device performs a corresponding interactive operation according to the acquired interactive instruction and the coordinate information of the first electronic device, and during the interactive operation, sends the position information of the first electronic device to the third electronic device, and the third electronic device marks and displays the position information of the first electronic device on a map.

For example, when a cleaning robot (corresponding to the first electronic device) is located at the position a in the area a, the cleaning robot acquires a cleaning instruction (corresponding to the interactive instruction) of the cleaning area B, executes a cleaning operation corresponding to the cleaning instruction according to the determined coordinate information, and sends position information of the cleaning robot to a mobile phone (corresponding to the third electronic device) during the cleaning area B, and the mobile phone marks and displays the position of the cleaning robot in the area B on the display interface.

According to the method for the cooperative operation among the multiple devices, the server side obtains the interaction instruction sent by the third electronic device, identifies the interaction instruction and sends the interaction instruction to the first electronic device with the position information determined from the map, so that the first electronic device executes the interaction operation generated based on the coordinate information. The method for the cooperative operation among the multiple devices can enable the multiple devices to interact with each other, and user experience is good.

The application also provides a server. Referring to fig. 8, which is a schematic flow chart of the server according to an embodiment of the present application, as shown in the figure, the server 40 includes an interface device 400, a storage device 401, and a processing device 402, where: the interface apparatus 400 is used for data communication with at least one first electronic device and at least one second electronic device; the storage device 401 is configured to store the image captured by the first electronic device, the map and the visual positioning data set of the physical space where each of the first electronic devices is located, and at least one program, where the image is acquired by the interface device; the processing device 402 is connected to the storage device 401 and the interface device 400, and is configured to execute the at least one program, so as to coordinate the storage device 401 and the interface device 400 to execute the above method for locating a device on a map.

The interface apparatus 400 performs data communication with at least one first electronic device and at least one second electronic device by means of wireless communication. The storage device 401 is configured to store the captured image from the first electronic device, the map and the set of visual positioning data of the physical space in which the first electronic device is located, and at least one program. The storage device 401 may include at least one software module stored in the storage device 401 in the form of software or Firmware (Firmware). The software module is used for storing images shot by the first electronic equipment, maps of physical spaces where the first electronic equipment is located, visual positioning data sets and various programs which can be executed by the first electronic equipment and the second electronic equipment. In an exemplary embodiment, for example, the software module in the storage device 401 stores a path planning program of the sweeping robot; accordingly, the processing device 402 is configured to execute the program, so as to control the cleaning robot to perform the cleaning operation.

The utility model provides a server, server move and construct at physical space such as indoor or outdoor through receiving the second electronic equipment that has the camera map and the visual positioning data set of physical space to the server with the image with map and visual positioning data set match or processing such as compare, make first electronic equipment can realize self location on the map.

Referring to fig. 9, which is a schematic structural diagram of the mobile robot in an embodiment of the present invention, as shown in the figure, the mobile robot 50 includes an interface device 500, a storage device 501, a mobile device 502, and a processing device 503, and each device is disposed on a circuit board of the mobile robot 50, and each device is directly or indirectly electrically connected to each other to implement data transmission or interaction.

The interface device 500 is used for data communication with at least one first electronic device. The interface device 500 may communicate data with at least one first electronic device by way of wireless communication.

The storage device 501 is used for storing the image captured by the first electronic device, the map and the visual positioning data set of the physical space where the first electronic device is located, and at least one program. The storage 501 may comprise at least one software module stored in the storage 501 in the form of software or Firmware (Firmware). The software module is used for storing images shot by the first electronic equipment, a map of a physical space where each first electronic equipment is located, a visual positioning data set and various programs which can be executed by the mobile robot, such as a path planning program of the mobile robot; accordingly, the processing device 503 is configured to execute the program, so as to control the mobile robot to perform the work.

The moving means 502 is used for moving the first electronic device in the physical space. In some embodiments, the mobile device 502 includes at least one drive unit, such as a left wheel drive unit for driving a left side drive wheel of the mobile robot and a right wheel drive unit for driving a right side drive wheel of the mobile robot. The drive unit may contain one or more processors (CPUs) or micro-processing units (MCUs) dedicated to controlling the drive motor. For example, the micro-processing unit is configured to convert the information or data provided by the processing device 503 into an electrical signal for controlling a driving motor, and control the rotation speed, the steering direction, and the like of the driving motor according to the electrical signal to adjust the moving speed and the moving direction of the mobile robot. The information or data is the declination as determined by the processing means 503. The processor in the drive unit may be shared with the processor in the processing device 503 or may be provided independently. For example, the driving unit functions as a slave processing device, the processing device 503 functions as a master device, and the driving unit performs movement control based on control of the processing device 503. Or the drive unit may be common to the processor in the processing means 503. The drive unit receives data provided by the processing means 503 via a program interface. The drive unit is used to control the drive wheels based on movement control instructions provided by the processing device 603.

The processing device 503 is connected to the storage device 501 and the interface device 500, and is configured to execute the at least one program, so as to coordinate the storage device 501 and the interface device 500 to execute the following methods: acquiring an image shot by at least one camera device of the first electronic equipment arranged in the physical space; determining positioning feature information in the image that matches the set of visual positioning data based on the map and the set of visual positioning data; wherein the map and set of visual positioning data are constructed by at least one movement of the mobile robot within the physical space; and determining the position of the corresponding first electronic equipment on the map based on the association relation between the matched positioning characteristic information in the visual positioning data set and the coordinate information marked in the map.

One or more first electronic devices can be arranged in a physical space, and the first electronic devices can shoot images of the physical space through the camera device of the first electronic devices and then can transmit the images to the server in a network communication mode; and after receiving the image, the server side performs subsequent processing on the image. Here, the server may preset a time interval for the first electronic device to capture an image, and then obtain the still images captured at different times by the imaging device of the first electronic device at the preset time interval. Alternatively, the server may preset that the first electronic device captures images at a plurality of fixed times, and then acquire still images captured by the imaging device of the first electronic device at a plurality of fixed times. Of course, in some embodiments, the camera device may also capture a video, and since the video is composed of image frames, the server may continuously or discontinuously capture the image frames in the acquired video, and then select one image as an image.

Based on the foregoing description, the set of visual positioning data and map may be constructed by the mobile robot performing one or more navigational movements in the physical space. In some embodiments, the mobile robot is a sweeping robot. The floor sweeping robot is also called an autonomous cleaner, an automatic floor sweeping machine, an intelligent dust collector and the like, is one of intelligent household appliances, and can complete cleaning, dust collection and floor wiping work. Specifically, the floor sweeping robot can be controlled by a person (an operator holds a remote controller by hand or through an APP loaded on an intelligent terminal) or automatically complete floor cleaning work in a room according to a certain set rule, and can clean floor impurities such as hair, dust and debris on the floor. In some examples, the sweeping robot is capable of performing cleaning work while building an indoor map during the operation work. Taking the second electronic device as an example of the sweeping robot, during the operation and work of the sweeping robot at home, the VSLAM technology is used to capture an indoor image through the camera device of the sweeping robot and construct a map of the interior, such as a living room, a study room, a bedroom, or the whole home. In other examples, the map and the set of visual positioning data are constructed by a plurality of the mobile robots performing respective navigational movement operations within the physical space. In each scene example, a plurality of mobile robots can be arranged, each mobile robot uploads the constructed map and visual positioning data set to the server according to the navigation moving operation of each mobile robot, and the map and visual positioning data set are fused together by the server to obtain the map and visual positioning data set convenient for subsequent execution. For example, the server integrates the coordinate information in the maps acquired at different times into the unified coordinate information in the map available for subsequent use, and integrates the visual positioning information in the visual positioning data sets acquired at different times into the unified visual positioning information in the visual positioning data set available for subsequent use, and the like.

The processing device 503, the storage device 501 and the interface device 500 may be electrically connected through one or more communication buses or signal lines. In some embodiments, the processing device 503 comprises an integrated circuit chip having signal processing capabilities; or include a general-purpose processor such as a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), discrete gate or transistor logic, discrete hardware components, or the like, that may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. In some embodiments, the Memory device 501 may include Random Access Memory (RAM), Read Only Memory (ROM), Programmable Read-Only Memory (PROM), Erasable Programmable Read-Only Memory (EPROM), electrically Erasable Programmable Read-Only Memory (EEPROM), and the like. The storage device 501 is used for storing a program, and the processing device 503 executes the program after receiving an execution instruction.

For the specific steps and processes of the method executed by the storage device 501 and the interface device 500 by the processing device 503, please refer to the above embodiments, which are not described herein again.

In some embodiments, the interface device 500 is further connected to a third electronic device capable of sharing the map, and the processing device 503 transmits the map marked with the first electronic device to the corresponding third electronic device through the interface device 500 so that the third electronic device displays the map. The third electronic device is a device with certain communication capability and a display device, and can be used for displaying the map marked with the first electronic device. In some embodiments, the third electronic device may be a mobile terminal, and the mobile terminal includes a smart terminal, a multimedia device, or a streaming media device, for example, a mobile phone, a tablet computer, a notebook computer, or the like.

In certain embodiments, the processing device 503 performs the step of determining the positioning feature information in the image matching the set of visual positioning data based on the pre-constructed map of the physical space and the set of visual positioning data, including: extracting candidate positioning characteristic information in the image; and selecting the positioning characteristic information of the image from each candidate positioning characteristic information through image matching. Please refer to the above embodiments for specific steps and processes, which are not described herein.

In some embodiments, the processing device 503 performs the step of determining the position of the corresponding first electronic device on the map based on the association relationship between the matching positioning feature information in the set of visual positioning data and the coordinate information marked in the map, including: determining position deviation information between the position of the first electronic device and coordinate information corresponding to the matched positioning feature information in the visual positioning data set based on pixel position deviation between the two matched positioning feature information; and determining the position information of the first electronic equipment in the map based on the position deviation information, and marking the position information on the map. Please refer to the above embodiments for specific steps and processes, which are not described herein.

In certain embodiments, the processing device 503 also performs the step of updating the map and set of visual positioning data. In certain embodiments, the processing device 503 performs the step of updating the set of map and visual positioning data comprising: updating the map and set of visual positioning data based on visual positioning information collected from the first electronic device during movement within physical space; or updating the map and set of visual positioning data based on visual positioning information collected from the first electronic device at a fixed location within physical space; or updating the map and set of visual positioning data based on visual positioning information from the first electronic device collected both during movement within physical space and at fixed locations within physical space.

In some embodiments, the first electronic device comprises: a device mounted in a fixed position within the physical space, and a device movable within the physical space.

In some embodiments, a device category label for each first electronic device location is also marked on the map; wherein the device category label is at least used to determine location information of the respective first electronic device in a map.

The application provides a mobile robot, through the map and the visual positioning data set of the place physical space that remove and construct during self operation work, and will through the server the map with first electronic equipment and third electronic equipment with the camera share, first electronic equipment can through the image that self ingested with map and visual positioning data set match or compare etc. realize being in by itself location on the map.

Therefore, the present application further provides a server, please refer to fig. 10, which is a schematic structural diagram of the server in the present application in an embodiment, as shown in the figure, including an interface device 600, a storage device 601 and a storage device 601, wherein: the interface device 600 is used for communicating with a first electronic device and a second electronic device; the storage device 601 is configured to store multimedia data including images from the first electronic device, a map corresponding to a physical space where each of the first electronic device and the second electronic device is located, and at least one program; wherein the map is marked with coordinate information determined by the camera of the first electronic device based on the captured image, coordinate information determined by the camera of the second electronic device based on the captured image, and coordinate information determined by the camera of the first electronic device and the camera of the second electronic device based on the captured images; the processing device 602 is connected to the storage device 601 and the interface device 600, and is configured to execute the at least one program to coordinate the storage device 601 and the interface device 600 to execute a method for performing a cooperative operation between multiple devices.

The interface device 600 performs data communication with at least one first electronic device and at least one second electronic device by means of wireless communication. The storage device 601 is used for storing multimedia data containing images sent from the first electronic device, and at least one program. The multimedia data includes image data, voice data, video data including the image data and the voice data, and the like. The map is marked with coordinates in an image shot by the camera shooting device of the first electronic equipment, coordinates in an image shot by the camera shooting device of the second electronic equipment, and coordinates in images shot by the camera shooting devices of the first electronic equipment and the second electronic equipment. The storage device 601 may include at least one software module stored in the storage device 601 in the form of software or Firmware (Firmware). The software module is used for storing images shot by the first electronic equipment, maps of physical spaces where the first electronic equipment is located, visual positioning data sets and various programs which can be executed by the first electronic equipment and the second electronic equipment.

According to the server side, the server side obtains and identifies multimedia data shot by the camera device from the first electronic equipment, and sends an interaction instruction to the second electronic equipment with the position information determined from the map, so that the second electronic equipment executes input operation generated based on the coordinate information. The method for the cooperative operation among the multiple devices can enable the multiple devices to interact with each other, and user experience is good.

Therefore, the present application further provides a second electronic device configured with an image capturing apparatus, as shown in fig. 11, which is a schematic structural diagram of the second electronic device in an embodiment of the present application, and as shown in the figure, the second electronic device includes an interface device 700, a storage device 701, and a processing device 702, where: the interface device 700 is used for communicating with at least one first electronic device; the storage device 701 is configured to store multimedia data including images from the first electronic device, a map of a physical space where each of the first electronic devices is located, and at least one program; wherein the map is marked with coordinate information determined by the camera of the first electronic equipment and/or the camera of the first electronic equipment based on the images shot by the camera; the processing device 702 is connected to the storage device 701 and the interface device 700, and configured to execute the at least one program, so as to coordinate the storage device 701 and the interface device 700 to execute the following methods: identifying an interaction instruction sent by the first electronic equipment from the multimedia data; and determining the position information of the first electronic equipment and/or the first electronic equipment in the map based on a preset map and the interactive instruction, and executing input operation generated based on at least one determined position information.

In some embodiments, the step of the processing device 702 executing the step of identifying the interaction instruction issued by the first electronic device from the multimedia data comprises: and identifying an interactive instruction from an image in the multimedia data or identifying an interactive instruction from voice data in the multimedia data.

In certain embodiments, the storage 701 also stores a set of visual positioning data corresponding to the map; the processing device 702 performing the step of determining the position information of the first electronic equipment and/or the first electronic equipment on the map based on the map comprises: acquiring images shot by a camera of the first electronic equipment and/or the camera of the first electronic equipment based on the interaction instruction; determining positioning feature information in the corresponding image, which is matched with the visual positioning data set, based on the map and the visual positioning data set; and determining the position information of the corresponding first electronic equipment and/or the first electronic equipment in the map based on the association relationship between the matched positioning characteristic information in the visual positioning data set and the coordinate information marked in the map.

In some embodiments, the processing device 702 further performs the step of marking the corresponding position information on the map and displaying the position information on the first electronic device during the input operation.

In some embodiments, the processing means 702 also displays the map on a third electronic device through the interface means 700.

According to the second electronic device, the multimedia data shot by the camera device of the second electronic device is sent to the server side, so that the server side can recognize the multimedia data and send an interactive instruction to the first electronic device with the position information determined from the map, and the first electronic device can execute input operation generated based on the coordinate information. The method for the cooperative operation among the multiple devices can enable the multiple devices to interact with each other, and user experience is good.

Therefore, the present application further provides a first electronic device configured with a camera device, please refer to fig. 12, which is a schematic structural diagram of the first electronic device in an embodiment of the present application, as shown in the figure, including an interface device 800, a storage device 801, and a processing device 802, where: the interface device 800 is used for communicating with at least one second electronic device; the storage device 801 is configured to store multimedia data including images captured by the camera device, a map corresponding to a physical space where each of the second electronic devices is located, and at least one program; wherein the map is marked with position information determined by the camera of the second electronic equipment and/or the camera of the second electronic equipment based on the images shot by the camera; the processing device 802 is connected to the storage device 801 and the interface device 800, and is configured to execute the at least one program to coordinate the storage device 801 and the interface device 800 to perform the following methods: identifying interaction instructions for interacting with the second electronic device from the multimedia data; and determining coordinate information of the second electronic equipment and/or the second electronic equipment in the map based on a preset map and the interaction instruction, and executing input operation generated based on at least one determined coordinate information.

In some embodiments, the step of the processing device 802 executing the interaction instructions for identifying from the multimedia data for interacting with the second electronic device comprises: identifying an interactive instruction from an image in the multimedia data or identifying an interactive instruction from voice data in the multimedia data; and determining the second electronic equipment corresponding to the interaction instruction based on a preset instruction set of at least one second electronic equipment.

In some embodiments, the storage device 801 further stores a set of visual positioning data corresponding to the map; the processing device 802 performs the step of determining the coordinate information of the second electronic device and/or the second electronic device itself in the map based on the map and the interactive instruction, including: acquiring images shot by a camera of the second electronic equipment and/or the camera of the second electronic equipment based on the interaction instruction; determining positioning feature information matched with the visual positioning data set in the corresponding image based on the map and the visual positioning data set; and determining the coordinate information of the corresponding second electronic equipment and/or the second electronic equipment in the map based on the association relationship between the matched positioning feature information in the visual positioning data set and the coordinate information marked in the map.

In some embodiments, the processing device 802 further performs a step of marking the corresponding position information on the map and displaying the position information on the second electronic device during the input operation.

In some embodiments, the processing means 802 also displays the map on a third electronic device through the interface means 800.

According to the first electronic device, the multimedia data shot by the camera device of the first electronic device is sent to the server side, so that the server side can recognize the multimedia data and send an interactive instruction to the second electronic device with the position information determined from the map, and the second electronic device can execute input operation generated based on the coordinate information. The method for the cooperative operation among the multiple devices can enable the multiple devices to interact with each other, and user experience is good.

Referring to fig. 13, a schematic structural diagram of the server according to the present application in an embodiment is shown, and as shown in the figure, the server includes an interface device 900, a storage device 901, and a processing device 902, where: the interface device 900 is used for communicating with the first electronic device and the third electronic device; the storage device 901 is configured to store a map of a physical space where the first electronic device is located, and at least one program; wherein the map is marked with position information determined by an image pickup device of the first electronic equipment based on the picked-up image; the processing device 902 is connected to the storage device 901 and the interface device 900, and is configured to execute the at least one program, so as to coordinate the storage device 901 and the interface device 900 to execute a method for performing cooperative operation between multiple devices.

According to the server side, the server side obtains the interaction instruction sent by the third electronic device, identifies the interaction instruction and sends the interaction instruction to the first electronic device of which the position information is determined on the map, so that the first electronic device executes interaction operation generated based on the coordinate information. The method for the cooperative operation among the multiple devices can enable the multiple devices to interact with each other, and user experience is good.

Referring to fig. 14, which is a schematic structural diagram of the third electronic device in an embodiment of the present application, as shown, the third electronic device includes an interface device 100, a storage device 101, and a processing device 102, where: the interface device 100 is used for communicating with a first electronic device; the storage device 101 is configured to store a map of a physical space where the first electronic device is located, and at least one program; wherein the map is marked with coordinate information determined by the camera of the first electronic equipment based on the shot image; the processing device 102 is connected to the storage device 101 and the interface device 100, and is configured to execute the at least one program to coordinate the storage device 101 and the interface device 100 to perform the following method: acquiring an interactive instruction from the third electronic equipment; the interactive instruction comprises coordinate information of first electronic equipment for executing corresponding interactive operation on a map; wherein the coordinate information is determined based on an image captured by an image capturing device of the first electronic apparatus and marked in the map; and sending the interaction instruction to the first electronic equipment corresponding to the coordinate information so that the first electronic equipment can execute the interaction operation.

In some embodiments, the interactive instructions include interactive instructions generated based on user input operations on the presented map. Here, in some examples, the first electronic device acquires an interaction instruction and performs a corresponding interaction operation according to the determined own coordinate information, and during the interaction operation, sends own position information to a device performing the step, and the device marks the position information of the first electronic device on a map and sends the marked map to the third electronic device for displaying by the third electronic device. In still other examples, the first electronic device performs a corresponding interactive operation according to the acquired interactive instruction and the coordinate information of the first electronic device, and during the interactive operation, sends the position information of the first electronic device to the third electronic device, and the third electronic device marks and displays the position information of the first electronic device on a map. In some embodiments, the processing device 102 further performs a step of marking and displaying corresponding location information on the map during the interactive operation performed by the first electronic device.

According to the third electronic device provided by the application, the third electronic device directly or indirectly sends the interaction instruction to the first electronic device with the position information determined on the map, so that the first electronic device executes the interaction operation with the third electronic device based on the coordinate information. The method for the cooperative operation among the multiple devices can enable the multiple devices to interact with each other, and user experience is good.

The present application further provides a computer-readable storage medium storing a computer program for locating a device on a map, which when executed implements the method for locating a device on a map as described in the above embodiments with respect to fig. 1 to 3.

The present application also provides a computer readable and writable storage medium storing a computer program for cooperation between multiple devices, where the computer program for cooperation between multiple devices implements the method for cooperation between multiple devices described in the foregoing embodiments with reference to fig. 4 to 6 when executed.

The present application also provides a computer readable and writable storage medium storing a computer program for cooperation between multiple devices, where the computer program for cooperation between multiple devices implements the method for cooperation between multiple devices described in the foregoing embodiment with respect to fig. 7 when executed.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application.

In the embodiments provided herein, the computer-readable and writable storage medium may include read-only memory, random-access memory, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, a USB flash drive, a removable hard disk, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable-writable storage media and data storage media do not include connections, carrier waves, signals, or other transitory media, but are intended to be non-transitory, tangible storage media. Disk and disc, as used in this application, includes Compact Disc (CD), laser disc, optical disc, Digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers.

In one or more exemplary aspects, the functions described in the computer program for locating devices on a map or the computer program operating in cooperation between multiple devices described herein may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. The steps of a method or algorithm disclosed herein may be embodied in a processor-executable software module, which may be located on a tangible, non-transitory computer-readable and/or writable storage medium. Tangible, non-transitory computer readable and writable storage media may be any available media that can be accessed by a computer.

The flowcharts and block diagrams in the figures described above of the present application illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The method for positioning the devices on the map, the method for the cooperative operation among the multiple devices, the server, the mobile robot, the first electronic device, the second electronic device and the third electronic device have the following beneficial effects: the second electronic equipment with the camera moves in indoor or outdoor physical space and constructs a map and a visual positioning data set of the physical space, the map is shared with the first electronic equipment with the camera and the third electronic equipment through the server, and the first electronic equipment can match or compare the map and the visual positioning data set through images shot by the first electronic equipment, so that the self positioning on the map is realized. Meanwhile, the first electronic device, the second electronic device and the third electronic device share the same map and the same visual positioning data set, so that interaction of multiple devices can be realized, and the user experience is good.

The above embodiments are merely illustrative of the principles and utilities of the present application and are not intended to limit the application. Any person skilled in the art can modify or change the above-described embodiments without departing from the spirit and scope of the present application. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical concepts disclosed in the present application shall be covered by the claims of the present application.

Claims

1. A method for interoperation among multiple devices, wherein the multiple devices comprise a first electronic device and a third electronic device, the method comprising:

acquiring an interactive instruction from the third electronic equipment; the interactive instruction comprises coordinate information of first electronic equipment for executing corresponding interactive operation on a map; wherein the coordinate information is determined and marked in the map based on an image captured by an image capturing device of the first electronic apparatus;

and sending the interaction instruction to the first electronic equipment corresponding to the coordinate information so that the first electronic equipment can execute the interaction operation.

2. The method for interoperation among multiple devices according to claim 1, wherein the interactive instruction comprises: interactive instructions generated based on user input operations on a map presented by the third electronic device.

3. The method according to claim 1, wherein the interactive instruction is used for instructing the first device to perform corresponding operation based on own location information and/or based on destination location information.

4. The method of claim 1, further comprising the step of marking location information corresponding to the first electronic device during the interactive operation on the map and displaying the location information on a third electronic device sharing the map.

5. The method of claim 1, wherein the step of obtaining an interactive command from the third electronic device comprises:

acquiring an image shot by a camera of the first equipment based on the interactive instruction;

determining positioning feature information in the image that matches the set of visual positioning data based on the map and the set of visual positioning data; and

and determining the position information of the corresponding first equipment in the map based on the association relation between the matched positioning characteristic information in the visual positioning data set and the coordinate information marked in the map.

6. A server, comprising:

interface means for communicating with a first electronic device and a third electronic device;

the storage device is used for storing a map of a physical space where the first electronic equipment is located and at least one program; wherein the map is marked with position information determined by the camera of the first electronic equipment based on the shot image;

processing means coupled to said storage means and to said interface means for executing said at least one program to coordinate said storage means and said interface means to perform the method of any of claims 1-5.

7. A method for interoperation among multiple devices, wherein the multiple devices comprise a first electronic device and a third electronic device, the method comprising:

generating an interactive instruction based on the recognized input operation of the user or the recognized voice data; the interactive instruction comprises coordinate information of first electronic equipment for executing corresponding interactive operation on a map; wherein the coordinate information is determined based on an image captured by an image capturing device of the first electronic apparatus and marked in the map;

8. The method of claim 7, wherein the step of generating an interactive command based on the input operation or the voice data of the recognized user comprises: and determining an interactive instruction corresponding to the input operation or the recognized voice data based on the mapping relation between the instruction set of the first electronic equipment pointed by the input operation or the recognized voice data of the user and the input operation.

9. The method of claim 8, wherein the step of determining the interaction instruction corresponding to the input operation or the recognized voice data comprises: displaying a map marked with the position of the first electronic device on a touch screen of the third electronic device, generating a gesture operation starting from the position of the first electronic device on the map by a user on the touch screen, and determining an interaction instruction corresponding to the gesture operation by the third electronic device according to a preset mapping relation between an instruction set of the first electronic device and the gesture operation.

10. The method for interoperation between multiple devices according to claim 8, wherein the step of determining an interactive instruction corresponding to the input operation or the recognized voice data comprises: performing semantic recognition on voice data in the acquired multimedia data by using a semantic translator obtained through pre-machine training, converting the voice data into character data, matching the character data based on instruction keywords in a pre-configured interactive instruction set, and determining an interactive instruction corresponding to the voice data.

11. The method according to claim 7, wherein the interactive instruction is used for instructing the first device to perform corresponding operation based on own location information and/or based on destination location information.

12. The method for the cooperative operation among the multiple devices according to claim 7, further comprising a step of displaying a position information mark corresponding to the first electronic device during the interactive operation on the map.

13. A third electronic device, comprising:

interface means for communicating with a first electronic device;

the storage device is used for storing a map of a physical space where the first electronic equipment is located and at least one program; wherein the map is marked with coordinate information determined by the camera of the first electronic equipment based on the shot image;

processing means, coupled to said storage means and said interface means, for executing said at least one program to coordinate said storage means and said interface means to perform the steps of:

14. The third electronic device of claim 13, wherein the interactive instructions comprise interactive instructions generated based on user input operations on the presented map.

15. The method of claim 13, wherein the step of generating an interactive command based on the input operation or the voice data of the recognized user comprises: and determining an interactive instruction corresponding to the input operation or the recognized voice data based on the mapping relation between the instruction set of the first electronic equipment pointed by the input operation or the recognized voice data of the user and the input operation.

16. The method of claim 15, wherein the step of determining the interaction instruction corresponding to the input operation or the recognized voice data comprises: displaying a map marked with the position of the first electronic device on a touch screen of the third electronic device, generating a gesture operation starting from the position of the first electronic device on the map by a user on the touch screen, and determining an interaction instruction corresponding to the gesture operation by the third electronic device according to a preset mapping relation between an instruction set of the first electronic device and the gesture operation.

17. The method of claim 15, wherein the step of determining the interaction instruction corresponding to the input operation or the recognized voice data comprises: performing semantic recognition on voice data in the acquired multimedia data by using a semantic translator obtained through pre-machine training, converting the voice data into character data, matching the character data based on instruction keywords in a pre-configured interactive instruction set, and determining an interactive instruction corresponding to the voice data.

18. The third electronic device of claim 13, wherein the processing device further performs the step of marking and displaying the corresponding location information on the map during the interactive operation performed by the first electronic device.

19. The method according to claim 13, wherein the interactive instruction is used to instruct the first device to perform a corresponding operation based on the own location information and/or the destination-based location information.

20. A computer-readable storage medium, in which a computer program for interoperation between multiple devices is stored, and the computer program for interoperation between multiple devices realizes the method for interoperation between multiple devices of any one of claims 1 to 5 or the method for interoperation between multiple devices of any one of claims 7 to 12 when executed.