WO2022121606A1 - Procédé et système d'obtention d'informations d'identification de dispositif ou d'utilisateur de celui-ci dans un scénario - Google Patents

Procédé et système d'obtention d'informations d'identification de dispositif ou d'utilisateur de celui-ci dans un scénario Download PDF

Info

Publication number
WO2022121606A1
WO2022121606A1 PCT/CN2021/129727 CN2021129727W WO2022121606A1 WO 2022121606 A1 WO2022121606 A1 WO 2022121606A1 CN 2021129727 W CN2021129727 W CN 2021129727W WO 2022121606 A1 WO2022121606 A1 WO 2022121606A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
user
camera
location information
spatial location
Prior art date
Application number
PCT/CN2021/129727
Other languages
English (en)
Chinese (zh)
Inventor
方俊
李江亮
牛旭恒
Original Assignee
北京外号信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202011442020.6A external-priority patent/CN114663491A/zh
Priority claimed from CN202011440905.2A external-priority patent/CN112528699B/zh
Priority claimed from CN202011440875.5A external-priority patent/CN112581630B/zh
Application filed by 北京外号信息技术有限公司 filed Critical 北京外号信息技术有限公司
Publication of WO2022121606A1 publication Critical patent/WO2022121606A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K7/00Methods or arrangements for sensing record carriers, e.g. for reading patterns
    • G06K7/10Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation
    • G06K7/14Methods or arrangements for sensing record carriers, e.g. for reading patterns by electromagnetic radiation, e.g. optical sensing; by corpuscular radiation using light without selection of wavelength, e.g. sensing reflected white light
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods

Definitions

  • the present invention relates to the field of information interaction, and in particular, to a method and system for obtaining identification information of a device or its user in a scene.
  • sensors such as cameras and radars will be deployed in the scenario to sense, locate, and track the personnel or equipment that appears in the scenario.
  • these sensors can sense the position or movement of people or equipment in the scene, they cannot obtain identification information of these people or equipment, thus making it difficult to provide services for these people or equipment.
  • facial recognition technology can be used to identify people, this involves violating user privacy and may have legal risks.
  • these sensors can usually only realize one-way information transmission (that is, collect relevant information in the scene), and cannot provide information to the user in the scene based on this information (for example, the user's real-time location information), such as navigation information , instruction information, commercial promotion information, etc.
  • on-site manual service is usually adopted, which requires setting up some consultation desks and arranging service personnel at a certain density in the venue, which is costly and flexible. Low.
  • One aspect of the present invention relates to a method for obtaining identification information for a device or its user in a scene in which one or more sensors and one or more visual markers are deployed, the sensors being capable of being used for sensing measuring or determining the location information of the device or user in the scene, the method includes: receiving information sent by the device, the information including the identification information of the device or its user and the spatial location information of the device, wherein , the device determines its spatial location information by scanning the visual sign; identifies the device or its user within the sensing range of the sensor based on the spatial location information of the device; The identification information of its user is associated with the device or its user within the sensing range of the sensor in order to provide services to the device or its user.
  • Another aspect of the present invention relates to a system for obtaining identification information of a device in a scene or a user thereof, the system comprising: one or more sensors deployed in the scene, the sensors capable of sensing measuring or determining location information of devices or users in the scene; one or more visual markers deployed in the scene; and a server configured to implement the methods described in the embodiments of the present application.
  • Another aspect of the present invention relates to a storage medium, in which a computer program is stored, and when the computer program is executed by a processor, can be used to implement the method described in the embodiments of the present application.
  • Another aspect of the present invention relates to an electronic device, comprising a processor and a memory, wherein the memory stores a computer program, and when the computer program is executed by the processor, can be used to implement the described embodiments of the present application. method.
  • the solution of the present invention not only the position or movement of persons or equipment existing in the scene can be sensed, but also the identification information of these persons or equipment can be obtained, and the corresponding personnel or equipment can be provided with services through the identification information.
  • the location information of the user in the scene can be collected or monitored, but also information, such as navigation information, instruction information, business promotion information, etc., can be provided to the user based on the real-time location information of the user.
  • Figure 1 shows an exemplary visual sign
  • Figure 2 shows an optical communication device that can be used as a visual sign
  • Figure 3 illustrates a system for obtaining identification information of a device in a scene or its user, according to one embodiment
  • FIG. 4 illustrates a method for obtaining identification information of a device in a scene or a user thereof, according to one embodiment
  • Figure 5 illustrates a method for providing a service to a device in a scene or its user, according to one embodiment.
  • Figure 6 illustrates a method for providing information to a user in a scene through a device (here, glasses are used as an example), according to one embodiment
  • FIG. 7 illustrates a system for providing information to a user in a scene through glasses, according to one embodiment
  • Figure 8 illustrates a method for providing information to a user in a scene through glasses, according to one embodiment
  • Figure 9 illustrates a user interaction system according to one embodiment
  • Figure 10 illustrates a user interaction method according to one embodiment
  • FIG. 11 illustrates a first user and a virtual object associated with the first user as observed by a second user through his device, according to one embodiment
  • Figure 12 shows the actual image observed by a user through his cell phone screen, according to one embodiment.
  • Visual signs refer to signs that can be recognized by the human eye or electronic devices, which can have various forms.
  • visual markers may be used to convey information that can be obtained by smart devices (eg, cell phones, smart glasses, etc.).
  • the visual sign may be an optical communication device capable of emitting encoded optical information, or the visual sign may be a graphic with encoded information, such as a two-dimensional code (eg, QR code, applet code), barcode, or the like.
  • Figure 1 shows an exemplary visual sign with a specific black and white pattern.
  • FIG. 2 shows an optical communication device 100 that can be used as a visual sign, which includes three light sources (respectively, a first light source 101, a second light source 102, and a third light source 103).
  • the optical communication device 100 also includes a controller (not shown in FIG. 2 ) for selecting a corresponding driving mode for each light source according to the information to be communicated.
  • the controller can use different driving signals to control the light-emitting manner of the light source, so that when the optical communication apparatus 100 is photographed by using a device with an imaging function, the imaging of the light source can present different images. Appearance (eg, different colors, patterns, brightness, etc.).
  • the driving mode of each light source at the moment can be analyzed, so as to analyze the information transmitted by the optical communication device 100 at the moment.
  • each visual logo may be assigned an identification information (ID), which is used to uniquely identify or identify the visual logo by the manufacturer, manager or user of the visual logo, etc. .
  • ID an identification information
  • the user can use the device to capture the image of the visual sign to obtain the identification information transmitted by the visual sign, so as to access the corresponding service based on the identification information, for example, visit the webpage associated with the identification information, obtain the identification information associated with the identification information.
  • Other information eg, position or gesture information of the visual landmark corresponding to the identification information
  • the devices mentioned herein can be, for example, devices carried or controlled by users (eg, mobile phones, tablet computers, smart glasses, AR glasses, smart helmets, smart watches, cars, etc.), or machines that can move autonomously (eg, drones, driverless cars, robots, etc.).
  • the device can acquire the image containing the visual sign by collecting the image of the visual sign through the image acquisition device on it, and can identify the information transmitted by the visual sign and determine the relative position of the device relative to the visual sign by analyzing the imaging of the visual sign in the image. position or attitude information.
  • Sensors capable of sensing the location of objects may be various sensors capable of sensing or determining location information of objects in a scene, such as cameras, radars (eg, lidars, millimeter-wave radars), wireless signal transceivers, and the like.
  • a target in a scene can be a person or an object in the scene.
  • a camera is used as an example of a sensor for description.
  • Figure 3 illustrates a system for obtaining identification information of a device in a scene or its user, which can be used to provide services or information to a user in the scene through the device, according to one embodiment.
  • the system includes a visual sign 301, a camera 302, and a server (not shown in Figure 3).
  • User 303 is in the scene and carries device 304 .
  • the device 304 has an image capture device on it and can identify the visual sign 301 through the image capture device.
  • device 304 may be a cell phone carried by the user.
  • device 304 may be glasses worn by a user. The glasses themselves may have the ability to directly access the network, for example, the glasses may access the network by means of wifi, telecommunication network or the like.
  • the glasses may also not have the ability to directly access the network, but may indirectly access the network through a connection (eg, a Bluetooth connection or a wired connection) between it and the user's other devices (eg, a mobile phone, a watch, etc.).
  • a connection eg, a Bluetooth connection or a wired connection
  • the visual sign 301 and the camera 302 are each installed in the real scene in a specific position and attitude (hereinafter collectively referred to as "pose").
  • the server may obtain the respective pose information of the camera and the visual marker, and may obtain relative pose information between the camera and the visual marker based on the respective pose information of the camera and the visual marker.
  • the server may also directly obtain the relative pose information between the camera and the visual marker. In this way, the server can obtain a transformation matrix between the camera coordinate system and the visual sign coordinate system, and the transformation matrix may include, for example, a rotation matrix R and a displacement vector t between the two coordinate systems.
  • the camera may be a camera installed in a fixed position and having a fixed orientation, but it is understood that the camera may also be a camera that can move (for example, the position or direction can be changed), as long as its current pose information can be determined.
  • the current pose information of the camera can be set by the server, and the movement of the camera can be controlled based on the pose information, or the movement of the camera can be controlled by the camera itself or other devices, and the current pose information of the camera can be sent to the server.
  • more than one camera may be included in the system, and more than one visual sign may also be included.
  • a scene coordinate system (which may also be referred to as a real world coordinate system) may be established for the real scene, and the distance between the camera coordinate system and the scene coordinate system may be determined based on the pose information of the camera in the real scene and the transformation matrix between the visual landmark coordinate system and the scene coordinate system is determined based on the pose information of the visual landmark in the real scene.
  • the distance between the camera coordinate system and the scene coordinate system may be determined based on the pose information of the camera in the real scene
  • the transformation matrix between the visual landmark coordinate system and the scene coordinate system is determined based on the pose information of the visual landmark in the real scene.
  • having a relative pose between the camera and the visual sign refers to objectively having a relative pose between the two, and does not require the system to pre-store the relative pose information between the two above or Use this relative pose information.
  • the system may be stored in the system, and the relative poses of the two may not be calculated or used.
  • Cameras can be used to track objects in a real scene, which can be stationary or moving, such as people, stationary objects, movable objects, etc. in the scene.
  • a camera can be used to track the position of a person or object in a real scene by various methods in the prior art.
  • the location information of objects in the scene can be determined in combination with scene information (eg, information on the plane on which a person or object in the scene is located).
  • scene information eg, information on the plane on which a person or object in the scene is located.
  • the position information of the target can be determined according to the position of the target in the field of view of the camera and the depth information of the target.
  • the position information of the target can be determined according to the position of the target in the field of view of each camera.
  • the system may have multiple visual signs or multiple cameras, and the fields of view of the multiple cameras may be continuous or discontinuous.
  • Fig. 4 shows a method for obtaining identification information of a device in a scene or a user thereof according to an embodiment.
  • the method can be implemented using the system shown in Fig. 3 and can include the following steps:
  • Step 401 Receive information sent by the device, where the information includes identification information of the device or its user and spatial location information of the device.
  • the information sent by the device may be various information, such as alarm information, help information, service request information, and so on.
  • the identification information of the device or its user can be any information that can be used to identify or identify the device or its user, such as device ID information, the device's phone number, account information for an application on the device, the user's name or nickname, the user's identity information, user account information, etc.
  • the user 303 may use the device 304 to determine the spatial location information of the device 304 by scanning the visual markers 301 deployed in the scene.
  • the user 303 may send information to the server through the device 304, the information may include the spatial position information of the device 304, and the spatial position information may be the spatial position information of the device 304 relative to the visual sign 301 or the spatial position information of the device 304 in the scene .
  • an image of the visual sign 301 may be collected using the device 304; identification information of the visual sign 301 and spatial position information of the device 304 relative to the visual sign 301 are determined by analyzing the collected image of the visual sign 301; The identification information of 301 determines the position and attitude information of the visual marker 301 in space; and based on the position and attitude information of the visual marker 301 in space and the spatial position information of the device 304 relative to the visual marker 301, determine the position of the device 304 in the scene. Spatial location information.
  • the device 304 can send the identification information of the visual marker 301 and the spatial position information of the device 304 relative to the visual marker 301 to the server, so that the server can determine the spatial position information of the device 304 in the scene.
  • the device 304 can also be used to scan the visual marker 301 to determine the gesture information of the device 304 relative to the visual marker 301 or the gesture information of the device 304 in the scene, and the gesture information can be sent to the server.
  • the spatial position information and attitude information of the device may be the spatial position information and attitude information of the device when scanning the visual sign, or the real-time position information and attitude information at any moment after scanning the visual sign.
  • a device can determine its initial spatial position information and attitude information when scanning a visual sign, and then use various sensors built into the device (eg, acceleration sensor, magnetic sensor, orientation sensor, gravity sensor, gyroscope, camera, etc.)
  • the real-time position and/or attitude of the device is determined by measuring or tracking its position change and/or attitude change by methods known in the art (eg, inertial navigation, visual odometry, SLAM, VSLAM, SFM, etc.).
  • the spatial location information of the device received by the server may be coordinate information, but is not limited thereto, any information that can be used to derive the spatial location of the device belongs to spatial location information.
  • the spatial location information of the device received by the server may be an image of a visual sign captured by the device, and the server may determine the spatial location of the device according to the image.
  • any information that can be used to derive a device's pose is pose information, which in one embodiment may be an image of a visual landmark captured by the device.
  • Step 402 Identify the device or its user in the image captured by the camera based on the spatial location information of the device.
  • the imaging position of the device or its user in the image captured by the camera may be determined based on the spatial position information of the device, and the device or the user in the image captured by the camera may be identified according to the imaging position.
  • the imaging position of the user in the image captured by the camera can be determined based on the spatial location information of the device. Since the user usually scans the visual sign while holding the device or wearing the device, the spatial position of the user can be inferred according to the spatial position of the device, and then the imaging position of the user in the image captured by the camera can be determined according to the spatial position of the user. The imaging position of the device in the image captured by the camera can also be determined according to the spatial position of the device, and then the imaging position of the user can be inferred according to the imaging position of the device.
  • the imaging position of the device in the image captured by the camera can be determined based on the spatial location information of the device.
  • a pre-established mapping relationship between one or more spatial positions (not necessarily all) in the scene and one or more imaging positions in the image captured by the camera and the space of the device may be used Location information to determine the imaging position of the device or its user in the image captured by the camera. For example, for a hall scene, several spatial positions on the floor of the hall can be selected, and the imaging positions of these positions in the image captured by the camera can be determined. After that, the mapping relationship between these spatial positions and the imaging positions can be established, and the An imaging position corresponding to a certain spatial position is deduced based on the mapping relationship.
  • the imaging position of the device or its user in the image captured by the camera may be determined based on the spatial position information of the device and the pose information of the camera, where the pose information of the camera may be its position in the scene pose information or its pose information relative to visual landmarks.
  • the device or its user can be identified in the image according to the imaging position. For example, a device or user closest to the imaging position may be selected, or a device or user whose distance from the imaging position satisfies a predetermined condition may be selected.
  • the spatial location information of the device may be compared with the spatial location information of one or more devices or users determined according to the tracking result of the camera. Compare.
  • a camera can be used to determine the spatial position of a person or object in a real scene through various methods in the prior art. For example, in the case of using a single monocular camera, the location information of objects in the scene can be determined in combination with scene information (eg, information on the plane on which a person or object in the scene is located). For the case of using a binocular camera, the position information of the target can be determined according to the position of the target in the field of view of the camera and the depth information of the target.
  • the position information of the target can be determined according to the position of the target in the field of view of each camera.
  • the spatial location information of one or more users may also be determined by using images captured by a camera in combination with lidar and the like.
  • real-time spatial location information eg, satellite positioning information or location information obtained through the device's sensors
  • a camera tracks the location of the plurality of users or devices and identifies the device or a user thereof by comparing real-time spatial location information received from the device to the locations of the plurality of users or devices tracked by the camera.
  • feature information of the device user may be determined based on information sent by the device, and the multiple users may be collected by a camera.
  • feature information of a plurality of users and identify the device user by comparing the feature information of the plurality of users with the feature information of the device user.
  • the field of view can cover one or more cameras of the device or its user, and then the imaging position of the device or its user in the images captured by the one or more cameras is determined .
  • Step 403 Associate the identification information of the device or its user with the device or its user in the image captured by the camera, so as to use the identification information to provide a service to the device or its user.
  • the received identification information of the device or its user may be associated with the device or its user in the image.
  • the ID information, phone number, account information of an application on the device can be known, or the user's name or nickname, the user's identity information, the user's account information, and many more.
  • the identification information can be used to provide various services to the device or its user, such as navigation service, explanation service, information display service, and so on. In one embodiment, the above information may be provided visually, audibly, or the like.
  • a virtual object may be superimposed on a display medium of a device (eg, a mobile phone or glasses), and the virtual object may be, for example, an icon (eg, a navigation icon), a picture, a text, and the like.
  • a device eg, a mobile phone or glasses
  • the virtual object may be, for example, an icon (eg, a navigation icon), a picture, a text, and the like.
  • the steps in the method shown in FIG. 4 may be implemented by the server in the system shown in FIG. 3 , but it is understood that one or more of these steps may also be implemented by other devices.
  • the device or its user in the scene can also be tracked through a camera to obtain its real-time position information and/or attitude information, or the device can be used to obtain its real-time position information and/or attitude information .
  • services can be provided to the device or its user based on the location and/or attitude information.
  • information can be sent to the corresponding device or user in the field of view of the camera through the identification information , the information is, for example, navigation information, explanation information, instruction information, advertisement information, and so on.
  • One or more visual signs and one or more cameras are deployed in a smart factory scenario where robots are used to deliver goods.
  • the camera is used to track the position of the robot, and navigation instructions are sent to the robot according to the tracked position.
  • each robot may be made to scan a visual sign, for example, when entering the scene or the camera's field of view, and send its position information and identification information. In this way, the identification information of each robot within the field of view of the camera can be easily determined, so as to send each robot a travel instruction or a navigation instruction based on its current position and the work task to be completed.
  • information related to a virtual object may be sent to the device, the virtual object may be, for example, pictures, characters, numbers, icons, videos, three-dimensional models, etc., and the information related to the virtual object may include the spatial location of the virtual object information.
  • the virtual object can be presented on the display medium of the device.
  • the device may present the virtual object at an appropriate location on its display medium based on the device's or user's spatial location information and/or gesture information.
  • the virtual object may be presented on the display medium of the user equipment in an augmented reality or mixed reality manner, for example.
  • the virtual object is a video image or a dynamic three-dimensional model generated by video capture of live characters.
  • the virtual object may be a video image generated by real-time video capture of service personnel, and the video image may be presented on the display medium of the user equipment, so as to provide services to the user.
  • the spatial position of the video image can be set so that it can be presented on the display medium of the user equipment in the manner of augmented reality or mixed reality.
  • the identification information sent by the device or user within the field of view of the camera can be identified based on the identification information.
  • information such as service request information, alarm information, help information, comment information, and the like.
  • a virtual object associated with the device or the user may be set according to the information, wherein the spatial location information of the virtual object may be based on the information of the device or the user The position information of the virtual object can be determined, and the spatial position of the virtual object can be changed accordingly as the position of the device or the user changes.
  • the content of the virtual object may be updated according to new information received from the device or user (eg, a new comment by the user).
  • Fig. 5 shows a method for providing a service to a device or a user in a scene according to one embodiment.
  • the method can be implemented using the system shown in Fig. 3 and can include the following steps:
  • Step 501 Receive information sent by the device, where the information includes identification information of the device or its user and spatial location information of the device.
  • Step 502 Identify the device or its user in the image captured by the camera based on the spatial location information of the device.
  • Step 503 Mark the device or its user in the image captured by the camera.
  • the device or user can be identified using a variety of methods, for example, an image of the device or user can be framed, a particular icon can be presented adjacent to the device or user's image, or the device or user's image can be highlighted.
  • the imaging area of the marked device or user can be enlarged, or the camera can be made to shoot for the marked device or user.
  • the device or user can be continuously tracked through a camera, and real-time spatial location information and/or gesture information of the device or user can be determined.
  • Step 504 Associate the identification information of the device or its user with the device or its user in the image captured by the camera, so as to use the identification information to provide services to the device or its user.
  • a person who can observe the image captured by the camera can know that the device or user currently needs service, and can know that the device or user currently needs service.
  • the current location of the device or user so that various required services, such as explanation service, navigation service, consulting service, help service, etc., can be conveniently provided to the device or user.
  • the help desk deployed in the scenario can be replaced, and any user in the scenario can be provided with the services they need in a convenient and low-cost manner.
  • the service may be provided to the user through a device carried or controlled by the user, such as a mobile phone, smart glasses, a vehicle, and the like.
  • the service may be provided visually, audibly, etc. through a telephony function, an application (APP), etc. on the device.
  • APP application
  • the steps in the method shown in FIG. 5 may be implemented by the server in the system shown in FIG. 3 , but it is understood that one or more of these steps may also be implemented by other devices.
  • Fig. 6 shows a method for providing information to a user in a scene through a device (here, glasses are taken as an example) according to an embodiment, the method can be implemented using the system shown in Fig. 3, and can include the following steps :
  • Step 601 Receive information sent by the glasses, where the information includes spatial position information of the glasses.
  • the user may use the glasses to determine the spatial position information of the glasses by scanning the visual landmarks deployed in the scene.
  • the user can send information to the server through the glasses.
  • the glasses can also be used to scan the visual markers to determine the gesture information of the glasses relative to the visual markers or the gesture information of the glasses in the scene, and the gesture information can be sent to the server.
  • the information sent by the glasses may also include information related to the glasses or their users, such as service request information, help information, alarm information, identification information (such as phone numbers, APP account information) etc.
  • the glasses themselves may be capable of direct access to the network.
  • the glasses may not have the ability to directly access the network, but indirectly access the network through a connection between it and, for example, the user's mobile phone.
  • the server may use an intermediate device such as a mobile phone. Receive information sent by glasses.
  • Step 602 Identify the user of the glasses in the image captured by the camera based on the spatial position information of the glasses.
  • the user's identification information can be associated with the user in order to provide services to the user using the identification information.
  • Step 603 Track the user through the camera and update the spatial location information of the user.
  • a camera may be used to track the user and update the imaging position of the user, and determine the spatial position information of the user based on the updated imaging position.
  • Various visual tracking methods known in the art can be used to track the user in the field of view of the camera and update the imaging position of the user.
  • the camera can remain stationary or move while tracking the user.
  • multiple cameras may be used, which may have a continuous field of view or a discontinuous field of view. Where the field of view is discontinuous, the user's characteristics can be recorded and re-identified and tracked when the user re-enters the field of view of one or more cameras.
  • a pre-established mapping relationship between one or more spatial positions (not necessarily all) in the scene and one or more imaging positions in the image captured by the camera and the imaging positions may be used, to determine the user's spatial location information.
  • the spatial position information of the user may be determined based on the pose information of the camera and the imaging position. For example, in the case of using a depth camera or a multi-camera camera, the direction of the user relative to the camera can be determined based on the imaging position, the depth information can be used to determine the distance of the user relative to the camera, so as to determine the position of the user relative to the camera, and then , the spatial position information of the user can be further determined based on the pose information of the camera.
  • the distance of the user relative to the camera may be estimated based on the imaging size of the user, and the spatial position information of the user may be determined based on the pose information of the camera and the imaging position.
  • the distance of the user relative to the camera may be determined by using a lidar or the like installed on the camera, and the spatial position information of the user may be determined based on the pose information of the camera and the imaging position.
  • the multiple cameras can be used to jointly determine the spatial location information of the user.
  • the spatial position information of the user may be determined based on the pose information of the camera, the imaging position, and optional other information (eg, coordinate information of the ground in the scene).
  • the user's gesture information may also be determined based on the tracking result of the user by the camera.
  • Step 604 Provide information to the user through the user's glasses based on the user's spatial location information.
  • the user can be provided with various required information, such as navigation information, instruction information, tutorial information, advertising information, other information related to location-based services, and the like.
  • the above information may be provided visually, audibly, or the like.
  • a virtual object may be superimposed on the display medium of the glasses, and the virtual object may be, for example, an icon (eg, a navigation icon), a picture, a text, or the like.
  • the glasses themselves may have the ability to directly access the network, so that the glasses may directly receive indication information from the server.
  • the glasses may not have the ability to directly access the network, but indirectly access the network through a connection between it and, for example, the user's mobile phone, in this case, the glasses may pass through an intermediate device such as a mobile phone Receive instructions from the server.
  • information may be further provided to the user in conjunction with the glasses or the gesture information of the user thereof.
  • the posture information of the glasses or the user thereof may be determined by the glasses, or the posture information of the user may be determined by the user image captured by the camera, and the posture information may include the orientation information of the user.
  • the posture information of the glasses can be obtained through its built-in sensors, for example, by tracking the initial posture or directly determined by the built-in sensors of the glasses (for example, a gravity sensor, a magnetic sensor, an orientation sensor, etc.) .
  • the server may directly receive the gesture information from the glasses, or receive the gesture information through an intermediate device such as a mobile phone.
  • the steps in the method shown in FIG. 6 may be implemented by the server in the system shown in FIG. 3 , but it is understood that one or more of these steps may also be implemented by other devices.
  • FIG. 7 shows a system for providing information to a user in a scene through glasses, including a visual sign 701, a camera 702, and a server (not shown in FIG. 7), according to one embodiment.
  • a user 703 is in the scene and carries glasses 704 and a mobile phone 705 .
  • the mobile phone 705 can recognize the visual sign 701 through the image capture device on it, so the glasses 704 may not have an image capture device, or although there is an image capture device on it, the image capture device may not have the ability to recognize the visual sign 701 .
  • FIG. 8 illustrates a method of providing information to a user in a scene through glasses, which may be implemented using the system shown in FIG. 7 , according to one embodiment.
  • the method includes the following steps (part of the steps are similar to the steps in FIG. 6 , and will not be repeated here, but it can be understood that the content described for each step in FIG. 6 can also be applied to the corresponding steps in FIG. 8 ):
  • Step 801 Receive information sent by the user's mobile phone, where the information includes spatial location information of the mobile phone.
  • the user can use the mobile phone to determine the spatial location information of the mobile phone by scanning the visual landmarks deployed in the scene.
  • the gesture information of the mobile phone can also be determined by scanning the visual sign, and the gesture information can be sent to the server.
  • Step 802 Identify the user of the mobile phone in the image captured by the camera based on the spatial location information of the mobile phone.
  • the user's identification information can be associated with the user in order to provide services to the user using the identification information.
  • Step 803 Track the user through the camera and update the spatial location information of the user.
  • the user's gesture information can also be determined.
  • Step 804 Provide information to the user through the user's glasses based on the user's spatial location information.
  • the glasses themselves may have the ability to directly access the network, so that the glasses may directly receive indication information from the server.
  • the glasses may not have the ability to directly access the network, but indirectly access the network through a connection between it and, for example, the user's mobile phone, in this case, the glasses may pass through an intermediate device such as a mobile phone Receive instructions from the server.
  • the server may first send first information to the user's mobile phone, and then the mobile phone may send second information (the second information may be the same as or different from the first information) to the glasses based on the first information, so as to provide the user with information based on the glasses through the glasses. location services.
  • information may be further provided to the user in conjunction with the glasses or the gesture information of the user thereof.
  • the user may also not use the glasses, but only use the cell phone.
  • information may be provided to the user through the user's mobile phone based on the user's spatial location information.
  • the information may be further provided to the user in combination with the gesture information of the mobile phone or its user.
  • the gesture information of the user can be determined through the mobile phone, or the user's gesture information can be determined through the user image captured by the camera.
  • the gesture information of the mobile phone can be obtained through its built-in sensor.
  • a device for scanning a visual sign to determine its spatial location information may be referred to as a "position acquisition device", and a device for providing information to a user may be referred to as an "information receiving device”.
  • the location obtaining device and the information receiving device may be the same device, such as the user's mobile phone or the user's glasses; the location obtaining device and the information receiving device may also be different devices, such as a mobile phone and a user's glasses respectively. Glasses.
  • Figure 9 illustrates a user interaction system including a visual sign 901, a camera 902, and a server (not shown in Figure 9) according to one embodiment.
  • the camera and the visual sign are each deployed in a real scene with a specific position and attitude (hereinafter collectively referred to as "pose"), and the scene also has a first user 903 and a second user 905 who carry a first device 904 respectively. and the second device 906.
  • the first device 904 and the second device 906 have image capture devices on them and can identify the visual sign 901 through the image capture devices.
  • the first device 904 and the second device 906 may be, for example, mobile phones, glasses and other devices.
  • FIG. 10 shows a user interaction method according to one embodiment, which can be implemented using the above-mentioned system, and can include the following steps:
  • Step 1001 Receive information sent by a first device of a first user, where the information includes spatial location information of the first device and identification information of the first user or the first device.
  • the first user may use the first device to determine the spatial location information of the first device by scanning the visual markers deployed in the scene.
  • the first device can also be used to scan the visual marker to determine the gesture information of the first device relative to the visual marker or the gesture information of the first device in the scene, and the gesture information can be sent to the server.
  • Step 1002 Identify the first user in the image captured by the camera based on the spatial location information of the first device.
  • Step 1003 Associate the identification information of the first user or the first device with the first user in the image captured by the camera.
  • Step 1004 Track the first user through the camera and update the spatial location information of the first user.
  • the gesture information of the user or the device may also be determined based on the tracking result of the user or the device by the camera.
  • Step 1005 Set relevant information of the first virtual object associated with the first user, the relevant information includes content information and spatial position information, wherein the first virtual object is set according to the spatial position information of the first user. Spatial location information.
  • the spatial position of the first virtual object may be configured to be at a predetermined distance above the first user.
  • the content information of the first virtual object is related information used to describe the content of the virtual object, which may include, for example, pictures, characters, numbers, icons, animations, videos, three-dimensional models, etc. contained in the virtual object, and may also include virtual objects shape information, color information, size information, posture information, etc.
  • the content information of the first virtual object may be set according to the information from the first user or the first device identified by the identification information of the first user or the first device.
  • the content information of the first virtual object may be, for example, the occupation, identity, gender, age, name, nickname, etc. of the first user.
  • the spatial location information of the first virtual object may change accordingly as the location of the first user changes, and the virtual object may be updated according to new information received from the first user or the first device (eg, new comments by the user).
  • Content information of the object eg, updating the textual content of the virtual object.
  • the pose information of the virtual object may also be set, and the pose information of the virtual object may be set based on the pose information of the device or user associated therewith, but may also be set in other ways.
  • Step 1006 Send the relevant information of the first virtual object to the second device of the second user.
  • Information about the first virtual object can be used by the second device to render the first virtual object on its display medium based on its position information and/or gesture information (eg, in an augmented or mixed reality manner) .
  • the location information and attitude information of the second device may be determined in various feasible ways.
  • the second device may determine its position information and/or gesture information by scanning the visual landmarks.
  • the location information and/or posture information of the second device may be determined through the tracking result of the second device or its user by the camera.
  • the second device may also use various sensors built in it to determine its position information and/or attitude information.
  • the second device may use point cloud information of the scene to determine its position information and/or pose information.
  • the first virtual object after obtaining the spatial position information of the first virtual object and the position and attitude information of the second device, the first virtual object can be superimposed at a suitable position in the real scene presented by the display medium of the second device virtual object.
  • the gesture of the superimposed first virtual object may be further determined.
  • the user of the second device may perform various interactive operations on the first virtual object.
  • a second virtual object may also be set for the second user of the second device in a similar manner, and the content information and spatial location information of the second virtual object may be sent to the first device or the first device of the first user.
  • the first device and the other devices may be, for example, a mobile phone and glasses, respectively
  • the content information and the spatial location information of the second virtual object can be used by the first device or the other devices to be based on its location information and/or gesture information to present the second virtual object on its display medium.
  • the steps in the method shown in FIG. 10 may be implemented by the server in the system shown in FIG. 9 , but it is understood that one or more of these steps may also be implemented by other devices.
  • the virtual object may be, for example, an icon containing text, wherein the text is "pick-up, XXX of XX company".
  • the spatial position of the virtual object is associated with the spatial position of the first user and can move as the first user moves.
  • Figure 12 shows an actual image observed by a user through his cell phone screen, the image including multiple users, each user having an associated virtual object, according to one embodiment.
  • a camera is used as an example of a sensor for description, but it can be understood that the embodiments herein are also applicable to any other sensor that can sense or determine the target position, such as lidar, millimeter-wave radar, wireless Signal Transceivers, etc.
  • the devices involved in the embodiments of the present application may be any devices carried or controlled by the user (eg, mobile phones, tablet computers, smart glasses, AR glasses, smart helmets, smart watches, vehicles, etc.), and also It can be various machines that can move autonomously, for example, unmanned aerial vehicles, unmanned vehicles, robots, etc., and image acquisition devices are installed on the equipment.
  • the glasses in this application may be AR glasses, smart glasses, or any other glasses that can be used to present information to the user.
  • the glasses in this application also include glasses formed by adding components or inserts to ordinary optical glasses, for example, glasses formed by adding a display device to ordinary optical glasses.
  • the present invention may be implemented in the form of a computer program.
  • the computer program can be stored in various storage media (eg, hard disk, optical disk, flash memory, etc.), and when the computer program is executed by the processor, can be used to implement the method of the present invention.
  • the present invention may be implemented in the form of an electronic device.
  • the electronic device includes a processor and a memory, and the memory stores a computer program that, when executed by the processor, can be used to implement the method of the present invention.
  • references herein to "various embodiments,” “some embodiments,” “one embodiment,” or “an embodiment” etc. refer to the fact that a particular feature, structure, or property described in connection with the embodiment is included in the in at least one embodiment.
  • appearances of the phrases “in various embodiments,” “in some embodiments,” “in one embodiment,” or “in an embodiment” in various places throughout this document are not necessarily referring to the same implementation example.
  • the particular features, structures, or properties may be combined in any suitable manner in one or more embodiments.
  • particular features, structures, or properties shown or described in connection with one embodiment may be combined, in whole or in part, with the features, structures, or properties of one or more other embodiments without limitation, so long as the combination does not limit the Logical or not working.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Electromagnetism (AREA)
  • General Health & Medical Sciences (AREA)
  • Toxicology (AREA)
  • Artificial Intelligence (AREA)
  • Position Input By Displaying (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

L'invention concerne un procédé et un système d'obtention d'informations d'identification d'un dispositif ou d'un utilisateur de celui-ci dans un scénario. Un ou plusieurs capteurs et une ou plusieurs marques visuelles sont déployés dans le scénario, et le capteur peut être utilisé pour détecter ou déterminer des informations de position d'un dispositif ou d'un utilisateur dans le scénario. Le procédé comprend les étapes suivantes : recevoir des informations envoyées par un dispositif, les informations comprenant des informations d'identification du dispositif ou d'un utilisateur de celui-ci et des informations de position spatiale du dispositif, et le dispositif déterminant les informations de position spatiale de celui-ci par balayage d'une marque visuelle ; identifier le dispositif ou l'utilisateur de celui-ci à portée de détection d'un capteur en fonction des informations de position spatiale du dispositif ; et associer des informations d'identification du dispositif ou de l'utilisateur de celui-ci avec le dispositif ou l'utilisateur de celui-ci à portée de détection du capteur, de manière à fournir un service au dispositif ou à l'utilisateur de celui-ci.
PCT/CN2021/129727 2020-12-08 2021-11-10 Procédé et système d'obtention d'informations d'identification de dispositif ou d'utilisateur de celui-ci dans un scénario WO2022121606A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
CN202011442020.6A CN114663491A (zh) 2020-12-08 2020-12-08 用于向场景中的用户提供信息的方法和系统
CN202011440905.2A CN112528699B (zh) 2020-12-08 2020-12-08 用于获得场景中的设备或其用户的标识信息的方法和系统
CN202011440905.2 2020-12-08
CN202011440875.5A CN112581630B (zh) 2020-12-08 一种用户交互方法和系统
CN202011442020.6 2020-12-08
CN202011440875.5 2020-12-08

Publications (1)

Publication Number Publication Date
WO2022121606A1 true WO2022121606A1 (fr) 2022-06-16

Family

ID=81973104

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/129727 WO2022121606A1 (fr) 2020-12-08 2021-11-10 Procédé et système d'obtention d'informations d'identification de dispositif ou d'utilisateur de celui-ci dans un scénario

Country Status (2)

Country Link
TW (1) TWI800113B (fr)
WO (1) WO2022121606A1 (fr)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012182685A (ja) * 2011-03-01 2012-09-20 Wham Net Service Corp 入下山届システム
CN103646565A (zh) * 2013-12-24 2014-03-19 苏州众天力信息科技有限公司 一种基于微信的车辆寻车二维码的位置信息存储及其寻找方法
CN111256701A (zh) * 2020-04-26 2020-06-09 北京外号信息技术有限公司 一种设备定位方法和系统
CN111814752A (zh) * 2020-08-14 2020-10-23 上海木木聚枞机器人科技有限公司 室内定位实现方法、服务器、智能移动设备、存储介质
CN112528699A (zh) * 2020-12-08 2021-03-19 北京外号信息技术有限公司 用于获得场景中的设备或其用户的标识信息的方法和系统
CN112581630A (zh) * 2020-12-08 2021-03-30 北京外号信息技术有限公司 一种用户交互方法和系统

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9430752B2 (en) * 2012-11-02 2016-08-30 Patrick Soon-Shiong Virtual planogram management, systems, and methods

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012182685A (ja) * 2011-03-01 2012-09-20 Wham Net Service Corp 入下山届システム
CN103646565A (zh) * 2013-12-24 2014-03-19 苏州众天力信息科技有限公司 一种基于微信的车辆寻车二维码的位置信息存储及其寻找方法
CN111256701A (zh) * 2020-04-26 2020-06-09 北京外号信息技术有限公司 一种设备定位方法和系统
CN111814752A (zh) * 2020-08-14 2020-10-23 上海木木聚枞机器人科技有限公司 室内定位实现方法、服务器、智能移动设备、存储介质
CN112528699A (zh) * 2020-12-08 2021-03-19 北京外号信息技术有限公司 用于获得场景中的设备或其用户的标识信息的方法和系统
CN112581630A (zh) * 2020-12-08 2021-03-30 北京外号信息技术有限公司 一种用户交互方法和系统

Also Published As

Publication number Publication date
TWI800113B (zh) 2023-04-21
TW202223749A (zh) 2022-06-16

Similar Documents

Publication Publication Date Title
US20210019854A1 (en) Location Signaling with Respect to an Autonomous Vehicle and a Rider
KR102366293B1 (ko) 디지털 트윈을 이용한 증강현실 기반 현장 모니터링 시스템 및 방법
CN107782314B (zh) 一种基于扫码的增强现实技术室内定位导航方法
US20180196417A1 (en) Location Signaling with Respect to an Autonomous Vehicle and a Rider
KR102289745B1 (ko) 실시간 현장 작업 모니터링 방법 및 시스템
CN105408938B (zh) 用于2d/3d空间特征处理的系统
CN105409212B (zh) 具有多视图图像捕捉和深度感测的电子设备
US20170323458A1 (en) Camera for Locating Hidden Objects
US20180196415A1 (en) Location Signaling with Respect to an Autonomous Vehicle and a Rider
EP3848674B1 (fr) Signalisation d'emplacement par rapport à un véhicule autonome et un passager
EP2974509B1 (fr) Communicateur d'informations personnelles
JP6896688B2 (ja) 位置算出装置、位置算出プログラム、位置算出方法、及びコンテンツ付加システム
CN110392908A (zh) 用于生成地图数据的电子设备及其操作方法
CN112528699B (zh) 用于获得场景中的设备或其用户的标识信息的方法和系统
TWI750822B (zh) 用於為目標設置可呈現的虛擬對象的方法和系統
WO2022121606A1 (fr) Procédé et système d'obtention d'informations d'identification de dispositif ou d'utilisateur de celui-ci dans un scénario
WO2021057886A1 (fr) Procédé et système de navigation basés sur un appareil de communication optique, et dispositif et support associés
CN112581630B (zh) 一种用户交互方法和系统
CN112581630A (zh) 一种用户交互方法和系统
CN114663491A (zh) 用于向场景中的用户提供信息的方法和系统
TWI759764B (zh) 基於光通信裝置疊加虛擬物件的方法、電子設備以及電腦可讀取記錄媒體
US20220084258A1 (en) Interaction method based on optical communication apparatus, and electronic device
CN114827338A (zh) 用于在设备的显示媒介上呈现虚拟对象的方法和电子装置
CN112561953A (zh) 用于现实场景中的目标识别与跟踪的方法和系统
CN111752293A (zh) 用于对能够自主移动的机器进行导引的方法和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21902313

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21902313

Country of ref document: EP

Kind code of ref document: A1