WO2023248091A1 - Système et procédé de reconnaissance basés sur des composants ontologiques - Google Patents

Système et procédé de reconnaissance basés sur des composants ontologiques Download PDF

Info

Publication number
WO2023248091A1
WO2023248091A1 PCT/IB2023/056290 IB2023056290W WO2023248091A1 WO 2023248091 A1 WO2023248091 A1 WO 2023248091A1 IB 2023056290 W IB2023056290 W IB 2023056290W WO 2023248091 A1 WO2023248091 A1 WO 2023248091A1
Authority
WO
WIPO (PCT)
Prior art keywords
ontological
interest
components
component
applying
Prior art date
Application number
PCT/IB2023/056290
Other languages
English (en)
Inventor
Bryan Martin
Wrushabh Warshe
Dylan KEATING
Laurent Juppe
Sherif Esmat Omar ABUELWAFA
Yann Majewski
Lionel LE CARLUER
Danae Blondel
Original Assignee
Applications Mobiles Overview Inc.
Overview Sas
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Applications Mobiles Overview Inc., Overview Sas filed Critical Applications Mobiles Overview Inc.
Publication of WO2023248091A1 publication Critical patent/WO2023248091A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/653Three-dimensional objects by matching three-dimensional models, e.g. conformal mapping of Riemann surfaces

Definitions

  • the present disclosure illustrates implementations of the technology focusing on associating an object-to-be-inferred to an object-of-interest.
  • the system may be a prediction and/or confidence engine aiming at identifying objects based on ontological components of the object.
  • the system generates a dataset of ontological component representations, also referred to as unique information.
  • QR codes, barcodes and other conventional comply with known formats and are easily recognizable using simple imaging techniques. Such codes or markers are easily reproduceable. Although quite useful in many applications, these codes and markers do not provide sufficient unicity to allow authenticating the objects on which they are applied.
  • a computer- implemented method for generating unique information associated with an object-of-interest comprising: obtaining ontological components associated with an object-of-interest; applying a routing operation to the ontological components, the routing operation providing a list of connections between each of the ontological components and encoding operations; applying the encoding operations to each of the ontological components, the encoding operation being configured to obtain unique information including (i) unique feature information, (ii) unique encoding operation information, and (iii) unique confidence information, the unique information being associated with the object-of-interest; and adding the unique information corresponding to the object-of-interest to a unique information database.
  • the object-of-interest is a non-synthetic object-of-interest.
  • At least one of the ontological components is captured by an imaging system.
  • the method further comprises capturing a plurality of representations of the object-of-interest, each representation being captured from a corresponding point of view, each representation being associated with feature data, the feature data including information about 3D coordinates of the corresponding point of view in a global coordinate system, and/or 3D coordinates of a set of feature points of the object in the global coordinate system.
  • the method further comprises applying a setting configuration on the imaging system that capture setting parameters thereof match pre-determined capture setting, the capture settings including one or more of extrinsic parameters of the imaging system, intrinsic parameters of the imaging system, position, orientation, aiming point, focal length, focal length range, pixel size, sensor size, position of a principal point on the object-of-interest, an lens distortion of the imaging system.
  • the method further comprises storing the ontological components on an ontological component database.
  • the method further comprises applying a weight and/or a bias to at least one of the ontological components.
  • the bias has a constant value defining a depth and/or an influence on the at least one ontological component on which the bias is applied.
  • the weight defines a depth and/or an influence on the at least one ontological component on which the bias is applied.
  • the noise is based on a random parameter.
  • the noise is selected from a White Gaussian noise, a Voroni noise, and a Fractal noise.
  • the noise is constrained such that a perturbation associated with the augmented ontological component is within a finite envelope.
  • At least one of the one or more geometric transformations is based on a random parameter.
  • the routing operation determines modalities of 2D images and 3D point clouds as a plurality of ontological components for application by the encoding operations.
  • the routing operation applies a first influence level to the depth and/or color on the object-of-interest; and the influence parameter causes to apply to the depth and/or color on the object-of-interest a second influence level greater than the first influence level in a subsequent routing operation.
  • the influence parameter is related to a partial 3D representation of the object-of-interest.
  • the method further comprises combining two ontological components before the encoding operations by (i) removing correlation data therefrom and/or (ii) fusing the two ontological components to a lower dimensional common space.
  • the method further comprises applying a first encoding operation to a first ontological component; applying a second encoding operation to a second ontological component; and after the first and second encoding operations, combining the first and second ontological components by (i) removing correlation data therefrom and/or (ii) fusing the first and second ontological components to a lower dimensional common space.
  • the method further comprises applying a first encoding operation to a first ontological component; applying a second encoding operation to a second ontological component; applying a third encoding operation to a third ontological component; and after the first and second encoding operations, combining the first and second ontological components by (i) removing correlation data therefrom and/or (ii) fusing the first and second ontological components to a lower dimensional common space; after the first, second and third encoding operations, combining the combined first and second ontological components with the third ontological component by (i) removing correlation data therefrom and/or (ii) fusing the combined first and second ontological components with the third ontological component to a lower dimensional common space.
  • adding the unique information corresponding to the object-of- interest to the unique information database comprises ID information related to the object-of-interest.
  • a computer- implemented method for obtaining ontological components of an object-of-interest Each component being captured by a sensor device.
  • the object-of-interest is subset of a plurality of objects.
  • the ontological components are captured from a corresponding point of view, each component being associated with feature data, the feature data comprising information concerning color, depth, heat, two dimensions (2D), three dimensions (3D), a continuous function, 3D descriptors, spatial distribution, geometric attributes, scale invariant features, shape descriptors, and/or motion.
  • the ontological component is a one-dimensional (ID) representation.
  • the ontological component is a mesh.
  • the at least one influence parameter is applied to the ontological component.
  • Examples of one or more influence parameter is a weight or bias.
  • a weight decides the influence the input value will have on the output value.
  • Biases which are constant, do not contain incoming connections, but contain outgoing connections with their own weights.
  • the at least one influence parameter is specified.
  • the geometric transformation may apply one or more geometric transformations on the ontological component.
  • one or more geometric transformations include, but are not limited to, changing the size of the ontological component, applying a rotation to the ontological component, applying shifting and/or translation to the ontological component, scaling the ontological component, rotating the ontological component, translating the ontological component, applying a scaling translation to the ontological component, shearing the ontological component, applying a non-uniform scaling to the ontological component, forming a matrix representation of the ontological component, applying a 3D affine transformation to the ontological component, applying matrix operations to the ontological component, applying a reflection to the ontological component, dilating the ontological component, applying a tessellation to the ontological component, applying a projection to the ontological component, or some combination thereof. It is to be noted that any one or more these transformations may be applied and in case of more than one transformation, the order in which these transformations may be applied should not limit the scope of the present disclosure.
  • the routing method is further determined by the at least one of the plurality of ontological components.
  • the routing method is specified.
  • the encoding operation may be pre- configured.
  • the encoding operation may be pre-configured to differentiate the at least one ontological component modality, such as, but not limited to, a complete object, a partial object, the at least one resolution, a plurality of ontological component modalities, 3D information, 2D information, ID information, raw data, intermediate representations, and/or feature vectors.
  • the at least one ontological component modality such as, but not limited to, a complete object, a partial object, the at least one resolution, a plurality of ontological component modalities, 3D information, 2D information, ID information, raw data, intermediate representations, and/or feature vectors.
  • the encoding operation provides unique data such as, but not limited to, a feature vector, values with delta and/or confidence values.
  • the ontological component modalities, augmentation parameters, influence parameters, fusion methods and encoder operation parameters are associated to the unique data provided by the encoding operation associated to the object-of-interest.
  • the unique data and any data associated with the object-of- interest defines a dataset.
  • the dataset and database are dynamic.
  • a dataset is used to determine ontological components, a routing strategy, an encoding operation, influence parameters and a data fusion method.
  • ID information is used to obtain a dataset to determine ontological components, a routing strategy, an encoding operation, influence parameters and a data fusion method.
  • a routing strategy, an encoding operation, influence parameters and a data fusion method subsequent to a dataset being used to determine ontological components, a routing strategy, an encoding operation, influence parameters and a data fusion method, a closest-match dataset to the dataset obtained from the encoding operation is retrieved from the database.
  • FIG. 1A-B are illustrative of an environment for executing a method of ontological recognition according to an embodiment of the present technology
  • FIG. 2A depicts a high-level functional block diagram of the method of ontological recognition to obtain unique object information, in accordance with the embodiments of the present disclosure
  • FIG. 4 depicts augmentation operations, in accordance with various embodiments of the present disclosure
  • FIG. 6 depicts data fusion method 2, in accordance with various embodiments of the present disclosure.
  • FIG. 7 depicts data fusion method 3, in accordance with various embodiments of the present disclosure.
  • processor may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software.
  • the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared.
  • the processor may be a general purpose processor, such as a central processing unit (CPU) or a processor dedicated to a specific purpose, such as a digital signal processor (DSP).
  • CPU central processing unit
  • DSP digital signal processor
  • modules may be represented herein as any combination of flowchart elements or other elements indicating the specified functionality and performance of process steps and/or textual description. Such modules may be executed by hardware that is expressly or implicitly shown. Moreover, it should be understood that these modules may, for example include, without being limitative, computer program logic, computer program instructions, software, stack, firmware, hardware circuitry or any combinations thereof that are configured to provide the required capabilities and specified functionality.
  • Ontological components herein, will include, but are not limited to, color, color transformations, depth, heat, 2D images, partial 2D images, 3D point clouds, partial 3D point clouds, a mesh, a continuous function, 3D descriptors, shape descriptors, spatial distribution, geometric attributes, scale invariant features, shape descriptors, and/or motion.
  • FIG. 1A-B there is shown a device 10 suitable for use in accordance with at least some embodiments of the present technology. It is to be expressly understood that the device 10 as depicted is merely an illustrative implementation of the present technology. In some cases, what are believed to be helpful examples of modifications to the device 10 may also be set forth below. This is done merely as an aid to understanding, and, again, not to define the scope or set forth the bounds of the present technology. These modifications are not an exhaustive list, and, as a person skilled in the art would understand, other modifications are likely possible.
  • FIGs. 1A-B provide a schematic representation of a device 10 configured for generating and/or processing a three-dimensional (3D) point cloud in accordance with an embodiment of the present technology.
  • the device 10 comprises a computing unit 100 that may receive captured images of an object to be characterized.
  • the computing unit 100 may be configured to generate the 3D point cloud as a representation of the object to be characterized.
  • the computing unit 100 is described in greater details hereinbelow.
  • the computing unit 100 may be implemented by any of a conventional personal computer, a controller, and/or an electronic device (e.g., a server, a controller unit, a control device, a monitoring device etc.) and/or any combination thereof appropriate to the relevant task at hand.
  • the computing unit 100 comprises various hardware components including one or more single or multi-core processors collectively represented by a processor 110, a solid-state drive 150, a random access memory (RAM) 130, a dedicated memory 140 and an input/output interface 160.
  • the computing unit 100 may be a computer specifically designed to operate a machine learning algorithm (MLA) and/or deep learning algorithms (DLA).
  • MLA machine learning algorithm
  • DLA deep learning algorithms
  • the computing unit 100 may be a generic computer system.
  • the computing unit 100 may be an "off the shelf' generic computer system. In some embodiments, the computing unit 100 may also be distributed amongst multiple systems. The computing unit 100 may also be specifically dedicated to the implementation of the present technology. As a person in the art of the present technology may appreciate, multiple variations as to how the computing unit 100 is implemented may be envisioned without departing from the scope of the present technology.
  • Communication between the various components of the computing unit 100 may be enabled by one or more internal and/or external buses 170 (e.g. a PCI bus, universal serial bus, IEEE 1394 "Firewire” bus, SCSI bus, Serial- ATA bus, ARINC bus, etc.), to which the various hardware components are electronically coupled.
  • internal and/or external buses 170 e.g. a PCI bus, universal serial bus, IEEE 1394 "Firewire” bus, SCSI bus, Serial- ATA bus, ARINC bus, etc.
  • the input/output interface 160 may provide networking capabilities such as wired or wireless access.
  • the input/output interface 160 may comprise a networking interface such as, but not limited to, one or more network ports, one or more network sockets, one or more network interface controllers and the like. Multiple examples of how the networking interface may be implemented will become apparent to the person skilled in the art of the present technology.
  • the networking interface may implement specific physical layer and data link layer standard such as Ethernet, Fibre Channel, Wi-Fi or Token Ring.
  • the specific physical layer and the data link layer may provide a base for a full network protocol stack, allowing communication among small groups of computers on the same local area network (LAN) and large-scale network communications through routable protocols, such as Internet Protocol (IP).
  • LAN local area network
  • IP Internet Protocol
  • the device 10 comprises an imaging system 18 that may be configured to capture Red- Green-Blue (RGB) images.
  • the device 10 may be referred to as the "imaging mobile device" 10.
  • the imaging system 18 may comprise image sensors such as, but not limited to, Charge-Coupled Device (CCD) or Complementary Metal Oxide Semiconductor (CMOS) sensors and/or digital cameras. Imaging system 18 may convert an optical image into an electronic or digital image and may send captured images to the computing unit 100. In the same or other embodiments, the imaging system 18 may be a single-lens camera providing RGB pictures.
  • the device 10 comprises depth sensors to acquire RGB-Depth (RGBD) pictures.
  • RGBD RGB-Depth
  • any device suitable for generating a 3D point cloud may be used as the imaging system 18 including but not limited to depth sensors, 3D scanners or any other suitable devices.
  • the device 10 may communicatively access an external imaging system 19 such as, but not limited to, a camera, a video camera, a microscope, endoscope, a Charge-Coupled Device (CCD) or Complementary Metal Oxide Semiconductor (CMOS) sensors and/or digital cameras.
  • imaging system 19 may send captured data to the computing unit 100.
  • imaging system 19 may convert an optical image into an electronic or digital image and may send captured images to the computing unit 100.
  • the imaging system 19 may be a single-lens camera providing RGB pictures.
  • the device 10 comprises depth sensors to acquire RGB- Depth (RGBD) pictures.
  • RGBBD RGB- Depth
  • any device suitable for generating a 3D point cloud may be used as the imaging system 19 including but not limited to depth sensors, 3D scanners or any other suitable devices.
  • the device 10 may comprise an Inertial Sensing Unit (ISU) 14 configured to be used in part by the computing unit 100 to determine a position of the imaging system 18 and/or the device 10. Therefore, the computing unit 100 may determine a set of coordinates describing the location of the imaging system 18, and thereby the location of the device 10, in a coordinate system based on the output of the ISU 14. Generation of the coordinate system is described hereinafter.
  • the ISU 14 may comprise 3 -axis accelerometer(s), 3 -axis gyroscope(s), and/or magnetometer(s) and may provide velocity, orientation, and/or other position related information to the computing unit 100.
  • the device 10 may include a screen or display 16 capable of rendering color images, including 3D images.
  • the display 16 may be used to display live images captured by the imaging system 18, 3D point clouds, Augmented Reality (AR) images, Graphical User Interfaces (GUIs), program output, etc.
  • display 16 may comprise and/or be housed with a touchscreen to permit users to input data via some combination of virtual keyboards, icons, menus, or other Graphical User Interfaces (GUIs).
  • GUIs Graphical User Interfaces
  • display 16 may be implemented using a Liquid Crystal Display (LCD) display or a Light Emitting Diode (LED) display, such as an Organic LED (OLED) display.
  • LCD Liquid Crystal Display
  • LED Light Emitting Diode
  • OLED Organic LED
  • display 16 may be remotely communicatively connected to the device 10 via a wired or a wireless connection (not shown), so that outputs of the computing unit 100 may be displayed at a location different from the location of the device 10.
  • the display 16 may be operationally coupled to, but housed separately from, other functional units and systems in device 10.
  • the device 10 may be, for example, an iPhone or mobile phone from Apple or a Galaxy mobile phone or tablet from Samsung, or any other mobile device whose features are similar or equivalent to the aforementioned features.
  • the device may be, for example and without being limitative, a handheld computer, a personal digital assistant, a cellular phone, a network device, a camera, a smart phone, an enhanced general packet radio service (EGPRS) mobile phone, a network base station, a media player, a navigation device, an e-mail device, a game console, or a combination of two or more of these data processing devices or other data processing devices.
  • a handheld computer a personal digital assistant
  • a cellular phone a network device
  • a camera a smart phone
  • an enhanced general packet radio service (EGPRS) mobile phone a network base station
  • media player a media player
  • a navigation device a navigation device
  • e-mail device e.g., a game console
  • game console a combination of two or more of these data processing devices or other data processing devices.
  • the device 10 may comprise a memory 12 communicatively connected to the computing unit 100 and configured to store without limitation data, captured images, depth values, sets of coordinates of the device 10, 3D point clouds, and raw data provided by ISU 14 and/or the imaging system 18.
  • the memory 12 may be embedded in the device 10 as in the illustrated embodiment of Figure 2 or located in an external physical location.
  • the computing unit 100 may be configured to access a content of the memory 12 via a network (not shown) such as a Local Area Network (LAN) and/or a wireless connection such as a Wireless Local Area Network (WLAN).
  • LAN Local Area Network
  • WLAN Wireless Local Area Network
  • the device 10 may also includes a power system (not depicted) for powering the various components.
  • the power system may include a power management system, one or more power sources (e.g., battery, alternating current (AC)), a recharging system, a power failure detection circuit, a power converter or inverter and any other components associated with the generation, management and distribution of power in mobile or non-mobile devices.
  • the device 10 may also be suitable for generating the 3D point cloud, based on images of the object. Such images may have been captured by the imaging system 18. As an example, the device 10 may generate the 3D point cloud according to the teachings of the Patent Cooperation Treaty Patent Publication No. 2020/240497, the disclosure of which is incorporated by reference herein it is entirety. [00156] Summarily, it is contemplated that the device 10 may perform the operations and steps of methods described in the present disclosure. More specifically, the device 10 may be suitable for capturing images of the object to be characterized, generating a 3D point cloud including data points and representative of the object, and executing methods for characterization of the 3D point cloud.
  • the device 10 is communicatively connected (e.g. via any wired or wireless communication link including, for example, 4G, LTE, Wi-Fi, or any other suitable connection) to an external computing device 23 (e.g. a server) adapted to perform some or all of the methods for characterization of the 3D point cloud.
  • an external computing device 23 e.g. a server
  • operation of the computing unit 100 may be shared with the external computing device 23.
  • FIGs. 2A-B there is depicted a flow diagram of a computer- implemented method 200 suitable for use in accordance with at least some embodiments of the present technology. It is to be expressly understood that the method 200 as depicted are merely an illustrative implementation of the present technology. In some cases, what are believed to be helpful examples of modifications to the method 200 may also be set forth below. This is done merely as an aid to understanding, and, again, not to define the scope or set forth the bounds of the present technology. These modifications are not an exhaustive list, and, as a person skilled in the art would understand, other modifications are likely possible.
  • the method of ontological recognition 200 may be applied to non-synthetic objects and/or synthetic objects.
  • synthetic objects may comprise, for example, synthetic 3D models, CAD, 3D models acquired from industrial oriented 3D software, or medical oriented 3D software, and/or non-specialized 3D software, 3D models generated by processes such as RGB photogrammetry, RGB-D photogrammetry, and/or other reconstruction techniques from real objects.
  • a "non-synthetic object” may refer to any object in the real- world.
  • Non-synthetic objects are not synthesized using any computer rendering techniques rather are scanned, captured by any non-limiting means such as using suitable sensor, such as camera, optical sensor, depth sensor or the like, to generate or reconstruct 3D point cloud representation of the non-synthetic 3D object using any "off the shelf' technique, including but not limited to photogrammetry, machine learning based techniques, depth maps or the like.
  • non-synthetic 3D object may be any real-world objects such as a computer screen, a table, a chair, a coffee mug, a mechanical component on an assembly line, or any type of inanimate object or entity.
  • a non-synthetic 3D object may also by an animated entity such as an animal, a plant, a human entity, or a portion thereof.
  • the capture settings may comprise parameters such as, for example, extrinsic and/or intrinsic parameters of the imaging system 18, 19, position, orientation, aiming point, focal length, focal length range, pixel size, sensor size, position of the principal point, and/or lens distortion.
  • the virtual capture module 204 may comprise a virtual camera and/or cameras acquiring a virtual/ synthetic representation of the object-of-interest. More specifically, a plurality of representations of the object-of-interest have been captured, each representation having been captured by a virtual camera from a corresponding point of view, each representation being associated with feature data, the feature data including information about 3D coordinates of the corresponding point of view in a global coordinate system, and 3D coordinates of a set of feature points of the object in the global coordinate system.
  • operation of the ontological component database 205 is further determined by the at least one influence parameter available from the influence parameter module 209.
  • influence parameter may be a weight or bias.
  • a weight controls the signal strength, and/or decides the influence the input value will have on the output value.
  • Biases which are constant, do not contain incoming connections, but contain outgoing connections with their own corresponding weights.
  • An illustrative example of an embodiment may be to determine that 3D representations will possess greater influence in subsequent operations.
  • Another illustrative example may be that color is determined to possess greater influence, and depth is determined to possess lesser influence in subsequent operations.
  • operation of the ontological component database 205 is prespecified. For example, if the object-of-interest is a cube, it could be prespecified that the ontological component database 205 may not consider curvature or angles other than 90°.
  • the method of 200 continues with using the augmentation module 206 providing data augmentation of each ontological component.
  • said augmentation involves various types of noise, and/or geometric transformation, and/or a sampling strategy.
  • augmentation module 500 incorporates noise generation module 501, geometric transformation generator module 502, and sampling strategy module 503.
  • the augmentation module 206 may provide various types of noise.
  • the noise generator may generate the noise based on a random parameter.
  • the noise added by the noise generator is constrained such that perturbation associated with the augmented ontological component is within a finite envelope.
  • the noise generated by the noise generator may include, but not limited to, White Gaussian noise, Voroni noise and/or Fractal noise.
  • common 2D-related noise e.g., Salt and Pepper, film grain, fixed-pattern, Perlin, simplex and/or Poisson noise
  • common 2D-related noise are applicable on the surface of a 3D model and generated by the noise generator.
  • the augmentation module 206 may provide geometric transformation wherein the geometric transformation may apply one or more geometric transformations on the ontological component.
  • the geometric transformation may include, but are not limited to, changing the size of the ontological component, applying a rotation to the ontological component, applying shifting and/or translation to the ontological component, scaling the ontological component, rotating the ontological component, translating the ontological component, applying a scaling translation to the ontological component, shearing the ontological component, applying a non- uniform scaling to the ontological component, forming a matrix representation of the ontological component, applying a 3D affine transformation to the ontological component, applying matrix operations to the ontological component, applying a reflection to the ontological component, dilating the ontological component, applying a tessellation to the ontological component, applying a projection to the ontological component, or some combination thereof. It is to be noted that any one or more these transformations may be applied and in case of more than one transformation, the order in which these transformations may be applied should
  • the augmentation module 206 may apply one or more geometric transformations based on a random parameter.
  • the geometric transformation random parameter associated with applying the rotation may be random angle in between 0° to 360°.
  • the geometric transformation random parameter associated with the shifting parameters may be conditioned according to a pre-defined world/scene maximum size and avoiding the intersections between each 3D object's own bounding box.
  • the geometric transformation random parameter may have different values.
  • the augmentation module 206 may apply a sampling strategy, such as, but not limited to, farthest point sampling, random sampling, and/or feature-based sampling.
  • operation of the augmentation module 206 is further determined by the at least one influence parameter available from the influence parameter module 209.
  • An illustrative example of an embodiment may be to determine the probability, percentage or intensity of the added noise, or a combination thereof.
  • a specified percentage of the selected noise could be added to each point in a 3D point cloud.
  • Another representative example may be to add a specified percentage or intensity of noise to each point of a 3D point cloud based on a probability, for example, every 10 th point.
  • a further example may be to determine the probability, percentage or intensity of a geometric transformation, wherein, concerning the removal of planes, the intensity of plane removal may be specified as a percentage of removal at each point within a 3D point cloud and/or as a probability of the number of points to receive plane removal, e.g. every 100 th point.
  • the augmentation module 206 may be omitted.
  • the method 200 continues with the routing module 207, wherein the routing module 207 defines a predefined list of connections between the ontological component database 205 and the encoding operations module 208.
  • operation of the routing module 207 is determined by the at least one ontological component.
  • the routing module could provide the modalities of 3D point clouds and exclude all other ontological components.
  • the routing module 207 could provide the modalities of partial 2D images and partial 3D point clouds and exclude all other ontological components.
  • the routing module 207 is further determined by the at least one plurality of ontological components.
  • the routing module 207 may determine modalities of 2D images and 3D point clouds as a plurality of ontological components available to the encoding block.
  • Another illustrative example is to determine the modalities of color, depth, and 3D point clouds as a plurality of ontological components available to the encoding block.
  • the routing module 207 determines a first plurality of 2D images and 3D point clouds available to the encoding block along with a second plurality of color, depth and partial 2D images as available to the encoding block.
  • the routing module 207 is further determined by the at least one influence parameter available from the influence parameter module 209.
  • An illustrative example of an embodiment may be to determine that, regardless of previous influence operations, partial 3D representations will possess greater influence in subsequent operations.
  • Another illustrative example may be that a plurality of depth and color is determined to possess greater influence in subsequent operations, and 3D representations is determined to possess lesser influence in subsequent operations.
  • Another illustrative example may be that a plurality of depth and color is determined to possess no influence in subsequent operations, and a plurality of 3D descriptors and 2D images is determined to possess lesser influence in subsequent operations.
  • operation of the routing module 207 is further determined by the data fusion module 210.
  • Other aspects and embodiments of the data fusion module 210 will be provided in detail in later sections of the present disclosure.
  • the routing module 207 is specified. For example, if the object-of- interest is a cube, it could be determined that the routing module 207 may not consider curvature or angles other than 90°.
  • the routing module 207 comprises the at least one channel.
  • the routing module may comprise multiple channels each possessing the at least one ontological component.
  • a further illustrative example depicts the routing module 207 comprising 2 channels (not shown) in which channel 1 possesses 3D point clouds and channel 2 possesses 2D images.
  • the routing module comprises multiple channels each possessing the at least one plurality of ontological components.
  • An illustrative example embodies the routing module 207 defining 2 channels; channel 1 possessing a plurality of 3D point clouds and 2D images; and channel 2 possessing color.
  • a further illustrative example embodies the routing module 207 defining 2 channels; channel 1 possessing the plurality of 3D point clouds and 2D images; and channel 2 possessing the plurality of color and depth.
  • a further illustrative example embodies the routing module 207 defining 3 channels; channel 1 possessing the plurality of 3D point clouds, 2D images, heat and partial 2D images; channel 2 possessing the plurality of color and depth; and channel 3 possessing curvature.
  • the method 200 continues with encoding operations, at the encoding operations module 208.
  • the encoding operations module 208 generates a dense representation of the object-of-interest such as, but not limited to, a vector, vectors, a feature vector, feature vectors, an array of numeric values, a confidence value, an array of confidence values, a value with a delta, and/or an array of values with a delta.
  • the encoding operations module 208 may comprise the at least one encoder, neural network, any learned network, machine learning algorithm, and/or array of encoding operations.
  • the encoding operations module 208 may comprise a plurality of encoders, neural networks, any learned networks, machine learning algorithm, and/or array of encoding operations.
  • An illustrative example of an embodiment is depicted in FIG. 3, wherein encoding operations 408 comprises of a plurality of encoders from 1 to ‘n’.
  • the encoding operations module 208 may be pre-configured.
  • the encoding operations module 208 may be pre-configured to differentiate the at least one ontological component, such as, but not limited to, a complete object, a partial object, the at least one resolution, a plurality of ontological component modalities, 3D information, 2D information, ID information, raw data, intermediate representations, and/or feature vectors.
  • encoder 4081 may be pre-configured to possess ontological components 4051 ‘a’, made available from database 405, by routing method 407.
  • encoder 4082 may possess ontological component, 4052 ‘b’; and encoder 4083 may possess 4052 ‘b’ and 4053 ‘x’.
  • the encoding operations module 208 may be pre-configured to differentiate the at least one plurality of ontological components.
  • encoder 4081 may be pre-configured to possess ontological components 4051 ‘a’ and 4052 ‘b’, made available from 405, by routing method 407.
  • encoder 4082 may possess ontological components 4051 ‘a’, 4052 ‘b’ and 4053, ‘x’; and encoder 4083 may possess 4052 ‘b’ and 4053 ‘x’ .
  • the encoding operations module 208 is trained.
  • the encoding operations module 208 is retrained for each object- of-interest.
  • operation of the encoding operations module 208 is further determined by the at least one influence parameter available from the influence parameter module 209.
  • Examples of one or more influence parameter is a weight or bias.
  • An illustrative example of an embodiment may be to determine that, regardless of previous influence operations, partial 3D representations will possess greater influence in subsequent operations.
  • Another illustrative example may be that a plurality of depth and color is determined to possess greater influence in subsequent operations, and 3D representations is determined to possess lesser influence in subsequent operations.
  • Another illustrative example may be that a plurality of depth and color is determined to possess no influence in subsequent operations, and a plurality of 3D descriptors and 2D images is determined to possess lesser influence in subsequent operations.
  • a further illustrative example of an embodiment may be to determine that ontological component 4051 ‘a’ will possess greater influence in subsequent operations.
  • Another illustrative example may be that ontological components 4051 ‘a’ and 4053 ‘x’ are determined to possess greater influence in subsequent operations, and that ontological component 4052 ‘b’ is determined to possess lesser influence in subsequent operations.
  • the at least one influence parameter is specified.
  • the encoding operations module 208 is further determined by the data fusion module 210.
  • Other aspects and embodiments the data fusion module 210 will be provided in detail in later sections of the present disclosure.
  • the encoding operations module 208 further implements at least one loopback routine.
  • 4081a depicts an encoder configured to comprise loopback routine 409.
  • loopback routine 409 utilizes random parameters. Non-limiting examples of random parameters may be sampling method, orientation and/or neural network randomization.
  • the number of cycles of the loopback routine 409 is specified.
  • the method 200 continues with the unique information database 211 storing information associated with an object-of-interest.
  • the information comprises information obtained from the non-synthetic object-of-interest 201, the synthetic object-of-interest 202, the capture module 203, the virtual capture module 204, the ontological component database 205, the augmentation module 206, the routing module 207, the encoding operations module 208, the influence parameter module 209, and/or the data fusion module 210.
  • the information stored in the unique information database 211 is associated to a specified object-of-interest.
  • the information stored in the unique information database 211 obtained from non-synthetic object-of-interest 201, synthetic object-of-interest 202 comprises ID information.
  • ID information may include, but is not limited to text, numbers, brand information, serial number, model information, time, date, location, and/or distance.
  • the information stored in the unique information database 211 is obtained from the capture module 203 and/or the virtual capture module 204.
  • the information stored in the unique information database 211 is obtained from the ontological component database 205.
  • Information from the ontological component database 205 may include the ontological components associated with an object-of-interest. Further information obtained from the ontological component database 205 may include influence parameters.
  • the information stored in the unique information database 211 is obtained from the augmentation module 206.
  • the information obtained from the augmentation module 206 is associated with a specified object-of-interest.
  • Information obtained from the augmentation module 206 may include the augmentation operations such as noise types, geometric transformation and/or a sample strategy. Further information obtained from the augmentation module 206 may include influence parameters and data fusion methods.
  • the information stored in the unique information database 211 is obtained from the routing module 207.
  • the information obtained from the routing module 207 is associated with an object of interest.
  • the information stored in the unique information database 211 is obtained from the encoding operations module 208.
  • the information obtained from the encoding operations module 208 is associated with an object-of-interest.
  • Information obtained from the encoding operations module 208 may include, but is not limited to, dense representations, the at least one encoder, the at least one plurality of encoders, the pre-configuration of encoders, the pre-configuration of encoders to differentiate the at least one ontological component and/or to differentiate the at least one plurality ontological components, training data, loopback routines and/or associated loopback routine parameters.
  • Further information obtained from the encoding operations module 208 may include influence parameters and a data fusion operation.
  • the information stored in the unique information database 211 is obtained from the influence parameter module 209.
  • the information stored in the unique information database 211 is obtained from the data fusion module 210.
  • influence parameters obtained from the influence parameter module 209 are specified.
  • method 600 depicts input-level fusion.
  • Method 600 may apply a fusion operation 603 to remove correlation data contained in ontological components 601 and 602, and/or fuse data at a lower dimensional common space.
  • Method 600 may utilize operations such as, but not limited to, principal component analysis (PCA), canonical correlation analysis and/or independent component analysis.
  • PCA principal component analysis
  • canonical correlation analysis canonical correlation analysis and/or independent component analysis.
  • each encoder may utilize linear and/or nonlinear function to combine different modalities into a single representation to enable an encoder to process a joint or shared representation of each modality. Differing modalities may be combined simultaneously into a single shared representation and/or combine one or multiple modalities in a step-wise fashion as illustrated in 700.
  • ID data associated with a non-synthetic object-of-interest 201a and/or a synthetic object-of-interest 202a is supplied to the unique information database 211.
  • the closest- match ID data contained within the unique information database 211 is determined.
  • the closest-match data may be utilized to configure the ontological component database 205, the routing module 207 and the encoding operations module 208.
  • the closest-match data is utilized to configure the capture module 203 and/or the virtual capture module 204.
  • the augmentation module 206 may be omitted.
  • the closest-match data is further utilized to determine influence parameters in the modules 205, 207 and 208.
  • the closest-match data is further utilized to determine fusion operations in the routing module 207 and in the encoding operations module 208.
  • the dense representation of the object-to-be-inferred generated by the encoding operations module 208 is stored in the unique information database 211 with the associated configuration data of method 200.
  • the combination of one or more of the capture module 203, the virtual capture module 204, the ontological component database 205, the augmentation module 206, the routing module 207, the encoding operations module 208 , the influence parameter module 209, and/or the data fusion module 210 may be utilized to determine a confidence and/or prediction for a non-synthetic object- to-be-inferred 201a and/or the synthetic object-to-be-inferred 202a at the confidence and/or prediction module 212.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Processing Or Creating Images (AREA)

Abstract

Les systèmes, composants et procédés divulgués illustrent des mises en œuvre de la technologie se concentrant sur l'association d'un objet à inférer à un objet d'intérêt. Dans un contexte, le système peut être un moteur de prédiction et/ou de confiance visant à identifier des objets sur la base de composants ontologiques de l'objet d'intérêt. Afin de fournir une fiabilité suffisante, le système génère un ensemble de données de représentations de composants ontologiques, également appelées informations uniques.
PCT/IB2023/056290 2022-06-20 2023-06-17 Système et procédé de reconnaissance basés sur des composants ontologiques WO2023248091A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP22179939.8 2022-06-20
EP22179939 2022-06-20

Publications (1)

Publication Number Publication Date
WO2023248091A1 true WO2023248091A1 (fr) 2023-12-28

Family

ID=82492661

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2023/056290 WO2023248091A1 (fr) 2022-06-20 2023-06-17 Système et procédé de reconnaissance basés sur des composants ontologiques

Country Status (1)

Country Link
WO (1) WO2023248091A1 (fr)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130153651A1 (en) * 2011-12-20 2013-06-20 Elena A. Fedorovskaya Encoding information in illumination patterns
US20210158017A1 (en) * 2017-10-05 2021-05-27 Applications Mobiles Overview Inc. Method for 3d object recognition based on 3d primitives

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130153651A1 (en) * 2011-12-20 2013-06-20 Elena A. Fedorovskaya Encoding information in illumination patterns
US20210158017A1 (en) * 2017-10-05 2021-05-27 Applications Mobiles Overview Inc. Method for 3d object recognition based on 3d primitives

Similar Documents

Publication Publication Date Title
KR102319177B1 (ko) 이미지 내의 객체 자세를 결정하는 방법 및 장치, 장비, 및 저장 매체
US9904850B2 (en) Fast recognition algorithm processing, systems and methods
JP6732214B2 (ja) 画像処理装置、画像処理方法、テンプレート作成装置、物体認識処理装置及びプログラム
EP3182371B1 (fr) Détermination de seuil dans par exemple un algorithme de type ransac
CN109683699B (zh) 基于深度学习实现增强现实的方法、装置及移动终端
US10297083B2 (en) Method and system for determining a model of at least part of a real object
EP3499414B1 (fr) Caméra légère à vision 3d dotée d'un moteur à segmentation intelligente pour vision artificielle et identification automatique
CN113874870A (zh) 基于图像的定位
JP6554900B2 (ja) テンプレート作成装置及びテンプレート作成方法
WO2017095576A1 (fr) Procédé et système de reconnaissance d'objet incurvé au moyen d'un appariement d'images pour un traitement d'images
US11100669B1 (en) Multimodal three-dimensional object detection
CN111340878A (zh) 图像处理方法以及装置
JP2014164483A (ja) データベース生成装置、カメラ姿勢推定装置、データベース生成方法、カメラ姿勢推定方法、およびプログラム
CN113424522A (zh) 使用半球形或球形可见光深度图像进行三维跟踪
JP6016242B2 (ja) 視点推定装置及びその分類器学習方法
WO2023248091A1 (fr) Système et procédé de reconnaissance basés sur des composants ontologiques
KR102299902B1 (ko) 증강현실을 제공하기 위한 장치 및 이를 위한 방법
US20230360322A1 (en) System and method for extracting an object of interest from a 3d point cloud
WO2024052862A1 (fr) Analyse itérative et traitement de représentations 3d d'objets
CN113424524B (zh) 使用半球形或球形可见光深度图像的三维建模
US12002227B1 (en) Deep partial point cloud registration of objects
JP2019003407A (ja) テンプレート作成装置、物体認識処理装置、テンプレート作成方法及びプログラム
TWI785332B (zh) 基於光標籤的場景重建系統
EP4379667A1 (fr) Système et procédé de génération de données d'apprentissage tridimensionnel à partir d'échantillons reconstruits limités
Lee et al. Spatiotemporal outdoor lighting aggregation on image sequences

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23741770

Country of ref document: EP

Kind code of ref document: A1