TWI822423B - Computing apparatus and model generation method - Google Patents

Computing apparatus and model generation method Download PDF

Info

Publication number
TWI822423B
TWI822423B TW111140954A TW111140954A TWI822423B TW I822423 B TWI822423 B TW I822423B TW 111140954 A TW111140954 A TW 111140954A TW 111140954 A TW111140954 A TW 111140954A TW I822423 B TWI822423 B TW I822423B
Authority
TW
Taiwan
Prior art keywords
time point
correlation
processor
sensing
coordinate system
Prior art date
Application number
TW111140954A
Other languages
Chinese (zh)
Other versions
TW202405757A (en
Inventor
杜宇威
張鈞凱
Original Assignee
杜宇威
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杜宇威 filed Critical 杜宇威
Priority to US18/353,852 priority Critical patent/US20240029350A1/en
Application granted granted Critical
Publication of TWI822423B publication Critical patent/TWI822423B/en
Publication of TW202405757A publication Critical patent/TW202405757A/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/10Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration
    • G01C21/12Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning
    • G01C21/16Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation
    • G01C21/165Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments
    • G01C21/1656Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 by using measurements of speed or acceleration executed aboard the object being navigated; Dead reckoning by integrating acceleration or speed, i.e. inertial navigation combined with non-inertial navigation instruments with passive imaging devices, e.g. cameras
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/20Instruments for performing navigational calculations

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Computer Graphics (AREA)
  • Monitoring And Testing Of Nuclear Reactors (AREA)
  • Length Measuring Devices By Optical Means (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Separation Using Semi-Permeable Membranes (AREA)
  • Feedback Control In General (AREA)
  • Image Analysis (AREA)

Abstract

A computing apparatus and a model generation method are provided. In the method, sensing data is fused to determine depth information of multiple sensing points, moving trajectories of one or more pixels in the image data are tracked according to the image data and the inertial measurement data through the visual inertial odometry (VIO) algorithm, and those sensing points are mapped into a coordinate system according to the depth information and the moving trajectories through the simultaneous localization and mapping (SLAM) algorithm, to generate a three-dimensional environment model. An object is set on the three-dimensional environment model through a setting operation. The shopping information of the object is provided.

Description

運算裝置及模型產生方法Computing device and model generation method

本發明是有關於一種空間建模技術,且特別是有關於一種運算裝置及模型產生方法。The present invention relates to a spatial modeling technology, and in particular, to a computing device and a model generation method.

為了模擬真實環境,可以對真實環境的空間進行掃描以產生看起來像真實環境的模擬環境。模擬環境可實現在諸如遊戲、家居佈置、機器人移動等應用。值得注意的是,掃描空間所得到的感測資料可能有誤差,進而造成模擬環境的失真。To simulate a real environment, the space of the real environment can be scanned to produce a simulated environment that looks like the real environment. The simulation environment can be implemented in applications such as games, home decoration, and robot movement. It is worth noting that the sensing data obtained by scanning the space may have errors, which may cause distortion of the simulated environment.

本發明實施例提供一種運算裝置及模型產生方法,可補償誤差,進而提升模擬環境的擬真度。Embodiments of the present invention provide a computing device and a model generation method that can compensate for errors and thereby improve the fidelity of the simulation environment.

本發明實施例的模型產生方法包括:融合那些感測資料,以決定多個感測點的深度資訊。這些感測資料包括影像資料及慣性(Inertial)測量資料。依據影像資料及慣性測量資料透過視覺慣性測程(Visual Inertial Odometry,VIO)演算法追蹤影像資料中的一個或更多個像素的移動軌跡。依據深度資訊及移動軌跡透過同步定位與映射(Simultaneous Localization And Mapping,SLAM)演算法將那些感測點映射到坐標系,以產生三維環境模型。三維環境模型中的位置由坐標系所定義。The model generation method in the embodiment of the present invention includes: fusing the sensing data to determine the depth information of multiple sensing points. These sensing data include image data and inertial measurement data. Based on the image data and inertial measurement data, the movement trajectory of one or more pixels in the image data is tracked through the Visual Inertial Odometry (VIO) algorithm. Based on the depth information and movement trajectories, those sensing points are mapped to the coordinate system through the Simultaneous Localization And Mapping (SLAM) algorithm to generate a three-dimensional environment model. Positions in the 3D environment model are defined by coordinate systems.

本發明實施例的運算裝置包括記憶體及處理器。記憶體用以儲存程式碼。處理器耦接記憶體。處理器載入程式碼以執行運算裝置經配置用以融合多個感測資料以決定多個感測點的深度資訊,依據影像資料及慣性測量資料透過視覺慣性測程演算法追蹤影像資料中的一個或更多個像素的移動軌跡,並依據深度資訊及移動軌跡透過同步定位與映射演算法將那些感測點映射到坐標系以產生三維環境模型。感測資料包括影像資料及慣性測量資料。三維環境模型中的位置由坐標系所定義。The computing device in the embodiment of the present invention includes a memory and a processor. Memory is used to store program code. The processor is coupled to the memory. The processor loads the program code to execute the computing device configured to fuse multiple sensing data to determine depth information of multiple sensing points, and track the depth information in the image data through a visual inertial odometry algorithm based on the image data and inertial measurement data. The movement trajectories of one or more pixels are mapped to the coordinate system through simultaneous positioning and mapping algorithms based on the depth information and movement trajectories to generate a three-dimensional environment model. Sensing data includes image data and inertial measurement data. Positions in the 3D environment model are defined by coordinate systems.

基於上述,依據本發明的運算裝置及模型產生方法,利用VIO及SLAM演算法估測環境中的感測點的位置,並據以建立三維環境模型。藉此,可提升位置估測的準確度及三維模型的擬真度。Based on the above, according to the computing device and model generation method of the present invention, VIO and SLAM algorithms are used to estimate the positions of sensing points in the environment, and thereby establish a three-dimensional environment model. This can improve the accuracy of position estimation and the fidelity of the three-dimensional model.

為讓本發明的上述特徵和優點能更明顯易懂,下文特舉實施例,並配合所附圖式作詳細說明如下。In order to make the above-mentioned features and advantages of the present invention more obvious and easy to understand, embodiments are given below and described in detail with reference to the accompanying drawings.

圖1是依據本發明的一實施例的模型產生系統1的示意圖。請參照圖1,模型產生系統1包括(但不僅限於)行動裝置10及運算裝置30。Figure 1 is a schematic diagram of a model generation system 1 according to an embodiment of the present invention. Referring to FIG. 1 , the model generation system 1 includes (but is not limited to) a mobile device 10 and a computing device 30 .

行動裝置10可以是手機、平板電腦、掃描器、機器人、穿戴式裝置、自走車或車載系統。行動裝置10包括(但不僅限於)多台感測器11。The mobile device 10 may be a mobile phone, a tablet computer, a scanner, a robot, a wearable device, a self-propelled vehicle or a vehicle-mounted system. The mobile device 10 includes (but is not limited to) multiple sensors 11 .

感測器11可以是影像擷取裝置、光達(LiDAR)、飛行時間(Time-of-Flight,ToF)偵測器、慣性測量單元(Inertial Measurement Unit,IMU)、加速度計、陀螺儀或電子羅盤。在一實施例中,感測器11用以取得感測資料。感測資料包括影像資料及慣性感測資料。影像資料可以是一張或更多張影像及其像素的感測強度。慣性感測資料可以是姿態、三軸的加速度、角速度或位移。The sensor 11 may be an image capture device, a LiDAR, a Time-of-Flight (ToF) detector, an Inertial Measurement Unit (IMU), an accelerometer, a gyroscope or an electronic sensor. compass. In one embodiment, the sensor 11 is used to obtain sensing data. Sensing data includes image data and inertial sensing data. The image data may be one or more images and the sensing intensity of their pixels. Inertial sensing data can be attitude, three-axis acceleration, angular velocity or displacement.

運算裝置30可以是手機、平板電腦、桌上型電腦、筆記型電腦、伺服器或智能助理裝置。運算裝置30通訊連接行動裝置10。例如,透過Wi-Fi、藍芽、紅外線或其他無線傳輸技術,或透過電路內部線路、乙太網路、光纖網路、通用序列匯流排(Universal Serial Bus,USB)或其他有線傳輸技術傳送或接收資料,並可能有額外的通訊收發器(圖未示)實現。運算裝置30包括(但不僅限於)記憶體31及處理器32。The computing device 30 may be a mobile phone, a tablet computer, a desktop computer, a notebook computer, a server or an intelligent assistant device. The computing device 30 is communicatively connected to the mobile device 10 . For example, through Wi-Fi, Bluetooth, infrared or other wireless transmission technologies, or through circuit internal wiring, Ethernet, optical fiber network, Universal Serial Bus (USB) or other wired transmission technologies or Receive data and may be implemented with additional communication transceivers (not shown). The computing device 30 includes (but is not limited to) a memory 31 and a processor 32 .

記憶體31可以是任何型態的固定或可移動隨機存取記憶體(Radom Access Memory,RAM)、唯讀記憶體(Read Only Memory,ROM)、快閃記憶體(flash memory)、傳統硬碟(Hard Disk Drive,HDD)、固態硬碟(Solid-State Drive,SSD)或類似元件。在一實施例中,記憶體31用以儲存程式碼、軟體模組、資料(例如,感測資料、或三維模型)或檔案,其詳細內容待後續實施例詳述。The memory 31 can be any type of fixed or removable random access memory (Radom Access Memory, RAM), read only memory (Read Only Memory, ROM), flash memory (flash memory), traditional hard disk (Hard Disk Drive, HDD), solid-state drive (Solid-State Drive, SSD) or similar components. In one embodiment, the memory 31 is used to store program codes, software modules, data (for example, sensing data, or three-dimensional models) or files, the details of which will be described in subsequent embodiments.

處理器32耦接及記憶體31。處理器32可以是中央處理單元(Central Processing Unit,CPU),或是其他可程式化之一般用途或特殊用途的微處理器(Microprocessor)、數位信號處理器(Digital Signal Processor,DSP)、可程式化控制器、特殊應用積體電路(Application-Specific Integrated Circuit,ASIC)或其他類似元件或上述元件的組合。在一實施例中,處理器32用以執行運算裝置30的所有或部份作業,且可載入並執行記憶體31所儲存的程式碼、軟體模組、檔案及/或資料。在一實施例中,處理器32執行本發明實施例的所有或部分操作。在一些實施例中,記憶體31所記錄的那些軟體模組或程式碼也可能是實體電路所實現。The processor 32 is coupled to the memory 31 . The processor 32 may be a central processing unit (CPU), or other programmable general-purpose or special-purpose microprocessor (Microprocessor), digital signal processor (Digital Signal Processor, DSP), programmable chemical controller, Application-Specific Integrated Circuit (ASIC) or other similar components or a combination of the above components. In one embodiment, the processor 32 is used to execute all or part of the operations of the computing device 30 and can load and execute program codes, software modules, files and/or data stored in the memory 31 . In one embodiment, the processor 32 performs all or part of the operations of embodiments of the present invention. In some embodiments, the software modules or program codes recorded in the memory 31 may also be implemented by physical circuits.

在一些實施例中,行動裝置10與運算裝置30可整合成獨立裝置。In some embodiments, the mobile device 10 and the computing device 30 may be integrated into independent devices.

下文中,將搭配模型產生系統1中的各項裝置及元件說明本發明實施例所述之方法。本方法的各個流程可依照實施情形而隨之調整,且並不僅限於此。In the following, the method described in the embodiment of the present invention will be explained in conjunction with various devices and components in the model generation system 1 . Each process of this method can be adjusted according to the implementation situation, and is not limited to this.

圖2是依據本發明的一實施例的模型產生方法的流程圖。請參照圖2,運算裝置30的處理器32融合多個感測資料,以決定多個感測點的深度資訊(步驟S210)。具體而言,透過感測器11掃描所處環境,可形成多個感測點。感測點的深度資訊可以是感測器11與感測點之間的距離。在一實施例中,處理器32可將影像資料中的影像匹配成多個影像區塊。例如,處理器32可透過影像特徵比對或深度學習模型辨識影像中的物體(例如,牆、天花板、地板或櫃架),並依據物體所在的區域的輪廓分割成影像區塊。接著,處理器32可決定那些影像區塊對應的深度資訊。例如,處理器32可透過深度學習模型擷取特徵並據以預測影像區塊或所屬物體的深度資訊。深度學習模型/演算法可分析訓練樣本以自中獲得規律,從而透過規律對未知資料預測。一般而言,深度資訊通常相關於物件在場景中的大小比例及姿態。而深度學習模型即是經學習後所建構出的機器學習模型,並據以對待評估資料(例如,影像區域)推論。又例如,處理器32可比對影像區域與記憶體31所儲存位於不同位置的物體的特徵資訊。處理器32可基於相似程度高於對應門檻值的位置決定深度資訊。Figure 2 is a flow chart of a model generation method according to an embodiment of the present invention. Referring to FIG. 2 , the processor 32 of the computing device 30 fuses multiple sensing data to determine depth information of multiple sensing points (step S210 ). Specifically, multiple sensing points can be formed by scanning the environment through the sensor 11 . The depth information of the sensing point may be the distance between the sensor 11 and the sensing point. In one embodiment, the processor 32 may match images in the image data into multiple image blocks. For example, the processor 32 can identify objects in the image (for example, walls, ceilings, floors, or cabinets) through image feature comparison or deep learning models, and segment the objects into image blocks based on the outline of the area where the object is located. Then, the processor 32 can determine the depth information corresponding to those image blocks. For example, the processor 32 can extract features through a deep learning model and predict the depth information of the image block or the corresponding object based on the features. Deep learning models/algorithms can analyze training samples to obtain patterns from them, and then predict unknown data through patterns. Generally speaking, depth information is usually related to the size ratio and posture of objects in the scene. The deep learning model is a machine learning model constructed after learning, and inferences are made based on the evaluation data (for example, image area). For another example, the processor 32 can compare the image area with the characteristic information of objects located at different locations stored in the memory 31 . The processor 32 may determine the depth information based on the locations with a degree of similarity higher than a corresponding threshold.

在另一實施例中,感測器11為深度感測器或距離感測器。處理器32可依據深度感測器或距離感測器的感測資料決定環境中的多個感測點的深度資訊。In another embodiment, the sensor 11 is a depth sensor or a distance sensor. The processor 32 can determine the depth information of multiple sensing points in the environment based on the sensing data of the depth sensor or the distance sensor.

處理器32依據影像資料及慣性測量資料透過視覺慣性測程(Visual Inertial Odometry,VIO)演算法追蹤影像資料中的一個或更多個像素的移動軌跡(步驟S220)。具體而言,VIO是使用一個或多個影像擷取裝置及一個或多個IMU進行狀態測量的技術。前述狀態是指感測器11的載體(例如,行動裝置10)在特定自由度下的姿態、速度或其他物理量。由於影像擷取裝置可在一定的曝光時間內捕獲光子以取得到一張二維(2D)的影像,在低速運動時,影像擷取裝置所得到的影像資料記錄相當豐富的環境資訊。然而,同時影像資料容易受到環境的影響,且有尺寸上的模棱兩可的問題。相比之下,IMU是用於感測自身角加速度及加速度。雖然慣性測量資料較為單一且累積誤差很大,但其不受環境的影響。此外,慣性測量資料還具有確切尺度單位的特性,正好彌補影像資料的短缺。透過整合影像資料及慣性測量資料兩者,可得到較為準確的慣性導航。The processor 32 tracks the movement trajectory of one or more pixels in the image data through a visual inertial odometry (VIO) algorithm based on the image data and inertial measurement data (step S220). Specifically, VIO is a technology that uses one or more image capture devices and one or more IMUs to perform state measurements. The aforementioned state refers to the posture, speed or other physical quantity of the carrier of the sensor 11 (for example, the mobile device 10) under a specific degree of freedom. Since the image capture device can capture photons within a certain exposure time to obtain a two-dimensional (2D) image, when moving at low speed, the image data obtained by the image capture device records quite rich environmental information. However, at the same time, image data are easily affected by the environment and have dimensional ambiguities. In contrast, IMU is used to sense its own angular acceleration and acceleration. Although the inertial measurement data is relatively single and has a large cumulative error, it is not affected by the environment. In addition, inertial measurement data also have the characteristics of exact scale units, which just makes up for the shortage of image data. By integrating both image data and inertial measurement data, more accurate inertial navigation can be obtained.

圖3是依據本發明的一實施例的慣性導航的示意圖。請參照圖3,處理器32可決定影像資料中的物體在時間點T1與時間點T2之間的位置差異。時間點T1早於時間點T2。物體在影像中佔據部分像素。處理器32可辨識物體,判斷物體在影像中的影像位置,並定義為地標(landmark)L。處理器32可比較兩個不同時間點T1、T2在影像擷取裝置112所擷取到相同物體的位置差異。Figure 3 is a schematic diagram of inertial navigation according to an embodiment of the present invention. Referring to FIG. 3 , the processor 32 can determine the position difference of the object in the image data between time point T1 and time point T2. Time point T1 is earlier than time point T2. The object occupies some of the pixels in the image. The processor 32 can identify the object, determine the image position of the object in the image, and define it as a landmark (landmark) L. The processor 32 can compare the position difference of the same object captured by the image capturing device 112 at two different time points T1 and T2.

接著,處理器32可依據時間點T1的初始位置及位置差異決定時間點T1至時間點T2的移動軌跡。初始位置是依據時間點T1的慣性測量資料(透過IMU 111取得)所決定。例如,對IMU 111的慣性積分可得出初始位置。處理器32可進一步將地標L的位置自感測坐標系轉換到世界坐標系WC。而VIO的資料融合方法有很多。例如,鬆耦合(loosely coupled)與緊耦合(tightly coupled)。鬆耦合演算法分別依據影像資料與慣性測量資料進行位姿估測,再對其位姿估測結果進行融合。而緊耦合演算法直接融合影像資料與慣性測量資料,依據融合資料建構運動與觀測方程式,並據以進行狀態估測。Then, the processor 32 can determine the movement trajectory from time point T1 to time point T2 based on the initial position and position difference of time point T1. The initial position is determined based on the inertial measurement data (obtained through the IMU 111) at time point T1. For example, integrating the inertia of IMU 111 may yield the initial position. The processor 32 may further convert the position of the landmark L from the sensing coordinate system to the world coordinate system WC. VIO has many data fusion methods. For example, loosely coupled and tight coupled. The loose coupling algorithm performs pose estimation based on image data and inertial measurement data respectively, and then fuses the pose estimation results. The tightly coupled algorithm directly fuses image data and inertial measurement data, constructs motion and observation equations based on the fused data, and performs state estimation accordingly.

請參照圖2,處理器32依據深度資訊及移動軌跡透過同步定位與映射(Simultaneous Localization And Mapping,SLAM)演算法將那些感測點映射到坐標系,以產生三維(3D)環境模型(步驟S230)。具體而言,SLAM演算法可透過座標轉換將環境中的感測點處於不同時刻不同位置的深度資訊,轉換到同一個坐標系下,從而產生對於環境的完整三維環境模型。而三維環境模型中的位置由這坐標系所定義。Referring to FIG. 2 , the processor 32 maps the sensing points to the coordinate system through a Simultaneous Localization And Mapping (SLAM) algorithm based on the depth information and movement trajectories to generate a three-dimensional (3D) environment model (step S230 ). Specifically, the SLAM algorithm can convert the depth information of sensing points in the environment at different locations at different times into the same coordinate system through coordinate conversion, thereby generating a complete three-dimensional environment model of the environment. The position in the three-dimensional environment model is defined by this coordinate system.

然而,無偏差/誤差且高準確度的環境三維環境模型需要倚賴無偏差的移動軌跡以及深度資訊。然而,由於各種感測器11通常會存在不同程度上的誤差。此外,雜訊通常會存在於真實環境中,因此SLAM演算法要考慮的不只是數學上的唯一解,更包括與那些和結果相關的物理概念的相互作用。值得注意的是,在三維模型構建的下一個反覆運算步驟中,測得的距離和方向/姿態有可預知的系列誤差。這些誤差通常由感測器11的有限準確度、以及來自環境中的其他雜訊所引起,並反映在三維環境模型上的點、或是特徵的誤差。隨著時間的推移和運動的變化,定位和地圖構建的誤差累計增加,進而影響地圖本身的精度。However, a bias/error-free and highly accurate 3D environment model requires unbiased movement trajectories and depth information. However, various sensors 11 usually have errors to varying degrees. In addition, noise usually exists in the real environment, so the SLAM algorithm must consider not only the unique mathematical solution, but also the interaction with the physical concepts related to the result. It is worth noting that in the next iterative step of building a three-dimensional model, the measured distance and direction/attitude have a predictable series of errors. These errors are usually caused by the limited accuracy of the sensor 11 and other noise from the environment, and are reflected in errors in points or features on the three-dimensional environment model. As time passes and movement changes, errors in positioning and map construction accumulate, affecting the accuracy of the map itself.

在一實施例中,處理器32可匹配第一時間點的第一關聯性及第二時間點的第二關聯性。第一時間點早於該第二時間點。第一關聯性是第一時間點的那些感測資料與該三維環境模型中的對應位置之間的關聯性,且第二關聯性是第二時間點的那些感測資料與該三維環境模型中的對應位置之間的關聯性。也就是,在特定時間點的感測資料與所對應的地標。SLAM演算架構是透過一個反覆運算數學問題,來解決各種感測資料的偏差。數學問題例如是基於感測資料(作為狀態)形成運動方程式及觀測方程式。In one embodiment, the processor 32 may match the first correlation at the first point in time and the second correlation at the second point in time. The first time point is earlier than the second time point. The first correlation is the correlation between the sensing data at the first time point and the corresponding position in the three-dimensional environment model, and the second correlation is the correlation between the sensing data at the second time point and the three-dimensional environment model. The correlation between the corresponding positions. That is, the sensing data at a specific time point and the corresponding landmark. The SLAM algorithm uses an iterative mathematical problem to solve the deviations of various sensed data. Mathematical problems include, for example, forming equations of motion and equations of observation based on sensed data (as states).

處理器32可依據第一關聯性及第二關聯性之間的匹配結果修正那些感測點在坐標系上的位置。為了補償這些誤差,處理器32可將當前的三維環境模型與先前的三維環境模型進行匹配。例如,透過可知已走過三維環境模型中重複地點的環路閉合(Loop Closure)演算法。或者,用於SLAM機率學相關的演算法。例如,卡爾曼濾波、粒子濾波(某一種蒙特卡羅方法)以及掃描匹配的資料範圍。透過這些演算法,處理器32可透過比對當前(例如,第二時間點)以及過去(例如,第一時間點)的感測資料,來逐步最佳化過去及現在的軌跡位置以及深度資訊。透過遞迴式的最佳化,可得到對環境中各個點的精準估測。由上述說明可知,本發明實施例的演算法能夠形成一個閉環,也才能夠隨著軌跡的推移,累積出完整且精準的三維環境模型。反之,若未形成閉環,則誤差可能會持續累積並且放大,最終導致前後資料不連貫,進而產出無用的三維環境模型。The processor 32 may correct the positions of those sensing points on the coordinate system according to the matching results between the first correlation and the second correlation. To compensate for these errors, processor 32 may match the current three-dimensional environment model with previous three-dimensional environment models. For example, through the loop closure algorithm, it is known that repeated locations in the three-dimensional environment model have been traveled. Or, for algorithms related to SLAM probabilistics. For example, Kalman filtering, particle filtering (a certain Monte Carlo method), and scanning the data range for matching. Through these algorithms, the processor 32 can gradually optimize past and present trajectory positions and depth information by comparing current (eg, second time point) and past (eg, first time point) sensing data. . Through recursive optimization, accurate estimates of each point in the environment can be obtained. From the above description, it can be seen that the algorithm of the embodiment of the present invention can form a closed loop, and can also accumulate a complete and accurate three-dimensional environment model as the trajectory progresses. On the contrary, if a closed loop is not formed, the errors may continue to accumulate and amplify, eventually leading to incoherence in the previous and later data, thereby producing a useless three-dimensional environmental model.

在一實施例中,處理器32可依據第一關聯性及第二關聯性透過最佳化演算法最小化那些感測點在坐標系上的位置的誤差,並依據第二關聯性透過濾波演算法估測那些感測點在坐標系上的位置。最佳化演算法是將SLAM的狀態估測轉換成誤差項,並最小化誤差項。例如,牛頓法、高斯-牛頓法或Levenberg-Marquardt方法。濾波演算法例如是卡爾曼濾波、擴展卡爾曼濾波、粒子濾波。最佳化演算法可參酌不同時間點的感測資料,而濾波演算法是對當前感測資料引入雜訊。In one embodiment, the processor 32 can minimize errors in the positions of the sensing points on the coordinate system through an optimization algorithm based on the first correlation and the second correlation, and through a filtering algorithm based on the second correlation. method to estimate the position of those sensing points on the coordinate system. The optimization algorithm converts the SLAM state estimate into an error term and minimizes the error term. For example, Newton's method, Gauss-Newton method or Levenberg-Marquardt method. Examples of filtering algorithms include Kalman filtering, extended Kalman filtering, and particle filtering. The optimization algorithm can refer to the sensing data at different time points, while the filtering algorithm introduces noise to the current sensing data.

與現有技術不同處在於,相較於現有技術僅單一採用最佳化演算法或濾波演算法,本發明實施例結合兩種演算法。而最佳化演算法與濾波演算法的比重相關於運算裝置30的軟硬體資源及預測位置的準確度。例如,若軟硬體資源或準確度要求較低,則濾波演算法的比重高於最佳化演算法。而若軟硬體資源或準確度要求較高,則最佳化演算法的比重高於濾波演算法。The difference from the existing technology is that compared with the existing technology that only uses an optimization algorithm or a filtering algorithm, the embodiment of the present invention combines the two algorithms. The proportion of the optimization algorithm and the filtering algorithm is related to the software and hardware resources of the computing device 30 and the accuracy of the predicted position. For example, if the software and hardware resources or accuracy requirements are low, the filtering algorithm has a higher weight than the optimization algorithm. And if the software and hardware resources or accuracy requirements are high, the proportion of the optimization algorithm will be higher than that of the filtering algorithm.

在一實施例中,處理器32可接收設置操作。設置操作可透過諸如觸控面板、滑鼠、鍵盤或其他輸入裝置取得。例如,滑動、按壓或點擊操作。處理器32可依據這設置操作在三維環境模型中設置物件。依據不同應用情境,物件例如是家具、畫框或家電。處理器32可依據設置操作移動物件,並將物件放置於三圍環境模型中的指定位置。接著,處理器32可透過顯示器(圖未示)提供這物件的購物資訊。例如,物件名稱、金額、運送方式、支付選擇等選項。處理器32還可透過通訊收發器(圖未示)連結到店商伺服器,並據以完成購物流程。In one embodiment, processor 32 may receive a set operation. Setting operations can be obtained through input devices such as touch panels, mice, keyboards, or other input devices. For example, a swipe, press, or click. The processor 32 can set objects in the three-dimensional environment model according to this setting operation. Depending on the application scenario, the objects may be furniture, picture frames or home appliances. The processor 32 can move the object according to the setting operation and place the object at a designated position in the three-dimensional environment model. Then, the processor 32 can provide the shopping information of the object through the display (not shown). For example, item name, amount, shipping method, payment options and other options. The processor 32 can also connect to the store server through a communication transceiver (not shown) and complete the shopping process accordingly.

在一應用情境中,行動裝置10可快速掃瞄空間並感知空間內的所有尺寸資訊,讓使用者不再需要任何手動丈量,即能直接輕鬆在三維環境模型中佈置家具。本發明實施例還可提供軟體即服務(Software as a Service,SaaS)系統,讓使用者可參考實際空間搭配呈現或調整擺放位置,且運算裝置30所裝載的購物程式可將商品加入購物車以直接購物。除此之外,雲端串連的方式更能夠讓使用者互助遠端搭配空間,進而成為線上最大家居社群。然而,不限於家具布置,本發明實施例的快速建模特性還能導入其他應用。In an application scenario, the mobile device 10 can quickly scan the space and sense all dimensional information in the space, so that the user no longer needs any manual measurements and can directly and easily arrange furniture in the three-dimensional environment model. Embodiments of the present invention can also provide a Software as a Service (SaaS) system, allowing users to refer to the actual space for presentation or adjustment of placement, and the shopping program loaded on the computing device 30 can add products to the shopping cart. to shop directly. In addition, the cloud connection method allows users to cooperate with each other to coordinate space remotely, thus becoming the largest online home community. However, it is not limited to furniture arrangement, and the rapid modeling characteristics of embodiments of the present invention can also be introduced into other applications.

綜上所述,在本發明的運算裝置及模型產生方法中,對手機或其他可攜式行動裝置的LiDAR、相機、IMU等感測器的資料進行資料融合以取得具有深度資訊,再透過VIO演算法來追蹤相機上不同像素的移動軌跡,利用深度資訊及移動的軌跡再搭配上SLAM演算法框架進行最佳化,以得到對環境中各感測點的準確估測。To sum up, in the computing device and model generation method of the present invention, data from sensors such as LiDAR, cameras, and IMUs of mobile phones or other portable mobile devices are fused to obtain in-depth information, and then through VIO Algorithm is used to track the movement trajectories of different pixels on the camera, and the depth information and movement trajectories are used together with the SLAM algorithm framework for optimization to obtain accurate estimates of each sensing point in the environment.

雖然本發明已以實施例揭露如上,然其並非用以限定本發明,任何所屬技術領域中具有通常知識者,在不脫離本發明的精神和範圍內,當可作些許的更動與潤飾,故本發明的保護範圍當視後附的申請專利範圍所界定者為準。Although the present invention has been disclosed above through embodiments, they are not intended to limit the present invention. Anyone with ordinary knowledge in the technical field may make some modifications and modifications without departing from the spirit and scope of the present invention. Therefore, The protection scope of the present invention shall be determined by the appended patent application scope.

1:模型產生系統 10:行動裝置 11:感測器 30:運算裝置 31:記憶體 32:處理器 S210~S230:步驟 T1、T2:時間點 111:IMU 112:影像擷取裝置 L:地標 WC:世界坐標系 1: Model generation system 10:Mobile device 11: Sensor 30:Computing device 31:Memory 32: Processor S210~S230: steps T1, T2: time point 111:IMU 112:Image capture device L:Landmark WC: world coordinate system

圖1是依據本發明的一實施例的模型產生系統的示意圖。 圖2是依據本發明的一實施例的模型產生方法的流程圖。 圖3是依據本發明的一實施例的慣性導航的示意圖。 FIG. 1 is a schematic diagram of a model generation system according to an embodiment of the present invention. Figure 2 is a flow chart of a model generation method according to an embodiment of the present invention. Figure 3 is a schematic diagram of inertial navigation according to an embodiment of the present invention.

S210~S230:步驟 S210~S230: steps

Claims (6)

一種模型產生方法,適用於一運算裝置,其中該運算裝置包括一處理器,所述方法包括:經由該處理器,融合多個感測資料,以決定多個感測點的一深度資訊,其中該些感測資料包括一影像資料及一慣性(Inertial)測量資料,其中決定該些感測點的該深度資訊的步驟包括:經由該處理器,根據該影像資料,識別匹配至多個物件的多個物件影像區塊;經由該處理器,將該些物件影像區塊輸入至深度學習模型,以輸出該些物件的特徵及深度資訊,進而決定對應該些物件的該些感測點的該深度資訊;經由該處理器,依據該影像資料及該慣性測量資料透過一視覺慣性測程(Visual Inertial Odometry,VIO)演算法追蹤該影像資料中的至少一像素的一移動軌跡;以及經由該處理器,依據該深度資訊及該移動軌跡透過一同步定位與映射(Simultaneous Localization And Mapping,SLAM)演算法將該些感測點映射到一坐標系,以產生一三維環境模型,其中該三維環境模型中的位置由該坐標系所定義,其中將該些感測點映射到該坐標系的步驟包括:經由該處理器,匹配一第一時間點的一第一關聯性及一第二時間點的一第二關聯性,其中該第一時間點早於該第二時間點,該第一關聯性是該第一時間點的該些感測資料與該 三維環境模型中的對應位置之間的關聯性,且該第二關聯性是該第二時間點的該些感測資料與該三維環境模型中的對應位置之間的關聯性;以及經由該處理器,依據該第一關聯性及該第二關聯性之間的匹配結果修正該些感測點在該坐標系上的位置,其中依據該第一關聯性及該第二關聯性之間的匹配結果修正該些感測點在該坐標系上的位置的步驟包括:經由該處理器,依據該第一關聯性及該第二關聯性透過一最佳化演算法最小化該些感測點在該坐標系上的位置的誤差;以及經由該處理器,依據該第二關聯性透過一濾波演算法估測該些感測點在該坐標系上的位置,其中該最佳化演算法與該濾波演算法的比重相關於該運算裝置的資源及預測位置的準確度,其中若該資源或該準確度的要求較低,該濾波演算法的比重高於該最佳化演算法,其中若該資源或該準確度的要求較高,該最佳化演算法的比重高於該濾波演算法。 A model generation method is suitable for a computing device, wherein the computing device includes a processor. The method includes: fusing multiple sensing data through the processor to determine depth information of multiple sensing points, wherein The sensing data includes an image data and an inertial measurement data. The step of determining the depth information of the sensing points includes: using the processor to identify multiple objects matching to multiple objects based on the image data. object image blocks; through the processor, input the object image blocks to the deep learning model to output the characteristics and depth information of the objects, and then determine the depth of the sensing points corresponding to the objects Information; through the processor, track a movement trajectory of at least one pixel in the image data through a visual inertial odometry (VIO) algorithm based on the image data and the inertial measurement data; and through the processor , based on the depth information and the movement trajectory, the sensing points are mapped to a coordinate system through a Simultaneous Localization and Mapping (SLAM) algorithm to generate a three-dimensional environment model, wherein the three-dimensional environment model The position of is defined by the coordinate system, wherein the step of mapping the sensing points to the coordinate system includes: via the processor, matching a first correlation of a first time point and a second time point of A second correlation, wherein the first time point is earlier than the second time point, and the first correlation is between the sensing data at the first time point and the The correlation between the corresponding positions in the three-dimensional environment model, and the second correlation is the correlation between the sensing data at the second time point and the corresponding positions in the three-dimensional environment model; and through the processing The device corrects the position of the sensing points on the coordinate system according to the matching result between the first correlation and the second correlation, wherein according to the matching between the first correlation and the second correlation As a result, the step of modifying the positions of the sensing points on the coordinate system includes: via the processor, minimizing the position of the sensing points based on the first correlation and the second correlation through an optimization algorithm. the error of the position on the coordinate system; and through the processor, estimating the position of the sensing points on the coordinate system through a filtering algorithm based on the second correlation, wherein the optimization algorithm and the The proportion of the filtering algorithm is related to the resources of the computing device and the accuracy of the predicted position. If the resource or the accuracy requirement is low, the proportion of the filtering algorithm is higher than the optimization algorithm. If the The requirements for resources or accuracy are higher, and the optimization algorithm has a higher proportion than the filtering algorithm. 如請求項1所述的模型產生方法,其中融合該些感測資料的步驟包括:經由該處理器,將該影像資料分割成多個影像區塊;經由該處理器,決定該些影像區塊對應的深度資訊,且追蹤該影像資料中的該至少一像素的該移動軌跡的步驟包括: 經由該處理器,決定該影像資料中的一物體在一第三時間點與一第四時間點之間的一位置差異,其中該第三時間點早於該第四時間點;以及經由該處理器,依據該第三時間點的一初始位置及該位置差異決定該第三時間點至該第四時間點的移動軌跡,其中該初始位置是依據該第三時間點的該慣性測量資料所決定。 The model generation method as described in claim 1, wherein the step of fusing the sensing data includes: dividing the image data into a plurality of image blocks through the processor; determining the image blocks through the processor Corresponding depth information, and the step of tracking the movement trajectory of the at least one pixel in the image data includes: Determine, through the processor, a position difference between an object in the image data between a third time point and a fourth time point, wherein the third time point is earlier than the fourth time point; and through the processing The device determines the movement trajectory from the third time point to the fourth time point based on an initial position at the third time point and the position difference, wherein the initial position is determined based on the inertial measurement data at the third time point. . 如請求項1所述的模型產生方法,更包括:經由該處理器,接收一設置操作;經由該處理器,依據該設置操作在該三維環境模型中設置一物件;以及經由該處理器,提供該物件的購物資訊。 The model generation method as described in claim 1 further includes: receiving a setting operation through the processor; setting an object in the three-dimensional environment model according to the setting operation through the processor; and providing, through the processor, Shopping information for this item. 一種運算裝置,包括:一記憶體,用以儲存一程式碼;以及一處理器,耦接該記憶體,經配置用以載入以執行:融合多個感測資料,以決定多個感測點的一深度資訊,其中該些感測資料包括一影像資料及一慣性測量資料,其中決定該些感測點的該深度資訊的步驟包括:根據該影像資料,識別匹配至多個物件的多個物件影像區塊;將該些物件影像區塊輸入至深度學習模型,以輸出該些物件的特徵及深度資訊,進而決定對應該些物件的該些感測點的該深度資訊; 依據該影像資料及該慣性測量資料透過一視覺慣性測程演算法追蹤該影像資料中的至少一像素的一移動軌跡;以及依據該深度資訊及該移動軌跡透過一同步定位與映射將該些感測點映射到一坐標系,以產生一三維環境模型,其中該三維環境模型中的位置由該坐標系所定義,其中將該些感測點映射到該坐標系的步驟包括:匹配一第一時間點的一第一關聯性及一第二時間點的一第二關聯性,其中該第一時間點早於該第二時間點,該第一關聯性是該第一時間點的該些感測資料與該三維環境模型中的對應位置之間的關聯性,且該第二關聯性是該第二時間點的該些感測資料與該三維環境模型中的對應位置之間的關聯性;以及依據該第一關聯性及該第二關聯性之間的匹配結果修正該些感測點在該坐標系上的位置,其中依據該第一關聯性及該第二關聯性之間的匹配結果修正該些感測點在該坐標系上的位置的步驟包括:依據該第一關聯性及該第二關聯性透過一最佳化演算法最小化該些感測點在該坐標系上的位置的誤差;以及依據該第二關聯性透過一濾波演算法估測該些感測點在該坐標系上的位置,其中該最佳化演算法與該濾波演算法的比重相關於該運算裝置的資源及預測位置的準確度,其中若該資源或該準 確度的要求較低,該濾波演算法的比重高於該最佳化演算法,其中若該資源或該準確度的要求較高,該最佳化演算法的比重高於該濾波演算法。 A computing device includes: a memory for storing a program code; and a processor coupled to the memory and configured to load and execute: fuse multiple sensing data to determine multiple sensing A depth information of points, wherein the sensing data includes an image data and an inertial measurement data, wherein the step of determining the depth information of the sensing points includes: identifying a plurality of objects matched to a plurality of objects according to the image data. Object image blocks; input the object image blocks to the deep learning model to output the characteristics and depth information of the objects, and then determine the depth information corresponding to the sensing points of the objects; Tracking a movement trajectory of at least one pixel in the image data based on the image data and the inertial measurement data through a visual inertial odometry algorithm; and based on the depth information and the movement trajectory through a synchronized positioning and mapping of these senses The measuring points are mapped to a coordinate system to generate a three-dimensional environment model, where the positions in the three-dimensional environment model are defined by the coordinate system. The step of mapping the sensing points to the coordinate system includes: matching a first A first correlation at a time point and a second correlation at a second time point, wherein the first time point is earlier than the second time point, and the first correlation is the feelings at the first time point. The correlation between the sensing data and the corresponding position in the three-dimensional environment model, and the second correlation is the correlation between the sensing data at the second time point and the corresponding position in the three-dimensional environment model; and modifying the positions of the sensing points on the coordinate system based on the matching results between the first correlation and the second correlation, wherein based on the matching results between the first correlation and the second correlation The step of modifying the positions of the sensing points on the coordinate system includes: minimizing the positions of the sensing points on the coordinate system through an optimization algorithm based on the first correlation and the second correlation. error; and estimating the positions of the sensing points on the coordinate system through a filtering algorithm based on the second correlation, wherein the proportion of the optimization algorithm and the filtering algorithm is related to the computing device. The accuracy of the resource and predicted location, where if the resource or the accuracy If the accuracy requirement is low, the filtering algorithm has a higher weight than the optimization algorithm. If the resource or the accuracy requirement is high, the optimization algorithm has a higher weight than the filtering algorithm. 如請求項6所述的運算裝置,其中該處理器更用以執行:將該影像資料分割成多個影像區塊;決定該些影像區塊對應的深度資訊;決定該影像資料中的一物體在一第三時間點與一第四時間點之間的一位置差異,其中該第三時間點早於該第四時間點;以及依據該第三時間點的一初始位置及該位置差異決定該第三時間點至該第四時間點的移動軌跡,其中該初始位置是依據該第三時間點的該慣性測量資料所決定。 The computing device as described in claim 6, wherein the processor is further used to: divide the image data into a plurality of image blocks; determine the depth information corresponding to the image blocks; determine an object in the image data a position difference between a third time point and a fourth time point, wherein the third time point is earlier than the fourth time point; and the determination is based on an initial position of the third time point and the position difference. The movement trajectory from the third time point to the fourth time point, wherein the initial position is determined based on the inertial measurement data at the third time point. 如請求項6所述的運算裝置,其中該處理器更用以執行:接收一設置操作;依據該設置操作在該三維環境模型中設置一物件;以及提供該物件的購物資訊。 The computing device of claim 6, wherein the processor is further configured to: receive a setting operation; set an object in the three-dimensional environment model according to the setting operation; and provide shopping information for the object.
TW111140954A 2022-07-22 2022-10-27 Computing apparatus and model generation method TWI822423B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/353,852 US20240029350A1 (en) 2022-07-22 2023-07-17 Computing apparatus and model generation method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202263391333P 2022-07-22 2022-07-22
US63/391,333 2022-07-22

Publications (2)

Publication Number Publication Date
TWI822423B true TWI822423B (en) 2023-11-11
TW202405757A TW202405757A (en) 2024-02-01

Family

ID=86689530

Family Applications (2)

Application Number Title Priority Date Filing Date
TW111140954A TWI822423B (en) 2022-07-22 2022-10-27 Computing apparatus and model generation method
TW111211774U TWM637241U (en) 2022-07-22 2022-10-27 Computing apparatus and model generation system

Family Applications After (1)

Application Number Title Priority Date Filing Date
TW111211774U TWM637241U (en) 2022-07-22 2022-10-27 Computing apparatus and model generation system

Country Status (2)

Country Link
CN (1) CN117437348A (en)
TW (2) TWI822423B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI822423B (en) * 2022-07-22 2023-11-11 杜宇威 Computing apparatus and model generation method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109556611A (en) * 2018-11-30 2019-04-02 广州高新兴机器人有限公司 A kind of fusion and positioning method based on figure optimization and particle filter
US20220137223A1 (en) * 2020-10-30 2022-05-05 Faro Technologies, Inc. Simultaneous localization and mapping algorithms using three-dimensional registration
CN114608554A (en) * 2022-02-22 2022-06-10 北京理工大学 Handheld SLAM equipment and robot instant positioning and mapping method
TWI768776B (en) * 2021-03-19 2022-06-21 國立臺灣大學 Indoor positioning system and indoor positioning method
TWM637241U (en) * 2022-07-22 2023-02-01 杜宇威 Computing apparatus and model generation system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109556611A (en) * 2018-11-30 2019-04-02 广州高新兴机器人有限公司 A kind of fusion and positioning method based on figure optimization and particle filter
US20220137223A1 (en) * 2020-10-30 2022-05-05 Faro Technologies, Inc. Simultaneous localization and mapping algorithms using three-dimensional registration
TWI768776B (en) * 2021-03-19 2022-06-21 國立臺灣大學 Indoor positioning system and indoor positioning method
CN114608554A (en) * 2022-02-22 2022-06-10 北京理工大学 Handheld SLAM equipment and robot instant positioning and mapping method
TWM637241U (en) * 2022-07-22 2023-02-01 杜宇威 Computing apparatus and model generation system

Also Published As

Publication number Publication date
TWM637241U (en) 2023-02-01
CN117437348A (en) 2024-01-23
TW202405757A (en) 2024-02-01

Similar Documents

Publication Publication Date Title
US11704833B2 (en) Monocular vision tracking method, apparatus and non-transitory computer-readable storage medium
US20210190497A1 (en) Simultaneous location and mapping (slam) using dual event cameras
CN111156998B (en) Mobile robot positioning method based on RGB-D camera and IMU information fusion
US20200226782A1 (en) Positioning method, positioning apparatus, positioning system, storage medium, and method for constructing offline map database
Panahandeh et al. Vision-aided inertial navigation based on ground plane feature detection
CN102622762B (en) Real-time camera tracking using depth maps
US11132810B2 (en) Three-dimensional measurement apparatus
US8976172B2 (en) Three-dimensional scanning using existing sensors on portable electronic devices
CN105143907B (en) Alignment system and method
Sola et al. Fusing monocular information in multicamera SLAM
US10157478B2 (en) Enabling use of three-dimensional locations of features with two-dimensional images
US12062210B2 (en) Data processing method and apparatus
TW202208879A (en) Pose determination method, electronic device and computer readable storage medium
EP3090410A1 (en) Methods and systems for generating a map including sparse and dense mapping information
WO2022247548A1 (en) Positioning method, apparatus, electronic device, and storage medium
CN104848861A (en) Image vanishing point recognition technology based mobile equipment attitude measurement method
Liu et al. Enabling context-aware indoor augmented reality via smartphone sensing and vision tracking
TWI822423B (en) Computing apparatus and model generation method
TW202238449A (en) Indoor positioning system and indoor positioning method
US20240029350A1 (en) Computing apparatus and model generation method
CN113610702B (en) Picture construction method and device, electronic equipment and storage medium
US20240069203A1 (en) Global optimization methods for mobile coordinate scanners
JP2023503750A (en) ROBOT POSITIONING METHOD AND DEVICE, DEVICE, STORAGE MEDIUM
Irmisch et al. Simulation framework for a visual-inertial navigation system
CN105741260A (en) Action positioning device and positioning method thereof