TWI799051B

TWI799051B - Automatic guided vehicle and method for forking pallet

Info

Publication number: TWI799051B
Application number: TW111100044A
Authority: TW
Inventors: 李永仁; 洪瑞志; 范妏瑄
Original assignee: 財團法人工業技術研究院
Priority date: 2022-01-03
Filing date: 2022-01-03
Publication date: 2023-04-11
Also published as: TW202327977A

Abstract

An automatic guided vehicle (AGV) and a method for forking a pallet are provided. The method includes: capturing a depth map by a depth camera, wherein the depth map includes the pallet; obtaining a plurality of images respectively corresponding to a plurality of depths from the depth map; inputting the plurality of images to a machine learning model to generate 3-dimensional location information of the pallet; and controlling a driving wheel and a fork of the AGV to move the pallet according to the 3-dimensional location information. The invention includes an automatic guided vehicle (AGV) for forking a pallet.

Description

Automatic carrier for forking pallets and method for forking pallets

本發明是有關於一種移動棧板的自動搬運車(automatic guided vehicle，AGV)和叉取棧板方法。 The invention relates to an automatic guided vehicle (AGV) for moving pallets and a method for forking pallets.

在智慧化工廠的趨勢下，採用自動棧板叉取技術可增加自動搬運車的自主營運模式，有效節省物流人員搬運貨物或上架貨物所需消耗的勞力和時間，進而加速物流產業或傳統產業的自動化。近年來AI機器學習技術已越趨發達，已有多種針對影像辨識的模型，其辨識率與辨識速度也有一定的基礎，並可不斷優化校正模型進行多方應用。 Under the trend of smart factories, the use of automatic pallet fork picking technology can increase the autonomous operation mode of automatic pallet trucks, effectively saving the labor and time required for logistics personnel to carry goods or put goods on shelves, and accelerate the development of logistics industry or traditional industries. automation. In recent years, AI machine learning technology has become more and more developed. There are many models for image recognition. The recognition rate and recognition speed also have a certain foundation, and the correction model can be continuously optimized for multiple applications.

目前現有之自動叉取棧板技術，多為由AGV定位方式行走至指定位置，並透過Touch Sensor與紅外線感測器判斷是否可叉取成功；另外也有透過3D視覺攝影機搭配計算模型辨識棧板，然而設備成本高昂，且擺放需要特定突出擺放方式才能辨識棧板進行叉取。為了使自動搬運車能自主地移動棧板，必須賦予自動搬運車辨識棧板的功能。然而，當物流倉庫環境狹小且環境擺設較為雜亂時，傳統的機器視覺技術並無法準確地判斷出棧板的位置。此外，棧板的材質也會影響到物件辨識結果。 At present, most of the existing automatic fork-picking pallet technology is to walk to the designated position by the AGV positioning method, and judge whether the fork-picking is successful through the Touch Sensor and infrared sensor; in addition, there are also 3D vision cameras and computing models to identify pallets. However, the cost of equipment is high, and the placement requires a specific prominent placement method to identify pallets for fork picking. In order to enable the automatic pallet truck to move the pallet autonomously, the automatic pallet truck must be given the function of identifying the pallet. However, when the logistics warehouse environment is small and the environment decoration is relatively When it is messy, traditional machine vision technology cannot accurately determine the position of the pallet. In addition, the material of the pallet will also affect the object recognition result.

本發明提供一種用於移動棧板的自動搬運車和叉取棧板方法，可通過深度圖判斷棧板的位置。 The invention provides an automatic transport vehicle for moving a pallet and a method for forking the pallet, which can judge the position of the pallet through a depth map.

本發明的一種用於叉取棧板的自動搬運車，包含牙叉、主動輪、深度攝影機、儲存媒體以及處理器。深度攝影機擷取深度圖，其中深度圖包含棧板。儲存媒體儲存機器學習模型。處理器耦接牙叉、主動輪、深度攝影機以及儲存媒體，並且經配置以執行：自深度圖取得分別對應於多個深度的多個圖像；將多個圖像輸入至機器學習模型以產生棧板的三維定位資訊；以及根據三維定位資訊控制主動輪以及牙叉以移動棧板。 An automatic transport vehicle for forking pallets according to the present invention includes a tooth fork, a driving wheel, a depth camera, a storage medium and a processor. The depth camera captures a depth map, where the depth map includes pallets. The storage medium stores the machine learning model. The processor is coupled to the fork, the drive wheel, the depth camera and the storage medium, and is configured to perform: obtaining a plurality of images respectively corresponding to a plurality of depths from the depth map; inputting the plurality of images into a machine learning model to generate The three-dimensional positioning information of the pallet; and controlling the driving wheel and the tooth fork to move the pallet according to the three-dimensional positioning information.

在本發明的一實施例中，上述的多個圖像包含第一圖像，其中機器學習模型偵測第一圖像以產生對應於棧板的第一定界框以及第二定界框，其中處理器計算第一圖像的中心點與第一定界框之間的第一距離，並且計算中心點與第二定界框之間的第二距離，其中處理器響應於第一距離小於第二距離而從第一定界框和第二定界框中選出第一定界框，並且根據受選的第一定界框產生三維定位資訊。 In an embodiment of the present invention, the above-mentioned plurality of images includes a first image, wherein the machine learning model detects the first image to generate a first bounding box and a second bounding box corresponding to the pallet, wherein the processor calculates a first distance between the center point of the first image and the first bounding box, and calculates a second distance between the center point and the second bounding box, wherein the processor responds to the first distance being less than Select the first bounding frame from the first bounding frame and the second bounding frame according to the second distance, and generate 3D positioning information according to the selected first bounding frame.

在本發明的一實施例中，上述的機器學習模型偵測第一圖像以產生對應於第一定界框的第一可信度以及對應於第二定界框的第二可信度，其中處理器響應於第一距離與第二距離相同而比較第一可信度與第二可信度，其中處理器響應於第一可信度大於第二可信度而從第一定界框和第二定界框中選出第一定界框，並且根據受選的第一定界框產生三維定位資訊。 In an embodiment of the present invention, the above machine learning model detects the first image to generate the first confidence level corresponding to the first bounding box and the second bounding box A second confidence level of block, wherein the processor compares the first confidence level to a second confidence level in response to the first distance being the same as the second distance, wherein the processor responds to the first confidence level being greater than the second confidence level The first bounding frame is selected from the first bounding frame and the second bounding frame, and three-dimensional positioning information is generated according to the selected first bounding frame.

在本發明的一實施例中，上述的多個圖像包含對應於第一深度的第一圖像，其中第一圖像包含像素，其中處理器深度圖取得像素的橫座標值、像素的縱座標值以及像素與深度攝影機之間的距離，其中處理器根據橫座標值、縱座標值以及距離判斷像素對應於第一深度。 In an embodiment of the present invention, the above-mentioned multiple images include a first image corresponding to a first depth, wherein the first image includes pixels, and the processor depth map obtains the abscissa value of the pixel, the vertical axis of the pixel The coordinate value and the distance between the pixel and the depth camera, wherein the processor judges that the pixel corresponds to the first depth according to the abscissa value, the ordinate value and the distance.

在本發明的一實施例中，上述的處理器根據橫座標值以及縱座標值計算座標原點與像素之間的第二距離，計算第二距離與距離的比值的反餘弦值，並且將距離與反餘弦值相乘以計算第一深度。 In an embodiment of the present invention, the above-mentioned processor calculates the second distance between the coordinate origin and the pixel according to the abscissa value and the ordinate value, calculates the arc cosine value of the ratio of the second distance to the distance, and calculates the distance Multiplied by the arccosine to calculate the first depth.

在本發明的一實施例中，上述的處理器響應於與棧板的叉口與深度攝影機的光軸之間的角度大於角度閾值而控制主動輪以縮小角度。 In an embodiment of the present invention, the processor controls the driving wheel to reduce the angle in response to the angle between the fork of the pallet and the optical axis of the depth camera being greater than an angle threshold.

在本發明的一實施例中，上述的自動搬運車更包含速度感測器。速度感測器耦接至處理器，其中速度感測器取得自動搬運車的線速度以及角速度，其中處理器根據線速度、角速度以及三維定位資訊控制主動輪。 In an embodiment of the present invention, the above-mentioned automatic transport vehicle further includes a speed sensor. The speed sensor is coupled to the processor, wherein the speed sensor obtains the linear velocity and angular velocity of the automatic transport vehicle, and the processor controls the driving wheel according to the linear velocity, angular velocity and three-dimensional positioning information.

在本發明的一實施例中，上述的自動搬運車更包含從動輪。主動輪的第一輪軸與從動輪的第二輪軸相距距離，其中處理器根據距離控制主動輪。 In an embodiment of the present invention, the above-mentioned automatic transport vehicle further includes driven wheels. The distance between the first axle of the driving wheel and the second axle of the driven wheel, wherein the processor Control the driving wheel according to the distance.

在本發明的一實施例中，上述的多個深度中的鄰近深度彼此相距預設間距。 In an embodiment of the present invention, adjacent depths among the aforementioned plurality of depths are apart from each other by a preset distance.

在本發明的一實施例中，上述的處理器根據棧板的長度決定預設間距。 In an embodiment of the present invention, the above-mentioned processor determines the preset distance according to the length of the pallet.

在本發明的一實施例中，上述的深度攝影機包含下列的至少其中之一：RGBD鏡頭的攝影機、紅外線鏡頭的攝影機或多鏡頭的攝影機。 In an embodiment of the present invention, the above-mentioned depth camera includes at least one of the following: a camera with an RGBD lens, a camera with an infrared lens, or a camera with multiple lenses.

在本發明的一實施例中，上述的自動搬運車更包含收發器。收發器耦接處理器，其中處理器響應於通過收發器接收命令而控制主動輪以移動到指定地點，並且在指定地點擷取深度圖。 In an embodiment of the present invention, the above-mentioned automatic transport vehicle further includes a transceiver. The transceiver is coupled to the processor, wherein the processor controls the driving wheel to move to a specified location in response to receiving a command through the transceiver, and captures a depth map at the specified location.

在本發明的一實施例中，上述的機器學習模型為YOLO模型。 In an embodiment of the present invention, the above-mentioned machine learning model is a YOLO model.

本發明的一種叉取棧板的方法，適用於自動搬運車，其中方法包含：利用深度攝影機擷取深度圖，其中深度圖包含棧板；自深度圖取得分別對應於多個深度的多個圖像；將多個圖像輸入至機器學習模型以產生棧板的三維定位資訊；以及根據三維定位資訊控制自動搬運車的主動輪以及牙叉以移動棧板。 A method for forking pallets of the present invention is suitable for automatic transport vehicles, wherein the method includes: using a depth camera to capture a depth map, wherein the depth map includes pallets; obtaining multiple maps corresponding to multiple depths from the depth map image; input multiple images into the machine learning model to generate three-dimensional positioning information of the pallet; and control the driving wheel and the tooth fork of the automatic pallet truck to move the pallet according to the three-dimensional positioning information.

基於上述，本發明的自動搬運車可偵測棧板的三維定位資訊，並可根據三維定位資訊自動地移動棧板，從而為工廠節省大量的勞力成本和時間成本。本發明結合深度攝影機與AI機器學習，可讓物流業者透過較低設備建置成本，以及環境佈建需求低的狀況下，有效的辨識出棧板位置與角度。本發明透過深度影像的分層處理與AI深度學習，可避免因環境變動或是因光線、不同的棧板材質變化等等，而造成無法辨識棧板的情況。 Based on the above, the automatic transport vehicle of the present invention can detect the three-dimensional positioning information of the pallet, and can automatically move the pallet according to the three-dimensional positioning information, thereby saving a lot of labor cost and time cost for the factory. The present invention combines the depth camera and AI machine learning, allowing the logistics industry to achieve low equipment construction costs and low environmental deployment requirements. Under certain circumstances, the position and angle of the pallet can be effectively identified. Through the layered processing of depth images and AI deep learning, the present invention can avoid the situation that pallets cannot be identified due to environmental changes or changes in light, different pallet materials, and the like.

100:自動搬運車 100:Automatic pallet truck

110:處理器 110: Processor

120:儲存媒體 120: storage media

121:機器學習模型 121:Machine Learning Models

130:收發器 130: Transceiver

140:深度攝影機 140: Depth camera

141:光軸 141: optical axis

150:牙叉 150: tooth fork

160:車輪 160: wheels

161:主動輪 161: driving wheel

162:從動輪 162: driven wheel

170:速度感測器 170: Speed sensor

200:棧板 200: Pallet

210:叉口 210: fork

211:中心軸 211: central axis

300、400、500:定界框 300, 400, 500: bounding box

310、410、O:中心點 310, 410, O: center point

61、62:輪軸 61, 62: axle

71、72:圖像 71, 72: Image

A1、A2、d、d2、d3、D:距離 A1, A2, d, d2, d3, D: distance

d1:深度 d1: depth

H:高度 H: height

L:長度 L: Length

m:預設間距 m: default spacing

P0:座標原點 P0: coordinate origin

P1:像素 P1: pixel

R:旋轉半徑 R: radius of rotation

W:寬度 W: width

x _J、x _K、z _J、z _K:座標值 x _J , x _K , z _J , z _K : coordinate values

δ、δ _J、δ _K:偏擺角 δ , δ _J , δ _K : yaw angle

θ、

:角度 θ ,

:angle

S301、S302、S303、S304、S305、S306、S307、S308、S309、 S310、S311、S312、S313、S314、S315、S901、S902、S903、S904:步驟 S301, S302, S303, S304, S305, S306, S307, S308, S309, S310, S311, S312, S313, S314, S315, S901, S902, S903, S904: steps

圖1根據本發明的一實施例繪示一種用於叉取棧板的自動搬運車的示意圖。 FIG. 1 shows a schematic diagram of an automatic transport vehicle for forking pallets according to an embodiment of the present invention.

圖2根據本發明的一實施例繪示自動搬運車的車輪配置的示意圖。 FIG. 2 is a schematic diagram illustrating a wheel configuration of an automatic transport vehicle according to an embodiment of the present invention.

圖3根據本發明的一實施例繪示移動棧板的方法的流程圖。 FIG. 3 shows a flowchart of a method for moving a pallet according to an embodiment of the invention.

圖4根據本發明的一實施例繪示自動搬運車的旋轉半徑和偏擺角的示意圖。 FIG. 4 is a schematic diagram illustrating the rotation radius and yaw angle of the automatic transport vehicle according to an embodiment of the present invention.

圖5根據本發明的一實施例繪示自動搬運車和棧板的示意圖。 FIG. 5 shows a schematic diagram of an automatic transport vehicle and pallets according to an embodiment of the present invention.

圖6根據本發明的一實施例繪示判斷物件深度的示意圖。 FIG. 6 is a schematic diagram of determining the depth of an object according to an embodiment of the present invention.

圖7和8根據本發明的一實施例繪示對應於不同深度的圖像的示意圖。 7 and 8 are schematic diagrams illustrating images corresponding to different depths according to an embodiment of the present invention.

圖9根據本發明的一實施例繪示一種叉取棧板的方法的流程圖。 FIG. 9 shows a flowchart of a method for forking a pallet according to an embodiment of the present invention.

為了使本發明之內容可以被更容易明瞭，以下特舉實施例作為本發明確實能夠據以實施的範例。另外，凡可能之處，在圖式及實施方式中使用相同標號的元件/構件/步驟，係代表相同或類似部件。 In order to make the content of the present invention more understandable, the following special examples are implemented Examples are given as examples of how the invention can actually be practiced. In addition, wherever possible, elements/components/steps using the same reference numerals in the drawings and embodiments represent the same or similar parts.

圖1根據本發明的一實施例繪示一種用於叉取棧板的自動搬運車100的示意圖。自動搬運車100可包含處理器110、儲存媒體120、收發器130、深度攝影機(depth camera)140、牙叉150以及車輪160。在一實施例中，自動搬運車100還可包含速度感測器170。自動搬運車100例如是叉車(forklift)或堆高機(fork truck)等。 FIG. 1 shows a schematic diagram of an automatic transport vehicle 100 for forking pallets according to an embodiment of the present invention. The automated pallet truck 100 may include a processor 110 , a storage medium 120 , a transceiver 130 , a depth camera 140 , a fork 150 and wheels 160 . In an embodiment, the automatic pallet truck 100 may further include a speed sensor 170 . The automatic pallet truck 100 is, for example, a forklift or a fork truck.

處理器110例如是中央處理單元(central processing unit，CPU)，或是其他可程式化之一般用途或特殊用途的微控制單元(micro control unit，MCU)、微處理器(microprocessor)、數位信號處理器(digital signal processor，DSP)、可程式化控制器、特殊應用積體電路(application specific integrated circuit，ASIC)、圖形處理器(graphics processing unit，GPU)、影像訊號處理器(image signal processor，ISP)、影像處理單元(image processing unit，IPU)、算數邏輯單元(arithmetic logic unit，ALU)、複雜可程式邏輯裝置(complex programmable logic device，CPLD)、現場可程式化邏輯閘陣列(field programmable gate array，FPGA)或其他類似元件或上述元件的組合。處理器110可耦接至儲存媒體120、收發器130、深度攝影機140、牙叉150、車輪160以及速度感測器170，並且存取和執行儲存於儲存媒體120中的多個模組和各種應用程式。 The processor 110 is, for example, a central processing unit (central processing unit, CPU), or other programmable general purpose or special purpose micro control unit (micro control unit, MCU), microprocessor (microprocessor), digital signal processing Digital Signal Processor (DSP), Programmable Controller, Application Specific Integrated Circuit (ASIC), Graphics Processing Unit (GPU), Image Signal Processor (ISP) ), image processing unit (image processing unit, IPU), arithmetic logic unit (arithmetic logic unit, ALU), complex programmable logic device (complex programmable logic device, CPLD), field programmable logic gate array (field programmable gate array , FPGA) or other similar components or a combination of the above components. The processor 110 can be coupled to the storage medium 120, the transceiver 130, the depth camera 140, the fork 150, the wheel 160, and the speed sensor 170, and access and execute multiple modules and various application.

儲存媒體120例如是任何型態的固定式或可移動式的隨機存取記憶體(random access memory，RAM)、唯讀記憶體(read-only memory，ROM)、快閃記憶體(flash memory)、硬碟(hard disk drive，HDD)、固態硬碟(solid state drive，SSD)或類似元件或上述元件的組合，而用於儲存可由處理器110執行的多個模組或各種應用程式。在本實施例中，儲存媒體120可儲存包含機器學習模型121等多個模組，其功能將於後續說明。 The storage medium 120 is, for example, any type of fixed or removable random access memory (random access memory, RAM), read-only memory (read-only memory, ROM), flash memory (flash memory) , hard disk drive (hard disk drive, HDD), solid state drive (solid state drive, SSD) or similar components or a combination of the above components, and are used to store multiple modules or various application programs that can be executed by the processor 110 . In this embodiment, the storage medium 120 can store a plurality of modules including the machine learning model 121, and its functions will be described later.

收發器130以無線或有線的方式傳送及接收訊號。收發器130還可以執行例如低噪聲放大、阻抗匹配、混頻、向上或向下頻率轉換、濾波、放大以及類似的操作。 The transceiver 130 transmits and receives signals in a wireless or wired manner. The transceiver 130 may also perform operations such as low noise amplification, impedance matching, frequency mixing, up or down frequency conversion, filtering, amplification, and the like.

深度攝影機140用以擷取物件的深度圖。深度攝影機140可包含RGBD鏡頭的攝影機、紅外線鏡頭的攝影機或多鏡頭的攝影機等，本發明不限於此。 The depth camera 140 is used to capture the depth map of the object. The depth camera 140 may include a camera with an RGBD lens, a camera with an infrared lens, or a camera with multiple lenses, and the present invention is not limited thereto.

牙叉150用以叉取棧板。當自動搬運車100欲移動棧板時，處理器110可控制牙叉150插入棧板的叉口(bay)中。處理器110可控制牙叉150垂直地移動以使棧板靠近或遠離地面。 Tooth fork 150 is used to fork and take pallet. When the automatic pallet truck 100 intends to move the pallet, the processor 110 can control the tooth fork 150 to insert into the bay of the pallet. The processor 110 can control the fork 150 to move vertically to move the pallet closer to or away from the ground.

速度感測器170可包含加速度計或陀螺儀等。速度感測器170用以感測自動搬運車100的線速度(linear velocity)與角速度(angular velocity)，並將線速度和角速度回報給處理器110。 The speed sensor 170 may include an accelerometer or a gyroscope. The speed sensor 170 is used to sense the linear velocity and the angular velocity of the automatic pallet truck 100 , and report the linear velocity and the angular velocity to the processor 110 .

車輪160可包含主動輪(driving wheel)161及從動輪(driven wheel)162。圖2根據本發明的一實施例繪示自動搬運車的車輪配置的示意圖。在本實施例中，假設自動搬運車100行駛的地面為笛卡兒座標系的XZ平面且Y軸與地面垂直。自動搬運車100可具有單一個設置在前方的主動輪161以及兩個設置在後方的從動輪162。主動輪161的輪軸61與從動輪的輪軸62相距距離D。處理器110可耦接至主動輪161並且控制主動輪的轉速以及偏擺角(yaw)，藉以控制自動搬運車100移動速度和移動方向。本實施例叉車為三輪，其主動輪有一輪，直接控制叉車移動角度，其動力學與三輪車一樣；若為四輪時，其主動輪為兩輪，其動力學與一般汽車一樣。 The wheels 160 may include a driving wheel 161 and a driven wheel 162 . FIG. 2 is a schematic diagram illustrating a wheel configuration of an automatic transport vehicle according to an embodiment of the present invention. In this embodiment, it is assumed that the automatic transport vehicle 100 travels The ground is the XZ plane of the Cartesian coordinate system and the Y axis is perpendicular to the ground. The automatic pallet truck 100 may have a single driving wheel 161 disposed at the front and two driven wheels 162 disposed at the rear. There is a distance D between the axle 61 of the driving wheel 161 and the axle 62 of the driven wheel. The processor 110 can be coupled to the driving wheel 161 and control the rotation speed and yaw angle (yaw) of the driving wheel, so as to control the moving speed and moving direction of the automatic pallet truck 100 . The present embodiment forklift is three-wheeled, and its driving wheel has one round, directly controls the moving angle of forklift, and its dynamics is the same with tricycle; If it is four-wheeled, its driving wheel is two-wheeled, and its dynamics is the same with general automobile.

圖3根據本發明的一實施例繪示移動棧板的方法的流程圖，其中所述方法可由如圖1所示的自動搬運車100實施。在步驟S301中，處理器110可通過收發器130接收命令，其中所述命令指示自動搬運車100移動到指定地點。舉例來說，若使用者欲搬運位於指定地點的棧板，使用者可操作終端裝置(例如：電腦或智慧型手機)發送對應於指定地點的命令給自動搬運車100。 FIG. 3 shows a flowchart of a method for moving pallets according to an embodiment of the present invention, wherein the method can be implemented by the automatic transport vehicle 100 shown in FIG. 1 . In step S301 , the processor 110 may receive a command through the transceiver 130 , wherein the command instructs the automatic pallet truck 100 to move to a designated location. For example, if the user wants to transport pallets at a designated location, the user can operate a terminal device (such as a computer or a smart phone) to send a command corresponding to the designated location to the automatic pallet truck 100 .

在步驟S302中，處理器110可根據命令控制主動輪161以將自動搬運車100移動到指定地點。處理器110可根據如公告號第I671610號的台灣專利所揭露的方法來控制自動搬運車100的移動。圖4根據本發明的一實施例繪示自動搬運車100的旋轉半徑R和偏擺角δ的示意圖。假設自動搬運車100的座標為(x _J ,z _J)且偏擺角為δ _J，則處理器110可根據方程式(1)、(2)、(3)、(4)和(5)來控制主動輪161的轉速以及偏擺角，藉以將自動搬運車100以偏擺角δ _K移動到座標(x _K ,z _K)。 In step S302 , the processor 110 may control the driving wheels 161 according to the command to move the automatic transport vehicle 100 to a designated location. The processor 110 can control the movement of the automatic transport vehicle 100 according to the method disclosed in Taiwan Patent Publication No. I671610. FIG. 4 shows a schematic diagram of the rotation radius R and the yaw angle δ of the automatic transport vehicle 100 according to an embodiment of the present invention. Assuming that the coordinates of the automatic pallet truck 100 are ( x _J , z _J ) and the yaw angle is δ _J , the processor 110 can perform equations (1), (2), (3), (4) and (5) to The rotational speed and the yaw angle of the driving wheel 161 are controlled, so as to move the automatic transport vehicle 100 to the coordinate ( x _K , z _K ) at the yaw angle δ _K .

其中R為自動搬運車100的旋轉半徑，ω為自動搬運車100的角速度，v為自動搬運車100的線速度，D為主動輪161的輪軸61與從動輪162的輪軸62之間的距離，且△t為自動搬運車100從座標(x _J ,z _J)移動到座標(x _K ,z _K)所花費的時間。處理器110可通過控制主動輪161的轉速來調整線速度v，並可通過控制主動輪161的轉速來和偏擺角來調整角速度ω或旋轉半徑R。

Wherein R is the radius of rotation of the automatic transport vehicle 100, ω is the angular velocity of the automatic transport vehicle 100, v is the linear velocity of the automatic transport vehicle 100, D is the distance between the axle 61 of the driving wheel 161 and the axle 62 of the driven wheel 162, And Δt is the time it takes for the automatic transport vehicle 100 to move from the coordinates ( x _J , z _J ) to the coordinates ( x _K , z _K ). The processor 110 can adjust the linear velocity v by controlling the rotational speed of the driving wheel 161 , and can adjust the angular velocity ω or the radius of rotation R by controlling the rotational speed of the driving wheel 161 and the yaw angle.

回到圖3，在自動搬運車100移動到指定地點後，在步驟S303中，處理器110可通過深度攝影機140擷取深度圖(depth map)，並且自深度圖取得分別對應於多個深度的多個圖像，其中深度圖可包含棧板。圖5根據本發明的一實施例繪示自動搬運車100和棧板200的示意圖。棧板200的長度為L，寬度為W且高度為H。棧板200可包含用以容納牙叉150的叉口210。深度攝影機140可擷取包含棧板200的深度圖。處理器110可根據預設間距m決定多個深度。多個深度中的鄰近深度可彼此相距預設間距 m。 Returning to FIG. 3, after the automatic transport vehicle 100 moves to the designated location, in step S303, the processor 110 can capture a depth map (depth map) through the depth camera 140, and obtain depth maps corresponding to multiple depths from the depth map. Multiple images, where the depth map can contain pallets. FIG. 5 shows a schematic diagram of the automatic transport vehicle 100 and the pallet 200 according to an embodiment of the present invention. The pallet 200 has a length L, a width W and a height H. As shown in FIG. The pallet 200 may include a fork 210 for receiving the fork 150 . The depth camera 140 can capture a depth map including the pallet 200 . The processor 110 may determine a plurality of depths according to the preset distance m. Adjacent depths in multiple depths can be separated from each other by a preset distance m.

舉例來說，假設深度攝影機140的光軸141與Z軸平行。處理器110可根據預設間距m從深度圖中取樣出對應於深度Z=0的圖像，對應於深度Z=m的圖像，對應於深度Z=2m的圖像以及對應於深度Z=3m的圖像等分別對應於不同深度的多個圖像。為了避免所有取樣出的圖像都未出現棧板200，處理器110可根據棧板200的尺寸來決定預設間距m。在一實施例中，處理器110可根據方程式(6)和(7)決定預設間距m，其中L為棧板200的長度，d為棧板200偏移的距離，α為係數(例如：α=3)，並且θ為深度攝影機140的光軸141與叉口210的中心軸211之間的角度，其中叉口210的中心軸211可與棧板200的短邊(即：長度為W的邊)平行。 For example, assume that the optical axis 141 of the depth camera 140 is parallel to the Z axis. The processor 110 can sample the image corresponding to depth Z=0, the image corresponding to depth Z=m, the image corresponding to depth Z=2m and the image corresponding to depth Z=2m from the depth map according to the preset interval m. The images of 3m and the like respectively correspond to a plurality of images of different depths. In order to avoid that the pallet 200 does not appear in all the sampled images, the processor 110 may determine the preset distance m according to the size of the pallet 200 . In one embodiment, the processor 110 can determine the preset distance m according to equations (6) and (7), where L is the length of the pallet 200, d is the offset distance of the pallet 200, and α is a coefficient (for example: α =3), and θ is the angle between the optical axis 141 of the depth camera 140 and the central axis 211 of the fork 210, wherein the central axis 211 of the fork 210 can be connected to the short side of the pallet 200 (that is: the length is W sides) are parallel.

d=L．sinθ...(6) d = L . sin θ ... (6)

m=α．d...(7) m = α . d ...(7)

處理器110自深度圖中所擷取出的圖像上的每一個像素都對應於相同的深度。圖6根據本發明的一實施例繪示判斷物件深度的示意圖。以對應於棧板200的頂點的像素P1為例。假設深度攝影機140的光軸141與Z軸平行。若處理器110通過深度攝影機140拍攝棧板200的頂點，則深度攝影機140所產生的深度圖可包含對應於頂點之像素P1的橫座標值(即：X座標值)、像素P1的縱座標值(即：Y座標值)以及像素P1與深度攝影機140之間的距離d3等資訊，如圖6所示。處理器110可基於方程式 (8)和(9)而根據像素P1的橫座標值、像素P1的縱座標值以及距離d3判斷像素P1對應於深度d1。因此，若處理器110欲擷取對應於深度d1的圖像，則處理器110可從深度圖中擷取出像素P1以產生所述圖像。處理器110自深度圖中擷取出的圖像可與XY平面平行。 Each pixel on the image extracted by the processor 110 from the depth map corresponds to the same depth. FIG. 6 is a schematic diagram of determining the depth of an object according to an embodiment of the present invention. Take the pixel P1 corresponding to the vertex of the pallet 200 as an example. Assume that the optical axis 141 of the depth camera 140 is parallel to the Z axis. If the processor 110 captures the vertex of the pallet 200 through the depth camera 140, the depth map generated by the depth camera 140 may include the abscissa value (ie: X coordinate value) of the pixel P1 corresponding to the vertex, and the ordinate value of the pixel P1 (ie: Y coordinate value) and information such as the distance d3 between the pixel P1 and the depth camera 140 , as shown in FIG. 6 . Processor 110 may be based on the equation (8) and (9) determine that the pixel P1 corresponds to the depth d1 according to the abscissa value of the pixel P1, the ordinate value of the pixel P1, and the distance d3. Therefore, if the processor 110 wants to capture an image corresponding to the depth d1, the processor 110 can extract the pixel P1 from the depth map to generate the image. The image extracted by the processor 110 from the depth map may be parallel to the XY plane.

其中d2為像素P1與座標原點P0之間的距離，並且d3為像素P1與深度攝影機140之間的距離。一般的深度攝影機將光軸的橫座標值和縱座標值設為零。也就是說，上述的座標原點P0可位於深度攝影機140的光軸141上。

Where d2 is the distance between the pixel P1 and the coordinate origin P0 , and d3 is the distance between the pixel P1 and the depth camera 140 . A typical depth camera sets the abscissa and ordinate values of the optical axis to zero. That is to say, the aforementioned coordinate origin P0 may be located on the optical axis 141 of the depth camera 140 .

回到圖3，在取得分別對應於多個深度的多個圖像後，在步驟S304中，處理器110可將多個圖像輸入至機器學習模型121以產生對應於棧板200的至少一定界框，其中所述至少一定界框用以產生棧板200的三維定位資訊。機器學習模型121例如是包含YOLO模型的物件辨識(object detection)模型，但本發明不限於此。圖7和8根據本發明的一實施例繪示對應於不同深度的圖像的示意圖。假設處理器110自深度圖中擷取的多個圖像包含分別對應於不同深度的圖像71與圖像72，其中圖像71對應於深度Z=2m且圖像72對應於深度Z=3m。處理器110可將圖像71輸入至機器學習模型121以產生定界框300以及定界框400。此外，機器學習模型121還可產生對應於定界框300的可信度(confidence score)以及對應於定界框400的可信度。另一方面，處理器110可將圖像72輸入至機器學習模型121以產生定界框500以及對應於定界框500的可信度。 Returning to FIG. 3 , after obtaining multiple images respectively corresponding to multiple depths, in step S304, the processor 110 may input the multiple images into the machine learning model 121 to generate at least a certain depth corresponding to the pallet 200. A bounding box, wherein the at least certain bounding box is used to generate three-dimensional positioning information of the pallet 200 . The machine learning model 121 is, for example, an object detection model including the YOLO model, but the present invention is not limited thereto. 7 and 8 are schematic diagrams illustrating images corresponding to different depths according to an embodiment of the present invention. Assume that the multiple images extracted by the processor 110 from the depth map include image 71 and image 72 respectively corresponding to different depths, wherein image 71 corresponds to depth Z=2m and image 72 corresponds to depth Z=3m . The processor 110 can input the image 71 to the machine learning model 121 to generate the bounding box 300 and the bounding box 400 . In addition, the machine learning model 121 can also generate a confidence level corresponding to the bounding box 300 score) and the confidence level corresponding to the bounding box 400. On the other hand, the processor 110 may input the image 72 to the machine learning model 121 to generate the bounding box 500 and the confidence level corresponding to the bounding box 500 .

回到圖3，在步驟S305中，處理器110可從多個圖像中選擇一個圖像。受選的圖像可包含棧板200，並可用以判斷棧板200的三維定位資訊。在一實施例中，處理器110可根據對應於圖像的定界框的可信度選擇圖像。具體來說，處理器110可從多個圖像中選出包含了對應於最大可信度的定界框的圖像以作為受選圖像。以圖7和圖8為例，假設圖像71中的定界框300的可信度大於定界框400的可信度，並且圖像71中的定界框300的可信度也大於圖像72中的定界框500的可信度。據此，處理器110可從圖像71和圖像72中選擇圖像71以作為受選圖像。 Returning to FIG. 3 , in step S305 , the processor 110 may select an image from a plurality of images. The selected image can include the pallet 200 and can be used to determine the three-dimensional positioning information of the pallet 200 . In one embodiment, the processor 110 may select the image according to the confidence of the bounding box corresponding to the image. Specifically, the processor 110 may select an image including a bounding box corresponding to a maximum reliability from the plurality of images as the selected image. Taking Fig. 7 and Fig. 8 as an example, it is assumed that the reliability of bounding box 300 in image 71 is greater than that of bounding box 400, and the reliability of bounding box 300 in image 71 is also greater than that in Fig. Confidence of the bounding box 500 as in 72. Accordingly, the processor 110 may select the image 71 from the images 71 and 72 as the selected image.

在步驟S306中，處理器110可判斷受選圖像是否包含多個定界框。若受選圖像包含多個定界框，則進入步驟S308。若受選圖像包含單一個定界框而不包含多個定界框，則進入步驟S307。以圖像71為例。假設圖像71為受選圖像，由於圖像71包含了定界框300和定界框400等兩個定界框，故處理器110可判斷受選圖像包含多個定界框。據此，處理器110可執行步驟S308。以圖像72為例。假設圖像72為受選圖像，由於圖像72包含一個定界框500，故處理器110可判斷受選圖像不包含多個定界框。據此，處理器110可執行步驟S307。 In step S306, the processor 110 may determine whether the selected image contains multiple bounding boxes. If the selected image contains multiple bounding boxes, go to step S308. If the selected image contains a single bounding box but does not contain multiple bounding boxes, go to step S307. Take image 71 as an example. Assuming that the image 71 is the selected image, since the image 71 includes two bounding boxes 300 and 400 , the processor 110 can determine that the selected image includes multiple bounding boxes. Accordingly, the processor 110 may execute step S308. Take image 72 as an example. Assuming that the image 72 is the selected image, since the image 72 includes one bounding box 500 , the processor 110 can determine that the selected image does not include multiple bounding boxes. Accordingly, the processor 110 may execute step S307.

在步驟S307中，處理器110可根據受選圖像中的定界框產生棧板200的三維定位資訊，其中三維定位資訊可包含棧板200的橫軸座標值(即：X座標值)、縱座標值(即：Y座標值)以及深度(即：Z座標值)。以圖8的圖像72為例，假設圖像72為受選圖像。由於處理器110已知圖像72對應於深度Z=3m，故處理器110可判斷棧板200的深度為Z=3m。另一方面，處理器110可根據定界框500判斷棧板200的橫座標值與縱座標值。在取得棧板200的橫座標值、縱座標值以及深度後，處理器110可產生包含棧板200的橫座標值、縱座標值以及深度的三維定位資訊。 In step S307, the processor 110 may, according to the bounding box in the selected image Generate three-dimensional positioning information of the pallet 200, wherein the three-dimensional positioning information may include the horizontal axis coordinate value (ie: X coordinate value), vertical coordinate value (ie: Y coordinate value) and depth (ie: Z coordinate value) of the pallet 200 . Taking the image 72 in FIG. 8 as an example, it is assumed that the image 72 is the selected image. Since the processor 110 knows that the image 72 corresponds to a depth of Z=3m, the processor 110 can determine that the depth of the pallet 200 is Z=3m. On the other hand, the processor 110 may determine the abscissa value and the ordinate value of the pallet 200 according to the bounding box 500 . After obtaining the abscissa value, ordinate value and depth of the pallet 200 , the processor 110 may generate three-dimensional positioning information including the abscissa value, ordinate value and depth of the pallet 200 .

在步驟S308中，處理器110可計算各個定界框(或定界框的中心點)與圖像的中心點之間的距離以產生多個距離，並且判斷所述多個距離是否相同。若所述多個距離都相同，則進入步驟S310。若所述多個距離不相同，則進入步驟S309。以圖像71為例，處理器110可計算定界框300的中心點310與圖像71的中心點O之間的距離A1以及定界框400的中心點410與圖像71的中心點O之間的距離A2。若距離A1與距離A2相同，則進入步驟S310。若距離A1與距離A2不同，則進入步驟S309。 In step S308, the processor 110 may calculate the distance between each bounding box (or the center point of the bounding box) and the center point of the image to generate a plurality of distances, and determine whether the plurality of distances are the same. If the multiple distances are the same, go to step S310. If the multiple distances are not the same, go to step S309. Taking the image 71 as an example, the processor 110 can calculate the distance A1 between the center point 310 of the bounding box 300 and the center point O of the image 71 and the distance A1 between the center point 410 of the bounding box 400 and the center point O of the image 71 The distance between A2. If the distance A1 is the same as the distance A2, go to step S310. If the distance A1 is different from the distance A2, go to step S309.

在步驟S309中，處理器110可從多個定界框中選出距離圖像的中心點最近的定界框以作為受選定界框，並可根據受選定界框產生棧板200的三維定位資訊。以圖像71為例，處理器110可響應於對應於定界框300的距離A1小於對應於定界框400的距離A2而從定界框300與定界框400中選出定界框300以作為受選定界框。據此，處理器110可根據定界框300產生棧板200的三維定位資訊。 In step S309, the processor 110 may select the bounding box closest to the center point of the image from the plurality of bounding boxes as the selected bounding box, and generate the three-dimensional positioning information of the pallet 200 according to the selected bounding box . Taking the image 71 as an example, the processor 110 may select the bounding box 300 from the bounding box 300 and the bounding box 400 in response to the distance A1 corresponding to the bounding box 300 being smaller than the distance A2 corresponding to the bounding box 400 . as the selected bounding box. Accordingly, the processor 110 can generate the frame of the pallet 200 according to the bounding box 300 3D positioning information.

在步驟S310中，處理器110可判斷分別對應於多個定界框的多個可信度是否相同。若多個可信度相同，則進入步驟S312。若多個可信度不相同，則進入步驟S311。以圖像71為例，若定界框300的可信度與定界框400的可信度相同，則進入步驟S312。若定界框300的可信度與定界框400的可信度不相同，則進入步驟S311。 In step S310, the processor 110 may determine whether the multiple confidence levels respectively corresponding to the multiple bounding boxes are the same. If the reliability levels are the same, go to step S312. If the multiple credibility levels are not the same, go to step S311. Taking the image 71 as an example, if the reliability of the bounding frame 300 is the same as that of the bounding frame 400 , go to step S312 . If the reliability of the bounding box 300 is not the same as that of the bounding box 400, go to step S311.

在步驟S311中，處理器110可從多個定界框中選出具有最大可信度的定界框以作為受選定界框，並可根據受選定界框產生棧板200的三維定位資訊。若圖像中的各個定界框與圖像的中心點之間的距離相同，代表處理器110並無法根據距離選擇較佳的定界框。因此，處理器110將選擇具有最大可信度的定界框以作為受選定界框。以圖像71為例，處理器110可響應於定界框300的可信度大於定界框400的可信度而從定界框300與定界框400中選出定界框300以作為受選定界框。據此，處理器110可根據定界框300產生棧板200的三維定位資訊。 In step S311 , the processor 110 may select a bounding frame with the greatest reliability from the plurality of bounding frames as the selected bounding frame, and may generate three-dimensional positioning information of the pallet 200 according to the selected bounding frame. If the distances between the bounding boxes in the image and the center point of the image are the same, it means that the processor 110 cannot select a better bounding box according to the distance. Therefore, the processor 110 will select the bounding box with the greatest confidence as the selected bounding box. Taking the image 71 as an example, the processor 110 may select the bounding frame 300 from the bounding frame 300 and the bounding frame 400 as the subject in response to the reliability of the bounding frame 300 being greater than the reliability of the bounding frame 400 . Select a bounding box. Accordingly, the processor 110 can generate the three-dimensional positioning information of the pallet 200 according to the bounding box 300 .

在步驟S312中，處理器110可根據預設規則(或隨機地)從多個定界框中選出受選定界框，並且根據受選定界框產生棧板200的三維定位資訊。預設規則例如是「選擇具有最小橫座標值的定界框以作為受選定界框」。以圖像71為例，處理器110可響應於定界框300的橫座標值小於定界框400的橫座標值而從定界框300以及定界框400中選出定界框300以作為受選定界框。據此，處理器110可根據定界框300產生棧板200的三維定位資訊。 In step S312 , the processor 110 may select a selected bounding frame from a plurality of bounding frames according to a preset rule (or randomly), and generate three-dimensional positioning information of the pallet 200 according to the selected bounding frame. The default rule is, for example, "select the bounding box with the smallest abscissa value as the selected bounding box". Taking the image 71 as an example, the processor 110 may select the bounding box 300 from the bounding box 300 and the bounding box 400 as the subject in response to the abscissa value of the bounding box 300 being smaller than the abscissa value of the bounding box 400 . Select a bounding box. Accordingly, The processor 110 can generate three-dimensional positioning information of the pallet 200 according to the bounding box 300 .

為了降低辨識棧板200或將牙叉150插入叉口210的難易度，處理器110可根據叉口210的中心軸211與深度攝影機140的光軸141之間的角度θ判斷是否移動自動搬運車100。具體來說，在步驟S313中，處理器110可判斷叉口210的中心軸211與深度攝影機140的光軸141之間的角度θ是否大於角度閾值。若角度θ大於角度閾值，則進入步驟S314。若角度θ小於或等於角度閾值，則進入步驟S315。 In order to reduce the difficulty of identifying the pallet 200 or inserting the fork 150 into the fork opening 210, the processor 110 can determine whether to move the automatic transport vehicle according to the angle θ between the central axis 211 of the fork opening 210 and the optical axis 141 of the depth camera 140 100. Specifically, in step S313 , the processor 110 may determine whether the angle θ between the central axis 211 of the fork 210 and the optical axis 141 of the depth camera 140 is greater than an angle threshold. If the angle θ is greater than the angle threshold, go to step S314. If the angle θ is less than or equal to the angle threshold, go to step S315.

在步驟S314中，處理器110可控制主動輪161以移動自動搬運車100，藉以縮小角度θ，直到角度θ小於或等於角度閾值。接著，處理器110可重新執行圖3的流程以嘗試叉取棧板200。 In step S314 , the processor 110 can control the driving wheel 161 to move the automated truck 100 , so as to reduce the angle θ until the angle θ is less than or equal to the angle threshold. Next, the processor 110 may re-execute the process of FIG. 3 to attempt to fork the pallet 200 .

在步驟S315中，處理器110可根據棧板200的三維定位資訊控制主動輪161以及牙叉150以移動棧板200。具體來說，處理器110可控制主動輪161移動自動搬運車100以將牙叉150插入叉口210中。接著，處理器110可控制牙叉150上升以舉起棧板200。 In step S315 , the processor 110 can control the driving wheel 161 and the tooth fork 150 to move the pallet 200 according to the three-dimensional positioning information of the pallet 200 . Specifically, the processor 110 can control the driving wheel 161 to move the automatic pallet truck 100 to insert the tooth fork 150 into the fork opening 210 . Next, the processor 110 can control the fork 150 to lift up the pallet 200 .

圖9根據本發明的一實施例繪示一種叉取棧板的方法的流程圖，其中所述方法可由如圖1所示的自動搬運車100實施。在步驟S901中，利用深度攝影機擷取深度圖，其中深度圖包含棧板。在步驟S902中，自深度圖取得分別對應於多個深度的多個圖像。在步驟S903中，將多個圖像輸入至機器學習模型以產生棧板的三維定位資訊。在步驟S904中，根據三維定位資訊控制自動搬運車的主動輪以及牙叉以移動棧板。 FIG. 9 shows a flow chart of a method for forking pallets according to an embodiment of the present invention, wherein the method can be implemented by the automatic transport vehicle 100 shown in FIG. 1 . In step S901, a depth image is captured by a depth camera, wherein the depth image includes pallets. In step S902, a plurality of images respectively corresponding to a plurality of depths are acquired from the depth map. In step S903, a plurality of images are input into a machine learning model to generate three-dimensional positioning information of pallets. In step S904, the automatic moving is controlled according to the three-dimensional positioning information. The driving wheels of the transport cart and the forks are used to move the pallet.

綜上所述，本發明的自動搬運車可根據命令自動地移動到指定地點，並且偵測指定地點是否存在棧板。若棧板的叉口與自動搬運車的牙叉之間的角度過大，自動搬運車可進行移動以縮小所述角度，以利牙叉叉取棧板。自動搬運車可通過影像分層演算法以從深度攝影機產生的深度圖中擷取出多個圖像，並且根據這些圖像判斷棧板的叉口位置，藉以增加自動搬運車對不同環境中的棧板或對不同材質的棧板的辨識準確度。自動搬運車可根據叉口位置控制車輪以及牙叉以自動地移動棧板。 To sum up, the automatic transport vehicle of the present invention can automatically move to a designated location according to commands, and detect whether there is a pallet at the designated location. If the angle between the fork of the pallet and the fork of the automatic pallet truck is too large, the automatic pallet truck can move to reduce the angle so that the pallet can be picked up by the forks of the pallet. The automatic pallet truck can extract multiple images from the depth map generated by the depth camera through the image layering algorithm, and judge the position of the fork of the pallet based on these images, so as to increase the automatic pallet truck's ability to stack in different environments board or the recognition accuracy of pallets of different materials. The automatic pallet truck can control the wheels and tooth forks according to the position of the fork to move the pallet automatically.

S901、S902、S903、S904:步驟 S901, S902, S903, S904: steps

Claims

An automatic carrier for forking pallets, comprising: Tooth fork; driving wheel; a depth camera for capturing a depth map, wherein the depth map includes the pallet; storage media for storing machine learning models; and a processor, coupled to the fork, the drive wheel, the depth camera and the storage medium, and configured to: obtaining a plurality of images respectively corresponding to a plurality of depths from the depth map; inputting the plurality of images into the machine learning model to generate 3D positioning information of the pallet; and The driving wheel and the tooth fork are controlled according to the three-dimensional positioning information to move the pallet.

The automatic pallet truck as claimed in claim 1, wherein said plurality of images comprises a first image, wherein The machine learning model detects the first image to generate a first bounding box and a second bounding box corresponding to the pallet, wherein the processor calculates a first distance between a center point of the first image and the first bounding box, and calculates a second distance between the center point and the second bounding box, in The processor selects the first bounding box from the first bounding box and the second bounding box in response to the first distance being less than the second distance, and based on the selected The first bounding box generates the three-dimensional positioning information.

The automatic transport vehicle as described in claim 2, wherein The machine learning model detects the first image to generate a first confidence level corresponding to the first bounding box and a second confidence level corresponding to the second bounding box, wherein The processor compares the first confidence level to the second confidence level in response to the first distance being the same as the second distance, wherein The processor selects the first bounding box from the first bounding box and the second bounding box in response to the first confidence level being greater than the second confidence level, and based on The selected first bounding box generates the 3D positioning information.

The automated pallet truck of claim 1, wherein the plurality of images includes a first image corresponding to a first depth, wherein the first image includes pixels, wherein The processor acquires the abscissa value of the pixel, the ordinate value of the pixel, and the distance between the pixel and the depth camera from the depth map, wherein The processor judges that the pixel corresponds to the first depth according to the abscissa value, the ordinate value and the distance.

The automatic transport vehicle according to claim 4, wherein the processor calculates the second distance between the coordinate origin and the pixel according to the abscissa value and the ordinate value, and calculates the second distance and an arccosine of a ratio of the distances, and multiplying the distance by the arccosine to calculate the first depth.

The automatic transport vehicle as described in claim 1, wherein The processor controls the drive wheel to reduce the angle in response to an angle between the fork of the pallet and the optical axis of the depth camera being greater than an angle threshold.

The automatic transfer vehicle as described in claim 1, further comprising: a speed sensor, coupled to the processor, wherein the speed sensor obtains the linear velocity and angular velocity of the automatic transport vehicle, wherein the processor is based on the linear velocity, the angular velocity and the three-dimensional Positioning information controls the driving wheel.

The automatic transfer vehicle as described in claim item 7, further comprising: For the driven wheel, the first axle of the driving wheel is at a distance from the second axle of the driven wheel, wherein the processor controls the driving wheel according to the distance.

The automated pallet truck as claimed in claim 1, wherein adjacent ones of the plurality of depths are spaced apart from each other by a preset distance.

The automatic transport vehicle according to claim 9, wherein the processor determines the preset distance according to the length of the pallet.

The automatic transport vehicle according to claim 1, wherein the depth camera includes at least one of the following: a camera with RGBD lens, a camera with infrared lens or a camera with multiple lenses.

The automatic transfer vehicle as described in claim 1, further comprising: a transceiver coupled to the processor, wherein the processor controls the drive wheel to move to a specified location in response to receiving a command through the transceiver, and captures the depth map at the specified location.

The automatic transport vehicle according to claim 1, wherein the machine learning model is a YOLO model.

A method for forking pallets, suitable for automatic transport vehicles, wherein the method includes: capturing a depth map using a depth camera, wherein the depth map includes the pallet; obtaining a plurality of images respectively corresponding to a plurality of depths from the depth map; inputting the plurality of images into a machine learning model to generate 3D positioning information of the pallet; and According to the three-dimensional positioning information, the driving wheel and the tooth fork of the automatic transport vehicle are controlled to move the pallet.