TWI832270B

TWI832270B - Method for detecting road condition, electronic device and storage medium

Info

Publication number: TWI832270B
Application number: TW111120339A
Authority: TW
Inventors: 簡士超; 郭錦斌
Original assignee: 鴻海精密工業股份有限公司
Filing date: 2022-05-31
Publication date: 2024-02-11

Abstract

The present application provides a method for detecting road condition, an electronic device and a storage medium. The method includes: acquiring images in front of the vehicle as detection images; inputting the detection images into a trained semantic segmentation model, which includes a backbone network and a head network; and extracting features from the detection images by using the backbone network and obtaining a plurality of feature maps. The plurality of feature maps are input into the head network, and a first segmentation network of the head network processes the plurality of feature maps to obtain a first recognition result, and a second segmentation network of the head network performs the plurality of feature images and obtaining a second identification result. The vehicle can be driven based on the first identification result and the second identification result. The application can improve the safety of autonomous driving.

Description

Road condition detection method, electronic device and computer-readable storage medium

本申請涉及計算機視覺技術領域、尤其涉及一種路況檢測方法、電子設備及計算機可讀存儲媒體。 The present application relates to the field of computer vision technology, and in particular to a road condition detection method, electronic equipment and computer-readable storage media.

在車輛的自動駕駛中，環境感知是極為重要的技術。目前，環境感知的功能大多數採用基於深度學習的語義分割方法來達成。語義分割方法使用深度學習的分割模型識別出圖像中的物體。然而，上述方法只能識別事先定義好的物體類別，例如識別事先定義好的道路、行人、車等類別，但是實際應用時道路上場景極為複雜，若道路場景上出現未知的物體類別，訓練的模型常常識別錯誤或無法識別，導致車輛可能直接撞上道路中的物體，發生交通事故。 In autonomous driving of vehicles, environmental perception is an extremely important technology. Currently, most environmental awareness functions are achieved using semantic segmentation methods based on deep learning. Semantic segmentation methods use deep learning segmentation models to identify objects in images. However, the above method can only recognize pre-defined object categories, such as identifying pre-defined categories of roads, pedestrians, cars, etc. However, the scene on the road is extremely complex in actual application. If an unknown object category appears on the road scene, the trained Models often recognize errors or fail to recognize them, causing the vehicle to directly hit objects on the road and cause traffic accidents.

鑒於以上內容，有必要提供路況檢測方法、電子設備及計算機可讀存儲媒體，以解決應用的模型無法識別未知類別的物體而導致車輛無法做出相應判斷的問題，從而避免交通事故的發生。 In view of the above, it is necessary to provide road condition detection methods, electronic devices and computer-readable storage media to solve the problem of the applied model being unable to identify unknown objects and causing the vehicle to be unable to make corresponding judgments, thereby avoiding the occurrence of traffic accidents.

本申請實施例提供一種路況檢測方法，所述路況檢測方法包括：獲取車輛前方圖像作為檢測圖像；將所述檢測圖像輸入至訓練完成的語義分割模型，所述語義分割模型包括backbone網路及head網路；利用所述backbone網路對所述檢測圖像進行特徵提取，得到多個特徵圖；將所述多個特徵圖輸入所述head網路，所述head網路的第一分割網路對所述多個特徵圖進行處理，得到第一識別結果，及所述head網路的第二分割網路對所述多個特徵圖像進行處理，得到第二識別結果；基於所述第一識別結果及所述第二識別結果確定所述車輛是否可以繼續行駛。 Embodiments of the present application provide a road condition detection method. The road condition detection method includes: obtaining an image in front of the vehicle as a detection image; inputting the detection image into a trained semantic segmentation model, where the semantic segmentation model includes a backbone network Road and head network; use the backbone network to perform feature extraction on the detection image to obtain multiple feature maps; input the multiple feature maps The head network, the first segmentation network of the head network processes the plurality of feature maps to obtain a first recognition result, and the second segmentation network of the head network processes the plurality of feature maps. The characteristic image is processed to obtain a second recognition result; based on the first recognition result and the second recognition result, it is determined whether the vehicle can continue driving.

在一種可選的實施方式中，所述方法還包括：構建語義分割模型並完成對所述語義分割模型的訓練，包括：獲取訓練圖像；將所述訓練圖像輸入至所述backbone網路進行特徵提取，得到多個訓練特徵圖；將所述多個訓練特徵圖輸入所述head網路，所述第一分割網路對所述多個訓練特徵圖進行處理，得到所述訓練圖像的第一訓練結果；根據所述第一訓練結果與預設的第一期望結果，利用預設的損失函數計算所述第一分割網路的第一損失值；所述第二分割網路對所述多個訓練特徵圖進行處理，得到所述訓練圖像的第二訓練結果；根據所述第二訓練結果與預設的第二期望結果，利用所述預設的損失函數計算所述第二分割網路的第二損失值；根據所述第一損失值及所述第二損失值，調整所述語義分割模型的參數，得到所述訓練完成的語義分割模型。 In an optional implementation, the method further includes: constructing a semantic segmentation model and completing training of the semantic segmentation model, including: obtaining training images; inputting the training images into the backbone network Perform feature extraction to obtain multiple training feature maps; input the multiple training feature maps into the head network, and the first segmentation network processes the multiple training feature maps to obtain the training image the first training result; according to the first training result and the preset first expected result, use the preset loss function to calculate the first loss value of the first segmentation network; the second segmentation network pairs The plurality of training feature maps are processed to obtain a second training result of the training image; according to the second training result and the preset second expected result, the preset loss function is used to calculate the third The second loss value of the binary segmentation network; adjust the parameters of the semantic segmentation model according to the first loss value and the second loss value to obtain the trained semantic segmentation model.

在一種可選的實施方式中，所述第一訓練結果與所述第二訓練結果的獲取方式包括：利用所述第一分割網路對所述多個訓練特徵圖進行上採樣及反卷積處理，得到多個與所述訓練圖像大小一致的第一訓練特徵圖，利用第一softmax層對每個第一訓練特徵圖按照第一預設像素類別進行分類，得到所述第一訓練特徵圖中每個像素分類的概率，選擇最大概率值對應的類別作為所述像素對應的類別，輸出所述第一訓練結果，其中，所述第一預設像素類別包括預定義的多個物體類別；利用所述第二分割網路對所述多個訓練特徵圖進行上採樣及反卷積處理，得到多個與所述檢測圖像大小一致的第二訓練特徵圖，利用第二softmax層對每個訓練特徵圖按照第二預設像素類別進行分類，得到所述第二訓練特徵圖中每個像素分類的概率，選擇最大概率值對應的類別作為所述像素對應的類別，輸出所述第二訓練結果，其中，所述第二預設像素類別包括預定義的兩個道路類別：車道或非車道。 In an optional implementation, the method of obtaining the first training result and the second training result includes: using the first segmentation network to upsample and deconvolve the multiple training feature maps. Process to obtain a plurality of first training feature maps with the same size as the training image, use the first softmax layer to classify each first training feature map according to the first preset pixel category, and obtain the first training feature Probability of each pixel classification in the picture, select the category corresponding to the maximum probability value as the category corresponding to the pixel, and output the first training result, wherein the first preset pixel category includes multiple predefined object categories ; Use the second segmentation network to perform upsampling and deconvolution processing on the plurality of training feature maps to obtain a plurality of second training feature maps with the same size as the detection image, and use the second softmax layer to Each training feature map is classified according to the second preset pixel category, the probability of each pixel classification in the second training feature map is obtained, and the category corresponding to the maximum probability value is selected as the Category corresponding to the pixel, and output the second training result, wherein the second preset pixel category includes two predefined road categories: lane or non-lane.

在一種可選的實施方式中，所述根據所述第一損失值及所述第二損失值，調整所述語義分割模型的參數，得到所述訓練完成的語義分割模型包括：將所述第一損失值與所述第二損失值相加得到所述語義分割模型的損失值；採用梯度下降法調整所述語義分割模型的參數，使所述語義分割模型的損失值最小，得到所述訓練完成的語義分割模型。 In an optional implementation, adjusting the parameters of the semantic segmentation model according to the first loss value and the second loss value to obtain the trained semantic segmentation model includes: Add a loss value to the second loss value to obtain the loss value of the semantic segmentation model; use the gradient descent method to adjust the parameters of the semantic segmentation model to minimize the loss value of the semantic segmentation model, and obtain the training Completed semantic segmentation model.

在一種可選的實施方式中，所述方法還包括：採用segnet網路的編碼網路作為所述backbone網路；採用segnet網路的解碼網路作為所述head網路中的第一分割網路；及新增一segnet網路的解碼網路作為所述head網路中的第二分割網路。 In an optional implementation, the method further includes: using the encoding network of the segnet network as the backbone network; using the decoding network of the segnet network as the first segmentation network in the head network. road; and add a decoding network of the segnet network as the second segmented network in the head network.

在一種可選的實施方式中，所述head網路的第一分割網路對所述多個特徵圖進行處理，得到第一識別結果包括：將所述檢測圖像輸入至所述backbone網路進行卷積運算及最大池化運算，得到所述檢測圖像的多個特徵圖；所述第一分割網路對所述多個特徵圖進行上採樣及反卷積處理，得到多個與所述檢測圖像大小一致的第一特徵圖；利用所述第一softmax層對所述每個第一特徵圖按照第一預設像素類別進行分類，輸出所述檢測圖像中每個像素所屬的類別信息；根據每個像素所屬的類別信息確定所述檢測圖像中所有物體的類別，將所述檢測圖像中所有物體的類別作為第一識別結果。 In an optional implementation, the first segmentation network of the head network processes the multiple feature maps, and obtaining the first recognition result includes: inputting the detection image to the backbone network Convolution operations and maximum pooling operations are performed to obtain multiple feature maps of the detection image; the first segmentation network performs upsampling and deconvolution processing on the multiple feature maps to obtain multiple features corresponding to the The first feature map of the same size in the detection image is used; the first softmax layer is used to classify each of the first feature maps according to the first preset pixel category, and the pixel to which each pixel in the detection image belongs is output. Category information: determine the categories of all objects in the detection image according to the category information to which each pixel belongs, and use the categories of all objects in the detection image as the first recognition result.

在一種可選的實施方式中，所述head網路的第二分割網路對所述多個特徵圖像進行處理，得到第二識別結果包括：所述第二分割網路對所述多個特徵圖進行上採樣及反卷積處理，得到多個與所述檢測圖像大小一致的第二特徵圖；利用所述第二softmax層對每個第二特徵圖按照第二預設像素類別進行分類，確定所述檢測圖像對應的道路類別作為第二識別結果。 In an optional implementation, the second segmentation network of the head network processes the plurality of characteristic images, and obtaining the second recognition result includes: the second segmentation network processes the plurality of feature images. The feature map is subjected to upsampling and deconvolution processing to obtain multiple second feature maps with the same size as the detection image; the second softmax layer is used to perform processing on each second feature map according to the second preset pixel category. Classify and determine the road category corresponding to the detected image as the second recognition result.

在一種可選的實施方式中，所述基於所述第一識別結果及所述第二識別結果確定所述車輛是否可以繼續行駛包括：若所述第一識別結果表明已經識別出所述檢測圖像中的所有物體類別，根據所述第一識別結果中的所有物體的類別確定所述車輛是否可以繼續行駛；或若所述第一識別結果表明所述檢測圖像中存在無法識別的物體並且所述第二識別結果表明道路類別為車道，確定所述車輛可以繼續行駛；或若所述第一識別結果表明所述檢測圖像中存在無法識別的物體並且所述第二識別結果表明道路類別為非車道，確定所述車輛不可以繼續行駛。 In an optional implementation, determining whether the vehicle can continue driving based on the first recognition result and the second recognition result includes: if the first recognition result indicates that the detection pattern has been recognized All object categories in the image, determine whether the vehicle can continue driving based on the categories of all objects in the first recognition result; or if the first recognition result indicates that there are unrecognizable objects in the detected image and The second recognition result indicates that the road category is a lane, and it is determined that the vehicle can continue driving; or if the first recognition result indicates that there are unrecognizable objects in the detected image and the second recognition result indicates that the road category It is a non-lane, and it is determined that the vehicle cannot continue driving.

本申請實施例還提供一種電子設備，所述電子設備包括處理器和記憶體，所述處理器用於執行記憶體中存儲的計算機程式以實現所述的路況檢測方法。 An embodiment of the present application also provides an electronic device. The electronic device includes a processor and a memory. The processor is configured to execute a computer program stored in the memory to implement the road condition detection method.

本申請的實施例還提供一種計算機可讀存儲媒體，所述計算機可讀存儲媒體存儲有至少一個指令，所述至少一個指令被處理器執行時實現所述的路況檢測方法。 Embodiments of the present application also provide a computer-readable storage medium that stores at least one instruction. When the at least one instruction is executed by a processor, the road condition detection method is implemented.

本申請實施例提供的上述技術方案，能夠在預設模型無法識別車輛前方的物體類別時，對前方物體類別進行再定義與再訓練，從而確保自動駕駛車輛能夠再次進行識別，藉由兩次識別相結合相輔助的方式，使得識別結果更精準，從而提高了自動駕駛的安全性。 The above technical solution provided by the embodiment of the present application can redefine and retrain the object category in front of the vehicle when the preset model cannot recognize the object category in front of the vehicle, thereby ensuring that the self-driving vehicle can recognize it again, by identifying it twice. The combined method of assistance makes the recognition results more accurate, thus improving the safety of autonomous driving.

5:電子設備 5: Electronic equipment

501:記憶體 501:Memory

502:處理器 502: Processor

503:計算機程式 503: Computer program

504:通訊匯流排 504: Communication bus

101-106:步驟 101-106: Steps

圖1為本申請實施例提供的一種路況檢測方法的流程圖。 Figure 1 is a flow chart of a road condition detection method provided by an embodiment of the present application.

圖2為本申請實施例提供的語義分割模型結構圖。 Figure 2 is a structural diagram of a semantic segmentation model provided by an embodiment of the present application.

圖3為本申請實施例提供的第一識別結果示意圖。 Figure 3 is a schematic diagram of the first recognition result provided by the embodiment of the present application.

圖4為本申請實施例提供的第二識別結果示意圖。 Figure 4 is a schematic diagram of the second recognition result provided by the embodiment of the present application.

圖5為本申請實施例提供的一種電子設備的結構示意圖。 FIG. 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.

為了能夠更清楚地理解本申請的上述目的、特徵和優點，下面結合附圖和具體實施例對本申請進行詳細描述。需要說明的是，此處所描述的具體實施例僅用以解釋本申請，並不用於限定本申請。 In order to more clearly understand the above objects, features and advantages of the present application, the present application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the specific embodiments described here are only used to explain the present application and are not used to limit the present application.

在下面的描述中闡述了很多具體細節以便於充分理解本申請，所描述的實施例僅僅是本申請一部分實施例，而不是全部的實施例。基於本申請中的實施例，本領域普通技術人員在沒有做出創造性勞動前提下所獲得的所有其他實施例，都屬本申請保護的範圍。 Many specific details are set forth in the following description to facilitate a full understanding of the present application. The described embodiments are only some, rather than all, of the embodiments of the present application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.

以下，術語“第一”、“第二”僅用於描述目的，而不能理解為指示或暗示相對重要性或者隱含指明所指示的技術特徵的數量。由此，限定有“第一”、“第二”的特徵可以明示或者隱含地包括一個或者更多個該特徵。在本申請的一些實施例的描述中，“示例性的”或者“例如”等詞用於表示作例子、例證或說明。本申請的一些實施例中被描述為“示例性的”或者“例如”的任何實施例或設計方案不應被解釋為比其它實施例或設計方案更優選或更具優勢。確切而言，使用“示例性的”或者“例如”等詞旨在以具體方式呈現相關概念。 Hereinafter, the terms “first” and “second” are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the quantity of indicated technical features. Therefore, features defined as "first" and "second" may explicitly or implicitly include one or more of these features. In the description of some embodiments of the present application, words such as "exemplary" or "such as" are used to represent examples, illustrations or explanations. Any embodiment or design described as "exemplary" or "such as" in some embodiments of the application is not intended to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the words "exemplary" or "such as" is intended to present the concept in a concrete manner.

除非另有定義，本文所使用的所有的技術和科學術語與屬本申請的技術領域的技術人員通常理解的含義相同。本文中在本申請的說明書中所使用的術語只是為了描述具體的實施例的目的，不是旨在於限制本申請。 Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the application is for the purpose of describing specific embodiments only and is not intended to limit the application.

參閱圖1所示，圖1為本申請實施例提供的一種路況檢測方法的流程圖。所述方法應用於電子設備(例如，圖5所示的電子設備5)中，所述電子設備可以是任何一種可與用戶進行人機交互的電子產品，例如，個人計算機、平板電腦、智能手機、個人數位助理(Personal Digital Assistant，PDA)、遊戲機、交互式網路電視(Internet Protocol Television，IPTV)、智能穿戴式裝置等。 Refer to Figure 1, which is a flow chart of a road condition detection method provided by an embodiment of the present application. The method is applied to electronic devices (for example, the electronic device 5 shown in Figure 5). The electronic device can be any electronic product that can interact with the user, such as a personal computer, a tablet computer, a smart phone. , Personal Digital Assistant (PDA), game consoles, interactive Internet Protocol Television (IPTV), smart wearable devices, etc.

所述電子設備是一種能夠按照事先設定或存儲的指令，自動進行數值計算和/或信息處理的設備，其硬體包括，但不限於：微處理器、專用集成電路(Application Specific Integrated Circuit，ASIC)、可編程門陣列(Field-Programmable Gate Array，FPGA)、數位訊號處理器(Digital Signal Processor，DSP)、嵌入式設備等。 The electronic device is a device that can automatically perform numerical calculations and/or information processing according to preset or stored instructions. Its hardware includes, but is not limited to: microprocessors, Application Specific Integrated Circuits (ASICs) ), Field-Programmable Gate Array (FPGA), Digital Signal Processor (DSP), embedded devices, etc.

所述電子設備還可以包括網路設備和/或用戶設備。其中，所述網路設備包括，但不限於單個網路伺服器、多個網路伺服器組成的伺服器組或基於雲計算(Cloud Computing)的由大量主機或網路伺服器構成的雲。 The electronic equipment may also include network equipment and/or user equipment. The network equipment includes, but is not limited to, a single network server, a server group composed of multiple network servers, or a cloud composed of a large number of hosts or network servers based on cloud computing.

所述電子設備所處的網路包括，但不限於：網際網路、廣域網、城域網、區域網路、虛擬專用網路(Virtual Private Network，VPN)等。 The network where the electronic device is located includes, but is not limited to: the Internet, wide area network, metropolitan area network, regional network, virtual private network (Virtual Private Network, VPN), etc.

所述方法具體包括如下步驟。 The method specifically includes the following steps.

101，獲取車輛的前方圖像作為檢測圖像。 101. Obtain the front image of the vehicle as the detection image.

在本申請的至少一個實施例中，利用安裝在車輛內部或者外部的攝像頭對車輛前方的區域(例如，視野區域)進行拍攝，並將拍攝所得的圖像作為檢測圖像。 In at least one embodiment of the present application, a camera installed inside or outside the vehicle is used to capture an area in front of the vehicle (for example, a field of view area), and the captured image is used as a detection image.

在其他實施例中，還可以利用OpenCV技術從行車記錄儀的影像視頻中獲取的圖像作為檢測圖像。本申請獲取檢測圖像的方式不做具體限定。 In other embodiments, OpenCV technology can also be used to obtain images from the driving recorder's video as the detection image. The method of obtaining detection images in this application is not specifically limited.

102，構建語義分割模型並完成對所述語義分割模型的訓練。 102. Construct a semantic segmentation model and complete training of the semantic segmentation model.

在本申請的至少一個實施例中，所述語義分割模型結構參考圖2所示，所述語義分割模型包括backbone網路及head網路，其中，所述head網路包括第一分割網路及第二分割網路。 In at least one embodiment of the present application, the structure of the semantic segmentation model is shown with reference to Figure 2. The semantic segmentation model includes a backbone network and a head network, wherein the head network includes a first segmentation network and a head network. Second segmentation network.

在本申請的至少一個實施例中，所述完成對所述語義分割模型的訓練包括：獲取訓練圖像；將所述訓練圖像輸入至所述backbone網路進行特徵提取，得到多個訓練特徵圖；將所述多個訓練特徵圖輸入所述head網路，所述第一分割網路對所述多個訓練特徵圖進行處理，得到所述訓練圖像的第一訓練結果；根據所述第一訓練結果與預設的第一期望結果，利用預設的損失函數計算所述第一分割網路的第一損失值；所述第二分割網路對所述多個訓練特徵圖進行處理，得到所述訓練圖像的第二訓練結果；根據所述第二訓練結果與預設的第二期望結果，利用所述預設的損失函數計算所述第二分割網路的第二損失值；根據所述第一損失值及所述第二損失值，調整所述語義分割模型的參數，得到所述訓練完成的語義分割模型。 In at least one embodiment of the present application, completing the training of the semantic segmentation model includes: obtaining training images; inputting the training images into the backbone network for feature extraction to obtain multiple training features Figure: Input the multiple training feature maps into the head network, and the first segmentation network processes the multiple training feature maps to obtain the first training result of the training image; according to the The first training result and the preset first expected result are used to calculate the first loss value of the first segmentation network using the preset loss function; the second segmentation network processes the plurality of training feature maps. , obtain the second training result of the training image; according to the second training result and the preset second expected result, use the preset loss function to calculate the second loss value of the second segmentation network ; According to the first loss value and the second loss value, adjust the parameters of the semantic segmentation model to obtain the trained semantic segmentation model.

在本申請的至少一個實施例中，所述獲取訓練圖像包括：採用PASCAL Visual Object Classes(VOC)數據集中的圖像作為訓練圖像，或採用Cityscapes數據集中的圖像作為訓練圖像，或採用自行拍攝的路況圖像作為訓練圖像對所述語義分割模型進行訓練。本申請對此不做具體限定，例如，可以將各種道路場景的圖像作為訓練圖像，訓練圖像中包括不同的物體作為檢測對象，例如，車輛、行人、樹木、路障等。 In at least one embodiment of the present application, the obtaining training images includes: using images in the PASCAL Visual Object Classes (VOC) data set as training images, or using images in the Cityscapes data set as training images, or The semantic segmentation model is trained using self-photographed traffic images as training images. This application does not specifically limit this. For example, images of various road scenes can be used as training images. The training images include different objects as detection objects, such as vehicles, pedestrians, trees, roadblocks, etc.

在本申請的至少一個實施例中，若訓練圖像採用自行拍攝的路況圖像作為訓練圖像，所述方法包括：對所述自行拍攝的路況圖像進行數據增強處理以增加所述訓練圖像，其中所述數據增強處理包括對所述訓練樣本圖像進行翻轉、旋轉、縮放比例、移位處理。在本實施例中，對自行拍攝的路況圖像進行數據增強操作，可以增加訓練圖像，以提高所述語義分割模型的魯棒性。 In at least one embodiment of the present application, if the training image uses a self-photographed road condition image as the training image, the method includes: performing data enhancement processing on the self-photographed road condition image to increase the training image. Image, wherein the data enhancement processing includes flipping, rotating, scaling, and shifting the training sample image. In this embodiment, data enhancement operations are performed on self-photographed traffic images, and training images can be added to improve the robustness of the semantic segmentation model.

在本申請的至少一個實施例中，將所述訓練圖像輸入所述backbone網路進行特徵提取，得到多個訓練特徵圖包括：採用segnet網路的編碼網路作為所述語義分割模型的backbone網路；所述segnet網路的編碼網路包括卷積層、批量標準化(Batch Normalization，BN)層、ReLU激活層及最大池化層(max-Pooling)；將所述訓練圖像輸入至所述卷積層進行卷積運算提取所述訓練圖像的特徵值，藉由BN層將特徵值進行標準化處理計算當前的學習率，並經過ReLU激活層、最大池化(max-Pooling)層處理後，輸出多個訓練特徵圖。 In at least one embodiment of the present application, inputting the training image into the backbone network for feature extraction, and obtaining multiple training feature maps includes: using the encoding network of the segnet network as the backbone of the semantic segmentation model. network; the coding network of the segnet network includes a convolution layer, a batch normalization (Batch Normalization, BN) layer, a ReLU activation layer and a maximum pooling layer (max-Pooling); input the training image to the The convolution layer performs a convolution operation to extract the feature values of the training image. The BN layer normalizes the feature values to calculate the current learning rate, and after processing by the ReLU activation layer and the max-Pooling layer, Output multiple training feature maps.

在本申請的至少一個實施例中，所述將所述多個訓練特徵圖輸入所述head網路，所述第一分割網路對所述多個訓練特徵圖進行處理，得到所述訓練圖像的第一訓練結果包括：採用segnet網路的解碼網路作為所述語義分割模型的head網路中的第一分割網路；其中，所述segnet網路的解碼層網路包括上採樣層、卷積層及第一softmax層。在本實施例中，將所述多個訓練特徵圖輸入至所述上採樣層進行上採樣操作，將訓練特徵圖放大至與訓練圖像相同的尺寸，隨後將經過上採樣的訓練特徵圖輸入所述卷積層進行卷積運算，得到運算後的第一訓練特徵圖，並將第一訓練特徵圖輸入第一softmax層按照第一預設像素類別進行分類，得到所述訓練圖像中每個像素分類的概率A _ik，其中，所述A _ik表示訓練圖像中第 i個像素為第k個類別的概率，選擇最大概率值對應的類別作為該個像素對應的類別，輸出所述訓練圖像中每個像素所屬的類別信息作為所述第一訓練結果，根據每個像素所屬的類別信息確定所述訓練圖像中所有物體的類別。 In at least one embodiment of the present application, the plurality of training feature maps are input into the head network, and the first segmentation network processes the plurality of training feature maps to obtain the training map. The first training result of the image includes: adopting the decoding network of the segnet network as the first segmentation network in the head network of the semantic segmentation model; wherein the decoding layer network of the segnet network includes an upsampling layer , convolutional layer and the first softmax layer. In this embodiment, the multiple training feature maps are input to the upsampling layer to perform an upsampling operation, the training feature maps are enlarged to the same size as the training image, and then the upsampled training feature maps are input The convolution layer performs a convolution operation to obtain the first training feature map after the operation, and inputs the first training feature map into the first softmax layer for classification according to the first preset pixel category to obtain each of the training images. Probability A _ik of pixel classification, where A _ik represents the probability that the i- th pixel in the training image is the k -th category, select the category corresponding to the maximum probability value as the category corresponding to the pixel, and output the training image The category information to which each pixel in the image belongs is used as the first training result, and the categories of all objects in the training image are determined based on the category information to which each pixel belongs.

在本實施例中，語義分割模型是基於訓練圖像以及對應的像素類別標註進行訓練得到，多個像素類別可以預先確定。例如，所述第一softmax層預測的第一預設像素類別包括19個預定義的物體類別，包括車輛、行人、樹木、路障、路燈、建築物等。例如，像素分類的類別包括車輛(k=0)、行人(k=1)、樹木(k=2)、路障(k=3)、路燈(k=4)、建築物(k=5)，經過第一softmax層按照第一預設像素類別分類後，得到第i個像素的概率值分別為：A _i0=0.94、A _i1=0.23、A _i2=0.13、A _i3=0.03、A _i4=0.02、A _i5=0.01，最大概率值為0.94，由於對應k=0，可以確認物體類別為車輛。因此，在此例中，藉由對第i個像素的分類的概率的計算與比較，可以確定第i個像素為車輛。 In this embodiment, the semantic segmentation model is trained based on training images and corresponding pixel category annotations, and multiple pixel categories can be predetermined. For example, the first preset pixel category predicted by the first softmax layer includes 19 predefined object categories, including vehicles, pedestrians, trees, roadblocks, street lights, buildings, etc. For example, the categories of pixel classification include vehicles ( k =0), pedestrians ( k =1), trees ( k =2), roadblocks ( k =3), street lights ( k =4), buildings ( k =5), After the first softmax layer is classified according to the first preset pixel category, the probability values of the i-th pixel are: A _{i 0} =0.94, A _{i 1} =0.23, A _{i 2} =0.13, A _{i 3} =0.03, A _{i 4} =0.02, A _{i 5} =0.01, and the maximum probability value is 0.94. Since the corresponding k =0, it can be confirmed that the object category is a vehicle. Therefore, in this example, by calculating and comparing the probability of classification of the i - th pixel, it can be determined that the i- th pixel is a vehicle.

在本申請的至少一個實施例中，所述根據所述第一訓練結果與預設的第一期望結果，利用預設的損失函數計算所述第一分割網路的第一損失值包括：所述損失函數為：

In at least one embodiment of the present application, calculating the first loss value of the first segmentation network using a preset loss function based on the first training result and a preset first expected result includes: The loss function is:

其中，LOSS代表第一損失，y代表預設的第一期望結果，

代表第一訓練結果。 Among them, LOSS represents the first loss, y represents the preset first expected result,

Represents the first training result.

在本申請的至少一個實施例中，所述第二分割網路對所述多個訓練特徵圖進行處理，得到所述訓練圖像的第二訓練結果包括：新增加一個segnet網路的解碼網路作為所述語義分割模型的head網路中的第二分割網路，其中，所述新增的segnet網路的解碼層網路包括上採樣層、卷積層及第二softmax層；將所述多個訓練特徵圖輸入至所述上採樣層進行上採樣操作，將訓練特徵圖放大至與訓練圖像相同的尺寸，隨後將經過上採樣的訓練特徵圖輸入所述卷積層進行卷積運算，得到第二訓練特徵圖，最後將第二訓練特徵圖輸入第二softmax層按照第二預設像素類別進行分類，得到所述訓練圖像中每個像素分類的概率A _bq，其中，所述A _bq表示訓練圖像中第b個像素為第b個類別的概率，選擇最大概率值對應的類別作為該個像素對應的類別，確定所述訓練圖像對應的道路類別作為第二訓練結果。 In at least one embodiment of the present application, the second segmentation network processes the plurality of training feature maps, and obtaining the second training result of the training image includes: adding a decoding network of a segnet network. path as the second segmentation network in the head network of the semantic segmentation model, wherein the decoding layer network of the newly added segnet network includes an upsampling layer, a convolution layer and a second softmax layer; Multiple training feature maps are input to the upsampling layer to perform an upsampling operation, the training feature map is enlarged to the same size as the training image, and then the upsampled training feature map is input to the convolution layer to perform a convolution operation, Obtain the second training feature map, and finally input the second training feature map into the second softmax layer to classify according to the second preset pixel category, and obtain the probability A _bq of each pixel classification in the training image, where, the A _bq represents the probability that the b- th pixel in the training image is of the b- th category, select the category corresponding to the maximum probability value as the category corresponding to the pixel, and determine the road category corresponding to the training image as the second training result.

在本實施例中，所述第二預設像素類別包括預定義的兩個道路類別：車道或非車道。例如，所述第二softmax層預測兩個物體類別，即車道或非車道。例如，像素分類的類別包括車道(q=10)、非車道(q=15)，經過第二softmax層按照第二預設像素類別分類後，得到第b個像素的概率值分別為：A _b10=0.86，A _b15=0.33，最大概率值為0.86，由於對應q=10，可以確定類別為車道。因此，在此例中，藉由對第b個像素的分類的概率的計算與比較，從而可以得到第b個像素的道路類別為車道。在本實施例中，若訓練圖像中的物體被識別為車道類別，表明該物體為非障礙物；若訓練圖像中的物體被識別為非車道類別，表明該物體為障礙物。 In this embodiment, the second preset pixel category includes two predefined road categories: lane or non-lane. For example, the second softmax layer predicts two object categories, lane or non-lane. For example, the categories of pixel classification include lanes ( q =10) and non-lanes ( q =15). After the second softmax layer is classified according to the second preset pixel category, the probability values of the b -th pixel are: A _{b 10} =0.86, A _{b 15} =0.33, and the maximum probability value is 0.86. Since it corresponds to q =10, the category can be determined to be a lane. Therefore, in this example, by calculating and comparing the probability of the classification of the b- th pixel, it can be obtained that the road category of the b -th pixel is a lane. In this embodiment, if the object in the training image is recognized as a lane category, it indicates that the object is not an obstacle; if the object in the training image is recognized as a non-lane category, it indicates that the object is an obstacle.

在本申請的至少一個實施例中，所述根據所述第二訓練結果與預設的第二期望結果，利用所述預設的損失函數計算所述第二分割網路的第二損失值與所述利用所述預設的損失函數計算所述第一分割網路的第一損失值方法相似，本申請在此不再贅述。 In at least one embodiment of the present application, the second loss value and the second loss value of the second segmentation network are calculated using the preset loss function based on the second training result and the preset second expected result. The method of calculating the first loss value of the first segmentation network using the preset loss function is similar and will not be described again in this application.

在本申請的至少一個實施例中，所述根據所述第一損失值及所述第二損失值，調整所述語義分割模型的參數，得到所述訓練完成的語義分割模型包括：將所述第一損失值與所述第二損失值相加得到所述語義分割模型的損失值；採用梯度下降法調整所述語義分割模型的參數，使所述語義分割模型的損失值最小，得到訓練完成的語義分割模型。在本實施例中，採用的梯度下降算法包括隨機梯度下降法(Stochastic Gradient Descent)或者小批量梯度下降法(Mini-batch Gradient Descent)。本申請對此不做具體限定。在本實施例中，所述調整所述語義分割模型的參數包括調整所述語義分割模型的學習率或者所述訓練圖像的迭代訓練次數。 In at least one embodiment of the present application, adjusting the parameters of the semantic segmentation model according to the first loss value and the second loss value to obtain the trained semantic segmentation model includes: Add the first loss value and the second loss value to obtain the loss value of the semantic segmentation model; use the gradient descent method to adjust the parameters of the semantic segmentation model to minimize the loss value of the semantic segmentation model, Obtain the trained semantic segmentation model. In this embodiment, the gradient descent algorithm used includes stochastic gradient descent method (Stochastic Gradient Descent) or mini-batch gradient descent method (Mini-batch Gradient Descent). This application does not specifically limit this. In this embodiment, adjusting the parameters of the semantic segmentation model includes adjusting the learning rate of the semantic segmentation model or the number of iterative training times of the training images.

103，將所述檢測圖像輸入所述訓練完成的語義分割模型中的backbone網路進行特徵提取，得到多個特徵圖。 103. Input the detection image into the backbone network in the trained semantic segmentation model for feature extraction to obtain multiple feature maps.

在本申請的至少一個實施例中，將所述檢測圖像輸入至backbone網路中的卷積層進行卷積運算提取所述檢測圖像的特徵值，藉由BN層將特徵值進行標準化處理計算當前的學習率，並經過ReLU激活層、最大池化(max-Pooling)層處理後，輸出多個特徵圖。 In at least one embodiment of the present application, the detection image is input to the convolution layer in the backbone network to perform a convolution operation to extract the feature values of the detection image, and the feature values are standardized and calculated through the BN layer. The current learning rate is processed by the ReLU activation layer and the max-Pooling layer, and multiple feature maps are output.

104，將所述多個特徵圖輸入所述head網路，所述head網路的第一分割網路對所述多個特徵圖進行處理，輸出第一識別結果。 104. Input the plurality of feature maps into the head network, and the first segmentation network of the head network processes the plurality of feature maps and outputs a first recognition result.

在本申請的至少一個實施例中，將所述多個特徵圖輸入所述head網路，所述head網路的第一分割網路對所述多個特徵圖進行處理，輸出第一識別結果包括：將所述檢測圖像輸入至所述backbone網路進行卷積運算及最大池化運算，得到所述檢測圖像的多個特徵圖；所述第一分割網路對所述多個特徵圖進行上採樣及反卷積處理，得到多個與所述檢測圖像大小一致的特徵圖；利用所述第一softmax層對所述多個特徵圖按照第一預設像素類別進行分類，輸出所述檢測圖像中每個像素所屬的類別信息；根據每個像素所屬的類別信息確定所述檢測圖像中所有物體的類別，將所述檢測圖像中所有物體的類別作為第一識別結果。例如，如圖3所示，圖3為本申請實施例提供的第一識別結果示意圖。圖中示出，檢測圖像經第一分割網路得到的識別結果，將所述檢測圖像中的物體類別按像素進行分類，得到檢測圖像上的物體類別。 In at least one embodiment of the present application, the plurality of feature maps are input into the head network, and the first segmentation network of the head network processes the plurality of feature maps and outputs a first recognition result. It includes: inputting the detection image to the backbone network to perform convolution operation and maximum pooling operation to obtain multiple feature maps of the detection image; the first segmentation network performs a convolution operation on the multiple features The image is subjected to upsampling and deconvolution processing to obtain multiple feature maps with the same size as the detection image; The first softmax layer is used to classify the plurality of feature maps according to a first preset pixel category, and the category information to which each pixel in the detection image belongs is output; and the category information to which each pixel belongs is determined according to the category information. Detect the categories of all objects in the image, and use the categories of all objects in the detected image as the first recognition result. For example, as shown in Figure 3, Figure 3 is a schematic diagram of the first recognition result provided by an embodiment of the present application. The figure shows the recognition result of the detection image obtained through the first segmentation network. The object categories in the detection image are classified by pixels to obtain the object categories on the detection image.

105，所述head網路的第二分割網路對所述多個特徵圖進行處理，輸出第二識別結果。 105. The second segmentation network of the head network processes the multiple feature maps and outputs a second recognition result.

在本申請的至少一個實施例中，將所述head網路的第二分割網路對所述多個特徵圖進行處理，輸出第二識別結果包括：所述第二分割網路對所述多個特徵圖進行上採樣及反卷積處理，得到多個與所述檢測圖像大小一致的特徵圖；利用所述第二softmax層對所述多個特徵圖按照第二預設像素類別進行分類，確定所述檢測圖像對應的道路類別作為第二識別結果，其中，所述道路類別為車道或非車道。例如，如圖4所示，圖4為本申請實施例提供的第二識別結果示意圖。圖中示出，檢測圖像經第二分割網路得到的識別結果，將所述檢測圖像中的物體類別按像素進行分類，確定所述檢測圖像對應的道路類別。在本實施例中，將所述車道作為非障礙物，所述非車道作為障礙物。 In at least one embodiment of the present application, processing the plurality of feature maps by the second segmentation network of the head network, and outputting the second recognition result includes: the second segmentation network processes the plurality of feature maps. Perform upsampling and deconvolution processing on the feature maps to obtain multiple feature maps with the same size as the detection image; use the second softmax layer to classify the multiple feature maps according to the second preset pixel category , determine the road category corresponding to the detection image as the second recognition result, wherein the road category is lane or non-lane. For example, as shown in Figure 4, Figure 4 is a schematic diagram of the second recognition result provided by an embodiment of the present application. The figure shows the recognition result of the detection image obtained through the second segmentation network. The object categories in the detection image are classified by pixels to determine the road category corresponding to the detection image. In this embodiment, the lane is regarded as a non-obstacle, and the non-lane is regarded as an obstacle.

上述對第一識別結果的獲取過程可參考上文對第一訓練結果的獲取過程，同理，對第二識別結果的獲取過程可參考上文對第二訓練結果的獲取過程，此處不再重複說明。 The above process of obtaining the first recognition result can refer to the above process of obtaining the first training result. Similarly, the process of obtaining the second recognition result can refer to the above process of obtaining the second training result, which will not be repeated here. Repeat instructions.

需要說明的是，所述第一分割網路與所述第二分割網路是同時對接收到的特徵圖進行處理。當所述第一分割網路得到第一識別結果時，對識別到的類別進行判斷，以確定車輛的下一步操作，當第一識別結果顯示有無法識別的類別時，調用第二識別結果，根據第二識別結果確定車輛的下一步操作。 It should be noted that the first segmentation network and the second segmentation network process the received feature maps at the same time. When the first segmentation network obtains the first recognition result, the recognition The identified category is judged to determine the next operation of the vehicle. When the first recognition result shows that there is an unrecognizable category, the second recognition result is called, and the next operation of the vehicle is determined based on the second recognition result.

106，基於所述第一識別結果及所述第二識別結果確定所述車輛是否可以繼續行駛。 106. Determine whether the vehicle can continue driving based on the first recognition result and the second recognition result.

在本申請的至少一個實施例中，所述基於所述第一識別結果及所述第二識別結果判斷確定所述車輛是否可以繼續行駛包括：若所述第一識別結果表明已經識別出所述檢測圖像中的所有物體類別，根據所述第一識別結果中的所有物體的類別確定所述車輛是否可以繼續行駛；或若所述第一識別結果表明所述檢測圖像中存在無法識別的物體並且所述第二識別結果表明道路類別為車道，視為車輛前方不存在障礙物，確定所述車輛可以繼續行駛；或若所述第一識別結果表明所述檢測圖像中存在無法識別的物體並且所述第二識別結果表明道路類別為非車道，視為車輛前方存在障礙物，確定所述車輛不可以繼續行駛。 In at least one embodiment of the present application, determining whether the vehicle can continue driving based on the first recognition result and the second recognition result includes: if the first recognition result indicates that the vehicle has been recognized Detect all object categories in the image, and determine whether the vehicle can continue driving based on the categories of all objects in the first recognition result; or if the first recognition result indicates that there are unrecognizable objects in the detected image object and the second recognition result indicates that the road type is a lane, it is deemed that there is no obstacle in front of the vehicle, and it is determined that the vehicle can continue driving; or if the first recognition result indicates that there is an unrecognizable object in the detection image object and the second recognition result indicates that the road category is non-lane, it is considered that there is an obstacle in front of the vehicle, and it is determined that the vehicle cannot continue driving.

在本申請的至少一個實施例中，經過所述第一分割網路得到第一識別結果時，若所述第一識別結果無法識別物體類別，則調用第二識別結果判斷車輛是否可以繼續行駛。例如，若車輛前方有一個嬰兒車，在第一分割網路訓練時沒有將嬰兒車的類別放入訓練，則第一分割網路識別不出來車輛前方的嬰兒車，則調用第二識別結果，第二識別結果表明道路類別為非車道，視為車輛前方存在障礙物，確定所述車輛不可以繼續行駛。 In at least one embodiment of the present application, when the first recognition result is obtained through the first segmentation network, if the first recognition result cannot identify the object type, the second recognition result is called to determine whether the vehicle can continue driving. For example, if there is a stroller in front of the vehicle, and the category of the stroller is not included in the training of the first segmentation network, the first segmentation network cannot recognize the stroller in front of the vehicle, and the second recognition result is called. The second recognition result indicates that the road category is non-lane, which indicates that there is an obstacle in front of the vehicle, and it is determined that the vehicle cannot continue driving.

以上所述，僅是本申請的具體實施方式，但本申請的保護範圍並不局限於此，對於本領域的普通技術人員來說，在不脫離本申請創造構思的前提下，還可以做出改進，但這些均屬本申請的保護範圍。 The above are only specific embodiments of the present application, but the protection scope of the present application is not limited thereto. For those of ordinary skill in the art, without departing from the creative concept of the present application, they can also make Improvements, but these all fall within the protection scope of this application.

如圖5所示，圖5為本申請實施例提供的一種電子設備的結構示意圖。所述電子設備5包括記憶體501、至少一個處理器502、存儲在所述記憶體501中並可在所述至少一個處理器502上運行的計算機程式503及至少一條通訊匯流排504。 As shown in Figure 5, Figure 5 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. The electronic device 5 includes a memory 501, at least one processor 502, a computer program 503 stored in the memory 501 and executable on the at least one processor 502, and at least one communication bus 504.

本領域技術人員可以理解，圖5所示的示意圖僅僅是所述電子設備5的示例，並不構成對所述電子設備5的限定，可以包括比圖示更多或更少的部件，或者組合某些部件，或者不同的部件，例如所述電子設備5還可以包括輸入輸出設備、網路接入設備等。 Those skilled in the art can understand that the schematic diagram shown in FIG. 5 is only an example of the electronic device 5 and does not constitute a limitation of the electronic device 5. It may include more or less components than those shown in the figure, or a combination thereof. Certain components, or different components, for example, the electronic device 5 may also include input and output devices, network access devices, etc.

所述至少一個處理器502可以是中央處理單元(Central Processing Unit，CPU)，還可以是其他通用處理器、數位訊號處理器(Digital Signal Processor，DSP)、專用集成電路(Application Specific Integrated Circuit，ASIC)、現場可編程門陣列(Field-Programmable Gate Array，FPGA)或者其他可編程邏輯器件、分立門或者晶體管邏輯器件、分立硬體組件等。該至少一個處理器502可以是微處理器或者該至少一個處理器502也可以是任何常規的處理器等，所述至少一個處理器502是所述電子設備5的控制中心，利用各種介面和線路連接整個電子設備5的各個部分。 The at least one processor 502 may be a Central Processing Unit (CPU), or other general-purpose processor, a Digital Signal Processor (DSP), or an Application Specific Integrated Circuit (ASIC). ), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The at least one processor 502 may be a microprocessor or the at least one processor 502 may also be any conventional processor, etc. The at least one processor 502 is the control center of the electronic device 5 and utilizes various interfaces and circuits. Connect various parts of the entire electronic device 5.

所述記憶體501可用於存儲所述計算機程式503，所述至少一個處理器502藉由運行或執行存儲在所述記憶體501內的計算機程式503，以及調用存儲在記憶體501內的數據，實現所述電子設備5的各種功能。所述記憶體501可主要包括存儲程式區和存儲數據區，其中，存儲程式區可存儲操作系統、至少一個功能所需的應用程式(比如聲音播放功能、圖像播放功能等)等；存儲數據區可存儲根據電子設備5的使用所創建的數據(比如音頻數據)等。此外，記憶體501可以包括非易失性記憶體，例如硬盤、內存、插接式硬盤，智能存儲卡(Smart Media Card，SMC)，安全數位(Secure Digital，SD)卡，閃存卡 (Flash Card)、至少一個磁盤記憶體件、閃存器件、或其他非易失性固態記憶體件。 The memory 501 can be used to store the computer program 503. The at least one processor 502 runs or executes the computer program 503 stored in the memory 501 and calls the data stored in the memory 501, Various functions of the electronic device 5 are realized. The memory 501 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area The area may store data created according to the use of the electronic device 5 (such as audio data) and the like. In addition, the memory 501 may include non-volatile memory, such as hard disk, memory, plug-in hard disk, smart memory card (Smart Media Card, SMC), Secure Digital (Secure Digital, SD) card, flash memory card (Flash Card), at least one disk memory device, flash memory device, or other non-volatile solid-state memory device.

所述電子設備5集成的模塊/單元如果以軟件功能單元的形式實現並作為獨立的產品銷售或使用時，可以存儲在一個計算機可讀取存儲媒體中。基於這樣的理解，本申請實現上述實施例方法中的全部或部分流程，也可以藉由計算機程式來指令相關的硬體來完成，所述的計算機程式可存儲於一計算機可讀存儲媒體中，該計算機程式在被處理器執行時，可實現上述各個方法實施例的步驟。其中，所述計算機程式包括計算機程式代碼，所述計算機程式代碼可以為源代碼形式、對象代碼形式、可執行文件或某些中間形式等。所述計算機可讀媒體可以包括：能夠攜帶所述計算機程式代碼的任何實體或裝置、記錄媒體、隨身碟、移動硬盤、磁碟、光盤、計算機記憶體以及唯讀記憶體(ROM，Read-Only Memory)。 If the integrated modules/units of the electronic device 5 are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the present application can implement all or part of the processes in the above embodiment methods by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. When executed by the processor, the computer program can implement the steps of each of the above method embodiments. Wherein, the computer program includes computer program code, and the computer program code may be in source code form, object code form, executable file or some intermediate form, etc. The computer-readable media may include: any entity or device capable of carrying the computer program code, a recording medium, a flash drive, a mobile hard drive, a magnetic disk, an optical disk, computer memory, and read-only memory (ROM, Read-Only Memory). Memory).

對於本領域技術人員而言，顯然本申請不限於上述示範性實施例的細節，而且在不背離本申請的精神或基本特徵的情況下，能夠以其他的具體形式實現本申請。因此，無論從哪一點來看，均應將實施例看作是示範性的，而且是非限制性的，本申請的範圍由所附請求項而不是上述說明限定，因此旨在將落在請求項的等同要件的含義和範圍內的所有變化涵括在本申請內。不應將請求項中的任何附關聯圖標記視為限制所涉及的請求項。 It is obvious to those skilled in the art that the present application is not limited to the details of the above-described exemplary embodiments, and that the present application can be implemented in other specific forms without departing from the spirit or essential characteristics of the present application. Therefore, the embodiments should be regarded as illustrative and non-restrictive from any point of view, and the scope of the present application is defined by the appended claims rather than the above description, and it is therefore intended that those falling within the claims All changes within the meaning and scope of the equivalent elements are included in this application. Any associated association markup in a request item should not be considered to limit the request item in question.

最後應說明的是，以上實施例僅用以說明本申請的技術方案而非限制，儘管參照較佳實施例對本申請進行了詳細說明，本領域的普通技術人員應當理解，可以對本申請的技術方案進行修改或等同替換，而不脫離本申請技術方案的精神和範圍。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application and are not limiting. Although the present application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present application can be modified. Modifications or equivalent substitutions may be made without departing from the spirit and scope of the technical solution of the present application.

101-106:步驟 101-106: Steps

Claims

A road condition detection method, wherein the road condition detection method includes: obtaining an image in front of the vehicle as a detection image; inputting the detection image into a trained semantic segmentation model, the semantic segmentation model including a backbone network and a head Network; use the backbone network to perform feature extraction on the detection image to obtain multiple feature maps; use the decoding network of the segnet network as the first segmentation network in the head network; and add a new The decoding network of a segnet network serves as the second segmentation network in the head network; the plurality of feature maps are input to the head network, and the first segmentation network of the head network is responsible for the Multiple feature images are processed to obtain a first recognition result, and the second segmentation network of the head network processes the multiple feature images to obtain a second recognition result; based on the first recognition result and The second recognition result determines whether the vehicle can continue driving, including: if the first recognition result indicates that all object categories in the detection image have been recognized, based on all objects in the first recognition result determine whether the vehicle can continue driving; or if the first recognition result indicates that there is an unrecognizable object in the detection image and the second recognition result indicates that the road category is a lane, determine whether the vehicle can continue driving Driving; or if the first recognition result shows that there is an unrecognizable object in the detected image and the second recognition result shows that the road type is non-lane, it is determined that the vehicle cannot continue to drive.

The road condition detection method according to claim 1, wherein the method further includes: constructing a semantic segmentation model and completing training of the semantic segmentation model, including: obtaining training images; inputting the training images to the The backbone network is used for feature extraction to obtain multiple training feature maps; The plurality of training feature maps are input into the head network, and the first segmentation network processes the plurality of training feature maps to obtain a first training result of the training image; according to the first The training results and the preset first expected result are calculated using the preset loss function to calculate the first loss value of the first segmentation network; the second segmentation network processes the multiple training feature maps to obtain The second training result of the training image; according to the second training result and the preset second expected result, use the preset loss function to calculate the second loss value of the second segmentation network; according to The first loss value and the second loss value are used to adjust the parameters of the semantic segmentation model to obtain the trained semantic segmentation model.

The road condition detection method according to claim 2, wherein the method of obtaining the first training result and the second training result includes: using the first segmentation network to upsample the plurality of training feature maps. and deconvolution processing to obtain a plurality of first training feature maps with the same size as the training image, and use the first softmax layer to classify each first training feature map according to the first preset primitive category to obtain the The probability of classifying each graphic element in the training image is selected, the category corresponding to the maximum probability value is selected as the category corresponding to the graphic element, and the first training result is output, wherein the first preset graphic element category includes a preset Multiple defined object categories; use the second segmentation network to perform upsampling and deconvolution processing on the multiple training feature maps to obtain multiple second training feature maps that are consistent in size with the detection image, Use the second softmax layer to classify each training feature map according to the second preset primitive category, obtain the probability of each primitive classification in the training image, and select the category corresponding to the maximum probability value as the corresponding primitive category, and output the second training result, wherein the second preset primitive category includes two predefined road categories: lane or non-lane.

The road condition detection method as described in claim 2, wherein adjusting the parameters of the semantic segmentation model according to the first loss value and the second loss value to obtain the trained semantic segmentation model includes: Add the first loss value and the second loss value to obtain the loss value of the semantic segmentation model; use the gradient descent method to adjust the parameters of the semantic segmentation model to minimize the loss value of the semantic segmentation model, The trained semantic segmentation model is obtained.

The road condition detection method as described in any one of claims 1 to 4, wherein the method further includes: using a coding network of a segnet network as the backbone network.

The road condition detection method according to claim 5, wherein the first segmentation network of the head network processes the plurality of feature maps, and obtaining the first recognition result includes: inputting the detection image to the The backbone network performs convolution operations and maximum pooling operations to obtain multiple feature maps of the detection image; the first segmentation network performs upsampling and deconvolution processing on the multiple feature maps to obtain A plurality of first feature maps with the same size as the detection image; use the first softmax layer to classify each first feature map according to the first preset element category, and output the first feature map in the detection image Category information to which each graphic element belongs; determine the categories of all objects in the detected image based on the category information to which each graphic element belongs, and use the categories of all objects in the detected image as the first recognition result.

The road condition detection method as described in claim 5, wherein the second segmentation network of the head network processes the plurality of feature images, and the second recognition result obtained includes: the second segmentation network pairs The plurality of feature maps are subjected to upsampling and deconvolution processing to obtain a plurality of second feature maps that are consistent in size with the detection image; The second softmax layer is used to classify each second feature map according to the second preset element category, and the road category corresponding to the detection image is determined as the second recognition result.

An electronic device, wherein the electronic device includes a processor and a memory, and the processor is used to execute a computer program stored in the memory to implement the road condition detection method as described in any one of claims 1 to 7.

A computer-readable storage medium, wherein the computer-readable storage medium stores at least one instruction, and when the at least one instruction is executed by a processor, the road condition detection method as described in any one of claims 1 to 7 is implemented.