WO2021088505A1 - 目标属性检测、神经网络训练及智能行驶方法、装置 - Google Patents

目标属性检测、神经网络训练及智能行驶方法、装置 Download PDF

Info

Publication number
WO2021088505A1
WO2021088505A1 PCT/CN2020/114109 CN2020114109W WO2021088505A1 WO 2021088505 A1 WO2021088505 A1 WO 2021088505A1 CN 2020114109 W CN2020114109 W CN 2020114109W WO 2021088505 A1 WO2021088505 A1 WO 2021088505A1
Authority
WO
WIPO (PCT)
Prior art keywords
attribute
target
image
feature
lane line
Prior art date
Application number
PCT/CN2020/114109
Other languages
English (en)
French (fr)
Inventor
林培文
程光亮
石建萍
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to KR1020217016723A priority Critical patent/KR20210087496A/ko
Priority to JP2021533200A priority patent/JP2022513781A/ja
Publication of WO2021088505A1 publication Critical patent/WO2021088505A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30256Lane; Road marking

Definitions

  • This application relates to computer vision processing technology, and relates to but is not limited to a target attribute detection, neural network training and intelligent driving method, device, electronic equipment, computer storage medium and computer program.
  • the recognition of target attributes in images has gradually become a research hotspot.
  • the recognition of lane line attributes is conducive to lane division, path planning, and collision warning.
  • how Accurately identifying target attributes in images is a technical problem to be solved urgently.
  • the embodiments of the present application expect to provide a technical solution for target attribute detection.
  • the embodiment of the present application provides a target attribute detection method, and the method includes:
  • the mask map determine the attribute characteristics of the target in the attribute characteristic map of the image to be processed; the attribute characteristic map of the image to be processed represents the attribute of the image to be processed;
  • the embodiment of the application also provides a neural network training method, including:
  • the annotated mask map of the sample image determine the attribute characteristics of the target in the attribute feature map of the sample image; the annotated mask map characterizes the position of the target in the sample image; the sample image The attribute feature map of represents the attribute of the sample image;
  • the difference between the determined attributes of the target and the marked attributes of the target and the difference between the marked mask image and the mask image of the sample image determined after semantic segmentation of the sample image Adjust the network parameter values of the neural network.
  • the embodiment of the present application also provides an intelligent driving method, including:
  • the smart driving device is instructed to drive on the road corresponding to the road image.
  • the embodiment of the present application also provides a target attribute detection device, the device includes a first processing module, a second processing module, and a third processing module, wherein:
  • the first processing module is configured to perform semantic segmentation on the image to be processed and determine a mask image of the image to be processed, the mask image representing the position of the target in the image to be processed;
  • the second processing module is configured to determine the attribute characteristic of the target in the attribute characteristic map of the image to be processed according to the mask image; the attribute characteristic map of the image to be processed represents the attribute of the image to be processed ;
  • the third processing module is configured to determine the attribute of the target according to the attribute characteristics of the target.
  • the embodiment of the present application also provides a neural network training device, the device includes a fourth processing module, a fifth processing module, and an adjustment module, wherein:
  • the fourth processing module is configured to determine the attribute characteristics of the target in the attribute characteristic map of the sample image according to the annotated mask map of the sample image; the annotated mask map indicates that the target is in the sample image The position of the sample image; the attribute feature map of the sample image characterizes the attribute of the sample image;
  • a fifth processing module configured to determine the attribute of the target according to the attribute characteristics of the target
  • the adjustment module is configured to determine the difference between the determined attribute of the target and the marked attribute of the target, and the marked mask map and the sample image determined after semantic segmentation of the sample image Adjust the network parameter values of the neural network for the difference between the mask maps.
  • the embodiment of the present application also provides an intelligent driving device, including a detection module and an indication module, wherein:
  • the detection module is configured to use any of the foregoing target attribute detection methods to detect the lane line attributes in the road image obtained by the smart driving device;
  • the indicating module is configured to instruct the intelligent driving device to drive on the road corresponding to the road image according to the detected attributes of the lane line.
  • An embodiment of the present application also proposes an electronic device, including a processor and a memory configured to store a computer program that can run on the processor; wherein,
  • the processor When the processor is configured to run the computer program, it executes any one of the above-mentioned target attribute detection methods, any one of the above-mentioned neural network training methods, or any one of the above-mentioned intelligent driving methods.
  • the embodiment of the present application also proposes a computer storage medium on which a computer program is stored.
  • a computer program is stored on which a computer program is stored.
  • the computer program is executed by a processor, any one of the above-mentioned target attribute detection methods or any one of the above-mentioned neural network training methods or any one of the above-mentioned methods is implemented.
  • a smart driving method is implemented.
  • the embodiment of the present application also proposes a computer program, including computer-readable code, when the computer-readable code is executed in an electronic device, the processor in the electronic device executes for realizing any of the above-mentioned target attributes
  • the detection method or any one of the above neural network training methods or any one of the above intelligent driving methods are included in the computer-readable code.
  • the target attribute detection method In the target attribute detection method, neural network training method, and intelligent driving method, device, electronic equipment, computer storage medium, and computer program proposed in the embodiments of this application, semantic segmentation is performed on the image to be processed, and the mask of the image to be processed is determined.
  • a model image the mask image characterizing the position of the target in the image to be processed; according to the mask image, determining the attribute characteristics of the target in the attribute feature map of the image to be processed;
  • the attribute feature map of the image represents the attribute of the image to be processed; the attribute of the target is determined according to the attribute feature of the target.
  • the target attribute detection method provided by the embodiment of the present application divides the target attribute detection into two steps.
  • the position of the target is determined from the image to be processed, and then the position of the target in the image to be processed is combined with the image to be processed.
  • the attribute feature map determines the attribute characteristics of the target, and then determines the attributes of the target according to the attribute characteristics of the target. Compared with determining the characteristics of the area where the target is located according to the pixel at the position of the target in the image to be processed, the characteristics of the target are determined according to the determined characteristics
  • the feature extraction required for classification is avoided, and the attribute features of the target extracted in the target attribute detection method provided in the embodiment of the present application are more discriminative, thereby more accurately distinguishing the target species classification .
  • FIG. 1 is a flowchart of a target attribute detection method according to an embodiment of the application
  • FIG. 2 is a flowchart of lane line attribute detection according to an embodiment of the application
  • Fig. 3 is a flowchart of a neural network training method according to an embodiment of the application.
  • Fig. 4 is a flowchart of a smart driving method according to an embodiment of the application.
  • FIG. 5 is a schematic diagram of the composition structure of a target attribute detection device according to an embodiment of the application.
  • FIG. 6 is a schematic diagram of the composition structure of a neural network training device according to an embodiment of the application.
  • FIG. 7 is a schematic diagram of the composition structure of a smart driving device according to an embodiment of the application.
  • FIG. 8 is a schematic structural diagram of an electronic device according to an embodiment of the application.
  • the terms "including”, “including” or any other variants thereof are intended to cover non-exclusive inclusion, so that a method or device including a series of elements not only includes what is clearly stated Elements, and also include other elements not explicitly listed, or elements inherent to the implementation of the method or device. Without more restrictions, the element defined by the sentence “including a" does not exclude the existence of other related elements in the method or device that includes the element (such as steps or steps in the method).
  • the unit in the device for example, the unit may be a part of a circuit, a part of a processor, a part of a program or software, etc.).
  • the target attribute detection method, neural network training method, and intelligent driving method provided in the embodiments of the application include a series of steps, but the target attribute detection method, neural network training method, and intelligent driving method provided in the embodiments of the application are not limited to
  • the target attribute detection device, neural network training device, and smart driving device provided in the embodiments of the present application include a series of modules, but the devices provided in the embodiments of the present application are not limited to include the explicitly recorded modules. It can also include modules that need to be set for obtaining relevant information or processing based on the information.
  • the embodiments of the present application can be applied to a computer system composed of a terminal and a server, and can be operated with many other general-purpose or special-purpose computing system environments or configurations.
  • the terminal can be a thin client, a thick client, a handheld or laptop device, a microprocessor-based system, a set-top box, a programmable consumer electronic product, a network personal computer, a small computer system, etc.
  • the server can be a server computer System small computer system, large computer system and distributed cloud computing technology environment including any of the above systems, etc.
  • Electronic devices such as terminals and servers can be described in the general context of computer system executable instructions (such as program modules) executed by a computer system.
  • program modules may include routines, programs, object programs, components, logic, data structures, etc., which perform specific tasks or implement specific abstract data types.
  • the computer system/server can be implemented in a distributed cloud computing environment. In the distributed cloud computing environment, tasks are executed by remote processing equipment linked through a communication network.
  • program modules may be located on a storage medium of a local or remote computing system including a storage device.
  • target classification methods and semantic segmentation methods can be used; the process of target classification methods includes extracting the target area from the image, and inputting the target area image to the target classification network , The attributes of the target are obtained through target classification.
  • the main problem of the target classification method is that the target occupies a small image area and the degree of discrimination is low.
  • the process of the semantic segmentation method includes: predicting the attributes of each pixel of the target in the image, and then determining the attributes of the entire target by taking the mode, that is, in the attributes of each pixel of the target, take the appearance
  • the attribute with the most frequency is regarded as the attribute of the entire target;
  • the main problem of the semantic segmentation method is that the target attribute is a whole for the entire target.
  • the semantic segmentation method breaks this overall relationship and will lead to the accuracy of the identified target attribute.
  • the sex is low.
  • a target attribute detection method is proposed.
  • the embodiments of the present application can be applied to scenes such as image classification, lane line attribute recognition, and automatic driving.
  • Fig. 1 is a flowchart of a target attribute detection method according to an embodiment of the application. As shown in Fig. 1, the process may include:
  • Step 101 Perform semantic segmentation on the image to be processed, and determine a mask image of the image to be processed, and the mask image represents the position of the target in the image to be processed.
  • the image to be processed is an image that requires target attribute recognition.
  • the target in the image to be processed may be a lane line or other targets.
  • the image to be processed can be obtained from the local storage area or the network, and the format of the image to be processed can be Joint Photographic Experts Group (JPEG), Bitmap (BMP), Portable Network Graphics (Portable Network Graphics). Graphics, PNG) or other formats; it should be noted that the format and source of the image to be processed are merely illustrated here, and the embodiment of the present invention does not limit the format and source of the image to be processed.
  • JPEG Joint Photographic Experts Group
  • BMP Bitmap
  • Portable Network Graphics Portable Network Graphics
  • PNG Portable Network Graphics
  • the number of targets in the image to be processed is not limited.
  • the target in the image to be processed may be one or multiple; for example, when the target is a lane line, the target is to be processed. There can be multiple lane lines in the image.
  • the position of each target in the image to be processed is represented based on the mask image obtained in step 101.
  • the image to be processed can be input into the trained semantic segmentation network.
  • the semantic segmentation network the mask image of the image to be processed is extracted from the image to be processed.
  • Step 102 According to the mask image, determine the attribute feature of the target in the attribute feature map of the image to be processed; the attribute feature map of the image to be processed represents the attribute of the image to be processed.
  • the attributes of the image to be processed can represent the characteristics of the image such as color, texture, surface roughness, etc.
  • the attributes of the image to be processed can be derived from the attributes of each pixel of the image to be processed; the attributes of the pixels of the image to be processed It can represent information such as the color of the pixels of the image to be processed.
  • the attribute characteristics of the target can characterize the target's color, texture, surface roughness and other characteristics.
  • the attribute characteristic of the target can be expressed as a characteristic map of the set number of channels, and the set number of channels can be set according to the effect of target attribute recognition, for example, the number of channels is set to 5, 6, or 7.
  • the mask map can characterize the position of the target in the image to be processed, according to the mask map, the attribute characteristics of the target can be determined in the attribute feature map of the image to be processed.
  • Step 103 Determine the attributes of the target according to the attributes of the target.
  • the attributes of the target can represent the color, size, shape and other information of the target in the image to be processed.
  • the attributes of the lane line can represent the color, line width, and line type of the lane line, etc. information.
  • the attributes of each target in the image to be processed can be obtained by executing step 103.
  • the attribute characteristics of the target can be input into the trained target classification network, and the target classification network is used to classify the attribute characteristics of the target to obtain the attributes of the target in the image to be processed.
  • steps 101 to 103 can be implemented by a processor in an electronic device, and the above-mentioned processor can be an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), Digital Signal Processing Device (Digital Signal Processing Device, DSPD), Programmable Logic Device (Programmable Logic Device, PLD), Field Programmable Gate Array (Field Programmable Gate Array, FPGA), Central Processing Unit (CPU) , At least one of a controller, a microcontroller, and a microprocessor.
  • ASIC Application Specific Integrated Circuit
  • DSP Digital Signal Processor
  • DSPD Digital Signal Processing Device
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • CPU Central Processing Unit
  • the semantic segmentation method is first used to obtain the mask image of the image to be processed, and then the attribute characteristics of the target are determined according to the mask image, and then the attributes of the target are determined.
  • the target attribute detection method provided in the embodiment of the application is , The target attribute detection is divided into two steps. First, determine the position of the target from the image to be processed, and then determine the attribute characteristics of the target based on the position of the target in the image to be processed combined with the attribute feature map of the image to be processed, and then Determine the attributes of the target according to the attributes of the target.
  • the first step is to avoid the need for classification.
  • the attribute features of the target extracted in the target attribute detection method provided by the embodiment of the application are more discriminative, so as to more accurately determine the target species classification; in addition, the target attribute detection provided by the embodiment of the application The method is to classify the target as a whole. Compared with the scheme of detecting target attributes only through semantic segmentation, since the target attributes are detected from the whole target, the target attributes can be accurately obtained.
  • the attribute characteristics of the target can be converted into features of a preset length; the attributes of the target can be determined according to the converted attribute characteristics of the target of the preset length;
  • the preset length can be set according to actual application scenarios.
  • the target classification can be performed directly on the multiple targets according to the attribute features of the multiple targets in the image to be processed. , Get the attributes of the above multiple targets.
  • the points corresponding to the attribute feature of the target can be divided into k parts, where k is an integer greater than or equal to 1; The average value of the attribute characteristics of the target corresponding to the point, to obtain k average values; repeat the above steps n times, and the value of k is different during any two executions, and k is less than the possibility of the point corresponding to the attribute characteristic of the target.
  • the maximum number of, n is an integer greater than 1; the average value obtained is used to form the feature of the preset length.
  • Ki parts For the attribute characteristics of each target, division is performed n times, wherein, by dividing the pixel points of the attribute characteristics of each target for the i-th time, Ki parts can be obtained, i is 1 to n, Ki Represents the value of k during the i-th division; in the embodiment of this application, the lengths of the Ki parts obtained may be equal or unequal; the Ki parts obtained from the i-th division are uniformly pooled to obtain each The average value of the attribute features of the target corresponding to the points in a part; then, the obtained feature of length K1 can be connected to the feature of length Kn, and the feature of length P can be obtained.
  • P represents the preset length.
  • k can be set according to actual conditions.
  • the maximum possible number of pixels of the target's attribute feature is 30, and the value of k is less than or equal to 30.
  • the target attribute detection method is a lane line attribute detection method
  • the image to be processed is a road image
  • the target is a lane line.
  • feature extraction can be performed on the road image to determine the area feature map of the road image and the attribute feature map of the road image; according to the area feature map of the road image, the mask map of the lane lines in the road image can be determined;
  • the mask map of the lane line determines the attribute characteristics of the lane line in the attribute feature map of the road image; the attributes of the lane line are determined according to the attribute characteristics of the lane line.
  • the area feature map of the road image represents the position of the lane line in the road image. Therefore, the mask image of the lane line in the road image can be obtained according to the area feature map of the road image.
  • Figure 2 is a flowchart of lane line attribute detection in an embodiment of this application.
  • road images can be input to the trained semantic segmentation network.
  • semantic segmentation is used.
  • the network can obtain the lane line segmentation results.
  • the lane line segmentation results can be expressed as the area feature map of the road image; and the semantic segmentation network can be used to obtain the attribute feature map of the road image; in this way, the road image can be based on the area
  • the feature map, the mask map of the lane line is obtained; according to the mask map of the lane line and the attribute feature map of the road image, the attribute feature of the lane line can be obtained in the attribute feature map of the road image.
  • the length and angle of the lane line are usually different. Therefore, the length of the attribute feature of each lane line obtained in the embodiment of this application is usually different. In the target classification process, it is necessary to obtain the same length. When realizing on the basis of features, the attribute features of lane lines with different lengths can be converted into features of the same length in advance.
  • the attribute characteristics of each lane line can be input to the fixed-length feature extraction module.
  • the fixed-length feature extraction module can be used to perform the following steps: divide the points corresponding to the attribute features of each lane line into k copies, k is an integer greater than or equal to 1; calculate the average value of the attribute characteristics of the lane line corresponding to the points in each copy to obtain k average values; repeat the above steps n times, and perform the process of any two times
  • the value of k is different, and k is less than the maximum possible number of points corresponding to the attribute feature of the target, and n is an integer greater than 1; the obtained average value is used to form a feature with a preset length.
  • the attribute feature of a lane line can be directly pooled to obtain a feature value; then the pixels of the attribute feature of the lane line are divided into 6 times respectively.
  • the pixels of the attribute characteristic of the lane line are divided into two parts, and the pixel values of each part are averaged to obtain 2 characteristic values; the pixels of the attribute characteristic of the lane line are divided into three parts, for the pixel value of each part Perform the averaging to obtain 3 feature values; divide the pixel of the attribute feature of the lane line into six, and average the pixel value of each one to obtain 6 feature values; the pixel of the attribute feature of the lane line Divide into eight parts and average the pixel values of each part to get 8 feature values; divide the attribute feature pixels of the lane line into ten parts, and average the pixel values of each part to get 10 features Value; divide the pixel of the attribute feature of the lane line into twelve, and average the pixel value of each one to get 12 feature values; the obtained 1, 2, 3, 6, 8, 10, 12 Combine
  • the fixed-length feature extraction module can be used to obtain the pixel attribute features of the same length (all lengths are 42).
  • the features of the same length can be input to the trained target classification network, and the target classification network is used to perform the input feature Target classification, so as to get the attributes of each lane line.
  • the target classification network may include two fully connected layers, where the input of the first fully connected layer is the pixel attribute feature of the same length (for example, the length is 42) corresponding to each target, and the first fully connected
  • the number of nodes in the layer is 256
  • the number of nodes in the second layer of fully connected layer is 128, and the second layer of fully connected layer can output the attributes of each target.
  • FIG. 2 merely illustrates a specific application scenario of target attribute detection, that is, lane line attribute recognition; the embodiment of the present application is not limited to performing attribute detection on lane lines, for example, Attribute detection can be performed on other types of targets.
  • the lane lines in the road image may be determined according to the road image, the mask map of the lane lines in the determined road image, and the determined attributes of the lane lines.
  • the foregoing target attribute detection method is executed by a neural network, and the foregoing neural network is obtained by training using sample images, annotated mask maps of sample images, and annotated attributes of target images of sample images.
  • the sample image is a predetermined image containing a target, for example, the target may be a lane line or other target.
  • the sample image can be obtained from the local storage area or the network, and the format of the sample image can be JPEG, BMP, PNG or other formats; it should be noted that here is only an example of the format and source of the sample image.
  • the embodiment of the present invention does not limit the format and source of the sample image.
  • the number of targets in the sample image is not limited.
  • the target in the sample image can be one or more; for example, when the target is a lane line, the sample image can be There are multiple lane lines.
  • the annotated mask map of the sample image can be set in advance; obviously, since the annotated mask map of the sample image represents the position of the target in the sample image, according to the annotated mask map of the sample image, you can Determine the attribute characteristics of the target in the attribute feature map of the sample image; in turn, it is helpful for the trained neural network to determine the attribute characteristics of the target, and further, it is helpful for the trained neural network to determine the target's attribute characteristics according to the target’s attribute characteristics. Attributes.
  • Fig. 3 is a flowchart of a neural network training method according to an embodiment of the application. As shown in Fig. 3, the process may include:
  • Step 301 According to the annotated mask map of the sample image, determine the attribute feature belonging to the target in the attribute feature map of the sample image; the annotated mask map represents the position of the target in the sample image; the attribute feature map of the sample image represents The attributes of the sample image.
  • the attributes of the sample image can represent the color, texture, surface roughness and other characteristics of the image.
  • the attributes of the sample image can be derived from the attributes of each pixel of the sample image; the attributes of the pixels of the sample image can be expressed as Information such as the color of the pixels of the image.
  • the attribute characteristics of the target can characterize the target's color, texture, surface roughness and other characteristics.
  • the attribute characteristic of the target can be expressed as a characteristic map of the set number of channels, and the set number of channels can be set according to the effect of target attribute recognition, for example, the number of channels is set to 5, 6, or 7.
  • Step 302 Determine the attributes of the target according to the attribute characteristics of the target.
  • Step 303 Adjust the neural network according to the difference between the determined attributes of the target and the marked attributes of the target, and the difference between the marked mask image and the mask image of the sample image determined after semantic segmentation of the sample image The value of the network parameter.
  • this step for example, it can be based on the difference between the determined attributes of the target and the marked attributes of the target, as well as the marked mask image and the mask of the sample image determined after semantic segmentation of the sample image.
  • the difference between the graphs calculate the loss of the initial neural network; according to the loss of the neural network, adjust the network parameters of the neural network.
  • Step 304 It is judged that the processing of the sample image by the neural network after the adjustment of the network parameters meets the preset requirements, if not, the steps 301 to 304 are repeated; if they are met, the step 305 is executed.
  • the preset requirement may be that the loss of the neural network after network parameter adjustment is less than the set loss value; in the embodiments of the present application, the set loss value may be preset according to actual needs.
  • Step 305 Use the neural network after adjusting the network parameters as the trained neural network.
  • steps 301 to 305 can be implemented by a processor in an electronic device.
  • the aforementioned processor can be at least ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor.
  • ASIC ASIC
  • DSP digital signal processor
  • DSPD DSPD
  • PLD PLD
  • FPGA field-programmable gate array
  • the aforementioned neural network is used for lane line attribute detection, the sample image is a road sample image, and the target is a lane line; in this way, firstly, it can be determined according to the labeled mask map of the road sample image.
  • the attribute feature map of the road sample image belongs to the attribute feature of the lane line, and the annotated mask map of the road sample image represents the position of the lane line in the road sample image;
  • the attributes of the lane lines are determined to determine the attributes of the lane lines; finally, the attributes of the lane lines can be determined according to the difference between the attributes of the lane lines and the attributes of the lane lines, as well as the mask of the label of the road sample image.
  • the difference between the model image and the mask image of the lane line determined according to the regional feature map of the road sample image (the mask image of the lane line can be detected by a semantic segmentation network), and the network of the neural network is adjusted The parameter value.
  • the attribute characteristics of the target are first determined according to the labeled mask map, and then the attributes of the target are determined. Since the segmentation network in the neural network has not been trained well in the training phase, it has not been trained well. The network predicts the mask map of the lane line will cause the classification network in the subsequent neural network to fail to converge. Therefore, in the training phase, the labeled mask map is used to determine the attribute characteristics of the target.
  • the determination of target attributes is also divided into two steps. First, determine the attribute characteristics of the target according to the labeled mask map, and then determine the attributes of the target according to the attribute characteristics of the target. The pixel at the position in the image to be processed determines the characteristics of the target area. According to the determined characteristics, in classifying the target, more and more discriminative attribute characteristics can be extracted, so as to better learn the classification and make the training
  • the completed neural network has higher accuracy in detecting targets; in the training process of the neural network, when determining the attributes of the target, it also classifies the target as a whole. Compared with the scheme of target attribute detection only through semantic segmentation, The target attribute is detected as a whole for the target, and the target attribute can be obtained more accurately, which can also make the trained neural network have a higher accuracy in detecting the target.
  • the embodiment of the present application also proposes a smart driving method, which can be applied to smart driving equipment.
  • the smart driving equipment includes, but is not limited to, self-driving vehicles, Advanced Driving Assistant System (ADAS) vehicles, ADAS-equipped robots, etc.
  • ADAS Advanced Driving Assistant System
  • Fig. 4 is a flowchart of a smart driving method according to an embodiment of the application. As shown in Fig. 4, the process may include:
  • Step 401 When the target attribute detection method is a lane line attribute detection method, and the image to be processed is a road image, use any of the foregoing target attribute detection methods to detect the lane line attribute in the road image acquired by the smart driving device.
  • lane line attributes include, but are not limited to, line type, line color, line width, etc.
  • the line type can be single line, double line, solid line or dashed line; the color of the line can be white, yellow or blue, or The combination of two colors, etc.
  • Step 402 Instruct the smart driving device to drive on the road corresponding to the road image according to the detected attributes of the lane line.
  • smart driving equipment can be directly controlled to drive (automatic driving and robots), or instructions can be sent to the driver, and the driver can control the vehicle (for example, a vehicle equipped with ADAS) to drive.
  • vehicle for example, a vehicle equipped with ADAS
  • the lane line attributes can be obtained, which is beneficial to provide assistance to vehicle driving and improve the safety of vehicle driving.
  • an embodiment of the present application proposes a target attribute detection device.
  • FIG. 5 is a schematic diagram of the composition structure of a target attribute detection device according to an embodiment of the application. As shown in FIG. 5, the device includes: a first processing module 501, a second processing module 502, and a third processing module 503, wherein:
  • the first processing module 501 is configured to perform semantic segmentation on the image to be processed, and determine a mask image of the image to be processed, the mask image representing the position of the target in the image to be processed;
  • the second processing module 502 is configured to determine the attribute characteristics of the target in the attribute characteristic map of the image to be processed according to the mask map; the attribute characteristic map of the image to be processed characterizes the characteristics of the image to be processed Attributes;
  • the third processing module 503 is configured to determine the attribute of the target according to the attribute characteristics of the target.
  • the third processing module 503 is configured to: convert the attribute feature of the target into a feature of a preset length; determine according to the converted attribute feature of the target of the preset length The attributes of the target.
  • the third processing module 503 is configured to: in terms of converting the attribute feature of the target into a feature of a preset length, it is used to: divide the points corresponding to the attribute feature of the target into k copies; calculate the average value of the attribute characteristics of the target corresponding to the points in each copy to obtain k average values; repeat the above steps n times, and the value of k is different during any two executions, and k is less than the maximum possible number of points corresponding to the target's attribute feature, n is an integer greater than 1, and the obtained average value is used to form a feature with a preset length.
  • the target attribute detection device is a lane line attribute detection device, and the image to be processed is a road image;
  • the first processing module 501 is configured to: perform feature extraction on the road image, determine an area feature map of the road image and an attribute feature map of the road image; determine according to the area feature map of the road image A mask image of lane lines in the road image;
  • the second processing module 502 is configured to: determine the attribute feature of the lane line in the attribute feature map of the road image according to the mask map of the lane line in the road image;
  • the third processing module 503 is configured to determine the attribute of the lane line according to the attribute characteristics of the lane line.
  • the third processing module 503 is further configured to, after determining the attributes of the lane lines, according to the road image, the determined mask map of the lane lines in the road image, and The determined attributes of the lane line determine the lane line in the road image.
  • the target attribute detection device is implemented based on a neural network, and the neural network uses a sample image, an annotated mask map of the sample image, and annotated attributes of the target of the sample image Get trained.
  • the first processing module 501, the second processing module 502, and the third processing module 503 can all be implemented by a processor in an electronic device.
  • the aforementioned processors can be ASIC, DSP, DSPD, PLD, FPGA, CPU, control At least one of a device, a microcontroller, and a microprocessor.
  • FIG. 6 is a schematic diagram of the composition structure of a neural network training device according to an embodiment of the application. As shown in FIG. 6, the device includes: a fourth processing module 601, a fifth processing module 602, and an adjustment module 603, wherein,
  • the fourth processing module 601 is configured to determine the attribute feature of the target in the attribute feature map of the sample image according to the annotated mask map of the sample image; the annotated mask map indicates that the target is in the sample image Position in; the attribute feature map of the sample image characterizes the attribute of the sample image;
  • the fifth processing module 602 is configured to determine the attribute of the target according to the attribute characteristics of the target;
  • the adjustment module 603 is configured to determine the difference between the determined attribute of the target and the marked attribute of the target, and the marked mask map and the sample determined after semantic segmentation of the sample image Adjust the network parameter value of the neural network for the difference between the mask map of the image.
  • the fifth processing module 602 is configured to: convert the attribute feature of the target into a feature of a preset length; determine according to the converted attribute feature of the target of the preset length The attributes of the target.
  • the fifth processing module 602 is configured to, in terms of converting the attribute characteristics of the target into features of a preset length, for: dividing the points corresponding to the attribute characteristics of the target into k copies; calculate the average value of the attribute characteristics of the target corresponding to the points in each copy to obtain k average values; repeat the above steps n times, and the value of k is different during any two executions, and k is less than the maximum possible number of points corresponding to the target's attribute feature, n is an integer greater than 1, and the obtained average value is used to form a feature with a preset length.
  • the neural network is used for lane line attribute detection, the sample image is a road sample image, and the target is a lane line;
  • the fourth processing module 601 is configured to determine the attribute characteristics of the lane line in the attribute feature map of the road sample image according to the annotated mask map of the road sample image,
  • the marked mask map represents the position of the lane line in the road sample image;
  • the fifth processing module 602 is configured to: determine the attribute of the lane line according to the attribute characteristics of the lane line;
  • the adjustment module 603 is configured to: according to the determined difference between the attribute of the lane line and the attribute of the label of the lane line, and the labelled mask map of the road sample image and according to the road sample Adjust the network parameter value of the neural network for the difference between the mask map of the lane line determined by the regional feature map of the image.
  • the fourth processing module 601, the fifth processing module 602, and the adjustment module 603 can all be implemented by a processor in an electronic device.
  • the aforementioned processors can be ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, At least one of microcontroller and microprocessor.
  • FIG. 7 is a schematic diagram of the composition structure of an intelligent driving device according to an embodiment of the application. As shown in FIG. 7, the device includes: a detection module 701 and an indication module 702, wherein,
  • the detection module 701 is configured to use any of the above-mentioned target attribute detection methods to detect the road acquired by the smart driving device when the target attribute detection method is a lane line attribute detection method and the image to be processed is a road image. Lane line attributes in the image;
  • the indicating module 702 is configured to instruct the intelligent driving device to drive on the road corresponding to the road image according to the detected attributes of the lane line.
  • both the detection module 701 and the indication module 702 can be implemented by a processor in a smart driving device.
  • the above-mentioned processors can be ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor. At least one of them.
  • the functional modules in this embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be realized in the form of hardware or software function module.
  • the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of this embodiment is essentially or It is said that the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium and includes several instructions to enable a computer device (which can A personal computer, a server, or a network device, etc.) or a processor (processor) executes all or part of the steps of the method described in this embodiment.
  • the aforementioned storage media include: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
  • the computer program instructions corresponding to any target attribute detection method, neural network training method, or smart driving method in this embodiment can be stored on storage media such as optical disks, hard disks, and USB flash drives.
  • storage media such as optical disks, hard disks, and USB flash drives.
  • FIG. 8 shows an electronic device 80 provided by an embodiment of the present application, which may include: a memory 81 and a processor 82; wherein,
  • the memory 81 is configured to store computer programs and data
  • the processor 82 is configured to execute a computer program stored in the memory to implement any target attribute detection method or any one of the above neural network training methods or any one of the above intelligent driving methods in the foregoing embodiments.
  • the aforementioned memory 81 may be a volatile memory (volatile memory), such as RAM; or a non-volatile memory (non-volatile memory), such as ROM, flash memory, or hard disk (Hard Disk). Drive, HDD) or Solid-State Drive (SSD); or a combination of the foregoing types of memories, and provide instructions and data to the processor 82.
  • volatile memory volatile memory
  • non-volatile memory non-volatile memory
  • ROM read-only memory
  • flash memory read-only memory
  • HDD hard disk
  • SSD Solid-State Drive
  • the aforementioned processor 82 may be at least one of ASIC, DSP, DSPD, PLD, FPGA, CPU, controller, microcontroller, and microprocessor. It can be understood that, for different devices, the electronic devices used to implement the above-mentioned processor functions may also be other, which is not specifically limited in the embodiment of the present application.
  • the embodiment of the present application also proposes a computer program, including computer-readable code, when the computer-readable code is executed in an electronic device, the processor in the electronic device executes for realizing any of the above-mentioned target attributes
  • the detection method or any one of the above neural network training methods or any one of the above intelligent driving methods are included in the computer-readable code.
  • the functions or modules contained in the apparatus provided in the embodiments of the present application can be used to execute the methods described in the above method embodiments.
  • the functions or modules contained in the apparatus provided in the embodiments of the present application can be used to execute the methods described in the above method embodiments.
  • the technical solution of the present invention essentially or the part that contributes to the existing technology can be embodied in the form of a software product, the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes a number of instructions to enable a terminal (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to execute the method described in each embodiment of the present invention.
  • a terminal which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.
  • the embodiments of the application provide a target attribute detection method, a neural network training method, and an intelligent driving method, device, electronic equipment, computer storage medium, and computer program.
  • the target attribute detection method includes: semantic segmentation of the image to be processed, and determination of the target attribute.
  • the mask map of the image to be processed, the mask map characterizing the position of the target in the image to be processed; according to the mask map, the attributes belonging to the target in the attribute feature map of the image to be processed are determined Feature; the attribute feature map of the image to be processed characterizes the attribute of the image to be processed; the attribute of the target is determined according to the attribute feature of the target.
  • the attribute characteristics of the target can be determined in a more discriminative mask map obtained by semantic segmentation, thus, the target attribute detection performance can be improved. Accuracy.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

本实施例公开了一种目标属性检测方法、神经网络训练方法及智能行驶方法、装置、电子设备、计算机存储介质和计算机程序,该目标属性检测方法包括:对待处理图像进行语义分割,确定所述待处理图像的掩模图,所述掩模图表征所述待处理图像中的目标的位置;根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征;所述待处理图像的属性特征图表征所述待处理图像的属性;根据所述目标的属性特征,确定所述目标的属性。

Description

目标属性检测、神经网络训练及智能行驶方法、装置
相关申请的交叉引用
本申请基于申请号为201911081216.4、申请日为2019年11月7日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及计算机视觉处理技术,涉及但不限于一种目标属性检测、神经网络训练及智能行驶方法、装置、电子设备、计算机存储介质和计算机程序。
背景技术
随着计算机视觉技术的不断发展,对图像中的目标属性的识别逐渐成为研究的热点,例如,对车道线属性的识别有利于进行车道划分、路径规划及碰撞预警等;在相关技术中,如何在图像中准确地识别出目标属性,是亟待解决的技术问题。
发明内容
本申请实施例期望提供目标属性检测的技术方案。
本申请实施例提供了一种目标属性检测方法,所述方法包括:
对待处理图像进行语义分割,确定所述待处理图像的掩模图,所述掩模图表征所述待处理图像中的目标的位置;
根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征;所述待处理图像的属性特征图表征所述待处理图像的属性;
根据所述目标的属性特征,确定所述目标的属性。
本申请实施例还提供了一种神经网络的训练方法,包括:
根据样本图像的标注的掩模图,确定所述样本图像的属性特征图中属于目标的属性特征;所述标注的掩模图表征所述目标在所述样本图像中的位置;所述样本图像的属性特征图表征所述样本图像的属性;
根据所述目标的属性特征,确定所述目标的属性;
根据确定的所述目标的属性和所述目标的标注的属性之间的差异,以及所述标注的掩模图和对所述样本图像进行语义分割后确定的所述样本图像的掩模图之间的差异,调整所述神经网络的网络参数值。
本申请实施例还提供了一种智能行驶方法,包括:
利用上述任意一种目标属性检测方法,检测智能行驶设备获取的道路图像中的车道线属性;
根据检测到的车道线属性,指示智能行驶设备在所述道路图像对应的道路上行驶。
本申请实施例还提供了一种目标属性检测装置,所述装置包括第一处理模块、第二处理模块和第三处理模块,其中,
第一处理模块,配置为对待处理图像进行语义分割,确定所述待处理图像的掩模图,所述掩模图表征所述待处理图像中的目标的位置;
第二处理模块,配置为根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征;所述待处理图像的属性特征图表征所述待处理图像的属性;
第三处理模块,配置为根据所述目标的属性特征,确定所述目标的属性。
本申请实施例还提供了一种神经网络训练装置,所述装置包括第四处理模块、第五处理模块和调整模块,其中,
第四处理模块,配置为根据样本图像的标注的掩模图,确定所述样本图像的属性特 征图中属于目标的属性特征;所述标注的掩模图表征所述目标在所述样本图像中的位置;所述样本图像的属性特征图表征所述样本图像的属性;
第五处理模块,配置为根据所述目标的属性特征,确定所述目标的属性;
调整模块,配置为根据确定的所述目标的属性和所述目标的标注的属性之间的差异,以及所述标注的掩模图和对所述样本图像进行语义分割后确定的所述样本图像的掩模图之间的差异,调整所述神经网络的网络参数值。
本申请实施例还提供了一种智能行驶装置,包括检测模块和指示模块,其中,
检测模块,配置为利用上述任意一种目标属性检测方法,检测智能行驶设备获取的道路图像中的车道线属性;
指示模块,配置为根据检测到的车道线属性,指示智能行驶设备在所述道路图像对应的道路上行驶。
本申请实施例还提出了一种电子设备,包括处理器和配置为存储能够在处理器上运行的计算机程序的存储器;其中,
所述处理器配置为运行所述计算机程序时,执行上述任意一种目标属性检测方法或上述任意一种神经网络训练方法或上述任意一种智能行驶方法。
本申请实施例还提出了一种计算机存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现上述任意一种目标属性检测方法或上述任意一种神经网络训练方法或上述任意一种智能行驶方法。
本申请实施例还提出了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行用于实现上述任意一种目标属性检测方法或上述任意一种神经网络训练方法或上述任意一种智能行驶方法。
本申请实施例提出的一种目标属性检测方法、神经网络训练方法及智能行驶方法、装置、电子设备、计算机存储介质和计算机程序中,对待处理图像进行语义分割,确定所述待处理图像的掩模图,所述掩模图表征所述待处理图像中的目标的位置;根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征;所述待处理图像的属性特征图表征所述待处理图像的属性;根据所述目标的属性特征,确定所述目标的属性。如此,本申请实施例提供的目标属性检测方法,将目标属性检测分为两个步骤,首先从待处理图像中确定出目标的位置,然后基于目标在待处理图像中的位置结合待处理图像的属性特征图,确定出目标的属性特征,然后根据目标的属性特征确定目标的属性,相比于根据目标在待处理图像中的位置处的像素确定目标所在区域的特征,根据确定的特征在对目标进行分类来说,首先避免了分类所要进行的特征提取,并且,本申请实施例提供的目标属性检测方法中提取的目标的属性特征更具有判别性,从而更准确地判别处目标的品种分类。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,而非限制本申请。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,这些附图示出了符合本申请的实施例,并与说明书一起用于说明本申请的技术方案。
图1为本申请实施例的目标属性检测方法的流程图;
图2为本申请实施例的车道线属性检测的流程图;
图3为本申请实施例的神经网络训练方法的流程图;
图4为本申请实施例的智能行驶方法的流程图;
图5为本申请实施例的目标属性检测装置的组成结构示意图;
图6为本申请实施例的神经网络训练装置的组成结构示意图;
图7为本申请实施例的智能行驶装置的组成结构示意图;
图8为本申请实施例的电子设备的结构示意图。
具体实施方式
以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所提供的实施例仅仅用以解释本申请,并不用于限定本申请。另外,以下所提供的实施例是用于实施本申请的部分实施例,而非提供实施本申请的全部实施例,在不冲突的情况下,本申请实施例记载的技术方案可以任意组合的方式实施。
需要说明的是,在本申请实施例中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的方法或者装置不仅包括所明确记载的要素,而且还包括没有明确列出的其他要素,或者是还包括为实施方法或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个......”限定的要素,并不排除在包括该要素的方法或者装置中还存在另外的相关要素(例如方法中的步骤或者装置中的单元,例如的单元可以是部分电路、部分处理器、部分程序或软件等等)。
例如,本申请实施例提供的目标属性检测方法、神经网络训练方法和智能行驶方法包含了一系列的步骤,但是本申请实施例提供的目标属性检测方法、神经网络训练方法和智能行驶方法不限于所记载的步骤,同样地,本申请实施例提供的目标属性检测装置、神经网络训练装置和智能行驶装置包括了一系列模块,但是本申请实施例提供的装置不限于包括所明确记载的模块,还可以包括为获取相关信息、或基于信息进行处理时所需要设置的模块。
本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中术语“至少一种”表示多种中的任意一种或多种中的至少两种的任意组合,例如,包括A、B、C中的至少一种,可以表示包括从A、B和C构成的集合中选择的任意一个或多个元素。
本申请实施例可以应用于终端和服务器组成的计算机系统中,并可以与众多其它通用或专用计算系统环境或配置一起操作。这里,终端可以是瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统,等等,服务器可以是服务器计算机系统小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。
终端、服务器等电子设备可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。
在相关技术中,针对目标如车道线的属性检测,一般可以采用目标分类方法和语义分割方法;目标分类方法的过程包括,从图像中提取目标所在区域,从目标所在区域图像输入至目标分类网络,通过目标分类得出目标的属性,目标分类方法存在的主要问题是目标占据的图片区域较小,可区分度较低,目标分类网络很难学习到有用的判别性特征,导致识别出的目标的属性的准确性较低;语义分割方法的过程包括:预测图像中目标每一个像素的属性,而后以取众数的方式决定整个目标的属性,即,在目标各像素的属性中,取出现次数最多的属性作为整个目标的属性;语义分割方法的主要的问题是,目标属性对整个目标而言是一个整体,语义分割方法打破了这种整体关系,会导致识别 出的目标的属性的准确性较低。
针对上述技术问题,在本申请的一些实施例中,提出了一种目标属性检测方法,本申请实施例可以应用于图像分类、车道线属性识别、自动驾驶等场景。
图1为本申请实施例的目标属性检测方法的流程图,如图1所示,该流程可以包括:
步骤101:对待处理图像进行语义分割,确定所述待处理图像的掩模图,掩模图表征所述待处理图像中的目标的位置。
这里,待处理图像为需要进行目标属性识别的图像,例如,待处理图像中的目标可以是车道线或其它目标。
示例性地,可以从本地存储区域或网络获取待处理图像,待处理图像的格式可以是联合图像专家小组(Joint Photographic Experts GROUP,JPEG)、位图(Bitmap,BMP)、便携式网络图形(Portable Network Graphics,PNG)或其他格式;需要说明的是,这里仅仅是对待处理图像的格式和来源进行了举例说明,本发明实施例并不对待处理图像的格式和来源进行限定。
本申请实施例中,并不对待处理图像中的目标个数进行限定,待处理图像中的目标可以是一个,也可以是多个;示例性地,在目标为车道线的情况下,待处理图像中可以存在多个车道线。
显然,在待处理图像中存在多个目标的情况下,基于步骤101的出的掩模图表征待处理图像中的各个目标的位置。
在实际应用中,可以将待处理图像输入至训练完成的语义分割网络中,在语义分割网络中,从待处理图像中提取出待处理图像的掩模图。
步骤102:根据掩模图,确定待处理图像的属性特征图中属于目标的属性特征;待处理图像的属性特征图表征所述待处理图像的属性。
本申请实施例中,待处理图像的属性可以表征图像的颜色、纹理、表面粗糙度等特征,待处理图像的属性可以由待处理图像的各个像素的属性得出;待处理图像的像素的属性可以表示待处理图像的像素的颜色等信息。同样地,目标的属性特征可以表征目标的颜色、纹理、表面粗糙度等特征。示例性地,目标的属性特征可以表示为设定通道数的特征图,设定通道数可以根据目标属性识别的效果进行设置,例如,设定通道数为5、6或7。
显然,由于掩膜图可以表征待处理图像中的目标的位置,因而,根据掩膜图,可以确定待处理图像的属性特征图中属于目标的属性特征。
步骤103:根据目标的属性特征,确定目标的属性。
这里,目标的属性可以表示目标在待处理图像中的颜色、大小、形状等信息,例如,在目标为车道线的情况下,车道线的属性可以表示车道线的颜色、线宽、线型等信息。
可以看出,在待处理图像中存在的多个目标的情况下,通过执行步骤103可以得到待处理图像中各个目标的属性。
在实际应用中,可以将目标的属性特征输入至训练完成的目标分类网络中,利用目标分类网络对目标的属性特征进行分类,得到待处理图像中目标的属性。
在实际应用中,步骤101至步骤103可以利用电子设备中的处理器实现,上述处理器可以为特定用途集成电路(Application Specific Integrated Circuit,ASIC)、数字信号处理器(Digital Signal Processor,DSP)、数字信号处理装置(Digital Signal Processing Device,DSPD)、可编程逻辑装置(Programmable Logic Device,PLD)、现场可编程逻辑门阵列(Field Programmable Gate Array,FPGA)、中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器中的至少一种。
在本申请实施例中,首先采用语义分割方法得到待处理图像的掩膜图,然后根据掩模图确定目标的属性特征,进而确定目标的属性,可见,本申请实施例提供的目标属性 检测方法,将目标属性检测分为两个步骤,首先从待处理图像中确定出目标的位置,然后基于目标在待处理图像中的位置结合待处理图像的属性特征图,确定出目标的属性特征,然后根据目标的属性特征确定目标的属性,相比于根据目标在待处理图像中的位置处的像素确定目标所在区域的特征,根据确定的特征在对目标进行分类来说,首先避免了分类所要进行的特征提取,并且,本申请实施例提供的目标属性检测方法中提取的目标的属性特征更具有判别性,从而更准确地判别处目标的品种分类;另外,本申请实施例提供的目标属性检测方法是对目标的整体进行分类,与仅通过语义分割进行目标属性检测的方案相比,由于从目标的整体进行目标属性的检测,因而,可以准确地的得出目标属性。
对于根据目标的属性特征确定目标的属性的实现方式,示例性地,可以将目标的属性特征转化为预设长度的特征;根据转化后的预设长度的目标的属性特征,确定目标的属性;本申请实施例中,预设长度可以根据实际应用场景进行设置。
具体地,可以在待处理图像中存在一个或多个目标,将上述一个或多个目标的属性特征转换为预设长度的特征;根据上述一个或多个目标的预设长度的特征,对上述一个或多个目标进行目标分类,得到上述一个或多个目标的属性。
另外,也可以在待处理图像中存在多个目标,且多个目标的属性特征的长度相同的情况下,可以直接根据待处理图像中多个目标的属性特征,对上述多个目标进行目标分类,得到上述多个目标的属性。
可以理解地,一些目标分类方法需要在获取相同长度的特征的基础上实现,因而,在本申请实施例中,通过将不定长度的目标的属性特征转换为预设长度的特征,有利于实现目标分类,进而,有利于得到目标的属性。
对于将目标的属性特征转化为预设长度的特征的实现方式,示例性地,可以将目标的属性特征对应的点分为k份,k为大于或等于1的整数;计算每一份中的点对应的目标的属性特征的平均值,得到k个平均值;重复执行上述步骤n次,且任意两次执行的过程中k的取值不同,且k小于目标的属性特征对应的点的可能的最大数量,n为大于1的整数;利用得到的平均值构成预设长度的特征。
具体地说,针对每个目标的属性特征,分别进行n次划分,其中,通过将每个目标的属性特征的像素点进行第i次划分,可以得到Ki个部分,i取1至n,Ki表示第i次划分时k的取值;本申请实施例中,得到的Ki个部分的长度可以相等,也可以不相等;针对第i次划分得到的Ki个部分分别进行均匀池化,得到每一部分中的点对应的目标的属性特征的平均值;然后,可以将得到的长度为K1的特征至长度为Kn的特征连接,得到长度为P的特征,
Figure PCTCN2020114109-appb-000001
P表示预设长度。
在本申请实施例中,k可以根据实际情况进行设置,示例性地,目标的属性特征的像素点的可能的最大数量为30,则k的取值小于或等于30。
在一个具体的示例中,目标属性检测方法为车道线属性检测方法,待处理图像为道路图像,目标为车道线。如此,可以对道路图像进行特征提取,确定道路图像的区域特征图以及道路图像的属性特征图;根据道路图像的区域特征图,确定道路图像中的车道线的掩模图;根据道路图像中的车道线的掩模图,确定道路图像的属性特征图中属于车道线的属性特征;根据车道线的属性特征,确定车道线的属性。
本申请实施例中,道路图像的区域特征图表征车道线在道路图像中的位置,因而,根据道路图像的区域特征图可以得出道路图像中的车道线的掩模图。
图2为本申请实施例的车道线属性检测的流程图,参照图2,在本申请实施例中,可以将道路图像输入至训练完成的语义分割网络中,在语义分割网络中,利用语义分割网 络可以得到车道线分割结果,图2中,车道线分割结果可以表示为道路图像的区域特征图;并且,利用语义分割网络可以得出道路图像的属性特征图;如此,可以根据道路图像的区域特征图,得到车道线的掩膜图;根据车道线的掩模图和道路图像的属性特征图,可以得到道路图像的属性特征图中属于车道线的属性特征。
在实际应用中,车道线的长度和角度通常是不一样的,因而,本申请实施例得出的各个车道线的属性特征的长度通常是不相同的,在目标分类过程需要在获取相同长度的特征的基础上实现时,可以预先将长度不相同的车道线的属性特征转换为相同长度的特征。
在具体实现时,参照图2,可以将各个车道线的属性特征输入至定长特征提取模块,定长特征提取模块可以用于执行以下步骤:将每个车道线的属性特征对应的点分为k份,k为大于或等于1的整数;计算每一份中的点对应的车道线的属性特征的平均值,得到k个平均值;重复执行上述步骤n次,且任意两次执行的过程中k的取值不同,且k小于目标的属性特征对应的点的可能的最大数量,n为大于1的整数;利用得到的平均值构成预设长度的特征。
在一个具体的示例中,第一次可以对一条车道线的属性特征直接进行池化,得到一个特征值;然后将该条车道线的属性特征的像素分别进行6次划分,其中,将该条车道线的属性特征的像素分为两份,针对每一份的像素值进行平均,得到2个特征值;将该条车道线的属性特征的像素分为三份,针对每一份的像素值进行平均,得到3个特征值;将该条车道线的属性特征的像素分为六份,针对每一份的像素值进行平均,得到6个特征值;将该条车道线的属性特征的像素分为八份,针对每一份的像素值进行平均,得到8个特征值;将该条车道线的属性特征的像素分为十份,针对每一份的像素值进行平均,得到10个特征值;将该条车道线的属性特征的像素分为十二份,针对每一份的像素值进行平均,得到12个特征值;将得到的1、2、3、6、8、10、12个特征值组合在一起,得到长度为42的特征,即得到了该条车道线的预设长度的特征。
也就是说,针对各个车道线的属性特征,均可以通过定长特征提取模块,得到相同长度的像素属性特征(长度均为42)。
参照图2,在利用定长特征提取模块将各个车道线的属性特征转换为相同长度的特征后,可以将相同长度的特征输入至训练完成的目标分类网络,利用目标分类网络对输入的特征进行目标分类,从而得到各个车道线的属性。
在一个具体的示例中,目标分类网络可以包括两层全连接层,其中第一层全连接层的输入为各个目标对应的相同长度(例如长度为42)的像素属性特征,第一层全连接层的节点数为256,第二层全连接层的节点数为128,第二层全连接层可以输出各个目标的属性。
需要说明的是,图2仅仅是示例性地说明了目标属性检测的一种具体的应用场景,即进行车道线属性识别;本申请实施例并不局限于对车道线进行属性检测,例如,还可以对其他种类的目标进行属性检测。
进一步地,在确定车道线的属性之后,还可以根据道路图像、确定的道路图像中的车道线的掩模图以及确定的车道线的属性特征,确定道路图像中的车道线。
可以理解地,通过确定道路图像的车道线,有利于对车辆驾驶提供帮助,提高车辆驾驶的安全性。
本申请的一些实施例中,上述目标属性检测方法由神经网络执行,上述神经网络采用样本图像、样本图像的标注的掩模图以及样本图像的目标的标注的属性训练得到。
这里,样本图像为预先确定的包含目标的图像,例如,目标可以是车道线或其它目标。
示例性地,可以从本地存储区域或网络获取样本图像,样本图像的格式可以是JPEG、 BMP、PNG或其他格式;需要说明的是,这里仅仅是对样本图像的格式和来源进行了举例说明,本发明实施例并不对样本图像的格式和来源进行限定。
本申请实施例中,并不对样本图像中的目标个数进行限定,样本图像中的目标可以是一个,也可以是多个;示例性地,在目标为车道线的情况下,样本图像中可以存在多个车道线。
在实际应用中,样本图像的标注的掩模图可以预先设置;显然,由于样本图像的标注的掩模图表征样本图像中的目标的位置,因而,根据样本图像的标注的掩膜图,可以确定样本图像的属性特征图中属于目标的属性特征;进而有利于使训练完成的神经网络能够确定目标的属性特征,进一步地,有利于使训练完成的神经网络能够根据目标的属性特征确定目标的属性。
下面结合附图示例性地说明上述神经网络的训练过程。
图3为本申请实施例的神经网络训练方法的流程图,如图3所示,该流程可以包括:
步骤301:根据样本图像的标注的掩模图,确定样本图像的属性特征图中属于目标的属性特征;标注的掩模图表征目标在所述样本图像中的位置;样本图像的属性特征图表征所述样本图像的属性。
本申请实施例中,样本图像的属性可以表征图像的颜色、纹理、表面粗糙度等特征,样本图像的属性可以由样本图像的各个像素的属性得出;样本图像的像素的属性可以表示与样本图像的像素的颜色等信息。同样地,目标的属性特征可以表征目标的颜色、纹理、表面粗糙度等特征。示例性地,目标的属性特征可以表示为设定通道数的特征图,设定通道数可以根据目标属性识别的效果进行设置,例如,设定通道数为5、6或7。
步骤302:根据目标的属性特征,确定目标的属性。
本步骤的实现方式已经在前述记载的内容中作出说明,这里不再赘述。
步骤303:根据确定的目标的属性和目标的标注的属性之间的差异,以及标注的掩模图和对样本图像进行语义分割后确定的样本图像的掩模图之间的差异,调整神经网络的网络参数值。
对于本步骤的实现方式,示例性地,可以根据确定的目标的属性和目标的标注的属性之间的差异,以及标注的掩模图和对样本图像进行语义分割后确定的样本图像的掩模图之间的差异,计算初始神经网络的损失;根据神经网络的损失,调整神经网络的网络参数。
步骤304:判断网络参数调整后的神经网络对样本图像的处理满足预设要求,如果不满足,则重复执行步骤301至步骤304;如果满足,则执行步骤305。
本申请的一些实施例中,预设要求可以是网络参数调整后的神经网络的损失小于设定损失值;本申请实施例中,设定损失值可以按照实际需求预先设置。
步骤305:将网络参数调整后的神经网络作为训练完成的神经网络。
在实际应用中,步骤301至步骤305可以利用电子设备中的处理器实现,上述处理器可以为ASIC、DSP、DSPD、PLD、FPGA、CPU、控制器、微控制器、微处理器中的至少一种。
在一个具体的实施方式中,上述神经网络用于车道线属性检测,样本图像为道路样本图像,目标为车道线;如此,首先,可以根据所述道路样本图像的标注的掩模图,确定所述道路样本图像的属性特征图中属于所述车道线的属性特征,所述道路样本图像的标注的掩模图表征所述车道线在所述道路样本图像中的位置;然后,可以根据所述车道线的属性特征,确定所述车道线的属性;最后,可以根据确定的所述车道线的属性和所述车道线的标注的属性之间的差异,以及所述道路样本图像的标注的掩模图和根据所述道路样本图像的区域特征图确定的所述车道线的掩模图(可以通过语义分割网络来检测车道线的掩模图)之间的差异,调整所述神经网络的网络参数值。
可以看出,在上述神经网络训练过程中,首先根据标注的掩模图确定目标的属性特征,进而确定目标的属性,由于在训练阶段神经网络中的分割网络还没有训练好,使用没有训练好的网络来预测车道线的掩模图会导致后续的神经网络中的分类网络无法收敛,因此,在训练阶段是用标注的掩模图来确定目标的属性特征。
在上述神经网络训练过程中,在确定目标属性时也分为两个步骤,首先根据标注的掩模图确定目标的属性特征,然后根据目标的属性特征确定目标的属性,相比于根据目标在待处理图像中的位置处的像素确定目标所在区域的特征,根据确定的特征在对目标进行分类来说,可以提取到更多更具有判别性的属性特征,从而更好地学习分类,使得训练完成的神经网络检测目标的准确性更高;,在神经网络的训练过程中,在确定目标的属性时也是对目标的整体进行分类,与仅通过语义分割进行目标属性检测的方案相比,从目标的整体进行目标属性的检测,可以更准确地的得出目标属性,这同样可以使得训练完成的神经网络检测目标的准确率更高。
在前述实施例提出的目标属性检测方法的基础上,本申请实施例还提出了一种智能行驶方法,可以应用于智能行驶设备中,这里,智能行驶设备包括但不限于自动驾驶车辆、装有高级驾驶辅助系统(Advanced Driving Assistant System,ADAS)的车辆、装有ADAS的机器人等。
图4为本申请实施例的智能行驶方法的流程图,如图4所示,该流程可以包括:
步骤401:在目标属性检测方法为车道线属性检测方法,且待处理图像为道路图像的情况下,利用上述任意一种目标属性检测方法检测智能行驶设备获取的道路图像中的车道线属性。
这里,车道线属性包括但不限于线的类型、线的颜色、线宽等,线的类型可以是单线、双线、实线或虚线;线的颜色可以是如白色、黄色或蓝色,或者两种颜色的组合等。
步骤402:根据检测到的车道线属性,指示智能行驶设备在所述道路图像对应的道路上行驶。
在实际应用中,可以直接控制智能行驶设备行驶(自动驾驶以及机器人),也可以向驾驶员发送指令,由驾驶员来控制车辆(例如装有ADAS的车辆)行驶。
可以看出,基于车道线属性检测方法,可以得出车道线属性,有利于对车辆驾驶提供帮助,提高车辆驾驶的安全性。
本领域技术人员可以理解,在具体实施方式的上述方法中,各步骤的撰写顺序并不意味着严格的执行顺序而对实施过程构成任何限定,各步骤的具体执行顺序应当以其功能和可能的内在逻辑确定
在前述实施例提出的目标属性检测方法的基础上,本申请实施例提出了一种目标属性检测装置。
图5为本申请实施例的目标属性检测装置的组成结构示意图,如图5所示,所述装置包括:第一处理模块501、第二处理模块502和第三处理模块503,其中,
第一处理模块501,配置为对待处理图像进行语义分割,确定所述待处理图像的掩模图,所述掩模图表征所述待处理图像中的目标的位置;
第二处理模块502,配置为根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征;所述待处理图像的属性特征图表征所述待处理图像的属性;
第三处理模块503,配置为根据所述目标的属性特征,确定所述目标的属性。
本申请的一些实施例中,所述第三处理模块503,配置为:将所述目标的属性特征转化为预设长度的特征;根据转化后的预设长度的所述目标的属性特征,确定所述目标的属性。
本申请的一些实施例中,所述第三处理模块503,配置为在将所述目标的属性特征转化为预设长度的特征方面,用于:将所述目标的属性特征对应的点分为k份;计算每一 份中的点对应的所述目标的属性特征的平均值,得到k个平均值;重复执行上述步骤n次,且任意两次执行的过程中k的取值不同,且k小于目标的属性特征对应的点的可能的最大数量,n为大于1的整数;利用得到的平均值构成预设长度的特征。
本申请的一些实施例中,所述目标属性检测装置为车道线属性检测装置,所述待处理图像为道路图像;
所述第一处理模块501,配置为:对所述道路图像进行特征提取,确定所述道路图像的区域特征图以及所述道路图像的属性特征图;根据所述道路图像的区域特征图,确定所述道路图像中的车道线的掩模图;
所述第二处理模块502,配置为:根据所述道路图像中的车道线的掩模图,确定所述道路图像的属性特征图中属于车道线的属性特征;
所述第三处理模块503,配置为:根据所述车道线的属性特征,确定所述车道线的属性。
本申请的一些实施例中,所述第三处理模块503,还配置为在确定所述车道线的属性之后,根据所述道路图像、确定的所述道路图像中的车道线的掩模图以及确定的所述车道线的属性,确定所述道路图像中的车道线。
本申请的一些实施例中,所述目标属性检测装置是基于神经网络实现的,所述神经网络采用样本图像、所述样本图像的标注的掩模图以及所述样本图像的目标的标注的属性训练得到。
实际应用中,第一处理模块501、第二处理模块502和第三处理模块503均可以利用电子设备中的处理器实现,上述处理器可以为ASIC、DSP、DSPD、PLD、FPGA、CPU、控制器、微控制器、微处理器中的至少一种。
图6为本申请实施例的神经网络训练装置的组成结构示意图,如图6所示,所述装置包括:第四处理模块601、第五处理模块602和调整模块603,其中,
第四处理模块601,配置为根据样本图像的标注的掩模图,确定所述样本图像的属性特征图中属于目标的属性特征;所述标注的掩模图表征所述目标在所述样本图像中的位置;所述样本图像的属性特征图表征所述样本图像的属性;
第五处理模块602,配置为根据所述目标的属性特征,确定所述目标的属性;
调整模块603,配置为根据确定的所述目标的属性和所述目标的标注的属性之间的差异,以及所述标注的掩模图和对所述样本图像进行语义分割后确定的所述样本图像的掩模图之间的差异,调整所述神经网络的网络参数值。
本申请的一些实施例中,所述第五处理模块602,配置为:将所述目标的属性特征转化为预设长度的特征;根据转化后的预设长度的所述目标的属性特征,确定所述目标的属性。
本申请的一些实施例中,所述第五处理模块602,配置为在将所述目标的属性特征转化为预设长度的特征方面,用于:将所述目标的属性特征对应的点分为k份;计算每一份中的点对应的所述目标的属性特征的平均值,得到k个平均值;重复执行上述步骤n次,且任意两次执行的过程中k的取值不同,且k小于目标的属性特征对应的点的可能的最大数量,n为大于1的整数;利用得到的平均值构成预设长度的特征。
本申请的一些实施例中,所述神经网络用于车道线属性检测,所述样本图像为道路样本图像,所述目标为车道线;
所述第四处理模块601,配置为:根据所述道路样本图像的标注的掩模图,确定所述道路样本图像的属性特征图中属于所述车道线的属性特征,所述道路样本图像的标注的掩模图表征所述车道线在所述道路样本图像中的位置;
所述第五处理模块602,配置为:根据所述车道线的属性特征,确定所述车道线的属性;
所述调整模块603,配置为:根据确定的所述车道线的属性和所述车道线的标注的属性之间的差异,以及所述道路样本图像的标注的掩模图和根据所述道路样本图像的区域特征图确定的所述车道线的掩模图之间的差异,调整所述神经网络的网络参数值。
实际应用中,第四处理模块601、第五处理模块602和调整模块603均可以利用电子设备中的处理器实现,上述处理器可以为ASIC、DSP、DSPD、PLD、FPGA、CPU、控制器、微控制器、微处理器中的至少一种。
图7为本申请实施例的智能行驶装置的组成结构示意图,如图7所示,所述装置包括:检测模块701和指示模块702,其中,
检测模块701,配置为在所述目标属性检测方法为车道线属性检测方法,且所述待处理图像为道路图像的情况下,利用上述任意一种目标属性检测方法,检测智能行驶设备获取的道路图像中的车道线属性;
指示模块702,配置为根据检测到的车道线属性,指示智能行驶设备在所述道路图像对应的道路上行驶。
实际应用中,检测模块701和指示模块702均可以利用智能行驶设备中的处理器实现,上述处理器可以为ASIC、DSP、DSPD、PLD、FPGA、CPU、控制器、微控制器、微处理器中的至少一种。
另外,在本实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
具体来讲,本实施例中的任意一种目标属性检测方法、神经网络训练方法或智能行驶方法对应的计算机程序指令可以被存储在光盘,硬盘,U盘等存储介质上,当存储介质中的与任意一种目标属性检测方法、神经网络训练方法或智能行驶方法对应的计算机程序指令被一电子设备读取或被执行时,实现前述实施例的任意一种目标属性检测方法或上述任意一种神经网络训练方法或上述任意一种智能行驶方法。
基于前述实施例相同的技术构思,参见图8,其示出了本申请实施例提供的一种电子设备80,可以包括:存储器81和处理器82;其中,
所述存储器81,配置为存储计算机程序和数据;
所述处理器82,配置为执行所述存储器中存储的计算机程序,以实现前述实施例的任意一种目标属性检测方法或上述任意一种神经网络训练方法或上述任意一种智能行驶方法。
在实际应用中,上述存储器81可以是易失性存储器(volatile memory),例如RAM;或者非易失性存储器(non-volatile memory),例如ROM,快闪存储器(flash memory),硬盘(Hard Disk Drive,HDD)或固态硬盘(Solid-State Drive,SSD);或者上述种类的存储器的组合,并向处理器82提供指令和数据。
上述处理器82可以为ASIC、DSP、DSPD、PLD、FPGA、CPU、控制器、微控制器、微处理器中的至少一种。可以理解地,对于不同的设备,用于实现上述处理器功能的电子器件还可以为其它,本申请实施例不作具体限定。
本申请实施例还提出了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行用于实现上述任意一种目标属性检测方法或上述任意一种神经网络训练方法或上述任意一种智能行驶方法。
在一些实施例中,本申请实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。
上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以互相参考,为了简洁,本文不再赘述
本申请所提供的各方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。
本申请所提供的各产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。
本申请所提供的各方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本发明各个实施例所述的方法。
上面结合附图对本发明的实施例进行了描述,但是本发明并不局限于上述的具体实施方式,上述的具体实施方式仅仅是示意性的,而不是限制性的,本领域的普通技术人员在本发明的启示下,在不脱离本发明宗旨和权利要求所保护的范围情况下,还可做出很多形式,这些均属于本发明的保护之内。
工业实用性
本申请实施例提供了一种目标属性检测方法、神经网络训练方法及智能行驶方法、装置、电子设备、计算机存储介质和计算机程序,该目标属性检测方法包括:对待处理图像进行语义分割,确定所述待处理图像的掩模图,所述掩模图表征所述待处理图像中的目标的位置;根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征;所述待处理图像的属性特征图表征所述待处理图像的属性;根据所述目标的属性特征,确定所述目标的属性。如此,在本申请实施例中,由于无需提取图像中目标所在区域,而是可以在通过语义分割得出的更加具有区分性的掩膜图确定目标的属性特征,因而,可以提高目标属性检测的准确率。

Claims (21)

  1. 一种目标属性检测方法,所述方法包括:
    对待处理图像进行语义分割,确定所述待处理图像的掩模图,所述掩模图表征所述待处理图像中的目标的位置;
    根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征;所述待处理图像的属性特征图表征所述待处理图像的属性;
    根据所述目标的属性特征,确定所述目标的属性。
  2. 根据权利要求1所述的方法,其中,所述根据所述目标的属性特征,确定所述目标的属性,包括:
    将所述目标的属性特征转化为预设长度的特征;
    根据转化后的预设长度的所述目标的属性特征,确定所述目标的属性。
  3. 根据权利要求2所述的方法,其中,所述将所述目标的属性特征转化为预设长度的特征,包括:
    将所述目标的属性特征对应的点分为k份;
    计算每一份中的点对应的所述目标的属性特征的平均值,得到k个平均值;
    重复执行上述步骤n次,且任意两次执行的过程中k的取值不同,且k小于目标的属性特征对应的点的可能的最大数量,n为大于1的整数;
    利用得到的平均值构成所述预设长度的特征。
  4. 根据权利要求1-3任一项所述的方法,其中,所述目标属性检测方法为车道线属性检测方法,所述待处理图像为道路图像;
    所述对待处理图像进行语义分割,确定所述待处理图像的掩模图,包括:
    对所述道路图像进行特征提取,确定所述道路图像的区域特征图以及所述道路图像的属性特征图;
    根据所述道路图像的区域特征图,确定所述道路图像中的车道线的掩模图;
    根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征,包括:
    根据所述道路图像中的车道线的掩模图,确定所述道路图像的属性特征图中属于车道线的属性特征;
    根据所述目标的属性特征,确定所述目标的属性,包括:
    根据所述车道线的属性特征,确定所述车道线的属性。
  5. 根据权利要求4所述的方法,其中,在确定所述车道线的属性之后,所述方法还包括:
    根据所述道路图像、确定的所述道路图像中的车道线的掩模图以及确定的所述车道线的属性,确定所述道路图像中的车道线。
  6. 根据权利要求1-5任一所述的方法,其中,所述目标属性检测方法由神经网络执行,所述神经网络采用样本图像、所述样本图像的标注的掩模图以及所述样本图像的目标的标注的属性训练得到。
  7. 一种神经网络的训练方法,包括:
    根据样本图像的标注的掩模图,确定所述样本图像的属性特征图中属于目标的属性特征;所述标注的掩模图表征所述目标在所述样本图像中的位置;所述样本图像的属性特征图表征所述样本图像的属性;
    根据所述目标的属性特征,确定所述目标的属性;
    根据确定的所述目标的属性和所述目标的标注的属性之间的差异,以及所述标注的掩模图和对所述样本图像进行语义分割后确定的所述样本图像的掩模图之间的差异,调整所述神经网络的网络参数值。
  8. 根据权利要求7所述的方法,其中,所述根据所述目标的属性特征,确定所述目标的属性,包括:
    将所述目标的属性特征转化为预设长度的特征;
    根据转化后的预设长度的所述目标的属性特征,确定所述目标的属性。
  9. 根据权利要求8所述的方法,其中,所述将所述目标的属性特征转化为预设长度的特征,包括:
    将所述目标的属性特征对应的点分为k份;
    计算每一份中的点对应的所述目标的属性特征的平均值,得到k个平均值;
    重复执行上述步骤n次,且任意两次执行的过程中k的取值不同,且k小于目标的属性特征对应的点的可能的最大数量,n为大于1的整数;
    利用得到的平均值构成所述预设长度的特征。
  10. 根据权利要求7-9任一所述的方法,其中,所述神经网络用于车道线属性检测,所述样本图像为道路样本图像,所述目标为车道线;
    所述根据样本图像的标注的掩模图,确定所述样本图像的属性特征图中属于目标的属性特征,包括:
    根据所述道路样本图像的标注的掩模图,确定所述道路样本图像的属性特征图中属于所述车道线的属性特征,所述道路样本图像的标注的掩模图表征所述车道线在所述道路样本图像中的位置;
    所述根据所述目标的属性特征,确定所述目标的属性,包括:
    根据所述车道线的属性特征,确定所述车道线的属性;
    所述根据确定的所述目标的属性和所述目标的标注的属性之间的差异,以及所述标注的掩模图和对所述样本图像进行语义分割后确定的所述样本图像的掩模图之间的差异,调整所述神经网络的网络参数值,包括:
    根据确定的所述车道线的属性和所述车道线的标注的属性之间的差异,以及所述道路样本图像的标注的掩模图和根据所述道路样本图像的区域特征图确定的所述车道线的掩模图之间的差异,调整所述神经网络的网络参数值。
  11. 一种智能行驶方法,包括:
    利用权利要求4-6任一所述的方法,检测智能行驶设备获取的道路图像中的车道线属性;
    根据检测到的车道线属性,指示智能行驶设备在所述道路图像对应的道路上行驶。
  12. 一种目标属性检测装置,所述装置包括第一处理模块、第二处理模块和第三处理模块,其中,
    第一处理模块,配置为对待处理图像进行语义分割,确定所述待处理图像的掩模图,所述掩模图表征所述待处理图像中的目标的位置;
    第二处理模块,配置为根据所述掩模图,确定所述待处理图像的属性特征图中属于所述目标的属性特征;所述待处理图像的属性特征图表征所述待处理图像的属性;
    第三处理模块,配置为根据所述目标的属性特征,确定所述目标的属性。
  13. 根据权利要求12所述的装置,其中,所述第三处理模块,配置为:将所述目标的属性特征转化为预设长度的特征;根据转化后的预设长度的所述目标的属性特征,确定所述目标的属性。
  14. 根据权利要求13所述的装置,其中,所述第三处理模块,配置为在将所述目标的属性特征转化为预设长度的特征方面,用于:
    将所述目标的属性特征对应的点分为k份;计算每一份中的点对应的所述目标的属性特征的平均值,得到k个平均值;重复执行上述步骤n次,且任意两次执行的过程中k的取值不同,且k小于目标的属性特征对应的点的可能的最大数量,n为大于1的整数;利用得到的平均值构成所述预设长度的特征。
  15. 根据权利要求12至14任一项所述的装置,其中,所述目标属性检测装置为车道线属性检测装置,所述待处理图像为道路图像;
    所述第一处理模块,配置为:对所述道路图像进行特征提取,确定所述道路图像的区域特征图以及所述道路图像的属性特征图;根据所述道路图像的区域特征图,确定所述道路图像中的车道线的掩模图;
    所述第二处理模块,配置为:根据所述道路图像中的车道线的掩模图,确定所述道路图像的属性特征图中属于车道线的属性特征;
    所述第三处理模块,配置为:根据所述车道线的属性特征,确定所述车道线的属性。
  16. 根据权利要求15所述的装置,其中,所述第三处理模块,还配置为在确定所述车道线的属性之后,根据所述道路图像、确定的所述道路图像中的车道线的掩模图以及确定的所述车道线的属性,确定所述道路图像中的车道线。
  17. 一种神经网络训练装置,其中,所述装置包括第四处理模块、第五处理模块和调整模块,其中,
    第四处理模块,配置为根据样本图像的标注的掩模图,确定所述样本图像的属性特征图中属于目标的属性特征;所述标注的掩模图表征所述目标在所述样本图像中的位置;所述样本图像的属性特征图表征所述样本图像的属性;
    第五处理模块,配置为根据所述目标的属性特征,确定所述目标的属性;
    调整模块,配置为根据确定的所述目标的属性和所述目标的标注的属性之间的差异,以及所述标注的掩模图和对所述样本图像进行语义分割后确定的所述样本图像的掩模图之间的差异,调整所述神经网络的网络参数值。
  18. 一种智能行驶装置,包括检测模块和指示模块,其中,
    检测模块,配置为利用权利要求4-6任一所述的方法,检测智能行驶设备获取的道路图像中的车道线属性;
    指示模块,配置为根据检测到的车道线属性,指示智能行驶设备在所述道路图像对应的道路上行驶。
  19. 一种电子设备,包括处理器和配置为存储能够在处理器上运行的计算机程序的存储器;其中,
    所述处理器配置为运行所述计算机程序时,执行权利要求1至6任一项所述的目标属性检测方法或权利要求7至10任一项所述的神经网络训练方法或权利要求11所述的智能行驶方法。
  20. 一种计算机存储介质,其上存储有计算机程序,其中,该计算机程序被处理器执行时实现权利要求1至6任一项所述的目标属性检测方法或权利要求7至10任一项所述的神经网络训练方法或权利要求11所述的智能行驶方法。
  21. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行用于实现权利要求1至6任一项所述的目标属性检测方法或权利要求7至10任一项所述的神经网络训练方法或权利要求11所述的智能行驶方法。
PCT/CN2020/114109 2019-11-07 2020-09-08 目标属性检测、神经网络训练及智能行驶方法、装置 WO2021088505A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217016723A KR20210087496A (ko) 2019-11-07 2020-09-08 객체 속성 검출, 신경망 훈련 및 지능형 주행 방법, 장치
JP2021533200A JP2022513781A (ja) 2019-11-07 2020-09-08 ターゲット属性検出、ニューラルネットワークトレーニング及びインテリジェント走行方法、装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911081216.4A CN112785595B (zh) 2019-11-07 2019-11-07 目标属性检测、神经网络训练及智能行驶方法、装置
CN201911081216.4 2019-11-07

Publications (1)

Publication Number Publication Date
WO2021088505A1 true WO2021088505A1 (zh) 2021-05-14

Family

ID=75747824

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/114109 WO2021088505A1 (zh) 2019-11-07 2020-09-08 目标属性检测、神经网络训练及智能行驶方法、装置

Country Status (4)

Country Link
JP (1) JP2022513781A (zh)
KR (1) KR20210087496A (zh)
CN (1) CN112785595B (zh)
WO (1) WO2021088505A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115661556A (zh) * 2022-10-20 2023-01-31 南京领行科技股份有限公司 一种图像处理方法、装置、电子设备及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023146286A1 (ko) * 2022-01-28 2023-08-03 삼성전자 주식회사 이미지의 화질을 개선하기 위한 전자 장치 및 방법

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718870A (zh) * 2016-01-15 2016-06-29 武汉光庭科技有限公司 自动驾驶中基于前向摄像头的道路标线提取方法
CN107729880A (zh) * 2017-11-15 2018-02-23 北京小米移动软件有限公司 人脸检测方法及装置
CN108764137A (zh) * 2018-05-29 2018-11-06 福州大学 基于语义分割的车辆行驶车道定位方法
CN110163069A (zh) * 2019-01-04 2019-08-23 深圳市布谷鸟科技有限公司 用于辅助驾驶的车道线检测方法
US20190311202A1 (en) * 2018-04-10 2019-10-10 Adobe Inc. Video object segmentation by reference-guided mask propagation

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3352655B2 (ja) * 1999-09-22 2002-12-03 富士重工業株式会社 車線認識装置
JP4292250B2 (ja) * 2004-07-02 2009-07-08 トヨタ自動車株式会社 道路環境認識方法及び道路環境認識装置
JP5664152B2 (ja) * 2009-12-25 2015-02-04 株式会社リコー 撮像装置、車載用撮像システム及び物体識別装置
JP6569280B2 (ja) * 2015-04-15 2019-09-04 日産自動車株式会社 路面標示検出装置及び路面標示検出方法
CN105260699B (zh) * 2015-09-10 2018-06-26 百度在线网络技术(北京)有限公司 一种车道线数据的处理方法及装置
CN105956122A (zh) * 2016-05-03 2016-09-21 无锡雅座在线科技发展有限公司 对象属性的确定方法和装置
JP6802756B2 (ja) * 2017-05-18 2020-12-16 株式会社デンソーアイティーラボラトリ 認識システム、共通特徴量抽出ユニット、及び認識システム構成方法
US10679351B2 (en) * 2017-08-18 2020-06-09 Samsung Electronics Co., Ltd. System and method for semantic segmentation of images
CN108229386B (zh) * 2017-12-29 2021-12-14 百度在线网络技术(北京)有限公司 用于检测车道线的方法、装置和介质
KR102541561B1 (ko) * 2018-02-12 2023-06-08 삼성전자주식회사 차량의 주행을 위한 정보를 제공하는 방법 및 그 장치들
CN110414428A (zh) * 2019-07-26 2019-11-05 厦门美图之家科技有限公司 一种生成人脸属性信息识别模型的方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718870A (zh) * 2016-01-15 2016-06-29 武汉光庭科技有限公司 自动驾驶中基于前向摄像头的道路标线提取方法
CN107729880A (zh) * 2017-11-15 2018-02-23 北京小米移动软件有限公司 人脸检测方法及装置
US20190311202A1 (en) * 2018-04-10 2019-10-10 Adobe Inc. Video object segmentation by reference-guided mask propagation
CN108764137A (zh) * 2018-05-29 2018-11-06 福州大学 基于语义分割的车辆行驶车道定位方法
CN110163069A (zh) * 2019-01-04 2019-08-23 深圳市布谷鸟科技有限公司 用于辅助驾驶的车道线检测方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115661556A (zh) * 2022-10-20 2023-01-31 南京领行科技股份有限公司 一种图像处理方法、装置、电子设备及存储介质
CN115661556B (zh) * 2022-10-20 2024-04-12 南京领行科技股份有限公司 一种图像处理方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN112785595A (zh) 2021-05-11
JP2022513781A (ja) 2022-02-09
CN112785595B (zh) 2023-02-28
KR20210087496A (ko) 2021-07-12

Similar Documents

Publication Publication Date Title
CN108038474B (zh) 人脸检测方法、卷积神经网络参数的训练方法、装置及介质
US20230087526A1 (en) Neural network training method, image classification system, and related device
WO2023138300A1 (zh) 目标检测方法及应用其的移动目标跟踪方法
CN108805170B (zh) 形成用于全监督式学习的数据集
US11416710B2 (en) Feature representation device, feature representation method, and program
CN107833213B (zh) 一种基于伪真值自适应法的弱监督物体检测方法
JP6897335B2 (ja) 学習プログラム、学習方法および物体検知装置
US8620026B2 (en) Video-based detection of multiple object types under varying poses
TW201926140A (zh) 影像標註方法、電子裝置及非暫態電腦可讀取儲存媒體
US20170032247A1 (en) Media classification
WO2022142855A1 (zh) 回环检测方法、装置、终端设备和可读存储介质
US9245206B2 (en) Image processing apparatus and image processing method
WO2021088505A1 (zh) 目标属性检测、神经网络训练及智能行驶方法、装置
WO2015042891A1 (zh) 图像语义分割的方法和装置
CN111583180B (zh) 一种图像的篡改识别方法、装置、计算机设备及存储介质
US20160162757A1 (en) Multi-class object classifying method and system
WO2021088504A1 (zh) 路口检测、神经网络训练及智能行驶方法、装置和设备
JP2023507248A (ja) 物体検出および認識のためのシステムおよび方法
US8630483B2 (en) Complex-object detection using a cascade of classifiers
WO2021000674A1 (zh) 细胞图片识别方法、系统、计算机设备及可读存储介质
CN112287905A (zh) 车辆损伤识别方法、装置、设备及存储介质
EP4332910A1 (en) Behavior detection method, electronic device, and computer readable storage medium
WO2021098346A1 (zh) 人体朝向检测方法、装置、电子设备和计算机存储介质
CN113269150A (zh) 基于深度学习的车辆多属性识别的系统及方法
CN112766128A (zh) 交通信号灯检测方法、装置和计算机设备

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 20217016723

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2021533200

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20884679

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20884679

Country of ref document: EP

Kind code of ref document: A1