WO2021174863A1 - Method for training vehicle model-year recognition model and method for recognizing vehicle model year - Google Patents

Method for training vehicle model-year recognition model and method for recognizing vehicle model year Download PDF

Info

Publication number
WO2021174863A1
WO2021174863A1 PCT/CN2020/121514 CN2020121514W WO2021174863A1 WO 2021174863 A1 WO2021174863 A1 WO 2021174863A1 CN 2020121514 W CN2020121514 W CN 2020121514W WO 2021174863 A1 WO2021174863 A1 WO 2021174863A1
Authority
WO
WIPO (PCT)
Prior art keywords
vehicle
interest
region
image
feature
Prior art date
Application number
PCT/CN2020/121514
Other languages
French (fr)
Chinese (zh)
Inventor
叶丹丹
晋兆龙
邹文艺
Original Assignee
苏州科达科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州科达科技股份有限公司 filed Critical 苏州科达科技股份有限公司
Publication of WO2021174863A1 publication Critical patent/WO2021174863A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Definitions

  • the present invention relates to the technical field of vehicle recognition, in particular to a training method of a vehicle year model recognition model and a vehicle year model recognition method.
  • Vehicles have become an indispensable means of transportation in modern life. As an important carrier and alternative behavior tool, the monitoring and recognition of vehicle information has also become an important issue for intelligent transportation and safe cities. Intelligent analysis of vehicle data, on the one hand, can facilitate traffic management, such as license plate recognition at parking lot bayonet; on the other hand, it can effectively assist in traffic control, such as the capture and information recording of illegal vehicles, licensed vehicles, and traffic accidents and crimes. Vehicle tracking, etc.
  • CNN Convolutional Neural Networks
  • ResNet Deep Residual Networks
  • the embodiments of the present invention provide a training method for a vehicle year recognition model and a vehicle year recognition method to solve the problem of insufficient recognition.
  • an embodiment of the present invention provides a training method for a vehicle model recognition model, including:
  • the fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest Features;
  • the vehicle year model recognition model includes the feature extraction module and the classification module;
  • the parameters of the feature extraction module and the classification module are updated to optimize the vehicle year recognition model.
  • the method for training a vehicle model recognition model provided by the embodiment of the present invention, at least two sets of features of the vehicle sample image are extracted through a feature extraction module, and the regions of interest corresponding to the at least two sets of features and their score values are obtained;
  • the fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest Feature; and based on the label information of the vehicle sample image and the loss function value, the feature extraction module and the parameters of the classification module are updated to optimize the vehicle year recognition model.
  • the method extracts at least two sets of features, and inputs the region of interest and the vehicle sample image into the classification module, and optimizes the recognition model according to the loss function, which not only improves the sense of hierarchy of feature extraction, but also The corresponding loss function is used to update the parameters of the recognition model, thereby improving the accuracy of recognition.
  • the obtaining the region of interest corresponding to each set of features and its score value based on the at least two sets of features includes:
  • a region of interest corresponding to each set of features and its score value are generated.
  • the training method of the vehicle model recognition model provided by the embodiment of the present invention generates a plurality of candidate regions corresponding to each set of features by using each set of features, and generates a sense corresponding to each set of features based on the multiple candidate regions.
  • the region of interest and its score value can accurately screen out the region of interest corresponding to each set of features, which provides a basis for subsequent training.
  • the generating a region of interest corresponding to each set of features and a score value based on the multiple candidate regions includes:
  • the training method of the vehicle model recognition model determines that the region of interest is the candidate region with the highest score value by calculating the score value of each candidate region, and further improves the feeling.
  • the accuracy of the region of interest provides a basis for subsequent training.
  • the fusion of the region of interest and the vehicle sample image is input into a classification module to obtain the entire image classification feature of the sample image, the features of the region of interest and the features of the entire image fused with the region of interest include:
  • the fusion of the region of interest and the vehicle sample image is input into a classification module; wherein the output of the classification module is the annual model classification of the vehicle sample image;
  • the overall feature of the vehicle sample image is segmented to obtain the overall image classification feature of the sample image and the feature of the region of interest.
  • the region of interest and the vehicle sample image are fused and then input into a classification module to obtain the model year classification of the vehicle sample image, wherein the feeling The region of interest is the candidate region with the highest score for each group of features.
  • the vehicle sample image is fused with the candidate region with the highest score for each group of features and then input into the classification module, which can improve the accuracy of the classification module and shorten the classification time.
  • the classification feature of the entire image according to the sample image, the feature of the region of interest, the feature of the entire image fused with the region of interest, and each Calculating the loss function value for each of the score values of the regions of interest including:
  • the loss function value is calculated using the following formula:
  • Loss 1 is the component loss function
  • Loss 2 is the fusion loss function
  • Loss 3 is the entire graph loss function, as well as Respectively are the level loss functions corresponding to the region of interest.
  • the training method of the vehicle model recognition model provided by the embodiment of the present invention calculates the loss function and the grade loss function by using the fusion feature, all the features of the region of interest, and the classification of the sample image, and calculates the loss function according to a certain amount.
  • the sum of the weights can accurately reflect the difference between the classification of the vehicle year recognition model and the actual classification. Through the gap, the parameters of the vehicle year recognition model can be further optimized, which further improves the classification accuracy of the model. Spend.
  • an embodiment of the present invention provides a method for identifying the year of a vehicle, including:
  • the target vehicle image is input into the vehicle year recognition model to obtain the year of the target vehicle image; wherein the vehicle year recognition model is the vehicle year recognition model according to any one of claims 1-6.
  • the training method of this model is obtained.
  • the year model of the target vehicle image is obtained by inputting the target vehicle image into the vehicle year model recognition model for classification, wherein the vehicle year model recognition model is
  • the vehicle year model recognition model is The use of at least two sets of features of the sample image and the sample image for joint training and parameter optimization with the loss function value can ensure the accuracy of the year recognition of the target vehicle image.
  • an embodiment of the present invention provides a training device for a vehicle model recognition model, including:
  • the first acquisition module is configured to acquire a sample vehicle image with label information; wherein the label information includes the vehicle brand and year model in the vehicle sample image;
  • the first feature extraction module is configured to input the vehicle sample image into the feature extraction module to obtain at least two sets of features of the vehicle sample image;
  • the scoring module is configured to obtain the region of interest corresponding to each group of features and its score value based on the at least two groups of features;
  • the second feature extraction module is used to combine
  • the fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest Features;
  • the vehicle year model recognition model includes the feature extraction module and the classification module;
  • the calculation module is used to calculate the entire image classification feature of the sample image, the feature of the region of interest, the feature after the entire image is fused with the region of interest, and the score value of each region of interest. Loss function value;
  • the parameter optimization module is configured to update the parameters of the feature extraction module and the classification module based on the annotation information of the vehicle sample image and the loss function value, so as to optimize the vehicle year recognition model.
  • the training device for the vehicle model recognition model provided by the embodiment of the present invention, at least two sets of features of the vehicle sample image are extracted through a feature extraction module and the region of interest corresponding to the at least two sets of features and its score value are obtained;
  • the fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the feature of the fusion of the entire image and the region of interest
  • the parameters of the feature extraction module and the classification module are updated to optimize the vehicle year recognition model.
  • the method extracts at least two sets of features, and inputs the region of interest and the vehicle sample image into the classification module, and optimizes the recognition model according to the loss function, which not only improves the sense of hierarchy of feature extraction, but also The corresponding loss function is used to update the parameters of the recognition model, thereby improving the accuracy of recognition.
  • an embodiment of the present invention provides a vehicle year model recognition device, including:
  • the second acquisition module is used to acquire the target vehicle image
  • the recognition module is used to input the target vehicle image into a vehicle year recognition model to obtain the year of the target vehicle image; wherein the vehicle year recognition model is based on the first aspect or any one of the first aspects It is obtained by training the training method of the vehicle model year model described in the item implementation mode.
  • the year of the target vehicle image is obtained by inputting the target vehicle image into the vehicle year recognition model for classification, wherein the vehicle year recognition model is
  • the vehicle year recognition model is
  • the use of at least two sets of features of the sample image and the sample image for joint training and parameter optimization with the loss function value can ensure the accuracy of the year recognition of the target vehicle image.
  • an electronic device including:
  • a memory and a processor the memory and the processor are communicatively connected to each other, and computer instructions are stored in the memory, and the processor executes the first aspect or any one of the first aspects by executing the computer instructions
  • Fig. 1 is a flowchart of a training method for a vehicle model recognition model according to an embodiment of the present invention
  • FIG. 2 is a complete flowchart of a training method for a vehicle model recognition model according to an embodiment of the present invention
  • Fig. 3 is a flowchart of a method for recognizing the year of a vehicle according to an embodiment of the present invention
  • FIG. 4 is a structural block diagram of a training device for a vehicle model recognition model according to an embodiment of the present invention.
  • Fig. 5 is a structural block diagram of a vehicle year model recognition device according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the present invention.
  • Fig. 7 is a schematic diagram of the composition of a vehicle model year recognition model according to an embodiment of the present invention.
  • a training method of a vehicle model year recognition model and an embodiment of a vehicle model recognition method are provided.
  • the instructions are executed in a computer system that executes the instructions, and, although a logical sequence is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than here.
  • FIG. 1 is a flowchart of a training method for a vehicle year recognition model according to an embodiment of the present invention, as shown in FIG. 1 , The process includes the following steps:
  • the labeling information includes the vehicle brand and year model in the vehicle sample image.
  • vehicle images of 10,216 year types are collected from vehicle bayonet surveillance videos and highway cameras, and information is marked on the vehicle images, where the vehicle images include cars, trucks, and buses.
  • the label information includes the front and back of the vehicle in the image, the major brand, the sub-brand, the manufacturer, and the model year; the sample image is scaled to 256*256, then cropped to 224*224, and the mean variance is processed As the vehicle sample image with the labeled information.
  • S12 Input the vehicle sample image into a feature extraction module to obtain at least two sets of features of the vehicle sample image.
  • three sets of features of the vehicle sample images are extracted; as shown in Figure 7, the lightweight neural network SqueezeNet is selected as the feature extraction module, and the features are extracted from the Fire2 module of the SqueezeNet network (SqueezeNet network).
  • the Fire2 module of the SqueezeNet network Three sets of features of different scales are extracted from the Fire5 module (the fifth module of the SqueezeNet network), and the Fire9 module (the ninth module of the SqueezeNet network); among them, due to the large number of vehicle sample images,
  • the output size of the last convolutional layer of the SqueezeNet network is changed from 512*13*13 to 1024*7*7.
  • the three sets of features with different scales can also be extracted from other Fire modules of the lightweight neural network SqueezeNet, preferably from the Fire2 module, Fire5 module, and Fire9 module; optionally, you can also select Back Propagation (abbreviated as BP) neural network or Learning Vector Quantization (abbreviated as LVQ) neural network or Hopfield neural network is used as the feature extraction module to perform feature extraction on the vehicle sample image; optionally, the feature extraction of the vehicle sample image
  • BP Back Propagation
  • LVQ Learning Vector Quantization
  • Hopfield neural network is used as the feature extraction module to perform feature extraction on the vehicle sample image; optionally, the feature extraction of the vehicle sample image
  • the number of groups can also be selected according to actual needs, such as 4 groups, 5 groups, etc. In specific embodiments, 3 groups are preferred.
  • S13 based on the at least two sets of features, obtain a region of interest corresponding to each set of features and a score value thereof.
  • the three sets of features with different scales are input into the Region Proposal Network (RPN), and a series of sizes of 24*24, 32*32, 86*86 rectangular frame, the rectangular frame is scaled according to the ratio of 1:3, 2:3, 1:1, that is, 9 rectangular frames can be obtained for each group of features, and the rectangular frame contains the corresponding corresponding to each group of features.
  • Information volume and information volume score; the 9 rectangular boxes corresponding to each group of features are used non-maximum suppression algorithm (NMS) to retain the rectangular box with the highest information volume score of each group and its score value as The region of interest corresponding to each set of features and its score value.
  • NMS non-maximum suppression algorithm
  • the information volume scores corresponding to the rectangular boxes can also be sorted by the method of sorting and screening, so as to obtain the rectangular box with the highest information volume score; optionally, you can also select Region-CNN (referred to as R The CNN) network generates the region of interest for each set of features.
  • R The CNN Region-CNN
  • the fusion of the region of interest and the vehicle sample image is input into a classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest After the characteristics.
  • the vehicle year model recognition model includes the feature extraction module and the classification module.
  • the deep residual network Resnet50 is selected as the classification module; the region of interest, that is, the rectangular box with the highest score corresponding to the three groups of features is bilinearly interpolated To 224*224 size, and input the deep residual network Resnet50 together with the sample image; acquire the sample image features and the interest from the fully connected (FC) layer of the deep residual network Resnet50 The overall feature of the region, and cut out from the overall feature the entire image classification feature of the sample image, the feature of each region of interest, and the feature after the entire image is fused with the region of interest.
  • ResNeXt network or Resnet101 or other residual networks of the same type can also be selected as the classification module.
  • S15 Calculate a loss function value according to the entire image classification feature of the sample image, the feature of the region of interest, the feature after the entire image is fused with the region of interest, and the score value of each region of interest .
  • the overall features obtained before the fully connected (FC) layer are segmented into the classification features of the entire image, and the overall features of the region of interest are fused and input into the fully connected (FC) layer.
  • calculate the loss Loss 2 to obtain the fusion loss function value after the fusion of the overall image classification feature and the overall feature of the region of interest; the overall image feature of the sample image is obtained from the fully connected (FC) layer , And by accessing the softmax layer, the overall image loss function value Loss 3 corresponding to the sample image is obtained; the components of the region of interest segmented from the overall features obtained before the fully connected (FC) layer After the features are input to the fully connected (FC) layer and the softmax layer, the component loss function value Loss 1 corresponding to the region of interest is obtained; the overall feature obtained before the fully connected (FC) layer is segmented into the feeling The overall feature of the region of interest is input to the fully connected (FC) layer and processed by log softmax, and the loss function values of the regions of
  • the log softmax can be replaced by other loss calculation methods, such as NLLLoss or Cross Entropy softmax.
  • the level loss function corresponding to the region of interest (three groups) is summed with a certain weight, and the parameters of the SqueezeNet network and the Resnet50 network are updated until the number of updates of the SqueezeNet network and the Resnet50 network reaches the threshold.
  • the loss function value of the SqueezeNet network and the Resnet50 network stops after a certain preset range, so as to obtain the vehicle year recognition model.
  • the vehicle model recognition model is composed of SqueezeNet network and Resnet50 network.
  • the method for training a vehicle model recognition model provided by the embodiment of the present invention, at least two sets of features of the vehicle sample image are extracted through a feature extraction module, and the regions of interest corresponding to the at least two sets of features and their score values are obtained;
  • the fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the feature of the fusion of the entire image and the region of interest
  • the parameters of the feature extraction module and the classification module are updated to optimize the vehicle year recognition model.
  • the method extracts at least two sets of features, and inputs the region of interest and the vehicle sample image into the classification module, and optimizes the recognition model according to the loss function, which not only improves the sense of hierarchy of feature extraction, but also The corresponding loss function is used to update the parameters of the recognition model, thereby improving the accuracy of recognition.
  • Fig. 2 is a complete flowchart of a training method for a vehicle model recognition model according to an embodiment of the present invention. As shown in Fig. 2, the method includes the following steps:
  • the labeling information includes the vehicle brand and year model in the vehicle sample image.
  • S22 Input the vehicle sample image into a feature extraction module to obtain at least two sets of features of the vehicle sample image.
  • S23 based on the at least two sets of features, obtain a region of interest corresponding to each set of features and a score value thereof.
  • step S23 may include the following steps:
  • each set of features is input into the Region Proposal Network (RPN) to obtain multiple rectangular boxes corresponding to each set of features, and each rectangular box corresponds to a score value.
  • the corresponding area is the candidate area.
  • S232 Based on the multiple candidate regions, generate a region of interest corresponding to each group of features and a score value thereof.
  • the plurality of rectangular boxes and their score values are subjected to non-maximum value suppression (NMS) processing or sorting and screening processing to obtain the rectangular box with the highest value of each component, and the rectangular box with the highest score is Is the region of interest.
  • NMS non-maximum value suppression
  • step S232 may include:
  • the fusion of the region of interest and the vehicle sample image is input into a classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest After the characteristics.
  • the vehicle year model recognition model includes the feature extraction module and the classification module.
  • step S24 may include the following steps:
  • S241 The region of interest and the vehicle sample image are fused and input into a classification module.
  • the output of the classification module is the annual model classification of the vehicle sample image
  • the region of interest and the vehicle sample image are input into Resnet 50 together.
  • S242 Extract the output of the last pooling layer of the classification module to obtain the overall characteristics of the vehicle sample image.
  • the characteristics of the sample image and the overall characteristics of the region of interest are obtained from the output of the last pooling layer before the Fully Connected (FC) layer of the deep residual network Resnet50.
  • S243 Perform segmentation on the overall feature of the vehicle sample image to obtain the overall image classification feature of the sample image and the feature of the region of interest.
  • the entire image classification feature of the sample image and the feature corresponding to each region of interest are cut out from the overall feature.
  • S25 Calculate a loss function value based on the entire image classification feature of the sample image, the feature of the region of interest, the feature after the entire image is fused with the region of interest, and the score value of each region of interest. .
  • step S25 may include the following steps:
  • the overall feature obtained before the Fully Connected (FC) layer is segmented into the overall image classification feature and the overall feature of the region of interest, and the overall image classification feature is merged with the overall feature to obtain the result.
  • the fusion characteristics are derived from the overall image classification feature and the overall feature of the region of interest.
  • the fusion feature is input to a fully connected (FC) layer and a loss function is calculated to obtain the fusion loss function value.
  • FC fully connected
  • the component loss function value corresponding to the region of interest is obtained .
  • S254 Calculate the overall image loss function value by using the entire image classification feature of the sample image.
  • the overall image feature corresponding to the sample image is segmented from the common classification features of the vehicle sample image and the region of interest obtained after the Fully Connected (FC) layer, and The entire image feature is used as the entire image classification feature, and the entire image loss function corresponding to the entire image classification feature is obtained after accessing the softmax layer.
  • FC Fully Connected
  • S255 Calculate the level loss function value corresponding to each region of interest by using the feature of each region of interest and its corresponding score value.
  • the features of the region of interest segmented from the overall features obtained before the fully connected (FC) layer are input to the fully connected (FC) layer and log softmax processing is performed to obtain the regions of interest (
  • the level loss is calculated together with the loss function values of the three groups) and the corresponding amount of information to obtain the level loss function corresponding to the region of interest (three groups).
  • S256 Calculate the loss function value based on each of the level loss function value, the component loss function value, the fusion loss function value, and the entire graph loss function value.
  • the value of the level loss function, the value of the component loss function, the value of the fusion loss function, and the value of the entire graph loss function are summed by a certain weight to obtain the loss function value.
  • the loss function is calculated using the following calculation formula:
  • Loss 1 is the component loss function value
  • Loss 2 is the fusion loss function value
  • Loss 3 is the overall image loss function value.
  • Respectively are the level loss function values corresponding to the region of interest.
  • the regions of interest are in three groups. Therefore, the calculation formula of the loss function is:
  • Loss 1 is the component loss function value
  • Loss 2 is the fusion loss function value
  • Loss 3 is the overall image loss function value.
  • points It is the level loss function value corresponding to the region of interest.
  • Fig. 3 is a flowchart of a method for recognizing the year of a vehicle according to an embodiment of the present invention. As shown in Fig. 3, the method includes the following steps:
  • the target vehicle image can be obtained from a vehicle bayonet or a road camera, and the vehicle image can be any type of car, truck, and bus.
  • the vehicle model recognition model includes a feature extraction module and a classification module.
  • a lightweight neural network SqueezeNet is selected as the feature extraction module
  • a deep residual network Resnet50 is selected as the classification module;
  • the lightweight neural network SqueezeNet is used to perform feature extraction, and then the deep residual network Resnet50 is used for classification to obtain the model year of the target vehicle image.
  • the year model of the target vehicle image is obtained by inputting the target vehicle image into the vehicle year model recognition model for classification, wherein the vehicle year model recognition model is
  • the vehicle year model recognition model is The use of at least two sets of features of the sample image and the sample image for joint training and parameter optimization with the loss function value can ensure the accuracy of the year recognition of the target vehicle image.
  • Fig. 4 is a training device for a vehicle model recognition model according to an embodiment of the present invention, as shown in Fig. 4, including:
  • the first obtaining module 41 is configured to obtain a sample vehicle image with label information; wherein the label information includes the vehicle brand and year model in the vehicle sample image;
  • the first feature extraction module 42 is configured to input the vehicle sample image into the feature extraction module to obtain at least two sets of features of the vehicle sample image;
  • the scoring module 43 is configured to obtain the region of interest corresponding to each group of features and its score value based on the at least two groups of features;
  • the second feature extraction module 44 is configured to merge the region of interest and the vehicle sample image and input it into the classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the entire image Features fused with the region of interest; wherein, the vehicle year model recognition model includes the feature extraction module and the classification module;
  • the calculation module 45 is configured to classify the features of the entire image of the sample image, the features of the region of interest, the features after the entire image is fused with the region of interest, and the score value of each region of interest, Calculate the value of the loss function;
  • the parameter optimization module 46 is configured to update the parameters of the feature extraction module and the classification module based on the annotation information of the vehicle sample image and the loss function value, so as to optimize the vehicle year recognition model.
  • the training device for the vehicle model recognition model provided by the embodiment of the present invention extracts at least two sets of features of the vehicle sample image through a feature extraction module, and obtains the region of interest corresponding to the at least two sets of features and the score value thereof;
  • the fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the entire image classification feature of the sample image, the feature of each region of interest, and the fusion feature of the entire image and the region of interest
  • the parameters of the feature extraction module and the classification module are updated to obtain the vehicle year recognition model.
  • the method extracts at least two sets of features, and inputs the region of interest and the vehicle sample image into the classification module, and optimizes the recognition model according to the loss function, which not only improves the sense of hierarchy of feature extraction, but also The corresponding loss function is used to update the parameters of the recognition model, thereby improving the accuracy of recognition.
  • Fig. 5 is a vehicle year model recognition device according to an embodiment of the present invention, as shown in Fig. 5, including:
  • the second acquisition module 51 is used to acquire an image of a target vehicle
  • the recognition module 52 is used to input the target vehicle image into a vehicle year recognition model to obtain the year of the target vehicle image; wherein, the vehicle year recognition model is shown in FIG. 1 or FIG. 2 Trained on the training method of the vehicle model year.
  • the year of the target vehicle image is obtained by inputting the target vehicle image into the vehicle year recognition model for classification, wherein the vehicle year recognition model is
  • the vehicle year recognition model is
  • the use of at least two sets of features of the sample image and the sample image for joint training and parameter optimization with the loss function value can ensure the accuracy of the year recognition of the target vehicle image.
  • the embodiment of the present invention also provides an electronic device having the training device for the vehicle year model recognition model shown in FIG. 4 and the vehicle year model recognition device shown in FIG. 5.
  • FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • the electronic device may include: at least one processor 61, such as a CPU (Central Processing Unit, central processing unit) , At least one communication interface 63, memory 64, and at least one communication bus 62.
  • the communication bus 62 is used to implement connection and communication between these components.
  • the communication interface 63 may include a display screen (Display) and a keyboard (Keyboard), and the optional communication interface 63 may also include a standard wired interface and a wireless interface.
  • the memory 64 may be a high-speed RAM memory (Random Access Memory, volatile random access memory), or a non-volatile memory (non-volatile memory), such as at least one disk memory.
  • the memory 64 may also be at least one storage device located far away from the aforementioned processor 61.
  • the processor 61 may be combined with the devices described in FIG. 4 and FIG. 5, the memory 64 stores application programs, and the processor 61 calls the program code stored in the memory 64 to execute any of the above method steps.
  • the communication bus 62 may be a peripheral component interconnect standard (peripheral component interconnect, PCI for short) bus or an extended industry standard architecture (EISA for short) bus, etc.
  • the communication bus 62 can be divided into an address bus, a data bus, a control bus, and so on. For ease of representation, only one thick line is used in FIG. 6, but it does not mean that there is only one bus or one type of bus.
  • the memory 64 may include a volatile memory (English: volatile memory), such as a random access memory (English: random-access memory, abbreviation: RAM); the memory may also include a non-volatile memory (English: non-volatile memory).
  • memory such as flash memory (English: flash memory), hard disk (English: hard disk drive, abbreviation: HDD) or solid-state hard disk (English: solid-state drive, abbreviation: SSD); memory 64 may also include the above types The combination of memory.
  • the processor 61 may be a central processing unit (English: central processing unit, abbreviation: CPU), a network processor (English: network processor, abbreviation: NP), or a combination of CPU and NP.
  • CPU central processing unit
  • NP network processor
  • the processor 61 may further include a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (English: application-specific integrated circuit, abbreviation: ASIC), a programmable logic device (English: programmable logic device, abbreviation: PLD) or a combination thereof.
  • the above-mentioned PLD can be a complex programmable logic device (English: complex programmable logic device, abbreviation: CPLD), field programmable logic gate array (English: field-programmable gate array, abbreviation: FPGA), general array logic (English: generic array) logic, abbreviation: GAL) or any combination thereof.
  • the memory 64 is also used to store program instructions.
  • the processor 71 can call program instructions to implement the vehicle year model training method shown in the embodiments of FIG. 1 to FIG. 2 of the present application and/or the vehicle year model recognition method shown in FIG. 3.
  • the embodiment of the present invention also provides a non-transitory computer storage medium, the computer storage medium stores computer-executable instructions, and the computer-executable instructions can execute the training method of the vehicle year recognition model in any of the above-mentioned method embodiments And/or the method of identifying the year of the vehicle.
  • the storage medium can be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), a random access memory (RAM), a flash memory (Flash Memory), a hard disk (Hard Disk Drive, abbreviation: HDD) or solid-state drive (Solid-State Drive, SSD), etc.; the storage medium may also include a combination of the foregoing types of memories.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

A method for training a vehicle model-year recognition model and a method for recognizing a vehicle model year, relating to the technical field of vehicle recognition. The training method comprises: acquiring a vehicle sample image having labeling information (S11); inputting the vehicle sample image into a feature extraction module, so as to obtain at least two groups of features (S12); on the basis of the at least two groups of features, obtaining a region of interest corresponding to each group of features and a score value thereof (S13); fusing the regions of interest and the vehicle sample image and then inputting same into a classification module, so as to obtain whole image classification features, features of the regions of interest, and features of the whole image fused with the regions of interest (S14); calculating loss function values according to the described three kinds of features and the score value of each region of interest (S15); and on the basis of the labeling information and the loss function values, updating parameters of the feature extraction module and the classification module, so as to optimize a vehicle model year recognition model (S16). The training method improves the accuracy of a recognition model, and provides a foundation for subsequent application in a recognition method.

Description

车辆年款识别模型的训练方法以及车辆年款的识别方法Training method of vehicle model recognition model and vehicle model recognition method
本申请要求申请号为:CN202010137345.7、申请日为20200305的中国国家知识产权局的在先专利申请为优先权,该在先专利申请文本中的内容通过引用而完全加入本专利申请中。This application requires the prior patent application of the State Intellectual Property Office of China with the application number of CN202010137345.7 and the filing date of 20200305 as priority, and the content of the prior patent application text is fully incorporated into this patent application by reference.
技术领域Technical field
本发明涉及车辆识别技术领域,具体涉及一种车辆年款识别模型的训练方法以及车辆年款的识别方法。The present invention relates to the technical field of vehicle recognition, in particular to a training method of a vehicle year model recognition model and a vehicle year model recognition method.
背景技术Background technique
车辆已成为现代生活必不可少的交通工具。作为一种重要载体与替代行为工具,车辆的信息的监控与识别也成为智能交通与平安城市的重要课题。对车辆数据进行智能分析,一方面可以便捷交通管理,如停车场卡口的车牌识别;另一方面可以有效协助交通管制,如违规车辆、套牌车辆的捕获与信息记录,涉交通事故、犯罪车辆的追踪等。Vehicles have become an indispensable means of transportation in modern life. As an important carrier and alternative behavior tool, the monitoring and recognition of vehicle information has also become an important issue for intelligent transportation and safe cities. Intelligent analysis of vehicle data, on the one hand, can facilitate traffic management, such as license plate recognition at parking lot bayonet; on the other hand, it can effectively assist in traffic control, such as the capture and information recording of illegal vehicles, licensed vehicles, and traffic accidents and crimes. Vehicle tracking, etc.
卷积神经网络(Convolutional Neural Networks,CNN)因其不受目标平移、缩放、倾斜及其他一定程度的形变的影响,已广泛应用于图像模式识别中,包括车辆属性识别,许多专家学者也就此技术发表了不少著作。Convolutional Neural Networks (CNN) has been widely used in image pattern recognition, including vehicle attribute recognition, because it is not affected by target translation, zoom, tilt, and other deformations to a certain degree. Many experts and scholars also use this technology. Many books have been published.
其中,为了克服由于卷积神经网络(Convolutional Neural Networks,简称CNN)深度的增加而产生的学习效率低,准确率无法有效提升的问题,深度残差网络(Deep residual network,简称ResNet)在2015年被提出且迅速应用于车辆年款的识别领域。通常来说,在进行识别时,ResNet会分别使用两次,第一次使用ResNet输入的是样本图像,得到整图特征和区域特征,第二次使用ResNet输入的是区域特征,得到感兴趣区域特征,最后将感兴趣区域特征进行分类,以得到样本图像的分类。发明人在对深度残差网络的研究过程中发现,分别使用两次ResNet,得到感兴趣区域特征,会增加训练时间与计算量,而且只通过感兴趣区域特征来得到样本图像的分类,忽略了其他层次的特征,从而造成识别不准确的问题。Among them, in order to overcome the problem of low learning efficiency and ineffective improvement of accuracy due to the increase in the depth of Convolutional Neural Networks (CNN), Deep Residual Networks (ResNet for short) were introduced in 2015. It was proposed and quickly applied to the field of vehicle model recognition. Generally speaking, during recognition, ResNet will be used twice. The first time ResNet is used to input the sample image to get the whole image feature and regional feature, and the second time ResNet is used to input the regional feature to get the region of interest. Features, and finally classify the features of the region of interest to obtain the classification of the sample image. The inventor found in the research process of the deep residual network that using ResNet twice to obtain the features of the region of interest will increase the training time and the amount of calculation, and only use the features of the region of interest to obtain the classification of the sample image, ignoring Features at other levels cause inaccurate recognition.
发明内容Summary of the invention
有鉴于此,本发明实施例提供了一种车辆年款识别模型的训练方法以及车辆年款的识别方法,以解决识别不够准确的问题。In view of this, the embodiments of the present invention provide a training method for a vehicle year recognition model and a vehicle year recognition method to solve the problem of insufficient recognition.
根据第一方面,本发明实施例提供了一种车辆年款识别模型的训练方法,包括:According to the first aspect, an embodiment of the present invention provides a training method for a vehicle model recognition model, including:
获取带标注信息的车辆样本图像;其中,所述标注信息包括所述车辆样本图像中的车辆品牌以及年款;Acquiring a sample vehicle image with label information; wherein the label information includes the vehicle brand and year model in the vehicle sample image;
将所述车辆样本图像输入特征提取模块中,以得到所述车辆样本图像的至少2组特征;Input the vehicle sample image into a feature extraction module to obtain at least two sets of features of the vehicle sample image;
基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值;Based on the at least two sets of features, obtain the region of interest corresponding to each set of features and its score value;
将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征;其中,所述车辆年款识别模型包括所述特征提取模块以及所述分类模块;The fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest Features; wherein, the vehicle year model recognition model includes the feature extraction module and the classification module;
根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值;Calculating a loss function value according to the entire image classification feature of the sample image, the feature of the region of interest, the feature after the entire image is fused with the region of interest, and the score value of each region of interest;
基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。Based on the annotation information of the vehicle sample image and the loss function value, the parameters of the feature extraction module and the classification module are updated to optimize the vehicle year recognition model.
本发明实施例提供的车辆年款识别模型的训练方法,通过特征提取模块提取所述车辆样本图像的至少2组特征并得到所述至少2组特征对应的感兴趣区域及其得分值;通过将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,得到所述样本图像的整图分类特 征、、所述感兴趣区域的特征及整图与所述感兴趣区域融合后的特征;并基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。该方法通过将提取至少2组特征,以及将感兴趣区域以及所述车辆样本图像融合后输入分类模块,并根据损失函数对所述识别模型进行优化,不仅提升了特征提取的层次感,还得出对应的损失函数以对识别模型进行参数更新,从而提高了识别的准确性。In the method for training a vehicle model recognition model provided by the embodiment of the present invention, at least two sets of features of the vehicle sample image are extracted through a feature extraction module, and the regions of interest corresponding to the at least two sets of features and their score values are obtained; The fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest Feature; and based on the label information of the vehicle sample image and the loss function value, the feature extraction module and the parameters of the classification module are updated to optimize the vehicle year recognition model. The method extracts at least two sets of features, and inputs the region of interest and the vehicle sample image into the classification module, and optimizes the recognition model according to the loss function, which not only improves the sense of hierarchy of feature extraction, but also The corresponding loss function is used to update the parameters of the recognition model, thereby improving the accuracy of recognition.
结合第一方面,在第一方面第一实施方式中,所述基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值,包括:With reference to the first aspect, in the first implementation manner of the first aspect, the obtaining the region of interest corresponding to each set of features and its score value based on the at least two sets of features includes:
分别利用每组特征,生成对应于每组特征的多个候选区域;Use each set of features to generate multiple candidate regions corresponding to each set of features;
基于所述多个候选区域,生成对应于每组特征的感兴趣区域及其得分值。Based on the multiple candidate regions, a region of interest corresponding to each set of features and its score value are generated.
本发明实施例提供的车辆年款识别模型的训练方法,通过利用每组特征,生成对应于每组特征的多个候选区域,并基于所述多个候选区域,生成对应于每组特征的感兴趣区域及其得分值,能够准确的筛选出对应于每组特征的感兴趣区域,为后续的训练提供了基础。The training method of the vehicle model recognition model provided by the embodiment of the present invention generates a plurality of candidate regions corresponding to each set of features by using each set of features, and generates a sense corresponding to each set of features based on the multiple candidate regions. The region of interest and its score value can accurately screen out the region of interest corresponding to each set of features, which provides a basis for subsequent training.
结合第一方面第一实施方式,在第一方面第二实施方式中,所述基于所述多个候选区域,生成对应于每组特征的感兴趣区域及其得分值,包括:With reference to the first implementation manner of the first aspect, in the second implementation manner of the first aspect, the generating a region of interest corresponding to each set of features and a score value based on the multiple candidate regions includes:
计算每个所述候选区域的得分值;Calculating the score value of each candidate region;
确定所述得分值最高的候选区域为所述感兴趣区域。It is determined that the candidate region with the highest score value is the region of interest.
本发明实施例提供的车辆年款识别模型的训练方法,通过计算每个所述候选区域的得分值,来确定所述感兴趣区域为得分值最高的候选区域,进一步提升了所述感兴趣区域的准确度,为后续的训练提供了基础。The training method of the vehicle model recognition model provided by the embodiment of the present invention determines that the region of interest is the candidate region with the highest score value by calculating the score value of each candidate region, and further improves the feeling. The accuracy of the region of interest provides a basis for subsequent training.
结合第一方面,在第一方面第三实施方式中,所述将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征,包括:With reference to the first aspect, in a third implementation manner of the first aspect, the fusion of the region of interest and the vehicle sample image is input into a classification module to obtain the entire image classification feature of the sample image, the The features of the region of interest and the features of the entire image fused with the region of interest include:
将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中;其中,所述分类模块的输出为所述车辆样本图像的年款分类;The fusion of the region of interest and the vehicle sample image is input into a classification module; wherein the output of the classification module is the annual model classification of the vehicle sample image;
提取所述分类模块的最后一层池化层的输出,以得到所述车辆样本图像的整体特征;Extracting the output of the last pooling layer of the classification module to obtain the overall characteristics of the vehicle sample image;
对所述车辆样本图像的整体特征进行分割,以得到所述样本图像的整图分类特征以及所述感兴趣区域的特征。The overall feature of the vehicle sample image is segmented to obtain the overall image classification feature of the sample image and the feature of the region of interest.
本发明实施例提供的车辆年款识别模型的训练方法,将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块,得出所述车辆样本图像的年款分类,其中,所述感兴趣区域为每组特征的得分值最高的候选区域,车辆样本图像与每组特征得分值最高的候选区域融合后输入分类模块,可以提高分类模块的准确性,也可以缩短分类时间。According to the training method of the vehicle model recognition model provided by the embodiment of the present invention, the region of interest and the vehicle sample image are fused and then input into a classification module to obtain the model year classification of the vehicle sample image, wherein the feeling The region of interest is the candidate region with the highest score for each group of features. The vehicle sample image is fused with the candidate region with the highest score for each group of features and then input into the classification module, which can improve the accuracy of the classification module and shorten the classification time.
结合第一方面,在第一方面第四实施方式中,所述根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值,包括:With reference to the first aspect, in a fourth implementation manner of the first aspect, the classification feature of the entire image according to the sample image, the feature of the region of interest, the feature of the entire image fused with the region of interest, and each Calculating the loss function value for each of the score values of the regions of interest, including:
将所述样本图像的整图特征与所述感兴趣区域的特征进行融合,以得到融合特征;Fusing the entire image feature of the sample image with the feature of the region of interest to obtain a fusion feature;
基于所述融合特征,计算融合损失函数值;Calculating the value of the fusion loss function based on the fusion feature;
利用所述感兴趣区域的特征,计算部件损失函数值;Use the features of the region of interest to calculate the component loss function value;
利用所述样本图像的整图分类特征,计算整图损失函数值;Calculate the value of the overall image loss function by using the entire image classification feature of the sample image;
利用每个所述感兴趣区域的特征及其对应的得分值,计算每个所述感兴趣区域对应的等级损失函数值;Calculating the level loss function value corresponding to each region of interest by using the feature of each region of interest and its corresponding score value;
基于各个所述等级损失函数值、所述部件损失函数值、所述融合损失函数值以及所述整图损失函数值,计算所述损失函数值。Calculate the loss function value based on each of the level loss function value, the component loss function value, the fusion loss function value, and the entire graph loss function value.
结合第一方面第四实施方式,在第一方面第五实施方式中,所述损失函数值采用如下公式计算:With reference to the fourth implementation manner of the first aspect, in the fifth implementation manner of the first aspect, the loss function value is calculated using the following formula:
Figure PCTCN2020121514-appb-000001
Figure PCTCN2020121514-appb-000001
其中,Loss 1为部件损失函数,Loss 2为融合损失函数,Loss 3为整图损失函数,
Figure PCTCN2020121514-appb-000002
以及
Figure PCTCN2020121514-appb-000003
分别为所述感兴趣区域对应的等级损失函数。
Among them, Loss 1 is the component loss function, Loss 2 is the fusion loss function, Loss 3 is the entire graph loss function,
Figure PCTCN2020121514-appb-000002
as well as
Figure PCTCN2020121514-appb-000003
Respectively are the level loss functions corresponding to the region of interest.
本发明实施例提供的车辆年款识别模型的训练方法,通过利用所述融合特征、所有所述感兴趣区域的特征以及所述样本图像的分类来计算损失函数以及等级损失函数,并按一定的权重进行求和,可以准确地反应所述车辆年款识别模型的分类与实际分类的差距,通过所述差距可以进一步对所述车辆年款识别模型进行参数优化,进一步提升了该模型的分类准确度。The training method of the vehicle model recognition model provided by the embodiment of the present invention calculates the loss function and the grade loss function by using the fusion feature, all the features of the region of interest, and the classification of the sample image, and calculates the loss function according to a certain amount. The sum of the weights can accurately reflect the difference between the classification of the vehicle year recognition model and the actual classification. Through the gap, the parameters of the vehicle year recognition model can be further optimized, which further improves the classification accuracy of the model. Spend.
根据第二方面,本发明实施例提供了一种车辆年款的识别方法,包括:According to the second aspect, an embodiment of the present invention provides a method for identifying the year of a vehicle, including:
获取目标车辆图像;Obtain an image of the target vehicle;
将所述目标车辆图像输入车辆年款识别模型中,以得到所述目标车辆图像的年款;其中,所述车辆年款识别模型是根据权利要求1-6中任一项所述的车辆年款模型的训练方法训练得到的。The target vehicle image is input into the vehicle year recognition model to obtain the year of the target vehicle image; wherein the vehicle year recognition model is the vehicle year recognition model according to any one of claims 1-6. The training method of this model is obtained.
本发明实施例提供的车辆年款的识别方法,通过将目标车辆图像输入所述车辆年款识别模型进行分类,得出所述目标车辆图像的年款,其中,所述车辆年款识别模型是利用样本图像的至少2组特征以及所述样本图像共同进行训练,并以损失函数值进行参数优化得到的,能够保证所述目标车辆图像年款识别的准确性。According to the method for recognizing the vehicle year model provided by the embodiment of the present invention, the year model of the target vehicle image is obtained by inputting the target vehicle image into the vehicle year model recognition model for classification, wherein the vehicle year model recognition model is The use of at least two sets of features of the sample image and the sample image for joint training and parameter optimization with the loss function value can ensure the accuracy of the year recognition of the target vehicle image.
根据第三方面,本发明实施例提供了一种车辆年款识别模型的训练装置,包括:According to the third aspect, an embodiment of the present invention provides a training device for a vehicle model recognition model, including:
第一获取模块,用于获取带标注信息的车辆样本图像;其中,所述标注信息包括所述车辆样本图像中的车辆品牌以及年款;The first acquisition module is configured to acquire a sample vehicle image with label information; wherein the label information includes the vehicle brand and year model in the vehicle sample image;
第一特征提取模块,用于将所述车辆样本图像输入特征提取模块中,以得到所述车辆样本图像的至少2组特征;The first feature extraction module is configured to input the vehicle sample image into the feature extraction module to obtain at least two sets of features of the vehicle sample image;
打分模块,用于基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值;The scoring module is configured to obtain the region of interest corresponding to each group of features and its score value based on the at least two groups of features;
第二特征提取模块,用于将The second feature extraction module is used to combine
将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征;其中,所述车辆年款识别模型包括所述特征提取模块以及所述分类模块;The fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest Features; wherein, the vehicle year model recognition model includes the feature extraction module and the classification module;
计算模块,用于根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值;The calculation module is used to calculate the entire image classification feature of the sample image, the feature of the region of interest, the feature after the entire image is fused with the region of interest, and the score value of each region of interest. Loss function value;
参数优化模块,用于基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。The parameter optimization module is configured to update the parameters of the feature extraction module and the classification module based on the annotation information of the vehicle sample image and the loss function value, so as to optimize the vehicle year recognition model.
本发明实施例提供的车辆年款识别模型的训练装置,通过特征提取模块提取所述车辆样本图像的至少2组特征并得到所述至少2组特征对应的感兴趣区域及其得分值;通过将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征;并基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。该方法通过将提取至少2组特征,以及将感兴趣区域以及所述车辆样本图像融合后输入分类模块,并根据损失函数对所述识别模型进行优化,不仅提升了特征提取的层次感,还得出对应的损失函数以对识别模型进行参数更新,从而提高了识别的准确性。According to the training device for the vehicle model recognition model provided by the embodiment of the present invention, at least two sets of features of the vehicle sample image are extracted through a feature extraction module and the region of interest corresponding to the at least two sets of features and its score value are obtained; The fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the feature of the fusion of the entire image and the region of interest And based on the labeling information of the vehicle sample image and the loss function value, the parameters of the feature extraction module and the classification module are updated to optimize the vehicle year recognition model. The method extracts at least two sets of features, and inputs the region of interest and the vehicle sample image into the classification module, and optimizes the recognition model according to the loss function, which not only improves the sense of hierarchy of feature extraction, but also The corresponding loss function is used to update the parameters of the recognition model, thereby improving the accuracy of recognition.
根据第四方面,本发明实施例提供了一种车辆年款的识别装置,包括:According to the fourth aspect, an embodiment of the present invention provides a vehicle year model recognition device, including:
第二获取模块,用于获取目标车辆图像;The second acquisition module is used to acquire the target vehicle image;
识别模块,用于将所述目标车辆图像输入车辆年款识别模型中,以得到所述目标车辆图像的年款;其中,所述车辆年款识别模型是根据第一方面或第一方面任一项实施方式所述的车辆年款模型的训练方法训练得到的。The recognition module is used to input the target vehicle image into a vehicle year recognition model to obtain the year of the target vehicle image; wherein the vehicle year recognition model is based on the first aspect or any one of the first aspects It is obtained by training the training method of the vehicle model year model described in the item implementation mode.
本发明实施例提供的车辆年款的识别装置,通过将目标车辆图像输入所述车辆年款识别 模型进行分类,得出所述目标车辆图像的年款,其中,所述车辆年款识别模型是利用样本图像的至少2组特征以及所述样本图像共同进行训练,并以损失函数值进行参数优化得到的,能够保证所述目标车辆图像年款识别的准确性。According to the vehicle year recognition device provided by the embodiment of the present invention, the year of the target vehicle image is obtained by inputting the target vehicle image into the vehicle year recognition model for classification, wherein the vehicle year recognition model is The use of at least two sets of features of the sample image and the sample image for joint training and parameter optimization with the loss function value can ensure the accuracy of the year recognition of the target vehicle image.
根据第五方面,本发明实施例提供了一种电子设备,包括:According to the fifth aspect, an embodiment of the present invention provides an electronic device, including:
存储器和处理器,所述存储器和所述处理器之间互相通信连接,所述存储器中存储有计算机指令,所述处理器通过执行所述计算机指令,从而执行第一方面或第一方面任一实施方式所述的车辆年款识别模型的训练方法,或,第二方面或第二方面任一实施方式所述的车辆年款的识别方法。A memory and a processor, the memory and the processor are communicatively connected to each other, and computer instructions are stored in the memory, and the processor executes the first aspect or any one of the first aspects by executing the computer instructions The training method of the vehicle year recognition model according to the embodiment, or the vehicle year recognition method according to the second aspect or any one of the second aspects.
附图说明Description of the drawings
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the specific embodiments of the present invention or the technical solutions in the prior art, the following will briefly introduce the drawings that need to be used in the specific embodiments or the description of the prior art. Obviously, the appendix in the following description The drawings are some embodiments of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1是根据本发明实施例的车辆年款识别模型的训练方法流程图;Fig. 1 is a flowchart of a training method for a vehicle model recognition model according to an embodiment of the present invention;
图2是根据本发明实施例的车辆年款识别模型的训练方法的完整流程图;2 is a complete flowchart of a training method for a vehicle model recognition model according to an embodiment of the present invention;
图3是根据本发明实施例的车辆年款的识别方法流程图;Fig. 3 is a flowchart of a method for recognizing the year of a vehicle according to an embodiment of the present invention;
图4是根据本发明实施例的车辆年款识别模型的训练装置的结构框图;4 is a structural block diagram of a training device for a vehicle model recognition model according to an embodiment of the present invention;
图5是根据本发明实施例的车辆年款的识别装置的结构框图;Fig. 5 is a structural block diagram of a vehicle year model recognition device according to an embodiment of the present invention;
图6是本发明实施例提供的电子设备的硬件结构示意图;6 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the present invention;
图7是根据本发明实施例的车辆年款识别模型的组成示意图。Fig. 7 is a schematic diagram of the composition of a vehicle model year recognition model according to an embodiment of the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments These are a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those skilled in the art without creative work shall fall within the protection scope of the present invention.
根据本发明实施例,提供了一种车辆年款识别模型的训练方法以及车辆年款的识别方法实施例,需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。According to the embodiment of the present invention, a training method of a vehicle model year recognition model and an embodiment of a vehicle model recognition method are provided. The instructions are executed in a computer system that executes the instructions, and, although a logical sequence is shown in the flowchart, in some cases, the steps shown or described may be executed in a different order than here.
在本实施例中提供了一种车辆年款识别模型的训练方法,可用于上述的电子设备,图1是根据本发明实施例的车辆年款识别模型的训练方法流程图,如图1所示,该流程包括如下步骤:In this embodiment, a method for training a vehicle year recognition model is provided, which can be used in the above-mentioned electronic equipment. FIG. 1 is a flowchart of a training method for a vehicle year recognition model according to an embodiment of the present invention, as shown in FIG. 1 , The process includes the following steps:
S11,获取带标注信息的车辆样本图像。S11: Obtain a vehicle sample image with annotated information.
其中,所述标注信息包括所述车辆样本图像中的车辆品牌以及年款。Wherein, the labeling information includes the vehicle brand and year model in the vehicle sample image.
具体地,从车辆卡口监控视频、公路摄像头中采集约120万张、10216个年款类别的车辆图像并对所述车辆图像进行信息标注,其中,所述车辆图像包括汽车、卡车、公交车三大类型,所述标注信息包括图像中车辆的前后向、大品牌、子品牌、厂商及年款;将所述样本图像缩放至256*256大小,然后剪裁至224*224并进行均值方差处理作为所述带标注信息的车辆样本图像。Specifically, about 1.2 million vehicle images of 10,216 year types are collected from vehicle bayonet surveillance videos and highway cameras, and information is marked on the vehicle images, where the vehicle images include cars, trucks, and buses. Three types, the label information includes the front and back of the vehicle in the image, the major brand, the sub-brand, the manufacturer, and the model year; the sample image is scaled to 256*256, then cropped to 224*224, and the mean variance is processed As the vehicle sample image with the labeled information.
S12,将所述车辆样本图像输入特征提取模块中,以得到所述车辆样本图像的至少2组特征。S12: Input the vehicle sample image into a feature extraction module to obtain at least two sets of features of the vehicle sample image.
在一个具体的实施例中,提取3组所述车辆样本图像的特征;如图7所示,选择轻量级神经网络SqueezeNet作为所述特征提取模块,并分别从SqueezeNet网络的Fire2模块(SqueezeNet网络的第二个模块)、Fire5模块(SqueezeNet网络的第五个模块)以及Fire9模块(SqueezeNet网络的第九个模块)中提取出三组不同尺度的特征;其中,由于车辆样本 图像的数量庞大,为了节省时间,将所述SqueezeNet网络的最后一个卷积层的输出尺寸由512*13*13改为1024*7*7。In a specific embodiment, three sets of features of the vehicle sample images are extracted; as shown in Figure 7, the lightweight neural network SqueezeNet is selected as the feature extraction module, and the features are extracted from the Fire2 module of the SqueezeNet network (SqueezeNet network Three sets of features of different scales are extracted from the Fire5 module (the fifth module of the SqueezeNet network), and the Fire9 module (the ninth module of the SqueezeNet network); among them, due to the large number of vehicle sample images, In order to save time, the output size of the last convolutional layer of the SqueezeNet network is changed from 512*13*13 to 1024*7*7.
可选地,所述三组不同尺度的特征还可由所述轻量级神经网络SqueezeNet的其他Fire模块中提取,优选从Fire2模块、Fire5模块以及Fire9模块中提取;可选地,还可选择Back Propagation(简称BP)神经网络或Learning Vector Quantization(简称LVQ)神经网络或Hopfield神经网络作为所述特征提取模块,对所述车辆样本图像进行特征提取;可选地,所述车辆样本图像特征的提取组数还可以根据实际需求进行选择,比如4组、5组等,在具体实施例中,优选3组。Optionally, the three sets of features with different scales can also be extracted from other Fire modules of the lightweight neural network SqueezeNet, preferably from the Fire2 module, Fire5 module, and Fire9 module; optionally, you can also select Back Propagation (abbreviated as BP) neural network or Learning Vector Quantization (abbreviated as LVQ) neural network or Hopfield neural network is used as the feature extraction module to perform feature extraction on the vehicle sample image; optionally, the feature extraction of the vehicle sample image The number of groups can also be selected according to actual needs, such as 4 groups, 5 groups, etc. In specific embodiments, 3 groups are preferred.
S13,基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值。S13, based on the at least two sets of features, obtain a region of interest corresponding to each set of features and a score value thereof.
在一个具体实施例中,如图7所示,将所述三组不同尺度的特征分别输入区域生成网络(Region Proposal Network,简称RPN),分别得到一系列尺寸为24*24、32*32、86*86的矩形框,将该矩形框按照1:3、2:3、1:1的比例缩放,也即每组特征能得到9个矩形框,所述矩形框包含了每组特征对应的信息量以及信息量的分值;对每组特征对应的9个矩形框采用非极大值抑制算法(NMS)将每组信息量分值最高的矩形框及其得分值保留下来,以作为所述对应于每组特征的感兴趣区域及其得分值。In a specific embodiment, as shown in FIG. 7, the three sets of features with different scales are input into the Region Proposal Network (RPN), and a series of sizes of 24*24, 32*32, 86*86 rectangular frame, the rectangular frame is scaled according to the ratio of 1:3, 2:3, 1:1, that is, 9 rectangular frames can be obtained for each group of features, and the rectangular frame contains the corresponding corresponding to each group of features. Information volume and information volume score; the 9 rectangular boxes corresponding to each group of features are used non-maximum suppression algorithm (NMS) to retain the rectangular box with the highest information volume score of each group and its score value as The region of interest corresponding to each set of features and its score value.
可选地,还可通过排序筛选的方法对所述矩形框对应的信息量的分值进行排序,从而得到信息量分值最高的矩形框;可选地,还可选择Region-CNN(简称R-CNN)网络生成所述每组特征的感兴趣区域。Optionally, the information volume scores corresponding to the rectangular boxes can also be sorted by the method of sorting and screening, so as to obtain the rectangular box with the highest information volume score; optionally, you can also select Region-CNN (referred to as R The CNN) network generates the region of interest for each set of features.
S14,将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征。S14. The fusion of the region of interest and the vehicle sample image is input into a classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest After the characteristics.
其中,所述车辆年款识别模型包括所述特征提取模块以及所述分类模块。Wherein, the vehicle year model recognition model includes the feature extraction module and the classification module.
在一个具体实施例中,如图7所示,选择深度残差网络Resnet50作为所述分类模块;将所述感兴趣区域,也即所述三组特征对应的得分最高的矩形框双线性插值到224*224大小,并与所述样本图像共同输入所述深度残差网络Resnet50中;从所述深度残差网络Resnet50的全连接(FC)层之前获取所述样本图像特征以及所述感兴趣区域的整体特征,并从所述整体特征中切割出所述样本图像的整图分类特征、每个所述感兴趣区域的特征及整图与所述感兴趣区域融合后的特征。可选地,还可选择ResNeXt网络或Resnet101或其他同类型的残差网络作为所述分类模块。In a specific embodiment, as shown in FIG. 7, the deep residual network Resnet50 is selected as the classification module; the region of interest, that is, the rectangular box with the highest score corresponding to the three groups of features is bilinearly interpolated To 224*224 size, and input the deep residual network Resnet50 together with the sample image; acquire the sample image features and the interest from the fully connected (FC) layer of the deep residual network Resnet50 The overall feature of the region, and cut out from the overall feature the entire image classification feature of the sample image, the feature of each region of interest, and the feature after the entire image is fused with the region of interest. Optionally, ResNeXt network or Resnet101 or other residual networks of the same type can also be selected as the classification module.
S15,根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值。S15: Calculate a loss function value according to the entire image classification feature of the sample image, the feature of the region of interest, the feature after the entire image is fused with the region of interest, and the score value of each region of interest .
在一个具体实施例中,如图7所示,将从全连接(FC)层之前得到的整体特征分割出整图分类特征以及所述感兴趣区域的整体特征融合后输入全连接(FC)层并计算损失Loss 2,得到所述整图分类特征以及所述感兴趣区域的整体特征融合后的融合损失函数值;从所述全连接(FC)层之后得出所述样本图像的整图特征,并通过接入softmax层,得到对应于所述样本图像对应的整图损失函数值Loss 3;将所述从全连接(FC)层之前得到的整体特征分割出来的所述感兴趣区域的部件特征输入全连接(FC)层和softmax层后,得到所述感兴趣区域对应的部件损失函数值Loss 1;将将所述从全连接(FC)层之前得到的整体特征分割出来的所述感兴趣区域的整体特征输入全连接(FC)层且进行log softmax处理,分别得到所述感兴趣区域(三组)的损失函数值并与其对应的信息量共同计算等级损失,得到所述感兴趣区域(三组)对应的等级损失函数值
Figure PCTCN2020121514-appb-000004
In a specific embodiment, as shown in FIG. 7, the overall features obtained before the fully connected (FC) layer are segmented into the classification features of the entire image, and the overall features of the region of interest are fused and input into the fully connected (FC) layer. And calculate the loss Loss 2 to obtain the fusion loss function value after the fusion of the overall image classification feature and the overall feature of the region of interest; the overall image feature of the sample image is obtained from the fully connected (FC) layer , And by accessing the softmax layer, the overall image loss function value Loss 3 corresponding to the sample image is obtained; the components of the region of interest segmented from the overall features obtained before the fully connected (FC) layer After the features are input to the fully connected (FC) layer and the softmax layer, the component loss function value Loss 1 corresponding to the region of interest is obtained; the overall feature obtained before the fully connected (FC) layer is segmented into the feeling The overall feature of the region of interest is input to the fully connected (FC) layer and processed by log softmax, and the loss function values of the regions of interest (three groups) are obtained respectively, and the level loss is calculated together with the corresponding amount of information to obtain the region of interest (Three groups) Corresponding grade loss function value
Figure PCTCN2020121514-appb-000004
可选地,所述log softmax可由其他损失计算方法替代,例如NLLLoss或Cross Entropy softmax。Optionally, the log softmax can be replaced by other loss calculation methods, such as NLLLoss or Cross Entropy softmax.
S16,基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。S16, based on the annotation information of the vehicle sample image and the loss function value, update the parameters of the feature extraction module and the classification module to optimize the vehicle year recognition model.
在一个具体实施例中,如图7所示,将所述融合损失函数值、所述样本图像对应的整图损失函数值、所述感兴趣区域整体特征对应的部件损失函数值以及所述感兴趣区域(三组) 对应的等级损失函数,按一定的权重求和,对所述SqueezeNet网络以及所述Resnet50网络进行参数更新,直到所述SqueezeNet网络以及所述Resnet50网络的更新次数达到阈值后停止,或,所述SqueezeNet网络以及所述Resnet50网络的损失函数值在某一预设范围后停止,以得到所述车辆年款识别模型。其中,所述车辆年款识别模型由SqueezeNet网络和Resnet50网络组成。In a specific embodiment, as shown in FIG. 7, the fusion loss function value, the overall image loss function value corresponding to the sample image, the component loss function value corresponding to the overall feature of the region of interest, and the feeling The level loss function corresponding to the region of interest (three groups) is summed with a certain weight, and the parameters of the SqueezeNet network and the Resnet50 network are updated until the number of updates of the SqueezeNet network and the Resnet50 network reaches the threshold. , Or, the loss function value of the SqueezeNet network and the Resnet50 network stops after a certain preset range, so as to obtain the vehicle year recognition model. Wherein, the vehicle model recognition model is composed of SqueezeNet network and Resnet50 network.
本发明实施例提供的车辆年款识别模型的训练方法,通过特征提取模块提取所述车辆样本图像的至少2组特征并得到所述至少2组特征对应的感兴趣区域及其得分值;通过将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,得到所述样本图像的整图分类特征、所述感兴趣区域的特征及整图与所述感兴趣区域融合后的特征;并基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。该方法通过将提取至少2组特征,以及将感兴趣区域以及所述车辆样本图像融合后输入分类模块,并根据损失函数对所述识别模型进行优化,不仅提升了特征提取的层次感,还得出对应的损失函数以对识别模型进行参数更新,从而提高了识别的准确性。In the method for training a vehicle model recognition model provided by the embodiment of the present invention, at least two sets of features of the vehicle sample image are extracted through a feature extraction module, and the regions of interest corresponding to the at least two sets of features and their score values are obtained; The fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the feature of the fusion of the entire image and the region of interest And based on the labeling information of the vehicle sample image and the loss function value, the parameters of the feature extraction module and the classification module are updated to optimize the vehicle year recognition model. The method extracts at least two sets of features, and inputs the region of interest and the vehicle sample image into the classification module, and optimizes the recognition model according to the loss function, which not only improves the sense of hierarchy of feature extraction, but also The corresponding loss function is used to update the parameters of the recognition model, thereby improving the accuracy of recognition.
图2是根据本发明实施例的车辆年款识别模型的训练方法的完整流程图,如图2所示,该方法包括如下步骤:Fig. 2 is a complete flowchart of a training method for a vehicle model recognition model according to an embodiment of the present invention. As shown in Fig. 2, the method includes the following steps:
S21,获取带标注信息的车辆样本图像。S21: Obtain a vehicle sample image with annotated information.
其中,所述标注信息包括所述车辆样本图像中的车辆品牌以及年款。Wherein, the labeling information includes the vehicle brand and year model in the vehicle sample image.
详细请参见图1所述的S11,在此不再赘述。For details, please refer to S11 described in Figure 1, which will not be repeated here.
S22,将所述车辆样本图像输入特征提取模块中,以得到所述车辆样本图像的至少2组特征。S22: Input the vehicle sample image into a feature extraction module to obtain at least two sets of features of the vehicle sample image.
详细请参见图1所示的S12,在此不再赘述。For details, please refer to S12 shown in Figure 1, which will not be repeated here.
S23,基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值。S23, based on the at least two sets of features, obtain a region of interest corresponding to each set of features and a score value thereof.
详细请参见图1所示的S13,在此不再赘述。For details, please refer to S13 shown in Figure 1, which will not be repeated here.
可选地,所述步骤S23可包含如下步骤:Optionally, the step S23 may include the following steps:
S231,分别利用每组特征,生成对应于每组特征的多个候选区域。S231: Use each group of features to generate multiple candidate regions corresponding to each group of features.
具体地,将所述每组特征输入所述区域生成网络(Region Proposal Network,简称RPN),得到对应于每组特征的多个矩形框且每个矩形框对应一个得分值,所述矩形框对应的区域即为所述候选区域。Specifically, each set of features is input into the Region Proposal Network (RPN) to obtain multiple rectangular boxes corresponding to each set of features, and each rectangular box corresponds to a score value. The corresponding area is the candidate area.
S232,基于所述多个候选区域,生成对应于每组特征的感兴趣区域及其得分值。S232: Based on the multiple candidate regions, generate a region of interest corresponding to each group of features and a score value thereof.
具体地,将所述多个矩形框及其得分值进行非极大值抑制(NMS)处理或排序筛选处理,得到每组分值最高的矩形框,所述得分值最高的矩形框即为所述感兴趣区域。Specifically, the plurality of rectangular boxes and their score values are subjected to non-maximum value suppression (NMS) processing or sorting and screening processing to obtain the rectangular box with the highest value of each component, and the rectangular box with the highest score is Is the region of interest.
可选地,所述步骤S232可包括:Optionally, the step S232 may include:
(1)计算每个所述候选区域的得分值。(1) Calculate the score value of each candidate region.
(2)确定所述得分值最高的候选区域为所述感兴趣区域。(2) Determine that the candidate area with the highest score is the region of interest.
S24,将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征。S24. The fusion of the region of interest and the vehicle sample image is input into a classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest After the characteristics.
其中,所述车辆年款识别模型包括所述特征提取模块以及所述分类模块。Wherein, the vehicle year model recognition model includes the feature extraction module and the classification module.
详细请参见图1所示的S14,在此不再赘述。For details, please refer to S14 shown in Figure 1, which will not be repeated here.
可选地,所述步骤S24可包含如下步骤:Optionally, the step S24 may include the following steps:
S241,将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中。S241: The region of interest and the vehicle sample image are fused and input into a classification module.
其中,所述分类模块的输出为所述车辆样本图像的年款分类;Wherein, the output of the classification module is the annual model classification of the vehicle sample image;
具体地,将所述感兴趣区域以及所述车辆样本图像共同输入Resnet50中。Specifically, the region of interest and the vehicle sample image are input into Resnet 50 together.
S242,提取所述分类模块的最后一层池化层的输出,以得到所述车辆样本图像的整体特征。S242: Extract the output of the last pooling layer of the classification module to obtain the overall characteristics of the vehicle sample image.
具体地,从所述深度残差网络Resnet50的全连接(FC)层之前也即最后一层池化层的 输出获取所述样本图像特征以及所述感兴趣区域的整体特征。Specifically, the characteristics of the sample image and the overall characteristics of the region of interest are obtained from the output of the last pooling layer before the Fully Connected (FC) layer of the deep residual network Resnet50.
S243,对所述车辆样本图像的整体特征进行分割,以得到所述样本图像的整图分类特征以及所述感兴趣区域的特征。S243: Perform segmentation on the overall feature of the vehicle sample image to obtain the overall image classification feature of the sample image and the feature of the region of interest.
具体地,从所述整体特征中切割出所述样本图像的整图分类特征以及对应于每个所述感兴趣区域的特征。Specifically, the entire image classification feature of the sample image and the feature corresponding to each region of interest are cut out from the overall feature.
S25,根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值。S25: Calculate a loss function value based on the entire image classification feature of the sample image, the feature of the region of interest, the feature after the entire image is fused with the region of interest, and the score value of each region of interest. .
详细请参见图1所示的S15,在此不再赘述。For details, please refer to S15 shown in Figure 1, which will not be repeated here.
可选地,所述步骤S25可包含如下步骤:Optionally, the step S25 may include the following steps:
S251,将所述样本图像的整图分类特征与所述感兴趣区域的特征进行融合,以得到融合特征。S251: Fusion the whole image classification feature of the sample image with the feature of the region of interest to obtain a fusion feature.
具体地,将从全连接(FC)层之前得到的整体特征分割出整图分类特征以及所述感兴趣区域的整体特征,并将所述整图分类特征与所述整体特征进行融合后得到所述融合特征。Specifically, the overall feature obtained before the Fully Connected (FC) layer is segmented into the overall image classification feature and the overall feature of the region of interest, and the overall image classification feature is merged with the overall feature to obtain the result. The fusion characteristics.
S252,基于所述融合特征,计算融合损失函数值。S252: Calculate a fusion loss function value based on the fusion feature.
具体地,将所述融合特征输入全连接(FC)层并计算损失函数,得到所述融合损失函数值。Specifically, the fusion feature is input to a fully connected (FC) layer and a loss function is calculated to obtain the fusion loss function value.
S253,利用所有所述感兴趣区域的特征,计算部件损失函数值。S253: Calculate a component loss function value using all the features of the region of interest.
具体地,将从全连接(FC)层之前得到的整体特征分割出来的所述感兴趣区域的特征输入全连接(FC)层和softmax层后,得到所述感兴趣区域对应的部件损失函数值。Specifically, after the features of the region of interest segmented from the overall features obtained before the fully connected (FC) layer are input into the fully connected (FC) layer and the softmax layer, the component loss function value corresponding to the region of interest is obtained .
S254,利用所述样本图像的整图分类特征,计算整图损失函数值。S254: Calculate the overall image loss function value by using the entire image classification feature of the sample image.
具体地,为了降低计算的复杂性,将从全连接(FC)层之后得到的所述车辆样本图像与所述感兴趣区域的共同分类特征中分割出对应所述样本图像的整图特征,将所述整图特征作为所述整图分类特征,接入softmax层后得到所述整图分类特征对应的整图损失函数。Specifically, in order to reduce the complexity of calculation, the overall image feature corresponding to the sample image is segmented from the common classification features of the vehicle sample image and the region of interest obtained after the Fully Connected (FC) layer, and The entire image feature is used as the entire image classification feature, and the entire image loss function corresponding to the entire image classification feature is obtained after accessing the softmax layer.
S255,利用每个所述感兴趣区域的特征及其对应的得分值,计算每个所述感兴趣区域对应的等级损失函数值。S255: Calculate the level loss function value corresponding to each region of interest by using the feature of each region of interest and its corresponding score value.
具体地,将将所述从全连接(FC)层之前得到的整体特征分割出来的所述感兴趣区域的特征输入全连接(FC)层且进行log softmax处理,分别得到所述感兴趣区域(三组)的损失函数值并与其对应的信息量共同计算等级损失,得到所述感兴趣区域(三组)对应的等级损失函数。Specifically, the features of the region of interest segmented from the overall features obtained before the fully connected (FC) layer are input to the fully connected (FC) layer and log softmax processing is performed to obtain the regions of interest ( The level loss is calculated together with the loss function values of the three groups) and the corresponding amount of information to obtain the level loss function corresponding to the region of interest (three groups).
S256,基于各个所述等级损失函数值、所述部件损失函数值、所述融合损失函数值以及所述整图损失函数值,计算所述损失函数值。S256: Calculate the loss function value based on each of the level loss function value, the component loss function value, the fusion loss function value, and the entire graph loss function value.
具体地,将所述等级损失函数值、所述部件损失函数值、所述融合损失函数值以及所述整图损失函数值,按一定的权重求和,得到所述损失函数值。Specifically, the value of the level loss function, the value of the component loss function, the value of the fusion loss function, and the value of the entire graph loss function are summed by a certain weight to obtain the loss function value.
作为本发明实施例的中可选实施方式,所述损失函数采用如下计算公式计算:As an optional implementation manner of the embodiment of the present invention, the loss function is calculated using the following calculation formula:
Figure PCTCN2020121514-appb-000005
Figure PCTCN2020121514-appb-000005
其中,Loss 1为部件损失函数值,Loss 2为融合损失函数值,Loss 3为整图损失函数值,
Figure PCTCN2020121514-appb-000006
以及
Figure PCTCN2020121514-appb-000007
分别为所述感兴趣区域对应的等级损失函数值。
Among them, Loss 1 is the component loss function value, Loss 2 is the fusion loss function value, and Loss 3 is the overall image loss function value.
Figure PCTCN2020121514-appb-000006
as well as
Figure PCTCN2020121514-appb-000007
Respectively are the level loss function values corresponding to the region of interest.
在一个具体实施例中,所述感兴趣区域为三组,因此,所述损失函数的计算公式为:In a specific embodiment, the regions of interest are in three groups. Therefore, the calculation formula of the loss function is:
Figure PCTCN2020121514-appb-000008
Figure PCTCN2020121514-appb-000008
其中,Loss 1为部件损失函数值,Loss 2为融合损失函数值,Loss 3为整图损失函数值,
Figure PCTCN2020121514-appb-000009
以及分
Figure PCTCN2020121514-appb-000010
别为所述感兴趣区域对应的等级损失函数值。
Among them, Loss 1 is the component loss function value, Loss 2 is the fusion loss function value, and Loss 3 is the overall image loss function value.
Figure PCTCN2020121514-appb-000009
And points
Figure PCTCN2020121514-appb-000010
It is the level loss function value corresponding to the region of interest.
图3是根据本发明实施例提供的车辆年款的识别方法的流程图,如图3所示,该方法包括如下步骤:Fig. 3 is a flowchart of a method for recognizing the year of a vehicle according to an embodiment of the present invention. As shown in Fig. 3, the method includes the following steps:
S31,获取目标车辆图像。S31: Acquire an image of the target vehicle.
具体地,所述目标车辆图像可以从车辆卡口或公路摄像头中获得,所述车辆图像可以 是汽车、卡车、公交车中的任意类型。Specifically, the target vehicle image can be obtained from a vehicle bayonet or a road camera, and the vehicle image can be any type of car, truck, and bus.
S32,将所述目标车辆图像输入车辆年款识别模型中,以得到所述目标车辆图像的年款。S32, inputting the target vehicle image into a vehicle year model recognition model to obtain the year model of the target vehicle image.
具体地,所述车辆年款识别模型包括特征提取模块以及分类模块,优选地,选择轻量级神经网络SqueezeNet作为所述特征提取模块,选择深度残差网络Resnet50作为所述分类模块;将所述目标车辆图像输入所述车辆年款识别模型后,由轻量级神经网络SqueezeNet进行特征提取,再通过深度残差网络Resnet50进行分类,得到所述目标车辆图像的年款。Specifically, the vehicle model recognition model includes a feature extraction module and a classification module. Preferably, a lightweight neural network SqueezeNet is selected as the feature extraction module, and a deep residual network Resnet50 is selected as the classification module; After the target vehicle image is input to the vehicle model year recognition model, the lightweight neural network SqueezeNet is used to perform feature extraction, and then the deep residual network Resnet50 is used for classification to obtain the model year of the target vehicle image.
本发明实施例提供的车辆年款的识别方法,通过将目标车辆图像输入所述车辆年款识别模型进行分类,得出所述目标车辆图像的年款,其中,所述车辆年款识别模型是利用样本图像的至少2组特征以及所述样本图像共同进行训练,并以损失函数值进行参数优化得到的,能够保证所述目标车辆图像年款识别的准确性。According to the method for recognizing the vehicle year model provided by the embodiment of the present invention, the year model of the target vehicle image is obtained by inputting the target vehicle image into the vehicle year model recognition model for classification, wherein the vehicle year model recognition model is The use of at least two sets of features of the sample image and the sample image for joint training and parameter optimization with the loss function value can ensure the accuracy of the year recognition of the target vehicle image.
图4是根据本发明实施例的车辆年款识别模型的训练装置,如图4所示,包括:Fig. 4 is a training device for a vehicle model recognition model according to an embodiment of the present invention, as shown in Fig. 4, including:
第一获取模块41,用于获取带标注信息的车辆样本图像;其中,所述标注信息包括所述车辆样本图像中的车辆品牌以及年款;The first obtaining module 41 is configured to obtain a sample vehicle image with label information; wherein the label information includes the vehicle brand and year model in the vehicle sample image;
第一特征提取模块42,用于将所述车辆样本图像输入特征提取模块中,以得到所述车辆样本图像的至少2组特征;The first feature extraction module 42 is configured to input the vehicle sample image into the feature extraction module to obtain at least two sets of features of the vehicle sample image;
打分模块43,用于基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值;The scoring module 43 is configured to obtain the region of interest corresponding to each group of features and its score value based on the at least two groups of features;
第二特征提取模块44,用于将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征;其中,所述车辆年款识别模型包括所述特征提取模块以及所述分类模块;The second feature extraction module 44 is configured to merge the region of interest and the vehicle sample image and input it into the classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the entire image Features fused with the region of interest; wherein, the vehicle year model recognition model includes the feature extraction module and the classification module;
计算模块45,用于根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值;The calculation module 45 is configured to classify the features of the entire image of the sample image, the features of the region of interest, the features after the entire image is fused with the region of interest, and the score value of each region of interest, Calculate the value of the loss function;
参数优化模块46,用于基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。The parameter optimization module 46 is configured to update the parameters of the feature extraction module and the classification module based on the annotation information of the vehicle sample image and the loss function value, so as to optimize the vehicle year recognition model.
本发明实施例提供的车辆年款识别模型的训练装置,通过特征提取模块提取所述车辆样本图像的至少2组特征并得到所述至少2组特征对应的感兴趣区域及其得分值;通过将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,得到所述样本图像的整图分类特征、每个所述感兴趣区域的特征以及整图与所述感兴趣区域融合特征;并基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以得到所述车辆年款识别模型。该方法通过将提取至少2组特征,以及将感兴趣区域以及所述车辆样本图像融合后输入分类模块,并根据损失函数对所述识别模型进行优化,不仅提升了特征提取的层次感,还得出对应的损失函数以对识别模型进行参数更新,从而提高了识别的准确性。The training device for the vehicle model recognition model provided by the embodiment of the present invention extracts at least two sets of features of the vehicle sample image through a feature extraction module, and obtains the region of interest corresponding to the at least two sets of features and the score value thereof; The fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the entire image classification feature of the sample image, the feature of each region of interest, and the fusion feature of the entire image and the region of interest And based on the label information of the vehicle sample image and the loss function value, the parameters of the feature extraction module and the classification module are updated to obtain the vehicle year recognition model. The method extracts at least two sets of features, and inputs the region of interest and the vehicle sample image into the classification module, and optimizes the recognition model according to the loss function, which not only improves the sense of hierarchy of feature extraction, but also The corresponding loss function is used to update the parameters of the recognition model, thereby improving the accuracy of recognition.
图5是根据本发明实施例的车辆年款的识别装置,如图5所示,包括:Fig. 5 is a vehicle year model recognition device according to an embodiment of the present invention, as shown in Fig. 5, including:
第二获取模块51,用于获取目标车辆图像;The second acquisition module 51 is used to acquire an image of a target vehicle;
识别模块52,用于将所述目标车辆图像输入车辆年款识别模型中,以得到所述目标车辆图像的年款;其中,所述车辆年款识别模型是根据图1或图2所示的车辆年款模型的训练方法训练得到的。The recognition module 52 is used to input the target vehicle image into a vehicle year recognition model to obtain the year of the target vehicle image; wherein, the vehicle year recognition model is shown in FIG. 1 or FIG. 2 Trained on the training method of the vehicle model year.
本发明实施例提供的车辆年款的识别装置,通过将目标车辆图像输入所述车辆年款识别模型进行分类,得出所述目标车辆图像的年款,其中,所述车辆年款识别模型是利用样本图像的至少2组特征以及所述样本图像共同进行训练,并以损失函数值进行参数优化得到的,能够保证所述目标车辆图像年款识别的准确性。According to the vehicle year recognition device provided by the embodiment of the present invention, the year of the target vehicle image is obtained by inputting the target vehicle image into the vehicle year recognition model for classification, wherein the vehicle year recognition model is The use of at least two sets of features of the sample image and the sample image for joint training and parameter optimization with the loss function value can ensure the accuracy of the year recognition of the target vehicle image.
本发明实施例还提供一种电子设备,具有图4所示的车辆年款识别模型的训练装置以及图5所示的车辆年款的识别装置。The embodiment of the present invention also provides an electronic device having the training device for the vehicle year model recognition model shown in FIG. 4 and the vehicle year model recognition device shown in FIG. 5.
请参阅图6,图6是根据本发明实施例提供的电子设备的结构示意图,如图6所示,该电子设备可以包括:至少一个处理器61,例如CPU(Central Processing Unit,中央处理 器),至少一个通信接口63,存储器64,至少一个通信总线62。其中,通信总线62用于实现这些组件之间的连接通信。其中,通信接口63可以包括显示屏(Display)、键盘(Keyboard),可选通信接口63还可以包括标准的有线接口、无线接口。存储器64可以是高速RAM存储器(Random Access Memory,易挥发性随机存取存储器),也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。存储器64可选的还可以是至少一个位于远离前述处理器61的存储装置。其中处理器61可以结合图4以及图5所描述的装置,存储器64中存储应用程序,且处理器61调用存储器64中存储的程序代码,以用于执行上述任一方法步骤。Please refer to FIG. 6. FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. As shown in FIG. 6, the electronic device may include: at least one processor 61, such as a CPU (Central Processing Unit, central processing unit) , At least one communication interface 63, memory 64, and at least one communication bus 62. Among them, the communication bus 62 is used to implement connection and communication between these components. The communication interface 63 may include a display screen (Display) and a keyboard (Keyboard), and the optional communication interface 63 may also include a standard wired interface and a wireless interface. The memory 64 may be a high-speed RAM memory (Random Access Memory, volatile random access memory), or a non-volatile memory (non-volatile memory), such as at least one disk memory. Optionally, the memory 64 may also be at least one storage device located far away from the aforementioned processor 61. The processor 61 may be combined with the devices described in FIG. 4 and FIG. 5, the memory 64 stores application programs, and the processor 61 calls the program code stored in the memory 64 to execute any of the above method steps.
其中,通信总线62可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。通信总线62可以分为地址总线、数据总线、控制总线等。为便于表示,图6中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The communication bus 62 may be a peripheral component interconnect standard (peripheral component interconnect, PCI for short) bus or an extended industry standard architecture (EISA for short) bus, etc. The communication bus 62 can be divided into an address bus, a data bus, a control bus, and so on. For ease of representation, only one thick line is used in FIG. 6, but it does not mean that there is only one bus or one type of bus.
其中,存储器64可以包括易失性存储器(英文:volatile memory),例如随机存取存储器(英文:random-access memory,缩写:RAM);存储器也可以包括非易失性存储器(英文:non-volatile memory),例如快闪存储器(英文:flash memory),硬盘(英文:hard disk drive,缩写:HDD)或固态硬盘(英文:solid-state drive,缩写:SSD);存储器64还可以包括上述种类的存储器的组合。The memory 64 may include a volatile memory (English: volatile memory), such as a random access memory (English: random-access memory, abbreviation: RAM); the memory may also include a non-volatile memory (English: non-volatile memory). memory), such as flash memory (English: flash memory), hard disk (English: hard disk drive, abbreviation: HDD) or solid-state hard disk (English: solid-state drive, abbreviation: SSD); memory 64 may also include the above types The combination of memory.
其中,处理器61可以是中央处理器(英文:central processing unit,缩写:CPU),网络处理器(英文:network processor,缩写:NP)或者CPU和NP的组合。The processor 61 may be a central processing unit (English: central processing unit, abbreviation: CPU), a network processor (English: network processor, abbreviation: NP), or a combination of CPU and NP.
其中,处理器61还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(英文:application-specific integrated circuit,缩写:ASIC),可编程逻辑器件(英文:programmable logic device,缩写:PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(英文:complex programmable logic device,缩写:CPLD),现场可编程逻辑门阵列(英文:field-programmable gate array,缩写:FPGA),通用阵列逻辑(英文:generic array logic,缩写:GAL)或其任意组合。Wherein, the processor 61 may further include a hardware chip. The aforementioned hardware chip may be an application-specific integrated circuit (English: application-specific integrated circuit, abbreviation: ASIC), a programmable logic device (English: programmable logic device, abbreviation: PLD) or a combination thereof. The above-mentioned PLD can be a complex programmable logic device (English: complex programmable logic device, abbreviation: CPLD), field programmable logic gate array (English: field-programmable gate array, abbreviation: FPGA), general array logic (English: generic array) logic, abbreviation: GAL) or any combination thereof.
可选地,存储器64还用于存储程序指令。处理器71可以调用程序指令,实现如本申请图1-图2实施例中所示的车辆年款识别模型的训练方法和/或图3所示的车辆年款的识别方法。Optionally, the memory 64 is also used to store program instructions. The processor 71 can call program instructions to implement the vehicle year model training method shown in the embodiments of FIG. 1 to FIG. 2 of the present application and/or the vehicle year model recognition method shown in FIG. 3.
本发明实施例还提供了一种非暂态计算机存储介质,所述计算机存储介质存储有计算机可执行指令,该计算机可执行指令可执行上述任意方法实施例中的车辆年款识别模型的训练方法和/或车辆年款的识别方法。其中,所述存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)、随机存储记忆体(Random Access Memory,RAM)、快闪存储器(Flash Memory)、硬盘(Hard Disk Drive,缩写:HDD)或固态硬盘(Solid-State Drive,SSD)等;所述存储介质还可以包括上述种类的存储器的组合。The embodiment of the present invention also provides a non-transitory computer storage medium, the computer storage medium stores computer-executable instructions, and the computer-executable instructions can execute the training method of the vehicle year recognition model in any of the above-mentioned method embodiments And/or the method of identifying the year of the vehicle. Wherein, the storage medium can be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), a random access memory (RAM), a flash memory (Flash Memory), a hard disk (Hard Disk Drive, abbreviation: HDD) or solid-state drive (Solid-State Drive, SSD), etc.; the storage medium may also include a combination of the foregoing types of memories.
虽然结合附图描述了本发明的实施例,但是本领域技术人员可以在不脱离本发明的精神和范围的情况下作出各种修改和变型,这样的修改和变型均落入由所附权利要求所限定的范围之内。Although the embodiments of the present invention have been described with reference to the accompanying drawings, those skilled in the art can make various modifications and variations without departing from the spirit and scope of the present invention, and such modifications and variations fall within the scope of the appended claims. Within the limited range.

Claims (10)

  1. 一种车辆年款识别模型的训练方法,其特征在于,包括:A training method for a vehicle model recognition model, which is characterized in that it includes:
    获取带标注信息的车辆样本图像;其中,所述标注信息包括所述车辆样本图像中的车辆品牌以及年款;Acquiring a sample vehicle image with label information; wherein the label information includes the vehicle brand and year model in the vehicle sample image;
    将所述车辆样本图像输入特征提取模块中,以得到所述车辆样本图像的至少2组特征;Input the vehicle sample image into a feature extraction module to obtain at least two sets of features of the vehicle sample image;
    基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值;Based on the at least two sets of features, obtain the region of interest corresponding to each set of features and its score value;
    将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域的融合后的特征;其中,所述车辆年款识别模型包括所述特征提取模块以及所述分类模块;The fusion of the region of interest and the vehicle sample image is input into the classification module to obtain the classification feature of the entire image of the sample image, the feature of the region of interest, and the fusion of the entire image and the region of interest Features; wherein, the vehicle year model recognition model includes the feature extraction module and the classification module;
    根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域的融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值;Calculating a loss function value according to the entire image classification feature of the sample image, the feature of the region of interest, the fused feature of the entire image and the region of interest, and the score value of each region of interest;
    基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。Based on the annotation information of the vehicle sample image and the loss function value, the parameters of the feature extraction module and the classification module are updated to optimize the vehicle year recognition model.
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值,包括:The method according to claim 1, wherein the obtaining a region of interest corresponding to each set of features and a score value thereof based on the at least two sets of features comprises:
    分别利用每组特征,生成对应于每组特征的多个候选区域;Use each set of features to generate multiple candidate regions corresponding to each set of features;
    基于所述多个候选区域,生成对应于每组特征的感兴趣区域及其得分值。Based on the multiple candidate regions, a region of interest corresponding to each set of features and its score value are generated.
  3. 根据权利要求2所述的方法,其特征在于,所述基于所述多个候选区域,生成对应于每组特征的感兴趣区域及其得分值,包括:The method according to claim 2, wherein the generating the region of interest corresponding to each set of features and its score value based on the plurality of candidate regions comprises:
    计算每个所述候选区域的得分值;Calculating the score value of each candidate region;
    确定所述得分值最高的候选区域为所述感兴趣区域。It is determined that the candidate region with the highest score value is the region of interest.
  4. 根据权利要求1所述的方法,其特征在于,所述将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征,包括:The method according to claim 1, wherein the fusion of the region of interest and the vehicle sample image is input into a classification module to obtain the whole image classification feature of the sample image and the interest The characteristics of the region and the characteristics of the fusion of the whole image and the region of interest include:
    将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中;其中,所述分类模块的输出为所述车辆样本图像的年款分类;The fusion of the region of interest and the vehicle sample image is input into a classification module; wherein the output of the classification module is the annual model classification of the vehicle sample image;
    提取所述分类模块的最后一层池化层的输出,以得到所述车辆样本图像的整体特征;Extracting the output of the last pooling layer of the classification module to obtain the overall characteristics of the vehicle sample image;
    对所述车辆样本图像的整体特征进行分割,以得到所述样本图像的整图分类特征以及所述感兴趣区域的特征。The overall feature of the vehicle sample image is segmented to obtain the overall image classification feature of the sample image and the feature of the region of interest.
  5. 根据权利要求1所述的方法,其特征在于,所述根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域的融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值,包括:The method according to claim 1, wherein the classification feature of the entire image according to the sample image, the feature of the region of interest, the fused feature of the entire image and the region of interest, and each The score value of the region of interest and the calculation of the loss function value include:
    将所述样本图像的整图特征与所述感兴趣区域的特征进行融合,以得到融合特征;Fusing the entire image feature of the sample image with the feature of the region of interest to obtain a fusion feature;
    基于所述融合特征,计算融合损失函数值;Calculating the value of the fusion loss function based on the fusion feature;
    利用所述感兴趣区域的特征,计算部件损失函数值;Use the features of the region of interest to calculate the component loss function value;
    利用所述样本图像的整图分类特征,计算整图损失函数值;Calculate the value of the overall image loss function by using the entire image classification feature of the sample image;
    利用每个所述感兴趣区域的特征及其对应的得分值,计算每个所述感兴趣区域对应的等级损失函数值;Calculating the level loss function value corresponding to each region of interest by using the feature of each region of interest and its corresponding score value;
    基于各个所述等级损失函数值、所述部件损失函数值、所述融合损失函数值以及所述整图损失函数值,计算所述损失函数值。Calculate the loss function value based on each of the level loss function value, the component loss function value, the fusion loss function value, and the entire graph loss function value.
  6. 根据权利要求5所述的方法,其特征在于,所述损失函数值采用如下公式计算:The method according to claim 5, wherein the loss function value is calculated using the following formula:
    Figure PCTCN2020121514-appb-100001
    Figure PCTCN2020121514-appb-100001
    其中,Loss 1为部件损失函数值,Loss 2为融合损失函数值,Loss 3为整图损失函数值,
    Figure PCTCN2020121514-appb-100002
    以及
    Figure PCTCN2020121514-appb-100003
    分别为所述感兴趣区域对应的等级损失函数值。
    Among them, Loss 1 is the component loss function value, Loss 2 is the fusion loss function value, and Loss 3 is the overall image loss function value.
    Figure PCTCN2020121514-appb-100002
    as well as
    Figure PCTCN2020121514-appb-100003
    Respectively are the level loss function values corresponding to the region of interest.
  7. 一种车辆年款的识别方法,其特征在于,包括:A method for identifying the year of a vehicle, which is characterized in that it includes:
    获取目标车辆图像;Obtain an image of the target vehicle;
    将所述目标车辆图像输入车辆年款识别模型中,以得到所述目标车辆图像的年款;其中,所述车辆年款识别模型是根据权利要求1-6中任一项所述的车辆年款模型的训练方法训练得到的。The target vehicle image is input into the vehicle year recognition model to obtain the year of the target vehicle image; wherein the vehicle year recognition model is the vehicle year recognition model according to any one of claims 1-6. The training method of this model is obtained.
  8. 一种车辆年款识别模型的训练装置,其特征在于,包括:A training device for a vehicle model recognition model, which is characterized in that it comprises:
    第一获取模块,用于获取带标注信息的车辆样本图像;其中,所述标注信息包括所述车辆样本图像中的车辆品牌以及年款;The first acquisition module is configured to acquire a sample vehicle image with label information; wherein the label information includes the vehicle brand and year model in the vehicle sample image;
    第一特征提取模块,用于将所述车辆样本图像输入特征提取模块中,以得到所述车辆样本图像的至少2组特征;The first feature extraction module is configured to input the vehicle sample image into the feature extraction module to obtain at least two sets of features of the vehicle sample image;
    打分模块,用于基于所述至少2组特征,得到对应于每组特征的感兴趣区域及其得分值;The scoring module is configured to obtain the region of interest corresponding to each group of features and its score value based on the at least two groups of features;
    第二特征提取模块,用于将所述感兴趣区域以及所述车辆样本图像融合后输入分类模块中,以得到所述样本图像的整图分类特征、所述感兴趣区域的特征以及整图与所述感兴趣区域融合后的特征;其中,所述车辆年款识别模型包括所述特征提取模块以及所述分类模块;The second feature extraction module is used to merge the region of interest and the vehicle sample image and input it into the classification module to obtain the entire image classification feature of the sample image, the feature of the region of interest, and the entire image and The features after the fusion of the region of interest; wherein, the vehicle year recognition model includes the feature extraction module and the classification module;
    计算模块,用于根据所述样本图像的整图分类特征、所述感兴趣区域的特征、整图与所述感兴趣区域融合后的特征以及每个所述感兴趣区域的得分值,计算损失函数值;The calculation module is used to calculate the entire image classification feature of the sample image, the feature of the region of interest, the feature after the entire image is fused with the region of interest, and the score value of each region of interest. Loss function value;
    参数优化模块,用于基于所述车辆样本图像的标注信息以及所述损失函数值,对所述特征提取模块以及所述分类模块的参数进行更新,以优化所述车辆年款识别模型。The parameter optimization module is configured to update the parameters of the feature extraction module and the classification module based on the annotation information of the vehicle sample image and the loss function value, so as to optimize the vehicle year recognition model.
  9. 一种车辆年款的识别装置,其特征在于,包括:A vehicle year model recognition device, which is characterized in that it comprises:
    第二获取模块,用于获取目标车辆图像;The second acquisition module is used to acquire the target vehicle image;
    识别模块,用于将所述目标车辆图像输入车辆年款识别模型中,以得到所述目标车辆图像的年款;其中,所述车辆年款识别模型是根据权利要求1-6中任一项所述的车辆年款模型的训练方法训练得到的。The recognition module is configured to input the target vehicle image into a vehicle year recognition model to obtain the year payment of the target vehicle image; wherein the vehicle year recognition model is according to any one of claims 1-6 It is obtained by training the training method of the vehicle model year model.
  10. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    存储器和处理器,所述存储器和所述处理器之间互相通信连接,所述存储器中存储有计算机指令,所述处理器通过执行所述计算机指令,从而执行权利要求1-6中任一项所述的车辆年款识别模型的训练方法或权利要求7所述的车辆年款的识别方法。A memory and a processor, the memory and the processor are in communication connection with each other, the memory is stored with computer instructions, and the processor executes any one of claims 1-6 by executing the computer instructions The training method of the vehicle year recognition model or the vehicle year recognition method of claim 7.
PCT/CN2020/121514 2020-03-05 2020-10-16 Method for training vehicle model-year recognition model and method for recognizing vehicle model year WO2021174863A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010137345.7A CN111340026B (en) 2020-03-05 2020-03-05 Training method of vehicle annual payment identification model and vehicle annual payment identification method
CN202010137345.7 2020-03-05

Publications (1)

Publication Number Publication Date
WO2021174863A1 true WO2021174863A1 (en) 2021-09-10

Family

ID=71184648

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/121514 WO2021174863A1 (en) 2020-03-05 2020-10-16 Method for training vehicle model-year recognition model and method for recognizing vehicle model year

Country Status (2)

Country Link
CN (1) CN111340026B (en)
WO (1) WO2021174863A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114022745A (en) * 2021-11-05 2022-02-08 光大科技有限公司 Neural network model training method and device

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111340026B (en) * 2020-03-05 2022-07-01 苏州科达科技股份有限公司 Training method of vehicle annual payment identification model and vehicle annual payment identification method
CN111783654B (en) * 2020-06-30 2022-09-09 苏州科达科技股份有限公司 Vehicle weight identification method and device and electronic equipment
CN111767954A (en) * 2020-06-30 2020-10-13 苏州科达科技股份有限公司 Vehicle fine-grained identification model generation method, system, equipment and storage medium
CN112101246A (en) * 2020-09-18 2020-12-18 济南博观智能科技有限公司 Vehicle identification method, device, equipment and medium
CN113298139B (en) * 2021-05-21 2024-02-27 广州文远知行科技有限公司 Image data optimization method, device, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100303342A1 (en) * 2009-06-02 2010-12-02 Yahoo! Inc. Finding iconic images
CN106548145A (en) * 2016-10-31 2017-03-29 北京小米移动软件有限公司 Image-recognizing method and device
CN108090429A (en) * 2017-12-08 2018-05-29 浙江捷尚视觉科技股份有限公司 Face bayonet model recognizing method before a kind of classification
CN111340026A (en) * 2020-03-05 2020-06-26 苏州科达科技股份有限公司 Training method of vehicle annual payment identification model and vehicle annual payment identification method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105590102A (en) * 2015-12-30 2016-05-18 中通服公众信息产业股份有限公司 Front car face identification method based on deep learning
CN108681707A (en) * 2018-05-15 2018-10-19 桂林电子科技大学 Wide-angle model recognizing method and system based on global and local Fusion Features
CN109359666B (en) * 2018-09-07 2021-05-28 佳都科技集团股份有限公司 Vehicle type recognition method based on multi-feature fusion neural network and processing terminal
CN109934177A (en) * 2019-03-15 2019-06-25 艾特城信息科技有限公司 Pedestrian recognition methods, system and computer readable storage medium again

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100303342A1 (en) * 2009-06-02 2010-12-02 Yahoo! Inc. Finding iconic images
CN106548145A (en) * 2016-10-31 2017-03-29 北京小米移动软件有限公司 Image-recognizing method and device
CN108090429A (en) * 2017-12-08 2018-05-29 浙江捷尚视觉科技股份有限公司 Face bayonet model recognizing method before a kind of classification
CN111340026A (en) * 2020-03-05 2020-06-26 苏州科达科技股份有限公司 Training method of vehicle annual payment identification model and vehicle annual payment identification method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114022745A (en) * 2021-11-05 2022-02-08 光大科技有限公司 Neural network model training method and device

Also Published As

Publication number Publication date
CN111340026B (en) 2022-07-01
CN111340026A (en) 2020-06-26

Similar Documents

Publication Publication Date Title
WO2021174863A1 (en) Method for training vehicle model-year recognition model and method for recognizing vehicle model year
CN111062413B (en) Road target detection method and device, electronic equipment and storage medium
CN108960055B (en) Lane line detection method based on local line segment mode characteristics
CN110532946B (en) Method for identifying axle type of green-traffic vehicle based on convolutional neural network
CN115239644B (en) Concrete defect identification method, device, computer equipment and storage medium
CN113269267B (en) Training method of target detection model, target detection method and device
CN110309765B (en) High-efficiency detection method for video moving target
CN113255605A (en) Pavement disease detection method and device, terminal equipment and storage medium
CN111191604A (en) Method, device and storage medium for detecting integrity of license plate
CN110599453A (en) Panel defect detection method and device based on image fusion and equipment terminal
CN113240623A (en) Pavement disease detection method and device
CN107578048A (en) A kind of long sight scene vehicle checking method based on vehicle rough sort
CN108229473A (en) Vehicle annual inspection label detection method and device
Mijić et al. Traffic sign detection using yolov3
CN117218622A (en) Road condition detection method, electronic equipment and storage medium
Liu et al. Multi-lane detection by combining line anchor and feature shift for urban traffic management
CN107871315B (en) Video image motion detection method and device
CN110991421B (en) Bayonet snap image vehicle detection method, computer storage medium and electronic equipment
Agarwal et al. Camera-based smart traffic state detection in india using deep learning models
Liu et al. UCN-YOLOv5: Traffic sign target detection algorithm based on deep learning
CN116740495A (en) Training method and defect detection method for defect detection model of road and bridge tunnel
CN115512330A (en) Object detection method based on image segmentation and laser radar point cloud completion
CN110728229A (en) Image processing method, device, equipment and storage medium
CN111126271B (en) Bayonet snap image vehicle detection method, computer storage medium and electronic equipment
CN115311630A (en) Method and device for generating distinguishing threshold, training target recognition model and recognizing target

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20923207

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20923207

Country of ref document: EP

Kind code of ref document: A1