CN113657409A - Vehicle loss detection method, device, electronic device and storage medium - Google Patents
Vehicle loss detection method, device, electronic device and storage medium Download PDFInfo
- Publication number
- CN113657409A CN113657409A CN202110937282.8A CN202110937282A CN113657409A CN 113657409 A CN113657409 A CN 113657409A CN 202110937282 A CN202110937282 A CN 202110937282A CN 113657409 A CN113657409 A CN 113657409A
- Authority
- CN
- China
- Prior art keywords
- network
- swin
- damage
- target image
- transformer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a vehicle loss detection method and device, electronic equipment and a storage medium. The method comprises the following steps: acquiring a target image; inputting a target image into a network model, wherein a trunk network of the network model comprises a Swin transform network trunk network and is used for predicting damage position coordinates and damage types of the target image based on the Swin transform network; and determining a damage detection result according to the damage position coordinate and the damage category. The embodiment of the invention uses the Swin transform network as the backbone network, is more accurate compared with a CNN detection mode, and can more effectively position and identify the damaged part. By adopting Swin transform as a backbone network to extract features, spatial information relation among pixels of the image and weighting selection of the features can be explored, so that better feature extraction and utilization are realized. Meanwhile, the Swin Transformer has the characteristics of locality, translation invariance, residual learning and the like of the CNN, so that the problems of complex calculated amount and large memory consumption in other visual Transformer schemes can be solved while the performance exceeds that of the CNN method.
Description
Technical Field
The embodiment of the invention relates to a machine learning technology, in particular to a vehicle loss detection method and device, electronic equipment and a storage medium.
Background
As society rapidly develops, vehicles have become one of indispensable vehicles, and the increasing number of vehicles undoubtedly increases the incidence of traffic accidents. After a traffic accident happens, an insurance company usually carries out damage assessment to the accident site, namely, the vehicle damage is determined by observing a picture taken at the site, and the vehicle damage is used as a basis for claim settlement of the vehicle insurance company. The loss assessment link consumes a large amount of human resources, and the obtained result has strong subjectivity. Therefore, the vehicle damage detection system starts to gradually replace manual operation based on the deep learning method, and the vehicle damage type can be accurately detected through one or more pictures.
Existing target detectors are mainly based on CNN implementations. However, the CNN-based image analysis process has a problem of being not accurate enough.
Disclosure of Invention
The invention provides a vehicle loss detection method, a vehicle loss detection device, electronic equipment and a storage medium, and aims to improve the accuracy of vehicle damage detection.
In a first aspect, an embodiment of the present invention provides a vehicle loss detection method, including:
acquiring a target image;
inputting the target image into a network model, wherein a trunk network of the network model comprises a Swin transform network (also called a hierarchical visual transform network), and the trunk network is used for predicting the damage position coordinates and the damage types of the target image based on the Swin transform network;
and determining a damage detection result according to the damage position coordinate and the damage category.
In a second aspect, an embodiment of the present invention further provides a vehicle loss detection apparatus, including:
the image acquisition module is used for acquiring a target image;
the detection module is used for inputting the target image into a network model, and a trunk network of the network model comprises a Swin transducer network and is used for predicting the damage position coordinates and the damage types of the target image based on the Swin transducer network;
and the detection result determining module is used for determining a damage detection result according to the damage position coordinate and the damage category.
In a third aspect, an embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the vehicle loss detection method according to the embodiment of the present application.
In a fourth aspect, embodiments of the present invention further provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a vehicle loss detection method as shown in the embodiments of the present application.
According to the vehicle loss detection method provided by the embodiment of the invention, a target image is obtained; inputting the target image into a network model, wherein a trunk network of the network model comprises a Swin transform network, and the trunk network is used for predicting the damage position coordinates and the damage types of the target image based on the Swin transform network; and determining a damage detection result according to the damage position coordinate and the damage category. Compared with the current method that the CNN is used for detecting the vehicle loss is not accurate enough, the method and the device provided by the embodiment of the invention use the Swin transducer network as the main network, are more accurate compared with the CNN detection mode, and can effectively position and identify the damaged part. By adopting Swin transform as a backbone network to extract features, spatial information relation among pixels of the image and weighting selection of the features can be explored, so that better feature extraction and utilization are realized. Meanwhile, the Swin Transformer has the characteristics of locality, translation invariance, residual learning and the like of the CNN, so that the problems of complex calculated amount and large memory consumption in other visual Transformer schemes can be solved while the performance exceeds that of the CNN method. The Swin transform block in the Swin transform has the advantages of wide range of vehicle type application and detection, suitability for field environment and complex photographing background, can realize high-efficiency damage assessment of damaged parts of vehicles, and optimizes damage assessment efficiency based on the method of the self-attention mechanism.
Drawings
FIG. 1 is a flow chart of a vehicle loss detection method according to a first embodiment of the present invention;
fig. 2 is a schematic structural diagram of a Swin Transformer network according to a first embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a Swin Transformer block according to a first embodiment of the present invention;
fig. 4 is a flowchart of a vehicle loss detection method in the second embodiment of the invention;
fig. 5 is a schematic structural view of a vehicle loss detection apparatus in a third embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device in the fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a vehicle loss detection method according to an embodiment of the present invention, where the present embodiment is applicable to a vehicle loss detection situation, the method may be executed by an electronic device, and the electronic device may be a computer device or a terminal, and specifically includes the following steps:
and step 110, acquiring a target image.
The target image is an image for vehicle loss detection. The user can take a picture of the damaged vehicle through the handheld terminal, and the picture obtained through taking the picture is used as a target image. The pre-captured image may also be imported to a computer device as a target image.
And 120, inputting the target image into a network model, wherein a trunk network of the network model comprises a Swin Transformer network trunk network, and the Swin Transformer network trunk network is used for predicting the damage position coordinates and the damage types of the target image based on the Swin Transformer network.
The structure of the Swin Transformer network is shown in FIG. 2, and includes a block partition (patch partition) and four stages. Each stage includes a linear embedding layer (linear embedding) and a Swin Transformer block (block). Each stage is used to perform a down-sampling.
Illustratively, an input target image 224 × 224 is divided into a non-overlapping set of patches by a patch partition (patch partition), wherein each patch has a size of 4 × 4, the target image has 3 color channels, each patch has a feature dimension of 4 × 3 ═ 48, and the number of patches is H/4x W/4.
In the stage1 part (stage1), firstly, a linear embedding layer (linear embedding) is used for changing the characteristic dimension of the partitioned patch into C, and then the C is sent into a Swin Transformer Block; the operations of stage2-stage4 are the same, firstly, through a patch clustering, input adjacent blocks of 2x2 are merged, the number of the obtained patch blocks becomes H/8x W/8, the characteristic dimension becomes 4C, and so on, the feature vector of the target image is processed through four stages, and the vehicle damage type and the damaged position information are obtained. In the Swin Transformer network, the size of each patch is preset, and the number of the patches is determined according to the determined size of the patches.
The segmentation layer is used for segmenting the image into a plurality of patches and obtaining a feature vector of each patch. And the stage1 to the stage4 are used for carrying out image recognition according to the characteristic vector to obtain the damage position coordinate and the damage category of the target image. Stage1 identifies a feature vector of a target image in each block in units of blocks. And the stage2 merges the blocks in the stage1 to obtain the number of the blocks H/8x W/8, and identifies the feature vector of the target image in each block according to the merged blocks. And by analogy, combining the blocks of the previous stage in the next stage, and identifying the feature vector of the target image according to the combined blocks patch. And 4, after the characteristic vector of the target image is obtained, mapping the characteristic vector to a neural network for image recognition.
Optionally, inputting the target image into the network model, including: convolving the image through the convolution layer to obtain convolution data; the convolved data are used as input to a Swin Transformer network.
Alternatively, a convolution layer is provided before the block division layer (patch partition), and the target image is convolved by the convolution layer. Illustratively, two layers of 3 by 3 convolutional layers are configured, and the target image is convolved using the two layers of 3 by 3 convolutional layers and converted into convolution data. The convolution data is input to the patch partition layer (patch partition).
The convolution layer is used for carrying out convolution on the image, so that not only can the complexity of subsequent calculation be reduced, but also the model precision can be improved. Using two layers of 3 by 3 convolutional layers can further improve the convolution efficiency.
After the convolution data is input to the patch partition layer (patch partition), the input convolution data is divided into non-overlapping patch sets by the patch partition layer (patch partition) as input features of the Swin Transformer network.
The Swin Transformer network as the backbone is formed by stacking Swin Transformer blocks in each stage. The input features are transformed in feature dimension by a linear embedding layer (linear embedding). The Swin Transformer network realizes the multiplexing of the characteristics by combining the input according to the adjacent latches.
As shown in fig. 3, each Swin Transformer block (Swin Transformer block) consists of a displacement window based MSA (multi-head self attribute) with two layers of MLPs (Muti-Layer persistence). A layernorm (ln) layer is used before each MSA module and each MLP and residual concatenation is used after each MSA and MLP. The MSA module divides an input picture into non-coincident windows, and then performs self-attention calculation in different windows, wherein the calculation complexity and the image size are in a linear relation.
Optionally, the Swin Transformer network includes a plurality of Swin Transformer blocks, and each Swin Transformer block includes a plurality of MSA layers;
the input of the MSA layer is provided with a first convolution layer; the output of the MSA layer is provided with a second convolutional layer.
For each MSA layer, a first convolutional layer is set at its input for dimensionality reduction. And setting a second convolution layer at the output of the transformer for dimension increasing. Illustratively, the first convolution layer may be a 1 x 1 convolution layer. The second convolution layer may be a 1 x 1 convolution layer. Correspondingly, the input of the MSA layer is provided with a 1 × 1 convolution layer; the output of the MSA layer is provided with 1 x 1 convolutional layers. By providing convolution layers for each input and output of the MSA layer, the characteristic operation efficiency can be improved, and the operation speed can be increased. For each MSA layer, 1 x 1 convolutional layer is set at its input for dimensionality reduction. At its output, 1 x 1 convolutional layers are set for dimensionality enhancement.
Optionally, the backbone network is connected to a neck network, and the neck network includes:
feature Pyramid Networks (FPN) and Balanced Feature Pyramid Networks (BFP).
The characteristic map pyramid network is used for extracting characteristics of the image of each scale, multi-scale characteristic representation can be generated, and characteristic maps of all levels have strong semantic information and even comprise some characteristic maps with high resolution.
And (3) performing convolution on the images in the stages 1 to 4 according to the sizes, namely, from the bottom layer to the top layer of the feature pyramid network, performing feature extraction on the image of each layer by the feature pyramid network to generate multi-scale feature representation, and fusing the features. The images of each layer have certain semantic information. Feature fusion may be performed through a feature map pyramid network. The balanced feature pyramid network is used for enhancing the semantic features of the multilayer feature layer balanced through deep integration. Features are enhanced by a balanced feature pyramid network.
The neck network is used for connecting the backbone network backbone and the head network head, so that the characteristics output by the backbone network can be more efficiently applied to the head network, and the data processing efficiency is improved.
And step 130, determining a damage detection result according to the damage position coordinate and the damage type.
After the Swin Transformer network outputs the damage position coordinates and the damage types through forward propagation in step 120, a final damage detection result can be screened out through a soft-NMS (non-maximum suppression) algorithm.
According to the vehicle loss detection method provided by the embodiment of the invention, a target image is obtained; inputting a target image into a network model, wherein a trunk network of the network model comprises a Swin transform network trunk network and is used for predicting damage position coordinates and damage types of the target image based on the Swin transform network; and determining a damage detection result according to the damage position coordinate and the damage category. Compared with the current method that the CNN is used for detecting the vehicle loss is not accurate enough, the method and the device provided by the embodiment of the invention use the Swin transducer network as the main network, are more accurate compared with the CNN detection mode, and can effectively position and identify the damaged part. By adopting Swin transform as a backbone network to extract features, spatial information relation among pixels of the image and weighting selection of the features can be explored, so that better feature extraction and utilization are realized. Meanwhile, the Swin Transformer has the characteristics of locality, translation invariance, residual learning and the like of the CNN, so that the problems of complex calculated amount and large memory consumption in other visual Transformer schemes can be solved while the performance exceeds that of the CNN method. The Swin transform block in the Swin transform has the advantages of wide range of vehicle type application and detection, suitability for field environment and complex photographing background, can realize high-efficiency damage assessment of damaged parts of vehicles, and optimizes damage assessment efficiency based on the method of the self-attention mechanism.
Example two
Fig. 4 is a flowchart of a vehicle loss detection method according to a second embodiment of the present invention, which further illustrates the above embodiment, and before the target image is acquired in step 110, the method further includes a step of training a Swin Transformer network. The first embodiment provides an implementation method for detecting traffic loss by using a Swin Transformer network as a backbone network. The embodiment is used for providing the training mode of the network. The method can be implemented by the following steps:
and step 210, marking the vehicle loss historical picture according to a marking criterion, and configuring the damage type of the vehicle loss historical picture.
Wherein, the damage category and the marking criterion can be determined by the settlement personnel and the algorithm engineer after meeting. The damage categories include vehicle damage of varying severity that requires reimbursement. The marking criteria include special case marking criteria such as overlapping of various injuries, uncertain whether the injuries are injuries, uncertain why the injuries are, and the like. The categories of injury include: scratches, dents, wrinkles, dead folds, tears, deletions, and the like.
And marking historical pictures of the vehicle body damage in batches based on the damage category. Optionally, manual labeling may be performed. And marking the damage form appearing in each picture by adopting a rectangular frame, and recording the damage type of the damage form. Further, pictures which are difficult to distinguish damage types are removed, and a vehicle body damage database is constructed.
And step 220, training the Swin Transformer network according to the marked vehicle loss historical picture.
Optionally, a part of the image is used as a training set and another part of the image is used as a testing set from the body damage database.
And (3) randomly cutting all pictures in the training set, randomly rotating, randomly changing data enhancement operations such as saturation, hue and contrast and the like, then scaling the pictures to 896 × 896 pixels, and inputting the pixels into a Swin transform for training. The training process comprises the step of taking parameters such as the car damage image and the mark of the damage type as input to train the Swin transform network. And testing on the test set every 1 period (epoch), and respectively storing the model parameters with the highest detection model map. And optimizing the Swin Transformer network through multiple iterations.
Optionally, training the Swin Transformer network according to the labeled vehicle loss historical image includes:
and in the training process, performing regression calculation of the Swin transducer network according to the distance punishment damage function.
The IOU is also called an Intersection over Union, and represents a ratio of an Intersection and a Union of a "predicted bounding box" and a "real bounding box". The network is usually trained by using an IOU calculation formula and a bounding box positioning loss function. However, the accuracy obtained using the above calculation method is low. Therefore, the regression calculation of the Swin Transformer network is performed according to the distance punishment damage function, and therefore the positioning accuracy of the predicted ore is improved. The dioLOss penalty function may still provide a direction of movement for the bounding box when it does not overlap the target box. In addition, DIoU loss has a faster convergence rate relative to IOU loss. Meanwhile, for the case where two frames are included in the horizontal direction and the vertical direction, the DIoU loss can realize a fast regression.
Illustratively, the distance penalty damage function (DIoU Loss) is used to perform bounding box regression calculations for Swin Transformer networks. Distance punished damage LDIoUCan be calculated by the following formula:
wherein b and bgtRespectively representing the center points, p, of the prediction and real boxes2(b,bgt) The expression calculates the euclidean distance between the two center points. C represents the diagonal distance of the minimum closure area that can contain both the prediction box and the real box. IoU denotes the intersection ratio of the prediction box and the real box.
Optionally, training the Swin Transformer network according to the labeled vehicle loss historical image includes:
in the training process, data enhancement is carried out according to the vehicle loss historical picture; and training the Swin Transformer network by using the data-enhanced car loss historical picture.
In the training process, different data enhancement methods can be adopted according to the vehicle loss historical pictures, and the methods comprise the steps of trying different types of optimizers, adopting a learning rate reduction strategy, adopting a regularization technology and the like. In addition, a multi-scale training mode is adopted to train for enough time epochs to make the loss values of the model in the training set and the test set converge, and the model parameters with the highest map of the network in the test set are stored. Where a complete data set passes through the neural network once and back once, this process is called a time epoch.
In addition, a small number of targeted data enhancements, including mosaic and dim light, can be misdetected, thus randomly adding mosaic and image saturation changes to the data enhancements.
Step 240, inputting the target image into a network model, wherein a trunk network of the network model comprises a Swin Transformer network trunk network, and the Swin Transformer network trunk network is used for predicting the damage position coordinates and the damage category of the target image based on the Swin Transformer network.
And step 250, determining a damage detection result according to the damage position coordinate and the damage category.
The vehicle loss detection method provided by the embodiment of the application can train the network more efficiently, so that the trained network is more accurate.
EXAMPLE III
Fig. 5 is a schematic structural diagram of a vehicle loss detection apparatus according to a third embodiment of the present invention, where the present embodiment is applicable to a vehicle loss detection situation, the method may be executed by an electronic device, and the electronic device may be a computer device or a terminal, and specifically includes: an image acquisition module 310, a detection module 320, and a detection result determination module 330.
An image acquisition module 310 for acquiring a target image;
the detection module 320 is configured to input the target image into a network model, where a backbone network of the network model includes a Swin Transformer network, and the backbone network is used for predicting the damage position coordinates and the damage category of the target image based on the Swin Transformer network;
and a detection result determining module 330, configured to determine a damage detection result according to the damage position coordinate and the damage category.
On the basis of the foregoing embodiment, the detection module 320 is configured to:
convolving the image through the convolution layer to obtain convolution data;
and taking the convolution data as an input of a Swin Transformer network.
On the basis of the above embodiment, the Swin Transformer network includes a plurality of Swin Transformer blocks, and each Swin Transformer block includes a plurality of MSA layers;
the input of the MSA layer is provided with a first convolution layer;
the output of the MSA layer is provided with a second convolutional layer.
Specifically, the input of the MSA layer is provided with a 1 × 1 convolution layer, and the output of the MSA layer is provided with a 1 × 1 convolution layer.
On the basis of the above embodiment, the backbone network is connected to a neck network, and the neck network includes:
a feature map pyramid network and a balanced feature pyramid network.
On the basis of the above embodiment, the training device further comprises a training module. The training module is used for:
marking the vehicle loss historical picture according to a marking criterion, and configuring the damage category of the vehicle loss historical picture;
and training the Swin transform network according to the marked vehicle loss historical picture.
On the basis of the above embodiment, the training module is configured to:
and in the training process, performing regression calculation of the Swin transducer network according to the distance punishment damage function.
On the basis of the above embodiment, the training module is configured to:
in the training process, data enhancement is carried out according to the vehicle loss historical picture;
and training the Swin Transformer network by using the data-enhanced car loss historical picture.
In the vehicle loss detection apparatus provided in the embodiment of the present invention, the image obtaining module 310 obtains a target image; the detection module 320 inputs the target image into a network model, wherein a trunk network of the network model comprises a Swin Transformer network, and the trunk network is used for predicting the damage position coordinates and the damage types of the target image based on the Swin Transformer network; a detection result determination module 330. And determining a damage detection result according to the damage position coordinate and the damage category. Compared with the current method that the CNN is used for detecting the vehicle loss is not accurate enough, the method and the device provided by the embodiment of the invention use the Swin transducer network as the main network, are more accurate compared with the CNN detection mode, and can effectively position and identify the damaged part. By adopting Swin transform as a backbone network to extract features, spatial information relation among pixels of the image and weighting selection of the features can be explored, so that better feature extraction and utilization are realized. Meanwhile, the Swin Transformer has the characteristics of locality, translation invariance, residual learning and the like of the CNN, so that the problems of complex calculated amount and large memory consumption in other visual Transformer schemes can be solved while the performance exceeds that of the CNN method. The Swin transform block in the Swin transform has the advantages of wide range of vehicle type application and detection, suitability for field environment and complex photographing background, can realize high-efficiency damage assessment of damaged parts of vehicles, and optimizes damage assessment efficiency based on the method of the self-attention mechanism.
The vehicle loss detection device provided by the embodiment of the invention can execute the vehicle loss detection method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 6 is a schematic structural diagram of an electronic apparatus according to a fourth embodiment of the present invention, as shown in fig. 6, the electronic apparatus includes a processor 40, a memory 41, an input device 42, and an output device 43; the number of the processors 40 in the electronic device may be one or more, and one processor 40 is taken as an example in fig. 6; the processor 40, the memory 41, the input device 42 and the output device 43 in the electronic apparatus may be connected by a bus or other means, and the bus connection is exemplified in fig. 6.
The memory 41, as a computer-readable storage medium, may be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the vehicle loss detection method in the embodiment of the present invention (for example, the image acquisition module 310, the detection module 320, the detection result determination module 330, and the training module in the vehicle loss detection apparatus). The processor 40 executes various functional applications of the electronic device and data processing by executing software programs, instructions, and modules stored in the memory 41, that is, implements the vehicle loss detection method described above.
The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 41 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 41 may further include memory located remotely from processor 40, which may be connected to the electronic device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 42 is operable to receive input numeric or character information and to generate key signal inputs relating to user settings and function controls of the electronic apparatus. The output device 43 may include a display device such as a display screen.
EXAMPLE five
Fifth, an embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a vehicle loss detection method, the method comprising:
acquiring a target image;
inputting the target image into a network model, wherein a trunk network of the network model comprises a Swin transform network, and the trunk network is used for predicting the damage position coordinates and the damage types of the target image based on the Swin transform network;
and determining a damage detection result according to the damage position coordinate and the damage category.
On the basis of the above embodiment, the inputting the target image into the network model includes:
convolving the image through the convolution layer to obtain convolution data;
and taking the convolution data as an input of a Swin Transformer network.
On the basis of the above embodiment, the Swin Transformer network includes a plurality of Swin Transformer blocks, and each Swin Transformer block includes a plurality of MSA layers;
the input of the MSA layer is provided with a first convolution layer; (the input of the MSA layer is provided with 1 x 1 convolution layer)
The output of the MSA layer is provided with a second convolutional layer.
Specifically, the input of the MSA layer is provided with a 1 × 1 convolution layer, and the output of the MSA layer is provided with a 1 × 1 convolution layer.
On the basis of the above embodiment, the backbone network is connected to a neck network, and the neck network includes:
a feature map pyramid network and a balanced feature pyramid network.
On the basis of the above embodiment, before acquiring the target image, the method further includes:
marking the vehicle loss historical picture according to a marking criterion, and configuring the damage category of the vehicle loss historical picture;
and training the Swin transform network according to the marked vehicle loss historical picture.
On the basis of the above embodiment, the training of the Swin Transformer network according to the labeled car loss history picture includes:
and in the training process, performing regression calculation of the Swin transducer network according to the distance punishment damage function.
On the basis of the above embodiment, the training of the Swin Transformer network according to the labeled car loss history picture includes:
in the training process, data enhancement is carried out according to the vehicle loss historical picture;
and training the Swin Transformer network by using the data-enhanced car loss historical picture.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also execute the relevant operations in the vehicle loss detection method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes instructions for enabling an electronic device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the vehicle loss detection apparatus, the included units and modules are only divided according to the functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.
Claims (10)
1. A vehicle loss detection method, characterized by comprising:
acquiring a target image;
inputting the target image into a network model, wherein a trunk network of the network model comprises a Swin transform network, and the trunk network is used for predicting the damage position coordinates and the damage types of the target image based on the Swin transform network;
and determining a damage detection result according to the damage position coordinate and the damage category.
2. The method of claim 1, wherein inputting the target image to a network model comprises:
convolving the image through the convolution layer to obtain convolution data;
and taking the convolution data as an input of a Swin Transformer network.
3. The method of claim 1, wherein the Swin Transformer network comprises a plurality of Swin Transformer blocks, wherein the Swin Transformer blocks comprise a plurality of MSA layers;
the input of the MSA layer is provided with a first convolution layer;
the output of the MSA layer is provided with a second convolutional layer.
4. The method of claim 1, wherein the backbone network is connected to a neck network, the neck network comprising:
a feature map pyramid network and a balanced feature pyramid network.
5. The method of claim 1, further comprising, prior to acquiring the target image:
marking the vehicle loss historical picture according to a marking criterion, and configuring the damage category of the vehicle loss historical picture;
and training the Swin transform network according to the marked vehicle loss historical picture.
6. The method of claim 5, wherein the training the Swin Transformer network according to the labeled car loss history pictures comprises:
and in the training process, performing regression calculation of the Swin transducer network according to the distance punishment damage function.
7. The method of claim 5, wherein the training the Swin Transformer network according to the labeled car loss history pictures comprises:
in the training process, data enhancement is carried out according to the vehicle loss historical picture;
and training the Swin Transformer network by using the data-enhanced car loss historical picture.
8. A vehicle loss detection apparatus, characterized by comprising:
the image acquisition module is used for acquiring a target image;
the detection module is used for inputting the target image into a network model, a trunk network of the network model comprises a Swin transform network, and the trunk network is used for predicting the damage position coordinates and the damage types of the target image based on the Swin transform network;
and the detection result determining module is used for determining a damage detection result according to the damage position coordinate and the damage category.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the vehicle loss detection method as claimed in any one of claims 1 to 7 when executing the program.
10. A storage medium containing computer executable instructions for performing the vehicle loss detection method of any one of claims 1-7 when executed by a computer processor.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110937282.8A CN113657409A (en) | 2021-08-16 | 2021-08-16 | Vehicle loss detection method, device, electronic device and storage medium |
PCT/CN2022/070984 WO2023019875A1 (en) | 2021-08-16 | 2022-01-10 | Vehicle loss detection method and apparatus, and electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110937282.8A CN113657409A (en) | 2021-08-16 | 2021-08-16 | Vehicle loss detection method, device, electronic device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113657409A true CN113657409A (en) | 2021-11-16 |
Family
ID=78491076
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110937282.8A Pending CN113657409A (en) | 2021-08-16 | 2021-08-16 | Vehicle loss detection method, device, electronic device and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113657409A (en) |
WO (1) | WO2023019875A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114152441A (en) * | 2021-12-13 | 2022-03-08 | 山东大学 | Rolling bearing fault diagnosis method and system based on shift window converter network |
CN114445690A (en) * | 2022-01-30 | 2022-05-06 | 百度在线网络技术(北京)有限公司 | License plate detection method, model training method, device, medium, and program product |
CN114627292A (en) * | 2022-03-08 | 2022-06-14 | 浙江工商大学 | Industrial shielding target detection method |
CN114898155A (en) * | 2022-05-18 | 2022-08-12 | 平安科技(深圳)有限公司 | Vehicle damage assessment method, device, equipment and storage medium |
CN114972771A (en) * | 2022-06-22 | 2022-08-30 | 平安科技(深圳)有限公司 | Vehicle loss assessment and claim settlement method and device, electronic equipment and storage medium |
WO2023019875A1 (en) * | 2021-08-16 | 2023-02-23 | 平安科技(深圳)有限公司 | Vehicle loss detection method and apparatus, and electronic device and storage medium |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116343043B (en) * | 2023-03-30 | 2023-11-21 | 南京审计大学 | Remote sensing image change detection method with multi-scale feature fusion function |
CN117611600B (en) * | 2024-01-22 | 2024-03-29 | 南京信息工程大学 | Image segmentation method, system, storage medium and device |
CN118537708A (en) * | 2024-07-26 | 2024-08-23 | 四川航空股份有限公司 | Hole detection image damage identification model based on improved convolutional neural network and application system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657716A (en) * | 2018-12-12 | 2019-04-19 | 天津卡达克数据有限公司 | A kind of vehicle appearance damnification recognition method based on deep learning |
CN110728236A (en) * | 2019-10-12 | 2020-01-24 | 创新奇智(重庆)科技有限公司 | Vehicle loss assessment method and special equipment thereof |
CN111667011A (en) * | 2020-06-08 | 2020-09-15 | 平安科技(深圳)有限公司 | Damage detection model training method, damage detection model training device, damage detection method, damage detection device, damage detection equipment and damage detection medium |
CN111666990A (en) * | 2020-05-27 | 2020-09-15 | 平安科技(深圳)有限公司 | Vehicle damage characteristic detection method and device, computer equipment and storage medium |
CN112966709A (en) * | 2021-01-27 | 2021-06-15 | 中国电子进出口有限公司 | Deep learning-based fine vehicle type identification method and system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113657409A (en) * | 2021-08-16 | 2021-11-16 | 平安科技(深圳)有限公司 | Vehicle loss detection method, device, electronic device and storage medium |
-
2021
- 2021-08-16 CN CN202110937282.8A patent/CN113657409A/en active Pending
-
2022
- 2022-01-10 WO PCT/CN2022/070984 patent/WO2023019875A1/en unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109657716A (en) * | 2018-12-12 | 2019-04-19 | 天津卡达克数据有限公司 | A kind of vehicle appearance damnification recognition method based on deep learning |
CN110728236A (en) * | 2019-10-12 | 2020-01-24 | 创新奇智(重庆)科技有限公司 | Vehicle loss assessment method and special equipment thereof |
CN111666990A (en) * | 2020-05-27 | 2020-09-15 | 平安科技(深圳)有限公司 | Vehicle damage characteristic detection method and device, computer equipment and storage medium |
CN111667011A (en) * | 2020-06-08 | 2020-09-15 | 平安科技(深圳)有限公司 | Damage detection model training method, damage detection model training device, damage detection method, damage detection device, damage detection equipment and damage detection medium |
CN112966709A (en) * | 2021-01-27 | 2021-06-15 | 中国电子进出口有限公司 | Deep learning-based fine vehicle type identification method and system |
Non-Patent Citations (1)
Title |
---|
ZE LIU等: "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows", ARXIV, pages 1 - 13 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023019875A1 (en) * | 2021-08-16 | 2023-02-23 | 平安科技(深圳)有限公司 | Vehicle loss detection method and apparatus, and electronic device and storage medium |
CN114152441A (en) * | 2021-12-13 | 2022-03-08 | 山东大学 | Rolling bearing fault diagnosis method and system based on shift window converter network |
CN114445690A (en) * | 2022-01-30 | 2022-05-06 | 百度在线网络技术(北京)有限公司 | License plate detection method, model training method, device, medium, and program product |
CN114627292A (en) * | 2022-03-08 | 2022-06-14 | 浙江工商大学 | Industrial shielding target detection method |
CN114627292B (en) * | 2022-03-08 | 2024-05-14 | 浙江工商大学 | Industrial shielding target detection method |
CN114898155A (en) * | 2022-05-18 | 2022-08-12 | 平安科技(深圳)有限公司 | Vehicle damage assessment method, device, equipment and storage medium |
CN114898155B (en) * | 2022-05-18 | 2024-05-28 | 平安科技(深圳)有限公司 | Vehicle damage assessment method, device, equipment and storage medium |
CN114972771A (en) * | 2022-06-22 | 2022-08-30 | 平安科技(深圳)有限公司 | Vehicle loss assessment and claim settlement method and device, electronic equipment and storage medium |
CN114972771B (en) * | 2022-06-22 | 2024-06-28 | 平安科技(深圳)有限公司 | Method and device for vehicle damage assessment and claim, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2023019875A1 (en) | 2023-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113657409A (en) | Vehicle loss detection method, device, electronic device and storage medium | |
CN113609896A (en) | Object-level remote sensing change detection method and system based on dual-correlation attention | |
CN111259710B (en) | Parking space structure detection model training method adopting parking space frame lines and end points | |
CN115631344B (en) | Target detection method based on feature self-adaptive aggregation | |
CN109034136A (en) | Image processing method, device, picture pick-up device and storage medium | |
CN115496923B (en) | Multi-mode fusion target detection method and device based on uncertainty perception | |
CN113487610B (en) | Herpes image recognition method and device, computer equipment and storage medium | |
CN114519853B (en) | Three-dimensional target detection method and system based on multi-mode fusion | |
CN113850136A (en) | Yolov5 and BCNN-based vehicle orientation identification method and system | |
CN114842035A (en) | License plate desensitization method, device and equipment based on deep learning and storage medium | |
CN115797336A (en) | Fault detection method and device of photovoltaic module, electronic equipment and storage medium | |
CN112613434A (en) | Road target detection method, device and storage medium | |
CN110909656B (en) | Pedestrian detection method and system integrating radar and camera | |
CN115953744A (en) | Vehicle identification tracking method based on deep learning | |
CN111626241A (en) | Face detection method and device | |
CN116543217A (en) | Small target classification recognition and pose estimation method with similar structure | |
CN115100469A (en) | Target attribute identification method, training method and device based on segmentation algorithm | |
US11361589B2 (en) | Image recognition method, apparatus, and storage medium | |
CN117911827A (en) | Multi-mode target detection method, device, equipment and storage medium | |
Dong et al. | Intelligent pixel-level pavement marking detection using 2D laser pavement images | |
CN111241891B (en) | Face image cutting method and device and computer readable storage medium | |
CN113537397B (en) | Target detection and image definition joint learning method based on multi-scale feature fusion | |
CN115115947A (en) | Remote sensing image detection method and device, electronic equipment and storage medium | |
CN115131762A (en) | Vehicle parking method, system and computer readable storage medium | |
CN111178158A (en) | Method and system for detecting cyclist |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |