CN112215244A

CN112215244A - Cargo image detection method, device, equipment and storage medium

Info

Publication number: CN112215244A
Application number: CN202011204728.8A
Authority: CN
Inventors: 刘阳; 陈志强; 李元景; 张丽; 邢宇翔; 张良; 戴诗语
Original assignee: Tongfang Vision Technology Jiangsu Co ltd; Tsinghua University; Nuctech Co Ltd
Current assignee: Tongfang Vision Technology Jiangsu Co ltd; Tsinghua University; Nuctech Co Ltd
Priority date: 2020-11-02
Filing date: 2020-11-02
Publication date: 2021-01-12

Abstract

The disclosure provides a cargo image detection method, a cargo image detection device, cargo image detection equipment and a storage medium, and relates to the technical field of security inspection. The method comprises the following steps: acquiring an image detection model to be trained, wherein the image detection model to be trained comprises a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is respectively connected with the plurality of category detection branch networks; iteratively updating network parameters of the image detection model to be trained to obtain a trained image detection model; determining a class detection branch network to be optimized and retrained according to the trained image detection model; iteratively updating network parameters of the class detection branch network to be optimized and retrained to obtain an optimized and retrained image detection model; and obtaining a final image detection model according to the trained image detection model and/or the optimized retrained image detection model to detect the cargo image. The method improves the detection accuracy of the multi-category cargo image detection model and improves the configurability and maintainability of the whole network.

Description

Cargo image detection method, device, equipment and storage medium

Technical Field

The disclosure relates to the technical field of security inspection, in particular to a cargo image detection method, a cargo image detection device, cargo image detection equipment and a readable storage medium.

Background

With the development of artificial intelligence technology, deep neural networks based on deep learning technology are widely applied in classification tasks and target detection tasks. With the continuous and deep research, the performance of the deep neural network is gradually improved, and the current target detection technology based on the deep neural network is widely applied to the detection process of large containers in the security inspection field, and mainly identifies and detects potential inclusion forbidden articles in the containers, so that the manual inspection is assisted, and the inspection efficiency is greatly improved.

Deep learning is a learning mechanism based on data driving, and forbidden article identification and detection belong to the category of supervised learning. When the deep neural network detection model is trained, a large amount of image data with marking information needs to be adopted. In the related technology, a model based on a Faster circular-convolutional neural network (Faster R-CNN) is adopted to perform target detection and classification on a cargo scanning image so as to identify and detect potential concealed forbidden articles in a container, and under the condition that the types of the forbidden articles are more, the model is difficult to ensure to obtain a better detection effect on all types, so that the detection accuracy is lower.

As described above, how to improve the accuracy of the cargo image detection model is an urgent problem to be solved.

The above information disclosed in this background section is only for enhancement of understanding of the background of the disclosure and therefore it may contain information that does not constitute prior art that is already known to a person of ordinary skill in the art.

Disclosure of Invention

The invention aims to provide a cargo image detection method, a cargo image detection device, cargo image detection equipment and a readable storage medium, which can improve the detection accuracy of a cargo image detection model.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to an aspect of the present disclosure, there is provided a cargo image detection method, including: acquiring an image detection model to be trained, wherein the image detection model to be trained comprises a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is respectively connected with the plurality of category detection branch networks; iteratively updating the network parameters of the image detection model to be trained to obtain a trained image detection model; determining a class detection branch network to be optimized and retrained according to the trained image detection model; iteratively updating the network parameters of the class detection branch network to be optimized and retrained to obtain an optimized and retrained image detection model; and obtaining a final image detection model according to the trained image detection model and/or the optimized retrained image detection model to detect the cargo image.

According to an embodiment of the present disclosure, each of the plurality of class detection branch networks includes a region extraction network and a classification regression network connected to each other; before the iteratively updating the network parameters of the image detection model to be trained, the method further includes: configuring a network training hyper-parameter of the image detection model to be trained, wherein the network training hyper-parameter comprises a plurality of class detection branch weight coefficients corresponding to the plurality of class detection branch networks, a plurality of region extraction weight coefficients corresponding to the plurality of region extraction networks and a plurality of classification regression weight coefficients corresponding to the plurality of classification regression networks; the iteratively updating the network parameters of the image detection model to be trained comprises: obtaining a plurality of region extraction loss functions corresponding to the plurality of region extraction networks and a plurality of classification regression loss functions corresponding to the plurality of classification regression networks; obtaining a total loss function of the image detection model to be trained according to the plurality of class detection branch weight coefficients, the plurality of region extraction loss functions, the plurality of region extraction weight coefficients, the plurality of classification regression loss functions and the plurality of classification regression weight coefficients; and iteratively updating the network parameters of the image detection model to be trained through a back propagation process based on the total loss function of the image detection model to be trained.

According to an embodiment of the present disclosure, the trained image detection model includes a trained feature extraction network and a plurality of trained category detection branch networks; the determining the class detection branch network to be optimized and retrained according to the trained image detection model comprises the following steps: determining the class detection branch network to be optimized and retrained from the trained class detection branch networks, and obtaining the class detection branch network which does not need to be retrained; the iteratively updating the network parameters of the class detection branch network to be optimized and retrained comprises: acquiring a total loss function of the class detection branch network to be optimized and retrained; iteratively updating the network parameters of the class detection branch network to be optimized and retrained through a back propagation process based on the total loss function of the class detection branch network to be optimized and retrained, and obtaining the network parameters of the class detection branch network to be optimized and retrained, wherein the network parameters of the trained feature extraction network and the network parameters of the class detection branch network which does not need to be retrained are not iteratively updated.

According to an embodiment of the present disclosure, the obtaining an optimized retrained image detection model includes: and obtaining the optimized and retrained image detection model based on the network parameters of the trained feature extraction network and the network parameters of the optimized and retrained class detection branch network.

According to an embodiment of the present disclosure, after the determining, according to the trained image detection model, a class detection branch network to be optimized and retrained, and before the iteratively updating network parameters of the class detection branch network to be optimized and retrained, the method further includes: and initializing the network parameters of the class detection branch network to be optimized and retrained by combining and loading one or more pre-training models.

According to an embodiment of the present disclosure, the obtaining a final image detection model according to the trained image detection model and/or the optimized retrained image detection model to detect the cargo image includes: splitting the trained image detection model to obtain a trained feature extraction network and a plurality of trained category detection branch networks; splitting the optimized and retrained image detection model to obtain an optimized and retrained class detection branch network; acquiring a final feature extraction network as the trained feature extraction network; determining a final plurality of class detection branch networks from the plurality of trained class detection branch networks and the optimized retrained class detection branch network; and combining the final feature extraction network and the final multiple category detection branch networks to obtain the final image detection model so as to detect the cargo image.

According to an embodiment of the present disclosure, the method further comprises: acquiring the cargo image; extracting the features of the cargo image through the final feature extraction network to obtain a feature map of the cargo image; and processing the characteristic diagram of the cargo image through the final class detection branch network to obtain a predicted target position and a predicted target class for detecting the cargo image.

According to still another aspect of the present disclosure, there is provided a cargo image detection apparatus including: the training device comprises a to-be-trained model acquisition module, a to-be-trained model acquisition module and a training module, wherein the to-be-trained model acquisition module is used for acquiring a to-be-trained image detection model, the to-be-trained image detection model comprises a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is respectively connected with the plurality of category detection branch networks; the integral model training module is used for iteratively updating the network parameters of the image detection model to be trained to obtain a trained image detection model; the branch configuration module is used for determining a class detection branch network to be optimized and retrained according to the trained image detection model; the branch optimization module is used for iteratively updating the network parameters of the class detection branch network to be optimized and retrained to obtain an optimized and retrained image detection model; and the model merging module is used for obtaining a final image detection model according to the trained image detection model and/or the optimized retrained image detection model so as to detect the cargo image.

According to an embodiment of the present disclosure, each of the plurality of class detection branch networks includes a region extraction network and a classification regression network connected to each other; the device further comprises: a parameter configuration module, configured to configure a network training hyper-parameter of the to-be-trained image detection model, where the network training hyper-parameter includes a plurality of class detection branch weight coefficients corresponding to the plurality of class detection branch networks, a plurality of region extraction weight coefficients corresponding to the plurality of region extraction networks, and a plurality of classification regression weight coefficients corresponding to the plurality of classification regression networks; the integral model training module comprises: a first loss calculation module, configured to obtain a plurality of region extraction loss functions corresponding to the plurality of region extraction networks and a plurality of classification regression loss functions corresponding to the plurality of classification regression networks; obtaining a total loss function of the image detection model to be trained according to the plurality of class detection branch weight coefficients, the plurality of region extraction loss functions, the plurality of region extraction weight coefficients, the plurality of classification regression loss functions and the plurality of classification regression weight coefficients; and the first training module is used for iteratively updating the network parameters of the image detection model to be trained through a back propagation process based on the total loss function of the image detection model to be trained.

According to an embodiment of the present disclosure, the trained image detection model includes a trained feature extraction network and a plurality of trained category detection branch networks; the branch configuration module is further configured to: determining the class detection branch network to be optimized and retrained from the trained class detection branch networks, and obtaining the class detection branch network which does not need to be retrained; the branch optimization module includes: the second loss calculation module is used for acquiring a total loss function of the class detection branch network to be optimized and retrained; and the second training module is used for iteratively updating the network parameters of the class detection branch network to be optimized and retrained through a back propagation process based on the total loss function of the class detection branch network to be optimized and retrained to obtain the network parameters of the class detection branch network to be optimized and retrained, wherein the network parameters of the trained feature extraction network and the network parameters of the class detection branch network which do not need to be retrained are not iteratively updated.

According to an embodiment of the present disclosure, the branch optimization module is further configured to: and obtaining the optimized and retrained image detection model based on the network parameters of the trained feature extraction network and the network parameters of the optimized and retrained class detection branch network.

According to an embodiment of the present disclosure, the apparatus further comprises: and the network initialization module is used for initializing the network parameters of the class detection branch network to be optimized and retrained by combining and loading one or more pre-training models.

According to an embodiment of the present disclosure, the model merging module includes: the first model splitting module is used for splitting the trained image detection model to obtain a trained feature extraction network and a plurality of trained category detection branch networks; the second model splitting module is used for splitting the optimized and retrained image detection model to obtain an optimized and retrained class detection branch network; the model freezing module is used for acquiring a final feature extraction network as the trained feature extraction network; determining a final plurality of class detection branch networks from the plurality of trained class detection branch networks and the optimized retrained class detection branch network; and combining the final feature extraction network and the final multiple category detection branch networks to obtain the final image detection model so as to detect the cargo image.

According to an embodiment of the present disclosure, the apparatus further comprises: the initial image acquisition module is used for acquiring the cargo image; the characteristic extraction module is used for extracting the characteristics of the cargo image through the final characteristic extraction network to obtain a characteristic diagram of the cargo image; and the region classification module is used for processing the characteristic diagram of the cargo image through the final class detection branch network to obtain a predicted target position and a predicted target class for detecting the cargo image.

According to yet another aspect of the present disclosure, there is provided an apparatus comprising: a memory, a processor and executable instructions stored in the memory and executable in the processor, the processor implementing any of the methods described above when executing the executable instructions.

According to yet another aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement any of the methods described above.

The cargo image detection method provided by the embodiment of the disclosure includes obtaining an image detection model to be trained, which includes a feature extraction network and a plurality of category detection branch networks respectively connected to the feature extraction network, iteratively updating network parameters of the image detection model to be trained to obtain a trained image detection model, determining a category detection branch network to be optimized and retrained according to the trained image detection model, iteratively updating network parameters of the category detection branch network to be optimized and retrained to obtain an optimized and retrained image detection model, and obtaining a final image detection model according to the trained image detection model and/or the optimized and retrained image detection model to detect cargo images, so that the detection accuracy of the multi-category cargo image detection model can be improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.

Fig. 1 shows a schematic diagram of a system architecture in an embodiment of the disclosure.

Fig. 2 shows a flowchart of a cargo image detection method in an embodiment of the present disclosure.

Fig. 3 shows a flowchart of a branch optimization method of a cargo image detection model in an embodiment of the present disclosure.

Fig. 4A shows an overall flowchart of multi-category cargo image detection model training in an embodiment of the present disclosure.

Fig. 4B shows a schematic diagram of a training process of a cargo image detection model in an embodiment of the present disclosure.

Fig. 5 shows a cargo image detection model branch optimization flow diagram in an embodiment of the present disclosure.

Fig. 6A shows a flowchart of a method for cargo image detection by using a cargo image detection model in an embodiment of the present disclosure.

Fig. 6B shows a cargo image detection model freezing flow diagram in an embodiment of the present disclosure.

Fig. 7A is a schematic overall flowchart illustrating cargo image detection by a cargo image detection model in an embodiment of the present disclosure.

Fig. 7B shows a schematic flowchart of cargo image detection by using a cargo image detection model in the embodiment of the present disclosure.

Fig. 8 shows a block diagram of a cargo image detection apparatus in an embodiment of the present disclosure.

Fig. 9 shows a block diagram of another cargo image detection apparatus in an embodiment of the present disclosure.

Fig. 10 shows a schematic structural diagram of an electronic device in an embodiment of the present disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted.

Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, apparatus, steps, etc. In other instances, well-known structures, methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the disclosure.

Furthermore, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present disclosure, "a plurality" means at least two, e.g., two, three, etc., unless explicitly specifically limited otherwise. The symbol "/" generally indicates that the former and latter associated objects are in an "or" relationship.

In the present disclosure, unless otherwise expressly specified or limited, the terms "connected" and the like are to be construed broadly, e.g., as meaning electrically connected or in communication with each other; may be directly connected or indirectly connected through an intermediate. The specific meaning of the above terms in the present disclosure can be understood by those of ordinary skill in the art as appropriate.

In some related technologies, a model is used to realize the detection of multiple categories of forbidden articles, when a poor detection of a certain category is fed back on site or a false detection image of a certain category is returned for model optimization and improvement, the whole model needs to be retrained, not only is the calculation cost and the time cost very high, but also the detection effects of other forbidden article categories can be influenced while one category is optimized, that is, it is difficult to ensure that all forbidden article categories are optimized in one training, and it is difficult to individually optimize the detection effects of a small number of forbidden article categories in the model.

In other related technologies, in order to facilitate maintenance and update of the forbidden article detection module, a detection model is designed for each forbidden article type separately, and during maintenance and update, only a detection model with poor detection or high false alarm needs to be trained and trained optimally. However, the detection models are designed separately for each forbidden class, which causes excessive detection models, thus greatly increasing the time cost for model training and detection and the requirement for computing resources; and each forbidden article detection model is trained independently, and interaction among categories is lacked, so that the condition of mutual false alarm among forbidden article categories frequently occurs. In order to improve the detection performance of the forbidden articles of the large container, detection models in other related technologies can be cascaded to form a classification network, and although the problem of insufficient classification performance is relieved to a certain extent, the problems of overfitting of the cascaded classification network and the need of multiple training can exist.

On the other hand, the need for computing resources is also an important factor to consider when detecting container prohibitions. In some related technologies, a parallel computing mode of each forbidden article detection module is adopted, so that the purpose of detecting multiple forbidden article types is achieved, but the requirement of the mode on computing resources is high. In other related technologies, the whole framework of the forbidden article detection is changed into a serial method, so that the number of Graphics Processing Units (GPUs) used for parallel computing can be reduced, but longer algorithm Processing time is needed, and the detection efficiency is greatly reduced.

In order to solve the problems existing in the detection of the forbidden articles of the large container, the invention provides a cargo image detection method, which utilizes a cargo image detection model to realize the detection of the forbidden articles of multiple categories, and simultaneously can carry out independent optimization training on the detection effect of a small number of forbidden article categories in the model, thereby reducing the use amount of computational resources such as GPU (graphics processing unit) and the like while ensuring the maintainability of the multiple forbidden article categories, and effectively improving the detection efficiency and the accuracy of the detection of multiple classes of the forbidden articles.

Fig. 1 illustrates an exemplary system architecture 10 to which the cargo image detection method or cargo image detection apparatus of the present disclosure may be applied.

As shown in FIG. 1, system architecture 10 may include a terminal device 102, a network 104, a server 106, and a charting workstation 108. Terminal device 102 may be a variety of electronic devices having a display screen and supporting input, output, including but not limited to smart phones, tablets, laptop portable computers, desktop computers, and the like. Network 104 may be used to provide the medium for communication links between terminal device 102 and server 106, and may also be used to provide the medium for communication links between charting station 108 and server 106, and between charting station 108 and terminal device 102. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. The server 106 may be a server or a cluster of servers, etc. that provide various services, such as a background processing server (for example only) that provides support for deep learning based cargo image classification model building by a user using the terminal device 102. The image checking workstation 108 may be an image checking workstation used in cargo inspection business processes of customs, ports, and the like, and may include an X-ray container inspection device, an X-ray article machine, a detector, a computer connected to the detector, and the like.

A user may use terminal device 102 to interact with server 106 via network 104 to send or receive data, and the like. For example, a user may use terminal device 102 to configure a network structure of a cargo image inspection model for processing on server 106 via network 104. For another example, the user may view information on the terminal device 102 that the server 106 processes the training of the cargo image detection model, such as the number of iterations, convergence conditions, and so on. The background processing server 106 may analyze and process the received cargo image data, and feed back the cargo image classification result to the terminal device. The charting workstation 108 may also interact with the server 106 via the network 104 to send or receive data and the like. For example, the server 106 sends the inspection result of the cargo image to the inspection workstation 108 through the network 104, or the inspection workstation 108 sends the cargo image with the label for training the inspection model of the cargo image to the server 106 through the network 104. After the image-checking workstation 108 obtains the radiation image by scanning the cargo, the data of the radiation image can be uploaded to the database server 106 through the network 104 for storage.

It should be understood that the number of terminal devices, networks, servers and charting stations in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, servers, and charting workstations, as desired for the implementation.

FIG. 2 is a flow chart illustrating a cargo image detection method according to an exemplary embodiment. The method shown in fig. 2 may be applied to, for example, a server side of the system, and may also be applied to a terminal device of the system.

Referring to fig. 2, a cargo image detection method 20 provided by the embodiment of the present disclosure may include the following steps.

In step S202, an image detection model to be trained is obtained, where the image detection model to be trained includes a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is connected to the plurality of category detection branch networks respectively. The feature extraction network may also be referred to as a basic network (backbone), for example, a Visual Geometry Group (VGG) network of oxford university may be used as the feature extraction network, and the VGG network may include various configurations according to different settings of parameters such as weight layers such as convolutional layers and maximum pooling layers. The feature extraction Network can be connected with a plurality of class detection branch networks respectively, and the class detection branch networks can be used for classifying and performing Region frame regression after a Region pooling layer is connected with a Region proposed Network (Region pro-social Network) in the Faster R-CNN Network.

In some embodiments, for example, each of the plurality of class detection branch networks may include an area extraction network and a classification regression network connected to each other, and each of the class detection branch networks may be configured to detect different classes of forbidden articles, for example, different classes of detection branch networks may be configured to output different forbidden article classification results, wherein each of the forbidden article branches may have a plurality of fine classification in one stage (e.g., a region suggestion stage) and then correspond to a plurality of two-stage (e.g., a classification and area frame regression stage) branches, and a certain two-stage branch may also correspond to a plurality of fine classification; it is also possible that one stage of a class branch corresponds to a class, and then two stages also correspond to the class.

In step S204, the network parameters of the image detection model to be trained are iteratively updated, so as to obtain the trained image detection model. The multiple class detection branch networks may be trained first, that is, the class detection branch network to be trained is the multiple class detection branch networks, and the network parameters of the backbone may be trained together, that is, the network parameters of the backbone and each class detection branch network are updated iteratively, so as to obtain the trained feature extraction network and the trained multiple class detection branch networks.

In some embodiments, for example, a network training hyper-parameter of the image detection model to be trained may be configured before the training in step S204, the network training hyper-parameter may include a plurality of class detection branch weight coefficients corresponding to a plurality of class detection branch networks, a plurality of region extraction weight coefficients corresponding to a plurality of region extraction networks, and a plurality of classification regression weight coefficients corresponding to a plurality of classification regression networks, and for example, when n trained forbidden branches are determined, a loss weight coefficient [ (w) of one stage and two stages may be set₁₁,w₁₂),(w₂₁,w₂₂),…，(w_n1,w_n2)]Loss weight coefficient [ k ] for each class branch₁,k₂,…，k_n]Etc., such as all set to 0.2, 0.5, or 1, etc., or different values may be set for different loss weight coefficients, which may be adjusted prior to each training session.

In some embodiments, for example, a plurality of region extraction loss functions corresponding to the plurality of region extraction networks and a plurality of classification regression loss functions corresponding to the plurality of classification regression networks may be obtained, a total loss function of the image detection model to be trained may be obtained according to the plurality of class detection branch weight coefficients, the plurality of region extraction loss functions, the plurality of region extraction weight coefficients, the plurality of classification regression loss functions, and the plurality of classification regression weight coefficients, and the network parameters of the image detection model to be trained may be iteratively updated through a back propagation process based on the total loss function of the image detection model to be trained. For example, a large container image with labeled information may be used for training, for example, a single classification branch for checking the forbidden articles, and the loss of a single forbidden article branch at one stage is expressed as loss₁₁Wherein the cross-entropy penalty including foreground and background classification is expressed as

The regression loss of the regression of the frame position is expressed as

The loss in two stages is expressed as loss₁₂Wherein the cross-entropy loss including classification is expressed as

The regression loss of the sum-position regression is expressed as

The losses in the two stages are respectively as follows:

where α, β are the scaling factors used to balance regression with classification loss.

In step S206, a class detection branch network to be optimized and retrained is determined according to the trained image detection model. After parameters of all class branch networks are updated through training, a situation that some class branches obtain a good detection effect after training may occur, that is, the classification accuracy is high, and the detection effect of other class branches is poor. And selecting some class detection branch networks with poor class branch detection effects as the class detection branch networks to be optimized and retrained. Meanwhile, a data set for network model training can be obtained, and the data set can comprise a large number of images of multiple classes of forbidden articles, namely corresponding labels.

In step S208, the network parameters of the class detection branch network to be optimized and retrained are iteratively updated, so as to obtain an optimized and retrained image detection model. The optimized and retrained class detection branch network may be one branch or multiple branches, and is not limited herein. Under the condition of not changing the network parameters of the class branch with better detection effect, the basic network parameters are locked in the training process, namely the network parameters are fixed to be fixed values, only the network parameters of the class detection branch network which needs to be trained (namely to be trained) are updated in an iterative mode, other class branches are not loaded in the training process, and the class branch network with poor detection effect can be improved in a targeted mode. The optimized retrained image detection model can be obtained based on the trained feature extraction network parameters and the network parameters of the optimized retrained class detection branch network.

In step S210, a final image detection model is obtained according to the trained image detection model and/or the optimized retrained image detection model to detect the cargo image. According to actual detection requirements, the feature extraction network after the integral model training is respectively connected with the category detection branch network after the optimization retraining and the category detection branch network branch with better effect which is not trained independently after the integral model training, and the final image detection model is obtained through combination to detect the cargo image. When the split grouping is carried out among a plurality of different models, the same model of the backbone network parameters needs to be used, and the combination of the backbone network and each class detection branch network can be carried out after the split.

According to the training method of the cargo image detection model provided by the embodiment of the disclosure, the to-be-trained image detection model comprising the feature extraction network and the plurality of class detection branch networks respectively connected with the feature extraction network is obtained, the network parameters of the to-be-trained image detection model are iteratively updated to obtain the trained image detection model, the class detection branch network to be optimized and retrained is determined according to the trained image detection model, the network parameters of the class detection branch network to be optimized and retrained are iteratively updated to obtain the optimized and retrained image detection model, then the final image detection model is obtained according to the trained image detection model and/or the optimized and retrained image detection model to detect the cargo image, so that the class branch with poor detection effect can be trained in a targeted manner without affecting the class branch with good detection effect, the method and the device realize the improvement of the detection accuracy of the multi-category cargo image detection model and improve the configurability and maintainability of the whole network.

FIG. 3 is a flow diagram illustrating a method for branch optimization of a cargo image inspection model according to an exemplary embodiment. The method shown in fig. 3 may be applied to, for example, a server side of the system, and may also be applied to a terminal device of the system.

Referring to fig. 3, a method 30 provided by an embodiment of the present disclosure may include the following steps.

In step S302, a class detection branch network to be optimally retrained is determined from the plurality of trained class detection branch networks, and a class detection branch network that does not need to be retrained is obtained. The image detection model after the integral training comprises a trained feature extraction network and a plurality of trained category detection branch networks. For the class branch networks with better detection effect, training is not needed, the class detection branch networks without training are obtained, and the branches can be locked, for example, the network parameters of the class detection branch networks without training and the gradient of the network parameters of the integrally trained feature extraction network are set to be 0, that is, the parameters of the networks are not updated when the parameters are updated iteratively.

In step S304, network parameters of the class detection branch network to be optimized and retrained are initialized by loading one or more pre-training models in combination. The network parameters in the pre-training model can be used as initial network parameters of the class detection branch network to be trained for training.

In some embodiments, for example, network parameters may be loaded for the multi-class detection model through a pre-training model, and whether to train the backbone and each class forbidden branch in the network model may be set, while setting the times that the backbone and the entire multi-class detection network model need to be trained separately. The pre-trained model may be a relatively generic pre-trained model previously trained over a large data set (e.g., coco), or may be a model previously trained in a similar project, and the use of the pre-trained model tends to reduce the time required for subsequent retraining on a new training model.

In other embodiments, for example, a model to be trained has a large number of required parameters, some of which may load the contents of parameters in the pre-trained model a, others of which may load the contents of parameters in the pre-trained model B, others of which may load the contents of parameters in the pre-trained model C, and so on. For example, in the case of having multiple classes of branches, that is, there are multiple branches in the first and second stages of backbone connection in the cargo image detection model, when the parameters of the model a and the backbone of the model B are consistent, the parameters of the corresponding class a branch may be loaded through the pre-trained model a, the parameters of the corresponding class B branch may be loaded through the pre-trained model B, and the like, and when the model a is combined into an integral model, the model a is used as the pre-trained model to load the parameters of the backbone for initialization. And then setting the times of training required by the backbone and the whole multi-class detection network model respectively. If only part of the class detection branch networks are trained during training, the backhaul and other class detection branch networks need to be locked.

In step S306, the total loss function of the class detection branch network to be optimized and retrained is obtained.

In some embodiments, for example, only the losses for class branches to be optimized for retraining may be calculated into the total loss function. For example, when n forbidden branches to be trained are determined, the total loss function L can be set_totalExpressed as:

in the formula, k_i∈[k₁,k₂,…，k_n]，(w_i1,w_i2)∈[(w₁₁,w₁₂),(w₂₁,w₂₂),…，(w_n1,w_n2)]。

In step S308, iteratively updating the network parameters of the class detection branch network to be optimized and retrained through a back propagation process based on the total loss function of the class detection branch network to be optimized and retrained, to obtain the network parameters of the class detection branch network to be optimized and retrained, wherein the network parameters of the trained feature extraction network and the network parameters of the class detection branch network not to be retrained are not iteratively updated. And when the set training times are reached, finishing the training of the multi-class initial image detection model.

According to the training method of the cargo image detection model provided by the embodiment of the disclosure, training of multiple forbidden article class branches is realized in one training, meanwhile, the detection effect of the class branch with poor effect in the model can be selected for targeted optimization training, the efficiency and accuracy of intelligent identification and detection of forbidden articles in large container cargos can be effectively improved, the problems and the defects of high maintenance and updating cost of a detection module during detection of the forbidden articles in the large container in the prior art can be overcome, and the false alarm among the forbidden article classes and the requirements on computing resources are reduced. When a certain forbidden article detection and identification branch is trained, forbidden articles of other classes are provided as negative samples, interaction between the classes exists, and false alarm between the forbidden articles can be obviously reduced.

FIG. 4A is a schematic diagram illustrating an overall process flow of multi-category cargo image detection model training according to an exemplary embodiment. As shown in fig. 4A, first, a class branch to be trained is configured (S4002), then, network parameters of the class branch to be trained are initialized by loading a corresponding pre-training model (S4004), then, the class branch to be trained is trained through a large container image with labeled information, a loss function of the class branch to be trained is calculated to obtain a total loss function (S4006), and then, the network parameters of the class branch to be trained are updated based on the total loss function (S4008).

FIG. 4B is a schematic diagram illustrating a cargo image detection model training process according to an exemplary embodiment. As shown in fig. 4B, firstly, a data set for training a cargo image detection model needs to be acquired, and then a cargo image detection depth network model including a plurality of forbidden article class branches is trained by using the acquired training data set, wherein a basic network and the forbidden article detection class branches needing to be trained are configured, then in a training process, a training image is input into the basic network, the basic network extracts features of the training image (S402), and then output feature vectors are respectively input into the forbidden article 1 to be trained, i.e., the class branches of the forbidden article. For example, in the category branch of the forbidden article 1, after extracting a candidate region of the forbidden article 1 (S4042), the forbidden article 1 region suggestion network carries out classification and prediction rectangular box regression (S4062); in the no-limit article 2 category branch, the no-limit article 2 region suggestion network extracts the candidate region of the no-limit article 1 (S4044) and then performs classification and predictive rectangular box regression (S4064) … … in the no-limit article n category branch, and the no-limit article n region suggestion network extracts the candidate region of the no-limit article n (S4046) and then performs classification and predictive rectangular box regression (S4066). And finally, obtaining the classification of each branch and the loss of the regional prediction according to the labeling of the training image, calculating a total loss function and then performing inverse regression (S408), and updating the network parameters of the forbidden detection class branches needing to be trained.

FIG. 5 is a schematic diagram illustrating a cargo image inspection model branch optimization flow according to an exemplary embodiment. The Faster Rcnn method is excellent in image detection performance, but it is difficult to ensure that all categories obtain good detection effects. In the detection of forbidden articles of large containers, detection and false alarm are two important indexes for measuring the detection effect of the model. The method for checking the large container based on fast Rcnn realizes the identification and detection of forbidden articles, and when the detection and the false alarm can not meet the use requirement, the corresponding detection module needs to be maintained and updated. As shown in fig. 5, the category branch of the no-limit article 1 is a branch that needs to be trained, and the remaining no-limit articles 2-no-limit article n are branches that do not need to be trained, and when the category branch to be trained is configured, the category detection branch networks such as the basic network (S502), the regional suggestion network of the no-limit article 2-no-limit article n, the classification network, and the frame regression network are locked (S5044, S5046, S5064, S5066), and then the loss function of the no-limit article 1 category branch is calculated as the total loss function, and the network parameters of the no-limit article 1 category branch are updated by back-propagation regression (S508). Under the condition that a basic network does not need to be retrained, only related branches need to be updated and maintained, the identification and detection effects of corresponding forbidden articles are improved, certain universal applicability is achieved, and the applicability, the accuracy and the false alarm rate are improved well.

Fig. 6A is a flowchart illustrating a method for cargo image detection using a cargo image detection model according to an exemplary embodiment. The method shown in fig. 6A may be applied to, for example, a server side of the system, and may also be applied to a terminal device of the system.

Referring to fig. 6A, a method 60 provided by embodiments of the present disclosure may include the following steps.

In step S602, the trained image detection model is split to obtain a trained feature extraction network and a plurality of trained category detection branch networks. The cargo image detection model obtained through the integral training in the embodiment can be split into a backbone module, each class forbidden article branch module and other modules, and therefore the subsequent loading of network parameters and the merging of the models are facilitated. The final feature extraction network may be obtained as the trained feature extraction network.

In step S604, the optimized retrained image detection model is split to obtain an optimized retrained class detection branch network. According to the cargo image detection model obtained by optimization retraining in the embodiment, the forbidden class detection branch network after optimization retraining is determined.

In some embodiments, for example, the trained network model may be split, and mainly includes modules such as a backbone and each class of forbidden article branch, which facilitates later-stage network parameter loading and model merging, for example, splitting the model a into modules such as a backbone and forbidden article branches a, B, c, d, and e, and splitting the model B into modules such as a backbone and forbidden article branches a, B, and c.

In step S606, a final plurality of class detection branch networks are determined from the plurality of trained class detection branch networks and the optimized retrained class detection branch network. And selecting a plurality of optimized retrained forbidden item class branches as the components of the multi-class detection model according to the actual use requirement. For example, if a model a and a model B are obtained by training, the backbones of the two detection models are consistent, the a model trains the forbidden article branches a, B, c, d, and e, and the B model trains only the forbidden article branches a, B, and c, which can be considered as the independent optimization of the forbidden article branches a, B, and c in the a model, so that the backbone of the a model can be selected to connect the forbidden article branches a, B, and c of the B model and the forbidden article branches d and e of the a model as the final image detection model.

In step S608, a final image detection model is obtained to detect the cargo image by combining the final feature extraction network and the final plurality of category detection branch networks. Parameters in the corresponding model forbidden article branches after training can be loaded according to the configuration conditions of the forbidden article branches and the configuration files. For example, parameters of the forbidden item branches d and e in the model A, parameters of the forbidden item branches a, B and c in the model B and parameters of a backbone (the same as the model A and the model B) are selected and loaded into the deep network model. When the model parameters are loaded, the forbidden branches finally loaded into the deep network model are ensured not to be repeated. And then, freezing the deep network model loaded with the network parameters, and combining the deep network model into a model file for calling when the algorithm identifies and detects the large container forbidden articles.

In some embodiments, for example, FIG. 6B illustrates a flow of model freezing. As shown in fig. 6B, a category branch for which training is completed is first configured (S6002); then, decomposing the training completion model (S6004) so as to select a module to be loaded according to the training condition and the detection requirement; selecting a proper branch module and then loading parameters (S6006); and finally, combining the definitions and the model weights of the calculation graph loaded with the parameters into the same model file (S6008) for loading and using the model file in a subsequent detection algorithm, for example, freezing the network model into a pb file, wherein the pb file is an algorithm model file and can be loaded and used in the subsequent detection algorithm.

In step S610, a cargo image is acquired. The image of the container to be detected can be preprocessed, and the goods part in the carriage in the image can be intercepted.

In step S612, the features of the cargo image are extracted through the final feature extraction network, and a feature map of the cargo image is obtained.

In step S614, the feature map of the cargo image is processed through the final category detection branch network, and a predicted target position and a predicted target category for detecting the cargo image are obtained. And identifying and detecting forbidden articles of each category respectively, wherein each branch is only responsible for identifying and detecting the forbidden articles of the category, then performing non-maximum value inhibition on each forbidden article branch independently, merging target frames meeting the score threshold condition, and finally outputting a detection frame to finish the identification and detection of the forbidden articles.

In some embodiments, for example, each category branch may include a single category of contraband detection tasks and may also include multiple categories of contraband detection tasks. When the class branch only trains and identifies one class A, the final result of the branch comprises the prediction frame position and the class score of the class A target; when the class branch is trained to recognize multiple classes, the final branch result comprises the positions of the multiple-class prediction frames and the multiple-class scores, and all the class branch results are combined to form the prediction results of all the classes.

According to the cargo image detection method provided by the embodiment of the disclosure, the forbidden article class branch networks can be combined and split, and the proper trained class detection branches can be selected according to actual needs and combined and loaded into a single model, so that the requirement on computing resources is reduced, the maintainability of multiple forbidden article classes is ensured, and the requirement on computing resources is reduced. Each category branch can be added and deleted through the configuration file, a plurality of category branches can be arranged in one model, and one category branch can contain training and detection of a plurality of fine categories, so that intelligent identification and detection of multiple category forbidden articles in large container cargos are realized, and the applicability, the accuracy and the false alarm rate are improved.

Fig. 7A is a schematic diagram illustrating an overall multi-category cargo image detection process according to an exemplary embodiment. As shown in fig. 7A, firstly, a rich training data set is obtained (S7002) and a deep network model is trained (S7004), secondly, each category branch is configured and network parameters are loaded to realize the freezing of the model (S7006), finally, the intelligent identification and detection of forbidden articles in large container cargos (S7008) are carried out, and the process is finished (S7010).

Fig. 7B is a schematic diagram illustrating a cargo image detection process according to an exemplary embodiment. As shown in fig. 7B, after the deep network model is trained, a basic network (backhaul) is used as a feature extractor to extract features of the container cargo image to be detected (S702), for each type of forbidden product detection branch, a one-stage processing and a two-stage processing are performed (S7042, S7044, S7046), and detection and identification results for the type of forbidden product are obtained (S7062, S7064, S7066). And finally, combining the detection and identification results of the various kinds of forbidden articles to obtain the final detection and identification result of the container cargo image, thereby realizing intelligent identification and detection of the plurality of kinds of forbidden articles in the large container cargo.

Fig. 8 is a block diagram illustrating a cargo image detection apparatus according to an exemplary embodiment. The apparatus shown in fig. 8 may be applied to, for example, a server side of the system, and may also be applied to a terminal device of the system.

Referring to fig. 8, the apparatus 80 provided in the embodiment of the present disclosure may include a model to be trained obtaining module 802, an overall model training module 804, a branch configuration module 806, a branch optimization module 808, and a model merging module 810.

The to-be-trained model obtaining module 802 may be configured to obtain an image detection model to be trained, where the image detection model to be trained includes a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is connected to the plurality of category detection branch networks respectively.

The integral model training module 804 may be configured to iteratively update network parameters of the image detection model to be trained, so as to obtain the trained image detection model.

The branch configuration module 806 may be configured to determine a class detection branch network to be optimized for retraining according to the trained image detection model.

The branch optimization module 808 may be configured to iteratively update network parameters of the class detection branch network to be optimized and retrained, so as to obtain an optimized and retrained image detection model.

The model merge module 810 may be configured to obtain a final image detection model from the trained image detection model and/or the optimized retrained image detection model to detect the cargo image.

The specific implementation of each module in the apparatus provided in the embodiment of the present disclosure may refer to the content in the foregoing method, and is not described herein again.

Fig. 9 is a block diagram illustrating another cargo image detection device according to an exemplary embodiment. The apparatus shown in fig. 9 may be applied to, for example, a server side of the system, and may also be applied to a terminal device of the system.

Referring to fig. 9, an apparatus 90 provided in an embodiment of the present disclosure may include a to-be-trained model obtaining module 902, a parameter configuration module 903, an overall model training module 904, a branch configuration module 906, a network initialization module 907, a branch optimization module 908, a model merging module 910, an initial image obtaining module 912, a feature extraction module 914 and a region classification module 916, where the overall model training module 904 may include a first loss calculation module 9042 and a first training module 9044, the branch optimization module 908 may include a second loss calculation module 9082 and a second training module 9084, and the model merging module 910 may include a first model splitting module 9102, a second model splitting module 9104 and a model freezing module 9106.

The to-be-trained model obtaining module 902 may be configured to obtain an image detection model to be trained, where the image detection model to be trained includes a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is connected to the plurality of category detection branch networks respectively. Each of the plurality of class detection branch networks includes a region extraction network and a classification regression network connected to each other.

The parameter configuration module 903 may be configured to configure a network training hyper-parameter of the image detection model to be trained, where the network training hyper-parameter includes a plurality of class detection branch weight coefficients corresponding to a plurality of class detection branch networks, a plurality of region extraction weight coefficients corresponding to a plurality of region extraction networks, and a plurality of classification regression weight coefficients corresponding to a plurality of classification regression networks.

The integral model training module 904 may be configured to iteratively update network parameters of the image detection model to be trained, so as to obtain a trained image detection model. The trained image detection model comprises a trained feature extraction network and a plurality of trained class detection branch networks.

The first loss calculation module 9042 may be configured to obtain a plurality of region extraction loss functions corresponding to the plurality of region extraction networks and a plurality of classification regression loss functions corresponding to the plurality of classification regression networks; and obtaining a total loss function of the image detection model to be trained according to the plurality of class detection branch weight coefficients, the plurality of region extraction loss functions, the plurality of region extraction weight coefficients, the plurality of classification regression loss functions and the plurality of classification regression weight coefficients.

The first training module 9044 may be configured to iteratively update network parameters of the image detection model to be trained through a back propagation process based on a total loss function of the image detection model to be trained.

The branch configuration module 906 may be configured to determine a class detection branch network to be optimized for retraining according to the trained image detection model.

The branch configuration module 906 may also be configured to determine a class detection branch network to be optimally retrained from the plurality of trained class detection branch networks and obtain a class detection branch network that does not need to be retrained.

The network initialization module 907 may be configured to initialize network parameters of the class detection branch network to be optimized and retrained by loading one or more pre-training models in combination.

The branch optimization module 908 may be configured to iteratively update network parameters of the class detection branch network to be optimized and retrained, so as to obtain an optimized and retrained image detection model.

The branch optimization module 908 may also be configured to obtain an optimized retrained image detection model based on the network parameters of the trained feature extraction network and the network parameters of the optimized retrained class detection branch network.

The second loss calculation module 9082 may be configured to obtain a total loss function of the class detection branch network to be optimized and retrained.

The second training module 9084 may be configured to iteratively update the network parameters of the class detection branch network to be optimized and retrained through a back propagation process based on a total loss function of the class detection branch network to be optimized and retrained, to obtain the network parameters of the class detection branch network to be optimized and retrained, where the network parameters of the trained feature extraction network and the network parameters of the class detection branch network that do not need to be retrained are not iteratively updated.

The model merging module 910 may be configured to obtain a final image detection model according to the trained image detection model and/or the optimized retrained image detection model to detect the cargo image.

The first model splitting module 9102 may be configured to split the trained image detection model to obtain a trained feature extraction network and a plurality of trained category detection branch networks.

The second model splitting module 9104 may be configured to split the optimized retrained image detection model to obtain an optimized retrained class detection branch network.

The model freezing module 9106 may be configured to obtain a final feature extraction network as the trained feature extraction network; determining a final plurality of class detection branch networks from the plurality of trained class detection branch networks and the optimized retrained class detection branch network; and combining the final feature extraction network with the final multiple category detection branch networks to obtain a final image detection model for detecting the cargo image.

The initial image acquisition module 912 may be used to acquire images of cargo.

The feature extraction module 914 is configured to extract features of the cargo image through the final feature extraction network, so as to obtain a feature map of the cargo image.

The region classification module 916 may be configured to process the feature map of the cargo image through the final class detection branch network, so as to obtain a predicted target position and a predicted target class for detecting the cargo image.

Fig. 10 shows a schematic structural diagram of an electronic device in an embodiment of the present disclosure. It should be noted that the apparatus shown in fig. 10 is only an example of a computer system, and should not bring any limitation to the function and the scope of the application of the embodiments of the present disclosure.

As shown in fig. 10, the apparatus 1000 includes a Central Processing Unit (CPU)1001 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data necessary for the operation of the apparatus 1000 are also stored. The CPU1001, ROM 1002, and RAM 1003 are connected to each other via a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

The following components are connected to the I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output section 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk and the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted into the storage section 1008 as necessary.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication part 1009 and/or installed from the removable medium 1011. The above-described functions defined in the system of the present disclosure are executed when the computer program is executed by a Central Processing Unit (CPU) 1001.

It should be noted that the computer readable media shown in the present disclosure may be computer readable signal media or computer readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present disclosure may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a model to be trained obtaining module, an integral model training module, a branch configuration module, a branch optimization module and a model merging module. The names of these modules do not constitute a limitation to the module itself in some cases, for example, the model to be trained acquiring module may also be described as a "module acquiring an image detection model to be trained".

As another aspect, the present disclosure also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring an image detection model to be trained, wherein the image detection model to be trained comprises a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is respectively connected with the plurality of category detection branch networks; iteratively updating network parameters of the image detection model to be trained to obtain a trained image detection model; determining a class detection branch network to be optimized and retrained according to the trained image detection model; iteratively updating network parameters of the class detection branch network to be optimized and retrained to obtain an optimized and retrained image detection model; and obtaining a final image detection model according to the trained image detection model and/or the optimized retrained image detection model to detect the cargo image.

Exemplary embodiments of the present disclosure are specifically illustrated and described above. It is to be understood that the present disclosure is not limited to the precise arrangements, instrumentalities, or instrumentalities described herein; on the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A cargo image detection method, comprising:

acquiring an image detection model to be trained, wherein the image detection model to be trained comprises a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is respectively connected with the plurality of category detection branch networks;

iteratively updating the network parameters of the image detection model to be trained to obtain a trained image detection model;

determining a class detection branch network to be optimized and retrained according to the trained image detection model;

iteratively updating the network parameters of the class detection branch network to be optimized and retrained to obtain an optimized and retrained image detection model;

and obtaining a final image detection model according to the trained image detection model and/or the optimized retrained image detection model to detect the cargo image.

2. The method of claim 1, wherein each of the plurality of class detection branch networks comprises an interconnected region extraction network and a classification regression network;

before the iteratively updating the network parameters of the image detection model to be trained, the method further includes:

configuring a network training hyper-parameter of the image detection model to be trained, wherein the network training hyper-parameter comprises a plurality of class detection branch weight coefficients corresponding to the plurality of class detection branch networks, a plurality of region extraction weight coefficients corresponding to the plurality of region extraction networks and a plurality of classification regression weight coefficients corresponding to the plurality of classification regression networks;

the iteratively updating the network parameters of the image detection model to be trained comprises:

obtaining a plurality of region extraction loss functions corresponding to the plurality of region extraction networks and a plurality of classification regression loss functions corresponding to the plurality of classification regression networks;

obtaining a total loss function of the image detection model to be trained according to the plurality of class detection branch weight coefficients, the plurality of region extraction loss functions, the plurality of region extraction weight coefficients, the plurality of classification regression loss functions and the plurality of classification regression weight coefficients;

and iteratively updating the network parameters of the image detection model to be trained through a back propagation process based on the total loss function of the image detection model to be trained.

3. The method of claim 1, wherein the trained image detection model comprises a trained feature extraction network and a plurality of trained class detection branch networks;

the determining the class detection branch network to be optimized and retrained according to the trained image detection model comprises the following steps:

determining the class detection branch network to be optimized and retrained from the trained class detection branch networks, and obtaining the class detection branch network which does not need to be retrained;

the iteratively updating the network parameters of the class detection branch network to be optimized and retrained comprises:

acquiring a total loss function of the class detection branch network to be optimized and retrained;

iteratively updating the network parameters of the class detection branch network to be optimized and retrained through a back propagation process based on the total loss function of the class detection branch network to be optimized and retrained, and obtaining the network parameters of the class detection branch network to be optimized and retrained, wherein the network parameters of the trained feature extraction network and the network parameters of the class detection branch network which does not need to be retrained are not iteratively updated.

4. The method of claim 3, wherein obtaining the optimized retrained image detection model comprises:

and obtaining the optimized and retrained image detection model based on the network parameters of the trained feature extraction network and the network parameters of the optimized and retrained class detection branch network.

5. The method of claim 1, wherein after the determining a class detection branch network to be optimized and retrained according to the trained image detection model, and before the iteratively updating network parameters of the class detection branch network to be optimized and retrained, the method further comprises:

and initializing the network parameters of the class detection branch network to be optimized and retrained by combining and loading one or more pre-training models.

6. The method of claim 1, wherein obtaining a final image detection model from the trained image detection model and/or the optimized retrained image detection model to detect cargo images comprises:

splitting the trained image detection model to obtain a trained feature extraction network and a plurality of trained category detection branch networks;

splitting the optimized and retrained image detection model to obtain an optimized and retrained class detection branch network;

acquiring a final feature extraction network as the trained feature extraction network;

determining a final plurality of class detection branch networks from the plurality of trained class detection branch networks and the optimized retrained class detection branch network;

and combining the final feature extraction network and the final multiple category detection branch networks to obtain the final image detection model so as to detect the cargo image.

7. The method of claim 6, further comprising: acquiring the cargo image;

extracting the features of the cargo image through the final feature extraction network to obtain a feature map of the cargo image;

and processing the characteristic diagram of the cargo image through the final class detection branch network to obtain a predicted target position and a predicted target class for detecting the cargo image.

8. An image cargo detection apparatus, comprising:

the training device comprises a to-be-trained model acquisition module, a to-be-trained model acquisition module and a training module, wherein the to-be-trained model acquisition module is used for acquiring a to-be-trained image detection model, the to-be-trained image detection model comprises a feature extraction network and a plurality of category detection branch networks, and the feature extraction network is respectively connected with the plurality of category detection branch networks;

the integral model training module is used for iteratively updating the network parameters of the image detection model to be trained to obtain a trained image detection model;

the branch configuration module is used for determining a class detection branch network to be optimized and retrained according to the trained image detection model;

the branch optimization module is used for iteratively updating the network parameters of the class detection branch network to be optimized and retrained to obtain an optimized and retrained image detection model;

and the model merging module is used for obtaining a final image detection model according to the trained image detection model and/or the optimized retrained image detection model so as to detect the cargo image.

9. An apparatus, comprising: memory, processor and executable instructions stored in the memory and executable in the processor, characterized in that the processor implements the method according to any of claims 1-7 when executing the executable instructions.

10. A computer-readable storage medium having stored thereon computer-executable instructions, which when executed by a processor, implement the method of any one of claims 1-7.