CN116664883A - Cargo image recognition method and system based on convolutional neural network - Google Patents

Cargo image recognition method and system based on convolutional neural network Download PDF

Info

Publication number
CN116664883A
CN116664883A CN202310551013.7A CN202310551013A CN116664883A CN 116664883 A CN116664883 A CN 116664883A CN 202310551013 A CN202310551013 A CN 202310551013A CN 116664883 A CN116664883 A CN 116664883A
Authority
CN
China
Prior art keywords
image
neural network
convolutional neural
cargo
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310551013.7A
Other languages
Chinese (zh)
Inventor
林阿勇
武博
祁锋
吴天愉
符莉婷
陈文�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan Port And Channel Logistics Co ltd
Original Assignee
Hainan Port And Channel Logistics Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan Port And Channel Logistics Co ltd filed Critical Hainan Port And Channel Logistics Co ltd
Priority to CN202310551013.7A priority Critical patent/CN116664883A/en
Publication of CN116664883A publication Critical patent/CN116664883A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • G06V10/763Non-hierarchical techniques, e.g. based on statistics of modelling distributions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Abstract

The invention discloses a cargo image recognition method and system based on a convolutional neural network, and belongs to the technical field of artificial intelligence. Acquiring X-ray images of a container to be detected, carrying out image segmentation on each X-ray image to be processed by adopting a trained 2D convolutional neural network model based on a U-Net architecture, obtaining a plurality of segmented images, carrying out similar image detection based on BOW and K-means, matching each segmented image with a standard picture in a cargo database, and combining the segmented model and the similar image detection model with highest similarity as a recognition result, wherein the segmentation effect is greatly improved in the face of a logistics cargo counting scene, and further the cargo recognition accuracy is improved.

Description

Cargo image recognition method and system based on convolutional neural network
Technical Field
The invention belongs to the technical field of artificial intelligence, and particularly relates to a cargo image recognition method and system based on a convolutional neural network.
Background
Because of cost limitations, logistic companies can only use very few people to inventory goods, so that an image-based logistic automatic judging system is particularly important.
Patent CN 115035129A discloses a method for identifying goods, comprising: acquiring an image to be processed containing at least one cargo by monitoring equipment; processing the image to be processed through the trained instance segmentation model to generate an instance segmentation result of the goods; generating a cargo pile outline of the cargo according to the example segmentation result; and matching the profile of the goods stack with data in a goods database to acquire the information of the goods. The segmentation model consists of a Detector module and a mask branch (BlendMask module), the Detector network uses fcos (Fully Convolutional One-Stage Object Detection, full convolution, object detection in a per pixel prediction mode) algorithm. The Fcos algorithm uses multi-level detection such as FPN (Feature Pyramid Networks, feature map pyramid network) to detect targets of different sizes at feature layers of different origins.
However, this method has the following drawbacks and disadvantages: the intelligent goods-watching for storage management is achieved, the acquired images are directly shot through monitoring equipment, the goods with large quantity and simple structure such as steel bars, steel coils, rubber and crops are segmented and matched, the scene difference with the logistics goods is large, and when the segmentation network is used for logistics goods counting, the segmentation effect is poor and the accuracy is low.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a cargo image recognition method and system based on a convolutional neural network, which aims to solve the problems of poor segmentation effect and low accuracy of the existing recognition method.
To achieve the above object, in a first aspect, the present invention provides a cargo image recognition method based on a convolutional neural network, the method comprising:
acquiring an X-ray image of a container to be inspected;
performing image segmentation on each X-ray image to be processed by adopting a trained 2D convolutional neural network model based on a U-Net architecture to obtain a plurality of segmented images;
and on the basis of similar image detection of BOW and K-means, matching each segmented image with a standard picture in a cargo database, and taking the highest similarity as a recognition result.
Preferably, the detection of similar images based on BOW and K-means is specifically as follows:
firstly, generating characteristic points and descriptors of each image in an image database by using a SIFT algorithm;
clustering the characteristic points in the image library by using a K-means algorithm, wherein the number of clustering centers is K, combining the clustering centers to form a dictionary, and calculating the weight of each visual word TF-IDF according to the IDF principle to represent the importance degree of the visual word on distinguishing the images;
counting the number of times that each word in the dictionary appears in the feature set of each image in the image database, and representing each image as a histogram;
after the histogram vector of each image is obtained, constructing an inverted list of features to the images, and rapidly indexing the related candidate images through the inverted list;
and for the image to be detected, computing a sift characteristic, converting the sift characteristic into a frequency histogram according to the TF-IDF, and judging the similarity of the histogram vectors according to the index result.
Preferably, the 2D convolutional neural network model based on the U-Net architecture comprises:
an encoder consisting of a plurality of convolution layers and a pooling layer for extracting features of an input image;
a decoder composed of a plurality of deconvolution layers and convolution layers for upsampling the feature map extracted in the encoder and restoring to the size of the input image, and outputting a predicted segmentation result through the convolution layers;
a jump connection is added between the encoder and the decoder for fusing the shallow features and the deep features.
Preferably, the encoder is VGG or ResNet.
Preferably, the output of each convolution and transpose convolution layer in the U-Net architecture based 2D convolutional neural network model uses batch normalization for accelerating convergence and yielding better results.
Preferably, the method further comprises:
counting the quantity and the types of the goods to be detected, performing data matching with a declared goods list, and reminding or warning if the goods are inconsistent.
To achieve the above object, in a second aspect, the present invention provides a cargo image recognition system based on a convolutional neural network, including: a processor and a memory; the memory is used for storing computer execution instructions; the processor is configured to execute the computer-executable instructions such that the method of the first aspect is performed.
Preferably, a Spring Boot application is deployed on the server using nglnx as a reverse proxy, which listens only to the local loop address of the virtual machine, so that it can be accessed only through the nglnx proxy, and the picture file can be uploaded to the server-specified folder.
Preferably, the system further comprises:
the accelerator is used for generating X-rays, and the X-rays have different degrees of energy loss after passing through the container to be detected;
the detector is used for receiving the X-rays, converting the X-rays into electric signals with different voltages according to the degree of energy loss and sending the electric signals to the image acquisition module;
the image acquisition module is used for converting the electric signals into image information through an image processing algorithm;
and the transmission and scanning module is used for controlling the relative movement of the inspected container and the X-ray source so as to obtain perspective images of the inspected container at different visual angles.
Preferably, the detector employs a photon counting X-ray detector.
In general, the above technical solutions conceived by the present invention have the following beneficial effects compared with the prior art:
the invention provides a cargo image recognition method and system based on a convolutional neural network, which are characterized in that an X-ray image of a container to be detected is acquired, a trained 2D convolutional neural network model based on a U-Net architecture is adopted to carry out image segmentation on each X-ray image to be processed, a plurality of segmented images are obtained, each segmented image is matched with a standard picture in a cargo database based on similar image detection of BOW and K-means, the highest similarity is used as a recognition result, the segmented model is combined with a similar image detection model, the object cargo checking scene is faced, the segmentation effect is greatly improved, and the recognition accuracy is further improved.
Drawings
Fig. 1 is a flowchart of a cargo image recognition method based on a convolutional neural network.
Fig. 2 is a schematic X-ray image of a container under inspection provided in an embodiment of the present invention.
FIG. 3 is a schematic diagram of a similar image detection principle based on BOW and K-means according to an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, the invention provides a cargo image recognition method based on a convolutional neural network, which comprises the following steps:
as shown in fig. 2, an X-ray image of the container under inspection is acquired;
performing image segmentation on each X-ray image to be processed by adopting a trained 2D convolutional neural network model based on a U-Net architecture to obtain a plurality of segmented images;
and on the basis of similar image detection of BOW and K-means, matching each segmented image with a standard picture in a cargo database, and taking the highest similarity as a recognition result.
Preferably, the X-ray image to be processed is subjected to a filling operation prior to segmentation. This is to ensure that the size of the image satisfies the requirements when performing the convolution operation. The fill value is 0 ('CONSTANT').
Preferably, the 2D convolutional neural network model based on the U-Net architecture comprises:
an encoder, which is composed of a plurality of convolution layers and a pooling layer, is used to extract features of an input image.
The encoder comprises 4 stages, each comprising two convolutional layers followed by a max pooling layer. The convolutional layer uses a layer 2d layer BN function, which implements a convolutional layer with Batch Normalization (BN). The convolution layer is used to extract features of the input image, while the max-pooling layer is used to reduce the spatial size of the image. The number of filters in the convolutional layer is doubled in turn as the encoder goes deeper, increasing from 64 to 1024. After the last layer of the encoder is output, two convolution layers are connected to form an intermediate layer.
And a decoder composed of a plurality of deconvolution layers and convolution layers for upsampling the feature map extracted in the encoder and restoring to the size of the input image, and outputting a predicted segmentation result through the convolution layers.
The decoder comprises 4 stages, each comprising a deconvolution layer followed by two convolution layers. The deconvolution layer implements a band-batch normalized deconvolution layer using a layers. Deconv2d_layer_bn function. The deconvolution layer is used for expanding the spatial dimension of the feature map and recovering the spatial detail of the image. The number of filters in the convolutional layer is gradually halved as the decoder depth increases, from 512 to 64.
At each stage of the decoder, the output of the corresponding encoder stage is clipped and spliced with the deconvolution layer output of the current stage using a layers. These jump connections help to combine the advanced features extracted by the encoder with the spatial information recovered by the decoder, thereby improving model performance.
At the final layer of the decoder, the output feature map is converted to a prediction using a 1x1 convolutional layer (also with batch normalization). The number of filters here is equal to nlabels, representing the number of classes.
Finally, the function returns a predicted outcome tensor pred. This prediction can be used to calculate the loss function and optimize the model during the training process.
A jump connection is added between the encoder and the decoder for fusing the shallow features and the deep features.
In this embodiment, a cross entropy loss function is used as an objective function for training U-Net for measuring the difference between the predicted result and the real tag.
Preferably, the encoder is VGG or ResNet.
Preferably, the output of each convolution and transpose convolution layer in the U-Net architecture based 2D convolutional neural network model uses batch normalization for accelerating convergence and yielding better results.
Batch normalization helps reduce internal covariate offset by normalizing the input to each layer to the same mean and variance, so that each layer can learn better features independently. In addition, batch normalization also allows for higher learning rates to be used, further speeding up the training process.
Preferably, the present invention is based on similar image detection of BOW and K-means. As shown in fig. 3, feature points and descriptors of each graph in the image database are first generated by using a SIFT algorithm; clustering the characteristic points in the image library by using a K-means algorithm, wherein the number of clustering centers is K, and combining the clustering centers together to form a dictionary; according to the IDF principle, each visual word TF-IDF weight is calculated to represent the importance of the visual word to distinguish images. For each image in the image database, counting the number of times each word in the dictionary appears in its feature set, and representing each image as a histogram. After the histogram vector of each graph is obtained, an inverted list of features to the images is constructed, and the images of related candidates are rapidly indexed through the inverted list. And for the image to be detected, computing a sift characteristic, converting the sift characteristic into a frequency histogram according to the TF-IDF, and judging the similarity of the histogram vectors according to the index result.
Preferably, the method further comprises:
counting the quantity and the types of the goods to be detected, performing data matching with a declared goods list, and reminding or warning if the goods are inconsistent.
A table is built in the MySQL database and used for storing diagnosis results, a special mark field (SI) is arranged in the table, and each SI corresponds to a standard picture of a logistics company and can represent the container characteristics or the goods characteristics of a company.
The invention provides a cargo image recognition system based on a convolutional neural network, which comprises: a processor and a memory; the memory is used for storing computer execution instructions; the processor is configured to execute the computer-executable instructions such that the method of the first aspect is performed.
Preferably, a Spring Boot application is deployed on the server using nglnx as a reverse proxy, which listens only to the local loop address of the virtual machine, so that it can be accessed only through the nglnx proxy, and the picture file can be uploaded to the server-specified folder.
The configuration file of the Springboot is added with the following configuration:
server.address=127.0.0.1;
server.port=8080;
the specific path of the file to be uploaded is upload.
The virtual machine installs the Nginx, the program is arranged into jar packets and uploaded to the server, and the configuration file of the Nginx is edited on the virtual machine and is usually located at/etc/ginx/sites-available/default. The following are added within the server block:
location/{
proxy_pass http://127.0.0.1:8080;
proxy_set_header Host$host;
proxy_set_header X-Real-IP$remote_addr;
proxy_set_header
X-Forwarded-For$proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto$scheme;
}
after configuration, the nginx is restarted, and the pictures uploaded by the user through the webpage are saved to a designated folder on the virtual machine.
Preferably, the system further comprises:
the accelerator is used for generating X-rays, and the X-rays have different degrees of energy loss after passing through the container to be detected;
the detector is used for receiving the X-rays, converting the X-rays into electric signals with different voltages according to the degree of energy loss and sending the electric signals to the image acquisition module;
the image acquisition module is used for converting the electric signals into image information through an image processing algorithm;
and the transmission and scanning module is used for controlling the relative movement of the inspected container and the X-ray source so as to obtain perspective images of the inspected container at different visual angles.
Preferably, the detector employs a photon counting X-ray detector.
It will be readily appreciated by those skilled in the art that the foregoing description is merely a preferred embodiment of the invention and is not intended to limit the invention, but any modifications, equivalents, improvements or alternatives falling within the spirit and principles of the invention are intended to be included within the scope of the invention.

Claims (10)

1. The cargo image recognition method based on the convolutional neural network is characterized by comprising the following steps of:
acquiring an X-ray image of a container to be inspected;
performing image segmentation on each X-ray image to be processed by adopting a trained 2D convolutional neural network model based on a U-Net architecture to obtain a plurality of segmented images;
and on the basis of similar image detection of BOW and K-means, matching each segmented image with a standard picture in a cargo database, and taking the highest similarity as a recognition result.
2. The method of claim 1, wherein the box and K-means based similar image detection is specified as follows:
firstly, generating characteristic points and descriptors of each image in an image database by using a SIFT algorithm;
clustering the characteristic points in the image library by using a K-means algorithm, wherein the number of clustering centers is K, combining the clustering centers to form a dictionary, and calculating the weight of each visual word TF-IDF according to the IDF principle to represent the importance degree of the visual word on distinguishing the images;
counting the number of times that each word in the dictionary appears in the feature set of each image in the image database, and representing each image as a histogram;
after the histogram vector of each image is obtained, constructing an inverted list of features to the images, and rapidly indexing the related candidate images through the inverted list;
and for the image to be detected, computing a sift characteristic, converting the sift characteristic into a frequency histogram according to the TF-IDF, and judging the similarity of the histogram vectors according to the index result.
3. The method of claim 1, wherein the U-Net architecture based 2D convolutional neural network model comprises:
an encoder consisting of a plurality of convolution layers and a pooling layer for extracting features of an input image;
a decoder composed of a plurality of deconvolution layers and convolution layers for upsampling the feature map extracted in the encoder and restoring to the size of the input image, and outputting a predicted segmentation result through the convolution layers;
a jump connection is added between the encoder and the decoder for fusing the shallow features and the deep features.
4. The method of claim 3, wherein the encoder is VGG or res net.
5. The method of claim 3, wherein the output of each convolution and transpose convolution layer in the U-Net architecture based 2D convolutional neural network model uses batch normalization for accelerating convergence and yielding better results.
6. The method of any one of claims 1 to 5, further comprising:
counting the quantity and the types of the goods to be detected, performing data matching with a declared goods list, and reminding or warning if the goods are inconsistent.
7. A cargo image recognition system based on convolutional neural network, comprising: a processor and a memory;
the memory is used for storing computer execution instructions;
the processor for executing the computer-executable instructions such that the method of any one of claims 1 to 6 is performed.
8. The system of claim 7, wherein a Spring Boot application is deployed on the server using Nginx as a reverse proxy, the Spring Boot application listens only to the local loop address of the virtual machine for access only by the Nginx proxy, and the picture file is able to be uploaded to the server-specified folder.
9. The system of claim 7 or 8, wherein the system further comprises:
the accelerator is used for generating X-rays, and the X-rays have different degrees of energy loss after passing through the container to be detected;
the detector is used for receiving the X-rays, converting the X-rays into electric signals with different voltages according to the degree of energy loss and sending the electric signals to the image acquisition module;
the image acquisition module is used for converting the electric signals into image information through an image processing algorithm;
and the transmission and scanning module is used for controlling the relative movement of the inspected container and the X-ray source so as to obtain perspective images of the inspected container at different visual angles.
10. The system of claim 9, wherein the detector employs a photon counting X-ray detector.
CN202310551013.7A 2023-05-12 2023-05-12 Cargo image recognition method and system based on convolutional neural network Pending CN116664883A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310551013.7A CN116664883A (en) 2023-05-12 2023-05-12 Cargo image recognition method and system based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310551013.7A CN116664883A (en) 2023-05-12 2023-05-12 Cargo image recognition method and system based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN116664883A true CN116664883A (en) 2023-08-29

Family

ID=87719852

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310551013.7A Pending CN116664883A (en) 2023-05-12 2023-05-12 Cargo image recognition method and system based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN116664883A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104751163A (en) * 2013-12-27 2015-07-01 同方威视技术股份有限公司 Fluoroscopy examination system and method for carrying out automatic classification recognition on goods
CN106156118A (en) * 2015-04-07 2016-11-23 阿里巴巴集团控股有限公司 Picture analogies degree computational methods based on computer system and system thereof
CN109948565A (en) * 2019-03-26 2019-06-28 浙江啄云智能科技有限公司 A kind of not unpacking detection method of the contraband for postal industry
CN113963161A (en) * 2021-11-11 2022-01-21 杭州电子科技大学 System and method for segmenting and identifying X-ray image based on ResNet model feature embedding UNet
CN115512283A (en) * 2021-06-21 2022-12-23 顺丰科技有限公司 Parcel image processing method and device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104751163A (en) * 2013-12-27 2015-07-01 同方威视技术股份有限公司 Fluoroscopy examination system and method for carrying out automatic classification recognition on goods
CN106156118A (en) * 2015-04-07 2016-11-23 阿里巴巴集团控股有限公司 Picture analogies degree computational methods based on computer system and system thereof
CN109948565A (en) * 2019-03-26 2019-06-28 浙江啄云智能科技有限公司 A kind of not unpacking detection method of the contraband for postal industry
CN115512283A (en) * 2021-06-21 2022-12-23 顺丰科技有限公司 Parcel image processing method and device, computer equipment and storage medium
CN113963161A (en) * 2021-11-11 2022-01-21 杭州电子科技大学 System and method for segmenting and identifying X-ray image based on ResNet model feature embedding UNet

Similar Documents

Publication Publication Date Title
Bao et al. Semantic structure from motion
CN105164700B (en) Detecting objects in visual data using a probabilistic model
Esmaeili et al. Fast-at: Fast automatic thumbnail generation using deep neural networks
CN111652317B (en) Super-parameter image segmentation method based on Bayes deep learning
US9330336B2 (en) Systems, methods, and media for on-line boosting of a classifier
US8761510B2 (en) Object-centric spatial pooling for image classification
CN112633297B (en) Target object identification method and device, storage medium and electronic device
JP2017062778A (en) Method and device for classifying object of image, and corresponding computer program product and computer-readable medium
Demirkus et al. Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos
WO2024060684A1 (en) Model training method, image processing method, device, and storage medium
WO2015146113A1 (en) Identification dictionary learning system, identification dictionary learning method, and recording medium
US9081800B2 (en) Object detection via visual search
CN115240280A (en) Construction method of human face living body detection classification model, detection classification method and device
Sundaram et al. FSSCaps-DetCountNet: fuzzy soft sets and CapsNet-based detection and counting network for monitoring animals from aerial images
Carvalho et al. Analysis of object description methods in a video object tracking environment
Sun et al. Pig detection algorithm based on sliding windows and PCA convolution
CN112750124B (en) Model generation method, image segmentation method, model generation device, image segmentation device, electronic equipment and storage medium
CN115861595A (en) Multi-scale domain self-adaptive heterogeneous image matching method based on deep learning
CN112819953B (en) Three-dimensional reconstruction method, network model training method, device and electronic equipment
CN113033305B (en) Living body detection method, living body detection device, terminal equipment and storage medium
CN116664883A (en) Cargo image recognition method and system based on convolutional neural network
CN115359296A (en) Image recognition method and device, electronic equipment and storage medium
CN115063831A (en) High-performance pedestrian retrieval and re-identification method and device
CN114168780A (en) Multimodal data processing method, electronic device, and storage medium
CN108154107B (en) Method for determining scene category to which remote sensing image belongs

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination