AU2019101133A4 - Fast vehicle detection using augmented dataset based on RetinaNet - Google Patents

Fast vehicle detection using augmented dataset based on RetinaNet Download PDF

Info

Publication number
AU2019101133A4
AU2019101133A4 AU2019101133A AU2019101133A AU2019101133A4 AU 2019101133 A4 AU2019101133 A4 AU 2019101133A4 AU 2019101133 A AU2019101133 A AU 2019101133A AU 2019101133 A AU2019101133 A AU 2019101133A AU 2019101133 A4 AU2019101133 A4 AU 2019101133A4
Authority
AU
Australia
Prior art keywords
retinanet
detection
layer
network
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2019101133A
Inventor
Yaxin Bo
Ziwei Liu
Buwei WU
Tianjian Yang
Fanghong Zhu
Huayang ZhuGe
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bo Yaxin Miss
Zhu Fanghong Miss
Original Assignee
Bo Yaxin Miss
Zhu Fanghong Miss
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bo Yaxin Miss, Zhu Fanghong Miss filed Critical Bo Yaxin Miss
Priority to AU2019101133A priority Critical patent/AU2019101133A4/en
Application granted granted Critical
Publication of AU2019101133A4 publication Critical patent/AU2019101133A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Abstract

Abstract This invention lies in the field of computer vision and artificial intelligence. It is an vehicle video detection system of various kinds of images, especially cars, based on retina net. The invention consists of the following steps: Initially, we employ datasets to train convolutional networks . Then, we add the training set of data into the convolutional neural network in batches incessantly and continually adjust the parameters of the network such as base learning rate, weight, padding, stride or input and utilize back propagation, making the model approach the optimal performance. Finally, the test set of data is put into the trained neural network and different kinds of objects are recognized with % accuracy. Picture Acquisition (VOC dataset) Initialize the neural network Model training based on resnet and mobilenet Adjust parameters to train the network Yes No Reach the requiremen Prediction of the testing set Figure 3

Description

TITLE
Fast vehicle detection using augmented dataset based on RetinaNet
FIELD OF THE INVENTION
This invention is in the field of computer vision and artificial intelligence and serves as vehicle detection using augmented dataset based on RetinaNet powered by deep learning.
BACKGROUND
Because of the development of intelligent monitoring systems, in-vehicle detection systems have become more and more widely used in the fields of information fusion of vehicle identity, vehicle inspection, vehicle illegal behavior detection, and vehicle tracking. With the development of intelligent transportation system, vehicle target detection is an important part of intelligent transportation system, and it has been widely used in the fields of information fusion of vehicle identity, vehicle inspection, vehicle illegal behavior detection and vehicle tracking. Before the era of deep learning, computer vision research usually uses traditional target detection models to accomplish this task. For example, the traditional HOG algorithm can capture local shape information well and has good invariance to geometric and optical changes. However, it is
2019101133 30 Sep 2019 difficult to deal with the occlusion problem, and the posture range of the human body is too large or the direction of the object is changed.
With the development of deep learning theory and application, it is found that the Convolutional Neural Network (CNN) can autonomously extract image features and greatly reduce a series of loss values due to angle, illumination, deformation, and other factors, more adaptable to complex scenes. CNN selects many candidate boxes by selective search method, then performs convolutional network operations for each candidate box separately, extracts features, and finally puts the convolved features into the svm classifier and Bbox reg's regression device. However, because it has different neural networks in different frames during the training process, the testing process is cumbersome, slow, and requires a lot of space in the internal storage.
Therefore, this paper chooses the RetinaNet model as the basic structure, which can detect the types of items it photographs in most environments. In the training phase, using PascalVQC2007 and COCO data sets, and carrying out data enhancement and reasonable adjustment of parameters to complete the complex research on target detection in the scene.
The RetinaNet model is designed to take advantage of the efficient network feature pyramid and use the anchor boxes to overcome the class
2019101133 30 Sep 2019 imbalance problem of original one-stage. At the same time, the cross-entropy loss, used in the original training regression task, is changed to focal loss, so its detection accuracy become higher and the target object can be quickly identified.
SUMMARY
In a thought to tackle with the situation that current technology cannot consider both precision and speed while detecting objects (the extreme foreground-background class imbalance encountered when training dense detectors) , to large extent, to deal with some problems with robots’ perception and avoid errors when the abilities of describing pictures of networks strengthen with the increasingly convolutional layers, we propose an invention which refers to a object detection method based on deep learning. We conduct an experiment by respectively capitalizing on two types of networks- Res Net and Mobile Net, giving full play to the superiority of both networks, which extracts the image’s partial semantic feature to make the precise description of image features. Considering the advantages of both models, this invention significantly improves the training processes and solve some obstacles like overfitting. Not only do we apply the new networks, but also we compare these two model and analyze the performance of the two networks.
The framework of our method of deep learning object detection for
2019101133 30 Sep 2019 vehicles comprises of: collecting images of people and automobiles, using convolutional networks for training, optimizing parameters and implementing the test of object detection.
As a purpose to build the database of images for our detection, we collect image data by downloading data from pascal VOC dataset.(dataset is consist of a series of images; each image has a corresponding and signal file and the file offers the bounding box and class label of the object). We also delete some useless and unrelated image data to keep the quantity balance of image data.
Our convolutional neural network is a sequence of layers. Figure 1 displays the architecture of our network, which has 5 convolutional layers followed by 1 full connected layer.
The input layer will have preprocessing operations on the model such as deleting the average value, normalizing and reducing dimensions.
The convolutional layer will precept the regionally each feature of the image by through computing the output of neurons that are connected to local regions in the input.
The incentive layer, ReLU layer, will make a nonlinear mapping on the output from convolutional layers.
The Pooling layer will perform a down sampling operation along with
2019101133 30 Sep 2019 spatial dimension (weight, height). The function of Max Pooling compresses data and the number of parameters. Meanwhile, Max Pooling also controls the phenomenon of overfitting and increases fault-tolerance of the model efficiently.
The Full connected layer will connect every nodes in it with all the nodes in the last layer to gather up all the characteristics extracted by previous layer. As a result, the activation can be computed with a matrix multiplication.
Softmax layer is used in the process of multi-classification, mapping the output of multiple neurons to the interval of (0, 1), which can be understood as the probability to conduct multi-classification.
For the part of optimizing the parameter data set , we firstly put the data set in batches into the network for training to reduce the loss function. Then, optimize the model by introducing three nets, Featured Pyramid Net, Classification Subnet and Box Regression Subnet, with this adding, we strengthen the uses of features generated in Resnet to get more expressive feature map which contains Multi-scale target region information. Additionally, we regard gradient descent optimization as algorithm and focal loss as a way to eliminate overfitting.
Lastly, the classifier and locator of images is capable of identifying, result will be presented through locating and classifying.
2019101133 30 Sep 2019
DESCRIPTION OF DRAWINGS
Figure 1 is the Feature Pyramid Network
Figure 2 shows Layerl to Layer5 of Res Net 50
Figure 3 shows the procedure of the project
Figure 4 - Figure 7 show the results of training.
DESCRIPTION OF PREFERRED EMBODIMENT
Network design
Table 1 shows the structure of our convolutional neural network. Our network architecture is inspired by the RetinaNet model. The network is composed of a backbone network(feature pyramid net(FPN) based on Resnet), and a Sub-net(Classification Subnet and Box Regression Subnet).
There are some parameters related to the calculation of the convolution layer:
1) Input: the input image that needs to be convolved
2) Filter: the convolution kernel in CNN, in this invention we mainly use 3X3 and 1X1 convolution kernel.
3) Stride: step size of window sliding during convolution
2019101133 30 Sep 2019
4) Zero-padding: zero-padding has two modes. “Valid” means no padding. “Same” means the output image is the same size as the input image. In this program we use “Same” mode.
Tablet specific structure of ResNet with different depth
Layer name Output size 18-layer 34-layer 50-layer 101-layer 152-layer
Convl 112x 7x7,64,stride 2
112
Conv2_ 56x56 3x3 max pool, stride 2
[3x3,64,1 _ x 2 [3x3,64,1 „ x 3 [1 x 1,64 1 [1 x 1,64 1 [1 x 1,64 1
[3 x 3, 64 J [3 x 3, 64j 3 x 3, 64 x 3 3 x 3, 64 x 3 3 x 3, 64 x 3
|_1 x 1,256j |_1 x 1,256j |_1 x 1,256j
Conv3_ 28x28 [3x3,1281 „ 1 x 2 [3x3,1281 Λ I x 4 [1 x 1, 1281 [1 x 1, 1281 [1 x 1, 1281
X [3 x 3, 128J [3 x 3, 128J 3 x 3, 128 x 4 3 x 3, 128 x 4 3 x 3, 128 x 8
|_1 x 1, 512 J |_1 x 1,512J |_1 x 1,512J
Conv4_ 14x14 [3x 3,2561 „ 1 x 2 [3x 3,2561 „ I x 6 [1x 1,256 1 [1 x 1,256 1 [1 x 1,256 1
X [3 x 3, 256j [3 x 3, 256j 3 x 3, 256 x 6 3 x 3,256 x 23 3x 3,256 x 36
|_1 x 1, 1024J |_1 x 1, 1024J |_1 x 1, 1024J
Conv5_ 7x7 LWxZ [3x3,5121 „ Ί x 3 [1 x 1, 512 1 [1x 1,512 1 [1x 1,512 1
X _3x 3, 512J _3x 3, 512J 3x3,512 x3 3x3,512 x3 3x3,512 x3
|_1 x 1, 2048j |_1 x 1,2048J |_1 x 1,2048J
1x1 Average pool, 1000-d fc, softmax
1. Backbone Net
In this invention, we use ResNet, a most widely used CNN feature extraction network, as the Backbone net of the model. Based on
2019101133 30 Sep 2019
Bottleneck, ResNet 50, 101 and 152 are constructed in the same way. A mainstream neural network is composed of input layer, hidden layers and output layer. As Table 1 suggests, each layer of the network (like Conv_2x, Conv_3x, etc.) is composed of several blocks. And each block is composed of 2 or 3 sub-layers. “x2”, “x3”„ etc. refer to the number of the blocks a layer contains. “[3X3,64]” means the sub-layer has 64 convolutional kernels with the size of 3><3.
Here, we take ResNet 50 for example, it is constructed of five layers, fifty sub-layers in all. Due to its huge size, we only illustrate its input layer, first hidden layer and output layer. The structure of the other hidden layers is similar to the first one.
(1) Convolutional Layer
Firstly, the input layer:
The input data of the Input Layer is the original convolutional layer [224x224x1] image, which is convoluted by a [7χ7χ1] convolution kernel, and the convolution kernel generates a new pixel for each convolution of the original image. The convolution kernel moves in both the x-axis direction and the y-axis direction of the original image, and the moving step size is 2 pixel. Therefore, the convolution kernel generates the [112x112] pixels layers after convolution of the original image. There are 64 convolution kernels and the depth will be 64. We choose ReLU as
2019101133 30 Sep 2019 the nonlinear activation function in the convolutional layers. ReLU function is
ReLU(x) = max(0, x) = 1 ’ [0, x<0 (2)
ReLU can alleviate gradient disappear problems and reduce the training time, which greatly speeds up the rate of convergence of the model. The convolved pixel layers are processed by the ReLU unit, and the size of data is still [64x112x112],
Then, we use a max pooling layer with filters of size [3x3] applied with a stride of 2 downsamples every depth slice in the input by 2 along both width and height, discarding 75% of the pixels. Every operation would be taking a max over 4 numbers in 3x3 region. The depth dimension remains unchanged. Thus in the project the input volume of size [64x112x112] is pooled into output volume of size [64x56x56] with filter size 2, stride 2.
Secondly, the hidden layers:
The first block of the hidden layer Conv2_x has 3 blocks, 9 sub-nets, as this shows:
Ί X 1,64 Ί x 3, 64 x 3 x 1,256J
The input pixels of 64x56x56 is convoluted by convolution kernel of
X 1 X 1 and processed by ReLU function, then 64 X 3 X 3 and 256 X 1
2019101133 30 Sep 2019
X 1 kernels and ReLU. Repeat this process three times. After the process, the network creates 256X56X56 output pixels.
Since the convolution kernel of 1x1 has a size of only 1x1, the relationship between pixels and surrounding pixels does not need to be considered. It is mainly used to adjust the number of channels, carry out linear combination of pixel points on different channels, and then carry out nonlinear operation, so as to achieve the functions of ascending and descending dimensions Then these pixel layers are processed by the ReLU unit, and the size is still [16x16x32],
The other layers, Conv3_x, Conv4_x, Conv5_x, have the similar structure as Conv2_x does. But the first block of each layer has convolution kernel of 3x3, stride 2, decreases 75% of the pixel number, which is different from the other blocks.
The difference between the residual network and the ordinary network is the introduction of jump connection, which enables the information of the previous residual block to flow into the next residual block without obstruction, improves the information flow, and avoids the vanishing gradient problem and degradation problem caused by too deep network connection. Instead of directly fitting the expected feature ad mapping with multiple stacked layers, we explicitly use them to fit a residual mapping.
io
2019101133 30 Sep 2019
Residual block.
(2) Output Layer
After finishing convolution, the model uses Fully Connected Layer to reshape the image matrix. The output data of the last hidden layer is fed in to the Fully Connected Layer. The data is reshaped from [2048 X 7 X 7] to [1000X1X1],
The fully connected layer has 2048 nodes, each of them has full connections to all activations in the input.
The final output is the high-level feature of the input image, corresponding to the probability of label for the input image through Softmax function, which is
Softmax(y) = cxpCr,)— (3)
ΣΜ εχρ(Α) where means the value of the i-th element. The Softmax classification model is used as the last layer of the fully connected layer and output the probability of each category of objects valued between 0 and 1.
The Full Connection Layer is equivalent to the inner product between nerve nodes, mainly involved forward-propagation and back propagation. Forward-propagation is equivalent to formula(4), which calculates the
2019101133 30 Sep 2019 output value of nerve node, and back-propagation is equivalent to formula(5) which calculates the error term of each nerve node.
y = wrx + b(4) cfoh — = w , — = x-(—)Q) ox dy dw dy
Among them, ve Rmxl represents the output of nerve node, xe Rnxl represents the input of nerve node, we Rnxm represents the weight of nerve node, b represents bias, and / represents the layer of nerve nodes.
2,Feature Pyramid Net(FPN)
Figure 1 is the structure of FPN. FPN is a pyramid form that naturally utilizes CNN hierarchical features and generates pyramids with strong semantic information on all scales and according to the features of little semantic information and accurate target location of low-level, while the high-level have much semantic information and rough target location, FPN integrates feature maps of different layers through bottom-up pathway, top-down pathway and lateral connections, making it easy to identify small targets.
In this way, the input image from a single sale sheet is realized, and the characteristic pyramid with strong semantic information on all scales is constructed rapidly without obvious cost.
2019101133 30 Sep 2019
1. Bottom-up pathway.
The feed-forward calculation of CNN is Bottom-up pathway After convolution kernel calculation, the feature graph is usually getting smaller and smaller, and the output of some feature layers is the same as the original size, which is called “same network stage”.
2. Top-down pathway and lateral connections.
The way to combine with the characteristics of low-level high-resolution is to take a more abstract, semantic high-level feature map and sample it, and then connect the lateral connections to the previous layer so that the high-level feature is enhanced. The features of the two horizontal connections are the same in spatial dimensions, which is used to take advantage of the underlying location details.
The FPN (Feature Pyramid Network) algorithm proposed by the author makes use of high resolution and high semantic information of low level features at the same time, and achieves the prediction effect by integrating the features of these different layers. Moreover, prediction is made on the feature layer after each fusion, which is different from the conventional feature fusion mode.
3. Subnet
The main network part of RetinaNet uses the FPN structure, with
2019101133 30 Sep 2019 two sub-networks of different tasks, one is the class subnet and the other is the box subnet.
The parameters of the classification sub-network and the regression sub-network are separate, but the structure is similar. Both of them use small FCN networks, the pyramid has been used as input, and then link four 3*3 convolutional layers, filter is the number of channels in the pyramid layer (256 in the paper), and there is a RELU activation function after each convolutional layer. This is followed by a 3*3 convolutional layer with filter of KA (K is the target species number, A is the number of anchors per layer), and the activation function is sigmoid.
The reason for using the two classifications is that the implementation of the loss layer combines the sigmoid operation for computing “p” with the loss computation, resulting in greater numerical stability.
Retina net uses a special initialization method for the final layer classification of the subnet, so that the training output can be close to Π=0.01, which is closer to the real situation of overwhelming background. The authors demonstrate here that the initialization strategy is important here and in later experiments. And such an initialization strategy is based on multiple sigmoid classifier. If softmax is used, it is impossible to make the output of all categories Π=0.01 for an anchor.
2019101133 30 Sep 2019
RetinaNet separates the classification subnet from the b-box regression subnet, and a large part of it is also used to classify the initial method.
Procedure
Step 1: Data Acquisition
In the data collection process of this project, we use the existing VOC data set, the Internet to collect related pictures, and manual photographs to collect project data. After collecting the data, we filtered the collected images, we eliminated the noise and some images that did not match the type of the project, and then modified the image that meets the requirements to modify to [224x224] pixel. The term of this project is to recognize to animals (birds, cats, cows, dogs, horses, sheep),vehicles (aircraft, bicycles, boats, buses, cars, motorcycles, trains) and indoor items (bottles, chairs, dining tables, potted plants, sofas, TVs). Each kind of pictures needs almost 5000 pieces of image date respectively. When we meet the obstacle that we can't collect sufficient data, we will rotate images to gain new data for the sake of finishing data acquisition. In addition, various types of pictures require 4000 image data to maintain a balance of various image data.
Step2: Data Preprocessing
2019101133 30 Sep 2019 (1) Transfer image data grid from ‘jpg’ to ‘mat’ (2) Transforming the form of the data: In order to facilitate the later data processing, it is necessary to obtain the corresponding data of the original picture in order to transfer the sequence of these pictures.
(3) Dimension reduction: Three dimensions are merged into one dimension to achieve normalization.
(4) Dividing image data: Four-fifths of the image scale, we divide the image data.
Step3: Training and optimization (1) Gradient Descent
In deep learning, the entire neural network can be viewed as a complex nonlinear function as a fitted model of the training samples. The use of gradient descent is actually the process of finding the minimum value of the objective function. The gradient descent can be divided into three forms based on the data of each sample used: batch gradient descent (BGD), small batch gradient descent (MBGD), and stochastic gradient descent (SGD). This topic uses the SGD algorithm as a way of regression. The algorithm is also used to support the linear classifier under the convex loss function such as vector machine and logistic regression. The algorithm logic is as follows:
2019101133 30 Sep 2019
The objective function for a sample is:
= - yv
Find the partial function of the target function:
Parameter update:
This optimization algorithm works by predicting the model every time it sees a training instance and repeating the iteration to several times. This process can be used to find the coefficients of the model that cause the smallest error in the training data. Since the loss function is not on all training data, but in each iteration, the loss function on a certain training data is randomly optimized, so the update speed of each round of parameters is greatly accelerated. The algorithm has been successfully applied to large-scale and sparse machine learning problems which often encountered in text classification and natural language processing.
(2) Focal Loss function
The loss function of Focal loss is modified based on the standard cross entropy loss. This function can make the model more focused on difficult-to-classify samples by reducing the weight of the easily
2019101133 30 Sep 2019 categorized samples.
The algorithm of object detection can be divided into two main categories: two-stage detector and one-stage detector. The former refers to a detection algorithm that requires a region proposal like Faster RCNN and RFCN. Such algorithms can achieve high accuracy, but at a slower speed. The latter refers to a detection algorithm similar to YOLO, SSD that does not require region proposal, direct regression, such algorithms are fast, but the accuracy is not as good as the former. The accuracy of one-stage detector is not as good as that of two-stage detector because of the imbalance of sample class. Therefore, for the problem of class imbalance, the research scientist Kaiming He proposed a new loss function: focal loss
FL(pt) = -at(l-ptf log(pf (In the experiment, the choice range of a is also very wide. Generally, when γ is increased, a needs to be reduced a little)
Focal loss has two important properties:
1. When a sample is split, it is very small, so the modulation factor tends to 1, which means that there is no major change compared to the original loss. When it tends to be 1 (when the classification is correct and the sample is easy to classify), the modulation factor tends to be 0, that is, the
2019101133 30 Sep 2019 contribution to the total loss is small.
2. When/ = 0, focal loss is the traditional cross entropy loss. When/is increased, the modulation factor will also increase.
The results of training can be seen in Figure 4 - Figure 7
Step4: Testing
We adjust parameters of the network constantly in order to reach the optimal performance. Then we put the test set into the network and get mAP as results.
Besides, some parameters are fixed, test batch size is 4391.The sheetl shows the result. We can reach the optimal mAP of 0.654.
Epoch threshoRk 01 05 09 13 17
0.4 0.5282 0.5722 0.5765 0.5861 0.5871
0.5 0.4776 0.5396 0.5510 0.5668 0.5684
0.6 0.4128 0.4979 0.5208 0.5420 0.5468
2019101133 30 Sep 2019
CLAIM

Claims (3)

1. A fast vehicle detection using augmented dataset based on RetinaNet, wherein in the training stage, augmented datasets are used to implement the deep learning: we use a large amount of sample patterns, and carry out reasonable adjustments of parameters. Consequently, the results can be highly-accurate.
2. The fast vehicle detection using augmented dataset based on RetinaNet of claim 1, wherein a class of efficient models called MobileNets is introduced, which apply the depth wise separable convolution, which can not only reduce the computational complexity of the model, but also greatly reduce the size of the model.
3. The fast vehicle detection using augmented dataset based on RetinaNet of claim 1, wherein implementing the-state-of-the-art FPN-based one-stage detector RetinaNet, involved focal loss function, the model ensures the detection speed and improves the detection accuracy in the case of class imbalance, which is also conducive to small targets detection; the model can be used in areas such as vehicle-mounted and spam detection.
AU2019101133A 2019-09-30 2019-09-30 Fast vehicle detection using augmented dataset based on RetinaNet Ceased AU2019101133A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2019101133A AU2019101133A4 (en) 2019-09-30 2019-09-30 Fast vehicle detection using augmented dataset based on RetinaNet

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2019101133A AU2019101133A4 (en) 2019-09-30 2019-09-30 Fast vehicle detection using augmented dataset based on RetinaNet

Publications (1)

Publication Number Publication Date
AU2019101133A4 true AU2019101133A4 (en) 2019-10-31

Family

ID=68342021

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2019101133A Ceased AU2019101133A4 (en) 2019-09-30 2019-09-30 Fast vehicle detection using augmented dataset based on RetinaNet

Country Status (1)

Country Link
AU (1) AU2019101133A4 (en)

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110988872A (en) * 2019-12-25 2020-04-10 中南大学 Method for rapidly identifying health state of wall body detected by unmanned aerial vehicle-mounted through-wall radar
CN110986949A (en) * 2019-12-04 2020-04-10 日照职业技术学院 Path identification method based on artificial intelligence platform
CN110988839A (en) * 2019-12-25 2020-04-10 中南大学 Method for quickly identifying health condition of wall based on one-dimensional convolutional neural network
CN111079543A (en) * 2019-11-20 2020-04-28 浙江工业大学 Efficient vehicle color identification method based on deep learning
CN111242122A (en) * 2020-01-07 2020-06-05 浙江大学 Lightweight deep neural network rotating target detection method and system
CN111415338A (en) * 2020-03-16 2020-07-14 城云科技(中国)有限公司 Method and system for constructing target detection model
CN111612722A (en) * 2020-05-26 2020-09-01 星际(重庆)智能装备技术研究院有限公司 Low-illumination image processing method based on simplified Unet full-convolution neural network
CN111738056A (en) * 2020-04-27 2020-10-02 浙江万里学院 Heavy truck blind area target detection method based on improved YOLO v3
CN111814604A (en) * 2020-06-23 2020-10-23 浙江理工大学 Pedestrian tracking method based on twin neural network
CN111814863A (en) * 2020-07-03 2020-10-23 南京信息工程大学 Detection method for light-weight vehicles and pedestrians
CN112528862A (en) * 2020-12-10 2021-03-19 西安电子科技大学 Remote sensing image target detection method based on improved cross entropy loss function
CN113033604A (en) * 2021-02-03 2021-06-25 淮阴工学院 Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN113112489A (en) * 2021-04-22 2021-07-13 池州学院 Insulator string-dropping fault detection method based on cascade detection model
CN113192646A (en) * 2021-04-25 2021-07-30 北京易华录信息技术股份有限公司 Target detection model construction method and different target distance monitoring method and device
CN113421252A (en) * 2021-07-07 2021-09-21 南京思飞捷软件科技有限公司 Actual detection method for vehicle body welding defects based on improved convolutional neural network
CN113536829A (en) * 2020-04-13 2021-10-22 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Goods static identification method of unmanned retail container
CN113537375A (en) * 2021-07-26 2021-10-22 深圳大学 Diabetic retinopathy grading method based on multi-scale cascade
CN113536824A (en) * 2020-04-13 2021-10-22 南京行者易智能交通科技有限公司 Improvement method of passenger detection model based on YOLOv3 and model training method
CN113553977A (en) * 2021-07-30 2021-10-26 国电汉川发电有限公司 Improved YOLO V5-based safety helmet detection method and system
CN113723278A (en) * 2021-08-27 2021-11-30 上海云从汇临人工智能科技有限公司 Training method and device of form information extraction model
CN113869766A (en) * 2021-10-11 2021-12-31 吉林大学 Intelligent detection modeling method for alloy plate blanking quality
CN113989265A (en) * 2021-11-11 2022-01-28 哈尔滨市科佳通用机电股份有限公司 Speed sensor bolt loss fault identification method based on deep learning
WO2023280082A1 (en) * 2021-07-07 2023-01-12 (美国)动力艾克斯尔公司 Handle inside-out visual six-degree-of-freedom positioning method and system
CN116310850A (en) * 2023-05-25 2023-06-23 南京信息工程大学 Remote sensing image target detection method based on improved RetinaNet
CN116778176A (en) * 2023-06-30 2023-09-19 哈尔滨工程大学 SAR image ship trail detection method based on frequency domain attention

Cited By (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111079543A (en) * 2019-11-20 2020-04-28 浙江工业大学 Efficient vehicle color identification method based on deep learning
CN111079543B (en) * 2019-11-20 2022-02-15 浙江工业大学 Efficient vehicle color identification method based on deep learning
CN110986949A (en) * 2019-12-04 2020-04-10 日照职业技术学院 Path identification method based on artificial intelligence platform
CN110988839B (en) * 2019-12-25 2023-10-10 中南大学 Quick identification method for wall health condition based on one-dimensional convolutional neural network
CN110988839A (en) * 2019-12-25 2020-04-10 中南大学 Method for quickly identifying health condition of wall based on one-dimensional convolutional neural network
CN110988872A (en) * 2019-12-25 2020-04-10 中南大学 Method for rapidly identifying health state of wall body detected by unmanned aerial vehicle-mounted through-wall radar
CN110988872B (en) * 2019-12-25 2023-10-03 中南大学 Rapid identification method for detecting wall health state by unmanned aerial vehicle through-wall radar
CN111242122A (en) * 2020-01-07 2020-06-05 浙江大学 Lightweight deep neural network rotating target detection method and system
CN111242122B (en) * 2020-01-07 2023-09-08 浙江大学 Lightweight deep neural network rotating target detection method and system
CN111415338A (en) * 2020-03-16 2020-07-14 城云科技(中国)有限公司 Method and system for constructing target detection model
CN113536829A (en) * 2020-04-13 2021-10-22 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Goods static identification method of unmanned retail container
CN113536824A (en) * 2020-04-13 2021-10-22 南京行者易智能交通科技有限公司 Improvement method of passenger detection model based on YOLOv3 and model training method
CN113536824B (en) * 2020-04-13 2024-01-12 南京行者易智能交通科技有限公司 Improved method of passenger detection model based on YOLOv3 and model training method
CN111738056B (en) * 2020-04-27 2023-11-03 浙江万里学院 Heavy truck blind area target detection method based on improved YOLO v3
CN111738056A (en) * 2020-04-27 2020-10-02 浙江万里学院 Heavy truck blind area target detection method based on improved YOLO v3
CN111612722B (en) * 2020-05-26 2023-04-18 星际(重庆)智能装备技术研究院有限公司 Low-illumination image processing method based on simplified Unet full-convolution neural network
CN111612722A (en) * 2020-05-26 2020-09-01 星际(重庆)智能装备技术研究院有限公司 Low-illumination image processing method based on simplified Unet full-convolution neural network
CN111814604A (en) * 2020-06-23 2020-10-23 浙江理工大学 Pedestrian tracking method based on twin neural network
CN111814863A (en) * 2020-07-03 2020-10-23 南京信息工程大学 Detection method for light-weight vehicles and pedestrians
CN112528862A (en) * 2020-12-10 2021-03-19 西安电子科技大学 Remote sensing image target detection method based on improved cross entropy loss function
CN112528862B (en) * 2020-12-10 2023-02-10 西安电子科技大学 Remote sensing image target detection method based on improved cross entropy loss function
CN113033604A (en) * 2021-02-03 2021-06-25 淮阴工学院 Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN113033604B (en) * 2021-02-03 2022-11-15 淮阴工学院 Vehicle detection method, system and storage medium based on SF-YOLOv4 network model
CN113112489B (en) * 2021-04-22 2022-11-15 池州学院 Insulator string-dropping fault detection method based on cascade detection model
CN113112489A (en) * 2021-04-22 2021-07-13 池州学院 Insulator string-dropping fault detection method based on cascade detection model
CN113192646B (en) * 2021-04-25 2024-03-22 北京易华录信息技术股份有限公司 Target detection model construction method and device for monitoring distance between different targets
CN113192646A (en) * 2021-04-25 2021-07-30 北京易华录信息技术股份有限公司 Target detection model construction method and different target distance monitoring method and device
WO2023280082A1 (en) * 2021-07-07 2023-01-12 (美国)动力艾克斯尔公司 Handle inside-out visual six-degree-of-freedom positioning method and system
CN113421252B (en) * 2021-07-07 2024-04-19 南京思飞捷软件科技有限公司 Improved convolutional neural network-based vehicle body welding defect detection method
CN113421252A (en) * 2021-07-07 2021-09-21 南京思飞捷软件科技有限公司 Actual detection method for vehicle body welding defects based on improved convolutional neural network
CN113537375A (en) * 2021-07-26 2021-10-22 深圳大学 Diabetic retinopathy grading method based on multi-scale cascade
CN113553977B (en) * 2021-07-30 2023-02-10 国电汉川发电有限公司 Improved YOLO V5-based safety helmet detection method and system
CN113553977A (en) * 2021-07-30 2021-10-26 国电汉川发电有限公司 Improved YOLO V5-based safety helmet detection method and system
CN113723278B (en) * 2021-08-27 2023-11-03 上海云从汇临人工智能科技有限公司 Training method and device for form information extraction model
CN113723278A (en) * 2021-08-27 2021-11-30 上海云从汇临人工智能科技有限公司 Training method and device of form information extraction model
CN113869766A (en) * 2021-10-11 2021-12-31 吉林大学 Intelligent detection modeling method for alloy plate blanking quality
CN113869766B (en) * 2021-10-11 2024-04-09 吉林大学 Intelligent detection modeling method for blanking quality of alloy plate
CN113989265A (en) * 2021-11-11 2022-01-28 哈尔滨市科佳通用机电股份有限公司 Speed sensor bolt loss fault identification method based on deep learning
CN116310850B (en) * 2023-05-25 2023-08-15 南京信息工程大学 Remote sensing image target detection method based on improved RetinaNet
CN116310850A (en) * 2023-05-25 2023-06-23 南京信息工程大学 Remote sensing image target detection method based on improved RetinaNet
CN116778176A (en) * 2023-06-30 2023-09-19 哈尔滨工程大学 SAR image ship trail detection method based on frequency domain attention
CN116778176B (en) * 2023-06-30 2024-02-09 哈尔滨工程大学 SAR image ship trail detection method based on frequency domain attention

Similar Documents

Publication Publication Date Title
AU2019101133A4 (en) Fast vehicle detection using augmented dataset based on RetinaNet
JP7289918B2 (en) Object recognition method and device
Hong et al. Multimodal GANs: Toward crossmodal hyperspectral–multispectral image segmentation
CN110378381B (en) Object detection method, device and computer storage medium
Wang et al. Regional parallel structure based CNN for thermal infrared face identification
US10896342B2 (en) Spatio-temporal action and actor localization
CN110263786B (en) Road multi-target identification system and method based on feature dimension fusion
WO2018089158A1 (en) Natural language object tracking
Cadena et al. Pedestrian graph: Pedestrian crossing prediction based on 2d pose estimation and graph convolutional networks
CN111368972B (en) Convolutional layer quantization method and device
CN111723829B (en) Full-convolution target detection method based on attention mask fusion
CN111401517B (en) Method and device for searching perceived network structure
CN107767416B (en) Method for identifying pedestrian orientation in low-resolution image
US20220148291A1 (en) Image classification method and apparatus, and image classification model training method and apparatus
CN110222718B (en) Image processing method and device
CN111052151A (en) Video motion localization based on attention suggestions
CN112614119A (en) Medical image region-of-interest visualization method, device, storage medium and equipment
Haider et al. Human detection in aerial thermal imaging using a fully convolutional regression network
CN110263731B (en) Single step human face detection system
Raparthi et al. Machine Learning Based Deep Cloud Model to Enhance Robustness and Noise Interference
CN111461221A (en) Multi-source sensor fusion target detection method and system for automatic driving
Fang et al. Multi-channel feature fusion networks with hard coordinate attention mechanism for maize disease identification under complex backgrounds
Zhu et al. Indoor scene segmentation algorithm based on full convolutional neural network
CN111275732B (en) Foreground object image segmentation method based on depth convolution neural network
Kiran et al. Edge preserving noise robust deep learning networks for vehicle classification

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry