CN113706584A - Streetscape flow information acquisition method based on computer vision - Google Patents

Streetscape flow information acquisition method based on computer vision Download PDF

Info

Publication number
CN113706584A
CN113706584A CN202111026783.7A CN202111026783A CN113706584A CN 113706584 A CN113706584 A CN 113706584A CN 202111026783 A CN202111026783 A CN 202111026783A CN 113706584 A CN113706584 A CN 113706584A
Authority
CN
China
Prior art keywords
frame
detection
video
target
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111026783.7A
Other languages
Chinese (zh)
Inventor
王峥
吴东鹏
黄秀君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hohai University HHU
Original Assignee
Hohai University HHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hohai University HHU filed Critical Hohai University HHU
Priority to CN202111026783.7A priority Critical patent/CN113706584A/en
Publication of CN113706584A publication Critical patent/CN113706584A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a street view flow information acquisition method based on computer vision, which comprises the steps of detecting and identifying objects in street views by adopting a target detection algorithm YOLOv5 for each frame of video; extracting the appearance characteristic of the detected object to assist the matching of the detection frame and the prediction frame; predicting the position of the next frame of each detected target by using a Kalman filtering algorithm; calculating a cost matrix by using the extracted appearance features and the extracted motion features by using the Hungarian algorithm, and realizing cascade matching of the detection frames to allocate tracking target serial numbers to the identified objects; maintaining appearance characteristics and tracking serial numbers of the detected object by using data structures such as a prototype library and the like, and judging whether the detected object appears in the video for the first time; and intercepting the small image of the object detected for the first time, transferring the small image to a specified path, counting the number of different objects appearing in each category, and displaying the motion track of the object in a video. The invention realizes the real-time acquisition of the image information of common objects and the counting statistical information of different types of objects under the street view.

Description

Streetscape flow information acquisition method based on computer vision
Technical Field
The invention belongs to the technical field of multi-target tracking intersection, and particularly relates to a street view flow information acquisition method based on computer vision.
Background
Video target tracking is an important task in computer vision, and refers to a process of continuously deducing the state of a target in a video sequence, and the task is to generate a motion track of the target by positioning the target in each frame of a video and provide a complete target area where a tracking target appears in the video at each moment. Video tracking technology has very wide application in the field of computer vision. The invention combines and improves the target tracking technology and the detection technology, and optimizes the problems of low inference speed and tracking loss of the common street view flow information collection technology.
The target detection algorithm commonly used in the street view traffic information collection technology is usually based on an R-CNN algorithm (R-CNN, Fast R-CNN, etc.), firstly, a target candidate frame is generated through algorithm calculation, and then the generated candidate frame is classified and regressed and screened. The problems generated by the method are that the reasoning speed is low, the real-time performance of video detection is difficult to meet, and frame extraction processing is often required to be performed on the video. Aiming at the problem, the latest YOLOv5 algorithm in one-stage is adopted, so that the method has higher reasoning speed, higher precision and robustness, and meanwhile, the accuracy of detecting the small target object is improved.
Meanwhile, the tracking algorithm adopted by the technology commonly used for street view flow information collection is the sort algorithm, and track-id switching can occur after a tracking target is temporarily shielded due to the fact that image information cannot be fully utilized. Aiming at the phenomenon, the invention extracts the image information of the tracking target detection frame by using a simple CNN convolution network, and adds the extracted image information in the data cascade, thereby improving the precision of the whole algorithm.
Disclosure of Invention
The invention aims to provide a street view flow information acquisition method based on computer vision, which solves the problems in the prior art.
The invention adopts the following technical scheme for realizing the functions:
the street view flow information acquisition method based on computer vision is characterized by comprising the following steps:
s1: identifying more than ten common objects under street views appearing in each frame of the video by using a YOLOv5 algorithm, framing the detected object out of the video by using frames with different colors according to different categories, and displaying the category and the confidence coefficient of the detection at the upper left corner of the object;
s2: extracting appearance characteristics of the detected object, storing the appearance characteristics as a low-dimensional vector and providing a basis for associated data;
s3: predicting the position of the next frame of object by using a Kalman filtering algorithm to generate a prediction frame;
s4: the prediction box and the detection boxes are in cascade matching by using a Hungarian algorithm, and a tracking serial number is allocated to each detection box;
s5: and intercepting the small image of the object appearing in the video for the first time, storing the small image to a specified path, and counting the number of various objects appearing.
Further optimization, the specific process of step S1 is as follows:
s11: the input end adopts three methods of Mosaic data enhancement, self-adaptive anchor frame calculation and self-adaptive picture scaling to preprocess the input image data:
(1) and Mosaic data enhancement: the training images are spliced in the modes of random scaling, random cutting and random arrangement, the background of a detected object is enriched, and the data of four pictures can be calculated at one time during BN calculation, so that the mini-batch size does not need to be large, a GPU can achieve a good effect, and enrichment of a data set and enhancement of detection accuracy of a small target object are facilitated;
(2) and (3) self-adaptive anchor frame calculation: in Yolov3, Yolov4, the calculation of the initial anchor box values is run by a separate program when training different data sets. However, the Yolov5 embeds the function into the code, and the optimal anchor frame value in different training sets is calculated in a self-adaptive manner during each training;
(3) adaptive picture scaling: the idea of scaling the picture size by the prior YOLO algorithm is changed, the least black edges are added to the original image in a self-adaptive mode, the black edges at two ends of the image height are reduced, the calculated amount is reduced during reasoning, and the target detection speed is improved.
S12: the Backbone adopts a Focus structure, a CSP structure:
(1) focus structure: cutting an input picture through a slicing operation, taking a value in every other pixel in one picture, obtaining four pictures through the operation, complementing the four pictures, and having no information loss, concentrating two-dimensional information of the images into a channel space, widening an input channel by 4 times, namely changing the spliced pictures into 12 channels relative to an original RGB three-channel mode, and finally performing convolution operation on the obtained new picture to finally obtain a double-sampling feature map under the condition of no information loss;
(2) CSP structure: different from the YOLOv4 algorithm, two CSP structures are designed in YOLOv5, a CSP1_ X structure is applied to a Backbone network of a backhaul, and another CSP2_ X structure is applied to a Neck.
S13: the heck adopts an FPN + PAN structure, i.e. a bottom-up feature pyramid is added behind the FPN layer, which contains two layers of PAN structures. The operation is combined, the FPN layer transmits strong semantic features from top to bottom, the feature pyramid transmits strong positioning features from bottom to top, the two structures interact with each other, and parameter aggregation is performed on different detection layers from different trunk layers.
S14: the output end adopts GIOU _ Loss as a Loss function of a Bounding box
Figure BDA0003243842730000031
Wherein C is the minimum circumscribed rectangle, IOU is the cross-over ratio, the numerical value is equal to the overlap area divided by the union area, and the evaluation target detection algorithm precision standard is obtained.
Further optimizing, the specific process of the second step is as follows: a CNN which is relatively simple and small in calculation amount is adopted to extract appearance features of an object to be detected (a detection frame coverage area) and is represented by a 128-dimensional vector, and after each frame of detection and tracking, the appearance features of the object are extracted and stored. While preserving appearance characteristics uses the data structure galery, i.e. the
Figure BDA0003243842730000032
LkThe index i indicates that only the appearance feature of the target k in the frame 100 before the current time can be stored, and i indicates the tracking number.
Specifically, the step S3 specifically includes:
(1) the track's state at time t-1 is predicted based on its state at time t-1.
x'=Fx (1)
P'=FPFT+Q (2)
(2) The predicted position is updated based on detection.
y=z-Hx' (3)
S=HP'HT+R (4)
K=P'HTS-1 (5)
x=x'+Ky (6)
P=(I-KH)P' (7)
In formulas 1 and 2, F is a state transition matrix, x is the mean value of the track at t-1, Q is a system noise matrix, and FTTranspose of state transition matrix, mean error of y detection and track, S noise error, I identity matrix.
In formula 3, z is a mean vector of detection and does not contain a speed variation value, i.e., z ═ cx, cy, r, H is a measurement matrix that maps a mean vector x' of track to the detection space, and the formula calculates mean errors of detection and track;
in formula 4, R is a noise matrix of the detector, which is a diagonal matrix of 4 × 4, values on the diagonal are two coordinates of the center point and the width and height noise, respectively, and are initialized with arbitrary values, and the width and height noise is generally set to be larger than the noise of the center point.
Calculating a Kalman gain K by formula 5, wherein the Kalman gain is used for estimating the importance degree of the error;
equations 6 and 7 obtain the updated mean vector x and covariance matrix P.
Further optimization, the specific process of step S4 is as follows:
(1) computing mahalanobis distance of motion features as a cost function
Figure BDA0003243842730000041
In the above formula, i represents the tracking number, and j represents the detection frame number (y)i,Si) Representing the projection of the i-th Kalman filter distribution in the measurement space, yiIs mean value, SiIs the covariance. Because the distance measurement is performed with the measured values (detection frames), the measurement must be performed in the same spatial distribution. The Mahalanobis distance is used to calculate the uncertainty between the state estimates by measuring the standard deviation between the mean values of the tracked positions of the Kalman filter and the detection box, i.e., d(1)(i, j) is the mahalanobis distance (uncertainty) between the ith trace distribution and the jth detection box;
dja motion feature vector representing the target j,
Figure BDA0003243842730000042
The covariance is indicated.
(2) Calculating the minimum cosine distance between the 128-dimensional vector (appearance characteristic) extracted by the small CNN and the appearance characteristic of the first 100 frames stored in the galery to obtain a cost matrix
Figure BDA0003243842730000043
And realizing cascade matching of the prediction box and the detection box by using a Hungarian algorithm.
Figure BDA0003243842730000044
The detection frame with reference numeral j is transposed,
Figure BDA0003243842730000045
Representing the appearance features extracted by object k.
Further optimization, the specific process of step S5 is as follows: and storing a 'category-tracking sequence number' pair by using a queue, checking whether the detected object exists in the data stored in the queue after the tracking sequence number is matched for the detection frame, and if the object appears in the video sequence for the first time, intercepting a small image of the frame of the object by using opencv and storing the small image to a specified position.
Compared with the prior art, the invention has the following beneficial effects:
according to the method, the YOLOv5 with high performance in reasoning speed and precision is combined with a multi-target tracking algorithm, so that the monocular camera collects image information of common objects in the street view in real time, classifies and counts the identified objects, and reports the small pictures appearing in the video for the first time. The method has higher reasoning speed, higher precision and robustness, and simultaneously improves the accuracy of detecting the small target object.
Meanwhile, the invention extracts the image information of the tracking target detection frame by using a simple CNN convolutional network, and the extracted image information is added in the data cascade, thereby improving the precision of the whole algorithm.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a diagram of the idea of step five tracking;
FIG. 3 is a frame occurring in a video;
FIG. 4 is the result of video annotation after the present invention has been run;
FIG. 5 shows the result of the image information of a common object in the street view video collected by the present invention;
fig. 6 shows the statistics of the number of objects in each category.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
As shown in fig. 1, first, a frame of unprocessed video is subjected to the YOLOv5 algorithm for object recognition processing, so as to obtain a detection frame of each object in the current frame. And predicting the position of the object detection frame in the current frame by a Kalman filtering algorithm according to the information of the previous frame detection frame to obtain a prediction frame. And processing the intersection in the detection frame by a lightweight CNN to obtain a 128-dimensional vector for representing the appearance characteristics of the object. And calculating the Mahalanobis distance of the motion characteristic represented by the prediction box and the minimum cosine distance of the appearance characteristic to obtain a cost matrix, calculating by using a Hungarian algorithm to obtain the best match, and allocating a tracking sequence number and a detection category to each detection box to be displayed at the upper left corner of the detection box. And storing the small detection frame image in the video which appears for the first time to a corresponding path, counting how many objects appear in each category, and displaying the statistical data on the left side of the image in real time.
The invention relates to a street view flow information acquisition method based on computer vision, which comprises the following steps:
the method comprises the following steps: and performing target recognition on each frame of the original video by using a YOLOv5 algorithm to obtain a prediction frame, distinguishing different types of objects by using prediction frames with different colors, and displaying the category information and the confidence coefficient of the detected object at the upper left corner of the detection frame. Compared with the previous generation YOLO algorithm, the YOLO 5 algorithm improves the network structure and training skills so as to obtain higher inference speed and detection accuracy, and meanwhile, the detection accuracy can further enhance the tracking accuracy of the tracking algorithm.
Step two: and representing the motion state of the detected object detection frame by using an 8-dimensional vector, and predicting the position of the next frame of object detection frame by using a Kalman algorithm according to the change of the motion state of the previous frame.
Step three: the appearance characteristics of the detected object are extracted by using a simple CNN network and are stored by using a data structure galery. The appearance characteristic effectively improves the ID _ Switch phenomenon of the tracking object, and greatly improves the accuracy of the tracking algorithm.
Step four: and calculating the Mahalanobis distance of the motion state and the minimum cosine distance of the appearance characteristics to obtain a cost matrix, performing cascade matching on the cost matrix by using a Hungarian algorithm, and allocating a corresponding tracking serial number to each detection box.
Step five: by recording the data pair of "category-tracking sequence number", as shown in fig. 2, it is determined whether each detected object appears in the video for the first time, if so, the corresponding region picture is captured by opencv and saved to the corresponding path, as shown in fig. 5, and the number of different objects appearing in the video for each category is recorded, as shown in fig. 4 and 6. If not, the central point of the position where the previous frame appears is correlated, and a moving track of the object is presented in the video by using opencv.
The above embodiments are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modifications made on the basis of the technical scheme according to the technical idea of the present invention fall within the protection scope of the present invention.

Claims (6)

1. The street view flow information acquisition method based on computer vision is characterized by comprising the following steps:
s1: identifying more than ten kinds of common objects under street views appearing in each frame of the video, framing the detected object out of the video by using frames with different colors according to different categories to obtain a detection frame, and displaying the category and the confidence coefficient of the detection on the upper left corner of the object;
s2: extracting appearance characteristics of the detected object and providing a basis for the associated data;
s3: predicting the position of the next frame of object to generate a prediction frame;
s4: the prediction frames and the detection frames are in cascade matching, and a tracking serial number is distributed to each detection frame;
s5: and intercepting the small image of the object appearing in the video for the first time, storing the small image to a specified path, and counting the number of various objects appearing.
2. The street view traffic information collection method based on computer vision as claimed in claim 1, wherein the step S1 is specifically performed by:
s11: the input end adopts three methods of Mosaic data enhancement, self-adaptive anchor frame calculation and self-adaptive picture scaling to preprocess the input image data:
(1) and Mosaic data enhancement: the training images are spliced in the modes of random zooming, random cutting and random arrangement, so that the data set is enriched and the detection precision of small target objects is enhanced;
(2) and (3) self-adaptive anchor frame calculation: in Yolov3, Yolov4, the calculation of the initial anchor box values is run by a separate program when training different data sets. However, the Yolov5 embeds the function into the code, and the optimal anchor frame value in different training sets is calculated in a self-adaptive manner during each training;
(3) adaptive picture scaling: the least black edges are added to the original image in a self-adaptive manner, so that the black edges at two ends of the image height are reduced, and the calculated amount is reduced during reasoning, namely the target detection speed is improved;
s12: backbone: focus structure, CSP structure:
(1) focus structure: clipping an input picture through a slicing operation;
(2) CSP structure: different from a YOLOv4 algorithm, two CSP structures are designed in a Yolov5, a CSP1_ X structure is applied to a Backbone network of a backhaul, and the other CSP2_ X structure is applied to a Neck;
s13: the heck adopts an FPN + PAN structure, i.e. a bottom-up feature pyramid is added behind the FPN layer, which contains two layers of PAN structures. The operation is combined, the FPN layer transmits strong semantic features from top to bottom, the feature pyramid transmits strong positioning features from bottom to top, the two structures interact with each other, and parameter aggregation is performed on different detection layers from different trunk layers.
S14: and the output end adopts GIOU _ Loss as a Loss function of the Bounding box.
3. The street view traffic information collection method based on computer vision as claimed in claim 1, wherein the step S2 is specifically performed by: adopting CNN to extract appearance characteristics of a detection frame coverage area in a detected object and using 128-dimensional vector to represent, after each frame detection and tracking, extracting and storing the appearance characteristics of the object once, and storing the appearance characteristics by using a data structure, namely, a data structure galery
Figure FDA0003243842720000021
LkIndicating that the appearance features of the target k at most 100 frames before the current time can be stored in the galery, wherein i represents a tracking serial number;
Figure FDA0003243842720000022
representing the appearance features extracted by the target k, i.e. the 128-dimensional vector of the target box extracted by CNN.
4. The street view traffic information collection method based on computer vision as claimed in claim 1, wherein the step S3 is specifically performed by:
(1) predicting the state of the track at the t moment based on the state of the track at the t-1 moment;
(2) the predicted position is updated based on detection.
5. The street view traffic information collection method based on computer vision as claimed in claim 1, wherein the step S4 is specifically performed by: and calculating the motion characteristics obtained through Kalman filtering and the 128-dimensional vector appearance characteristics extracted by the small CNN to obtain a cost matrix, and realizing cascade matching of the prediction frame and the detection frame by using a Hungary algorithm.
6. The street view traffic information collection method based on computer vision as claimed in claim 1, wherein the step S5 is specifically performed by: and storing a 'category-tracking sequence number' pair by using a queue, checking whether the detected object exists in the data stored in the queue after the tracking sequence number is matched for the detection frame, and if the object appears in the video sequence for the first time, intercepting a small image of the frame of the object by using opencv and storing the small image to a specified position.
CN202111026783.7A 2021-09-02 2021-09-02 Streetscape flow information acquisition method based on computer vision Pending CN113706584A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111026783.7A CN113706584A (en) 2021-09-02 2021-09-02 Streetscape flow information acquisition method based on computer vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111026783.7A CN113706584A (en) 2021-09-02 2021-09-02 Streetscape flow information acquisition method based on computer vision

Publications (1)

Publication Number Publication Date
CN113706584A true CN113706584A (en) 2021-11-26

Family

ID=78657419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111026783.7A Pending CN113706584A (en) 2021-09-02 2021-09-02 Streetscape flow information acquisition method based on computer vision

Country Status (1)

Country Link
CN (1) CN113706584A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445393A (en) * 2022-02-07 2022-05-06 无锡雪浪数制科技有限公司 Bolt assembly process detection method based on multi-vision sensor
CN114724359A (en) * 2022-03-07 2022-07-08 重庆亲禾智千科技有限公司 Deepstream-based road event detection method
CN116563769A (en) * 2023-07-07 2023-08-08 南昌工程学院 Video target identification tracking method, system, computer and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114445393A (en) * 2022-02-07 2022-05-06 无锡雪浪数制科技有限公司 Bolt assembly process detection method based on multi-vision sensor
CN114724359A (en) * 2022-03-07 2022-07-08 重庆亲禾智千科技有限公司 Deepstream-based road event detection method
CN116563769A (en) * 2023-07-07 2023-08-08 南昌工程学院 Video target identification tracking method, system, computer and storage medium
CN116563769B (en) * 2023-07-07 2023-10-20 南昌工程学院 Video target identification tracking method, system, computer and storage medium

Similar Documents

Publication Publication Date Title
CN109829398B (en) Target detection method in video based on three-dimensional convolution network
CN113706584A (en) Streetscape flow information acquisition method based on computer vision
Srivatsa et al. Salient object detection via objectness measure
CN113011329A (en) Pyramid network based on multi-scale features and dense crowd counting method
CN113763427B (en) Multi-target tracking method based on coarse-to-fine shielding processing
CN111079739A (en) Multi-scale attention feature detection method
CN113609896A (en) Object-level remote sensing change detection method and system based on dual-correlation attention
CN115841649A (en) Multi-scale people counting method for urban complex scene
CN113850136A (en) Yolov5 and BCNN-based vehicle orientation identification method and system
CN110084284A (en) Target detection and secondary classification algorithm and device based on region convolutional neural networks
CN111553337A (en) Hyperspectral multi-target detection method based on improved anchor frame
CN115393635A (en) Infrared small target detection method based on super-pixel segmentation and data enhancement
CN110738076A (en) People counting method and system in images
CN101610412B (en) Visual tracking method based on multi-cue fusion
LU500512B1 (en) Crowd distribution form detection method based on unmanned aerial vehicle and artificial intelligence
CN113408550B (en) Intelligent weighing management system based on image processing
CN114463724A (en) Lane extraction and recognition method based on machine vision
CN114422720A (en) Video concentration method, system, device and storage medium
CN114155278A (en) Target tracking and related model training method, related device, equipment and medium
CN112926426A (en) Ship identification method, system, equipment and storage medium based on monitoring video
CN112668662A (en) Outdoor mountain forest environment target detection method based on improved YOLOv3 network
CN116052090A (en) Image quality evaluation method, model training method, device, equipment and medium
CN113723181B (en) Unmanned aerial vehicle aerial photographing target detection method and device
CN115482256A (en) Lightweight target detection and automatic tracking method based on semantic segmentation
CN115512263A (en) Dynamic visual monitoring method and device for falling object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication