WO2019085377A1 - 目标跟踪硬件实现系统和方法 - Google Patents

目标跟踪硬件实现系统和方法 Download PDF

Info

Publication number
WO2019085377A1
WO2019085377A1 PCT/CN2018/080595 CN2018080595W WO2019085377A1 WO 2019085377 A1 WO2019085377 A1 WO 2019085377A1 CN 2018080595 W CN2018080595 W CN 2018080595W WO 2019085377 A1 WO2019085377 A1 WO 2019085377A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
kcf
feature
training
Prior art date
Application number
PCT/CN2018/080595
Other languages
English (en)
French (fr)
Inventor
贾希杰
吴迪
孙寒泊
Original Assignee
北京深鉴智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京深鉴智能科技有限公司 filed Critical 北京深鉴智能科技有限公司
Publication of WO2019085377A1 publication Critical patent/WO2019085377A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/62Extraction of image or video features relating to a temporal dimension, e.g. time-based feature extraction; Pattern tracking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Definitions

  • the present invention relates to computer machine vision, and more particularly to a method and system for real-time high resolution multi-target tracking.
  • Target tracking is an important topic in the field of computer machine vision.
  • the main task of the project is to design a discriminating classifier that can distinguish the target from the environment.
  • high-speed trackers based on Kernelized Correlation Filters (KCF) are new developments in recent years.
  • KCF algorithm uses the cyclic matrix to construct training samples to train the classifier (ridge regression problem), and avoids complicated matrix inversion process by performing calculations in the discrete Fourier domain, thereby reducing the computational and storage complexity of the algorithm. , improve the calculation speed of the algorithm.
  • a target tracking hardware implementation system comprising: an intercept scaling module, configured to intercept a target image and a plurality of scales of a to-be-detected image from a video in an external storage module, and normalize by scaling
  • the feature extraction module is configured to extract features from the normalized image; the feature management module is configured to access and update the matching template based on the extracted image features; and calculate a correlation filter (KCF)
  • KCF correlation filter
  • a module configured to calculate a KCF response of each of the to-be-detected images based on the extracted image features and the updated matching template, and select a to-be-detected image with the largest KCF response as the tracked target.
  • the video may be a high resolution video.
  • the feature extraction module may be further configured to extract a gradient histogram (HOG) feature, and perform normalization and principal component analysis on the extracted feature vector.
  • HOG gradient histogram
  • PCA dimension reduction and Hanning window weighting.
  • the feature management module can implement access and use of image features using a ping-pong cache structure.
  • the intercept scaling module may be further configured to: acquire a previous frame as a training image frame from a video, and acquire a current frame as a to-be-detected image frame; Extracting a target position in the training image frame, and extracting a plurality of scales of the to-be-detected position from the image frame to be detected based on the target position; capturing the target image from the training image frame and normalizing to a specified size by scaling, from the to-be-checked In the image frame, the image to be inspected is respectively taken at each position to be inspected and normalized to the same size by scaling.
  • the feature extraction module may be further configured to extract a feature vector of the normalized target image and a feature vector of the normalized image to be detected. And sent to the feature management module for storage.
  • the feature management module may be further configured to read the historical training matching template from the external storage module, and update the feature training matching template by using the feature vector of the target image.
  • the KCF calculation module may be further configured to generate a KCF training coefficient based on a feature vector of a target image read from the feature management module based on a discrete Fourier transform, thereby updating a KCF training coefficient matching template;
  • the updated feature training matching template read by the feature management module and the feature vector of the image to be detected read from the feature management module, and the KCF of the feature vector of each image to be detected is calculated using the updated KCF training coefficient matching template.
  • Response the target image with the largest KCF response is used as the tracked target, and the target size and offset distance tracked are converted.
  • a target tracking method including: intercepting a scaling module, capturing a target image and a plurality of scales of a to-be-detected image from a video in an external storage module, and normalizing to a specified size by scaling;
  • the feature extraction module extracts features from the normalized image;
  • the feature management module accesses and updates the matching template based on the extracted image features;
  • the nucleation correlation filter (KCF) calculation module is based on the extracted image features and the updated image
  • the matching template calculates the KCF response of each image to be inspected, and selects the image to be inspected with the largest response from the KCF as the tracked target.
  • the video may be a high resolution video.
  • the step of extracting features from the normalized image may further include: extracting a gradient histogram (HOG) feature, and normalizing the extracted feature vector, Principal Component Analysis (PCA) dimension reduction and Hanning window weighting.
  • HOG gradient histogram
  • PCA Principal Component Analysis
  • the method of the second aspect of the invention may further comprise: implementing access and use of image features using a ping-pong cache structure.
  • the step of capturing a target image and a plurality of scales of the image to be inspected from the video in the external storage module and normalizing to a specified size by scaling may further The method includes: obtaining a previous frame from the video as a training image frame, acquiring a current frame as a to-be-detected image frame; extracting a target position from the training image frame, and extracting a plurality of scales from the image frame to be detected based on the target position Checking the position; extracting the target image from the training image frame and normalizing to a specified size by scaling, intercepting the image to be inspected at each of the to-be-detected positions from the image frame to be inspected, and normalizing to the same size by scaling .
  • the step of extracting a feature from the normalized image may further include: extracting a feature vector of the normalized target image and the normalized image to be inspected The feature vector is stored.
  • the step of accessing and updating the matching template based on the extracted image features may further include: reading a historical training matching template from the external storage module, and updating the feature training matching template using the feature vector of the target image.
  • the step of calculating a KCF response of each of the to-be-detected images based on the extracted image features and the updated matching template, and selecting a KCF-responsive maximum image to be detected as the tracked target may further include: using discrete Fourier transform as Basically, generating a KCF training coefficient based on the feature vector of the read target image, thereby updating the KCF training coefficient matching template; updating the feature vector based on the read updated feature training matching template and the read image to be inspected, using the update
  • the KCF training coefficient matching template calculates the KCF response of the feature vector of each image to be inspected; the target image with the largest KCF response is used as the tracked target, and the tracked target size and offset distance are converted.
  • the method according to the second aspect of the invention may further comprise: traversing the plurality of targets in parallel using a plurality of sets of computing resources until tracking of all targets is completed.
  • the method according to the second aspect of the present invention may further comprise: using the current frame as a training image frame, reading the next frame from the video as a to-be-detected image frame, and sequentially traversing all image frames in the video to perform the control method Until the end of the video.
  • a computer readable medium for recording instructions executable by a processor, when executed by a processor, causing a processor to perform a target tracking method, including the following operations: from outside The target image and the plurality of scales of the image to be inspected are captured in the video in the storage module, and are normalized to a specified size by scaling; the feature is extracted from the normalized image; and the matching template is accessed based on the extracted image feature. Updating; calculating a Kernel Correlation Filter (KCF) response of each of the to-be-detected images based on the extracted image features and the updated matching template, and selecting a to-be-tested image with the largest KCF response as the tracked target.
  • KCF Kernel Correlation Filter
  • the present invention combines the advantages of the existing KCF algorithm, and implements a real-time high-resolution multi-target tracking method and system on a parallel hardware platform, such as a field programmable logic array (FPGA) or an application specific integrated circuit (ASIC). It has the advantages of small size, low power consumption and high real-time performance.
  • a parallel hardware platform such as a field programmable logic array (FPGA) or an application specific integrated circuit (ASIC). It has the advantages of small size, low power consumption and high real-time performance.
  • FIG. 1 is a schematic diagram for explaining a target tracking method according to a preferred embodiment of the present invention
  • FIG. 2 is a schematic block diagram for explaining a target tracking hardware implementation system in accordance with a preferred embodiment of the present invention
  • FIG. 3 is a flow chart of a more general target tracking method in accordance with the present invention.
  • FIG. 1 is a schematic diagram for explaining a target tracking method in accordance with a preferred embodiment of the present invention. As shown in FIG. 1, the present invention provides a real-time high-resolution multi-target tracking method.
  • Step 1 Obtain the previous frame video as the training image frame through the video, and obtain the current frame as the image frame to be inspected.
  • the video can be high resolution video to provide better picture quality and more image detail;
  • Step 2 extracting a target position from the training image frame, and extracting a plurality of scales to be inspected from the image frame to be detected based on the target position;
  • Step 3 The target image is intercepted from the training image frame and normalized to a specified size by scaling; from the image frame to be inspected, the image to be inspected is respectively captured at each position to be inspected, and normalized to the same size by scaling. ;
  • Step 4 Extract features from the normalized target image and each normalized image to be examined.
  • the Histogram of Gradient (HOG) feature is extracted, and the feature vector is normalized, Principal Component Analysis (PCA) dimension reduction and Hanning window weighting are performed to reduce the boundary effect of the circulant matrix;
  • PCA Principal Component Analysis
  • Step 5 Calculate the KCF training coefficient of the target image feature
  • Step 6 Import the history training matching template, use the target image feature update feature training matching template, and update the KCF training coefficient matching template by using the KCF training coefficient of the target image feature. Export the updated training template;
  • Step 7 training the matching template and the KCF training coefficient matching template according to the feature of the target image, and calculating a KCF response matrix of the image features to be detected at each scale;
  • Step 8 selecting, from the image to be inspected at each scale, the size of the image to be inspected with the largest response as the size of the tracked target;
  • Step 9 Convert the offset distance of the maximum response in the feature to the offset distance of the tracked target
  • Step 10 Switch to another target and repeat steps 2 through 9 until all targets are serially traversed in sequence.
  • each target is calculated in parallel until the calculation of all targets is completed;
  • Step 11 Set the current frame as a training image frame, read the next frame from the video stream as the image frame to be inspected, and repeat steps 2 to 10 until the end of the video.
  • the invention also provides a real-time high-resolution multi-target tracking hardware implementation system.
  • 2 is a schematic block diagram for explaining a target tracking hardware implementation system in accordance with a preferred embodiment of the present invention.
  • the target tracking hardware implementation system 200 according to a preferred embodiment of the present invention includes an intercept scaling module 210 for capturing a target image and a plurality of scales of the to-be-detected image from the video in the external storage module 300.
  • the feature extraction module 220 is configured to extract features from the normalized image;
  • the feature management module 230 is configured to access and update the matching template based on the extracted image features;
  • the correlation filter (KCF) calculation module 240 is configured to calculate a KCF response of each to-be-detected image based on the extracted image features and the updated matching template, and select a to-be-detected image with the largest KCF response as the tracked target.
  • target tracking hardware implementation system 200 and its supporting modules or external modules 100, 300, 400, 500, 201 in accordance with a preferred embodiment of the present invention are described in detail below in conjunction with the method of FIG. 1 and the block diagram of FIG.
  • the general purpose processor 100 receives video data from the video source 400 and stores it to the external storage module 300, generates a tracking system hardware scheduling policy, sends instructions and parameters to the receiving control module 201 of the tracking system, and receives the calculation result of the KCF calculation module 240, The display is output to the display device 500 after finishing.
  • the receiving control module 201 is configured to receive the instructions and parameters sent from the general purpose processor 100, thereby controlling the working modes of the intercept scaling module 210 and the feature management module 230.
  • the intercept scaling module 220 reads an image frame from the external storage module 300 according to the instruction and parameters sent from the receiving control module 201, intercepts the image, and normalizes to a specified size by scaling.
  • Feature extraction module 220 for extracting features from normalized images.
  • the HOG feature is extracted, and the feature vector is subjected to feature normalization, PCA dimensionality reduction and Hanning window weighting to improve the classification effect of the KCF classifier and reduce system calculation and storage complexity.
  • Feature management module 230 for managing feature vectors.
  • the storage result feature vector matrix of the storage feature extraction module 220 is stored, the feature training matching template is stored and updated, the KCF training coefficient matching template is stored and read, and the stored feature vector and the feature training matching template are sequentially transmitted to the KCF calculation module 240.
  • the calculation stores the matching template to the external storage module 300 for switching between the multi-target matching templates.
  • the feature vector is stored using a ping-pong strategy to implement feature extraction and pipeline execution of the KCF calculation.
  • KCF calculation module 240 Based on the discrete Fourier transform, is used to calculate the KCF response of the image features. According to the instruction of the receiving control module 201, in the training step, the target image feature is read from the feature management module 230, the KCF training coefficient is calculated and updated, and the KCF training coefficient matching template is updated; in the detecting step, from the feature management module 230 The feature training matching template of the target image and the feature vector matrix of the image to be inspected are respectively read, and the KCF response matrix of the image feature to be detected is calculated by using the KCF training coefficient matching template, and the image to be inspected with the largest response is selected from the images to be inspected at each scale. The scale of the target is the size of the tracked target, and the offset distance of the maximum response in the feature is converted to the offset distance of the tracked target, and transmitted back to the general purpose processor 100.
  • the external storage module 300 stores the training image frame and the image frame to be inspected, preferably, using a ping-pong strategy for storage. A historical training matching template for each tracking target is stored.
  • a real-time high-resolution multi-target tracking method and system proposed by the present invention has the following beneficial effects.
  • the computing system proposed by the present invention has a pipeline structure between modules.
  • the parallel computing advantages of the devices can be fully utilized to improve computational efficiency and meet high-speed real-time calculations.
  • the calculation method proposed by the present invention is normalized by scaling of the size, and can maintain a stable processing speed when inputting video of different resolutions, and provides good support for high-resolution video.
  • the calculation method proposed by the present invention can perform a multi-target tracking, and can use a system to serially traverse each target, or use the parallelism of device resources to open multiple systems, traverse the target in parallel, and realize higher speed real-time. Multi-target tracking.
  • the target tracking method 300 begins in step S310, in which a target image and a plurality of scales of the image to be inspected are captured from the video in the external storage module and normalized by scaling. To the specified size.
  • the video shown can be a high resolution video.
  • step S310 is performed by the intercept scaling module 210 of FIG. More specifically, step S310 may further include: acquiring a previous frame from the video as a training image frame, acquiring a current frame as a to-be-detected image frame; extracting a target position from the training image frame, and extracting a target image based on the target position Extracting a plurality of scales of the to-be-detected position in the frame; extracting the target image from the training image frame and normalizing to a specified size by scaling, and intercepting the image to be inspected at each of the to-be-detected positions from the image frame to be inspected, and passing The scaling is normalized to the same size.
  • step S320 features are extracted for the normalized image.
  • step S320 is performed by feature extraction module 220 of FIG. More specifically, step S320 may further include: extracting a feature vector of the normalized target image and a feature vector of the normalized image to be detected for storage. Moreover, the extracted features may be gradient histogram (HOG) features, and the extracted feature vectors are normalized, principal component analysis (PCA) dimensionality reduction, and Hanning window weighting.
  • HOG gradient histogram
  • PCA principal component analysis
  • SVD singular value decomposition
  • Replace PCA to achieve the same or similar purpose. Therefore, if there are other ways to implement the normalized target image and the image to be detected feature, such a manner and specific steps are also within the specific details of step S320, and are also within the scope of the claimed invention.
  • step S330 the matching template is accessed and updated based on the extracted image features.
  • step S330 is performed by feature management module 230 of FIG. More specifically, step S330 may further include: reading the historical training matching template from the external storage module 300 of FIG. 2, and updating the feature training matching template using the feature vector of the target image.
  • a nuclear correlation filter (KCF) response of each of the to-be-detected images is calculated based on the extracted image features and the updated matching template, and the image to be detected with the largest KCF response is selected as the tracked target.
  • KCF nuclear correlation filter
  • steps S330 and S340 access and use of image features may be implemented using a ping-pong cache structure.
  • step S340 is performed by KCF calculation module 240 of FIG. More specifically, step S340 further includes: generating a KCF training coefficient based on the feature vector of the read target image based on the discrete Fourier transform, thereby updating the KCF training coefficient matching template; and updating the feature based on the reading Training the matching template and the read feature vector of the image to be inspected, using the updated KCF training coefficient matching template to calculate the KCF response of the feature vector of each image to be inspected; using the image to be detected with the largest KCF response as the tracked target, conversion tracking The target size and offset distance to.
  • step S340 it will be understood by those of ordinary skill in the art that although the specific manner described above is used in the preferred embodiment, if there are other ways to implement the calculation of the KCF response and thus the most responsive image to be detected can be selected as the tracked target, Such a manner and specific steps are also within the specific details of step S340 and are also within the scope of the claimed invention.
  • multiple sets of computing resources can be used to traverse multiple targets in parallel until tracking of all targets is completed.
  • step S310 the previous frame and the current frame may be read as the training and pending image frames respectively.
  • the next frame in the video may be entered, that is, the current frame is used as the current frame.
  • the image frame is trained to read the next frame from the video as the image frame to be examined, thereby sequentially traversing all of the image frames in the video to perform flowcharts S310 through S340 of the method 300 until the end of the video.
  • method 300 can also end.
  • Non-transitory computer readable media include various types of tangible storage media.
  • non-transitory computer readable medium examples include magnetic recording media (such as floppy disks, magnetic tapes, and hard disk drives), magneto-optical recording media (such as magneto-optical disks), CD-ROM (Compact Disc Read Only Memory), CD-R, CD-R /W and semiconductor memory (such as ROM, PROM (programmable ROM), EPROM (rewritable PROM), flash ROM and RAM (random access memory)).
  • these programs can be provided to a computer by using various types of transient computer readable media.
  • Examples of transitory computer readable media include electrical signals, optical signals, and electromagnetic waves. The transitory computer readable medium can be used to provide a program to a computer via a wired communication path such as a wire and an optical fiber or a wireless communication path.
  • a computer program or a computer readable medium for recording instructions executable by a processor when executed by a processor, causes the processor to perform a target tracking method, including The operation is as follows: capturing a target image and a plurality of scales of the image to be inspected from the video in the external storage module, and normalizing to a specified size by scaling; extracting the feature from the normalized image; and matching the image based on the extracted image feature
  • the template is accessed and updated; the nucleation correlation filter (KCF) response of each image to be detected is calculated based on the extracted image features and the updated matching template, and the image to be detected with the largest KCF response is selected as the tracked target.
  • KCF nucleation correlation filter

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

本公开提供一种目标跟踪硬件实现系统和方法。根据本发明的目标跟踪硬件实现系统(200)包括:截取缩放模块(210),用于从外部存储模块(300)中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小;特征提取模块(220),用于对归一化图像提取特征;特征管理模块(230),用于基于所提取的图像特征,对匹配模板进行存取与更新;KCF计算模块(240),用于基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的KCF响应,选取KCF响应最大的待检图像作为跟踪到的目标。本发明综合现有KCF算法的优势,在并行硬件平台上实现,具有小体积、低功耗和高实时性的优点。

Description

目标跟踪硬件实现系统和方法 技术领域
本发明涉及计算机机器视觉,更具体涉及实时高分辨率的多目标跟踪的方法与系统。
背景技术
目标跟踪是计算机机器视觉领域的重要课题。课题的主要任务是设计一个具有鉴别力的分类器,能够将目标从环境中区分出来。其中,基于核化相关滤波器(Kernelized Correlation Filters,KCF)的高速跟踪器是近年来的新进展。KCF算法利用循环矩阵来构建训练样本对分类器进行训练(岭回归问题),并通过在离散傅里叶域中进行计算来避免复杂的矩阵求逆过程,进而降低了算法的计算和存储复杂度,提高了算法的计算速度。
现有的KCF算法仍然使用软件在通用处理器中实现,如CPU、GPU或ARM等。然而在通用处理器上进行串行计算或多线程并行计算,由于并行度有限,算法对主频依赖很大,在高性能CPU或GPU上体积和功耗都很大,在嵌入式ARM上则主频过低,性能严重不足,难以达到实时。此外,在进行多目标实时跟踪任务时,串行计算的劣势则更加明显。
发明内容
本发明的目的在于提供一种实时高分辨率多目标跟踪方法及系统,以便达到在高分辨率视频中,对多目标进行实时跟踪。
根据本发明的第一方面,提供一种目标跟踪硬件实现系统,包括:截取缩放模块,用于从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小;特征提取模块,用于对归一化图像提取特征;特征管理模块,用于基于所提取的图像特征,对匹配模板进行存取与更新;核化相关滤波器(KCF)计算模块,用于基于所提取的图像特征 以及更新后的匹配模板计算各个待检图像的KCF响应,选取KCF响应最大的待检图像作为跟踪到的目标。
优选地,在根据本发明第一方面的目标跟踪硬件实现系统中,所述视频可以是高分辨率视频。
优选地,在根据本发明第一方面的目标跟踪硬件实现系统中,所述特征提取模块可以进一步用于提取梯度直方图(HOG)特征,并对提取的特征向量进行归一化、主成分分析(PCA)降维和汉宁窗加权。
优选地,在根据本发明第一方面的目标跟踪硬件实现系统中,所述特征管理模块可以采用乒乓缓存结构实现对图像特征的存取与使用。
优选地,在根据本发明第一方面的目标跟踪硬件实现系统中,所述截取缩放模块可以进一步用于:从视频中获取前一帧作为训练图像帧,获取当前帧作为待检图像帧;从训练图像帧中提取目标位置,并以目标位置为基础从待检图像帧中提取多个尺度的待检位置;从训练图像帧中截取目标图像并通过缩放归一化至指定大小,从待检图像帧中,分别在每个待检位置截取待检图像,并通过缩放归一化至相同的大小。
优选地,在根据本发明第一方面的目标跟踪硬件实现系统中,所述特征提取模块可以进一步用于提取归一化后的目标图像的特征向量以及归一化后的待检图像的特征向量,发送给所述特征管理模块进行存储。所述特征管理模块可以进一步用于从外部存储模块中读取历史训练匹配模板,使用目标图像的特征向量更新特征训练匹配模板。所述KCF计算模块可以进一步用于以离散傅里叶变换为基础,基于从所述特征管理模块读取的目标图像的特征向量计算生成KCF训练系数,由此更新KCF训练系数匹配模板;基于从所述特征管理模块读取的更新后的特征训练匹配模板和从所述特征管理模块读取的待检图像的特征向量,使用更新的KCF训练系数匹配模板计算各个待检图像的特征向量的KCF响应;将KCF响应最大的待检图像作为跟踪到的目标,换算跟踪到的目标尺寸与偏移距离。
根据本发明的第二方面,提供一种目标跟踪方法,包括:截取缩放模块从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小;特征提取模块对归一化图像提取特征;特征管理模块基于所提取的图像特征,对匹配模板进行存取与更新;核化相关滤波器(KCF)计算模块基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的KCF响应,选取KCF响应最大的待检图像作为跟踪到的目标。
优选地,在根据本发明第二方面的方法中,所述视频可以是高分辨率视频。
优选地,在根据本发明第二方面的方法中,所述的对归一化图像提取特征的步骤可以进一步包括:提取梯度直方图(HOG)特征,并对提取的特征向量进行归一化、主成分分析(PCA)降维和汉宁窗加权。
优选地,发明第二方面的方法可以进一步包括:采用乒乓缓存结构实现对图像特征的存取与使用。
优选地,在根据本发明第二方面的方法中,所述的从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小的步骤可以进一步包括:从视频中获取前一帧作为训练图像帧,获取当前帧作为待检图像帧;从训练图像帧中提取目标位置,并以目标位置为基础从待检图像帧中提取多个尺度的待检位置;从训练图像帧中截取目标图像并通过缩放归一化至指定大小,从待检图像帧中,分别在每个待检位置截取待检图像,并通过缩放归一化至相同的大小。
优选地,在根据本发明第二方面的方法中,所述的对归一化图像提取特征的步骤可以进一步包括:提取归一化后的目标图像的特征向量以及归一化后的待检图像的特征向量,进行存储。所述的基于所提取的图像特征,对匹配模板进行存取与更新的步骤可以进一步包括:从外部存储模块中读取历史训练匹配模板,使用目标图像的特征向量更新特征训练匹配模板。所述的基 于所提取的图像特征以及更新后的匹配模板计算各个待检图像的KCF响应,选取KCF响应最大的待检图像作为跟踪到的目标的步骤可以进一步包括:以离散傅里叶变换为基础,基于读取的目标图像的特征向量计算生成KCF训练系数,由此更新KCF训练系数匹配模板;基于读取的更新后的特征训练匹配模板和读取的待检图像的特征向量,使用更新的KCF训练系数匹配模板计算各个待检图像的特征向量的KCF响应;将KCF响应最大的待检图像作为跟踪到的目标,换算跟踪到的目标尺寸与偏移距离。
优选地,根据本发明第二方面的方法可以进一步包括:使用多套计算资源,并行地遍历多个目标,直至完成所有目标的跟踪。
优选地,根据本发明第二方面的方法可以进一步包括:将当前帧作为训练图像帧,从视频中读取下一帧作为待检图像帧,依次遍历视频中的所有图像帧执行所述控制方法,直至视频结束。
根据本发明的第三方面,提供一种计算机可读介质,用于记录可由处理器执行的指令,所述指令在被处理器执行时,使得处理器执行目标跟踪方法,包括如下操作:从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小;对归一化图像提取特征;基于所提取的图像特征,对匹配模板进行存取与更新;基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的核化相关滤波器(KCF)响应,选取KCF响应最大的待检图像作为跟踪到的目标。
本发明综合现有KCF算法的优势,在并行硬件平台上实现了一种实时高分辨率的多目标跟踪方法及系统,如现场可编程逻辑阵列(FPGA)、专用集成电路(ASIC)均可,具有小体积、低功耗和高实时性的优点。
附图说明
下面参考附图结合实施例说明本发明。在附图中:
图1是用于说明根据本发明的优选实施例的目标跟踪方法的示意图;
图2是用于说明根据本发明的优选实施例的目标跟踪硬件实现系统的 示意框图;
图3是根据本发明的更一般的目标跟踪方法的流程图。
具体实施方式
附图仅用于示例说明,不能理解为对本发明的限制。下面结合附图和实施例对本发明的技术方案做进一步的说明。
图1是用于说明根据本发明的优选实施例的目标跟踪方法的示意图。如图1中所示,本发明提供了一种实时高分辨率多目标跟踪方法。
在图1所示的优选实施例的方法中,包括如下的具体步骤:
步骤1:通过视频,获取前一帧视频作为训练图像帧,获取当前帧作为待检图像帧。优选地,视频可以为高分辨率视频,以提供更好的画质和更多的图像细节;
步骤2:从训练图像帧中提取目标位置,并以目标位置为基础从待检图像帧中提取多个尺度的待检位置;
步骤3:从训练图像帧中截取目标图像并通过缩放归一化至指定大小;从待检图像帧中,分别在每个待检位置截取待检图像,并通过缩放归一化至相同的大小;
步骤4:分别对归一化目标图像以及各归一化待检图像提取特征。优选地,提取梯度直方图(Histogram of Gradient,简称为HOG)特征,并对特征向量进行归一化、主成分分析(PCA)降维和汉宁窗加权,降低循环矩阵的边界效应;
步骤5:计算目标图像特征的KCF训练系数;
步骤6:导入历史训练匹配模板,使用目标图像特征更新特征训练匹配模板,使用目标图像特征的KCF训练系数更新KCF训练系数匹配模板。导出更新后的训练模板;
步骤7:根据目标图像的特征训练匹配模板及KCF训练系数匹配模板,计算各个尺度待检图像特征的KCF响应矩阵;
步骤8:从各个尺度待检图像中,选取响应最大的待检图像的尺寸,作为跟踪到的目标的尺寸;
步骤9:将最大响应在特征中的偏移距离换算为跟踪到的目标的偏移距离;
步骤10:切换至另一目标,重复步骤2至9,直至依次串行遍历所有目标。优选地,使用多套计算资源,并行计算各个目标,直至完成所有目标的计算;
步骤11:将当前帧置为训练图像帧,从视频流中读取下一帧作为待检图像帧,重复步骤2至10,直至视频结束。
本发明还提供了一种实时高分辨率多目标跟踪硬件实现系统。图2是用于说明根据本发明的优选实施例的目标跟踪硬件实现系统的示意框图。如图2中所示,根据本发明的优选实施例的目标跟踪硬件实现系统200包括:截取缩放模块210,用于从外部存储模块300中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小;特征提取模块220,用于对归一化图像提取特征;特征管理模块230,用于基于所提取的图像特征,对匹配模板进行存取与更新;核化相关滤波器(KCF)计算模块240,用于基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的KCF响应,选取KCF响应最大的待检图像作为跟踪到的目标。
下面进一步结合图1的方法和图2的框图,详细描述根据本发明的优选实施例的目标跟踪硬件实现系统200及其支持模块或外部模块100、300、400、500、201。
通用处理器100:从视频源400接收视频数据并存储至外部存储模块300,生成跟踪系统硬件调度策略,向跟踪系统的接收控制模块201发送指令和参数,并接收KCF计算模块240的计算结果,整理后显示输出到显示设备500。
接收控制模块201:用于接收从通用处理器100发送的指令和参数,进而控制截取缩放模块210和特征管理模块230的工作模式。
截取缩放模块220:根据接收控制模块201传来的指令和参数,从外部存储模块300中读取图像帧,截取图像,并通过缩放归一化至指定大小。
特征提取模块220:用于对归一化图像提取特征。优选地,提取HOG特征,并对特征向量进行特征归一化、PCA降维和汉宁窗加权,以提高KCF分类器的分类效果,降低系统计算和存储复杂度。
特征管理模块230:用于对特征向量进行管理。包括存储特征提取模块220的计算结果特征向量矩阵,存储和更新特征训练匹配模板,存储和读取KCF训练系数匹配模板,将存储的特征向量和特征训练匹配模板按序传输至KCF计算模块240进行计算,将匹配模板存储至外部存储模块300进行多目标间匹配模板的切换。优选地,使用乒乓策略对特征向量进行存储,实现特征提取与KCF计算的流水进行。
KCF计算模块240:以离散傅里叶变换为基础,用于计算图像特征的KCF响应。根据接收控制模块201的指令,在训练步骤中,从特征管理模块230中读取目标图像特征,计算生成KCF训练系数,并更新KCF训练系数匹配模板;在检测步骤中,从特征管理模块230中分别读取目标图像的特征训练匹配模板和待检图像的特征向量矩阵,使用KCF训练系数匹配模板计算待检图像特征的KCF响应矩阵,并从各个尺度待检图像中选取响应最大的待检图像的尺度作为跟踪到的目标的尺寸,将最大响应在特征中的偏移距离换算为跟踪到的目标的偏移距离,传送回通用处理器100。
外部存储模块300:存储训练图像帧和待检图像帧,优选地,使用乒乓策略进行存储。存储各个跟踪目标的历史训练匹配模板。
基于上述技术方案可知,本发明提出的一种实时高分辨率多目标跟踪方法及系统具有以下有益的效果。
1、本发明提出的计算系统,各个模块间为流水结构,在实施于并行资源丰富的硬件器件,如FPGA、ASIC等时,可充分发挥器件的并行计算优势,提高计算效率,满足高速实时计算要求,而在通用处理器上仅需要进行系统调度,有效降低处理器负荷和功耗,以及降低对处理器主频的依赖。
2、本发明提出的计算方法,通过尺寸的缩放归一化,可以在输入不同分辨率的视频时,均保持稳定的处理速度,对高分辨率视频提供了很好的支持。
3、本发明提出的计算方法,在进行多目标跟踪时,既可以使用一套系统串行遍历各个目标,也可以利用器件资源的并行性,开设多套系统,并行遍历目标,实现更高速实时多目标跟踪。
通过以上的描述,可以总结出更一般化的目标跟踪方法。
图3是根据本发明的更一般的目标跟踪方法的流程图。如图3中所示,根据本发明的目标跟踪方法300开始于步骤S310,在此步骤,从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小。根据优选实施例可知,所示视频可以是高分辨率视频。
在优选实施例中,步骤S310是由图2中的截取缩放模块210来执行的。更具体地,步骤S310可以进一步包括:从视频中获取前一帧作为训练图像帧,获取当前帧作为待检图像帧;从训练图像帧中提取目标位置,并以目标位置为基础从待检图像帧中提取多个尺度的待检位置;从训练图像帧中截取目标图像并通过缩放归一化至指定大小,从待检图像帧中,分别在每个待检位置截取待检图像,并通过缩放归一化至相同的大小。
本领域普通技术人员应该理解,尽管优选实施例中使用了上述具体步骤,但本发明并不排除使用其他具体步骤来实现截取并缩放目标图像与待检图像的可能性。因此,如果存在其他方式来实现截取并缩放目标图像与待检图像,则这样的方式和具体步骤也属于步骤S310的具体细节,也在本发明所要求保护的范围之内。
接下来,在步骤S320,对归一化图像提取特征。
在优选实施例中,步骤S320是由图2中的特征提取模块220来执行的。更具体地,步骤S320可以进一步包括:提取归一化后的目标图像的特征向量以及归一化后的待检图像的特征向量,进行存储。而且,所述的提取的特征可以是梯度直方图(HOG)特征,并对提取的特征向量进行归一化、主成分分析(PCA)降维和汉宁窗加权。本领域普通技术人员应当理解,尽管优选实施例中使用了特征向量、HOG、PCA和汉宁窗等具体手段进行优化,但也可以使用其他手段来进行替换,例如使用奇异值分解(SVD)来替代PCA,以达到相同或相似的目的。因此,如果存在其他方式来实现对归一化的目标图像与待检图像提取特征,则这样的方式和具体步骤也属于步骤S320的具体细节,也在本发明所要求保护的范围之内。
在步骤S330,基于所提取的图像特征,对匹配模板进行存取与更新。
在优选实施例中,步骤S330是由图2中的特征管理模块230来执行的。更具体地,步骤S330可以进一步包括:从图2的外部存储模块300中读取历史训练匹配模板,使用目标图像的特征向量更新特征训练匹配模板。
然后,在步骤340,基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的核化相关滤波器(KCF)响应,选取KCF响应最大的待检图像作为跟踪到的目标。
在步骤S330和S340中,可以采用乒乓缓存结构实现对图像特征的存取与使用。
在优选实施例中,步骤S340是由图2中的KCF计算模块240来执行的。更具体地,步骤S340进一步包括:以离散傅里叶变换为基础,基于读取的目标图像的特征向量计算生成KCF训练系数,由此更新KCF训练系数匹配模板;基于读取的更新后的特征训练匹配模板和读取的待检图像的特征向量,使用更新的KCF训练系数匹配模板计算各个待检图像的特征向量的KCF响应;将KCF响应最大的待检图像作为跟踪到的目标,换算跟踪到的目标尺寸与偏移距离。
类似地,本领域普通技术人员应当理解,尽管优选实施例中使用了上述的具体方式,如果存在其他方式来实现KCF响应的计算并由此能选取响应最大的待检图像作为跟踪到的目标,则这样的方式和具体步骤也属于步骤S340的具体细节,也在本发明所要求保护的范围之内。
在使用图2的硬件实现系统200来执行方法300时,可以使用多套计算资源,并行地遍历多个目标,直至完成所有目标的跟踪。
此外,在步骤S310中,可以读取前一帧和当前帧分别作为训练和待检图像帧,在当前帧的目标都已跟踪完毕之后,可以进入到视频中的下一帧,即将当前帧作为训练图像帧,从视频中读取下一帧作为待检图像帧,由此依次遍历视频中的所有图像帧执行所述方法300的流程图S310至S340,直至视频结束。由此,方法300也可以结束。
本领域普通技术人员应该认识到,本发明的方法可以实现为计算机程序。如上结合图3所述,根据上述实施例的方法可以执行一个或多个程序,包括指令来使得计算机或处理器执行结合附图所述的算法。这些程序可以使用各种类型的非瞬时计算机可读介质存储并提供给计算机或处理器。非瞬时计算机可读介质包括各种类型的有形存贮介质。非瞬时计算机可读介质的示例包括磁性记录介质(诸如软盘、磁带和硬盘驱动器)、磁光记录介质(诸如磁光盘)、CD-ROM(紧凑盘只读存储器)、CD-R、CD-R/W以及半导体存储器(诸如ROM、PROM(可编程ROM)、EPROM(可擦写PROM)、闪存ROM和RAM(随机存取存储器))。进一步,这些程序可以通过使用各种类型的瞬时计算机可读介质而提供给计算机。瞬时计算机可读介质的示例包括电信号、光信号和电磁波。瞬时计算机可读介质可以用于通过诸如电线和光纤的有线通信路径或无线通信路径提供程序给计算机。
因此,根据本发明,还可以提议一种计算机程序或一种计算机可读介质,用于记录可由处理器执行的指令,所述指令在被处理器执行时,使得处理器执行目标跟踪方法,包括如下操作:从外部存储模块中的视频中截取目标图 像以及多个尺度的待检图像,并通过缩放归一化至指定大小;对归一化图像提取特征;基于所提取的图像特征,对匹配模板进行存取与更新;基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的核化相关滤波器(KCF)响应,选取KCF响应最大的待检图像作为跟踪到的目标。
上面已经描述了本发明的各种实施例和实施情形。但是,本发明的精神和范围不限于此。本领域技术人员将能够根据本发明的教导而做出更多的应用,而这些应用都在本发明的范围之内。
也就是说,本发明的上述实施例仅仅是为清楚说明本发明所做的举例,而非对本发明实施方式的限定。对于所属领域的普通技术人员来说,在上述说明的基础上还可以做出其他不同形式的变化或变动。这里无需也无法对所有的实施方式予以穷举。凡在本发明的精神和原则之内所作的任何修改、替换或改进等,均应包含在本发明权利要求的保护范围之内。

Claims (15)

  1. 一种目标跟踪硬件实现系统,包括:
    截取缩放模块,用于从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小;
    特征提取模块,用于对归一化图像提取特征;
    特征管理模块,用于基于所提取的图像特征,对匹配模板进行存取与更新;
    核化相关滤波器(KCF)计算模块,用于基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的KCF响应,选取KCF响应最大的待检图像作为跟踪到的目标。
  2. 根据权利要求1所述的目标跟踪硬件实现系统,其中,所述视频是高分辨率视频。
  3. 根据权利要求1所述的目标跟踪硬件实现系统,其中,所述特征提取模块进一步用于提取梯度直方图(HOG)特征,并对提取的特征向量进行归一化、主成分分析(PCA)降维和汉宁窗加权。
  4. 根据权利要求1所述的目标跟踪硬件实现系统,其中,所述特征管理模块采用乒乓缓存结构实现对图像特征的存取与使用。
  5. 根据权利要求1所述的目标跟踪硬件实现系统,其中,所述截取缩放模块进一步用于:
    从视频中获取前一帧作为训练图像帧,获取当前帧作为待检图像帧;
    从训练图像帧中提取目标位置,并以目标位置为基础从待检图像帧中提取多个尺度的待检位置;
    从训练图像帧中截取目标图像并通过缩放归一化至指定大小,从待检图像帧中,分别在每个待检位置截取待检图像,并通过缩放归一化至相同的大小。
  6. 根据权利要求1所述的目标跟踪硬件实现系统,其中:
    所述特征提取模块进一步用于提取归一化后的目标图像的特征向量以及归一化后的待检图像的特征向量,发送给所述特征管理模块进行存储,
    所述特征管理模块进一步用于从外部存储模块中读取历史训练匹配模板,使用目标图像的特征向量更新特征训练匹配模板,
    所述KCF计算模块进一步用于以离散傅里叶变换为基础,基于从所述特征管理模块读取的目标图像的特征向量计算生成KCF训练系数,由此更新KCF训练系数匹配模板;基于从所述特征管理模块读取的更新后的特征训练匹配模板和从所述特征管理模块读取的待检图像的特征向量,使用更新的KCF训练系数匹配模板计算各个待检图像的特征向量的KCF响应;将KCF响应最大的待检图像作为跟踪到的目标,换算跟踪到的目标尺寸与偏移距离。
  7. 一种目标跟踪方法,包括:
    从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小;
    对归一化图像提取特征;
    基于所提取的图像特征,对匹配模板进行存取与更新;
    基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的核化相关滤波器(KCF)响应,选取KCF响应最大的待检图像作为跟踪到的目标。
  8. 根据权利要求7所述的方法,其中,所述视频是高分辨率视频。
  9. 根据权利要求7所述的方法,其中,所述的对归一化图像提取特征的步骤进一步包括:提取梯度直方图(HOG)特征,并对提取的特征向量进行归一化、主成分分析(PCA)降维和汉宁窗加权。
  10. 根据权利要求7所述的方法,进一步包括:采用乒乓缓存结构实现对图像特征的存取与使用。
  11. 根据权利要求7所述的方法,其中,所述的从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小的步骤进一步包括:
    从视频中获取前一帧作为训练图像帧,获取当前帧作为待检图像帧;
    从训练图像帧中提取目标位置,并以目标位置为基础从待检图像帧中提取多个尺度的待检位置;
    从训练图像帧中截取目标图像并通过缩放归一化至指定大小,从待检图像帧中,分别在每个待检位置截取待检图像,并通过缩放归一化至相同的大小。
  12. 根据权利要求7所述的方法,其中:
    所述的对归一化图像提取特征的步骤进一步包括:提取归一化后的目标图像的特征向量以及归一化后的待检图像的特征向量,进行存储,
    所述的基于所提取的图像特征,对匹配模板进行存取与更新的步骤进一步包括:从外部存储模块中读取历史训练匹配模板,使用目标图像的特征向量更新特征训练匹配模板,
    所述的基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的KCF响应,选取KCF响应最大的待检图像作为跟踪到的目标的步骤进一步包括:以离散傅里叶变换为基础,基于读取的目标图像的特征向量计算生成KCF训练系数,由此更新KCF训练系数匹配模板;基于读取的更新后的特征训练匹配模板和读取的待检图像的特征向量,使用更新的KCF训练系数匹配模板计算各个待检图像的特征向量的KCF响应;将KCF响应最大的待检图像作为跟踪到的目标,换算跟踪到的目标尺寸与偏移距离。
  13. 根据权利要求7所述的方法,进一步包括:使用多套计算资源,并行地遍历多个目标,直至完成所有目标的跟踪。
  14. 根据权利要求11所述的方法,进一步包括:将当前帧作为训练图像帧,从视频中读取下一帧作为待检图像帧,依次遍历视频中的所有图像帧执行所述方法,直至视频结束。
  15. 一种计算机可读介质,用于记录可由处理器执行的指令,所述指令在被处理器执行时,使得处理器执行目标跟踪方法,包括如下操作:
    从外部存储模块中的视频中截取目标图像以及多个尺度的待检图像,并通过缩放归一化至指定大小;
    对归一化图像提取特征;
    基于所提取的图像特征,对匹配模板进行存取与更新;
    基于所提取的图像特征以及更新后的匹配模板计算各个待检图像的核化相关滤波器(KCF)响应,选取KCF响应最大的待检图像作为跟踪到的目标。
PCT/CN2018/080595 2017-11-03 2018-03-27 目标跟踪硬件实现系统和方法 WO2019085377A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711067602.9A CN109753846A (zh) 2017-11-03 2017-11-03 目标跟踪硬件实现系统和方法
CN201711067602.9 2017-11-03

Publications (1)

Publication Number Publication Date
WO2019085377A1 true WO2019085377A1 (zh) 2019-05-09

Family

ID=66327445

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/080595 WO2019085377A1 (zh) 2017-11-03 2018-03-27 目标跟踪硬件实现系统和方法

Country Status (3)

Country Link
US (1) US10810746B2 (zh)
CN (1) CN109753846A (zh)
WO (1) WO2019085377A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582266A (zh) * 2020-03-30 2020-08-25 西安电子科技大学 配置目标跟踪硬件加速控制方法、系统、存储介质及应用
CN111754548A (zh) * 2020-06-29 2020-10-09 西安科技大学 一种基于响应判别的多尺度相关滤波目标跟踪方法和装置
CN112541468A (zh) * 2020-12-22 2021-03-23 中国人民解放军国防科技大学 一种基于双模板响应融合的目标跟踪方法

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110276383B (zh) * 2019-05-31 2021-05-14 北京理工大学 一种基于多通道记忆模型的核相关滤波目标定位方法
CN110555866A (zh) * 2019-08-07 2019-12-10 北京首贝科技发展有限公司 一种改进kcf特征描述符的红外目标跟踪方法
KR102164754B1 (ko) * 2019-08-27 2020-10-13 중앙대학교 산학협력단 개선된 kcf를 이용한 객체 추적 방법 및 장치
CN110706252B (zh) * 2019-09-09 2020-10-23 西安理工大学 运动模型引导下的机器人核相关滤波跟踪算法
CN112991382B (zh) * 2019-12-02 2024-04-09 中国科学院国家空间科学中心 一种基于pynq框架的异构视觉目标跟踪系统及方法
CN111160365A (zh) * 2019-12-06 2020-05-15 南京航空航天大学 基于检测器和跟踪器相结合的无人机目标跟踪方法
CN110991565A (zh) * 2019-12-24 2020-04-10 华北理工大学 基于kcf的目标跟踪优化算法
CN111161323B (zh) * 2019-12-31 2023-11-28 北京理工大学重庆创新中心 一种基于相关滤波的复杂场景目标跟踪方法及系统
CN111105444B (zh) * 2019-12-31 2023-07-25 哈尔滨工程大学 一种适用于水下机器人目标抓取的连续跟踪方法
CN111400069B (zh) * 2020-03-23 2024-01-26 北京经纬恒润科技股份有限公司 一种kcf跟踪算法的实现方法、装置及系统
CN111814734B (zh) * 2020-07-24 2024-01-26 南方电网数字电网研究院有限公司 识别刀闸状态的方法
CN112070802B (zh) * 2020-09-02 2024-01-26 合肥英睿系统技术有限公司 一种目标跟踪方法、装置、设备及计算机可读存储介质
CN112561958B (zh) * 2020-12-04 2023-04-18 武汉华中天经通视科技有限公司 一种相关滤波图像跟踪丢失判定方法
CN112528817B (zh) * 2020-12-04 2024-03-19 重庆大学 一种基于神经网络的巡检机器人视觉检测及跟踪方法
CN113658216A (zh) * 2021-06-24 2021-11-16 北京理工大学 基于多级自适应kcf的遥感目标跟踪方法及电子设备
CN114596332A (zh) * 2022-04-26 2022-06-07 四川迪晟新达类脑智能技术有限公司 提升跟踪目标特征信息的方法、系统、设备及存储介质
CN116109975B (zh) * 2023-02-08 2023-10-20 广州宝立科技有限公司 电网安全作业监控图像处理方法及智能视频监控系统
CN116069801B (zh) * 2023-03-06 2023-06-30 山东华夏高科信息股份有限公司 一种交通视频结构化数据生成方法、装置及介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106023248A (zh) * 2016-05-13 2016-10-12 上海宝宏软件有限公司 一种实时的视频跟踪方法
CN106650592A (zh) * 2016-10-05 2017-05-10 北京深鉴智能科技有限公司 目标追踪系统
WO2017088050A1 (en) * 2015-11-26 2017-06-01 Sportlogiq Inc. Systems and methods for object tracking and localization in videos with adaptive image representation

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1916538A3 (en) * 2006-10-27 2011-02-16 Panasonic Electric Works Co., Ltd. Target moving object tracking device
MXPA06013614A (es) * 2006-11-24 2007-12-06 Global Sight S A De C V Sistemas de transmision de datos en forma remota y digital y localizacion satelital desde terminales moviles o fijas con camaras de vigilancia urbana para reconocimiento facial, detector de disparos, captura de personal de seguridad publica y persona
KR101554643B1 (ko) * 2009-02-17 2015-09-21 삼성전자주식회사 이동통신 단말기에서 화상통화 시 자동으로 이모티콘을 전송하기 위한 장치 및 방법
KR101480348B1 (ko) * 2013-05-31 2015-01-09 삼성에스디에스 주식회사 사람 검출 장치 및 방법과 사람 계수 장치 및 방법
CN104424634B (zh) * 2013-08-23 2017-05-03 株式会社理光 对象跟踪方法和装置
US9210542B2 (en) * 2013-12-09 2015-12-08 Nec Europe Ltd. Method and computer system for detecting crowds in a location tracking system
US9552633B2 (en) * 2014-03-07 2017-01-24 Qualcomm Incorporated Depth aware enhancement for stereo video
CN104200237B (zh) * 2014-08-22 2019-01-11 浙江生辉照明有限公司 一种基于核化相关滤波高速自动多目标跟踪方法
US9646389B2 (en) * 2014-08-26 2017-05-09 Qualcomm Incorporated Systems and methods for image scanning
JP5794599B1 (ja) * 2014-12-22 2015-10-14 株式会社 テクノミライ デジタルファインドセキュリティシステム、方法及びプログラム
US9905006B2 (en) * 2015-02-12 2018-02-27 Toshiba Medical Systems Corporation Medical image processing apparatus, medical image processing method, and medical imaging system
US20170054449A1 (en) * 2015-08-19 2017-02-23 Texas Instruments Incorporated Method and System for Compression of Radar Signals
US10026004B2 (en) * 2016-07-08 2018-07-17 Conduent Business Services, Llc Shadow detection and removal in license plate images
CN106570893A (zh) * 2016-11-02 2017-04-19 中国人民解放军国防科学技术大学 一种基于相关滤波的快速稳健视觉跟踪方法
CN106651913A (zh) * 2016-11-29 2017-05-10 开易(北京)科技有限公司 基于相关滤波和颜色直方图统计的目标跟踪方法及adas系统
CN106887011B (zh) * 2017-01-20 2019-11-15 北京理工大学 一种基于cnn和cf的多模板目标跟踪方法
CN107197199A (zh) * 2017-05-22 2017-09-22 哈尔滨工程大学 一种智能监控装置及目标跟踪方法
US10902615B2 (en) * 2017-11-13 2021-01-26 Qualcomm Incorporated Hybrid and self-aware long-term object tracking

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017088050A1 (en) * 2015-11-26 2017-06-01 Sportlogiq Inc. Systems and methods for object tracking and localization in videos with adaptive image representation
CN106023248A (zh) * 2016-05-13 2016-10-12 上海宝宏软件有限公司 一种实时的视频跟踪方法
CN106650592A (zh) * 2016-10-05 2017-05-10 北京深鉴智能科技有限公司 目标追踪系统

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582266A (zh) * 2020-03-30 2020-08-25 西安电子科技大学 配置目标跟踪硬件加速控制方法、系统、存储介质及应用
CN111582266B (zh) * 2020-03-30 2023-04-07 西安电子科技大学 配置目标跟踪硬件加速控制方法、系统、存储介质及应用
CN111754548A (zh) * 2020-06-29 2020-10-09 西安科技大学 一种基于响应判别的多尺度相关滤波目标跟踪方法和装置
CN111754548B (zh) * 2020-06-29 2023-10-03 西安科技大学 一种基于响应判别的多尺度相关滤波目标跟踪方法和装置
CN112541468A (zh) * 2020-12-22 2021-03-23 中国人民解放军国防科技大学 一种基于双模板响应融合的目标跟踪方法
CN112541468B (zh) * 2020-12-22 2022-09-06 中国人民解放军国防科技大学 一种基于双模板响应融合的目标跟踪方法

Also Published As

Publication number Publication date
US10810746B2 (en) 2020-10-20
US20190139232A1 (en) 2019-05-09
CN109753846A (zh) 2019-05-14

Similar Documents

Publication Publication Date Title
WO2019085377A1 (zh) 目标跟踪硬件实现系统和方法
US11450146B2 (en) Gesture recognition method, apparatus, and device
CN112580416A (zh) 基于深暹罗网络和贝叶斯优化的视频跟踪
US9892315B2 (en) Systems and methods for detection of behavior correlated with outside distractions in examinations
US20140003663A1 (en) Method of detecting facial attributes
Yang et al. Deformable convolution and coordinate attention for fast cattle detection
Nguyen et al. Yolo based real-time human detection for smart video surveillance at the edge
JP2013131209A (ja) 顔特徴ベクトルの構築
US20200285859A1 (en) Video summary generation method and apparatus, electronic device, and computer storage medium
Liu et al. Real-time facial expression recognition based on cnn
CN113989696B (zh) 目标跟踪方法、装置、电子设备及存储介质
WO2013122009A1 (ja) 信頼度取得装置、信頼度取得方法および信頼度取得プログラム
CN106338711A (zh) 一种基于智能设备的语音定向方法及系统
Zhang et al. Mask-MRNet: A deep neural network for wind turbine blade fault detection
Chen et al. Learning to count with back-propagated information
CN113158904B (zh) 一种基于双掩膜模板更新的孪生网络目标跟踪方法及装置
Qian et al. An image classification algorithm based on SVM
Valencia et al. Hardware Performance Evaluation of Different Computing Devices on YOLOv5 Ship Detection Model
TC et al. Parallelization of face detection engine
TW201327421A (zh) 可簡化影像特徵值組之影像擷取裝置及其控制方法
Zhang et al. A review of small target detection based on deep learning
Chen et al. Real-time spatio-temporal action localization in 360 videos
US20160180143A1 (en) Eye tracking with 3d eye position estimations and psf models
CN110210306B (zh) 一种人脸跟踪方法和相机
WO2022227916A1 (zh) 图像处理方法、图像处理器、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18873147

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13.08.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18873147

Country of ref document: EP

Kind code of ref document: A1