CN110490906A - A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network - Google Patents

A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network Download PDF

Info

Publication number
CN110490906A
CN110490906A CN201910771090.7A CN201910771090A CN110490906A CN 110490906 A CN110490906 A CN 110490906A CN 201910771090 A CN201910771090 A CN 201910771090A CN 110490906 A CN110490906 A CN 110490906A
Authority
CN
China
Prior art keywords
network
lstm
layer
shot
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910771090.7A
Other languages
Chinese (zh)
Inventor
王彩玲
臧振飞
蒋国平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201910771090.7A priority Critical patent/CN110490906A/en
Publication of CN110490906A publication Critical patent/CN110490906A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network, firstly for video sequence to be tracked, the input obtained every time using the continuous two field pictures in front and back as network;Then feature extraction is carried out by two continuous frames image of the twin convolutional network to input, the appearance and semantic feature of different levels, then the depth characteristic high and low layered by full articulamentum cascading is obtained after convolution operation;Depth characteristic is transmitted to the shot and long term memory network comprising two LSTM units again and carries out Series Modeling, door is forgotten by LSTM, activation screening is carried out to the target signature of different location in sequence, and export the status information of current goal by out gate;The full articulamentum of LSTM output is finally received to export target in the predicted position coordinate of present frame, and updates the region of search of next frame target.Tracking velocity is greatly improved while guaranteeing certain tracking stability and accuracy, real-time performance of tracking is enabled to be greatly improved.

Description

A kind of real-time vision target based on twin convolutional network and shot and long term memory network with Track method
Technical field
The invention belongs to computer visions and visual target tracking technical field, and in particular to one kind is based on twin convolution net The real-time vision method for tracking target of network and shot and long term memory network.
Background technique
Visual target tracking technology is one of most important Railway Project in computer vision field, and is possessed extensive Application scenarios, such as safety monitoring, smart home, smart city etc..Its main task be in one group of video sequence, give to Same target is found in the position of present frame from the second frame in position and state of the track target in first frame backward.Although target with Track technology has been widely studied, but be limited to during tracking the target deformation that may occur, background is blocked, motion blur, The disturbing factors such as illumination variation, stability, accuracy and the rapidity of visual target tracking still it is difficult to ensure that.
Current vision method for tracking target can be divided into two major classes from basic framework: be based on manual extraction feature respectively Conventional method and deep learning method based on depth characteristic;Can inwardly two major classes be divided into from building: with stencil matching For the production method of representative and to be detected as the duscriminant method represented frame by frame.Conventional method based on manual extraction feature is most Big advantage is that algorithm is small and exquisite, and structure is simple, and operational efficiency is high and debugging is convenient, is easy to be transformed;However its disadvantage is also very Obviously, i.e., lower with the most precision of the method for manual extraction traditional characteristic, feature extraction exists uncertain, it is difficult to maximize mesh Appearance is marked for the validity of tracking process.Therefore, it is ruled by deep learning in ImageNet image recognition and detection contest Afterwards, target tracking domain also starts to introduce the method using deep neural network as frame.The VOT contest held in 2015 In, Hyeonseob Nam et al. proposes that the MDNet that duscriminant tracking is carried out using convolution feature reaches surprising in cycle tests 94.8% precision win the title at one stroke and substantially leading other methods, thus open deep learning in target tracking domain Research is widely applied.
Although extracting the method that target appearance feature is used to track process using convolutional neural networks can reach relatively high Tracking accuracy, but due in one group of video sequence, when appearance of tracked target, may be difficult to expect change Change, therefore on-line fine must be all added in existing most of methods based on depth network during tracking, that is, is tracking When one group of sequence, different positive negative samples hundreds of to the Objective extraction of each frame, and after treatment continue frame when to extract All samples based on, constantly update in addition to convolutional network responsible classification output full articulamentum weight parameter.It is many It is well known, the parameter scale of deep neural network be it is very huge, any minor modifications for parameter can all make entire net Network finds the optimal value under current state again, and since the calculation amount of this process is excessively huge compared to conventional method, institute The time of cost is very very long, therefore the problem that the generally existing tracking velocity of tracking based on depth characteristic is slow, It is difficult to reach real-time.
Summary of the invention
It is a kind of based on twin convolutional network the technical problem to be solved by the present invention is to overcome the deficiencies of the prior art and provide With the real-time vision method for tracking target of shot and long term memory network, net is remembered by using the shot and long term comprising two LSTM units Network part replaces the full articulamentum of use multilayer in conventional tracking to realize and is directed to external appearance characteristic on-line fine online- The strategy of finetune so that sequence relation is effectively used between the distinctive frame of video data and frame, and carries out sequence and builds Mould;While guaranteeing tracking stability and accuracy, tracking velocity is greatly improved, the real-time of tracking is improved.
The present invention provides a kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network, Include the following steps:
Step S1, it for video sequence to be tracked, is obtained every time using the continuous two field pictures in front and back as network Input;
Step S2, feature extraction is carried out by two continuous frames image of the twin convolutional network to input, by convolution operation The appearance and semantic feature of different levels, then the depth characteristic high and low layered by full articulamentum cascading are obtained afterwards;
Step S3, depth characteristic is transmitted to the shot and long term memory network comprising two LSTM units and carries out Series Modeling, Door is forgotten by LSTM, activation screening is carried out to the target signature of different location in sequence, and current goal is exported by out gate Status information;
Step S4, the full articulamentum of LSTM output is received to export target in the predicted position coordinate of present frame, and more The region of search of new next frame target.
As further technical solution of the present invention, in the step S2, twin convolutional network by the network number of plies, structure, Group in parallel above and below two convolutional networks of convolution kernel size, pond mode and weight identical and shared with Padding step-length At;The network number of plies includes the first convolutional layer, the first pond layer, the second convolutional layer, the second pond layer, third convolutional layer, Volume Four Lamination, the 5th convolutional layer and third pond layer;The convolution kernel size and port number of first convolutional layer are 11*11*96, the second convolution The convolution kernel size and port number of layer are 5*5*256, the convolution kernel size of third convolutional layer, Volume Four lamination and the 5th convolutional layer It is 3*3*384 with port number, the first pond layer, the filter size of the second pond layer and third pond layer and port number are 3* 3;First convolutional layer, the first pond layer, the second pond layer and third pond layer padding mode be valid mode, volume Two Lamination, five convolutional layer of third convolutional layer, Volume Four lamination and ground padding mode be same mode;Input picture is twin Convolutional network is modified to having a size of 227*227*3.
Further, shot and long term memory network part includes two LSTM units, is connected entirely wherein the first LSTM receives to come from The convolution feature input of layer is connect, the 2nd LSTM is defeated with the cascade nature of the output of the first LSTM and twin convolutional network part Enter, and continuous and independent tracking video sequence is combined to carry out sequence data modeling, exists to the same target in same sequence The output of corresponding each sequence state is calculated separately under different conditions.
Further, the data set of video sequence includes ILSVRC2016 video object detection data collection, Amsterdam Normal video data library and non-natural video sequence, ILSVRC2016 video object detection data collection include 3862 video sequences Column, 1122397 width images, the bounding box and 7911 target trajectories of 1731913 spottings;Amsterdam Normal video data library includes 314 video sequences, and 148319 width images, each video sequence includes a specific objective;It is non- Natural video frequency sequence is selected 478807 width static images to enhance using data by artificial synthesis from ImageNet data set Strategy synthesis constructs.
Further, the feed-forward mode of two LSTM units is
Wherein, t is the index of frame, xtAnd yt-1Respectively the feature of current time input frame and previous moment output frame to Amount, W, R, P are respectively the weight coefficient matrix of input gate, out gate and peephole transmitting, and b is bias vector, h be hyperbolic just Function is cut, σ is sigmoid function, and ⊙ is dot product;Z is the whole input of LSTM unit, and i is to transmit between the cell of LSTM Input gate, o are the out gate of each cell of LSTM, and f is to forget door, and c is the cell state of different moments in sequence, y LSTM Overall output;Forward direction transmitting generates the output vector y for storing present frame dbjective statetWhen the current frame with processing The cell state c of LSTMt, and ytAnd ctAll input will be taken as to be transmitted to cell when handling subsequent frame, to reach in sequence Propagated forward in data.
Further, when video sequence to be tracked input, the target position of sequence head frame is with top-left coordinates and bottom right The pairs of form of coordinate is given.
It is mostly based on invention replaces other in method of deep learning and receives convolution net using the full articulamentum of multilayer The external appearance characteristic that network layers are extracted, then be used for realizing the on-line fine for improving robustness for target appearance eigentransformation The feature of online-finetune strategy replaces multilayer complete using the shot and long term memory network part comprising two LSTM units Articulamentum carries out Series Modeling to handle the distinctive relationship between sequences of video data, for proposed in technical background based on depth The tracking universal velocity for spending study is slower, the not high defect of real-time, is guaranteeing certain tracking stability and accuracy Tracking velocity is greatly improved simultaneously, real-time performance of tracking is enabled to be greatly improved.
Detailed description of the invention
Fig. 1 is method flow schematic diagram of the invention;
Specific embodiment
Referring to Fig. 1, the present embodiment provides a kind of real-time vision based on twin convolutional network and shot and long term memory network Method for tracking target includes the following steps:
Step S1, it for video sequence to be tracked, is obtained every time using the continuous two field pictures in front and back as network Input;
Step S2, feature extraction is carried out by two continuous frames image of the twin convolutional network to input, by convolution operation The appearance and semantic feature of different levels, then the depth characteristic high and low layered by full articulamentum cascading are obtained afterwards;
Step S3, depth characteristic is transmitted to the shot and long term memory network comprising two LSTM units and carries out Series Modeling, Door is forgotten by LSTM, activation screening is carried out to the target signature of different location in sequence, and current goal is exported by out gate Status information;
Step S4, the full articulamentum of LSTM output is received to export target in the predicted position coordinate of present frame, and more The region of search of new next frame target.
When carrying out the real-time vision target following based on twin convolutional network and shot and long term memory network, builds be used for first Execute tracing task deep neural network, network mainly includes two parts: for input picture carry out dimensional variation and The twin convolutional neural networks part of convolution feature extraction, and the external appearance characteristic and semantic feature that receive target different levels are simultaneously The shot and long term memory network part of Series Modeling is carried out to the feature from same video image successive frame.
Twin convolutional network part is complete by the network number of plies, structure, convolution kernel size, pond mode and Padding step-length Two convolutional networks of identical and shared weight compose in parallel up and down, and specific network structure and parameter are as shown in table 1:
Table 1, network structure and parameter list
Wherein, since Layer indicate to the last one layer of pond receiving the first layer convolutional layer Conv1 that original image inputs Change the all-network layer between layer;Size indicates the convolution kernel or filter size and port number of current convolutional layer or pond layer; The filter step size of Stride expression current network layer;Padding indicates the Padding mode that current network layer uses: Same Mode or Valid mode.
When the two continuous frames image from same video sequence is input into network, network is first by the defeated of two images The characteristic dimension for entering modification of dimension to 227*227*3, after each layer network convolutional calculation output in twin convolutional network part And size, as shown in table 2:
Table 2, each network layer characteristic dimension and size
Wherein, Layer indicates network layer different in twin convolutional network part, and Size indicates input picture by corresponding Characteristic dimension and port number after network layer handles.
Shot and long term memory network part is made of two LSTM units, and first LSTM receives the convolution from full articulamentum Feature input, second LSTM is input with the output of first LSTM and the cascade nature of twin convolutional network part, and is tied It closes continuous and independent tracking video sequence and carries out sequence data modeling.
After putting up the deep neural network for executing tracing task, start to train network weight parameter.It is arrived using end The training method at end, video sequence data collection used by training include:
1) ILSVRC2016 video object detection data collection, wherein comprising 3862 video sequences, 1122397 width images, The bounding box and 7911 target trajectories of 1731913 spottings.
2) Amsterdam Normal video data library (ALOV 300+), wherein including 314 video sequences, 148319 width Image, each video sequence include a specific objective.
3) in order to increase the targeted species in training data, artificial synthesis and data enhancing strategy are adopted from ImageNet Different static images is selected in data set to synthesize the non-natural video sequence that construction includes 478807 width images.
Compared with single layer LSTM, two layers of LSTM can handle more features details, capture more complicated target signature variation, And longer sequence data is handled, it is transmitted using the feed-forward mode of such as following formula:
Wherein, t indicates the index of frame, xtAnd yt-1It is the feature of current time input frame and previous moment output frame respectively Vector, W, R, P respectively indicate the weight coefficient matrix of input gate, out gate and peephole transmitting, and b indicates that bias vector, h are Hyperbolic tangent function, σ are sigmoid functions, and ⊙ indicates dot product;Z indicates the whole input of LSTM unit, and i is indicated LSTM's The input gate transmitted between cell, o indicate the out gate of each cell of LSTM, and f, which is represented, forgets door, and c indicates different moments in sequence Cell state, y indicate LSTM overall output.The transmitting of forward direction generate the output for storing present frame dbjective state to Measure ytWith the cell state c for handling LSTM when the current framet, and ytAnd ctAll input will be taken as to be transmitted to when handling subsequent frame Cell, to reach the propagated forward on sequence data.Since shot and long term memory network is compared to other depth nets for tracking Network during tracking without carrying out on-line fine, therefore real-time and tracking velocity is available significantly improves.
It is tested below.From the method proposed in recent years, chooses be based on traditional characteristic and depth characteristic, Yi Jisheng respectively Several trackings of an accepted way of doing sth and duscriminant, respectively to biography on two target following sequential test collection of VOT2014 and VOT2016 The duscriminant correlation filter tracker DSST and the present invention of system carry out tracking accuracy and rapidity Experimental comparison, as a result such as table Shown in 3:
Table 3, test result
Wherein, Methods is the tracking for participating in comparison;Accuracy indicates future position and the true position of target The overlapping region set is handed over and is compared, and can embody the accuracy of tracking, Accuracy numerical value is bigger, and accuracy is higher;Speed Indicating the average tracking rate during tracking tested entire test set within 1 second unit time, Speed numerical value is bigger, Rapidity is higher.It should be noted that the field with "/" mark shows that the method is not delivered also in test set publication, therefore Have neither part nor lot in the experiment of corresponding test set.
As can be seen from the table, the present invention is better than largely testing comparison other in accuracy, that is, precision, and with Rapidity, that is, tracking velocity of track is substantially better than other and participates in experimental method, therefore effectiveness of the invention is proved.
The basic principles, main features and advantages of the invention have been shown and described above.Those skilled in the art should Understand, the present invention do not limited by above-mentioned specific embodiment, the description in above-mentioned specific embodiment and specification be intended merely into One step illustrates the principle of the present invention, and under the premise of not departing from spirit of that invention range, the present invention also has various change and changes Into these changes and improvements all fall within the protetion scope of the claimed invention.The scope of protection of present invention is by claim Book and its equivalent thereof.

Claims (6)

1. a kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network, which is characterized in that Include the following steps:
Step S1, for video sequence to be tracked, the input obtained every time using the continuous two field pictures in front and back as network;
Step S2, feature extraction is carried out by two continuous frames image of the twin convolutional network to input, is obtained after convolution operation Take the appearance and semantic feature of different levels, then the depth characteristic high and low layered by full articulamentum cascading;
Step S3, depth characteristic is transmitted to the shot and long term memory network comprising two LSTM units and carries out Series Modeling, by LSTM forgets door and carries out activation screening to the target signature of different location in sequence, and the shape of current goal is exported by out gate State information;
Step S4, the full articulamentum of LSTM output is received to export target in the predicted position coordinate of present frame, and under update The region of search of one frame target.
2. a kind of real-time vision target based on twin convolutional network and shot and long term memory network according to claim 1 with Track method, which is characterized in that in the step S2, twin convolutional network is by the network number of plies, structure, convolution kernel size, Chi Huafang Two convolutional networks of formula and weight identical and shared with Padding step-length compose in parallel up and down;The network number of plies includes the One convolutional layer, the first pond layer, the second convolutional layer, the second pond layer, third convolutional layer, Volume Four lamination, the 5th convolutional layer and Third pond layer;The convolution kernel size and port number of first convolutional layer be 11*11*96, the convolution kernel size of the second convolutional layer and Port number is 5*5*256, and third convolutional layer, the convolution kernel size of Volume Four lamination and the 5th convolutional layer and port number are 3*3* 384, the first pond layer, the filter size of the second pond layer and third pond layer and port number are 3*3;First convolutional layer, The padding mode of one pond layer, the second pond layer and third pond layer is valid mode, the second convolutional layer, third convolution Layer, five convolutional layer of Volume Four lamination and ground padding mode be same mode;Input picture by twin convolutional network modify to Having a size of 227*227*3.
3. a kind of real-time vision target based on twin convolutional network and shot and long term memory network according to claim 1 with Track method, which is characterized in that shot and long term memory network part includes two LSTM units, is connected entirely wherein the first LSTM receives to come from The convolution feature input of layer is connect, the 2nd LSTM is defeated with the cascade nature of the output of the first LSTM and twin convolutional network part Enter, and continuous and independent tracking video sequence is combined to carry out sequence data modeling, exists to the same target in same sequence The output of corresponding each sequence state is calculated separately under different conditions.
4. a kind of real-time vision target based on twin convolutional network and shot and long term memory network according to claim 1 with Track method, which is characterized in that the data set of video sequence includes ILSVRC2016 video object detection data collection, Amsterdam Normal video data library and non-natural video sequence, ILSVRC2016 video object detection data collection include 3862 video sequences Column, 1122397 width images, the bounding box and 7911 target trajectories of 1731913 spottings;Amsterdam Normal video data library includes 314 video sequences, and 148319 width images, each video sequence includes a specific objective;It is non- Natural video frequency sequence is selected 478807 width static images to synthesize from ImageNet data set and is constructed by artificial synthesis.
5. a kind of real-time vision target based on twin convolutional network and shot and long term memory network according to claim 1 with Track method, which is characterized in that the feed-forward mode of two LSTM units is
Wherein, t is the index of frame, xtAnd yt-1The respectively feature vector of current time input frame and previous moment output frame, W, R, P is respectively the weight coefficient matrix of input gate, out gate and peephole transmitting, and b is bias vector, and h is tanh letter Number, σ are sigmoid function, and ⊙ is dot product;Z is the whole input of LSTM unit, and i is the input transmitted between the cell of LSTM Door, o are the out gate of each cell of LSTM, and f is to forget door, and c is the cell state of different moments in sequence, and y is the whole of LSTM Body output;Forward direction transmitting generates the output vector y for storing present frame dbjective statetWith processing LSTM when the current frame Cell state ct, and ytAnd ctAll input will be taken as to be transmitted to cell when handling subsequent frame, to reach on sequence data Propagated forward.
6. a kind of real-time vision target based on twin convolutional network and shot and long term memory network according to claim 1 with Track method, which is characterized in that when video sequence to be tracked inputs, the target position of sequence head frame is with top-left coordinates and bottom right The pairs of form of coordinate is given.
CN201910771090.7A 2019-08-20 2019-08-20 A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network Pending CN110490906A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910771090.7A CN110490906A (en) 2019-08-20 2019-08-20 A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910771090.7A CN110490906A (en) 2019-08-20 2019-08-20 A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network

Publications (1)

Publication Number Publication Date
CN110490906A true CN110490906A (en) 2019-11-22

Family

ID=68551599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910771090.7A Pending CN110490906A (en) 2019-08-20 2019-08-20 A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network

Country Status (1)

Country Link
CN (1) CN110490906A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110992378A (en) * 2019-12-03 2020-04-10 湖南大学 Dynamic update visual tracking aerial photography method and system based on rotor flying robot
CN111882580A (en) * 2020-07-17 2020-11-03 元神科技(杭州)有限公司 Video multi-target tracking method and system
CN111899283A (en) * 2020-07-30 2020-11-06 北京科技大学 Video target tracking method
CN112037263A (en) * 2020-09-14 2020-12-04 山东大学 Operation tool tracking system based on convolutional neural network and long-short term memory network
CN112149616A (en) * 2020-10-13 2020-12-29 西安电子科技大学 Figure interaction behavior recognition method based on dynamic information
CN112336381A (en) * 2020-11-07 2021-02-09 吉林大学 Echocardiogram end systole/diastole frame automatic identification method based on deep learning
CN112465028A (en) * 2020-11-27 2021-03-09 南京邮电大学 Perception vision security assessment method and system
CN112530553A (en) * 2020-12-03 2021-03-19 中国科学院深圳先进技术研究院 Method and device for estimating interaction force between soft tissue and tool
CN112971769A (en) * 2021-02-04 2021-06-18 杭州慧光健康科技有限公司 Home personnel tumble detection system and method based on biological radar
CN113298142A (en) * 2021-05-24 2021-08-24 南京邮电大学 Target tracking method based on deep space-time twin network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326837A (en) * 2016-08-09 2017-01-11 北京旷视科技有限公司 Object tracking method and apparatus
US20180129934A1 (en) * 2016-11-07 2018-05-10 Qualcomm Incorporated Enhanced siamese trackers
CN108229292A (en) * 2017-07-28 2018-06-29 北京市商汤科技开发有限公司 target identification method, device, storage medium and electronic equipment
CN108320297A (en) * 2018-03-09 2018-07-24 湖北工业大学 A kind of video object method for real time tracking and system
CN108520530A (en) * 2018-04-12 2018-09-11 厦门大学 Method for tracking target based on long memory network in short-term
CN108665482A (en) * 2018-04-18 2018-10-16 南京邮电大学 A kind of visual target tracking method based on VGG depth networks
CN109446889A (en) * 2018-09-10 2019-03-08 北京飞搜科技有限公司 Object tracking method and device based on twin matching network
CN109711316A (en) * 2018-12-21 2019-05-03 广东工业大学 A kind of pedestrian recognition methods, device, equipment and storage medium again

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326837A (en) * 2016-08-09 2017-01-11 北京旷视科技有限公司 Object tracking method and apparatus
US20180129934A1 (en) * 2016-11-07 2018-05-10 Qualcomm Incorporated Enhanced siamese trackers
CN108229292A (en) * 2017-07-28 2018-06-29 北京市商汤科技开发有限公司 target identification method, device, storage medium and electronic equipment
CN108320297A (en) * 2018-03-09 2018-07-24 湖北工业大学 A kind of video object method for real time tracking and system
CN108520530A (en) * 2018-04-12 2018-09-11 厦门大学 Method for tracking target based on long memory network in short-term
CN108665482A (en) * 2018-04-18 2018-10-16 南京邮电大学 A kind of visual target tracking method based on VGG depth networks
CN109446889A (en) * 2018-09-10 2019-03-08 北京飞搜科技有限公司 Object tracking method and device based on twin matching network
CN109711316A (en) * 2018-12-21 2019-05-03 广东工业大学 A kind of pedestrian recognition methods, device, equipment and storage medium again

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
K. GREFF, ET.AL: "LSTM: A Search Space Odyssey", 《IEEE TRANSACTIONS ON NEURAL NETWORKS & LEARNING SYSTEMS》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110992378B (en) * 2019-12-03 2023-05-16 湖南大学 Dynamic updating vision tracking aerial photographing method and system based on rotor flying robot
CN110992378A (en) * 2019-12-03 2020-04-10 湖南大学 Dynamic update visual tracking aerial photography method and system based on rotor flying robot
CN111882580A (en) * 2020-07-17 2020-11-03 元神科技(杭州)有限公司 Video multi-target tracking method and system
CN111882580B (en) * 2020-07-17 2023-10-24 元神科技(杭州)有限公司 Video multi-target tracking method and system
CN111899283A (en) * 2020-07-30 2020-11-06 北京科技大学 Video target tracking method
CN111899283B (en) * 2020-07-30 2023-10-17 北京科技大学 Video target tracking method
CN112037263A (en) * 2020-09-14 2020-12-04 山东大学 Operation tool tracking system based on convolutional neural network and long-short term memory network
CN112037263B (en) * 2020-09-14 2024-03-19 山东大学 Surgical tool tracking system based on convolutional neural network and long-term and short-term memory network
CN112149616A (en) * 2020-10-13 2020-12-29 西安电子科技大学 Figure interaction behavior recognition method based on dynamic information
CN112149616B (en) * 2020-10-13 2023-10-20 西安电子科技大学 Character interaction behavior recognition method based on dynamic information
CN112336381B (en) * 2020-11-07 2022-04-22 吉林大学 Echocardiogram end systole/diastole frame automatic identification method based on deep learning
CN112336381A (en) * 2020-11-07 2021-02-09 吉林大学 Echocardiogram end systole/diastole frame automatic identification method based on deep learning
CN112465028A (en) * 2020-11-27 2021-03-09 南京邮电大学 Perception vision security assessment method and system
CN112465028B (en) * 2020-11-27 2023-11-14 南京邮电大学 Perception visual safety assessment method and system
CN112530553A (en) * 2020-12-03 2021-03-19 中国科学院深圳先进技术研究院 Method and device for estimating interaction force between soft tissue and tool
CN112971769A (en) * 2021-02-04 2021-06-18 杭州慧光健康科技有限公司 Home personnel tumble detection system and method based on biological radar
CN113298142A (en) * 2021-05-24 2021-08-24 南京邮电大学 Target tracking method based on deep space-time twin network
CN113298142B (en) * 2021-05-24 2023-11-17 南京邮电大学 Target tracking method based on depth space-time twin network

Similar Documents

Publication Publication Date Title
CN110490906A (en) A kind of real-time vision method for tracking target based on twin convolutional network and shot and long term memory network
CN111553193B (en) Visual SLAM closed-loop detection method based on lightweight deep neural network
CN109829436B (en) Multi-face tracking method based on depth appearance characteristics and self-adaptive aggregation network
CN108133188A (en) A kind of Activity recognition method based on motion history image and convolutional neural networks
CN104217214B (en) RGB D personage's Activity recognition methods based on configurable convolutional neural networks
CN105701460B (en) A kind of basketball goal detection method and apparatus based on video
CN108520530A (en) Method for tracking target based on long memory network in short-term
CN109146921A (en) A kind of pedestrian target tracking based on deep learning
CN109816689A (en) A kind of motion target tracking method that multilayer convolution feature adaptively merges
CN109410242A (en) Method for tracking target, system, equipment and medium based on double-current convolutional neural networks
CN108171112A (en) Vehicle identification and tracking based on convolutional neural networks
CN109871781A (en) Dynamic gesture identification method and system based on multi-modal 3D convolutional neural networks
CN106651830A (en) Image quality test method based on parallel convolutional neural network
CN108681774A (en) Based on the human body target tracking method for generating confrontation network negative sample enhancing
CN108932479A (en) A kind of human body anomaly detection method
CN105512680A (en) Multi-view SAR image target recognition method based on depth neural network
CN109522855A (en) In conjunction with low resolution pedestrian detection method, system and the storage medium of ResNet and SENet
CN110991274B (en) Pedestrian tumbling detection method based on Gaussian mixture model and neural network
CN111160294B (en) Gait recognition method based on graph convolution network
CN113239801B (en) Cross-domain action recognition method based on multi-scale feature learning and multi-level domain alignment
CN108596327A (en) A kind of seismic velocity spectrum artificial intelligence pick-up method based on deep learning
CN107729993A (en) Utilize training sample and the 3D convolutional neural networks construction methods of compromise measurement
Xiao et al. Few-shot object detection with self-adaptive attention network for remote sensing images
CN107025420A (en) The method and apparatus of Human bodys' response in video
CN104298974A (en) Human body behavior recognition method based on depth video sequence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191122