WO2018086607A1 - Procédé de suivi de cible, dispositif électronique et support d'informations - Google Patents

Procédé de suivi de cible, dispositif électronique et support d'informations Download PDF

Info

Publication number
WO2018086607A1
WO2018086607A1 PCT/CN2017/110577 CN2017110577W WO2018086607A1 WO 2018086607 A1 WO2018086607 A1 WO 2018086607A1 CN 2017110577 W CN2017110577 W CN 2017110577W WO 2018086607 A1 WO2018086607 A1 WO 2018086607A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
target
feature vector
candidate
region
Prior art date
Application number
PCT/CN2017/110577
Other languages
English (en)
Chinese (zh)
Inventor
唐矗
Original Assignee
纳恩博(北京)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 纳恩博(北京)科技有限公司 filed Critical 纳恩博(北京)科技有限公司
Publication of WO2018086607A1 publication Critical patent/WO2018086607A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Definitions

  • the present invention relates to the field of electronic technologies, and in particular, to a target tracking method, an electronic device, and a storage medium.
  • the visual tracking technology based on online learning has become a hot spot of visual tracking after its rise in recent years.
  • Such a method extracts feature templates according to the specified tracking targets in the initial frame picture without any prior experience of offline learning.
  • the training model is used for tracking the target in subsequent videos.
  • the tracking process according to the tracking status Update the model to accommodate changes in the target's posture.
  • This type of method does not require any offline training, and can track any object specified by the user, which has high versatility.
  • the embodiment of the present invention solves the visual tracking method of online learning in the prior art by providing a target tracking method, an electronic device, and a storage medium, and it is impossible to determine whether the tracking target is It’s hard to get back the technical problems of tracking targets after losing them.
  • the present invention provides the following technical solutions through an embodiment of the present invention:
  • a target tracking method is applied to an electronic device, wherein the electronic device has an image capturing unit, and the image collecting unit is configured to collect image data, and the method includes:
  • a candidate target having the highest similarity with the tracking target among the plurality of candidate targets is determined as the tracking target.
  • the determining a tracking target in the initial frame image of the image data comprises:
  • the extracting a plurality of candidate targets in the subsequent frame image of the image data comprises:
  • the plurality of candidate targets are determined within the ith image block.
  • the calculating the similarity between each candidate target and the tracking target comprises:
  • the calculating the first color feature vector of the first candidate target and calculating the second color feature vector of the tracking target comprises:
  • the calculating a color feature vector of each region in the first mask image; and calculating a color feature vector of each region in the second mask image comprises:
  • W is a positive integer
  • the projection weight of the first pixel on each n primary colors is calculated based on the following equation:
  • the first pixel is any one of the first region or the second region, and the nth main color is any one of the W main colors, and w n is a projection weight of the first pixel on the nth main color, I r , I g , and I b are RGB values of the first pixel; R n , G n , B n are the The RGB values of the n main colors.
  • the calculating the similarity between each candidate target and the tracking target comprises:
  • the determining the plurality of candidate targets in the ith image block comprises:
  • the calculating the similarity between each candidate target and the tracking target comprises:
  • the present invention provides the following technical solutions through an embodiment of the present invention:
  • a first determining unit configured to determine a tracking target in an initial frame image of the image data
  • An extracting unit configured to extract a plurality of candidate targets in a subsequent frame image of the image data,
  • the subsequent frame image is any frame image subsequent to the initial frame image;
  • a calculating unit configured to calculate a similarity between each candidate target and the tracking target
  • the second determining unit is configured to determine, as the tracking target, a candidate target that has the highest similarity with the tracking target among the plurality of candidate targets.
  • the first determining unit includes:
  • a first determining subunit configured to acquire a user's selection operation after outputting the initial frame image through the display screen; determining the tracking target in the initial frame image based on a user's selection operation;
  • a second determining subunit configured to acquire feature information for describing the tracking target; and determining the tracking target in the initial frame image based on the feature information.
  • the extracting unit comprises:
  • a first determining subunit configured to determine an i-th bounding frame of the tracking target in the i-1th frame image, wherein the i-1th frame image belongs to the image data, and i is greater than or equal to 2
  • An integer of the i-th frame is the initial frame image when i is equal to 2;
  • a second determining subunit configured to determine an i-th image block in the i-th frame image based on the i-th bounding frame, wherein the i-th frame image is the subsequent frame image, the ith The center of the image block is the same as the center position of the i-1th bounding frame, and the area of the i-th image block is larger than the area of the i-th bounding frame;
  • a third determining subunit configured to determine the plurality of candidate targets within the ith image block.
  • the calculating unit comprises:
  • a first selection sub-unit configured to select a first candidate target from the plurality of candidate targets, wherein the first candidate target is any one of the plurality of candidate targets;
  • a first calculation subunit configured to calculate a first color feature vector of the first candidate target, and calculate a second color feature vector of the tracking target;
  • a second calculation subunit configured to calculate the first color feature vector and the second color The distance of the feature vector, wherein the distance is the similarity between the first candidate target and the tracking target.
  • the first calculating subunit is further configured to:
  • the first calculating subunit is further configured to:
  • a W main color Determining a W main color, W being a positive integer; calculating a projection weight of each pixel in the first region of the first mask image on each of the main colors, the first region being in the first mask image Any one of the M regions; and calculating a projection weight of each pixel in the second region of the second mask image on each of the main colors, the second region being in the second mask image Any one of the M regions; obtaining a W-dimensional color feature vector corresponding to each pixel in the first region based on a projection weight of each pixel in each of the first regions; and, based on a projection weight of each pixel in the second region on each of the main colors, obtaining a W-dimensional color feature vector corresponding to each pixel in the second region; and a W dimension corresponding to each pixel in the first region
  • the color feature vector is normalized to obtain a color feature vector of each pixel in the first region; and normalizing the W-dimensional color feature vector corresponding to each pixel in the second region to obtain the The color
  • the first calculating subunit is further configured to calculate a projection weight of the first pixel on each n main colors based on the following equation:
  • the first pixel is any one of the first region or the second region, and the nth main color is any one of the W main colors, and w n is a projection weight of the first pixel on the nth main color, I r , I g , and I b are RGB values of the first pixel; R n , G n , B n are the The RGB values of the n main colors.
  • the calculating unit comprises:
  • a second selection subunit configured to select a first candidate target from the plurality of candidate targets, wherein the first candidate target is any one of the plurality of candidate targets;
  • a normalization subunit configured to normalize an image of the first candidate target to an image of the tracking target to the same size
  • a first input subunit configured to input an image of the tracking target into a first convolutional network of a first depth neural network for feature calculation, to obtain a feature vector of the tracking target, wherein the first deep neural
  • the network is based on the Siamese structure
  • a second input subunit configured to input an image of the first candidate target into a second convolution network of the first depth neural network to perform feature calculation, to obtain a feature vector of the first candidate target,
  • the second convolution network and the first convolution network share a convolution layer parameter;
  • a third input subunit configured to input a feature vector of the tracking target and a feature vector of the first candidate target into a first fully connected network of the first depth neural network to perform a similarity calculation, to obtain the The similarity between the first candidate target and the tracking target.
  • the third determining subunit is further configured to:
  • the calculating unit comprises:
  • Extracting a subunit configured to extract a feature vector of the first candidate target from the feature vectors of the plurality of candidate targets, wherein the first candidate target is any one of the plurality of candidate targets;
  • a fourth input subunit configured to input an image of the tracking target into a fourth convolution network of the second depth neural network to perform feature calculation, to obtain a feature vector of the tracking target, where the fourth The convolution network and the third convolution network share a convolution layer parameter;
  • a fifth input subunit configured to input a feature vector of the tracking target and a feature vector of the first candidate target into a second fully connected network of the second depth neural network to perform a similarity calculation, to obtain the The similarity between the first candidate target and the tracking target.
  • the present invention provides the following technical solutions through an embodiment of the present invention:
  • An electronic device comprising: a processor and a memory for storing a computer program executable on the processor, wherein the processor is operative to perform the steps of the method described above when the computer program is run.
  • the present invention provides the following technical solutions through an embodiment of the present invention:
  • a computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the steps of the method described above.
  • a tracking target is determined in an initial frame image of the image data; a plurality of candidate targets are extracted in a subsequent frame image of the image data; and a similarity between each candidate target and the tracking target is calculated; The highest candidate target is determined as the tracking target.
  • each The candidate target of one frame image is compared with the tracking target in the initial frame image, and the candidate target with the highest similarity among the candidate targets is determined as the tracking target, thereby implementing tracking of the tracking target.
  • the tracking method in the embodiment of the present invention can be regarded as determining whether the target is lost or not, and has a reliable judgment. Tracking whether the target is lost or not; and does not need to maintain the tracking template, avoiding the continuous update of the tracking template, causing the error to be continuously amplified, which is beneficial to recovering the lost tracking target, thereby improving the robustness of the tracking system.
  • FIG. 1 is a flowchart of a target tracking method according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of an initial frame image in an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an initial tracking target in an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of an image of a second frame in an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of candidate objects determined in a second frame image according to an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a first deep neural network according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a second deep neural network according to an embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
  • the embodiment of the present invention solves the prior art visual tracking method for online learning by providing a target tracking method and device, and has the technical problem that it is impossible to determine whether the tracking target is lost or not, and it is difficult to retrieve the tracking target after the lost.
  • a target tracking method is applied to an electronic device, wherein the electronic device has an image acquisition unit, and the image acquisition unit is configured to acquire image data, the method comprising: determining a tracking target in the initial frame image of the image data; Extracting a plurality of candidate targets in the subsequent frame image, the subsequent frame image is any frame image after the initial frame image; calculating the similarity between the candidate target and the tracking target; and maximizing the similarity between the candidate targets and the tracking target The candidate target is determined as the tracking target.
  • the embodiment provides a target tracking method, which is applied to an electronic device, and the electronic device may be: a ground robot (for example, a balance vehicle), or a drone (for example, a multi-rotor drone, or a fixed wing without
  • the electronic device is not limited to the specific embodiment of the device.
  • the electronic device has an image acquisition unit (for example, a camera), and the image acquisition unit is configured to collect image data.
  • the target tracking method includes:
  • Step S101 Determine a tracking target in the initial frame image of the image data.
  • step S101 includes:
  • an image acquired by the image acquisition unit may be acquired, and the image (for example, an initial frame image 300) is output through a display screen set on the electronic device, and a user executed is acquired.
  • Select an action for example, when the display is a touch screen, pass the touch
  • the touch screen acquires the user's selection operation, and then determines a tracking target (ie, initial tracking target 000) from the initial frame image 300 based on the selection operation.
  • the feature information for describing the tracking target is acquired, and the tracking target (ie, the initial tracking target 000) is determined in the initial frame image 300 in conjunction with a saliency detection or an object detection algorithm.
  • the image 311 of the initial tracking target 000 can be extracted and saved for backup, and the image 311 is the image in the first bounding frame 310.
  • Step S102 extracting a plurality of candidate targets in the subsequent frame image of the image data, and the subsequent frame image is any frame image subsequent to the initial frame image.
  • step S102 includes:
  • the i-1th frame image belongs to image data, i is an integer greater than or equal to 2; when i is equal to 2, the i-1th frame The image is the initial frame image); based on the i-1th bounding box, the i-th image block is determined in the i-th frame image, wherein the i-th frame image is the subsequent frame image, the center of the i-th image block and the i-th image 1 The center of the bounding box is the same, the area of the i-th image block is larger than the area of the i-1th bounding box; and a plurality of candidate targets are determined within the i-th image block.
  • FIG. 2 is an initial frame image including a plurality of person targets, and the tracking target to be tracked is a person in the first bounding box 310.
  • FIG. 4 is a second frame image in which the position or posture of each character object is changed.
  • the bounding frame (ie, the first bounding box 310) in the initial frame image 300 is determined (ie, the initial tracking target 000), and the bounding box is generally rectangular, and Can just surround the tracking target (ie: initial tracking target 000).
  • the bounding box is generally rectangular, and Can just surround the tracking target (ie: initial tracking target 000).
  • the position of the first bounding frame 310 based on the position of the first bounding frame 310 (the position of the first bounding frame 310 in the initial frame image 300 is the same as the position in the second frame image 400), one is determined in the second frame image 400.
  • the second image block 420 is the same as the center of the first bounding frame 310, but the second image block 420 is larger than the area of the first bounding frame 310, and is in the second image block.
  • Possible in 420 There are a plurality of targets, wherein the tracking target determined in the initial frame image 300 (ie, the initial tracking target 000) is in the second image block 420, where the second image can be utilized by methods such as saliency analysis or target detection.
  • the plurality of targets are determined in block 420 and determined as candidate targets (ie, candidate target 401, candidate target 402, candidate target 403, candidate target 404). Further, based on steps S103 to S104, the tracking target is determined from among the candidate targets, that is, the initial tracking target 000 is identified from the second frame image.
  • S103 to S104 specific embodiments of S103 to S104 will be described in detail later.
  • the bounding frame of the tracking target in the second frame image 400 (ie, the second bounding frame) is determined, based on the second surrounding. a frame, wherein an image block (ie, a third image block) is determined in the image of the third frame, and the third image block is the same as the center of the second frame, but the third image block is larger than the area of the second image block.
  • There may be multiple targets in the third image block wherein the tracking targets determined in the initial frame image are in these targets, and the method may be determined in the third image block by means of saliency analysis or target detection.
  • a plurality of targets and determining the plurality of targets as candidate targets. Further, based on steps S103 to S104, the tracking target is determined from among the candidate targets, that is, the initial tracking target 000 is identified from the third frame image.
  • the fourth image block is determined in the fourth frame image, and the plurality of candidate targets are determined in the fourth image block, and further, the steps are determined from the candidate targets based on steps S103 to S104.
  • Track the target ie: initial tracking target 000.
  • a plurality of candidate targets are determined in each frame image, and the tracking target is determined from the candidate targets based on steps S103 to S104 ( That is: the target 000) is initially tracked, thereby achieving the identification tracking of the tracking target.
  • images of each candidate target are extracted and saved for backup.
  • the image 421 of the candidate target 401, the image 422 of the candidate target 402, the image 423 of the candidate target 403, and the candidate mesh are extracted and saved.
  • Step S103 Calculate the similarity between the candidate target and the tracking target.
  • the similarity of each candidate target to the tracking target is calculated.
  • the tracking target is an initial tracking target 000 (shown in FIG. 3) determined in the initial frame image 300
  • the candidate target is from an ith image block in the ith frame image
  • the ith frame image is a Subsequent frame graphics (ie, any frame image after the initial frame graphic).
  • the candidate target includes the candidate target 401, the candidate target 402, the candidate target 403, and the candidate target 404 determined in the second frame image 400.
  • the target re-identification algorithm can be used to calculate the similarity between each candidate target and the tracking target.
  • the following three embodiments are available for step S103.
  • Manner 1 Calculate the similarity between each candidate target and the tracking target by using a color feature based target re-identification algorithm.
  • step S103 includes:
  • the color feature vector of the initial tracking target 000 is calculated, wherein the initial tracking target 000 is the tracking target determined in the initial frame image 300, as shown in FIG. 5, and then the candidate target is sequentially calculated.
  • the color feature vector of 401 and finally, the distance between the color feature vector of the initial tracking target 000 and the color feature vector of the candidate target 401 is calculated, which represents the similarity between the candidate target 401 and the initial tracking target 000.
  • the similarity between the candidate target 402, the candidate target 403, the candidate target 404, and the initial tracking target 000 is calculated separately.
  • the distance between the first color feature vector and the second color feature vector may be calculated based on the Euclidean distance formula.
  • the calculating the first color feature vector of the first candidate target and calculating the second color feature vector of the tracking target comprises:
  • Principal component segmentation is performed on the image of the first candidate target pair to obtain a first mask image; and the image of the tracking target is subjected to Saliency Segmentation to obtain a second mask image; the first mask image and the second mask are obtained The image is scaled to the same size; the first mask image is equally divided into M regions; and the second mask image is equally divided into M regions, M being a positive integer; the color feature vector of each region in the first mask image is calculated; Calculating a color feature vector of each region in the second mask image; sequentially connecting color feature vectors of each region in the first mask image to obtain a first color feature vector; and, for each region in the second mask image The color feature vectors are sequentially connected to obtain a second color feature vector.
  • the image 311 of the initial tracking target 000 may be first subjected to principal component segmentation to obtain a second mask.
  • Image in the mask image, only the principal component area keeps the pixel value consistent with the original image, and other regions have a pixel value of 0
  • the image 311 of the initial tracking target 000 is a rectangle and can immediately surround the initial tracking target 000
  • the second mask image is scaled to a preset size, and then the second mask image is equally divided into four regions (upper and lower halved, left and right halved), and then the color eigenvectors of each of the four regions are respectively calculated.
  • the color feature vectors of each of the four regions are sequentially connected (if the color feature vector of each region is a 10-dimensional vector, then a sequential connection obtains a 40-dimensional vector), and the tracking target is obtained after normalization ( That is: the color feature vector (ie, the second color feature vector) of the initial tracking target 000).
  • the image 421 of the candidate target 401 may be subjected to principal component segmentation to obtain a first mask image, wherein the candidate target image is 401.
  • the image block 421 is rectangular and can surround the candidate target 401, and then the first mask image is also scaled to a preset size, which is the same size as the second mask image, and the first mask image is equally divided into four regions (upper and lower Equally divided into two equal parts, and then calculate the color feature vector of each of the four regions separately, and finally connect the color feature vectors of each of the four regions sequentially (wherein, if the color of each region)
  • the feature vector is a 10-dimensional vector, and then a sequential connection obtains a 40-dimensional vector.
  • the color feature vector of the candidate target 401 is obtained.
  • the color feature vector of the candidate target 402, the color feature vector of the candidate target 403, and the color feature vector of the candidate target 404 are respectively calculated.
  • the calculating a color feature vector of each region in the first mask image; and calculating a color feature vector of each region in the second mask image includes:
  • W is a positive integer; calculating a projection weight of each pixel in the first region of the first mask image on each of the main colors, the first region being any of the M regions in the first mask image An area; and calculating a projection weight of each pixel in the second area of the second mask image on each of the main colors, the second area being any one of the M areas in the second mask image; a projection weight of each pixel in the region on each of the main colors, obtaining a W-dimensional color feature vector corresponding to each pixel in the first region; and, based on the projection weight of each pixel in each of the main colors in the second region Obtaining a W-dimensional color feature vector corresponding to each pixel in the second region; normalizing the W-dimensional color feature vector corresponding to each pixel in the first region to obtain a color feature vector of each pixel in the first region; And normalizing the W-dimensional color feature vector corresponding to each pixel in the second region to obtain a color feature vector of each pixel in the second region; adding
  • the first mask image is equally divided into four regions (upper and lower halving, left and right halving)
  • An area ie, the first area
  • the second mask image is equally divided into four regions (upper and lower halving, left and right halving)
  • the color feature vector of each region in the second mask image first, from the four regions Any one of the regions (ie, the second region) is selected, and the projection weight of each pixel in the second region on each of the main colors is calculated, and the projection weight of each of the pixels in the second region is obtained.
  • each pixel obtains the first 10-dimensional color feature vector, and then normalizes the 10-dimensional color feature vector, and as the color feature vector of the pixel, obtains the color features of all the pixels in the second region. After the vector, the color feature vectors of all the pixels are added, and finally, the color feature vector of the second region is obtained. Based on the method, the color feature vector of each of the four regions in the second mask image can be calculated.
  • the projection weight of the first pixel on every n main colors can be calculated based on the following equation:
  • the first pixel is any one of the first region or the second region
  • the nth main color is any one of the main colors of W
  • w n is the first pixel on the nth main color
  • the projection weights, I r , I g , and I b are the RGB values of the first pixel
  • R n , G n , B n are the RGB values of the nth main color.
  • n is the number of the above 10 main colors.
  • w 2 is the projection weight of the pixel on yellow
  • R 2 , G 2 , and B 2 are yellow RGB values
  • I r , I g , and I b are the RGB values of the pixel.
  • step S103 includes:
  • a first candidate target is selected from a plurality of candidate targets, wherein the first candidate target is any one of a plurality of candidate targets; and the image of the first candidate target and the image of the tracking target are returned
  • the image of the tracking target is input to the first convolution network 601 of the first depth neural network through the first input terminal 611 for feature calculation, and the feature vector of the tracking target is obtained, wherein the first depth neural network Based on the Siamese structure; the image of the first candidate target is input into the second convolution network 602 of the first depth neural network through the second input end 612 to perform feature calculation, and the feature vector of the first candidate target is obtained, wherein the second volume
  • the product network 602 and the first convolutional network 601 share the convolution layer parameters, that is, the volume base layer parameters are the same; the feature vector of the tracking target and the feature vector of the first candidate target are input to the first fully connected layer 603 of the first deep neural network.
  • the similarity of the targets is tracked, wherein the outputs of the first convolutional network 601 and the second convolutional network 602 are automatically entered as inputs to the first fully connected network 603.
  • the first deep neural network needs to be trained offline (as shown in FIG. 6), and the first deep neural network includes a first convolutional network 601, a second convolutional network 602, and a first fully connected network 603, An input terminal 611, a second input terminal 612, and a first output terminal 621, wherein the first convolutional network 601 and the second convolutional network 602 are bilateral deep neural networks adopting a Siamese structure, and each side of the network adopts AlexNet.
  • the network structure before FC6 in the network, the first convolutional network 601 and the second convolutional network 602 all contain a plurality of convolution layers, the convolutional layer in the first convolutional network 601 and the second convolutional network 602.
  • the convolutional layers are mutually shared convolutional layers with the same parameters.
  • the images input by the first convolutional network 601 and the second convolutional network 602 need to be normalized to the same size.
  • the image of the normalized tracking target is input into the first convolution network 601, and the feature vector of the tracking target can be obtained; and the image of the normalized first candidate target is input to the second convolution network.
  • a feature vector of the first candidate target can be obtained.
  • the first convolutional layer 601 and the second convolutional layer 602 are connected to the first fully connected network 603.
  • the first fully connected network 603 includes a plurality of fully connected layers for calculating the distance between the input feature vectors on both sides. The similarity between the first candidate target and the tracking target.
  • the parameters in the first deep neural network are obtained through offline learning, and the method of training the first deep neural network is consistent with the training method of the general convolutional neural network. After the offline training is finished, the first deep nerve can be obtained.
  • the network network is used in the tracking system.
  • the image 421 of the candidate target 401 and the image 311 of the initial tracking target 000 may be first normalized to the same size;
  • the image 311 of the initial tracking target 000 is input into the first convolution network 601 to obtain the feature vector of the initial tracking target 000, and the image 421 of the candidate target 401 is used in the second convolution network 602 to obtain the feature vector of the candidate target 401;
  • the feature vector of the initial tracking target 000 and the feature vector of the candidate target 401 are input to the first full connection.
  • the network 603 is connected to obtain the similarity between the candidate target 401 and the initial tracking target 000.
  • the image 422 of the candidate target 402 is normalized to the image 311 corresponding to the initial tracking target 000
  • the image 311 of the initial tracking target 000 is input into the first convolution network 601
  • the image of the candidate target 402 is taken.
  • the 422 is input to the second convolutional network 602 to obtain the similarity between the candidate target 402 and the initial tracking target 000.
  • the similarity of the candidate target 403 and the initial tracking target 000, and the similarity of the candidate target 404 and the initial tracking target 000 can be obtained.
  • Manner 3 Using a deep neural network, simultaneously generating candidate targets and calculating the similarity between each candidate target and the tracking target.
  • a second method as shown in FIG. 7 may be utilized. Deep neural network.
  • the second deep neural network may be trained offline, the second deep neural network is based on the Siamese structure, and the second deep neural network includes a third convolutional network 604, a fourth convolutional network 605, and an RPN ( Region Proposal Network, network 607 and second fully connected network 606, third input 613, fourth input 614, and second output 622.
  • the output of the third convolutional network 604 is input to the RPN network 607, and the fourth convolutional network 605 and the RPN network 607 are simultaneously connected to the second fully connected network 606.
  • the third convolutional network 604 includes a plurality of convolution layers for performing feature calculation on the i-th image block, and the third convolution network 604 is used to obtain a feature map of the i-th image block, and the RPN network 607 is configured to A feature map of the i-th image block, a plurality of candidate targets are extracted from the i-th image block, and feature vectors of each candidate target are calculated.
  • the second deep neural network shown in FIG. 7 is mainly different from the first deep neural network shown in FIG. 6 in the lower half of FIG.
  • the third convolutional network 604 in FIG. 7 takes the i-th image block as an input, and additionally adds an RPN network 607, which is the i-th image.
  • the candidate target is extracted on the feature map obtained after the block is calculated by the third convolution network 604.
  • the RPN network 607 directly uses the feature map calculated by the third convolution network 604 to perform calculation, and directly finds the candidate target in the feature after the calculation.
  • the corresponding position on the map directly acquires the feature vector of each candidate target on the feature map, and then the feature vector corresponding to the initial tracking target 000 is input to the second fully connected network 606 to calculate the similarity.
  • the ith image block may be input into the third convolution network 604 of the second depth neural network through the fourth input terminal 614 to perform feature calculation, and obtain a feature map of the ith image block;
  • the feature map of the block is input to the RPN network 607 of the second depth neural network for feature calculation, a plurality of candidate targets are extracted, and feature vectors of each candidate target are calculated.
  • the second image block 420 can be input into the third convolution network 604 of the second depth neural network to obtain a feature map of the second image block 420, and the feature image of the second image block 420 is input to the second image block 420.
  • a plurality of candidate targets ie, candidate target 401, candidate target 402, candidate target 404, candidate target 404 are extracted, and feature vectors of each candidate target can also be obtained.
  • step S103 includes:
  • the convolutional layer and the third convolutional network 604 in the product network 605 share the convolutional layer parameters, i.e., the volume base layer parameters are the same.
  • the second deep neural network includes a third convolution
  • the network 604 and the RPN network 607 further include a fourth convolution network 605 and a second fully connected network 606.
  • the RPN network 704 is configured to extract a plurality of candidate targets based on the feature map output by the third convolution network 604. And calculating the feature vector of each candidate target, and inputting the feature vector of each candidate target into the second fully connected network 606 in sequence, and the fourth convolution network 605 is configured to calculate the feature vector of the tracking target and output to the second full connection.
  • the network 606, the second fully connected network 606 is configured to calculate the similarity between the first candidate target and the tracking target based on the feature vector of the first candidate target and the feature vector of the tracking target.
  • the candidate can be obtained by the calculation of the third convolutional network 604 and the RPN network 607.
  • the image 311 corresponding to the initial tracking target 000 is input to the fourth convolution network 605 of the second deep neural network, and the similarity between the candidate target 401 and the initial tracking target 000 can be calculated by the second fully connected network 606.
  • Step S104 Determine a candidate target having the highest similarity with the tracking target among the plurality of candidate targets as the tracking target.
  • the candidate with the highest similarity can be used as the tracking target.
  • the candidate target 402 continues to be tracked as the tracking target.
  • the above mainly takes the second frame image 400 as an example, and for each candidate target in the second image block 420 in the second frame image 400, the similarity between each candidate target and the initial tracking target 000 is calculated separately, and is similar.
  • the candidate with the highest degree is used as the tracking target in the image of the second frame.
  • subsequent frame images for example, the third frame image, the fourth frame image, the fifth frame image, (7), The same is true, the similarity between each candidate target and the initial tracking target 000 in each frame image is calculated, and the candidate object with the highest similarity is used as the tracking target in the frame image.
  • the target tracking method in the embodiment of the present invention can be regarded as determining whether the target is lost or not, and the processing is reliable. It is not necessary to maintain the tracking template, and the tracking template is not required to be maintained, so that the error is continuously amplified, which is beneficial to recovering the tracking target, thereby improving the robustness of the tracking system.
  • the embodiment provides an electronic device, which has an image acquisition unit, and the image acquisition unit is configured to collect image data.
  • the electronic device includes:
  • the first determining unit 801 is configured to determine a tracking target in the initial frame image of the image data
  • the extracting unit 802 is configured to extract a plurality of candidate targets in the subsequent frame image of the image data, where the subsequent frame image is any frame image subsequent to the initial frame image;
  • the calculating unit 803 is configured to calculate a similarity between the candidate target and the tracking target;
  • the second determining unit 804 is configured to determine a candidate target that has the highest similarity with the tracking target among the plurality of candidate targets as the tracking target.
  • the first determining unit 801 includes:
  • a first determining subunit configured to acquire a user's selection operation after outputting the initial frame image through the display screen; determining a tracking target in the initial frame image based on the user's selection operation;
  • a second determining subunit configured to acquire feature information for describing the tracking target; and determining a tracking target in the initial frame image based on the feature information.
  • the extracting unit 802 includes:
  • a first determining subunit configured to determine an i-1 bounding frame of the tracking target in the i-1th frame image, wherein the i-1th frame image belongs to image data, and i is an integer greater than or equal to 2; 2, the image of the i-1th frame is the initial frame image;
  • a second determining subunit configured to determine an i-th image block in the i-th frame image based on the i-th bounding frame, wherein the i-th frame image is a subsequent frame image, the center of the i-th image block and the i-th image 1
  • the center position of the bounding frame is the same, and the area of the i-th image block is larger than the area of the i-1th bounding frame;
  • a third determining subunit configured to determine a plurality of candidate targets within the ith image block.
  • the calculating unit 803 includes:
  • a first selection sub-unit configured to select a first candidate target from the plurality of candidate targets, wherein the first candidate target is any one of the plurality of candidate targets;
  • a first calculation subunit configured to calculate a first color feature vector of the first candidate target and calculate a second color feature vector of the tracking target
  • a second calculating subunit configured to calculate a distance between the first color feature vector and the second color feature vector, wherein the distance is the similarity between the first candidate target and the tracking target.
  • the first computing subunit is further configured to:
  • the first computing subunit is further configured to:
  • W is a positive integer; calculating a projection weight of each pixel in the first region of the first mask image on each of the main colors, the first region being any of the M regions in the first mask image An area; and calculating a projection weight of each pixel in the second area of the second mask image on each of the main colors, the second area being any one of the M areas in the second mask image; a projection weight of each pixel in the region on each of the main colors, obtaining a W-dimensional color feature vector corresponding to each pixel in the first region; and, based on the projection weight of each pixel in each of the main colors in the second region Obtaining a W-dimensional color feature vector corresponding to each pixel in the second region; normalizing the W-dimensional color feature vector corresponding to each pixel in the first region to obtain a color feature vector of each pixel in the first region; And normalizing the W-dimensional color feature vector corresponding to each pixel in the second region to obtain a color feature vector of each pixel in the second region; adding
  • the first calculating subunit is further configured to calculate a projection weight of the first pixel on each n main colors based on the following equation:
  • the first pixel is any of the first or second area of a pixel
  • the primary colors of n kinds of primary colors W is any one of a primary color
  • the projection weights, I r , I g , and I b are the RGB values of the first pixel; R n , G n , B n are the RGB values of the nth main color.
  • the calculating unit 803 includes:
  • a second selection subunit configured to select a first candidate target from the plurality of candidate targets, wherein the first candidate target is any one of the plurality of candidate targets;
  • a normalized subunit configured to normalize an image of the first candidate target to an image of the tracking target to the same size
  • a first input subunit configured to input an image of the tracking target into a first convolution network of the first depth neural network for feature calculation, to obtain a feature vector of the tracking target, wherein the first depth neural network is based on the Siamese structure;
  • a second input subunit configured to input an image of the first candidate target into a second convolution network of the first depth neural network to perform feature calculation, to obtain a feature vector of the first candidate target;
  • a third input subunit configured to input the feature vector of the tracking target and the feature vector of the first candidate target into the first fully connected network of the first depth neural network for similarity calculation, to obtain the first candidate target and the tracking target Similarity.
  • the third determining subunit is further configured to:
  • the ith image block is input into a third convolution network of the second depth neural network to perform feature calculation, and the feature map of the ith image block is obtained, wherein the second depth neural network is based on the Siamese structure; and the feature of the ith image block is The map is input to the RPN network of the second deep neural network, and a plurality of candidate targets are extracted, and feature vectors of the plurality of candidate targets are obtained.
  • the calculating unit 803 includes:
  • Extracting a subunit configured to extract a feature vector of the first candidate target from the feature vectors of the plurality of candidate targets, wherein the first candidate target is any one of the plurality of candidate targets;
  • a fourth input subunit configured to input an image of the tracking target into a fourth convolution network of the second depth neural network for feature calculation, to obtain a feature vector of the tracking target;
  • a fifth input subunit configured to input the feature vector of the tracking target and the feature vector of the first candidate target into the second fully connected network of the second depth neural network to perform similarity calculation, to obtain the first candidate target and the tracking target Similarity.
  • the electronic device introduced in this embodiment is an electronic device used in the method for implementing the target tracking method in the embodiment of the present invention. Therefore, those skilled in the art can understand the method based on the target tracking method introduced in the embodiment of the present invention.
  • the specific embodiment of the electronic device of the embodiment and various variations thereof, so how to implement the invention for the electronic device The method in the example is not described in detail.
  • the electronic device used in the method of the subject tracking method in the embodiments of the present invention is within the scope of the present invention.
  • the first determining unit 801, the extracting unit 802, the calculating unit 803, and the second determining unit 804 may all run on an electronic device, and may be a central processing unit (CPU) or a microprocessor located on the electronic device. (MPU), or digital signal processor (DSP), or programmable gate array (FPGA) implementation.
  • CPU central processing unit
  • MPU microprocessor located on the electronic device.
  • DSP digital signal processor
  • FPGA programmable gate array
  • the candidate target of each subsequent frame image is compared with the tracking target in the initial frame image, the candidate object with the highest similarity among the candidate targets is determined as the tracking target, thereby implementing tracking of the tracking target.
  • the processing of each frame after the initial frame can be regarded as determining whether the target is lost or not. It has the advantage of reliably judging whether the tracking target is lost or not; and it does not need to maintain the tracking template, which avoids the continuous updating of the tracking template, so that the error is continuously amplified, which is beneficial to recovering the tracking target, thereby improving the robustness of the tracking system. Sex.
  • the electronic device includes: a processor and a memory for storing a computer program executable on the processor, wherein the processor is configured to execute the computer program when The steps of the method.
  • the memory may be implemented by any type of volatile or non-volatile storage device, or a combination thereof.
  • the non-volatile memory may be a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), or an Erasable Programmable Read (EPROM). Only Memory), Electrically Erasable Programmable Read-Only Memory (EEPROM), Ferromagnetic Random Access Memory (FRAM), Flash Memory, Magnetic Surface Memory , CD, or CD-ROM (CD-ROM, Compact Disc) Read-Only Memory); the magnetic surface memory can be a disk storage or a tape storage.
  • the volatile memory can be a random access memory (RAM) that acts as an external cache.
  • RAM Random Access Memory
  • SRAM Static Random Access Memory
  • SSRAM Synchronous Static Random Access Memory
  • SSRAM Dynamic Random Access
  • DRAM Dynamic Random Access Memory
  • SDRAM Synchronous Dynamic Random Access Memory
  • DDRSDRAM Double Data Rate Synchronous Dynamic Random Access Memory
  • ESDRAM enhancement Enhanced Synchronous Dynamic Random Access Memory
  • SLDRAM Synchronous Dynamic Random Access Memory
  • DRRAM Direct Memory Bus Random Access Memory
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the above method may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the above processor may be a general purpose processor, a digital signal processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, or the like.
  • the processor may implement or perform the methods, steps, and logic blocks disclosed in the embodiments of the present invention.
  • a general purpose processor can be a microprocessor or any conventional processor or the like.
  • the steps of the method disclosed in the embodiment of the present invention may be directly implemented as a hardware decoding processor, or may be performed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a storage medium, the storage medium being located in the memory, the processor reading the information in the memory, and completing the steps of the foregoing methods in combination with the hardware thereof.
  • Embodiments of the present invention also provide a computer readable storage medium, including, for example, a computer A memory of the program, which may be executed by a processor of the electronic device described above to perform the steps described in the foregoing methods.
  • the computer readable storage medium may be a memory such as FRAM, ROM, programmable read only memory PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM; or may include one or any combination of the above memories.
  • Various equipment may be used to store data into a computer readable storage medium.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • the embodiment of the invention has the advantages of reliably determining whether the tracking target is lost or not, and does not need to maintain the tracking template, and avoids the continuous updating of the tracking template, so that the error is continuously amplified, which is beneficial to recovering the tracking target and the tracking target, thereby improving the tracking.
  • the robustness of the system is the advantages of reliably determining whether the tracking target is lost or not, and does not need to maintain the tracking template, and avoids the continuous updating of the tracking template, so that the error is continuously amplified, which is beneficial to recovering the tracking target and the tracking target, thereby improving the tracking.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un procédé de suivi de cible, un dispositif électronique et un support d'informations. Le dispositif électronique comprend une unité de collecte d'image. L'unité de collecte d'image est configurée pour collecter des données d'image. Le procédé est appliqué au dispositif électronique, et consiste : à déterminer, dans une trame d'image initiale des données d'image, une cible à suivre (S101) ; à extraire une pluralité de cibles candidates à partir d'une trame d'image suivante des données d'image, la trame d'image suivante étant une trame d'image quelconque suivant la trame d'image initiale (S102) ; à calculer les similarités entre les cibles candidates et la cible à suivre (S103) ; et à déterminer, parmi la pluralité des cibles candidates, une cible candidate présentant la similarité la plus élevée par rapport à la cible à suivre en tant que cible à suivre (S104). Le procédé résout les problèmes techniques de l'état de la technique liés à l'impossibilité de déterminer si une cible à suivre est perdue ou pas et à la difficulté de trouver une cible à suivre après la perte de la cible à suivre dans un procédé de suivi visuel d'apprentissage en ligne.
PCT/CN2017/110577 2016-11-11 2017-11-10 Procédé de suivi de cible, dispositif électronique et support d'informations WO2018086607A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611041675.6A CN106650630B (zh) 2016-11-11 2016-11-11 一种目标跟踪方法及电子设备
CN201611041675.6 2016-11-11

Publications (1)

Publication Number Publication Date
WO2018086607A1 true WO2018086607A1 (fr) 2018-05-17

Family

ID=58811573

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/110577 WO2018086607A1 (fr) 2016-11-11 2017-11-10 Procédé de suivi de cible, dispositif électronique et support d'informations

Country Status (2)

Country Link
CN (1) CN106650630B (fr)
WO (1) WO2018086607A1 (fr)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110335289A (zh) * 2019-06-13 2019-10-15 河海大学 一种基于在线学习的目标跟踪方法
CN110544268A (zh) * 2019-07-29 2019-12-06 燕山大学 一种基于结构光及SiamMask网络的多目标跟踪方法
CN110570460A (zh) * 2019-09-06 2019-12-13 腾讯云计算(北京)有限责任公司 目标跟踪方法、装置、计算机设备及计算机可读存储介质
CN110766720A (zh) * 2019-09-23 2020-02-07 盐城吉大智能终端产业研究院有限公司 一种基于深度学习的多摄像头车辆跟踪系统
CN110889718A (zh) * 2019-11-15 2020-03-17 腾讯科技(深圳)有限公司 方案筛选方法、方案筛选装置、介质以及电子设备
CN111105436A (zh) * 2018-10-26 2020-05-05 曜科智能科技(上海)有限公司 目标跟踪方法、计算机设备及存储介质
CN111428539A (zh) * 2019-01-09 2020-07-17 成都通甲优博科技有限责任公司 目标跟踪方法及装置
CN111598928A (zh) * 2020-05-22 2020-08-28 郑州轻工业大学 一种基于具有语义评估和区域建议的突变运动目标跟踪方法
CN111783878A (zh) * 2020-06-29 2020-10-16 北京百度网讯科技有限公司 目标检测方法、装置、电子设备以及可读存储介质
CN111814905A (zh) * 2020-07-23 2020-10-23 上海眼控科技股份有限公司 目标检测方法、装置、计算机设备和存储介质
CN111914890A (zh) * 2020-06-23 2020-11-10 北京迈格威科技有限公司 图像之间的图像块匹配方法、图像配准方法和产品
CN112037256A (zh) * 2020-08-17 2020-12-04 中电科新型智慧城市研究院有限公司 目标跟踪方法、装置、终端设备及计算机可读存储介质
US20210271892A1 (en) * 2019-04-26 2021-09-02 Tencent Technology (Shenzhen) Company Limited Action recognition method and apparatus, and human-machine interaction method and apparatus
CN113538507A (zh) * 2020-04-15 2021-10-22 南京大学 一种基于全卷积网络在线训练的单目标跟踪方法
CN114491131A (zh) * 2022-01-24 2022-05-13 北京至简墨奇科技有限公司 对候选图像进行重新排序的方法、装置和电子设备

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650630B (zh) * 2016-11-11 2019-08-23 纳恩博(北京)科技有限公司 一种目标跟踪方法及电子设备
CN107346413A (zh) * 2017-05-16 2017-11-14 北京建筑大学 一种街景影像中的交通标志识别方法和系统
CN109214238B (zh) * 2017-06-30 2022-06-28 阿波罗智能技术(北京)有限公司 多目标跟踪方法、装置、设备及存储介质
CN107168343B (zh) * 2017-07-14 2020-09-15 灵动科技(北京)有限公司 一种行李箱的控制方法及行李箱
CN107292284B (zh) * 2017-07-14 2020-02-28 成都通甲优博科技有限责任公司 目标重检测方法、装置及无人机
US10592786B2 (en) * 2017-08-14 2020-03-17 Huawei Technologies Co., Ltd. Generating labeled data for deep object tracking
CN107481265B (zh) * 2017-08-17 2020-05-19 成都通甲优博科技有限责任公司 目标重定位方法及装置
CN108230359B (zh) * 2017-11-12 2021-01-26 北京市商汤科技开发有限公司 目标检测方法和装置、训练方法、电子设备、程序和介质
CN108229456B (zh) * 2017-11-22 2021-05-18 深圳市商汤科技有限公司 目标跟踪方法和装置、电子设备、计算机存储介质
CN108171112B (zh) * 2017-12-01 2021-06-01 西安电子科技大学 基于卷积神经网络的车辆识别与跟踪方法
CN108133197B (zh) * 2018-01-05 2021-02-05 百度在线网络技术(北京)有限公司 用于生成信息的方法和装置
CN110163029B (zh) * 2018-02-11 2021-03-30 中兴飞流信息科技有限公司 一种图像识别方法、电子设备以及计算机可读存储介质
CN108416780B (zh) * 2018-03-27 2021-08-31 福州大学 一种基于孪生-感兴趣区域池化模型的物体检测与匹配方法
CN108491816A (zh) * 2018-03-30 2018-09-04 百度在线网络技术(北京)有限公司 在视频中进行目标跟踪的方法和装置
CN108665485B (zh) * 2018-04-16 2021-07-02 华中科技大学 一种基于相关滤波与孪生卷积网络融合的目标跟踪方法
CN108596957B (zh) * 2018-04-26 2022-07-22 北京小米移动软件有限公司 物体跟踪方法及装置
CN108898620B (zh) * 2018-06-14 2021-06-18 厦门大学 基于多重孪生神经网络与区域神经网络的目标跟踪方法
CN109118519A (zh) * 2018-07-26 2019-01-01 北京纵目安驰智能科技有限公司 基于实例分割的目标Re-ID方法、系统、终端和存储介质
CN109614907B (zh) * 2018-11-28 2022-04-19 安徽大学 基于特征强化引导卷积神经网络的行人再识别方法及装置
CN109685805B (zh) * 2019-01-09 2021-01-26 银河水滴科技(北京)有限公司 一种图像分割方法及装置
CN111428535A (zh) * 2019-01-09 2020-07-17 佳能株式会社 图像处理装置和方法及图像处理系统
CN111524159A (zh) * 2019-02-01 2020-08-11 北京京东尚科信息技术有限公司 图像处理方法和设备、存储介质和处理器
CN110147768B (zh) * 2019-05-22 2021-05-28 云南大学 一种目标跟踪方法及装置
CN112347817B (zh) * 2019-08-08 2022-05-17 魔门塔(苏州)科技有限公司 一种视频目标检测与跟踪方法及装置
CN112800811B (zh) * 2019-11-13 2023-10-13 深圳市优必选科技股份有限公司 一种色块追踪方法、装置及终端设备
CN111178284A (zh) * 2019-12-31 2020-05-19 珠海大横琴科技发展有限公司 基于地图数据的时空联合模型的行人重识别方法及系统
CN111524162B (zh) * 2020-04-15 2022-04-01 上海摩象网络科技有限公司 一种跟踪目标的找回方法、设备及手持相机
CN113273174A (zh) * 2020-09-23 2021-08-17 深圳市大疆创新科技有限公司 待跟随目标的确定方法、装置、系统、设备及存储介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090019149A1 (en) * 2005-08-02 2009-01-15 Mobixell Networks Content distribution and tracking
AU2011265494A1 (en) * 2011-12-22 2013-07-11 Canon Kabushiki Kaisha Kernalized contextual feature
CN103218798A (zh) * 2012-01-19 2013-07-24 索尼公司 图像处理设备和方法
CN103339655A (zh) * 2011-02-03 2013-10-02 株式会社理光 图像捕捉装置、图像捕捉方法及计算机程序产品
CN103679743A (zh) * 2012-09-06 2014-03-26 索尼公司 目标跟踪装置和方法,以及照相机
CN105184778A (zh) * 2015-08-25 2015-12-23 广州视源电子科技股份有限公司 一种检测方法及装置
CN106650630A (zh) * 2016-11-11 2017-05-10 纳恩博(北京)科技有限公司 一种目标跟踪方法及电子设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090019149A1 (en) * 2005-08-02 2009-01-15 Mobixell Networks Content distribution and tracking
CN103339655A (zh) * 2011-02-03 2013-10-02 株式会社理光 图像捕捉装置、图像捕捉方法及计算机程序产品
AU2011265494A1 (en) * 2011-12-22 2013-07-11 Canon Kabushiki Kaisha Kernalized contextual feature
CN103218798A (zh) * 2012-01-19 2013-07-24 索尼公司 图像处理设备和方法
CN103679743A (zh) * 2012-09-06 2014-03-26 索尼公司 目标跟踪装置和方法,以及照相机
CN105184778A (zh) * 2015-08-25 2015-12-23 广州视源电子科技股份有限公司 一种检测方法及装置
CN106650630A (zh) * 2016-11-11 2017-05-10 纳恩博(北京)科技有限公司 一种目标跟踪方法及电子设备

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105436A (zh) * 2018-10-26 2020-05-05 曜科智能科技(上海)有限公司 目标跟踪方法、计算机设备及存储介质
CN111105436B (zh) * 2018-10-26 2023-05-09 曜科智能科技(上海)有限公司 目标跟踪方法、计算机设备及存储介质
CN111428539A (zh) * 2019-01-09 2020-07-17 成都通甲优博科技有限责任公司 目标跟踪方法及装置
US20210271892A1 (en) * 2019-04-26 2021-09-02 Tencent Technology (Shenzhen) Company Limited Action recognition method and apparatus, and human-machine interaction method and apparatus
US11710351B2 (en) * 2019-04-26 2023-07-25 Tencent Technology (Shenzhen) Company Limited Action recognition method and apparatus, and human-machine interaction method and apparatus
CN110335289A (zh) * 2019-06-13 2019-10-15 河海大学 一种基于在线学习的目标跟踪方法
CN110335289B (zh) * 2019-06-13 2022-08-05 河海大学 一种基于在线学习的目标跟踪方法
CN110544268A (zh) * 2019-07-29 2019-12-06 燕山大学 一种基于结构光及SiamMask网络的多目标跟踪方法
CN110544268B (zh) * 2019-07-29 2023-03-24 燕山大学 一种基于结构光及SiamMask网络的多目标跟踪方法
CN110570460A (zh) * 2019-09-06 2019-12-13 腾讯云计算(北京)有限责任公司 目标跟踪方法、装置、计算机设备及计算机可读存储介质
CN110570460B (zh) * 2019-09-06 2024-02-13 腾讯云计算(北京)有限责任公司 目标跟踪方法、装置、计算机设备及计算机可读存储介质
CN110766720A (zh) * 2019-09-23 2020-02-07 盐城吉大智能终端产业研究院有限公司 一种基于深度学习的多摄像头车辆跟踪系统
CN110889718A (zh) * 2019-11-15 2020-03-17 腾讯科技(深圳)有限公司 方案筛选方法、方案筛选装置、介质以及电子设备
CN110889718B (zh) * 2019-11-15 2024-05-14 腾讯科技(深圳)有限公司 方案筛选方法、方案筛选装置、介质以及电子设备
CN113538507B (zh) * 2020-04-15 2023-11-17 南京大学 一种基于全卷积网络在线训练的单目标跟踪方法
CN113538507A (zh) * 2020-04-15 2021-10-22 南京大学 一种基于全卷积网络在线训练的单目标跟踪方法
CN111598928B (zh) * 2020-05-22 2023-03-10 郑州轻工业大学 一种基于具有语义评估和区域建议的突变运动目标跟踪方法
CN111598928A (zh) * 2020-05-22 2020-08-28 郑州轻工业大学 一种基于具有语义评估和区域建议的突变运动目标跟踪方法
CN111914890B (zh) * 2020-06-23 2024-05-14 北京迈格威科技有限公司 图像之间的图像块匹配方法、图像配准方法和产品
CN111914890A (zh) * 2020-06-23 2020-11-10 北京迈格威科技有限公司 图像之间的图像块匹配方法、图像配准方法和产品
CN111783878A (zh) * 2020-06-29 2020-10-16 北京百度网讯科技有限公司 目标检测方法、装置、电子设备以及可读存储介质
CN111783878B (zh) * 2020-06-29 2023-08-04 北京百度网讯科技有限公司 目标检测方法、装置、电子设备以及可读存储介质
CN111814905A (zh) * 2020-07-23 2020-10-23 上海眼控科技股份有限公司 目标检测方法、装置、计算机设备和存储介质
CN112037256A (zh) * 2020-08-17 2020-12-04 中电科新型智慧城市研究院有限公司 目标跟踪方法、装置、终端设备及计算机可读存储介质
CN114491131B (zh) * 2022-01-24 2023-04-18 北京至简墨奇科技有限公司 对候选图像进行重新排序的方法、装置和电子设备
CN114491131A (zh) * 2022-01-24 2022-05-13 北京至简墨奇科技有限公司 对候选图像进行重新排序的方法、装置和电子设备

Also Published As

Publication number Publication date
CN106650630B (zh) 2019-08-23
CN106650630A (zh) 2017-05-10

Similar Documents

Publication Publication Date Title
WO2018086607A1 (fr) Procédé de suivi de cible, dispositif électronique et support d'informations
Wang et al. Detect globally, refine locally: A novel approach to saliency detection
Liu et al. Joint face alignment and 3d face reconstruction
US10776936B2 (en) Point cloud matching method
WO2022134337A1 (fr) Procédé et système de détection d'occlusion de visage, dispositif et support d'enregistrement
CN110909651B (zh) 视频主体人物的识别方法、装置、设备及可读存储介质
US11189020B2 (en) Systems and methods for keypoint detection
CN106203242B (zh) 一种相似图像识别方法及设备
CN109960742B (zh) 局部信息的搜索方法及装置
TWI470563B (zh) 偵測影像中臉部屬性之方法、執行影像分析處理之處理系統及電腦可讀取儲存媒體
EP4099221A1 (fr) Procédé et appareil de reconnaissance faciale
US20160275339A1 (en) System and Method for Detecting and Tracking Facial Features In Images
Ishikura et al. Saliency detection based on multiscale extrema of local perceptual color differences
CN109271930B (zh) 微表情识别方法、装置与存储介质
CN107316029B (zh) 一种活体验证方法及设备
US10489636B2 (en) Lip movement capturing method and device, and storage medium
WO2021137946A1 (fr) Détection de falsification d'image faciale
JP2005327076A (ja) パラメタ推定方法、パラメタ推定装置および照合方法
CN114155365B (zh) 模型训练方法、图像处理方法及相关装置
US20150131873A1 (en) Exemplar-based feature weighting
JP6756406B2 (ja) 画像処理装置、画像処理方法および画像処理プログラム
JP2011508323A (ja) 不変の視覚場面及び物体の認識
CN111091075A (zh) 人脸识别方法、装置、电子设备及存储介质
JP2021503139A (ja) 画像処理装置、画像処理方法および画像処理プログラム
CN107862680A (zh) 一种基于相关滤波器的目标跟踪优化方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17870266

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17870266

Country of ref document: EP

Kind code of ref document: A1