CN110378264A

CN110378264A - Method for tracking target and device

Info

Publication number: CN110378264A
Application number: CN201910611097.2A
Authority: CN
Inventors: 卓世杰
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2019-07-08
Filing date: 2019-07-08
Publication date: 2019-10-25
Anticipated expiration: 2039-07-08
Also published as: CN110378264B

Abstract

This disclosure relates to technical field of image processing, and in particular to a kind of method for tracking target, a kind of target tracker, a kind of computer-readable medium and a kind of electronic equipment.The described method includes: obtaining video to be tracked, target detection is carried out to the video to be tracked to obtain the key frame images for including tracking target；Image recognition is carried out to the key frame images to obtain the subject area for including tracking target, and key point is carried out to the subject area and is extracted to obtain the crucial point data of the tracking target；Image to be detected of next frame adjacent with the key frame is extracted, and feature extraction is carried out to obtain the second feature figure of described image to be detected to described image to be detected；Using the second feature figure and the crucial point data as input parameter input prediction model, to obtain the crucial point data of each key point corresponding prediction in described image to be detected；The tracking target in described image to be detected is determined according to the crucial point data of the prediction.

Description

Method for tracking target and device

Technical field

This disclosure relates to technical field of image processing, and in particular to a kind of method for tracking target, a kind of target tracker, A kind of computer-readable medium and a kind of electronic equipment.

Background technique

Target following is one of the hot spot in computer vision research field, and has been answered extensively in multiple fields With.In general, target following is exactly to establish the positional relationship for the object of being tracked in continuous video sequence, obtain object The complete motion profile of body.

In the prior art, optical flow method and depth learning technology are also widely used in target following technology. In the target following scheme based on deep learning, generally requires and advance with deep learning training network model；Execute with When track, the feature that network model is learnt is applied directly to inside the tracking frame of correlation filtering, thus obtain preferably with Track result.But the shortcomings that also bringing the increase of calculation amount simultaneously, and then cause to be difficult to online real-time tracing.In addition, 3 points of stronger assumed conditions: brightness constancy, Time Continuous or motion bit are the need for when carrying out target following using optical flow method Move the unanimously equal requirement of smaller and space.And during practical application, there is a large amount of scene to be unable to satisfy above want It asks.

It should be noted that information is only used for reinforcing the reason to the background of the disclosure disclosed in above-mentioned background technology part Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.

Summary of the invention

The disclosure is designed to provide a kind of method for tracking target, a kind of target tracker, a kind of computer-readable Medium, a kind of electronic equipment, to provide, a kind of calculation amount is small, requires low target following side to tracking target local environment Case, to overcome the limitation and defect of the relevant technologies to a certain extent.

Other characteristics and advantages of the disclosure will be apparent from by the following detailed description, or partially by the disclosure Practice and acquistion.

According to the disclosure in a first aspect, providing a kind of method for tracking target, comprising:

Video to be tracked is obtained, target detection is carried out to the video to be tracked to obtain the key frame for including tracking target Image；

Image recognition is carried out to obtain the subject area for including tracking target to the key frame images, and to the object Region carries out key point and extracts to obtain the crucial point data of the tracking target；And

Image to be detected of next frame adjacent with the key frame is extracted, and feature extraction is carried out to described image to be detected To obtain the second feature figure of described image to be detected；

Using the second feature figure and the crucial point data as input parameter input prediction model, to obtain each key The crucial point data of point corresponding prediction in described image to be detected；

The tracking target in described image to be detected is determined according to the crucial point data of the prediction.

According to the second aspect of the disclosure, a kind of target tracker is provided, comprising:

Key frame identification module carries out target detection to the video to be tracked for obtaining video to be tracked to obtain Key frame images comprising tracking target；

Key point data computation module, for carrying out image recognition to the key frame images to obtain comprising tracking target Subject area, and to the subject area carry out feature extraction with obtain it is described tracking target crucial point data；

Second feature information computational module, for extracting image to be detected of next frame adjacent with the key frame, and it is right Described image to be detected carries out feature extraction, to obtain the second feature information of described image to be detected；

Key point computing module is predicted, for using the second feature information and the key point information as input parameter It inputs in the prediction model trained, to obtain key point in the prediction key point data of described image to be detected；

Target Acquisition module is tracked, described in determining in described image to be detected according to prediction key point data Track target.

According to the third aspect of the disclosure, a kind of computer-readable medium is provided, is stored thereon with computer program, it is described Above-mentioned method for tracking target is realized when computer program is executed by processor.

According to the fourth aspect of the disclosure, a kind of electronic equipment is provided, comprising:

One or more processors；

Storage device, for storing one or more programs, when one or more of programs are one or more of When processor executes, so that one or more of processors realize above-mentioned method for tracking target.

Method for tracking target provided by a kind of embodiment of the disclosure, it is first determined include tracking target in tracking video Key frame images, then the key frame images are identified and extract tracking target crucial point data, and extract it is adjacent The second feature figure of next frame image to be detected；Thus according to crucial point data and second feature figure to key point in mapping to be checked Position is predicted as in, and describes the accurate location of the tracking target in image to be detected according to the crucial point data of prediction, from And it realizes and tracking target is continuously tracked.Pass through the characteristic pattern of crucial point data and image to be detected using key frame images Movement tendency of the predicting tracing target in image to be detected does not need to keep tracking using the global feature of tracking target, into And effectively reduce data calculation amount；And it can be realized the crucial point data of adaptive study tracking target, effectively Promote tracking efficiency.

It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not The disclosure can be limited.

Detailed description of the invention

The drawings herein are incorporated into the specification and forms part of this specification, and shows the implementation for meeting the disclosure Example, and together with specification for explaining the principles of this disclosure.It should be evident that the accompanying drawings in the following description is only the disclosure Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.

Fig. 1 schematically shows a kind of flow diagram of method for tracking target in disclosure exemplary embodiment；

Fig. 2 schematically shows a kind of obtain in disclosure exemplary embodiment and predicts that the method flow of crucial point data is illustrated Figure；

Fig. 3 schematically shows a kind of schematic diagram for stacking hourglass network structure in disclosure exemplary embodiment；

Fig. 4 schematically shows a kind of composition schematic diagram of target tracker in disclosure exemplary embodiment；

Fig. 5 schematically shows the structural representation of the computer system of a kind of electronic equipment in disclosure exemplary embodiment Figure.

Specific embodiment

Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein；On the contrary, thesing embodiments are provided so that the disclosure will more Fully and completely, and by the design of example embodiment comprehensively it is communicated to those skilled in the art.Described feature, knot Structure or characteristic can be incorporated in any suitable manner in one or more embodiments.

In addition, attached drawing is only the schematic illustrations of the disclosure, it is not necessarily drawn to scale.Identical attached drawing mark in figure Note indicates same or similar part, thus will omit repetition thereof.Some block diagrams shown in the drawings are function Energy entity, not necessarily must be corresponding with physically or logically independent entity.These function can be realized using software form Energy entity, or these functional entitys are realized in one or more hardware modules or integrated circuit, or at heterogeneous networks and/or place These functional entitys are realized in reason device device and/or microcontroller device.

In the existing tracking scheme based on depth learning technology, the application mode at initial stage is the spy that e-learning is arrived Sign is applied directly to inside the tracking frame of correlation filtering, to obtain better tracking result.Substantially convolution exports to obtain More preferably feature representation, this is also one of advantage of deep learning, but also brings the increase of calculation amount simultaneously.It is existing very much The frame and method for studying tracking often compare two kinds of features simultaneously, so that the improvement for verifying tracking or frame improves, One is traditional manual feature, another kind is exactly the feature of depth e-learning.But either how improved method and frame Frame still still realizes target tracking task based on the ability of target detection at all.Based on target detection capabilities come complete When at target tracking task, calculation amount can be larger, it is difficult to accomplish online real-time tracing.And traditional light stream rule needs three The stronger assumed condition of point: brightness constancy, Time Continuous or moving displacement are smaller, and space is consistent.And under truth, have a large amount of Scene be unable to satisfy this 3 points hypothesis.

The shortcomings that for the above-mentioned prior art and deficiency provide a kind of target following side in this example embodiment Method can be applied to the online real-time tracing under complex scene to moving object.With reference to shown in Fig. 1, above-mentioned target with Track method may comprise steps of:

S11 obtains video to be tracked, carries out target detection to the video to be tracked to obtain the pass for including tracking target Key frame image；

S12 carries out image recognition to the key frame images to obtain the subject area for including tracking target, and to described Subject area carries out key point and extracts to obtain the crucial point data of the tracking target；And

S13 extracts image to be detected of next frame adjacent with the key frame, and carries out feature to described image to be detected It extracts to obtain the second feature figure of described image to be detected；

S14, it is each to obtain using the second feature figure and the crucial point data as input parameter input prediction model The crucial point data of key point corresponding prediction in described image to be detected；

S15 determines the tracking target in described image to be detected according to the crucial point data of the prediction.

In method for tracking target provided by this example embodiment, on the one hand, pass through the key using key frame images Movement tendency of the characteristic pattern predicting tracing target of point data and image to be detected in image to be detected, does not need using tracking The global feature of target keeps tracking, and then effectively reduces data calculation amount；On the other hand, it can be realized adaptive The crucial point data of tracking target is practised, it is effective to promote tracking efficiency.

In the following, accompanying drawings and embodiments will be combined to carry out each step of the method for tracking target in this example embodiment More detailed description.

Step S11 obtains video to be tracked, and carrying out target detection to the video to be tracked to obtain includes tracking target Key frame images.

In the present exemplary embodiment, above-mentioned video to be tracked can be to be set by camera shootings such as monitoring camera, video cameras What standby directly shooting obtained, it is also possible to the video data obtained by cable network or wireless network transmissions.Obtain to After tracking video, editing can be carried out to video data, to obtain continuous multiple frames image data.It is tracked for the first time When, after obtaining continuous image data, target detection can be carried out to each frame image data, determined comprising tracking target Key frame images.For example, when determining key frame images, key frame images can be determined by way of artificial selection. Alternatively, also can use Target Recognition Algorithms extracts key frame images, for example, passing through the target identification based on deep learning network Algorithm is based on the algorithm of target detection of SSD (Single Shot MultiBox Detector) framework to each frame picture number According to target identification is carried out, so that it is determined that key frame images.For example, tracking target can be automobile, unmanned plane, human or animal Deng.

Step S12 carries out image recognition to the key frame images to obtain the subject area for including tracking target, and right The subject area carries out key point and extracts to obtain the crucial point data of the tracking target.

In the present exemplary embodiment, after determining key frame images, which can be carried out further Processing.Specifically, key frame images can be carried out with image recognition processing, and selected using rectangle frame in key frame images center The subject area where target is tracked out.After extracting subject area, crucial points can be carried out to above-mentioned subject area According to extraction.Further, it is also possible to which crucial point data is generated corresponding key point thermal map, and key point thermal map is saved, is used Prediction during subsequent tracking to target is tracked in other image to be detected.For example, it can use based on Stacked The convolutional neural networks model of Hourglass Network (stacking hourglass network) carries out crucial point data to subject area and mentions It takes, model output is made to can be used for several key point informations of target tracking.

Specifically, trained hourglass network can have been completed and formed by multiple by stacking hourglass network structure, Ge Gesha Network of slipping through the net forms in series, and previous output is the input of the latter.In each hourglass network internal, in order to capture Information under each scale changes the size of input by way of down-sampling, when reaching minimum resolution ratio, network Start up-sampling and merge feature under different scale, and by residual error drill ground by the feature phase before down-sampling before with scale Add, to reach the capture of multi-scale information.

For example, with reference to stacking hourglass network structure shown in Fig. 3, including first order hourglass network 301 and the second level Hourglass network 302.Using subject area as the input parameter N1 of first order hourglass network 301, first order hourglass network 301 it is defeated Parameter N2 may include the characteristic information extracted for above-mentioned subject area out, can also include the corresponding thermal map of subject area； After the output parameter N2 of first order hourglass network is carried out process of convolution again, as the input parameter of second level hourglass network 302, And key point information character pair information and thermal map are finally exported by second level hourglass network 302.For example, in tracking target When for unmanned plane, key point information can be the key point of description unmanned plane profile and main feature, and can use thermal map Form indicate.

It, can also be right after obtaining the subject area comprising tracking target in other exemplary embodiments of the disclosure The subject area carries out image segmentation, distinguishes the foreground image and background image of tracking target in rectangle frame.So that right As the clean images in region only comprising tracking target, and then when key point is extracted, available tracking target is more accurate Crucial point data.

Step S13 extracts image to be detected of next frame adjacent with the key frame, and carries out to described image to be detected Feature extraction is to obtain the second feature figure of described image to be detected.

It, can also be to key frame figure when carrying out crucial point data extraction to key frame images in the present exemplary embodiment As image to be detected of adjacent next frame is handled.Specifically, feature extraction can be carried out to image to be detected, and generated The corresponding second feature figure of image to be detected.

For example, the convolutional neural networks based on MobilenetV3 (mobile network) structure be can use to be detected Image carries out feature extraction.Specifically, image to be detected is inputted the convolutional neural networks mould based on MobilenetV3 structure After type, first with the convolutional layer of the start-up portion of model, BN (Batch Normalization) layer and h-switch active coating Process of convolution, batch normalized and activation successively are carried out to image to be detected of input to handle, and obtain the first intermediate result； Again by the middle section of the first intermediate result input model, using the convolutional layer and expanding layer of middle section to the first intermediate result Convolutional calculation and expansion process are carried out, the second intermediate result is obtained；Again by the decline of the second intermediate result input model, Convolutional calculation is carried out by back-page convolutional layer again, to obtain the second feature figure of image to be detected.It is risen in above-mentioned In initial portion, middle section and decline, each convolutional layer can have different convolution kernels and specified step-length.

Step S14, using the second feature figure and the crucial point data as input parameter input prediction model, to obtain Take the crucial point data of each key point corresponding prediction in described image to be detected.

In the present exemplary embodiment, in the corresponding key point thermal map of crucial point data for obtaining key frame images, and to After the corresponding second feature figure of detection image, above-mentioned step S14 be can specifically include:

Step S1411 is merged key point thermal map and second feature figure to obtain and close based on pixel access for described And characteristic image；

Step S1412, the prediction mould based on stacking hourglass network structure that the merging characteristic image input has been trained Type, to obtain the prediction key point data of image to be detected.

For example, above-mentioned to can be based on the prediction model for stacking hourglass network structure based on stacking hourglass network knot The convolutional neural networks model of structure.It can be by multiple trained successful hourglass nets specifically, stacking hourglass network structure Network composition, in stacking hourglass network structure, the output and input of previous hourglass network can be used as the latter hourglass network Input.What all network structure stacked it is, for example, possible to use shown in Fig. 3 in such as above-described embodiment.Further, it is also possible to first A transfer source layer is set between grade hourglass network 301 and second level hourglass network 302.The transfer source layer is become using geometry Core is changed, the change of a relative position is made in the position of the key point thermal map for the key frame images that first order hourglass network 301 is exported It changes, to obtain the key point position of second feature figure.

By merging key point thermal map, second feature figure along the channel channel, recycles and be based on Stacked The convolutional neural networks of Hourglass Network can find the change location of key point by convolutional neural networks, into And it predicts to obtain key point in the motion range of next frame, and can will predict that crucial point data is exported in the form of thermal map.

It,, can be with benefit after obtaining key frame images in other exemplary embodiments of the disclosure based on above content Feature extraction is carried out to key frame images with the convolutional neural networks model based on Mobile-netV3 (mobile network V3) structure, To obtain the corresponding fisrt feature figure of key frame images, and join the fisrt feature figure as the input of above-mentioned prediction model Number.Specifically, refering to what is shown in Fig. 2, may comprise steps of:

Step S1421 merges the key point thermal map, fisrt feature figure and second feature figure based on pixel access Merge characteristic image to obtain；

Step S1422, the prediction mould based on stacking hourglass network structure that the merging characteristic image input has been trained Type, to obtain the prediction key point data of image to be detected.

By can effectively examine fisrt feature figure, second feature figure and key point thermal map while input prediction model The motion profile for considering other characteristic points in key frame images can refer to when carrying out crucial point prediction in image to be detected The motion profile and the direction of motion of other characteristic points, and then can more accurately predict other in key point and key frame images Motion range of the characteristic point in image to be detected, the further accuracy for improving crucial point prediction.

Certainly, in other exemplary embodiments of the disclosure, when carrying out feature identification to image, it also can use it His model or algorithm obtains the corresponding fisrt feature figure of key frame images and the corresponding second feature of image to be detected Figure, the disclosure do not do particular determination to this.

Step S15 determines the tracking target in described image to be detected according to the crucial point data of the prediction.

In the present exemplary embodiment, after obtaining the corresponding thermal map of default crucial point data, bounding box can be carried out to it It calculates, and using bounding box calculated result as tracking target.To obtain the position for tracking target in image to be detected.To upper When image to be detected of the adjacent next frame of the image to be detected stated carries out target following, it still can use above-mentioned key frame images Corresponding key point data is predicted and is calculated.For example, can will make comprising the minimum rectangular area of all key points For bounding box.

Based on above content, above-mentioned method can also include:

Step S21 is obtained current described when the frame number of the described image to be detected being continuously tracked is greater than preset threshold The current key point data of target is tracked in image to be monitored；

Step S22 matches the current key point data with the crucial point data；And in matching result When delta data ratio is greater than preset threshold, the crucial of tracking target is updated according to the current key point data and is counted According to.

In the present exemplary embodiment, during tracking, the detection cycle of key frame images can also be set, such as be arranged 10 frames, 20 frames or 50 frames are as a detection cycle.For example, the detection cycle of key frame images be 20 frame when, with During track, start target following after determining key frame images, the ice spirit key frame images are the 1st frame, and in subsequent company It is tracked successfully in continuous image.It, can be according to above-mentioned step S12 if current image to be detected is the 21st frame image Method current image to be detected is identified, thus obtain in current image to be detected track target current key point Data, and corresponding thermal map can be generated according to current key point data.Again by current key point data and above-mentioned key point Data compare, such as are compared by thermal map, if key point data variation is larger, such as keypoint quantity variation is greater than When the change in displacement of preset threshold or key point change in location is greater than preset threshold, current key point data can be made For new crucial point data, i.e., using current image to be detected as new key frame images.To realize to crucial point data It updates, and then guarantees validity of crucial point data during tracking.

Based on above content, in other exemplary embodiments of the disclosure, during tracking, if current is to be detected Tracking target is not detected in image, then in image to be detected of next frame, still using crucial in front and continued key frame images Point data carries out the identification of tracking target.If tracking target is not detected in continuous n frame image to be detected, i.e. tracking is lost When, then in the (n+1)th frame image, the detection of key frame is re-started, to guarantee the continuity of tracking process.Wherein, n is positive Integer.

Method provided by the embodiment of the present disclosure can run in the terminal side where user, such as be taken the photograph by external Video is tracked as equipment or internet obtain, and executes above-mentioned method in terminal side, realizes the real-time tracing to tracking target. Alternatively, can also be run in server end, such as server end holds tracking video in server end after receiving tracking video After the above-mentioned method of row obtains tracking target, tracking result is sent to the user terminal.

Method provided by the embodiment of the present disclosure, by the key frame images comprising tracking target determining first, then to pass Key frame image is handled, and the corresponding characteristic pattern of the corresponding crucial point data of tracking target and key frame images is obtained；Into And motion range, direction and position of the key point in image to be detected are predicted according to crucial point data and characteristic pattern, And it is exported in the form of thermal map, to realize the successful tracking to tracking target.By predicting the movement tendency of key point, Search range is reduced in next frame image to be detected, can effectively reduce operand, promotes speed, and then promote tracking task Efficiency.

It should be noted that above-mentioned attached drawing is only showing for processing included by method according to an exemplary embodiment of the present invention Meaning property explanation, rather than limit purpose.It can be readily appreciated that it is above-mentioned it is shown in the drawings processing do not indicate or limit these processing when Between sequence.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.

Further, refering to what is shown in Fig. 4, also providing a kind of target tracker 40 in this exemplary embodiment, comprising: Key frame identification module 401, key point data computation module 402, second feature information computational module 403, prediction key point meter Calculate module 404 and tracking Target Acquisition module 405.Wherein:

The key frame identification module 401 can be used for obtaining video to be tracked, carry out target to the video to be tracked Detection includes the key frame images for tracking target to obtain.

The key point data computation module 402 can be used for carrying out the key frame images image recognition to obtain packet The subject area of the target containing tracking, and feature extraction is carried out to the subject area to obtain the crucial of tracking target and count According to.

The second feature information computational module 403 can be used for extracting the to be detected of next frame adjacent with the key frame Image, and feature extraction is carried out to described image to be detected, to obtain the second feature information of described image to be detected.

The prediction key point computing module 404 can be used for making the second feature information and the key point information In the prediction model trained for input parameter input, counted with obtaining key point in the prediction key of described image to be detected According to.

The tracking Target Acquisition module 405 can be used for determining the mapping to be checked according to the crucial point data of the prediction The tracking target as in.

In a kind of example of the disclosure, above-mentioned apparatus can also include: thermal map conversion module (not shown).

The thermal map conversion module can be used for generating corresponding key point thermal map according to the crucial point data, and will close Key point thermal map is as input parameter.

In a kind of example of the disclosure, when inputting parameter is second feature figure and key point thermal map, above-mentioned prediction is closed Key point computing module 404 may include: the first merging module and the first computing module (not shown).Wherein:

First merging module can be used for carrying out the key point thermal map and second feature figure based on pixel access Merge to obtain merging characteristic image.

First computing module can be used for by it is described merging characteristic image input train based on stacking hourglass net The prediction model of network structure, to obtain the prediction key point data of image to be detected.

In a kind of example of the disclosure, above-mentioned apparatus can also include: that fisrt feature figure computing module (does not show in figure Out).

The fisrt feature figure computing module can be used for carrying out the key frame images feature extraction described to obtain The fisrt feature figure of key frame images；And using the fisrt feature figure as the input parameter of prediction model.

It include that the key point thermal map, fisrt feature figure and second are special in input parameter in a kind of example of the disclosure When sign figure, above-mentioned prediction key point computing module 404 may include: that the second merging module and the second computing module (do not show in figure Out).Wherein:

Second merging module can be used for being based on picture to the key point thermal map, fisrt feature figure and second feature figure Plain channel is merged to obtain merging characteristic image.

Second computing module can be used for by it is described merging characteristic image input train based on stacking hourglass net The prediction model of network structure, to obtain the prediction key point data of image to be detected.

In a kind of example of the disclosure, the tracking Target Acquisition module 405 includes: bounding box computing unit (in figure It is not shown).

The bounding box computing unit can be used for carrying out bounding box calculating according to the crucial point data of the prediction, and will packet Box calculated result is enclosed as tracking target.

In a kind of example of the disclosure, stating device can also include: image judgment module, key point update module (figure In be not shown).Wherein:

Described image judgment module can be used for being greater than preset threshold in the frame number of the described image to be detected being continuously tracked When, obtain the current key point data that target is tracked in the current image to be monitored.

The key point update module can be used for the current key point data and the crucial point data progress Match；And the delta data ratio in matching result be greater than preset threshold when, according to the current key point data update described in Track the crucial point data of target.

The detail of each module carries out in corresponding method for tracking target in above-mentioned target tracker Detailed description, therefore details are not described herein again.

It should be noted that although being referred to several modules or list for acting the equipment executed in the above detailed description Member, but this division is not enforceable.In fact, according to embodiment of the present disclosure, it is above-described two or more Module or the feature and function of unit can embody in a module or unit.Conversely, an above-described mould The feature and function of block or unit can be to be embodied by multiple modules or unit with further division.

Fig. 5 shows the structural schematic diagram for being suitable for the computer system for the electronic equipment for being used to realize the embodiment of the present invention.

It should be noted that the computer system 500 of the electronic equipment shown in Fig. 5 is only an example, it should not be to this hair The function and use scope of bright embodiment bring any restrictions.

As shown in figure 5, computer system 500 includes central processing unit (Central Processing Unit, CPU) 501, it can be according to the program being stored in read-only memory (Read-Only Memory, ROM) 402 or from storage section 508 programs being loaded into random access storage device (Random Access Memory, RAM) 503 and execute various appropriate Movement and processing.In RAM 503, it is also stored with various programs and data needed for system operatio.CPU 501, ROM 502 with And RAM 503 is connected with each other by bus 504.Input/output (Input/Output, I/O) interface 505 is also connected to bus 504。

I/O interface 505 is connected to lower component: the importation 506 including keyboard, mouse etc.；It is penetrated including such as cathode Spool (Cathode Ray Tube, CRT), liquid crystal display (Liquid Crystal Display, LCD) etc. and loudspeaker Deng output par, c 507；Storage section 508 including hard disk etc.；And including such as LAN (Local Area Network, office Domain net) card, modem etc. network interface card communications portion 509.Communications portion 509 via such as internet network Execute communication process.Driver 510 is also connected to I/O interface 505 as needed.Detachable media 511, such as disk, CD, Magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 510, in order to from the computer journey read thereon Sequence is mounted into storage section 508 as needed.

Particularly, according to an embodiment of the invention, may be implemented as computer below with reference to the process of flow chart description Software program.For example, the embodiment of the present invention includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 509, and/or from detachable media 511 are mounted.When the computer program is executed by central processing unit (CPU) 501, executes and limited in the system of the application Various functions.

It should be noted that computer-readable medium shown in the embodiment of the present invention can be computer-readable signal media Or computer readable storage medium either the two any combination.Computer readable storage medium for example can be with System, device or the device of --- but being not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than Combination.The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires Electrical connection, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type are programmable Read-only memory (Erasable Programmable Read Only Memory, EPROM), flash memory, optical fiber, Portable, compact Disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), light storage device, magnetic memory device or The above-mentioned any appropriate combination of person.In the present invention, computer readable storage medium can be it is any include or storage program Tangible medium, which can be commanded execution system, device or device use or in connection.And in this hair In bright, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, In carry computer-readable program code.The data-signal of this propagation can take various forms, including but not limited to Electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer-readable Any computer-readable medium other than storage medium, the computer-readable medium can send, propagate or transmit for by Instruction execution system, device or device use or program in connection.The journey for including on computer-readable medium Sequence code can transmit with any suitable medium, including but not limited to: wireless, wired etc. or above-mentioned is any appropriate Combination.

Flow chart and block diagram in attached drawing are illustrated according to the system of various embodiments of the invention, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.

Being described in unit involved in the embodiment of the present invention can be realized by way of software, can also be by hard The mode of part realizes that described unit also can be set in the processor.Wherein, the title of these units is in certain situation Under do not constitute restriction to the unit itself.

As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in electronic equipment described in above-described embodiment；It is also possible to individualism, and without in the supplying electronic equipment. Above-mentioned computer-readable medium carries one or more program, when the electronics is set by one for said one or multiple programs When standby execution, so that method described in electronic equipment realization as the following examples.For example, the electronic equipment can be real Each step now as shown in Figure 1.

In addition, above-mentioned attached drawing is only the schematic theory of processing included by method according to an exemplary embodiment of the present invention It is bright, rather than limit purpose.It can be readily appreciated that the time that above-mentioned processing shown in the drawings did not indicated or limited these processing is suitable Sequence.In addition, be also easy to understand, these processing, which can be, for example either synchronously or asynchronously to be executed in multiple modules.

Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the disclosure His embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or Adaptive change follow the general principles of this disclosure and including the undocumented common knowledge in the art of the disclosure or Conventional techniques.The description and examples are only to be considered as illustrative, and the true scope and spirit of the disclosure are by claim It points out.

It should be understood that the present disclosure is not limited to the precise structures that have been described above and shown in the drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present disclosure is only limited by the attached claims.

Claims

1. a kind of method for tracking target characterized by comprising

Video to be tracked is obtained, target detection is carried out to the video to be tracked to obtain the key frame figure for including tracking target Picture；

Image recognition is carried out to obtain the subject area for including tracking target to the key frame images, and to the subject area Key point is carried out to extract to obtain the crucial point data of the tracking target；And

Image to be detected of next frame adjacent with the key frame is extracted, and feature extraction is carried out to obtain to described image to be detected Take the second feature figure of described image to be detected；

Using the second feature figure and the crucial point data as input parameter input prediction model, existed with obtaining each key point The crucial point data of corresponding prediction in described image to be detected；

2. method for tracking target according to claim 2, which is characterized in that described when obtaining the crucial point data Method further include:

Corresponding key point thermal map is generated according to the crucial point data, and using key point thermal map as input parameter.

3. method for tracking target according to claim 1, which is characterized in that inputting parameter as second feature figure and key When point thermal map, each key point of acquisition includes: in the prediction key point data of described image to be detected

The key point thermal map and second feature figure are based on pixel access to merge to obtain merging characteristic image；

The prediction model based on stacking hourglass network structure that the merging characteristic image input has been trained, it is to be detected to obtain The prediction key point data of image.

4. method for tracking target according to claim 2, which is characterized in that in the key point letter for obtaining the tracking target When breath, the method also includes:

Feature extraction is carried out to obtain the fisrt feature figure of the key frame images to the key frame images；And by described first Input parameter of the characteristic pattern as prediction model.

5. method for tracking target according to claim 4, which is characterized in that in input parameter include key point heat When figure, fisrt feature figure and second feature figure, the key point that obtains is in the prediction key point data packet of described image to be detected It includes:

Pixel access is based on to the key point thermal map, fisrt feature figure and second feature figure to merge to obtain merging feature Image；

6. method for tracking target according to claim 1, which is characterized in that described true according to the crucial point data of the prediction The tracking target in described image to be detected includes: calmly

Bounding box calculating is carried out according to the crucial point data of the prediction, and using bounding box calculated result as tracking target.

7. method for tracking target according to claim 1, which is characterized in that the method also includes:

When the frame number of the described image to be detected being continuously tracked is greater than preset threshold, obtain in the current image to be monitored Track the current key point data of target；

The current key point data is matched with the crucial point data；And the delta data ratio in matching result When greater than preset threshold, the crucial point data of the tracking target is updated according to the current key point data.

8. a kind of target tracker characterized by comprising

Key frame identification module, for obtaining video to be tracked, carrying out target detection to the video to be tracked to obtain includes Track the key frame images of target；

Key point data computation module, for carrying out image recognition to the key frame images to obtain pair for including tracking target As region, and feature extraction is carried out to the subject area to obtain the crucial point data of the tracking target；

Second feature information computational module, for extracting image to be detected of next frame adjacent with the key frame, and to described Image to be detected carries out feature extraction, to obtain the second feature information of described image to be detected；

Key point computing module is predicted, for using the second feature information and the key point information as input parameter input In the prediction model trained, to obtain key point in the prediction key point data of described image to be detected；

Target Acquisition module is tracked, for determining the tracking in described image to be detected according to the crucial point data of the prediction Target.

9. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that the computer program is processed The method for tracking target as described in any one of claims 1 to 7 is realized when device executes.

10. a kind of electronic equipment characterized by comprising

One or more processors；

Storage device, for storing one or more programs, when one or more of programs are by one or more of processing When device executes, so that one or more of processors realize the target following side as described in any one of claims 1 to 7 Method.