CN106919918A - A kind of face tracking method and device - Google Patents
A kind of face tracking method and device Download PDFInfo
- Publication number
- CN106919918A CN106919918A CN201710108748.7A CN201710108748A CN106919918A CN 106919918 A CN106919918 A CN 106919918A CN 201710108748 A CN201710108748 A CN 201710108748A CN 106919918 A CN106919918 A CN 106919918A
- Authority
- CN
- China
- Prior art keywords
- face
- network model
- key point
- present frame
- confidence level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/48—Matching video sequences
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
Abstract
The embodiment of the invention discloses a kind of face tracking method and device;The present embodiment is when needing to carry out face tracking to video flowing, the network model of corresponding deep learning can be obtained, and be the network model storage allocation resource, so that all layers of the network model share same memory space, then, memory source and network model based on distribution are processed the video flowing, to realize the real-time tracking of face;Because in this scenario, all layers of network model can share same memory space, therefore, an independent memory space need not be all distributed for each layer of network model, can not only save the occupancy of internal memory, improve computational efficiency, furthermore, it is possible to reduce fragmentation, application program capacity is improved.
Description
Technical field
The present invention relates to communication technical field, and in particular to a kind of face tracking method and device.
Background technology
In recent years, face tracking technology has obtained significant progress, in many fields, such as monitoring, video conference and remote
Cheng Jiaoxue etc., is required for being tracked Given Face and analyzing.
In the prior art, there are various face tracking technologies, deep learning forward prediction technology is exactly one of which.In depth
Degree study forward prediction technology in, for different application fields, it is necessary to set up different network models, and according to need solve
The difference of the complex nature of the problem, the level of its network model be would also vary from, such as, complexity problem higher is generally required
Set up deeper network model, etc..Personal computer (PC, Personal Computer) end, network model it is every
One layer is required for monopolizing one section of storage region, and the storage region can be specifically configured by configuration file, for example, dividing
During with storage resource, storage size calculating can be carried out to current layer, and be current Layer assignment by reading configuration file
Memory space, etc., wherein, the storage region of each layer needs independently to be allocated, and without shared between the storage region of each layer
Internal memory.
In the research and practice process to prior art, it was found by the inventors of the present invention that due in existing scheme, net
Each layer of network model is required for monopolizing one section of storage region, therefore, required total internal memory is more, on the platform of constrained storage,
To cause to calculate hydraulic performance decline, even resulting in algorithm cannot run;Being additionally, since batch operation needs generation multiple, so,
Fragmentation is relatively easy to form, causes application program capacity to decline.
The content of the invention
The embodiment of the present invention provides a kind of face tracking method and device, can not only save the occupancy of internal memory, improves meter
Efficiency is calculated, furthermore, it is possible to reduce fragmentation, application program capacity is improved.
The embodiment of the present invention provides a kind of face tracking method, including:
Acquisition needs to carry out the video flowing of face tracking and the network model of deep learning;
It is the network model storage allocation resource so that all layers of the network model share same memory space;
Memory source and the network model based on distribution are tracked to the face in the video flowing.
Accordingly, the embodiment of the present invention also provides a kind of face tracking device, including:
Acquiring unit, needs to carry out the video flowing of face tracking and the network model of deep learning for obtaining;
Allocation unit, for being the network model storage allocation resource so that all layers of the network model are shared
Same memory space;
Tracking cell, is carried out for memory source and the network model based on distribution to the face in the video flowing
Tracking.
The embodiment of the present invention is needing to carry out deep learning to video flowing, to carry out during face tracking, can obtain corresponding
Deep learning network model, and be the network model storage allocation resource so that all layers of the network model are shared same
One memory space, then, memory source and network model based on distribution are processed the video flowing, to realize the reality of face
When track;Because in this scenario, all layers of network model can share same memory space, it is therefore not necessary to be network
Each layer of model all distributes an independent memory space, can not only greatly save the occupancy of internal memory, improves computational efficiency, and
And, due to need to only distribute once, so, it is also possible to the number of times of batch operation is substantially reduced, fragmentation is reduced, is conducive to improving
Application program capacity.
Brief description of the drawings
Technical scheme in order to illustrate more clearly the embodiments of the present invention, below will be to that will make needed for embodiment description
Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present invention, for
For those skilled in the art, on the premise of not paying creative work, can also obtain other attached according to these accompanying drawings
Figure.
Fig. 1 a are the schematic diagram of a scenario of face tracking method provided in an embodiment of the present invention;
Fig. 1 b are the flow charts of face tracking method provided in an embodiment of the present invention;
Fig. 1 c are the schematic diagrames of Memory Allocation in face tracking method provided in an embodiment of the present invention;
Fig. 1 d are the use schematic diagrames of memory headroom in face tracking method provided in an embodiment of the present invention;
Fig. 2 a are another flow charts of face tracking method provided in an embodiment of the present invention;
Fig. 2 b are network model schematic diagrames at all levels in face tracking method provided in an embodiment of the present invention;
Fig. 3 is the structural representation of face tracking device provided in an embodiment of the present invention;
Fig. 4 is the structural representation of terminal provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is only a part of embodiment of the invention, rather than whole embodiments.It is based on
Embodiment in the present invention, the every other implementation that those skilled in the art are obtained under the premise of creative work is not made
Example, belongs to the scope of protection of the invention.
The embodiment of the present invention provides a kind of face tracking method and device.
Wherein, the face tracking device is specifically integrated in the equipment such as mobile terminal, for example, with reference to Fig. 1 a, the movement end
End can obtain the network model of corresponding deep learning, and be the network mould when needing to carry out face tracking to video flowing
The disposable storage allocation resource of type so that all layers of the network model share same memory space, such as, can calculate the net
Memory space in network model needed for each layer network, selects maximum therein as the size of the memory space of predistribution,
And accordingly for the network model storage allocation resource, etc., after memory source is assigned, the distribution just can be based on
Memory source and the network model are tracked to the face in the video flowing, so as to reach province's EMS memory occupation, reduce storage broken
Piece, and improve the purpose of computational efficiency.
It is described in detail individually below.It should be noted that, the sequence number of following examples is not as preferably suitable to embodiment
The restriction of sequence.
Embodiment one,
The present embodiment will be described from the angle of face tracking device, and the face tracking device can specifically be integrated in shifting
In the equipment such as dynamic terminal, the mobile terminal can be including mobile phone, panel computer or Intelligent worn device etc..
A kind of face tracking method, including:Acquisition needs to carry out the video flowing of face tracking and the network of deep learning
Model, is the network model storage allocation resource so that all layers of the network model share same memory space, based on distribution
Memory source and the network model face in the video flowing is tracked.
As shown in Figure 1 b, the idiographic flow of the face tracking method can be as follows:
101st, obtaining needs to carry out the video flowing of face tracking and the network model of deep learning.
For example, the network model of video flowing and deep learning can be specifically obtained from local or other storage devices,
Etc..
Wherein, the network model can be configured according to the demand of practical application, will not be repeated here.
102nd, it is the network model storage allocation resource so that all layers of the network model share same memory space,
For example, specifically can be as follows:
(1) memory space needed for each layer network in the network model is calculated.
For example, the configuration file of network model can be obtained, each layer in the network model is calculated according to the configuration file
Memory space needed for network.Such as, specifically can be as follows:
First, the configuration file of the network model is read, secondly, each network layer is calculated according to the configuration file
Number of parameters, input (i.e. Bottom) Blob, output (i.e. Top) Blob and the layer for obtaining each layer network needs temporarily
Blob (the i.e. interim Blob) size opened up, finally, just can count according to the Bottom Blob, Top Blob and interim Blob
Memory space required for calculating this layer, i.e., A+B+C area sizes as shown in figure 1 c, wherein, A areas can not only be used for this floor
Input area, also can be used as last layer or next layer of output area, B areas are the temporary area of this floor, and C areas can not only be used for this floor
Output area, also can be used as last layer or next layer of input area.
It should be noted that, for convenience, in embodiments of the present invention, Bottom Blob are referred to as input area, Top
Blob is referred to as output area, and interim Blob is referred to as into temporary area.Wherein, Blob is the Storage unit names of depth network model, is
One four-matrix, comprising each dimension size of matrix.
(2) using the maximum in the memory space needed for each layer as predistribution memory space size.
For example, by taking six layers of network model as an example, if the memory space required for layer 5 is maximum, with needed for layer 5
The memory space wanted is defined, as the size of the memory space of predistribution, by that analogy, etc..
(3) size of the memory space according to the predistribution is the network model storage allocation resource.
Need to be only the memory source of the storage size that the network model disposably distributes the predistribution, preceding
It is not required to distribute other spaces during to calculating.
Wherein, the Memory Allocation process of forward calculation can be as shown in Figure 1 d:Assuming that the input area of n-th layer is currently deposited in A areas
(i.e. Bottom Blob) data, ephemeral data needed for B areas storage current layer, the output area (Top that the storage of C areas is calculated
Blob) data, then after n-th layer is calculated output result, just can be assigned to n-th by output area (the i.e. Top Blob) pointer
+ 1 layer of input area (i.e. Bottom Blob), and the input area pointer in A areas is assigned to output area (the i.e. Top of the (n+1)th floor
Blob), for storing (n+1)th layer of output result, and B areas are equally used for preserving (n+1)th ephemeral data, so repeatedly, i.e.,
The calculating of whole feedforward network can be completed, the process is without other data copies and transmission, even if pointer assignment operation also can be pre-
Processing stage completes.
103rd, memory source and the network model based on distribution are tracked to the face in the video flowing;For example, tool
Body can be as follows:
(1) determined to be currently needed for the image for the treatment of according to the video flowing, obtain present frame.
(2) the face key point coordinates and confidence level of the previous frame image of present frame are obtained.
Wherein, face key point refers to reflecting the information of face characteristic, such as eyes, eyebrow, nose, face,
And face's outline etc..Face key point coordinates refers to the coordinate of these face key points, each face key point coordinate
An array can be used, such as with array (x1, y1, x2, y2..., xn, yn) characterize, wherein, (xi, yi) represent wherein i-th
The coordinate of point.
(3) memory source based on distribution, the network model, the face key point coordinates of previous frame image and confidence level are pre-
The face key point coordinates and confidence level of present frame are surveyed, and returns to the image for performing and treatment being currently needed for according to video flowing determination
The step of, until the image in the video flowing is disposed.
Wherein, the memory source based on distribution, the network model, the face key point coordinates of previous frame image and confidence level
Predict that the face key point coordinates of present frame and the mode of confidence level can have various, for example, specifically can be as follows:
When determining that the confidence level of previous frame image is more than predetermined threshold value, using the memory source of distribution, by the network mould
Type is calculated the face key point coordinates of the previous frame image, obtains result of calculation, current according to result of calculation prediction
The face key point coordinates of frame, and the confidence level for calculating the present frame.
Wherein, the predetermined threshold value can be configured according to the demand of practical application, will not be repeated here.
For example, so that the network model includes public network part, key point predicted branches and confidence level predicted branches as an example,
Then step " face of previous frame image key point coordinates is calculated by the network model, obtains result of calculation, according to
The face key point coordinates of result of calculation prediction present frame, and the confidence level for calculating the present frame " specifically can be as follows:
The face key point coordinates of the previous frame image is calculated by the public network part, obtains calculating knot
Really;The result of calculation is processed by the key point predicted branches, obtains the face key point coordinates of present frame, and,
The result of calculation is processed by the confidence level predicted branches, obtains confidence level of present frame, etc..
It should be noted that, if confidence level (i.e. the confidence level of the face key point coordinates of previous frame image) is not less than (i.e. high
In, including be equal to) predetermined threshold value, then show that the reference value of the face key point coordinates of previous frame is relatively low, therefore, now can be with
Come face key point coordinates in obtaining present frame by the way of detection;Similarly, if obtaining the previous frame image less than present frame
Face key point coordinates and confidence level, such as present frame is the first frame of the video flowing, it would however also be possible to employ the mode of detection is obtained
Take face key point coordinates in present frame, you can choosing, after step " being the network model storage allocation resource ", the face
Tracking can also include:
In the face key point coordinates and confidence level that obtain the previous frame image less than present frame, or, determine previous frame
When the confidence level of image is less than or equal to predetermined threshold value, based on the memory source of distribution, by Face datection algorithm in present frame
Face detected, to determine the face key point coordinates and confidence level of present frame.
Wherein, the mode of detection can have various, such as, and can be in the following way:
A, the memory source based on distribution, the human face region of the present frame is determined by Face datection algorithm, for example, can be with
It is as follows:
Based on the memory source of distribution, the face characteristic in obtaining the present frame by calculating image integration figure, according to this
Face characteristic builds strong man's face and non-face strong classifier, then, present frame is processed according to the strong classifier, obtains
The human face region of present frame.
Wherein, in order to improve the accuracy rate of Face datection, differentiation face and non-can be built using Adaboost algorithm
The strong classifier of face, and by cascade system by strong classifier level be associated in a system, will the strong classifier level be associated in
In same system.
Wherein, Adaboost is a kind of iterative algorithm, and its core concept is directed to same training set and trains different dividing
Class device (Weak Classifier), then gets up these weak classifier sets, constitutes a stronger final classification device (strong classifier).
B, the position of the human face five-sense-organ in the human face region is predicted by the network model, obtains the people of present frame
Face key point coordinates and confidence level.
From the foregoing, it will be observed that the present embodiment is when needing to carry out face tracking to video flowing, corresponding deep learning can be obtained
Network model, and be the network model storage allocation resource so that all layers of the network model share same memory space,
Then, memory source and network model based on distribution are processed the video flowing, to realize the real-time tracking of face;Due to
In this scenario, all layers of network model can share same memory space, it is therefore not necessary to be each layer of network model
An independent memory space is all distributed, the occupancy of internal memory can be not only greatlyd save, computational efficiency is improved, being additionally, since only needs
Distribute once, so, it is also possible to the number of times of batch operation is substantially reduced, fragmentation is reduced, is conducive to improving application program
Energy.
Embodiment two,
, be described in further detail for citing below by the method according to described by embodiment one.
In the present embodiment, will be illustrated so that the face tracking device can specifically be integrated in mobile terminal as an example.
As described in Fig. 2 a, a kind of face tracking method, idiographic flow can be as follows:
201st, acquisition for mobile terminal video flowing.
For example, mobile terminal can specifically receive the video flowing of other equipment transmission, or, obtained from local storage space
Video flowing, etc..
202nd, the network model of acquisition for mobile terminal deep learning.
Wherein, the network model can be configured according to the demand of practical application, such as, the network model can include
Three parts, first, Part I is public network part, is Liang Ge branches that public network part is subsequently generated secondly, crucial
Point prediction branch and confidence level predicted branches.Wherein, the level of each section can according to demand depending on, for example, with reference to Fig. 2 b,
The level of each several part specifically can be as follows:
Public network part can include 6 convolution (Convolution) layers, such as convolutional layer 1, convolutional layer 2, convolutional layer
3rd, convolutional layer 4, convolutional layer 5 and convolutional layer 6, immediately amendment linear unit (Relu, a Rectified after each convolutional layer
Linear unit) activation primitive, abbreviation nonlinear activation function can also immediately be used to gather after the nonlinear activation function of part
The layer of conjunction --- pond (Pooling) layer, for details, reference can be made to Fig. 2 b.
Key point predicted branches can include 1 convolutional layer and 3 inner product (Inner Product) layers, such as, referring to figure
2b, can specifically include convolutional layer 7, interior lamination 1, interior lamination 2 and interior lamination 3, immediately one after each convolutional layer and interior lamination
Individual nonlinear activation function.
Confidence level predicted branches can include 1 convolutional layer (i.e. convolutional layer 8), 5 interior lamination (i.e. interior lamination 4, interior laminations
5th, interior lamination 6, interior lamination 7 and interior lamination 8) and 1 flexible maximum transfer function (Softmax) layer, wherein, Softmax
Two values of layer output, are respectively face probability and non-face probability, and both add up to 1.0.Additionally, each convolutional layer and
A nonlinear activation function can be connect in each two after lamination.
203rd, mobile terminal calculates the memory space needed for each layer network in the network model.
For example, the configuration file of the network model can be read, each network layer is calculated according to the configuration file
Number of parameters, obtains the size of input area, output area and the temporary area of each layer network, then, according to the input area, defeated
Memory space required for going out the stool and urine of area and temporary area and can calculating the floor, i.e., A+B+C areas as shown in figure 1 c
Domain size, specifically can detailed in Example one, will not be repeated here.
204th, mobile terminal using the maximum in the memory space needed for each layer as predistribution memory space size,
And according to the size of the memory space of the predistribution be the network model storage allocation resource.
Wherein, the Memory Allocation process of forward calculation can be as shown in Figure 1 d:Assuming that the input area of n-th layer is currently deposited in A areas
Data, ephemeral data needed for B areas storage current layer, the output area data that the storage of C areas is calculated, then when n-th layer is calculated
After output result, the output area pointer just can be assigned to (n+1)th layer of input area, and the input area pointer in A areas is assigned to n-th
+ 1 layer of output area, the output result for storing (n+1)th layer, and B areas are equally used for preserving (n+1)th ephemeral data, n-th
After+1 layer has been processed, the value of n is updated to " n+1 ", repeats said process, so repeatedly, you can complete whole feedforward network
Calculating.For example, be equal to as a example by 1 by the initial value of n, then specifically can be as follows:
After output result is calculated for the 1st layer, just output area (i.e. the C areas of the 1st floor) pointer of the 1st floor can be assigned
To the 2nd layer of input area, and the input area pointer in A areas is assigned to the output area of the 2nd floor, the output result for storing the 2nd layer,
And B areas are equally used for preserving the 2nd ephemeral data.Similarly, after being calculated output result at the 2nd layer, just can be by the 2nd layer
Output area (the C areas of the 2nd floor are also the A areas of the 1st floor) pointer be assigned to the input area of the 3rd floor, and by the A areas the (the i.e. the 1st of the 2nd floor
C the layer of layer) input area pointer be assigned to the 3rd layer of output area, the output result for storing the 3rd layer, and B areas are equally used for guarantor
The 3rd ephemeral data is deposited, by that analogy, etc..
Wherein, the process is without other data copies and transmission, even if pointer assignment operation can also be completed in pretreatment stage.
It can be seen that, the calculating make use of a feature of deep learning, i.e., (n+1)th layer of calculating only needs to use (n+1)th layer
Input area (i.e. the output area of n-th layer) and (n+1)th layer of output area, the input area without using n-th layer again, so that can
To recycle the internal memory shared by the input area of n-th layer;That is, all layers of computing is in pre-assigned " A+B+
Carried out in C " region of memorys, therefore, no matter the depth network layer has more deep, and required memory space is only dependent upon depositing for a certain layer
Storage space, so, the occupancy of memory source can be saved so that turn into the complicated profound network of mobile-terminal platform application
May.Additionally, from the point of view of calculating process, due to being only the pointer assignment operation in internal memory, therefore, it can very quick high
Effect.
205th, mobile terminal determines to be currently needed for the image for the treatment of according to the video flowing, obtains present frame.
206th, the face key point coordinates and confidence level of the previous frame image of acquisition for mobile terminal present frame, then perform step
Rapid 207.
Wherein, face key point refers to reflecting the information of face characteristic, such as eyes, eyebrow, nose, face,
And face's outline etc..Face key point coordinates refers to the coordinate of these face key points.
It should be noted that, if obtaining the face key point coordinates and confidence level of the previous frame image less than the present frame, than
Such as the first frame that present frame is the video flowing, then the face key point coordinates and confidence of present frame can be obtained by detecting
Degree, that is, perform step 208.
207th, whether mobile terminal determines the confidence level of the face key point coordinates of the previous frame image higher than predetermined threshold value,
If so, then showing that face key point is tracked successfully, step 209 is performed, otherwise, if not higher than predetermined threshold value, show that face is closed
The tracking failure of key point, performs step 208.
Wherein, the predetermined threshold value can be configured according to the demand of practical application, will not be repeated here.
208th, mobile terminal is based on the memory source of distribution, and the face in present frame is examined by Face datection algorithm
Survey, to determine the face key point coordinates and confidence level of present frame, then perform step 210.
Wherein, the mode of detection can have various, such as, and can be in the following way:
(1) mobile terminal is based on the memory source of distribution, and the human face region of the present frame is determined by Face datection algorithm,
For example, can be as follows:
Mobile terminal is based on the memory source of distribution, and by calculating image integration figure, to obtain face in the present frame special
Levy, strong man's face and non-face strong classifier are built according to the face characteristic, then, present frame is carried out according to the strong classifier
Treatment, obtains the human face region of present frame.
Wherein, in order to improve the accuracy rate of Face datection, differentiation face and non-can be built using Adaboost algorithm
The strong classifier of face, and strong classifier level is associated in a system by cascade system.
(2) mobile terminal is predicted by the network model to the position of the human face five-sense-organ in the human face region, is obtained
The face key point coordinates and confidence level of present frame.
Wherein, in order to reduce the calculating time, and computing resource is saved, the calculating of face key point coordinates and confidence level can
Being synchronous.
209th, mobile terminal is crucial to the face of the previous frame image by the network model using the memory source for distributing
Point coordinates is calculated, and obtains result of calculation, and the face key point coordinates of present frame is predicted according to the result of calculation, and is calculated
The confidence level of the present frame, then performs step 210.
For example, mobile terminal can be crucial to the face of the previous frame image by the public network part of the network model
Point coordinates is calculated, and obtains result of calculation, then, the result of calculation is processed by the key point predicted branches, is obtained
To the face key point coordinates of present frame, and, the result of calculation is processed by the confidence level predicted branches, worked as
Confidence level of previous frame, etc..
Such as, the face key point of the previous frame image can be specifically calculated by the public network part of the network model
The envelope frame of coordinate, then, on the one hand, by the key point predicted branches, face in the present frame is calculated according to the envelope frame
The position of key point, obtains the face key point coordinates of present frame, on the other hand, face is analyzed by the confidence level predicted branches
The accuracy of identification, whether the image for example analyzed in the envelope frame is face, etc., and then according to analysis result come calculate work as
The confidence level of previous frame.
Wherein, in order to reduce the calculating time, and computing resource is saved, the calculating of face key point coordinates and confidence level can
Treatment with synchronization, i.e. key point predicted branches and confidence level predicted branches can be parallel.
210th, mobile terminal determines that whether identification is finished for image in video flowing, if so, then flow terminates, otherwise, is returned
Receipt row step 205.
That is, the face key point coordinates and confidence level of present frame are referred to as one of next two field picture face tracking,
So circulation, until the image in video flowing is recognized and finished.
From the foregoing, it will be observed that the present embodiment is when needing to carry out face tracking to video flowing, corresponding deep learning can be obtained
Network model, and be the network model storage allocation resource so that all layers of the network model share same memory space,
Then, memory source and network model based on distribution are processed the video flowing, to realize completing people in the terminal
The real-time tracking of face.On the one hand, because in this scenario, all layers of network model can share same memory space, because
This, an independent memory space is all distributed without each layer for network model, can not only greatly save the occupancy of internal memory, is carried
Computationally efficient, being additionally, since need to only distribute once, so, it is also possible to the number of times of batch operation is substantially reduced, storage is reduced broken
Piece, is conducive to improving application program capacity;On the other hand, this programme is less than or equal to threshold value in face tracking exception, such as confidence level
Or obtain less than the face key point coordinates of previous frame and during confidence level, can also be tracked automatically replacement (i.e. again through
The mode of detection come obtain face key point coordinates and confidence level), therefore, it can strengthen face tracking continuity.
Further, since the program is less to the demand of internal memory, and computational efficiency is higher, therefore, the requirement to equipment performance
It is relatively low, go for the equipment such as mobile terminal, so, relative to the side that deep learning forwards algorithms are placed on server end
For case, more high efficient and flexible face can be tracked, be conducive to improving Consumer's Experience.
Embodiment three,
In order to preferably implement above method, the embodiment of the present invention also provides a kind of face tracking device, as shown in figure 3,
The face tracking device, including acquiring unit 301, allocation unit 302 and tracking cell 303, it is as follows:
(1) acquiring unit 301;
Acquiring unit 301, needs to carry out the video flowing of face tracking and the network model of deep learning for obtaining.
For example, the network model of video flowing and deep learning can be specifically obtained from local or other storage devices,
Etc..
Wherein, the network model can be configured according to the demand of practical application, such as, the network model can include
Public network part, key point predicted branches and confidence level predicted branches etc., for details, reference can be made to embodiment of the method above, herein
Repeat no more.
(2) allocation unit 302;
Allocation unit 302, for being the network model storage allocation resource so that all layers of the network model are shared same
One memory space.
For example, the allocation unit 302 can include computation subunit and distribution subelement, it is as follows:
Computation subunit, can be used for calculating the memory space needed for each layer network in the network model.
For example, the computation subunit, specifically can be used for obtaining the configuration file of network model, according to the configuration file meter
Calculate the memory space needed for each layer network in the network model, such as, and can be as follows:
Computation subunit reads the configuration file of the network model, and each network layer is calculated according to the configuration file
Number of parameters, obtains the size of input area, output area and the temporary area of each layer network, then, according to the input area, defeated
Memory space required for going out the stool and urine of area and temporary area and can calculating the floor, i.e., A+B+C areas as shown in figure 1 c
Domain size, specifically can detailed in Example one, will not be repeated here.
Distribution subelement, can be used for the maximum in the memory space needed for each layer as the memory space for pre-allocating
Size, the size of the memory space according to the predistribution is the network model storage allocation resource.
(3) tracking cell 303;
Tracking cell 303, is carried out for memory source and the network model based on distribution to the face in the video flowing
Tracking.
For example, the tracking cell 303 can include determining that subelement, parameter acquiring subelement and prediction subelement, it is as follows:
Determination subelement, can be used for being determined to be currently needed for the image for the treatment of according to the video flowing, obtain present frame;
Parameter acquiring subelement, can be used for the face key point coordinates of the previous frame image for obtaining present frame and confidence
Degree.
Wherein, face key point refers to reflecting the information of face characteristic, such as eyes, eyebrow, nose, face,
And face's outline etc..Face key point coordinates refers to the coordinate of these face key points.
Prediction subelement, can be used for the face key of the memory source based on distribution, the network model, previous frame image
The face key point coordinates and confidence level of point coordinates and confidence level prediction present frame, and trigger determination subelement and perform and regarded according to this
Frequency stream determines the operation of the image for being currently needed for treatment, until the image in the video flowing is disposed.
Wherein, the memory source based on distribution, the network model, the face key point coordinates of previous frame image and confidence level
Predict that the face key point coordinates of present frame and the mode of confidence level can have various, for example, specifically can be as follows:
The prediction subelement, be specifically determined for previous frame image confidence level be more than predetermined threshold value when, using point
The memory source matched somebody with somebody, is calculated the face key point coordinates of the previous frame image by the network model, obtains calculating knot
Really, the face key point coordinates of present frame, and the confidence level for calculating the present frame are predicted according to the result of calculation.
For example, so that the network model includes public network part, key point predicted branches and confidence level predicted branches as an example,
Then the prediction subelement, specifically can be used for:
The face key point coordinates of the previous frame image is calculated by the public network part, obtains calculating knot
Really;The result of calculation is processed by the key point predicted branches, obtains the face key point coordinates of present frame, and,
The result of calculation is processed by the confidence level predicted branches, obtains confidence level of present frame, etc..
Wherein, the predetermined threshold value can be configured according to the demand of practical application, will not be repeated here.
It should be noted that, if the confidence level of the previous frame of present frame is not higher than predetermined threshold value, show the face of previous frame
The reference value of crucial point coordinates is relatively low, therefore, now can be by the way of detection come face key point in obtaining present frame
Coordinate;Similarly, if obtaining the face key point coordinates and confidence level of the previous frame image less than present frame, such as present frame is should
The first frame of video flowing, equally can also be by the way of detection come face key point coordinates in obtaining present frame, you can choosing, should
Tracking cell 303 can also include detection sub-unit, as follows:
The detection sub-unit, can be used in the face key point coordinates for obtaining the previous frame image less than present frame and puts
Reliability, or, when determining that the confidence level of previous frame image is less than or equal to predetermined threshold value, based on the memory source of distribution, by people
Face detection algorithm is detected to the face in present frame, to determine the face key point coordinates and confidence level of present frame.
Wherein, the mode of detection can have various, for example, the detection sub-unit, specifically can be used for based on the interior of distribution
Resource is deposited, the human face region of the present frame is determined by Face datection algorithm, by the network model in the human face region
The position of human face five-sense-organ is predicted, and obtains the face key point coordinates and confidence level of present frame.
During specific implementation, above unit can be realized respectively as independent entity, it is also possible to carry out any group
Close, realized as same or several entities, the specific implementation of above unit can be found in embodiment of the method above,
This is repeated no more.
The face tracking device can be specifically integrated in the equipment such as mobile terminal, the mobile terminal can include mobile phone,
Panel computer or Intelligent worn device etc..
From the foregoing, it will be observed that the present embodiment is when needing to carry out face tracking to video flowing, phase can be obtained by acquiring unit 301
The network model of the deep learning answered, and be the network model storage allocation resource by allocation unit 302 so that the network model
All layers share same memory space, then, by tracking cell 303 be based on distribution memory source and network model this is regarded
Frequency stream is processed, to realize the real-time tracking of face;Because in this scenario, all layers of network model can be shared together
One memory space, it is therefore not necessary to for each layer of network model all distributes an independent memory space, can not only greatly save
The occupancy of internal memory, improves computational efficiency, and being additionally, since need to only distribute once, so, it is also possible to substantially reduce the secondary of batch operation
Number, reduces fragmentation, is conducive to improving application program capacity.
Example IV,
Accordingly, the embodiment of the present invention also provides a kind of mobile terminal, as shown in figure 4, the mobile terminal can include penetrating
Frequently (RF, Radio Frequency) circuit 401, include the memory of one or more computer-readable recording mediums
402nd, input block 403, display unit 404, sensor 405, voicefrequency circuit 406, Wireless Fidelity (WiFi, Wireless
Fidelity) module 407, include the part such as or the processor 408 and power supply 409 of more than one processing core.
It will be understood by those skilled in the art that the mobile terminal structure shown in Fig. 4 does not constitute the restriction to mobile terminal, can wrap
Part more more or less than diagram is included, or combines some parts, or different part arrangements.Wherein:
RF circuits 401 can be used to receiving and sending messages or communication process in, the reception and transmission of signal, especially, by base station
After downlink information is received, transfer to one or more than one processor 408 is processed;In addition, will be related to up data is activation to
Base station.Generally, RF circuits 401 include but is not limited to antenna, at least one amplifier, tuner, one or more oscillators, use
Family identity module (SIM, Subscriber Identity Module) card, transceiver, coupler, low-noise amplifier
(LNA, Low Noise Amplifier), duplexer etc..Additionally, RF circuits 401 can also by radio communication and network and its
His equipment communication.The radio communication can use any communication standard or agreement, including but not limited to global system for mobile telecommunications system
System (GSM, Global System of Mobile communication), general packet radio service (GPRS, General
Packet Radio Service), CDMA (CDMA, Code Division Multiple Access), wideband code division it is many
Location (WCDMA, Wideband Code Division Multiple Access), Long Term Evolution (LTE, Long Term
Evolution), Email, Short Message Service (SMS, Short Messaging Service) etc..
Memory 402 can be used to store software program and module, and processor 408 is by running storage in memory 402
Software program and module, so as to perform various function application and data processing.Memory 402 can mainly include storage journey
Sequence area and storage data field, wherein, the application program (ratio that storing program area can be needed for storage program area, at least one function
Such as sound-playing function, image player function) etc.;Storage data field can be stored and use created number according to mobile terminal
According to (such as voice data, phone directory etc.) etc..Additionally, memory 402 can include high-speed random access memory, can also wrap
Include nonvolatile memory, for example, at least one disk memory, flush memory device or other volatile solid-state parts.
Correspondingly, memory 402 can also include Memory Controller, to provide processor 408 and input block 403 to memory
402 access.
Input block 403 can be used to receive the numeral or character information of input, and generation is set and function with user
The relevant keyboard of control, mouse, action bars, optics or trace ball signal input.Specifically, in a specific embodiment
In, input block 403 may include Touch sensitive surface and other input equipments.Touch sensitive surface, also referred to as touch display screen or tactile
Control plate, user can be collected thereon or neighbouring touch operation (such as user use any suitable objects such as finger, stylus or
Operation of the annex on Touch sensitive surface or near Touch sensitive surface), and corresponding connection dress is driven according to formula set in advance
Put.Optionally, Touch sensitive surface may include two parts of touch detecting apparatus and touch controller.Wherein, touch detecting apparatus inspection
The touch orientation of user is surveyed, and detects the signal that touch operation brings, transmit a signal to touch controller;Touch controller from
Touch information is received on touch detecting apparatus, and is converted into contact coordinate, then give processor 408, and can reception processing
Order that device 408 is sent simultaneously is performed.Furthermore, it is possible to various using resistance-type, condenser type, infrared ray and surface acoustic wave etc.
Type realizes Touch sensitive surface.Except Touch sensitive surface, input block 403 can also include other input equipments.Specifically, other are defeated
Entering equipment can include but is not limited to physical keyboard, function key (such as volume control button, switch key etc.), trace ball, mouse
One or more in mark, action bars etc..
Display unit 404 can be used to showing by user input information or be supplied to the information and mobile terminal of user
Various graphical user interface, these graphical user interface can be made up of figure, text, icon, video and its any combination.
Display unit 404 may include display panel, optionally, can use liquid crystal display (LCD, Liquid Crystal
Display), the form such as Organic Light Emitting Diode (OLED, Organic Light-Emitting Diode) configures display surface
Plate.Further, Touch sensitive surface can cover display panel, when Touch sensitive surface is detected thereon or after neighbouring touch operation,
Processor 408 is sent to determine the type of touch event, with preprocessor 408 according to the type of touch event in display panel
It is upper that corresponding visual output is provided.Although in fig. 4, Touch sensitive surface with display panel is realized as two independent parts
Input and input function, but in some embodiments it is possible to by Touch sensitive surface and display panel it is integrated and realize be input into and it is defeated
Go out function.
Mobile terminal may also include at least one sensor 405, such as optical sensor, motion sensor and other sensings
Device.Specifically, optical sensor may include ambient light sensor and proximity transducer, wherein, ambient light sensor can be according to environment
The light and shade of light adjusts the brightness of display panel, and proximity transducer can close display surface when mobile terminal is moved in one's ear
Plate and/or backlight.As one kind of motion sensor, in the detectable all directions of Gravity accelerometer (generally three axles)
The size of acceleration, can detect that size and the direction of gravity when static, can be used for the application of identification mobile phone attitude (such as anyhow
Screen switching, dependent game, magnetometer pose calibrating), Vibration identification correlation function (such as pedometer, percussion) etc.;As for movement
The other sensors such as gyroscope, barometer, hygrometer, thermometer, infrared ray sensor that terminal can also configure, no longer go to live in the household of one's in-laws on getting married herein
State.
Voicefrequency circuit 406, loudspeaker, microphone can provide the COBBAIF between user and mobile terminal.Voicefrequency circuit
Electric signal after the 406 voice data conversions that will can be received, is transferred to loudspeaker, and it is defeated to be converted to voice signal by loudspeaker
Go out;On the other hand, the voice signal of collection is converted to electric signal by microphone, and audio is converted to after being received by voicefrequency circuit 406
Data, then after voice data output processor 408 is processed, through RF circuits 401 being sent to such as another mobile terminal, or
Voice data is exported to memory 402 so as to further treatment.Voicefrequency circuit 406 is also possible that earphone jack, to provide
The communication of peripheral hardware earphone and mobile terminal.
WiFi belongs to short range wireless transmission technology, and mobile terminal can help user to receive and dispatch electricity by WiFi module 407
Sub- mail, browse webpage and access streaming video etc., it has provided the user wireless broadband internet and has accessed.Although Fig. 4 shows
Go out WiFi module 407, but it is understood that, it is simultaneously not belonging to must be configured into for mobile terminal, completely can be according to need
To be omitted in the essential scope for do not change invention.
Processor 408 is the control centre of mobile terminal, using various interfaces and each portion of connection whole mobile phone
Point, by running or performing software program and/or module of the storage in memory 402, and storage is called in memory 402
Interior data, perform the various functions and processing data of mobile terminal, so as to carry out integral monitoring to mobile phone.Optionally, process
Device 408 may include one or more processing cores;Preferably, processor 408 can integrated application processor and modulation /demodulation treatment
Device, wherein, application processor mainly processes operating system, user interface and application program etc., and modem processor is mainly located
Reason radio communication.It is understood that above-mentioned modem processor can not also be integrated into processor 408.
Mobile terminal also includes the power supply 409 (such as battery) powered to all parts, it is preferred that power supply can be by electricity
Management system is logically contiguous with processor 408, so as to realize management charging, electric discharge and power consumption by power-supply management system
The functions such as management.Power supply 409 can also include one or more direct current or AC power, recharging system, power supply event
The random component such as barrier detection circuit, power supply changeover device or inverter, power supply status indicator.
Although not shown, mobile terminal can also will not be repeated here including camera, bluetooth module etc..Specifically at this
In embodiment, the processor 408 in mobile terminal can be according to following instruction, by entering for one or more application program
The corresponding executable file of journey is loaded into memory 402, and storage answering in memory 402 is run by processor 408
With program, so as to realize various functions:
Acquisition needs to carry out the video flowing of face tracking and the network model of deep learning, for the network model is distributed
Memory source so that all layers of the network model share same memory space, memory source and the network mould based on distribution
Type is tracked to the face in the video flowing.
For example, can specifically calculate the memory space in the network model needed for each layer network, such as obtain network mould
The configuration file of type, the memory space according to needed for the configuration file calculates each layer network in the network model, then, will be each
The maximum in memory space needed for layer as the memory space of predistribution size, memory space according to the predistribution
Size is the network model storage allocation resource, etc..
Wherein, the structure of the network model can be configured according to the demand of practical application, such as, the network model can
With including public network part, key point predicted branches and confidence level predicted branches etc..Additionally, public network part, the key
Depending on the level of point prediction branch and confidence level predicted branches can also be according to the demand of practical application, for details, reference can be made to above
Embodiment of the method, will not be repeated here.
Wherein, memory source and the network model based on distribution can to the mode that the face in the video flowing is tracked
It is various to have, for example, the face key point coordinates and confidence level of the previous frame image of present frame can be obtained, then, based on point
The face of memory source, the network model, the face key point coordinates of previous frame image and the confidence level prediction present frame matched somebody with somebody is closed
Key point coordinates and confidence level, etc., i.e. application program of the storage in memory 402, can also implement function such as:
Determined to be currently needed for the image for the treatment of according to the video flowing, obtain present frame;Obtain the previous frame image of present frame
Face key point coordinates and confidence level;Memory source, the network model, the face key point of previous frame image based on distribution
The face key point coordinates and confidence level of coordinate and confidence level prediction present frame, and it is current according to video flowing determination to return to execution
The step of needing image to be processed, until the image in the video flowing is disposed.
Such as, can lead to when it is determined that the confidence level of previous frame image is more than predetermined threshold value, using the memory source of distribution
Cross the network model to calculate the face key point coordinates of the previous frame image, result of calculation is obtained, then, according to the meter
Calculate the face key point coordinates of prediction of result present frame, and the confidence level for calculating the present frame.
Wherein, the predetermined threshold value can be configured according to the demand of practical application, will not be repeated here.
It should be noted that, if the confidence level of previous frame is not higher than predetermined threshold value, show that the face key point of previous frame is sat
Target reference value is relatively low, therefore, now can be by the way of detection come face key point coordinates in obtaining present frame;Together
Reason, if obtaining the face key point coordinates and confidence level of the previous frame image less than present frame, such as present frame is the video flowing
First frame, similarly can using detection by the way of come in obtaining present frame face key point coordinates, i.e., the storage is in memory
Application program in 402, can also implement function such as:
In the face key point coordinates and confidence level that obtain the previous frame image less than present frame, or, determine previous frame
When the confidence level of image is less than or equal to predetermined threshold value, based on the memory source of distribution, by Face datection algorithm in present frame
Face detected, to determine the face key point coordinates and confidence level of present frame.
The specific implementation of each operation above can be found in embodiment above, will not be repeated here.
From the foregoing, it will be observed that the mobile terminal of the present embodiment is needing to carry out deep learning to video flowing, to carry out face tracking
When, the network model of corresponding deep learning can be obtained, and be the network model storage allocation resource so that the network model
All layers share same memory space, then, based on distribution memory source and network model the video flowing is processed,
To realize the real-time tracking of face;Because in this scenario, all layers of network model can share same memory space, because
This, an independent memory space is all distributed without each layer for network model, can not only greatly save the occupancy of internal memory, is carried
Computationally efficient, being additionally, since need to only distribute once, so, it is also possible to the number of times of batch operation is substantially reduced, storage is reduced broken
Piece, is conducive to improving application program capacity.
Further, since the program is less to the demand of internal memory, and computational efficiency is higher, therefore, the requirement to equipment performance
It is relatively low, go for the equipment such as mobile terminal, so, relative to the side that deep learning forwards algorithms are placed on server end
For case, more high efficient and flexible face can be tracked, be conducive to improving Consumer's Experience.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can
Completed with instructing the hardware of correlation by program, the program can be stored in a computer-readable recording medium, storage
Medium can include:Read-only storage (ROM, Read Only Memory), random access memory (RAM, Random
Access Memory), disk or CD etc..
A kind of face tracking method and device for being provided the embodiment of the present invention above are described in detail, herein
Apply specific case to be set forth principle of the invention and implementation method, the explanation of above example is only intended to help
Understand the method for the present invention and its core concept;Simultaneously for those skilled in the art, according to thought of the invention, in tool
Be will change in body implementation method and range of application, in sum, this specification content should not be construed as to the present invention
Limitation.
Claims (16)
1. a kind of face tracking method, it is characterised in that including:
Acquisition needs to carry out the video flowing of face tracking and the network model of deep learning;
It is the network model storage allocation resource so that all layers of the network model share same memory space;
Memory source and the network model based on distribution are tracked to the face in the video flowing.
2. method according to claim 1, it is characterised in that described is the network model storage allocation resource so that
All layers of the network model share same memory space, including:
Calculate the memory space needed for each layer network in the network model;
Using the maximum in the memory space needed for each layer as predistribution memory space size;
The size of the memory space according to the predistribution is the network model storage allocation resource.
3. method according to claim 2, it is characterised in that in the calculating network model needed for each layer network
Memory space, including:
Obtain the configuration file of network model;
Memory space according to needed for the configuration file calculates each layer network in the network model.
4. the method according to any one of claims 1 to 3, it is characterised in that the memory source and institute based on distribution
Network model is stated to be tracked the face in the video flowing, including:
Determined to be currently needed for the image for the treatment of according to the video flowing, obtain present frame;
Obtain the face key point coordinates and confidence level of the previous frame image of present frame;
The face key point coordinates of memory source, the network model, previous frame image based on distribution and confidence level prediction are worked as
The face key point coordinates and confidence level of previous frame, and return to the image that execution is currently needed for processing according to video flowing determination
Step, until the image in the video flowing is disposed.
5. method according to claim 4, it is characterised in that the memory source based on distribution, the network model,
The face key point coordinates of previous frame image and the face key point coordinates and confidence level of confidence level prediction present frame, including:
When determining that the confidence level of previous frame image is more than predetermined threshold value, using the memory source of distribution, by the network model
Face key point coordinates to the previous frame image is calculated, and obtains result of calculation;
The face key point coordinates of present frame, and the confidence level for calculating the present frame are predicted according to the result of calculation.
6. method according to claim 5, it is characterised in that the network model includes public network part, key point
Predicted branches and confidence level predicted branches, then it is described the face key point of the previous frame image is sat by the network model
Mark is calculated, and obtains result of calculation, including:
The face key point coordinates of the previous frame image is calculated by the public network part, obtains calculating knot
Really;
The face key point coordinates that present frame is predicted according to the result of calculation, and the confidence for calculating the present frame
Degree, including:The result of calculation is processed by the key point predicted branches, the face key point for obtaining present frame is sat
Mark, and, the result of calculation is processed by the confidence level predicted branches, obtain the confidence level of present frame.
7. method according to claim 4, it is characterised in that also include:
In the face key point coordinates and confidence level that obtain the previous frame image less than present frame, or, determine previous frame image
Confidence level be less than or equal to predetermined threshold value when, based on distribution memory source, by Face datection algorithm to the people in present frame
Face is detected, to determine the face key point coordinates and confidence level of present frame.
8. method according to claim 7, it is characterised in that the memory source based on distribution, by Face datection
Algorithm is detected to the face in present frame, to determine the face key point coordinates and confidence level of present frame, including:
Based on the memory source of distribution, the human face region of the present frame is determined by Face datection algorithm;
The position of the human face five-sense-organ in the human face region is predicted by the network model, obtains the face of present frame
Crucial point coordinates and confidence level.
9. method according to claim 8, it is characterised in that the memory source based on distribution, by Face datection
Algorithm determines the human face region of the present frame, including:
Based on the memory source of distribution, by calculating the face characteristic in the image integration figure acquisition present frame;
Strong man's face and non-face strong classifier are built according to the face characteristic, the strong classifier level is associated in same system
In;
Present frame is processed according to the strong classifier, obtains the human face region of present frame.
10. a kind of face tracking device, it is characterised in that including:
Acquiring unit, needs to carry out the video flowing of face tracking and the network model of deep learning for obtaining;
Allocation unit, for being the network model storage allocation resource so that all layers of the network model share same
Memory space;
Tracking cell, for based on distribution memory source and the network model face in the video flowing is carried out with
Track.
11. devices according to claim 10, it is characterised in that the allocation unit includes computation subunit and distribution
Unit;
Computation subunit, for calculating the memory space in the network model needed for each layer network;
Distribution subelement, for using the maximum in the memory space needed for each layer as predistribution memory space size,
The size of the memory space according to the predistribution is the network model storage allocation resource.
12. devices according to claim 11, it is characterised in that
The computation subunit, the configuration file specifically for obtaining network model, the net is calculated according to the configuration file
Memory space in network model needed for each layer network.
13. device according to any one of claim 10 to 12, it is characterised in that the tracking cell includes determining that son is single
Unit, parameter acquiring subelement and prediction subelement;
Determination subelement, for determining to be currently needed for the image for the treatment of according to the video flowing, obtains present frame;
Parameter acquiring subelement, face key point coordinates and confidence level for obtaining the previous frame image of present frame;
Prediction subelement, for the face key point coordinates based on the memory source, the network model, previous frame image for distributing
The crucial point coordinates of the face of present frame and confidence level are predicted with confidence level, and triggers determination subelement and performed according to the video flowing
It is determined that the operation of the image for the treatment of is currently needed for, until the image in the video flowing is disposed.
14. devices according to claim 13, it is characterised in that the prediction subelement, specifically for:
When determining that the confidence level of previous frame image is more than predetermined threshold value, using the memory source of distribution, by the network model
Face key point coordinates to the previous frame image is calculated, and obtains result of calculation;
The face key point coordinates of present frame, and the confidence level for calculating the present frame are predicted according to the result of calculation.
15. devices according to claim 13, it is characterised in that the tracking cell also includes detection sub-unit;
The detection sub-unit, in the face key point coordinates and confidence level for obtaining the previous frame image less than present frame,
Or, when determining that the confidence level of previous frame image is less than or equal to predetermined threshold value, based on the memory source of distribution, by Face datection
Algorithm is detected to the face in present frame, to determine the face key point coordinates and confidence level of present frame.
16. devices according to claim 15, it is characterised in that
The detection sub-unit, specifically for the memory source based on distribution, the present frame is determined by Face datection algorithm
Human face region, the position of the human face five-sense-organ in the human face region is predicted by the network model, obtain current
The face key point coordinates and confidence level of frame.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710108748.7A CN106919918B (en) | 2017-02-27 | 2017-02-27 | Face tracking method and device |
PCT/CN2018/076238 WO2018153294A1 (en) | 2017-02-27 | 2018-02-11 | Face tracking method, storage medium, and terminal device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710108748.7A CN106919918B (en) | 2017-02-27 | 2017-02-27 | Face tracking method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106919918A true CN106919918A (en) | 2017-07-04 |
CN106919918B CN106919918B (en) | 2022-11-29 |
Family
ID=59453864
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710108748.7A Active CN106919918B (en) | 2017-02-27 | 2017-02-27 | Face tracking method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106919918B (en) |
WO (1) | WO2018153294A1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108388879A (en) * | 2018-03-15 | 2018-08-10 | 斑马网络技术有限公司 | Mesh object detection method, device and storage medium |
WO2018153294A1 (en) * | 2017-02-27 | 2018-08-30 | 腾讯科技(深圳)有限公司 | Face tracking method, storage medium, and terminal device |
CN109285119A (en) * | 2018-10-23 | 2019-01-29 | 百度在线网络技术(北京)有限公司 | Super resolution image generation method and device |
CN109447253A (en) * | 2018-10-26 | 2019-03-08 | 杭州比智科技有限公司 | The method, apparatus of video memory distribution calculates equipment and computer storage medium |
CN109460077A (en) * | 2018-11-19 | 2019-03-12 | 深圳博为教育科技有限公司 | A kind of automatic tracking method, automatic tracking device and automatic tracking system |
CN109508575A (en) * | 2017-09-14 | 2019-03-22 | 深圳超多维科技有限公司 | Face tracking method and device, electronic equipment and computer readable storage medium |
CN111666150A (en) * | 2020-05-09 | 2020-09-15 | 深圳云天励飞技术有限公司 | Storage space allocation method and device, terminal and computer readable storage medium |
CN112286694A (en) * | 2020-12-24 | 2021-01-29 | 瀚博半导体(上海)有限公司 | Hardware accelerator memory allocation method and system based on deep learning computing network |
CN113221630A (en) * | 2021-03-22 | 2021-08-06 | 刘鸿 | Estimation method of human eye watching lens and application of estimation method in intelligent awakening |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111914598A (en) * | 2019-05-09 | 2020-11-10 | 北京四维图新科技股份有限公司 | Method, device and equipment for detecting key points of continuous frame human face and storage medium |
CN110516620B (en) * | 2019-08-29 | 2023-07-28 | 腾讯科技(深圳)有限公司 | Target tracking method and device, storage medium and electronic equipment |
CN113409354A (en) * | 2020-03-16 | 2021-09-17 | 深圳云天励飞技术有限公司 | Face tracking method and device and terminal equipment |
CN111881838B (en) * | 2020-07-29 | 2023-09-26 | 清华大学 | Dyskinesia assessment video analysis method and equipment with privacy protection function |
CN112101106A (en) * | 2020-08-07 | 2020-12-18 | 深圳数联天下智能科技有限公司 | Face key point determination method and device and storage medium |
CN112417985A (en) * | 2020-10-30 | 2021-02-26 | 杭州魔点科技有限公司 | Face feature point tracking method, system, electronic equipment and storage medium |
CN113723214B (en) * | 2021-08-06 | 2023-10-13 | 武汉光庭信息技术股份有限公司 | Face key point labeling method, system, electronic equipment and storage medium |
CN113792633B (en) * | 2021-09-06 | 2023-12-22 | 北京工商大学 | Face tracking system and method based on neural network and optical flow method |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1822024A (en) * | 2006-04-13 | 2006-08-23 | 北京中星微电子有限公司 | Positioning method for human face characteristic point |
CN103699905A (en) * | 2013-12-27 | 2014-04-02 | 深圳市捷顺科技实业股份有限公司 | Method and device for positioning license plate |
CN104036240A (en) * | 2014-05-29 | 2014-09-10 | 小米科技有限责任公司 | Face feature point positioning method and device |
US20160180214A1 (en) * | 2014-12-19 | 2016-06-23 | Google Inc. | Sharp discrepancy learning |
CN105787448A (en) * | 2016-02-28 | 2016-07-20 | 南京信息工程大学 | Facial shape tracking method based on space-time cascade shape regression |
CN106203333A (en) * | 2016-07-08 | 2016-12-07 | 乐视控股(北京)有限公司 | Face identification method and system |
CN106295567A (en) * | 2016-08-10 | 2017-01-04 | 腾讯科技(深圳)有限公司 | The localization method of a kind of key point and terminal |
US20170039468A1 (en) * | 2015-08-06 | 2017-02-09 | Clarifai, Inc. | Systems and methods for learning new trained concepts used to retrieve content relevant to the concepts learned |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7184602B2 (en) * | 2003-05-02 | 2007-02-27 | Microsoft Corp. | System and method for low bandwidth video streaming for face-to-face teleconferencing |
CN106056529B (en) * | 2015-04-03 | 2020-06-02 | 阿里巴巴集团控股有限公司 | Method and equipment for training convolutional neural network for picture recognition |
CN106295707B (en) * | 2016-08-17 | 2019-07-02 | 北京小米移动软件有限公司 | Image-recognizing method and device |
CN106919918B (en) * | 2017-02-27 | 2022-11-29 | 腾讯科技(上海)有限公司 | Face tracking method and device |
-
2017
- 2017-02-27 CN CN201710108748.7A patent/CN106919918B/en active Active
-
2018
- 2018-02-11 WO PCT/CN2018/076238 patent/WO2018153294A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1822024A (en) * | 2006-04-13 | 2006-08-23 | 北京中星微电子有限公司 | Positioning method for human face characteristic point |
CN103699905A (en) * | 2013-12-27 | 2014-04-02 | 深圳市捷顺科技实业股份有限公司 | Method and device for positioning license plate |
CN104036240A (en) * | 2014-05-29 | 2014-09-10 | 小米科技有限责任公司 | Face feature point positioning method and device |
US20160180214A1 (en) * | 2014-12-19 | 2016-06-23 | Google Inc. | Sharp discrepancy learning |
US20170039468A1 (en) * | 2015-08-06 | 2017-02-09 | Clarifai, Inc. | Systems and methods for learning new trained concepts used to retrieve content relevant to the concepts learned |
CN105787448A (en) * | 2016-02-28 | 2016-07-20 | 南京信息工程大学 | Facial shape tracking method based on space-time cascade shape regression |
CN106203333A (en) * | 2016-07-08 | 2016-12-07 | 乐视控股(北京)有限公司 | Face identification method and system |
CN106295567A (en) * | 2016-08-10 | 2017-01-04 | 腾讯科技(深圳)有限公司 | The localization method of a kind of key point and terminal |
Non-Patent Citations (2)
Title |
---|
TIANQI CHEN等: "Training Deep Nets with Sublinear Memory Cost", 《MACHINE LEARNING》 * |
栾中: "基于改进Adaboost人脸检测算法的研究与FPGA实现", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018153294A1 (en) * | 2017-02-27 | 2018-08-30 | 腾讯科技(深圳)有限公司 | Face tracking method, storage medium, and terminal device |
CN109508575A (en) * | 2017-09-14 | 2019-03-22 | 深圳超多维科技有限公司 | Face tracking method and device, electronic equipment and computer readable storage medium |
CN108388879A (en) * | 2018-03-15 | 2018-08-10 | 斑马网络技术有限公司 | Mesh object detection method, device and storage medium |
CN108388879B (en) * | 2018-03-15 | 2022-04-15 | 斑马网络技术有限公司 | Target detection method, device and storage medium |
CN109285119A (en) * | 2018-10-23 | 2019-01-29 | 百度在线网络技术(北京)有限公司 | Super resolution image generation method and device |
CN109447253A (en) * | 2018-10-26 | 2019-03-08 | 杭州比智科技有限公司 | The method, apparatus of video memory distribution calculates equipment and computer storage medium |
CN109460077A (en) * | 2018-11-19 | 2019-03-12 | 深圳博为教育科技有限公司 | A kind of automatic tracking method, automatic tracking device and automatic tracking system |
CN111666150A (en) * | 2020-05-09 | 2020-09-15 | 深圳云天励飞技术有限公司 | Storage space allocation method and device, terminal and computer readable storage medium |
CN112286694A (en) * | 2020-12-24 | 2021-01-29 | 瀚博半导体(上海)有限公司 | Hardware accelerator memory allocation method and system based on deep learning computing network |
CN113221630A (en) * | 2021-03-22 | 2021-08-06 | 刘鸿 | Estimation method of human eye watching lens and application of estimation method in intelligent awakening |
Also Published As
Publication number | Publication date |
---|---|
CN106919918B (en) | 2022-11-29 |
WO2018153294A1 (en) | 2018-08-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106919918A (en) | A kind of face tracking method and device | |
CN106778585B (en) | A kind of face key point-tracking method and device | |
CN104618440B (en) | Smart machine control method and device | |
CN108304758A (en) | Facial features tracking method and device | |
CN104363988B (en) | A kind of management method and device of multi-core processor | |
CN107357656A (en) | A kind of memory allocation method, mobile terminal and computer-readable recording medium | |
CN108984064A (en) | Multi-screen display method, device, storage medium and electronic equipment | |
CN106817540A (en) | A kind of camera control method and equipment | |
CN107241784A (en) | The method and device of Internet of Things access | |
CN104699501B (en) | A kind of method and device for running application program | |
CN107105093A (en) | Camera control method, device and terminal based on hand track | |
CN107040610A (en) | Method of data synchronization, device, storage medium, terminal and server | |
CN109062468A (en) | Multi-screen display method, device, storage medium and electronic equipment | |
CN106200970A (en) | A kind of method of split screen display available and terminal | |
CN109067981A (en) | Split screen application switching method, device, storage medium and electronic equipment | |
CN108241752A (en) | Photo display methods, mobile terminal and computer readable storage medium | |
CN108958629A (en) | Split screen exits method, apparatus, storage medium and electronic equipment | |
CN109460170A (en) | Screen extension and exchange method, terminal and computer readable storage medium | |
CN109922539A (en) | Method for connecting network and Related product | |
CN104901992B (en) | A kind of method and apparatus of resource transfers | |
CN107256334A (en) | recipe matching method and related product | |
CN109062469A (en) | Multi-screen display method, device, storage medium and electronic equipment | |
CN106708500B (en) | Unload the display methods and device at interface | |
CN108234979A (en) | A kind of image pickup method, mobile terminal and computer readable storage medium | |
CN107708071A (en) | Transmission power control method and related product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |