CN110516642A - A kind of lightweight face 3D critical point detection method and system - Google Patents
A kind of lightweight face 3D critical point detection method and system Download PDFInfo
- Publication number
- CN110516642A CN110516642A CN201910818443.4A CN201910818443A CN110516642A CN 110516642 A CN110516642 A CN 110516642A CN 201910818443 A CN201910818443 A CN 201910818443A CN 110516642 A CN110516642 A CN 110516642A
- Authority
- CN
- China
- Prior art keywords
- network
- sub
- thermodynamic chart
- combined coding
- coordinate vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of lightweight face 3D critical point detection method and system, comprising: N number of 3D reference coordinate vector of face key point in database is carried out dimensionality reduction projection in three two-dimensional surfaces;Based on k rank modified hourglass network struction combined coding sub-network, N number of 2D reference coordinate vector combined coding under each visual angle 2D is combined into thermodynamic chart for 2D using the combined coding sub-network;2D joint thermodynamic chart under three visual angles 2D is superposed to by 3D joint thermodynamic chart using concat method;The decoding sub-network is constructed based on the full convolutional network of 2D, 3D joint thermodynamic chart is decoded as N number of 3D using decoding sub-network and detects coordinate vector.The present invention devises corresponding lightweight neural network (combined coding sub-network reconciliation numeral network) to carry out the map generalization of joint heating power, the recurrence of 3D coordinate;The advantages of combining existing 2D and 3D face critical point detection method reduces model parameter amount, improves model running speed while keeping compared with high measurement accuracy.
Description
Technical field
The present invention relates to image procossing and computer machine vision technique fields more particularly to a kind of lightweight face 3D to close
Key point detecting method and system.
Background technique
With depth learning technology flourishing in computer vision field, various face image processing tasks are being lived
In be widely applied, wherein face critical point detection recognition of face, Expression Recognition, in terms of all play
Important role.
Face critical point detection achieves huge achievement in past ten years, especially in 2D face critical point detection
Field.It is wherein classical by ASM (Active Shape Model) algorithm based on points distribution models of the propositions such as Cootes
Face critical point detection algorithm, the algorithm pass through the method manually demarcated and first demarcate training set, obtain shape by training,
The matching of certain objects is realized by the matching of key point again;CPR (the Cascaded returned based on cascade proposed by Dollar
Pose Regression) algorithm gradually refined a specified initial prediction by a series of recurrences devices, and each is returned
Device depends on the previous output for returning device to execute simple image operation, and whole system can be automatically from training sample middle school
It practises;In addition, proposing a kind of multitask concatenated convolutional neural network MTCNN (Multi-task Cascaded by Zhang et al.
Convolutional Networks) to handle Face datection and face key point orientation problem simultaneously.However, such as big
Angular pose and face block under equal complex scenes, and the face critical point detection method based on 2D is difficult to realize, there is limitation.
In order to solve this limitation, more and more researchers gradually focus on 3D face critical point detection, and 3D face key point is opposite
More information is indicated in 2D and more block informations are provided.
The method that 3D face critical point detection method is roughly divided into the method based on model and is not based on model.One, it is based on
The method of model: the 3 D deformation model (3DMM) that Blanz et al. is proposed is the common method for completing 3D face critical point detection;
Two, be not based on the method for model: Tulyakov et al. proposes a kind of returned with cascade and calculates three-dimensional shape features to position 3D
Cascade homing method is generalized in 3D face critical point detection by the method for face key point.In addition, in the method based on model
In, further include the method for completing face critical point detection using deep learning model, is broadly divided into the two stages Return Law and volume
Representation method, two stages return typical method, (x, y) coordinate are separated with z-axis, first return (x, y) coordinate, then return z;Volume
Traditional 2D thermodynamic chart is expanded to 3D volume tabular form by representation method, is also widely used in human body critical point detection.
However due to the increase of 3d space dimension, the processing speed of respective algorithms, model accuracy all suffer from huge challenge,
Existing 3D face critical point detection algorithm processing speed, model size and complexity, in terms of all exist not
With the defect of degree.
Summary of the invention
An object of the present invention at least that, for how to overcome the above-mentioned problems of the prior art, provide one kind
Lightweight face 3D critical point detection method and system.
To achieve the goals above, the technical solution adopted by the present invention includes following aspects.
A kind of lightweight face 3D critical point detection method, comprising:
Step 101, N number of 3D reference coordinate vector of face key point in database is subjected to dimensionality reduction in three two-dimensional surfaces
Projection;Wherein, three two-dimensional surfaces are respectively xy, xz, yz plane, and x, y, z is positive or is negative simultaneously simultaneously;Each two
It include N number of 2D reference coordinate vector corresponding with the N number of 3D reference coordinate vector in dimensional plane;
Step 102, k rank modified hourglass network struction combined coding sub-network, the training combined coding subnet are based on
Network makes its performance tend towards stability;Using trained combined coding sub-network by N number of 2D reference coordinate under each visual angle 2D to
Measuring combined coding is that 2D combines thermodynamic chart;Wherein, the k rank modified hourglass network residual unit uses Residual+
Inception structure;
Step 103, the 2D joint thermodynamic chart under three visual angles 2D is superposed to by 3D joint thermodynamic chart using concat method;
Step 104, the decoding sub-network is constructed based on the full convolutional network of 2D, the training decoding sub-network makes its performance
It tends towards stability;3D joint thermodynamic chart is decoded as N number of 3D using the decoding sub-network and detects coordinate vector.
Preferably, the combined coding sub-network is 2 rank modified hourglass networks.
Preferably, the combined coding sub-network, decoding sub-network are carried out using more loss function Fusion training methods
Training.
Preferably, more loss function Fusion training methods carry out three-wheel to network using three kinds of different loss functions
Repetitive exercise, using the optimal weights that previous training in rotation is got as the initial weight of next round, until three-wheel training is completed to stop
Training.
Preferably, three kinds of loss functions are as follows: mean square error loss function, is put down at least absolute value error loss function
Least absolute value error loss function after cunning.
Preferably, the decoding sub-network includes: 4 2D convolutional layers, and arrange in pairs or groups batch among each convolutional layer
Normalization and LeakyRelu activation primitive.
A kind of lightweight face 3D critical point detection system, including at least one processor, and with it is described at least one
The memory of processor communication connection;The memory is stored with the instruction that can be executed by least one described processor, described
Instruction is executed by least one described processor, so that at least one described processor is able to carry out the above method.
In conclusion by adopting the above-described technical solution, the present invention at least has the advantages that
1, by combining 2D face critical point detection and 3D face critical point detection the advantages of, a kind of joint heating power is proposed
Figure and coordinate homing method, and devise corresponding lightweight neural network (combined coding sub-network reconciliation numeral network) come into
The row map generalization of joint heating power, the recurrence of 3D coordinate;This method combines existing 2D and 3D face critical point detection method
Advantage, used joint thermodynamic chart representation method reduce calculation amount and model complexity, are keeping compared with high measurement accuracy
Meanwhile reducing model parameter amount, improving model running speed;During combined coding, to the residual error of original neural network
Unit improves, and further promotes ability in feature extraction, the detection accuracy of network.
2, combined coding sub-network uses 2 rank modified hourglass configurations, reduces the depth of network, improves network convergence speed
Degree, reduces the parameter amount of network.
3, a kind of more loss function Fusion training methods are proposed, three-wheel is carried out to network using three kinds of different loss functions
Repetitive exercise makes the detection accuracy of network become more accurate.
Detailed description of the invention
Fig. 1 is lightweight face 3D critical point detection method flow diagram according to an exemplary embodiment of the present invention.
Fig. 2 is former hourglass network residual unit structural schematic diagram.
Fig. 3 is modified hourglass network residual unit structural schematic diagram according to an exemplary embodiment of the present invention.
Fig. 4 is that modified second order hourglass network (combined coding sub-network) structure according to an exemplary embodiment of the present invention is shown
It is intended to.
Fig. 5 is the exemplary thermogram that combined coding sub-network according to an exemplary embodiment of the present invention generates.
Fig. 6 is the 3D key point schematic diagram that decoding sub-network according to an exemplary embodiment of the present invention generates.
Fig. 7 is that the projection of the 3D key point of decoding sub-network generation according to an exemplary embodiment of the present invention on the image is shown
It is intended to.
Fig. 8 is the complete network that combined coding sub-network according to an exemplary embodiment of the present invention and decoding sub-network are constituted
Structural schematic diagram.
Fig. 9 is lightweight face 3D critical point detection system structure diagram according to an exemplary embodiment of the present invention.
Specific embodiment
With reference to the accompanying drawings and embodiments, the present invention will be described in further detail, so that the purpose of the present invention, technology
Scheme and advantage are more clearly understood.It should be appreciated that described herein, specific examples are only used to explain the present invention, and does not have to
It is of the invention in limiting.
Fig. 1 shows lightweight face 3D critical point detection method according to an exemplary embodiment of the present invention.The embodiment
Method specifically include that
Step 101, N number of 3D reference coordinate vector of face key point in database is subjected to dimensionality reduction in three two-dimensional surfaces
Projection;Wherein, three two-dimensional surfaces are respectively xy, xz, yz plane, and x, y, z is positive or is negative simultaneously simultaneously;Each two
It include N number of 2D reference coordinate vector corresponding with the N number of 3D reference coordinate vector in dimensional plane;
Specifically, extracting N number of face key point inside Ground truth (commonly abbreviated as GT information) data set
3D reference coordinate vector, a total of 68 key points of Generic face, therefore preferred N=68 in the present embodiment.It is N number of by what is extracted
3D key point reference coordinate vector (x, y, z) carries out lowering dimension decomposition in three two-dimensional surfaces.It is to be decomposed into three in specific projection
A 2D reference coordinate vector (x, y), (y, z) and (x, z).Enable Vx,y,z=(x, y, z) indicates key point 3D reference coordinate vector, then
Separate the three 2D reference coordinate vectors generated are as follows:
Such as: its lowering dimension decomposition can be obtained (1, -2), (- 2,3), (1,3) for (1, -2,3) by a three dimensional space coordinate point,
But in order to be capable of forming joint 2D thermodynamic chart below, we will (the positive negativity of x, y, z be identical, simultaneously in xy, yz, xz in dimensionality reduction
Be positive or be negative simultaneously) three coordinate planes projected;Thereby guarantee that each three-dimensional coordinate is available after dimensionality reduction
The identical two-dimentional reference coordinate of three positive negativity.Preferably, we are projected in space coordinates first quartile (x, y, z are
In three faces just).
Step 102, k rank modified hourglass network struction combined coding sub-network, the training combined coding subnet are based on
Network makes its performance tend towards stability;Using trained combined coding sub-network by N number of 2D reference coordinate under each visual angle 2D to
Measuring combined coding is that 2D combines thermodynamic chart;Wherein, the k rank modified hourglass network residual unit uses Residual+
Inception structure;
Specifically, proposing as shown in figure 3, the residual error subelement (original structure such as Fig. 2) to hourglass network internal improves
Residual+Inception structure, is extended in network-wide, and convolution kernel is having a size of n × n, and Chi Huahe is having a size of n × n
(n=2k+1, k are positive integer) makes multiclass receptive field carry out channel fusion later.Fused characteristic pattern has input picture
Different feeling is wild, different semantic informations.For the input picture of different scale, modified hourglass network has stronger feature to mention
Ability is taken, detection accuracy is promoted.After changing residual unit, hourglass network will broaden, and the characterization ability of network can use width
To be promoted.If still using 4 traditional rank hourglass configurations, network will excessively fall into over-fitting due to parameter.Therefore in order to prevent
Over-fitting, we only retain 2 rank hourglass configurations.As shown in figure 4, combined coding sub-network of the invention is husky using 2 rank modifieds
Network of slipping through the net avoids because introducing Inception structure bring coding subnet the efficient feature extraction processing of input picture progress
The width of network increase and network parameter excessively caused by over-fitting.Network can be substantially reduced using 2 rank modified hourglass networks
Depth, enable the network to more rapid convergence, while reducing the parameter amount of network.Green rectangular module in figure be by improving after
Residual+Inception subelement composition, the first row number of green rectangle inside represents input channel, the second line number
Word represents output channel.For single order hourglass module, upper midway is carried out in archeus, and it is down-sampled again that lower midway experienced elder generation
The process of sampling is risen, it is down-sampled using maximum pond, it rises sampling and uses arest neighbors interpolation, finally by upper and lower two midway output phasies
Add to obtain final output.The order difference of hourglass module causes the complexity of network different with parameter amount.
Learning training is carried out to the k rank hourglass combined coding network using the facial image with coordinate value, due to joint
The size of thermodynamic chart is w × h × 3, and for the facial image that size is 256 × 256, code distinguishability is standing to be set to 128
× 128 × 3, so that the coding sub-network E forms mapping E (the I) → H for being input to joint thermodynamic chart H from facial image I coordinate.
Network inputs be 128 × 128 sizes facial image, export for w × h combine thermodynamic chart (output layer thermodynamic chart size can basis
Actual needs is configured).Preferably, the joint thermodynamic chart of generation is dimensioned to 64 × 64, so that face key point
Relative position become more compact by sparse, reduce the spatial redundancy of model, reduce the parameter amount of network.
It further, is that 2D combines heating power by N number of 2D reference coordinate vector combined coding using the combined coding sub-network
The detailed process of figure are as follows:
It is directed to the corresponding 2D reference coordinate vector (x, y) of some key point, is encoded to first a series of continuous
Numerical value;And screened by the way of being maximized, i.e., it is chosen wherein in a series of resulting serial numbers of coding
Maximum value, the encoded radio as thermodynamic chart.It enablesIt indicates to be located at (i in m-th of thermodynamic chartm,jm) at value, m ∈ 1,2,
3}.For n-th of key point on facial image, position vx,y,vy,z,vx,z, with 2D Gaussian form to (x, y) coordinate to
Amount is encoded (other two coordinate vectors carry out identical operation), as shown in formula (1) (σ is variance):
For a facial image with N number of key point, to each key pointIt is encoded out at it a series of
In continuous value, by being maximized, then the encoded radio of N number of key point is joined on a figure and forms 2D joint heating power
Figure, as shown in formula (2):
The 2D under three visual angles, which can be respectively obtained, by above-mentioned combined coding process combines thermodynamic chart, each 2D joint
The size of thermodynamic chart is w × h, wherein encoding all N number of key points.Fig. 5 shows the illustrative combined coding of the present invention
The thermodynamic chart that sub-network generates.
Step 103, the 2D joint thermodynamic chart under three visual angles 2D is superposed to by 3D joint thermodynamic chart using concat method;
Specifically, the 2D joint thermodynamic chart under three two-dimensional surfaces is overlapped using concat method, obtain 3D heating power
Figure.Concat method is a kind of joint vector algorithm, for connecting two or more arrays.It can be incited somebody to action by the method for concat
These three 2D joint thermodynamic chart is superimposed together, and obtains the 3D thermodynamic chart (wherein 3 representing 3 channels) that size is w × h × 3, such as
Shown in formula (3):
H=concat (p1,p2,p3) (3)
Step 104, the decoding sub-network is constructed based on the full convolutional network of 2D, the training decoding sub-network makes its performance
It tends towards stability;3D joint thermodynamic chart is decoded as N number of 3D using the decoding sub-network and detects coordinate vector.
Specifically, the decoding sub-network can be formed after pre-training between joint thermodynamic chart H to corresponding 3D coordinate vector c
Mapping D (H) → c.Since the size of joint thermodynamic chart H is w × h × 3, decoding sub-network uses a full convolution of 2D
Network is constructed, and is decoded to thermodynamic chart, (see annex 1) as shown in Figure 4;The decoding sub-network includes 5 2D convolution altogether
Layer, convolution kernel number is respectively 128,128,256,256,512, and convolution kernel size is 4 × 4, step-length 2, the last one volume
The port number of lamination is N × 3, batch normalization and LeakyRelu activation primitive of arranging in pairs or groups among each convolutional layer,
The last layer is global average pond layer, and the 3D obtained by concat method joint thermodynamic chart can be obtained by the decoding sub-network
N number of 3D key point coordinate vector.As shown in fig. 6, thus we by decode sub-network complete 3D key point N number of to face
Detect the extraction of coordinate vector.Further, it visualizes for convenience, as shown in fig. 7, by 3D key point coordinate projection to 2D image
On.
Further, it is contemplated that in network training process, different loss functions possesses different convergence rates and leads
To different extreme points, the present invention is using more loss function Fusion training methods to the combined coding sub-network, decoding subnet
Network is trained.More loss function Fusion training methods carry out three-wheel iteration to network using three kinds of different loss functions
Training, using the optimal weights that previous training in rotation is got as the initial weight of next round, until deconditioning is completed in three-wheel training.
Since each loss function is different to different size of error suseptibility, mean square error loses (MSE) to big error sensitive, because
Fast convergence rate when this encounters big error;Least absolute value error loses (L1) and the loss of smoothed out least absolute value error
(SmoothL1) to small error sensitive, when encountering small error, convergence rate is faster.Therefore we use three kinds of loss functions, iteration
Three-wheel training is carried out, the first round uses mean square error loss function, and the second wheel uses least absolute value error loss function, third
Wheel uses smoothed out least absolute value error loss function, initial power of the optimal weights as next round after each round training
Weight.Training method in this way makes the detection accuracy of network become more accurate.Corresponding combined coding sub-network loses letter
Number is formula 4~6:
Lhm1=∑ | E (I)-H |2 (4)
Lhm2=∑ | E (I)-H | (5)
Corresponding decoding sub-network loss function is formula 7~9:
Lcoord1=∑ | D (H)-c |2 (7)
Lcoord2=∑ | D (H)-c | (8)
Wherein, D indicates decoding sub-network;C indicates that 3D detects coordinate vector;H indicates joint thermodynamic chart;E indicates that joint is compiled
Numeral network, I indicate to have the facial image of coordinate vector.
In further embodiment of the present invention, first numeral network, decoding two sons of sub-network are compiled in collaboration in distich respectively for we
Network carries out pre-training, then two network connections are finely adjusted to (in programming process, Fig. 8 is complete for this as a whole together
The structural schematic diagram of whole network.Cancat algorithm is added among two network models and carries out 2D joint thermodynamic chart to 3D thermodynamic chart
Superposition), be mainly carried out in two steps:
Step 1: the combined coding sub-network is trained using the facial image with coordinate vector in the pre-training stage,
To form it into the Nonlinear Mapping that input layer is N number of facial image with coordinate vector, output layer is joint thermodynamic chart.Together
When, using the 3D joint thermodynamic chart training decoding sub-network, to form it into, input layer is 3D joint thermodynamic chart, output layer is
The Nonlinear Mapping of 3D detection coordinate vector.
Step 2: the combined coding subnet in the fine tuning stage, after the decoding sub-network after pre-training to be connected to pre-training
Behind network, concat algorithm (can realize by programming) is added among two networks and forms a complete joint thermodynamic chart
Face 3D critical point detection network model, is finely adjusted this complete network model, inputs to be N number of with coordinate vector
Original facial image, output are followed successively by corresponding 2D joint thermodynamic chart, corresponding key point 3D coordinate vector.Final whole network
It is trained in a manner of end to end, uses more loss function Fusion training methods are as follows: first round training uses mean square error
Loss function, corresponding penalty values are Lhm1+Lcoord1;Second wheel training uses least absolute value error loss function, corresponding
Penalty values are Lhm2+Lcoord2;Third round training uses smoothed out least absolute value error loss function, and corresponding penalty values are
Lhm3+Lcoord3.Initial weight of the optimal weights that previous round training obtains as next round, until the training of three three-wheels terminates to obtain
Final training result deconditioning.
In further embodiment of the invention, we make the detection coordinate vector extracted and reference coordinate vector
Comparison demonstration, and this algorithm is verified by specific experimental data.By this algorithm experimental results and calculation in the prior art
Method does accuracy comparison, result such as table 1, table 2, shown:
The GTE performance comparison of table 1 3D-FAN on AFLW2000-3D data set, JVCR and this algorithm
Table 2 3DDFA, 3D-FAN, JVCR and this algorithm network parameter amount size (MB) and one picture of processing are time-consuming (ms)
Fig. 9 shows the face 3D critical point detection system according to an exemplary embodiment of the present invention based on joint thermodynamic chart
System, i.e. electronic equipment 310 (such as having the computer server that program executes function) comprising at least one processor 311,
Power supply 314, and memory 312 and input/output interface 313 with the communication connection of at least one described processor 311;It is described
Memory 312 is stored with the instruction that can be executed by least one described processor 311, and described instruction is by least one described processing
Device 311 executes, so that at least one described processor 311 is able to carry out method disclosed in aforementioned any embodiment;It is described defeated
Entering output interface 313 may include display, keyboard, mouse and USB interface, be used for inputoutput data;Power supply 314 is used
In providing electric energy for electronic equipment 310.
It will be appreciated by those skilled in the art that: realize that all or part of the steps of above method embodiment can pass through program
Relevant hardware is instructed to complete, program above-mentioned can store in computer-readable storage medium, which is executing
When, execute step including the steps of the foregoing method embodiments;And storage medium above-mentioned includes: movable storage device, read-only memory
The various media that can store program code such as (Read Only Memory, ROM), magnetic or disk.
When the above-mentioned integrated unit of the present invention be realized in the form of SFU software functional unit and as the sale of independent product or
In use, also can store in a computer readable storage medium.Based on this understanding, the skill of the embodiment of the present invention
Substantially the part that contributes to existing technology can be embodied in the form of software products art scheme in other words, the calculating
Machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be individual
Computer, server or network equipment etc.) execute all or part of each embodiment the method for the present invention.And it is aforementioned
Storage medium include: the various media that can store program code such as movable storage device, ROM, magnetic or disk.
The above, the only detailed description of the specific embodiment of the invention, rather than limitation of the present invention.The relevant technologies
The technical staff in field is not in the case where departing from principle and range of the invention, various replacements, modification and the improvement made
It should all be included in the protection scope of the present invention.
Claims (7)
1. a kind of lightweight face 3D critical point detection method characterized by comprising
Step 101, N number of 3D reference coordinate vector of face key point in database is subjected to dimensionality reduction throwing in three two-dimensional surfaces
Shadow;Wherein, three two-dimensional surfaces are respectively xy, xz, yz plane, and x, y, z is positive or is negative simultaneously simultaneously;Each two dimension
It include N number of 2D reference coordinate vector corresponding with the N number of 3D reference coordinate vector in plane;
Step 102, it is based on k rank modified hourglass network struction combined coding sub-network, the training combined coding sub-network makes
Its performance tends towards stability;N number of 2D reference coordinate vector under each visual angle 2D is joined using trained combined coding sub-network
Conjunction is encoded to 2D joint thermodynamic chart;Wherein, the k rank modified hourglass network residual unit uses Residual+Inception
Structure;
Step 103, the 2D joint thermodynamic chart under three visual angles 2D is superposed to by 3D joint thermodynamic chart using concat method;
Step 104, the decoding sub-network is constructed based on the full convolutional network of 2D, the training decoding sub-network tends to its performance
Stablize;3D joint thermodynamic chart is decoded as N number of 3D using the decoding sub-network and detects coordinate vector.
2. the method according to claim 1, wherein the combined coding sub-network is 2 rank modified hourglass nets
Network.
3. the method according to claim 1, wherein using more loss function Fusion training methods to the joint
Coding sub-network, decoding sub-network are trained.
4. according to the method described in claim 3, it is characterized in that, more loss function Fusion training methods using three kinds not
Same loss function carries out three-wheel repetitive exercise to network, using the optimal weights that previous training in rotation is got as the initial of next round
Weight, until deconditioning is completed in three-wheel training.
5. according to the method described in claim 4, it is characterized in that, three kinds of loss functions are as follows: mean square error loss function,
Least absolute value error loss function, smoothed out least absolute value error loss function.
6. the method according to claim 1, wherein the decoding sub-network includes: 4 2D convolutional layers, each
Collocation batch normalization and LeakyRelu activation primitive among convolutional layer.
7. a kind of lightweight face 3D critical point detection system, which is characterized in that including at least one processor, and with it is described
The memory of at least one processor communication connection;The memory is stored with the finger that can be executed by least one described processor
Enable, described instruction executed by least one described processor so that at least one described processor be able to carry out claim 1 to
Method described in any one of 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910818443.4A CN110516642A (en) | 2019-08-30 | 2019-08-30 | A kind of lightweight face 3D critical point detection method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910818443.4A CN110516642A (en) | 2019-08-30 | 2019-08-30 | A kind of lightweight face 3D critical point detection method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110516642A true CN110516642A (en) | 2019-11-29 |
Family
ID=68628740
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910818443.4A Pending CN110516642A (en) | 2019-08-30 | 2019-08-30 | A kind of lightweight face 3D critical point detection method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110516642A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113128436A (en) * | 2021-04-27 | 2021-07-16 | 北京百度网讯科技有限公司 | Method and device for detecting key points |
WO2021190664A1 (en) * | 2020-11-12 | 2021-09-30 | 平安科技(深圳)有限公司 | Multi-face detection method and system based on key point positioning, and storage medium |
WO2022089360A1 (en) * | 2020-10-28 | 2022-05-05 | 广州虎牙科技有限公司 | Face detection neural network and training method, face detection method, and storage medium |
CN114757822A (en) * | 2022-06-14 | 2022-07-15 | 之江实验室 | Binocular-based human body three-dimensional key point detection method and system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030161505A1 (en) * | 2002-02-12 | 2003-08-28 | Lawrence Schrank | System and method for biometric data capture and comparison |
CN108564029A (en) * | 2018-04-12 | 2018-09-21 | 厦门大学 | Face character recognition methods based on cascade multi-task learning deep neural network |
CN109063666A (en) * | 2018-08-14 | 2018-12-21 | 电子科技大学 | The lightweight face identification method and system of convolution are separated based on depth |
CN109241910A (en) * | 2018-09-07 | 2019-01-18 | 高新兴科技集团股份有限公司 | A kind of face key independent positioning method returned based on the cascade of depth multiple features fusion |
CN109685023A (en) * | 2018-12-27 | 2019-04-26 | 深圳开立生物医疗科技股份有限公司 | A kind of facial critical point detection method and relevant apparatus of ultrasound image |
CN109919097A (en) * | 2019-03-08 | 2019-06-21 | 中国科学院自动化研究所 | Face and key point combined detection system, method based on multi-task learning |
CN110084221A (en) * | 2019-05-08 | 2019-08-02 | 南京云智控产业技术研究院有限公司 | A kind of serializing face critical point detection method of the tape relay supervision based on deep learning |
-
2019
- 2019-08-30 CN CN201910818443.4A patent/CN110516642A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030161505A1 (en) * | 2002-02-12 | 2003-08-28 | Lawrence Schrank | System and method for biometric data capture and comparison |
CN108564029A (en) * | 2018-04-12 | 2018-09-21 | 厦门大学 | Face character recognition methods based on cascade multi-task learning deep neural network |
CN109063666A (en) * | 2018-08-14 | 2018-12-21 | 电子科技大学 | The lightweight face identification method and system of convolution are separated based on depth |
CN109241910A (en) * | 2018-09-07 | 2019-01-18 | 高新兴科技集团股份有限公司 | A kind of face key independent positioning method returned based on the cascade of depth multiple features fusion |
CN109685023A (en) * | 2018-12-27 | 2019-04-26 | 深圳开立生物医疗科技股份有限公司 | A kind of facial critical point detection method and relevant apparatus of ultrasound image |
CN109919097A (en) * | 2019-03-08 | 2019-06-21 | 中国科学院自动化研究所 | Face and key point combined detection system, method based on multi-task learning |
CN110084221A (en) * | 2019-05-08 | 2019-08-02 | 南京云智控产业技术研究院有限公司 | A kind of serializing face critical point detection method of the tape relay supervision based on deep learning |
Non-Patent Citations (2)
Title |
---|
ZHENGNING WANG: "A LIGHT-WEIGHTED NETWORK FOR FACIAL LANDMARK DETECTION VIA COMBINED HEATMAP AND COORDINATE REGRESSION", 《2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME)》 * |
张伟等: "引入全局约束的精简人脸关键点检测网络", 《信号处理》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022089360A1 (en) * | 2020-10-28 | 2022-05-05 | 广州虎牙科技有限公司 | Face detection neural network and training method, face detection method, and storage medium |
WO2021190664A1 (en) * | 2020-11-12 | 2021-09-30 | 平安科技(深圳)有限公司 | Multi-face detection method and system based on key point positioning, and storage medium |
CN113128436A (en) * | 2021-04-27 | 2021-07-16 | 北京百度网讯科技有限公司 | Method and device for detecting key points |
CN114757822A (en) * | 2022-06-14 | 2022-07-15 | 之江实验室 | Binocular-based human body three-dimensional key point detection method and system |
CN114757822B (en) * | 2022-06-14 | 2022-11-04 | 之江实验室 | Binocular-based human body three-dimensional key point detection method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110516642A (en) | A kind of lightweight face 3D critical point detection method and system | |
CN110516643A (en) | A kind of face 3D critical point detection method and system based on joint thermodynamic chart | |
CN107291871B (en) | Matching degree evaluation method, device and medium for multi-domain information based on artificial intelligence | |
Jiang et al. | Dual attention mobdensenet (damdnet) for robust 3d face alignment | |
CN112215050A (en) | Nonlinear 3DMM face reconstruction and posture normalization method, device, medium and equipment | |
CN111368662A (en) | Method, device, storage medium and equipment for editing attribute of face image | |
CN111191526A (en) | Pedestrian attribute recognition network training method, system, medium and terminal | |
CN110084253A (en) | A method of generating object detection model | |
CN110598601A (en) | Face 3D key point detection method and system based on distributed thermodynamic diagram | |
He et al. | Sketch recognition with deep visual-sequential fusion model | |
US20220335685A1 (en) | Method and apparatus for point cloud completion, network training method and apparatus, device, and storage medium | |
WO2024001311A1 (en) | Method, apparatus and system for training feature extraction network of three-dimensional mesh model | |
Wang et al. | High pe utilization CNN accelerator with channel fusion supporting pattern-compressed sparse neural networks | |
Li et al. | Graph jigsaw learning for cartoon face recognition | |
JP2023059231A (en) | Key point detection and model training method, apparatus, device, and storage medium | |
CN115116559A (en) | Method, device, equipment and medium for determining and training atomic coordinates in amino acid | |
Zhang et al. | Differentiable spatial regression: A novel method for 3D hand pose estimation | |
CN113420289B (en) | Hidden poisoning attack defense method and device for deep learning model | |
CN113298931A (en) | Reconstruction method and device of object model, terminal equipment and storage medium | |
Li et al. | Msvit: training multiscale vision transformers for image retrieval | |
CN111597367B (en) | Three-dimensional model retrieval method based on view and hash algorithm | |
WO2022096944A1 (en) | Method and apparatus for point cloud completion, network training method and apparatus, device, and storage medium | |
Zhao et al. | Fm-3dfr: Facial manipulation-based 3-d face reconstruction | |
Gao et al. | Robust facial image super-resolution by kernel locality-constrained coupled-layer regression | |
US20220335566A1 (en) | Method and apparatus for processing point cloud data, device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191129 |