CN110287912A - Method, apparatus and medium are determined based on the target object affective state of deep learning - Google Patents

Method, apparatus and medium are determined based on the target object affective state of deep learning Download PDF

Info

Publication number
CN110287912A
CN110287912A CN201910576185.3A CN201910576185A CN110287912A CN 110287912 A CN110287912 A CN 110287912A CN 201910576185 A CN201910576185 A CN 201910576185A CN 110287912 A CN110287912 A CN 110287912A
Authority
CN
China
Prior art keywords
affective state
target object
model
face
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910576185.3A
Other languages
Chinese (zh)
Inventor
赵志舜
黄国恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201910576185.3A priority Critical patent/CN110287912A/en
Publication of CN110287912A publication Critical patent/CN110287912A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Image Analysis (AREA)

Abstract

This application discloses a kind of target object affective states based on deep learning to determine method, comprising: includes the picture frame of target object in acquisition video clip;Picture frame is separately input into Face datection model and behavioral value model, determines corresponding expression affective state and behavior affective state using the face characteristic and behavioural characteristic that identify;The target affective state of target object is determined according to expression affective state and behavior affective state.As it can be seen that compared to the prior art, the target object affective state provided in this embodiment based on deep learning determines that method further contemplates influence of the behavioural characteristic of target object to determining affective state, therefore can more accurately obtain the affective state of target object.Disclosed herein as well is a kind of target object affective state determining device and computer readable storage medium based on deep learning, all have above-mentioned beneficial effect.

Description

Method, apparatus and medium are determined based on the target object affective state of deep learning
Technical field
The present invention relates to field of image recognition, in particular to a kind of target object affective state based on deep learning determines Method, apparatus and computer readable storage medium.
Background technique
With the development of information technology, video technique in people's daily life using more more and more universal.For example, people Can be used Internet chatroom carry out Video chat, trans-corporation by network progress video conference, subway, square, supermarket Equal public places carry out video monitoring etc. by camera.Currently, in order to further such that people can be in the mistake of viewing video Cheng Zhonggeng directly and accurately understands the affective state of the target object in video clip, improves the efficiency linked up between men, Or timely early warning can be carried out to the target object of abnormal feeling, dangerous generation is avoided, is proposed a kind of based on depth The target object affective state of habit determines method.In this method, by carrying out figure to from the picture frame extracted in video clip As identification, the affective state of target object is determined according to the expression of the target object identified.But due to target object Expression and actual affective state are not complete corresponding relationship, therefore determine target pair according to method in the prior art Error will be present in the affective state of elephant.
Therefore, the accuracy for determining the affective state of target object how is improved, is that those skilled in the art need at present Technical problems to be solved.
Summary of the invention
In view of this, the purpose of the present invention is to provide a kind of target object affective state determination sides based on deep learning Method can be improved the accuracy for determining the affective state of target object;It is a further object of the present invention to provide one kind based on deep The target object affective state determining device and computer readable storage medium for spending study, all have above-mentioned beneficial effect.
In order to solve the above technical problems, the present invention provides a kind of target object affective state determination side based on deep learning Method, comprising:
It include the picture frame of target object in acquisition video clip;
Described image frame is separately input into Face datection model and behavioral value model, it is special using the face identified Behavioural characteristic of seeking peace determines corresponding expression affective state and behavior affective state;Wherein, the Face datection model and institute Stating behavioral value model is the model being arranged by deep learning;
The target emotion shape of the target object is determined according to the expression affective state and the behavior affective state State.
Preferably, described that described image frame is separately input into Face datection model and behavioral value model, utilize knowledge Not Chu face characteristic and behavioural characteristic determine the process of corresponding expression affective state and behavior affective state, it is specific to wrap It includes:
Described image frame is input in the Face datection model, the extraction of Face datection and facial key point is carried out, Obtain the face coordinate of the target object;
The corresponding face image of the face coordinate is input in the facial expression disaggregated model trained in advance, is obtained Corresponding expression affective state;
Described image frame is input in the behavioral value model, the extraction of human testing and human body key point is carried out, Obtain the limbs coordinate of the target object;
The described image frame for being provided with the limbs coordinate is input in the limbs expression classification model trained in advance, Obtain corresponding behavior affective state.
Preferably, it is described acquisition video clip in include target object picture frame after, further comprise:
The similarity between any two described image frame is calculated using the feature vector in each described image frame and is adopted Sample distance threshold;
Initial clustering is carried out to each described image frame using the sampled distance threshold value and the similarity, and optimizes and obtains Target cluster centre;
Key frame is determined using the target cluster centre;
It is corresponding, it is described that described image frame is separately input into Face datection model and behavioral value model, utilize knowledge Not Chu face characteristic and behavioural characteristic determine the process of corresponding expression affective state and behavior affective state, specifically:
The key frame is separately input into Face datection model and behavioral value model, it is special using the face identified Behavioural characteristic of seeking peace determines corresponding expression affective state and behavior affective state.
Preferably, the Face datection model and the behavioral value model are specially in Two-pathway CNN network Face datection model and behavioral value model.
Preferably, further comprise:
Record the target object affective state and the corresponding time for determining the affective state.
In order to solve the above technical problems, the present invention also provides a kind of target object affective states based on deep learning to determine Device, comprising:
Obtain module, for obtain include in video clip target object picture frame;
Identification module is utilized for described image frame to be separately input into Face datection model and behavioral value model The face characteristic and behavioural characteristic identified determines corresponding expression affective state and behavior affective state;Wherein, the people Face detection model and the behavioral value model are the model being arranged by deep learning;
Determining module, for determining the target object according to the expression affective state and the behavior affective state Target affective state.
Preferably, further comprise:
Computing module, for calculating the phase between any two picture frame using the feature vector in each described image frame Like degree and sampled distance threshold value;
Cluster module, for initially being gathered using the sampled distance threshold value and the similarity to each described image frame Class, and optimize and obtain target cluster centre;
Determining module, for determining key frame using the target cluster centre.
In order to solve the above technical problems, the target object affective state the present invention also provides another kind based on deep learning is true Determine device, comprising:
Memory, for storing computer program;
Processor realizes any of the above-described kind of target object feelings based on deep learning when for executing the computer program The step of sense state determines method.
In order to solve the above technical problems, the present invention also provides a kind of computer readable storage medium, it is described computer-readable Computer program is stored on storage medium, the computer program realizes that any of the above-described kind is based on depth when being executed by processor The step of target object affective state of study determines method.
Target object affective state provided by the invention based on deep learning determines method, further contemplates target pair The behavioural characteristic of elephant to determine affective state influence, by obtain video clip in include target object picture frame it Afterwards, picture frame is separately input into Face datection model and behavioral value model, utilizes the face characteristic and behavior identified Feature determines corresponding expression affective state and behavior affective state, then according to expression affective state and behavior affective state It determines the target affective state of target object, therefore can more accurately obtain the affective state of target object.
In order to solve the above technical problems, the present invention also provides a kind of, the target object affective state based on deep learning is true Determine device and computer readable storage medium, all has above-mentioned beneficial effect.
Detailed description of the invention
It in order to illustrate the embodiments of the present invention more clearly or the technical solution of the prior art, below will be to embodiment or existing Attached drawing needed in technical description is briefly described, it should be apparent that, the accompanying drawings in the following description is only this hair Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root Other attached drawings are obtained according to the attached drawing of offer.
Fig. 1 is a kind of stream that method is determined based on the target object affective state of deep learning provided in an embodiment of the present invention Cheng Tu;
Fig. 2 is a kind of knot of the target object affective state determining device based on deep learning provided in an embodiment of the present invention Composition;
Fig. 3 is another target object affective state determining device based on deep learning provided in an embodiment of the present invention Structure chart.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
The core of the embodiment of the present invention is to provide a kind of target object affective state based on deep learning and determines method, energy It is enough to improve the accuracy for determining the affective state of target object;Another core of the invention is to provide a kind of based on deep learning Target object affective state determining device and computer readable storage medium, all have above-mentioned beneficial effect.
It is right with reference to the accompanying drawings and detailed description in order to make those skilled in the art more fully understand the present invention program The present invention is described in further detail.
Fig. 1 is a kind of stream that method is determined based on the target object affective state of deep learning provided in an embodiment of the present invention Cheng Tu.As shown in Figure 1, a kind of target object affective state based on deep learning determines that method includes:
S10: including the picture frame of target object in acquisition video clip.
Specifically, the present embodiment is further to extract in video clip after getting video clip for true first Set the goal object affective state picture frame, and due to needing to carry out analysis identification to picture frame frame, the picture frame In need to include target object.
S20: picture frame is separately input into Face datection model and behavioral value model, special using the face identified Behavioural characteristic of seeking peace determines corresponding expression affective state and behavior affective state;Wherein, Face datection model and behavior inspection Surveying model is the model being arranged by deep learning.
Specifically, picture frame is then separately input into Face datection model and behavioral value after getting picture frame In model, recognition of face and Activity recognition are carried out, the face characteristic of target object is obtained by recognition of face, according to Activity recognition It obtains the behavioural characteristic of target object, and corresponding expression affective state and row is obtained according to face characteristic and behavioural characteristic respectively For affective state.
S30: the target affective state of target object is determined according to expression affective state and behavior affective state.
It is understood that the emotion contained under different behavior states is different for same expression.Example Such as, for this affective state of cryying, if not combining specific behavior, we can not judge that this people is the tears that weep through sorrow Still it is so happy as to weep.Specifically, analyzed in the present embodiment further combined with expression affective state and behavior affective state, Such as can be weighted by pre-set expression affective state and the corresponding weight of behavior affective state, it determines The target affective state of target object out.
It should be noted that in practical applications, can be carried out according to the target affective state for the target object determined Corresponding operation: for example, it is assumed that be the video clip obtained from the camera for being set to public place, it can be by identifying The target affective state of target object in video clip, obtains the target object in abnormal object affective state, such as angry, Indignation etc., so as to carry out early warning preparation, in advance so as to control in time abnormal conditions.
As it can be seen that compared to the prior art, the target object affective state provided in this embodiment based on deep learning determines Method further contemplates influence of the behavioural characteristic of target object to affective state is determined, by wrapping in obtaining video clip After the picture frame for including target object, picture frame is separately input into Face datection model and behavioral value model, is utilized The face characteristic and behavioural characteristic identified determines corresponding expression affective state and behavior affective state, then according to expression Affective state and behavior affective state determine the target affective state of target object, therefore can more accurately obtain target object Affective state.
On the basis of the above embodiments, the present embodiment has made further instruction and optimization to technical solution, specifically, Picture frame is separately input into Face datection model and behavioral value model, the face characteristic and behavioural characteristic identified is utilized The process for determining corresponding expression affective state and behavior affective state, specifically includes:
Picture frame is input in Face datection model, the extraction of Face datection and facial key point is carried out, obtains target The face coordinate of object;
The corresponding face image of face coordinate is input in the facial expression disaggregated model trained in advance, obtains correspondence Expression affective state;
Picture frame is input in behavioral value model, the extraction of human testing and human body key point is carried out, obtains target The limbs coordinate of object;
The picture frame for being provided with limbs coordinate is input in the limbs expression classification model trained in advance, obtains correspondence Behavior affective state.
Specifically, picture frame is separately input into Face datection model and behavioral value model first, face inspection is carried out Survey and the extraction of facial key point and the extraction for carrying out human testing and human body key point, obtain target object face coordinate and The limbs coordinate of target object.
Then, the corresponding face image of face coordinate is input in the facial expression disaggregated model trained in advance, is obtained Corresponding expression affective state out;The picture frame for being provided with limbs coordinate is input to the limbs expression classification mould trained in advance In type, corresponding behavior affective state is obtained.
In the present embodiment, Face datection model and behavioral value model are specially in Two-pathway CNN network Face datection model and behavioral value model.
Specifically, the present embodiment be Face datection and human testing are carried out by Two-pathway CNN network, thus Opposite detection efficiency can be improved, save detection time.Also, in Two-pathway CNN network, Face datection model can To be specially Mt-CNN convolutional neural networks, behavioral value model can be specially faster-RCNN convolutional neural networks, this reality Example is applied not limit this.In addition, facial expression disaggregated model can be specifically local CNN neural network, limbs expression point Class model can be specifically global CNN neural network, and the present embodiment does not do specific restriction to this yet.
As it can be seen that the present embodiment is to utilize model trained in advance, and obtain expression affective state and behavior according to picture frame Affective state, therefore can be derived that more accurate recognition result.
It is understood that video clip is as a kind of unstructured dynamic number being made of continuous associated images frame According to often comprising some redundancies.Such as in the same camera lens, the picture frame and the image at t+1 moment of video t moment Often therefore difference very little carries out the emotion recognition of target object in foundation video clip, really to frame on vision content and feature During the affective state for making target object, picture frame all in video clip is all used for the emotion shape of target object The analysis of state, will be so that analytic process be complicated, and calculation amount is huge, and operates redundancy.Therefore, the present embodiment is in above-mentioned implementation On the basis of example, further instruction and optimization are made to technical solution, specifically, including target in obtaining video clip After the picture frame of object, further comprise:
The similarity and sampled distance threshold between any two picture frame are calculated using the feature vector in each picture frame Value;
Initial clustering is carried out to each picture frame using sampled distance threshold value and similarity, and optimizes and obtains in target cluster The heart;
Key frame is determined using target cluster centre.
Specifically, being the video clip that acquisition includes target object first, then extracting each video in the present embodiment Each picture frame in segment obtains the feature vector in each picture frame, calculates the Euclidean distance between any two picture frame, Similarity i.e. between any two picture frame, recycles calculated Euclidean distance to calculate sampled distance threshold value.
More specifically, the calculation method of the Euclidean distance between any two picture frame is as follows:
Wherein, dis (Fi,Fj) indicate any two picture frame feature vector Euclidean distance;FiAnd FjIndicate video figure The feature vector of any two frames i and j as in;R indicates each picture frame;N is the dimension of the feature vector of picture frame.
Specifically, the calculation method of sampled distance threshold value is as follows:
Wherein, c constant, N are the quantity of picture frame;Indicate the number of the Euclidean distance calculated.
Different sampled distance threshold values can be chosen to different video clips by this method, also, calculated The initial clustering that sampled distance threshold value less than normal can make subsequent acquisition more is conducive to the class heart and merges and secondary cluster.
After calculating sampled distance threshold value and similarity, using sampled distance threshold value and similarity in video clip Picture frame carry out initial clustering, obtain initial cluster center set C and clusters number K.
Specifically, by random selection one picture frame feature vector be used as cluster centre, calculating it is remaining each Feature vector is at a distance from cluster centre;Then judge whether the minimum value in each distance is less than sampled distance threshold value, that is, judge Min(dis)≥2*d;If so, cluster data is added 1, and using the picture frame as a cluster centre, that is, by the picture frame Corresponding feature vector is added in the set of cluster centre;If it is not, then the corresponding feature vector of the picture frame is deleted;Again from Next picture frame cluster centre the most is chosen in picture frame and carries out judgement calculating, until by all picture frames of video clip It has carried out judgement to calculate, has obtained initial cluster center set C and clusters number K.
Then, the initial cluster center set C obtained is clustered using K-means algorithm again, is updated Cluster centre set G successively calculates the distance between two cluster centres in cluster centre set G, if Min (dis)≤2*d, The two cluster centres are then merged by Sequence cluster, and calculate corresponding updated cluster centre;Otherwise, it chooses another Group cluster center carries out judgement calculating, until each cluster centre in traversal cluster centre set G, thus the mesh after being optimized Mark cluster centre.
After obtaining target cluster centre, the image nearest from each target cluster centre is chosen from all picture frames Key frame is separately input into Face datection model and behavioral value model by key frame of the frame as the video clip, correspondence, Corresponding expression affective state and behavior affective state are determined using the face characteristic and behavioural characteristic that identify.
In the present embodiment, by the key frame for further selecting the video object, it is possible to reduce image interframe it is a large amount of superfluous Remaining information, the information that more condensed compactly expression video clip includes, the emotion of target object is carried out using the key frame selected The determination of state can be further reduced calculation amount, improve the real-time for determining the affective state of target object.
On the basis of the above embodiments, the present embodiment has made further instruction and optimization to technical solution, specifically, The present embodiment further comprises:
The affective state of record target object and the corresponding time for determining affective state.
Specifically, the target affective state that target object is determined according to expression affective state and behavior affective state it Afterwards, record target object affective state and the corresponding time for determining affective state.It should be noted that specific record Mode can be with text or record in table form, and the present embodiment does not limit this.More specifically, it can be with memory The modes such as item, hard disk, TF (Trans-flash Card) card and SD (Secure Digital Memory Card) card are deposited Storage, is selected, the present embodiment is not limited this with specific reference to actual demand.
In the present embodiment, affective state and the corresponding time for determining affective state by record target object, energy Enough convenient for the emotion situation of change of subsequent further analysis target object, so as to further promote usage experience.
Above for a kind of implementation for determining method based on the target object affective state of deep learning provided by the invention Example is described in detail, and the present invention also provides a kind of target object emotions based on deep learning corresponding with this method State determination device and computer readable storage medium, due to the embodiment and side of device and computer readable storage medium part The embodiment of method part mutually correlates, therefore the embodiment of device and computer readable storage medium part refers to method part Embodiment description, wouldn't repeat here.
Fig. 2 is a kind of knot of the target object affective state determining device based on deep learning provided in an embodiment of the present invention Composition, as shown in Fig. 2, a kind of target object affective state determining device based on deep learning includes:
Obtain module 21, for obtain include in video clip target object picture frame;
Identification module 22 utilizes knowledge for picture frame to be separately input into Face datection model and behavioral value model Not Chu face characteristic and behavioural characteristic determine corresponding expression affective state and behavior affective state;Wherein, Face datection Model and behavioral value model are the model being arranged by deep learning;
Determining module 23, for determining the target emotion of target object according to expression affective state and behavior affective state State.
Target object affective state determining device provided in an embodiment of the present invention based on deep learning has above-mentioned be based on The target object affective state of deep learning determines the beneficial effect of method.
As preferred embodiment, further comprise:
Computing module, for calculating the similarity between any two picture frame using the feature vector in each picture frame With sampled distance threshold value;
Cluster module for carrying out initial clustering to each picture frame using sampled distance threshold value and similarity, and optimizes To target cluster centre;
Determining module, for determining key frame using target cluster centre.
Fig. 3 is another target object affective state determining device based on deep learning provided in an embodiment of the present invention Structure chart, as shown in figure 3, a kind of target object affective state determining device based on deep learning includes:
Memory 31, for storing computer program;
Processor 32 realizes such as the above-mentioned target object affective state based on deep learning when for executing computer program The step of determining method.
Target object affective state determining device provided in an embodiment of the present invention based on deep learning has above-mentioned be based on The target object affective state of deep learning determines the beneficial effect of method.
In order to solve the above technical problems, the present invention also provides a kind of computer readable storage medium, computer-readable storage It is stored with computer program on medium, such as the above-mentioned target pair based on deep learning is realized when computer program is executed by processor The step of determining method as affective state.
Computer readable storage medium provided in an embodiment of the present invention has the above-mentioned target object feelings based on deep learning Sense state determines the beneficial effect of method.
Method, apparatus and calculating are determined to the target object affective state provided by the present invention based on deep learning above Machine readable storage medium storing program for executing is described in detail.Specific embodiment used herein to the principle of the present invention and embodiment into Elaboration is gone, the above description of the embodiment is only used to help understand the method for the present invention and its core ideas.It should be pointed out that pair For those skilled in the art, without departing from the principle of the present invention, the present invention can also be carried out Some improvements and modifications, these improvements and modifications also fall within the scope of protection of the claims of the present invention.
Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are with other realities The difference of example is applied, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment Speech, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part illustration ?.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.

Claims (9)

1. a kind of target object affective state based on deep learning determines method characterized by comprising
It include the picture frame of target object in acquisition video clip;
Described image frame is separately input into Face datection model and behavioral value model, using the face characteristic that identifies and Behavioural characteristic determines corresponding expression affective state and behavior affective state;Wherein, the Face datection model and the row It is the model being arranged by deep learning for detection model;
The target affective state of the target object is determined according to the expression affective state and the behavior affective state.
2. the method according to claim 1, wherein described be separately input into Face datection mould for described image frame In type and behavioral value model, corresponding expression affective state and row are determined using the face characteristic and behavioural characteristic that identify For the process of affective state, specifically include:
Described image frame is input in the Face datection model, the extraction of Face datection and facial key point is carried out, obtains The face coordinate of the target object;
The corresponding face image of the face coordinate is input in the facial expression disaggregated model trained in advance, obtains correspondence Expression affective state;
Described image frame is input in the behavioral value model, the extraction of human testing and human body key point is carried out, obtains The limbs coordinate of the target object;
The described image frame for being provided with the limbs coordinate is input in the limbs expression classification model trained in advance, is obtained Corresponding behavior affective state.
3. the method according to claim 1, wherein including target object in the acquisition video clip After picture frame, further comprise:
Using the feature vector in each described image frame calculate similarity between any two described image frame and sampling away from From threshold value;
Initial clustering is carried out to each described image frame using the sampled distance threshold value and the similarity, and optimizes and obtains target Cluster centre;
Key frame is determined using the target cluster centre;
It is corresponding, it is described that described image frame is separately input into Face datection model and behavioral value model, using identifying Face characteristic and behavioural characteristic determine the process of corresponding expression affective state and behavior affective state, specifically:
The key frame is separately input into Face datection model and behavioral value model, using the face characteristic that identifies and Behavioural characteristic determines corresponding expression affective state and behavior affective state.
4. the method according to claim 1, wherein the Face datection model and behavioral value model tool Body is the Face datection model and behavioral value model in Two-pathway CNN network.
5. method according to any one of claims 1 to 4, which is characterized in that further comprise:
Record the target object affective state and the corresponding time for determining the affective state.
6. a kind of target object affective state determining device based on deep learning characterized by comprising
Obtain module, for obtain include in video clip target object picture frame;
Identification module utilizes identification for described image frame to be separately input into Face datection model and behavioral value model Face characteristic and behavioural characteristic out determines corresponding expression affective state and behavior affective state;Wherein, the face inspection It surveys model and the behavioral value model is the model being arranged by deep learning;
Determining module, for determining the mesh of the target object according to the expression affective state and the behavior affective state Mark affective state.
7. device according to claim 6, which is characterized in that further comprise:
Computing module, for calculating the similarity between any two picture frame using the feature vector in each described image frame With sampled distance threshold value;
Cluster module, for carrying out initial clustering to each described image frame using the sampled distance threshold value and the similarity, And optimizes and obtain target cluster centre;
Determining module, for determining key frame using the target cluster centre.
8. a kind of target object affective state determining device based on deep learning characterized by comprising
Memory, for storing computer program;
Processor is realized when for executing the computer program and is based on deep learning as described in any one of claim 1 to 5 Target object affective state the step of determining method.
9. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program, when the computer program is executed by processor realize as it is described in any one of claim 1 to 5 based on deep learning The step of target object affective state determines method.
CN201910576185.3A 2019-06-28 2019-06-28 Method, apparatus and medium are determined based on the target object affective state of deep learning Pending CN110287912A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910576185.3A CN110287912A (en) 2019-06-28 2019-06-28 Method, apparatus and medium are determined based on the target object affective state of deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910576185.3A CN110287912A (en) 2019-06-28 2019-06-28 Method, apparatus and medium are determined based on the target object affective state of deep learning

Publications (1)

Publication Number Publication Date
CN110287912A true CN110287912A (en) 2019-09-27

Family

ID=68019647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910576185.3A Pending CN110287912A (en) 2019-06-28 2019-06-28 Method, apparatus and medium are determined based on the target object affective state of deep learning

Country Status (1)

Country Link
CN (1) CN110287912A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882625A (en) * 2020-07-07 2020-11-03 北京达佳互联信息技术有限公司 Method and device for generating dynamic graph, electronic equipment and storage medium
CN113723165A (en) * 2021-03-25 2021-11-30 山东大学 Method and system for detecting dangerous expressions of people to be detected based on deep learning
CN114064969A (en) * 2021-11-19 2022-02-18 浙江大学 Dynamic picture linkage display device based on emotional curve

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295568A (en) * 2016-08-11 2017-01-04 上海电力学院 The mankind's naturalness emotion identification method combined based on expression and behavior bimodal
CN106503646A (en) * 2016-10-19 2017-03-15 竹间智能科技(上海)有限公司 Multi-modal emotion identification system and method
CN106851437A (en) * 2017-01-17 2017-06-13 南通同洲电子有限责任公司 A kind of method for extracting video frequency abstract
CN107220585A (en) * 2017-03-31 2017-09-29 南京邮电大学 A kind of video key frame extracting method based on multiple features fusion clustering shots
CN107808146A (en) * 2017-11-17 2018-03-16 北京师范大学 A kind of multi-modal emotion recognition sorting technique
CN108520250A (en) * 2018-04-19 2018-09-11 北京工业大学 A kind of human motion sequence extraction method of key frame
CN109145754A (en) * 2018-07-23 2019-01-04 上海电力学院 Merge the Emotion identification method of facial expression and limb action three-dimensional feature
CN109766759A (en) * 2018-12-12 2019-05-17 成都云天励飞技术有限公司 Emotion identification method and Related product

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295568A (en) * 2016-08-11 2017-01-04 上海电力学院 The mankind's naturalness emotion identification method combined based on expression and behavior bimodal
CN106503646A (en) * 2016-10-19 2017-03-15 竹间智能科技(上海)有限公司 Multi-modal emotion identification system and method
CN106851437A (en) * 2017-01-17 2017-06-13 南通同洲电子有限责任公司 A kind of method for extracting video frequency abstract
CN107220585A (en) * 2017-03-31 2017-09-29 南京邮电大学 A kind of video key frame extracting method based on multiple features fusion clustering shots
CN107808146A (en) * 2017-11-17 2018-03-16 北京师范大学 A kind of multi-modal emotion recognition sorting technique
CN108520250A (en) * 2018-04-19 2018-09-11 北京工业大学 A kind of human motion sequence extraction method of key frame
CN109145754A (en) * 2018-07-23 2019-01-04 上海电力学院 Merge the Emotion identification method of facial expression and limb action three-dimensional feature
CN109766759A (en) * 2018-12-12 2019-05-17 成都云天励飞技术有限公司 Emotion identification method and Related product

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111882625A (en) * 2020-07-07 2020-11-03 北京达佳互联信息技术有限公司 Method and device for generating dynamic graph, electronic equipment and storage medium
CN111882625B (en) * 2020-07-07 2024-04-05 北京达佳互联信息技术有限公司 Method, device, electronic equipment and storage medium for generating dynamic diagram
CN113723165A (en) * 2021-03-25 2021-11-30 山东大学 Method and system for detecting dangerous expressions of people to be detected based on deep learning
CN113723165B (en) * 2021-03-25 2022-06-07 山东大学 Method and system for detecting dangerous expressions of people to be detected based on deep learning
CN114064969A (en) * 2021-11-19 2022-02-18 浙江大学 Dynamic picture linkage display device based on emotional curve

Similar Documents

Publication Publication Date Title
Fan et al. Lasot: A high-quality benchmark for large-scale single object tracking
CN110675475B (en) Face model generation method, device, equipment and storage medium
US20230049135A1 (en) Deep learning-based video editing method, related device, and storage medium
CN107423398A (en) Exchange method, device, storage medium and computer equipment
CN108229268A (en) Expression Recognition and convolutional neural networks model training method, device and electronic equipment
CN110287912A (en) Method, apparatus and medium are determined based on the target object affective state of deep learning
CN111881776B (en) Dynamic expression acquisition method and device, storage medium and electronic equipment
CN113822254B (en) Model training method and related device
CN110503076A (en) Video classification methods, device, equipment and medium based on artificial intelligence
CN111027419B (en) Method, device, equipment and medium for detecting video irrelevant content
CN110531849A (en) Intelligent teaching system based on 5G communication and capable of enhancing reality
CN113963304B (en) Cross-modal video time sequence action positioning method and system based on time sequence-space diagram
CN113393544B (en) Image processing method, device, equipment and medium
CN113723530A (en) Intelligent psychological assessment system based on video analysis and electronic psychological sand table
CN116309992A (en) Intelligent meta-universe live person generation method, equipment and storage medium
CN111405314B (en) Information processing method, device, equipment and storage medium
Cui et al. Deep learning based advanced spatio-temporal extraction model in medical sports rehabilitation for motion analysis and data processing
Liu et al. Trampoline motion decomposition method based on deep learning image recognition
He Athlete human behavior recognition based on continuous image deep learning and sensors
CN115546491B (en) Fall alarm method, system, electronic equipment and storage medium
CN111783587A (en) Interaction method, device and storage medium
CN108596068A (en) A kind of method and apparatus of action recognition
Zhao Research on athlete behavior recognition technology in sports teaching video based on deep neural network
CN111760276B (en) Game behavior control method, device, terminal, server and storage medium
CN113824989A (en) Video processing method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190927