CN109271929A - Detection method and device - Google Patents

Detection method and device Download PDF

Info

Publication number
CN109271929A
CN109271929A CN201811075036.0A CN201811075036A CN109271929A CN 109271929 A CN109271929 A CN 109271929A CN 201811075036 A CN201811075036 A CN 201811075036A CN 109271929 A CN109271929 A CN 109271929A
Authority
CN
China
Prior art keywords
mouth
face
open
face object
threshold
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811075036.0A
Other languages
Chinese (zh)
Other versions
CN109271929B (en
Inventor
邓启力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Douyin Vision Co Ltd
Douyin Vision Beijing Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN201811075036.0A priority Critical patent/CN109271929B/en
Priority to PCT/CN2018/115973 priority patent/WO2020052062A1/en
Publication of CN109271929A publication Critical patent/CN109271929A/en
Application granted granted Critical
Publication of CN109271929B publication Critical patent/CN109271929B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the present application discloses detection method and device.One specific embodiment of this method includes: to obtain to obtained face critical point detection result after the face object progress face critical point detection in the present frame of target video;Based on the face critical point detection as a result, determining that the mouth of the face object in the present frame opens distance;Based on the mouth open and-shut mode of the face object in predetermined, the present frame previous frame, targets threshold is determined;Distance is opened compared with the targets threshold based on the mouth, determines the mouth open and-shut mode of the face object in the present frame.This embodiment improves the accuracys of the testing result of the mouth open and-shut mode of face object in video.

Description

Detection method and device
Technical field
The invention relates to field of computer technology, and in particular to detection method and device.
Background technique
With the development of computer technology, it usually needs frame in video carries out face critical point detection.Then basis Face critical point detection is as a result, determine the expression etc. of the face object in video.
When detecting to mouth open and-shut mode, relevant mode is usually to work as mouth to open distance more than certain threshold value When, determine that mouth is open configuration;When mouth opens distance below the threshold value, determine that mouth is closed state.
Summary of the invention
The embodiment of the present application proposes detection method and device.
In a first aspect, the embodiment of the present application provides a kind of detection method, target video is worked as this method comprises: obtaining Face object in previous frame carries out obtained face critical point detection result after face critical point detection;Based on face key point Testing result determines that the mouth of the face object in present frame opens distance;Based in predetermined, present frame previous frame Face object mouth open and-shut mode, determine targets threshold;Distance is opened compared with targets threshold based on mouth, and determination is worked as The mouth open and-shut mode of face object in previous frame.
In some embodiments, the mouth based on the face object in predetermined, present frame previous frame is opened and closed shape State determines targets threshold, comprising:, will in response to determining that the mouth open and-shut mode of the face object in previous frame is open configuration Preset first threshold is determined as targets threshold;In response to determining that the mouth open and-shut mode of the face object in previous frame is closure Preset second threshold is determined as targets threshold by state, wherein first threshold is less than second threshold.
In some embodiments, distance is opened compared with targets threshold based on mouth, determines the face pair in present frame The mouth open and-shut mode of elephant, comprising: be greater than targets threshold in response to determining that mouth opens distance, determine the face pair in present frame The mouth open and-shut mode of elephant is open configuration;It is not more than targets threshold in response to determining that mouth opens distance, determines in present frame Face object mouth open and-shut mode be closed state.
In some embodiments, this method further include: the previous frame of present frame is not present in response to determining, will preset Initial threshold as targets threshold, distance is opened compared with targets threshold based on mouth, determines the face pair in present frame The mouth open and-shut mode of elephant.
In some embodiments, this method further include: in response to determining that the mouth of the face object in present frame is opened and closed shape State is open configuration, obtains target special efficacy, shows target special efficacy in the mouth position of the face object of present frame.
Second aspect, the embodiment of the present application provide a kind of detection device, which includes: acquiring unit, are configured to It obtains to obtained face critical point detection after the face object progress face critical point detection in the present frame of target video As a result;First determination unit is configured to based on face critical point detection as a result, determining the mouth of the face object in present frame Open distance;Second determination unit is configured to the mouth based on the face object in predetermined, present frame previous frame Open and-shut mode determines targets threshold;Third determination unit is configured to open distance compared with targets threshold based on mouth, Determine the mouth open and-shut mode of the face object in present frame.
In some embodiments, the second determination unit, comprising: the first determining module is configured in response to determine upper one The mouth open and-shut mode of face object in frame is open configuration, and preset first threshold is determined as targets threshold;Second really Cover half block is configured in response to determine that the mouth open and-shut mode of the face object in previous frame is closed state, will be preset Second threshold is determined as targets threshold, wherein first threshold is less than second threshold.
In some embodiments, third determination unit, comprising: third determining module is configured in response to determine mouth It opens distance and is greater than targets threshold, determine that the mouth open and-shut mode of the face object in present frame is open configuration;4th determines Module is configured in response to determine that mouth opens distance no more than targets threshold, determines the mouth of the face object in present frame Bar open and-shut mode is closed state.
In some embodiments, device further include: the 4th determination unit is configured in response to determine that there is no current The previous frame of frame opens distance compared with targets threshold based on mouth using pre-set initial threshold as targets threshold, Determine the mouth open and-shut mode of the face object in present frame.
In some embodiments, device further include: display unit is configured in response to determine the face in present frame The mouth open and-shut mode of object is open configuration, obtains target special efficacy, shows mesh in the mouth position of the face object of present frame Mark special efficacy.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, comprising: one or more processors;Storage dress Set, be stored thereon with one or more programs, when one or more programs are executed by one or more processors so that one or Multiple processors realize the method such as any embodiment in above-mentioned first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, should The method such as any embodiment in above-mentioned first aspect is realized when program is executed by processor.
Detection method provided by the embodiments of the present application and device, by obtaining to the face pair in the present frame of target video As carry out face critical point detection after obtained face critical point detection as a result, so as to be based on the face critical point detection As a result, determining that the mouth of the face object in present frame opens distance.Mouth then based on the face object in previous frame Open and-shut mode can determine targets threshold, so as to based on targets threshold with identified mouth open distance compared with, Determine the mouth open and-shut mode of the face object in present frame.The target threshold compared with distance carries out data is opened with mouth as a result, Value is that the mouth open and-shut mode based on the face object in previous frame is determined, that is, it is considered that the people in previous frame Influence of the mouth open and-shut mode of face object to the mouth open and-shut mode of the face object in present frame.Thus, it is possible to improve view The accuracy of the testing result of the mouth open and-shut mode of face object in frequency.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is that one embodiment of the application can be applied to exemplary system architecture figure therein;
Fig. 2 is the flow chart according to one embodiment of the detection method of the application;
Fig. 3 is the schematic diagram according to an application scenarios of the detection method of the application;
Fig. 4 is the flow chart according to another embodiment of the detection method of the application;
Fig. 5 is the structural schematic diagram according to one embodiment of the detection device of the application;
Fig. 6 is adapted for the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 is shown can be using the detection method of the application or the exemplary system architecture 100 of detection device.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 to provide the medium of communication link.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Various telecommunication customer end applications can be installed, such as interactive voice class is answered on terminal device 101,102,103 With, shopping class application, searching class application, instant messaging tools, mailbox client, social platform software etc..
Terminal device 101,102,103 can be hardware, be also possible to software.When terminal device 101,102,103 is hard When part, it can be the various electronic equipments with display screen and supported web page browsing, including but not limited to smart phone, plate Computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic Image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, move State image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc..When terminal is set Standby 101,102,103 when being software, may be mounted in above-mentioned cited electronic equipment.Its may be implemented into multiple softwares or Software module (such as providing Distributed Services), also may be implemented into single software or software module.It does not do herein specific It limits.
When terminal device 101,102,103 is hardware, it is also equipped with image capture device thereon.Image Acquisition is set It is standby to can be the various equipment for being able to achieve acquisition image function, such as camera, sensor.User can use terminal device 101, the image capture device on 102,103, to acquire video.
Frame in the video that terminal device 101,102,103 can record the video or user that it is played carries out people The processing such as face detection, face critical point detection;Face critical point detection result can also be analyzed, be calculated, determine view The mouth of the face object of each frame opens distance in frequency;The mouth open and-shut mode for being also based on a certain frame chooses targets threshold, To open distance using the mouth in the targets threshold and next frame, to the mouth open and-shut mode of the face object in next frame into Row detection, obtains testing result.
Server 105 can be to provide the server of various services, such as uploading to terminal device 101,102,103 The video video processing service device that is stored, managed or analyzed.Video processing service device can store a large amount of view Frequently, and video can be sent to terminal device 101,102,103.
It should be noted that server 105 can be hardware, it is also possible to software.When server is hardware, Ke Yishi The distributed server cluster of ready-made multiple server compositions, also may be implemented into individual server.When server is software, Multiple softwares or software module (such as providing Distributed Services) may be implemented into, single software or soft also may be implemented into Part module.It is not specifically limited herein.
It should be noted that detection method provided by the embodiment of the present application is generally held by terminal device 101,102,103 Row, correspondingly, detection device is generally positioned in terminal device 101,102,103.
It should be pointed out that the case where the correlation function of server 105 may be implemented in terminal device 101,102,103 Under, server 105 can be not provided in system architecture 100.
It may also be noted that server 105 can also be to the video or terminal device 101,102,103 that it is stored The video uploaded carries out the processing such as Face datection, face critical point detection, the detection of mouth open and-shut mode, and processing result is returned Back to terminal device 101,102,103.At this point, detection method provided by the embodiment of the present application can also be held by server 105 Row, correspondingly, detection device also can be set in server 105.
It should be understood that the number of terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, the process 200 of one embodiment of the detection method according to the application is shown.The detection side Method, comprising the following steps:
Step 201, it obtains to acquired after the face object progress face critical point detection in the present frame of target video Face critical point detection result.
In the present embodiment, the executing subject (such as terminal device shown in FIG. 1 101,102,103) of detection method can be with Carry out the recording or broadcasting of video.Its video played, which can be, is stored in advance in local video;It is also possible to by having Line connection or radio connection, from the video obtained in server (such as server 105 shown in FIG. 1).Herein, when into When the recording of row video, image collecting device (such as camera) can be installed or be connected with to above-mentioned executing subject.It may be noted that , above-mentioned radio connection can include but is not limited to 3G/4G connection, WiFi connection, bluetooth connection, WiMAX connection, Zigbee connection, UWB (ultra wideband) connection and other currently known or exploitation in the future radio connections.
In the present embodiment, the face object in the available present frame to target video of above-mentioned executing subject carries out people Obtained face critical point detection result after face critical point detection.Wherein, above-mentioned target video, which can be, is currently played Video, be also possible to the video that user is recording.It is not construed as limiting herein.Above-mentioned face key testing result may include each The position (coordinate representation can be used) of a face key point.In practice, face key point can be the crucial point (example in face Such as with the point of semantic information, face mask or the point of face shape etc. are either influenced).It can be in face key testing result Coordinate including upper lip center, the coordinate etc. of lower lip center.
Herein, the present frame of target video can be in target video to carry out mouth opening and closing to face object therein The frame of state-detection.As an example, above-mentioned executing subject can be according to the sequence of the timestamp of frame, successively in target video Face object in each frame carries out the detection of mouth open and-shut mode.The frame of current pending mouth open and-shut mode detection, it can The referred to as present frame of target video.By taking the following two kinds scene as an example:
In a kind of scene, target video can be above-mentioned executing subject video being played on.In broadcasting for target video During putting, above-mentioned executing subject seriatim can carry out face critical point detection to each frame to be played, obtain the frame In face object face critical point detection as a result, so as to in the frame face object carry out the detection of mouth open and-shut mode, And then carry out the broadcasting of the frame.It can be present frame in the frame that current time will play.
In another scene, target video can be the video that above-mentioned executing subject is being recorded.In target video In recording process, above-mentioned executing subject seriatim can carry out face critical point detection to the frame that each has been captured, and be somebody's turn to do The face critical point detection of face object in frame is as a result, to carry out the inspection of mouth open and-shut mode to the face object in the frame It surveys, and then the frame is shown.Newest frame acquired in current time can be present frame.
It should be noted that can use various modes carries out face critical point detection.For example, can in above-mentioned executing subject To be previously stored with the face critical point detection model for carrying out face critical point detection to image.For the every of target video The frame can be input in above-mentioned face critical point detection model by one frame, obtain face critical point detection result.Here, people Face critical point detection model can be using machine learning method, is based on sample set, has to existing convolutional neural networks What supervised training obtained.Wherein, convolutional neural networks can be used various existing structures, such as DenseBox, VGGNet, ResNet, SegNet etc..It should be noted that above-mentioned machine learning method, Training method be at present extensively research and The well-known technique of application, details are not described herein.
In some optional implementations of the present embodiment, can also be previously stored in above-mentioned executing subject for pair The Face datection model of image progress Face datection.At this point, when carrying out the detection of mouth open and-shut mode to a certain frame, it is above-mentioned to hold The frame can be input to Face datection model first by row main body, obtained Face datection result and (such as be used to indicate face object The position of region, that is, the position of Face datection frame).Then, screenshot can be carried out to the face object region, i.e., Facial image can be obtained.Later, which can be input to face critical point detection model, obtain the inspection of face key point Survey result.
Step 202, based on face critical point detection as a result, determining that the mouth of the face object in present frame opens distance.
In the present embodiment, above-mentioned executing subject can be primarily based on face critical point detection result adjustment face object Scaling.For example, can calculate the coordinate of the forehead in face critical point detection result to chin coordinate distance, then The ratio is determined as scaling by the ratio for determining the distance with pre-determined distance.Then, due to can be in Face datection result The coordinate of upper lip center including face object, the coordinate of lower lip center, therefore, above-mentioned executing subject can be with The distance between the two coordinates are calculated, by the distance divided by above-mentioned scaling, determine that mouth opens distance.It may be noted that , other distances can also be used to determine scaling, be not construed as limiting herein.It is, for example, possible to use the distances of the left and right corners of the mouth Ratio with another pre-determined distance is as scaling.
It should be noted that above-mentioned executing subject can also carry out Face datection in advance, face subject area is being determined The region is zoomed in and out afterwards, then carries out face critical point detection again.At this point, the upper lip in face critical point detection result The coordinate of center is at a distance from the coordinate of lower lip center, and as mouth opens distance.
Step 203, the mouth open and-shut mode based on the face object in predetermined, present frame previous frame determines Targets threshold.
In the present embodiment, since above-mentioned executing subject can successively carry out the face object in the frame in target video The detection of mouth open and-shut mode, therefore, above-mentioned executing subject have been predefined when carrying out the detection of mouth open and-shut mode to present frame The testing result of the mouth open and-shut mode of face object in the previous frame of present frame out.At this point, above-mentioned executing subject can be with base The mouth open and-shut mode of face object in predetermined, present frame previous frame, determines targets threshold.Herein, target Threshold value can be mouth open and-shut mode of the above-mentioned executing subject based on the face object in previous frame, pre-set from user institute Selected threshold value in multiple threshold values.Herein, the mouth open and-shut mode of the face object in previous frame is different, and targets threshold is not yet Together.
It should be noted that the face object in previous frame and present frame herein can be the face of the same person.Example Such as, during user's recording shoots the video certainly, the face object in present frame and previous frame is all the face of the user.
In some optional implementations of the present embodiment, in response to determining that the mouth of the face object in previous frame is opened Closed state is open configuration, preset first threshold can be determined as targets threshold.In response to determining the face in previous frame The mouth open and-shut mode of object is closed state, preset second threshold can be determined as targets threshold.Wherein, above-mentioned first Threshold value can be less than above-mentioned second threshold.It should be noted that in this implementation, technical staff can be in advance based on greatly The data statistics of amount and test two threshold values (respectively first threshold and second threshold) of setting.
In some optional implementations of the present embodiment, technical staff can be in advance based on a large amount of data statistics and Test sets multiple threshold values, and multiple threshold values is of different sizes.In response to determining the mouth opening and closing of the face object in previous frame State is open configuration, above-mentioned executing subject can using any threshold value for being greater than default median in above-mentioned multiple threshold values as Targets threshold.In response to determining that the mouth open and-shut mode of the face object in previous frame is closed state, above-mentioned executing subject can Using by any threshold value for being less than default median in above-mentioned multiple threshold values as targets threshold.Herein, default median can be with It is the average value of above-mentioned multiple threshold values, is also possible to any minimum value greater than in above-mentioned multiple threshold values and is less than above-mentioned multiple thresholds The numerical value of maximum value in value.
In previous mode, a single threshold value is usually set.When mouth is opened apart from preset greater than this, then it is assumed that It is the state of opening one's mouth;If mouth, which opens distance, is less than the threshold value, then it is assumed that be the state of shutting up.Previous this mode, in mouth When opening distance in the Near Threshold, it will cause testing result and beat back and forth, cause the stability of testing result and accuracy equal It is poor.And in such a way that the mouth open and-shut mode in the present embodiment based on the face object in previous frame determines targets threshold, It by the selection of different threshold values, can frequently beat to avoid testing result, improve the stability and accuracy of testing result.
In some optional implementations of the present embodiment, the previous frame of present frame if it does not exist, i.e. present frame are mesh Mark the first frame of video.Above-mentioned executing subject can be using pre-set initial threshold as targets threshold, based on above-mentioned mouth Distance is opened compared with above-mentioned targets threshold, determines the mouth open and-shut mode of the face object in present frame.For example, if mouth It opens distance and is greater than targets threshold, can be determined as mouth is open configuration;If mouth, which opens distance, is not more than above-mentioned targets threshold, Can be determined as mouth is closed state.Herein, above-mentioned initial threshold can be set according to actual needs.
In some optional implementations of the present embodiment, above-mentioned initial threshold can be greater than above-mentioned first threshold and small In above-mentioned second threshold.
Step 204, distance is opened compared with targets threshold based on mouth, determines the mouth of the face object in present frame Open and-shut mode.
In the present embodiment, above-mentioned executing subject can the opening distance of the mouth based on determined by step 202 and step 203 The comparison of identified targets threshold determines the mouth open and-shut mode of the face object in present frame.
In some optional implementations of the present embodiment, distance is opened in response to the above-mentioned mouth of determination and is greater than above-mentioned mesh Threshold value is marked, above-mentioned executing subject can determine that the mouth open and-shut mode of the face object in above-mentioned present frame is open configuration.It rings Distance should be opened no more than above-mentioned targets threshold in determining above-mentioned mouth, above-mentioned executing subject can determine in above-mentioned present frame The mouth open and-shut mode of face object is closed state.
In some optional implementations of the present embodiment, distance is opened in response to the above-mentioned mouth of determination and is greater than above-mentioned mesh Threshold value is marked, above-mentioned executing subject can determine that the mouth open and-shut mode of the face object in above-mentioned present frame is open configuration.It rings It should open distance in determining above-mentioned mouth and be less than above-mentioned targets threshold, above-mentioned executing subject can determine the people in above-mentioned present frame The mouth open and-shut mode of face object is closed state.Distance, which is opened, in response to the above-mentioned mouth of determination is equal to above-mentioned targets threshold, on Stating executing subject can be using the mouth open and-shut mode of previous frame as the mouth open and-shut mode of present frame.
In some optional implementations of the present embodiment, in response to the mouth of the face object in the above-mentioned present frame of determination Bar open and-shut mode is open configuration, and above-mentioned executing subject obtains target special efficacy (such as paster of mouth), in above-mentioned present frame The mouth position of face object shows above-mentioned target special efficacy.
With continued reference to the schematic diagram that Fig. 3, Fig. 3 are according to the application scenarios of the detection method of the present embodiment.Fig. 3's In application scenarios, the self-timer mode of user's using terminal equipment 301 records target video.Terminal device is capturing present frame Afterwards, face critical point detection has been carried out to present frame using the face critical point detection model that it is stored, and has obtained face Critical point detection result 302.Then, terminal device 301 determines the people in present frame based on face critical point detection result 302 The mouth of face object opens distance 303.Then, terminal device 301 gets the mouth opening and closing shape of the face object in previous frame State 304 may thereby determine that out targets threshold 305.Finally, above-mentioned terminal device 301 is based on targets threshold 305 and mouth opens The comparison of distance 303 can determine the mouth open and-shut mode 306 of the face object in present frame.
The method provided by the above embodiment of the application, by obtain to the face object in the present frame of target video into Obtained face critical point detection after pedestrian's face critical point detection is as a result, so as to be based on the face critical point detection knot Fruit determines that the mouth of the face object in present frame opens distance.Mouth then based on the face object in previous frame is opened Closed state can determine targets threshold, so as to based on targets threshold with identified mouth open distance compared with, really The mouth open and-shut mode of face object in settled previous frame.The targets threshold compared with distance carries out data is opened with mouth as a result, It is that the mouth open and-shut mode based on the face object in previous frame is determined, that is, it is considered that the face in previous frame Influence of the mouth open and-shut mode of object to the mouth open and-shut mode of the face object in present frame.Pass through different target as a result, The selection of threshold value can frequently beat to avoid testing result, improve the inspection of the mouth open and-shut mode of the face object in video Survey the stability and accuracy of result.
With further reference to Fig. 4, it illustrates the processes 400 of another embodiment of detection method.The stream of the detection method Journey 400, comprising the following steps:
Step 401, it obtains to acquired after the face object progress face critical point detection in the present frame of target video Face critical point detection result.
In the present embodiment, the executing subject (such as terminal device shown in FIG. 1 101,102,103) of detection method obtains To obtained face critical point detection result after the face object progress face critical point detection in the present frame of target video.
It in the present embodiment, can also be in advance before carrying out face critical point detection to the face object in present frame The Face datection of present frame is carried out, to determine face object region.In addition, determine face object region it Afterwards, which can also be zoomed in and out, so that the size (such as length) in the region and preset size (such as length) phase Together.
Step 402, based on face critical point detection as a result, determining that the mouth of the face object in present frame opens distance.
In the present embodiment, due to may include in Face datection result face object upper lip center seat Mark, the coordinate of lower lip center, therefore, above-mentioned executing subject can calculate the distance between the two coordinates, by this away from From be determined as mouth open with a distance from.
It step 403, will be preset in response to determining that the mouth open and-shut mode of the face object in previous frame is open configuration First threshold is determined as targets threshold.
In the present embodiment, in response to determining that the mouth open and-shut mode of the face object in previous frame is open configuration, on Preset first threshold can be determined as targets threshold (such as 0.2) by stating executing subject.
It step 404, will be preset in response to determining that the mouth open and-shut mode of the face object in previous frame is closed state Second threshold is determined as targets threshold.
In the present embodiment, in response to determining that the mouth open and-shut mode of the face object in previous frame is closed state, on Targets threshold can be determined as preset second threshold by stating executing subject.Wherein, above-mentioned first threshold can be less than above-mentioned the Two threshold values.
Step 405, distance is opened compared with targets threshold based on mouth, determines the mouth of the face object in present frame Open and-shut mode.
In the present embodiment, above-mentioned electronic equipment can open the ratio of distance and above-mentioned targets threshold based on above-mentioned mouth Compared with determining the mouth open and-shut mode of the face object in above-mentioned present frame.Specifically, distance is opened in response to the above-mentioned mouth of determination Greater than above-mentioned targets threshold, it can determine that the mouth open and-shut mode of the face object in above-mentioned present frame is open configuration.Response In determining that above-mentioned mouth opens distance no more than above-mentioned targets threshold, the mouth of the face object in above-mentioned present frame can be determined Open and-shut mode is closed state.
Step 406, in response to determining that the mouth open and-shut mode of the face object in present frame is open configuration, target is obtained Special efficacy shows target special efficacy in the mouth position of the face object of present frame.
It in the present embodiment, is to open shape in response to the mouth open and-shut mode of the face object in the above-mentioned present frame of determination State, the above-mentioned available target special efficacy of executing subject (such as paster of mouth), in the mouth of the face object of above-mentioned present frame Position shows above-mentioned target special efficacy.
Figure 4, it is seen that the process 400 of the detection method in the present embodiment relates to compared with the corresponding embodiment of Fig. 2 And to by the way that the step of dual threshold detects mouth open and-shut mode is arranged.The scheme of the present embodiment description can be with as a result, Determine that the mode of targets threshold can by the selection of different threshold values based on the mouth open and-shut mode of the face object in previous frame It frequently beats to avoid testing result, improves the stability and accuracy of testing result.In addition, having further related to determining mouth Open and-shut mode is the displaying step of progress target special efficacy after open configuration.Thus, it is possible to which abundant video shows form.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of detection devices One embodiment, the Installation practice is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to various electricity In sub- equipment.
As shown in figure 5, detection device 500 described in the present embodiment includes: acquiring unit 501, it is configured to obtain to mesh The face object marked in the present frame of video carries out obtained face critical point detection result after face critical point detection;First Determination unit 502 is configured to based on above-mentioned face critical point detection as a result, determining the mouth of the face object in above-mentioned present frame Ba Zhangkai distance;Second determination unit 503, the face being configured in the previous frame based on predetermined, above-mentioned present frame The mouth open and-shut mode of object, determines targets threshold;Third determination unit 504 is configured to open distance based on above-mentioned mouth Compared with above-mentioned targets threshold, the mouth open and-shut mode of the face object in above-mentioned present frame is determined.
In some optional implementations of the present embodiment, above-mentioned second determination unit 503 may include first determining Module and the second determining module (not shown).Wherein, above-mentioned first determining module may be configured in response in determination The mouth open and-shut mode of face object in one frame is open configuration, and preset first threshold is determined as targets threshold.It is above-mentioned Second determining module may be configured in response to the mouth open and-shut mode for determining the face object in previous frame be closed state, Preset second threshold is determined as targets threshold, wherein above-mentioned first threshold is less than above-mentioned second threshold.
In some optional implementations of the present embodiment, above-mentioned third determination unit 504 may include that third determines Module and the 4th determining module (not shown).Wherein, above-mentioned third determining module may be configured in response in determination It states mouth and opens distance greater than above-mentioned targets threshold, determine the mouth open and-shut mode of the face object in above-mentioned present frame to open State.Above-mentioned 4th determining module may be configured to open distance no more than above-mentioned target threshold in response to the above-mentioned mouth of determination Value determines that the mouth open and-shut mode of the face object in above-mentioned present frame is closed state.
In some optional implementations of the present embodiment, which can also be including the 4th determination unit (in figure not It shows).Wherein, above-mentioned 4th determination unit may be configured to the previous frame that present frame is not present in response to determining, will be preparatory The initial threshold of setting opens distance compared with above-mentioned targets threshold as targets threshold, based on above-mentioned mouth, determines current The mouth open and-shut mode of face object in frame.
In some optional implementations of the present embodiment, which can also include that display unit (does not show in figure Out).Wherein, above-mentioned display unit may be configured to be opened and closed shape in response to the mouth of the face object in the above-mentioned present frame of determination State is open configuration, obtains target special efficacy, shows above-mentioned target special efficacy in the mouth position of the face object of above-mentioned present frame.
The device provided by the above embodiment of the application is obtained by acquiring unit 501 in the present frame of target video Face object carry out face critical point detection after obtained face critical point detection as a result, to the first determination unit 502 It can be based on the face critical point detection as a result, determining that the mouth of the face object in present frame opens distance.Then second Mouth open and-shut mode of the determination unit 503 based on the face object in previous frame, can determine targets threshold, so that third is true Order member 504 can determine the face object in present frame based on targets threshold compared with identified mouth opens distance Mouth open and-shut mode.The targets threshold compared with mouth opens distance progress data is based on the face in previous frame as a result, What the mouth open and-shut mode of object was determined, that is, it is considered that the mouth open and-shut mode pair of the face object in previous frame The influence of the mouth open and-shut mode of face object in present frame.The selection for passing through different target threshold value as a result, can be to avoid inspection It surveys result frequently to beat, improves the stability of the testing result of the mouth open and-shut mode of the face object in video and accurate Property.
Below with reference to Fig. 6, it illustrates the computer systems 600 for the electronic equipment for being suitable for being used to realize the embodiment of the present application Structural schematic diagram.Electronic equipment shown in Fig. 6 is only an example, function to the embodiment of the present application and should not use model Shroud carrys out any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data. CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always Line 604.
I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 608 including hard disk etc.; And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon Computer program be mounted into storage section 608 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 609, and/or from detachable media 611 are mounted.When the computer program is executed by central processing unit (CPU) 601, limited in execution the present processes Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer readable storage medium either the two any combination.Computer readable storage medium for example can be --- but Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination. The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires electrical connection, Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory Part or above-mentioned any appropriate combination.In this application, computer readable storage medium, which can be, any include or stores The tangible medium of program, the program can be commanded execution system, device or device use or in connection.And In the application, computer-readable signal media may include in a base band or the data as the propagation of carrier wave a part are believed Number, wherein carrying computer-readable program code.The data-signal of this propagation can take various forms, including but not It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use In by the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc., Huo Zheshang Any appropriate combination stated.
Flow chart and block diagram in attached drawing are illustrated according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of the module, program segment or code include one or more use The executable instruction of the logic function as defined in realizing.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it to infuse Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit also can be set in the processor, for example, can be described as: a kind of processor packet Include acquiring unit, the first determination unit, the second determination unit and third determination unit.Wherein, the title of these units is at certain In the case of do not constitute restriction to the unit itself, for example, acquiring unit is also described as " obtaining to target video Face object in present frame carries out the unit of obtained face critical point detection result after face critical point detection ".
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in device described in above-described embodiment;It is also possible to individualism, and without in the supplying device.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device, so that should Device: it obtains to obtained face key point after the face object progress face critical point detection in the present frame of target video Testing result;Based on the face critical point detection as a result, determining that the mouth of the face object in the present frame opens distance;It is based on The mouth open and-shut mode of face object in predetermined, the present frame previous frame, determines targets threshold;Based on the mouth Distance is opened compared with the targets threshold, determines the mouth open and-shut mode of the face object in the present frame.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.Those skilled in the art Member is it should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Any combination and the other technical solutions formed.Such as features described above has similar function with (but being not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims (12)

1. a kind of detection method, comprising:
It obtains to obtained face key point after the face object progress face critical point detection in the present frame of target video Testing result;
Based on the face critical point detection as a result, determining that the mouth of the face object in the present frame opens distance;
The mouth open and-shut mode of face object in previous frame based on predetermined, the described present frame, determines targets threshold;
Distance is opened compared with the targets threshold based on the mouth, determines the mouth of the face object in the present frame Open and-shut mode.
2. detection method according to claim 1, wherein the previous frame based on predetermined, the described present frame In face object mouth open and-shut mode, determine targets threshold, comprising:
In response to determining that the mouth open and-shut mode of the face object in previous frame is open configuration, preset first threshold is determined For targets threshold;
In response to determining that the mouth open and-shut mode of the face object in previous frame is closed state, preset second threshold is determined For targets threshold, wherein the first threshold is less than the second threshold.
3. detection method according to claim 1, wherein described to open distance and the targets threshold based on the mouth Comparison, determine the mouth open and-shut mode of the face object in the present frame, comprising:
Distance is opened in response to the determination mouth and is greater than the targets threshold, determines the mouth of the face object in the present frame Bar open and-shut mode is open configuration;
Distance is opened in response to the determination mouth and is not more than the targets threshold, determines the face object in the present frame Mouth open and-shut mode is closed state.
4. detection method according to claim 1, wherein the method also includes:
The previous frame of present frame is not present in response to determining, using pre-set initial threshold as targets threshold, based on described Mouth opens distance compared with the targets threshold, determines the mouth open and-shut mode of the face object in present frame.
5. detection method according to claim 1, wherein the method also includes:
Mouth open and-shut mode in response to the face object in the determination present frame is open configuration, obtains target special efficacy, The mouth position of the face object of the present frame shows the target special efficacy.
6. a kind of detection device, comprising:
Acquiring unit is configured to obtain the face object in the present frame to target video and carries out institute after face critical point detection Obtained face critical point detection result;
First determination unit is configured to based on the face critical point detection as a result, determining the face pair in the present frame The mouth of elephant opens distance;
Second determination unit is configured to the mouth of the face object in the previous frame based on predetermined, the described present frame Open and-shut mode determines targets threshold;
Third determination unit is configured to open distance compared with the targets threshold based on the mouth, work as described in determination The mouth open and-shut mode of face object in previous frame.
7. detection device according to claim 6, wherein second determination unit, comprising:
First determining module is configured in response to determine the mouth open and-shut mode of the face object in previous frame to open shape Preset first threshold is determined as targets threshold by state;
Second determining module is configured in response to determine that the mouth open and-shut mode of the face object in previous frame is closed form Preset second threshold is determined as targets threshold by state, wherein the first threshold is less than the second threshold.
8. detection device according to claim 6, wherein the third determination unit, comprising:
Third determining module is configured in response to determine that the mouth opens distance and is greater than the targets threshold, determine described in The mouth open and-shut mode of face object in present frame is open configuration;
4th determining module is configured in response to determine that the mouth opens distance no more than the targets threshold, determines institute The mouth open and-shut mode for stating the face object in present frame is closed state.
9. detection device according to claim 6, wherein described device further include:
4th determination unit is configured in response to determine the previous frame that present frame is not present, by pre-set initial threshold As targets threshold, distance is opened compared with the targets threshold based on the mouth, determines the face object in present frame Mouth open and-shut mode.
10. detection device according to claim 6, wherein described device further include:
Display unit is configured in response to determine the mouth open and-shut mode of the face object in the present frame to open shape State obtains target special efficacy, shows the target special efficacy in the mouth position of the face object of the present frame.
11. a kind of electronic equipment, comprising:
One or more processors;
Storage device is stored thereon with one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real Now such as method as claimed in any one of claims 1 to 5.
12. a kind of computer-readable medium, is stored thereon with computer program, wherein the realization when program is executed by processor Such as method as claimed in any one of claims 1 to 5.
CN201811075036.0A 2018-09-14 2018-09-14 Detection method and device Active CN109271929B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201811075036.0A CN109271929B (en) 2018-09-14 2018-09-14 Detection method and device
PCT/CN2018/115973 WO2020052062A1 (en) 2018-09-14 2018-11-16 Detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811075036.0A CN109271929B (en) 2018-09-14 2018-09-14 Detection method and device

Publications (2)

Publication Number Publication Date
CN109271929A true CN109271929A (en) 2019-01-25
CN109271929B CN109271929B (en) 2020-08-04

Family

ID=65189111

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811075036.0A Active CN109271929B (en) 2018-09-14 2018-09-14 Detection method and device

Country Status (2)

Country Link
CN (1) CN109271929B (en)
WO (1) WO2020052062A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008922A (en) * 2019-04-12 2019-07-12 腾讯科技(深圳)有限公司 Image processing method, unit, medium for terminal device
CN110188712A (en) * 2019-06-03 2019-08-30 北京字节跳动网络技术有限公司 Method and apparatus for handling image
CN114359673A (en) * 2022-01-10 2022-04-15 北京林业大学 Small sample smoke detection method, device and equipment based on metric learning

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111898529B (en) * 2020-07-29 2022-07-19 北京字节跳动网络技术有限公司 Face detection method and device, electronic equipment and computer readable medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794464A (en) * 2015-05-13 2015-07-22 上海依图网络科技有限公司 In vivo detection method based on relative attributes
CN105518582A (en) * 2015-06-30 2016-04-20 北京旷视科技有限公司 Vivo detection method and device, computer program product
CN106709400A (en) * 2015-11-12 2017-05-24 阿里巴巴集团控股有限公司 Sense organ opening and closing state recognition method, sense organ opening and closing state recognition device and client
CN107358153A (en) * 2017-06-02 2017-11-17 广州视源电子科技股份有限公司 A kind of mouth method for testing motion and device and vivo identification method and system
CN107368777A (en) * 2017-06-02 2017-11-21 广州视源电子科技股份有限公司 A kind of smile motion detection method and device and vivo identification method and system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951730B (en) * 2014-03-26 2018-08-31 联想(北京)有限公司 A kind of lip moves detection method, device and electronic equipment
TWI603270B (en) * 2014-12-11 2017-10-21 由田新技股份有限公司 Method and apparatus for detecting person to use handheld device
CN106897658B (en) * 2015-12-18 2021-12-14 腾讯科技(深圳)有限公司 Method and device for identifying human face living body
KR101797870B1 (en) * 2016-08-12 2017-11-14 라인 가부시키가이샤 Method and system for measuring quality of video call
CN106650624A (en) * 2016-11-15 2017-05-10 东软集团股份有限公司 Face tracking method and device
JP6940742B2 (en) * 2016-11-24 2021-09-29 キヤノンマーケティングジャパン株式会社 Information processing equipment, information processing methods, programs

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794464A (en) * 2015-05-13 2015-07-22 上海依图网络科技有限公司 In vivo detection method based on relative attributes
CN105518582A (en) * 2015-06-30 2016-04-20 北京旷视科技有限公司 Vivo detection method and device, computer program product
CN106709400A (en) * 2015-11-12 2017-05-24 阿里巴巴集团控股有限公司 Sense organ opening and closing state recognition method, sense organ opening and closing state recognition device and client
CN107358153A (en) * 2017-06-02 2017-11-17 广州视源电子科技股份有限公司 A kind of mouth method for testing motion and device and vivo identification method and system
CN107368777A (en) * 2017-06-02 2017-11-21 广州视源电子科技股份有限公司 A kind of smile motion detection method and device and vivo identification method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008922A (en) * 2019-04-12 2019-07-12 腾讯科技(深圳)有限公司 Image processing method, unit, medium for terminal device
CN110188712A (en) * 2019-06-03 2019-08-30 北京字节跳动网络技术有限公司 Method and apparatus for handling image
CN110188712B (en) * 2019-06-03 2021-10-12 北京字节跳动网络技术有限公司 Method and apparatus for processing image
CN114359673A (en) * 2022-01-10 2022-04-15 北京林业大学 Small sample smoke detection method, device and equipment based on metric learning
CN114359673B (en) * 2022-01-10 2024-04-09 北京林业大学 Small sample smoke detection method, device and equipment based on metric learning

Also Published As

Publication number Publication date
WO2020052062A1 (en) 2020-03-19
CN109271929B (en) 2020-08-04

Similar Documents

Publication Publication Date Title
US20240193948A1 (en) Systems and methods for generating media content
CN109214343A (en) Method and apparatus for generating face critical point detection model
CN109376267A (en) Method and apparatus for generating model
CN109308469B (en) Method and apparatus for generating information
CN109271929A (en) Detection method and device
CN109308490A (en) Method and apparatus for generating information
CN109492128A (en) Method and apparatus for generating model
CN109344908A (en) Method and apparatus for generating model
CN109446990A (en) Method and apparatus for generating information
CN108197618A (en) For generating the method and apparatus of Face datection model
CN109447156A (en) Method and apparatus for generating model
CN109086719A (en) Method and apparatus for output data
CN108171204B (en) Detection method and device
CN109241921A (en) Method and apparatus for detecting face key point
CN108171211A (en) Biopsy method and device
CN109447246A (en) Method and apparatus for generating model
CN109145828A (en) Method and apparatus for generating video classification detection model
CN109829432A (en) Method and apparatus for generating information
CN109389072A (en) Data processing method and device
CN108510454A (en) Method and apparatus for generating depth image
CN110263748A (en) Method and apparatus for sending information
CN109784304A (en) Method and apparatus for marking dental imaging
CN109389096A (en) Detection method and device
CN110472558A (en) Image processing method and device
CN109325996A (en) Method and apparatus for generating information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: Tiktok vision (Beijing) Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Patentee before: Tiktok vision (Beijing) Co.,Ltd.