CN113723306A

CN113723306A - Push-up detection method, device and computer readable medium

Info

Publication number: CN113723306A
Application number: CN202111014524.2A
Authority: CN
Inventors: 陈大年; 梁文昭
Original assignee: Shanghai Zhangmen Science and Technology Co Ltd
Current assignee: Shanghai Zhangmen Science and Technology Co Ltd
Priority date: 2021-08-31
Filing date: 2021-08-31
Publication date: 2021-11-30
Anticipated expiration: 2041-08-31
Also published as: CN113723306B

Abstract

According to the push-up detection method, device and computer readable medium provided by the embodiment of the application, the scheme can acquire a push-up video comprising a user face image, then acquire a video frame sequence from the push-up video, identify the position information of the user face image in a corresponding video frame, determine the size information of the user face image in each video frame according to the position information of the user face image in each video frame, and identify the push-up action of the user according to the change condition of the size information of the user face image in the video frame sequence. Therefore, the user can acquire the needed push-up video through the user equipment with the camera shooting function, such as a camera in equipment such as a mobile phone and a tablet personal computer, and can accurately detect the push-up action of the user every time through subsequent processing, so that the social requirement of the user on the push-up can be met.

Description

Push-up detection method, device and computer readable medium

Technical Field

The present application relates to the field of information technology, and in particular, to a push-up detection method, device, and computer readable medium.

Background

Push-up is a daily and relatively common exercise scheme, and in a part of social scenes, people have a need to use "push-up" social contact, such as: the purpose of continuously building the body is achieved by online matching the number of push-ups or by checking the number of push-ups. However, because a means for accurately judging whether the push-up is really done through mobile equipment such as a mobile phone is lacked at present, whether the push-up doing of the user is real and effective cannot be objectively measured, and the interestingness of social contact and the daily supervision capability are lost. Therefore, how to provide a scheme capable of accurately detecting the push-up behavior of the user is an urgent problem to be solved.

Disclosure of Invention

An object of the present application is to provide a push-up detection method, apparatus, and computer readable medium to solve the problem of the current lack of a scheme capable of accurately detecting push-up behavior.

To achieve the above object, some embodiments of the present application provide a push-up detection method, including:

acquiring a push-up video comprising a face image of a user;

acquiring a video frame sequence from the push-up video, wherein the video frame sequence at least comprises a plurality of video frames arranged according to a time sequence;

identifying the position information of the face image of the user in the corresponding video frame;

determining the size information of the user face image in each video frame according to the position information of the user face image in each video frame;

and identifying the push-up action of the user according to the change condition of the size information of the face image of the user in the video frame sequence.

Furthermore, an embodiment of the present application also provides a push-up detection apparatus, which includes a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform the push-up detection method.

Embodiments of the present application also provide a computer readable medium, on which computer program instructions are stored, the computer program instructions being executable by a processor to implement the push-up detection method.

Compared with the prior art, in the push-up detection scheme provided by the embodiment of the application, a push-up video including a face image of a user can be acquired, then a video frame sequence is acquired from the push-up video, position information of the face image of the user in a corresponding video frame is identified, size information of the face image of the user in each video frame is determined according to the position information of the face image of the user in each video frame, and a push-up action of the user is identified according to a change situation of the size information of the face image of the user in the video frame sequence. Therefore, the user can acquire the needed push-up video through the user equipment with the camera shooting function, such as a camera in equipment such as a mobile phone and a tablet personal computer, and can accurately detect the push-up action of the user every time through subsequent processing, so that the social requirement of the user on the push-up can be met.

Drawings

Fig. 1 is a processing flow chart of a push-up detection method provided in an embodiment of the present application;

fig. 2 is a schematic diagram of a position relationship of a user placing user equipment for collecting push-up videos in an embodiment of the present application;

FIG. 3 is a schematic diagram of a video frame obtained in an embodiment of the present application;

FIG. 4 is a schematic diagram of another video frame acquired in an embodiment of the present application;

FIG. 5 is a flow chart of a process for identifying a push-up action of a user in the application embodiment;

fig. 6 is a processing flow chart of a false discriminating function part in the push-up detection method provided by the embodiment of the application;

FIG. 7 is a flow chart of a process for implementing push-up detection by using the solution provided in the embodiments of the present application;

fig. 8 is a schematic structural diagram of an apparatus for implementing push-up recognition provided by an embodiment of the present application;

the same or similar reference numbers in the drawings identify the same or similar elements.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In a typical configuration of the present application, the terminal, the devices serving the network each include one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, which include both non-transitory and non-transitory, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer program instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device.

According to the push-up detection method provided by the embodiment of the application, the required push-up video can be acquired through the camera in the user equipment with the camera shooting function, such as mobile phones, tablet computers and other equipment, and each push-up action of the user can be accurately detected through subsequent processing, so that the social requirement of the user on the use of push-up is met.

In a practical scenario, the execution subject of the method may be a user equipment, or a device formed by integrating the user equipment and a network device through a network. The part related to data processing in the scheme can be implemented locally in the user equipment, or can be implemented in the network equipment and provides the processing result to the user equipment through the network, for example, for image processing of video frames, data processing of position information and size information, calculation of reliability and the like, the user equipment can provide required information to the network equipment, and after calculation is completed by utilizing the calculation resources of the network equipment, the result is returned to the local user equipment. And the part related to interaction is realized by user equipment, such as acquisition of push-up videos and the like, and the user equipment can complete acquisition based on a camera arranged on the user equipment.

The user device may include, but is not limited to, various terminal devices such as a computer, a mobile phone, a tablet computer, a smart watch, and the like, and the network device may include, but is not limited to, an implementation such as a network host, a single network server, multiple network server sets, or a computer set based on cloud computing. Here, the Cloud is made up of a large number of hosts or web servers based on Cloud Computing (Cloud Computing), which is a type of distributed Computing, one virtual computer consisting of a collection of loosely coupled computers.

Fig. 1 shows a processing flow of a push-up detection method provided in an embodiment of the present application, which at least includes the following processing steps:

step S101, a push-up video including a face image of a user is acquired. In an actual scene, obtaining the push-up video may be achieved by various acquisition devices, where the acquisition devices may be cameras connected to the user equipment or cameras built in the user equipment, for example, front cameras of a mobile phone or a tablet computer may be used to capture actions of the user when the user performs push-up, and a lens includes images of the face of the user.

According to the scheme provided by the embodiment of the application, the detection of the push-up action can be realized based on the size change of the face image of the user during subsequent processing, so that when the push-up video is acquired, the shooting direction of the push-up video needs to have a certain corresponding relation with the action direction of the push-up, so that the face image of the user can be changed along with the change of the push-up action in different video frames, and the detection accuracy is improved.

In the embodiment of the application, an included angle between the shooting direction of the push-up video and the action direction of the push-up of the user can be set to be smaller than a preset value, and if the included angle between the shooting direction of the push-up video and the action direction of the push-up of the user is limited within the range of 5 degrees, 10 degrees or 20 degrees, so that a good detection effect can be obtained, and then the push-up video including the face image of the user can be obtained based on the shooting direction. For example, in an actual scene, if a mobile device such as a mobile phone or a tablet computer is used, the device may be placed under the face of the user, and a front camera of the mobile device is used to capture the image, so that the lens is oriented perpendicular to a horizontal plane supported by the user, thereby obtaining a push-up video including an image of the face of the user, which conforms to the angle definition, as shown in fig. 2.

In some embodiments of the present application, the relevant video content may also be played to the user when a condition is satisfied, where the preset condition includes at least any one of: currently acquiring a push-up video including a face image of a user; the currently identified push-up action of the user does not accord with the push-up standard action; and a new push-up action is not recognized within a preset time, for example, when the user stops suddenly in the process of push-up, but the push-up detection function of the equipment is not cancelled.

In an actual scene, corresponding to the different preset conditions, the played related video content may also show different specific content, for example, the related video content may be a currently acquired push-up video, so that the user can view the push-up action being performed by himself in real time, or if the related video content may be a teaching video of the push-up, the user can learn the correct push-up action against the standard teaching video, or an interactive video of another user who is interacting with the user currently, so that the user can perform the push-up while interacting with the other user, and social experience is improved.

Through the mode, the mobile equipment with the front-facing camera is matched, so that the equipment can play related contents which are beneficial to improving social experience to the user through the display screen when the push-up video is collected, for example, the collected push-up video, the teaching video of the push-up, the video contents of other users who are interacting with the user at present and the like can be played synchronously, the scheme is more suitable for mobile social scenes, the mobile equipment can be conveniently checked under the condition that the push-up is not influenced, and the use experience is improved.

In some embodiments of the present application, before acquiring the push-up video including the face image of the user, detection trigger information may also be acquired. Therefore, trigger type detection can be realized, and a user can actively control equipment to take push-up videos including face images of the user and execute subsequent detection processing by inputting trigger information when needed. In an actual scene, the trigger information may be input based on a trigger operation completed by the device by the user, for example, when the user controls the device to enter a push-up function or application, the user clicks a start button, so that the device acquires the trigger information input based on the operation of the start button, starts shooting of the push-up video, and executes subsequent steps by the device to complete push-up detection.

It will be understood by those skilled in the art that the specific input form of the trigger information is merely exemplary, and other forms based on similar principles now existing or later come to be applicable to the present application are also included in the scope of the present application and are incorporated herein by reference.

Step S102, obtaining a video frame sequence from the push-up video, wherein the video frame sequence at least comprises a plurality of video frames arranged according to a time sequence. The method comprises the steps of acquiring a video frame sequence, acquiring a push-up video, acquiring a video frame sequence, acquiring a push-up video frame sequence, acquiring a push-up video frame, acquiring a push-up video, and acquiring a push-up video frame.

When the video frame sequence is obtained, the granularity of the obtained video frame sequence can be selected according to the requirements of the actual application scene, for example, if the most accurate detection effect is required, each video frame in the push-up video can be used as the video frame sequence, at this time, because the video frame contains the complete information in the push-up video, the result obtained by detection is most accurate, but because the video frames required to be processed are the most, the calculation load is the highest.

In order to improve the detection efficiency, when the video frame sequence is acquired, the push-up video may be sampled, and the video frames may be extracted from the push-up video according to a certain proportion, for example, if the push-up video is acquired by shooting according to a format of 30 frames per second, when the video frame sequence is acquired, the push-up video may be extracted according to a format of 6 frames per second, that is, 1 frame is extracted from every 5 frames in the push-up video, so as to form the video frame sequence. In addition, in order to avoid missing key frames and causing loss of key information in the process of extracting the video frame sequence, the proportion of extracting the video frames when the video frame sequence is obtained can be dynamically adjusted according to the speed of the push-up action of the user, the proportion of sampling can be higher when the speed of the push-up action of the user is higher, such as 10 frames per second, and the proportion of sampling can be lower when the speed of the push-up action of the user is higher, such as 4 frames per second. Therefore, the detection efficiency can be improved, the loss of key information can be avoided, and the detection accuracy can be improved.

Step S103, identifying the position information of the face image of the user in the corresponding video frame.

The specific object corresponding to the user face image may be set according to the actual application requirements, and the user face image may be an image of five sense organs of the user, an image of eyes of the user, an image of mouth of the user, an image of nose of the user, or an image of skin of the face of the user, or a combination of the above. The position information of the face image of the user in the corresponding video frame can be identified from each video frame by respectively carrying out image identification processing on each video frame.

In this embodiment, the position information may be represented in a form of coordinates, for example, a certain vertex of an image range of the video frame is used as a coordinate origin, a plane coordinate system is established, and each pixel point is used as a coordinate unit, so that the position of the user face image in the corresponding video frame may be represented in the form of coordinates. The manner of representing the position information may be that an alignment point is preset according to a specific form of the user face image, for example, when an image of the user's mouth is used as the user face image, a pixel point on the rightmost side of the mouth image may be set as the alignment point, so as to determine the position information in each video frame. The number of the alignment points may be set to be greater than 1 according to the requirement of the actual scene, for example, may be set to be 3, 5, and the like.

In some embodiments of the present application, the recognition of the position information may be implemented by using a neural network algorithm, that is, each video frame is input into a neural network trained in advance, and the position information of the user face image in the corresponding video frame is output.

And step S104, determining the size information of the user face image in each video frame according to the position information of the user face image in each video frame. The size can be expressed in the form of pixel coordinate values, that is, the pixel coordinate values of the alignment points in the user face image. Taking the two video frames in fig. 3 and 4 as an example, if the alignment point is set as the rightmost point in the images of the five sense organs of the user, and the origin of the coordinate system is the vertex of the lower left corner of the video frame, the size information of the user face image in fig. 3 is the pixel coordinate value (x1, y1) of the alignment point p1, and the size information of the user face image in fig. 4 is the pixel coordinate value (x2, y2) of the alignment point p 2. In actual use, only the coordinate values of one coordinate axis may be used, for example, only the pixel coordinate values x1 and x2 of the x axis may be used.

Step S105, according to the change situation of the size information of the face image of the user in the video frame sequence, the push-up action of the user is identified. After the size information of each video frame in the video frame sequence is determined, the change situation of the size information of each video frame in the video frame sequence on the time dimension can be known based on the motion rule of the push-up action, the continuous change process of the size information should be periodic, and each push-up action, namely the change process of the corresponding size information from 'small-large-small' or 'large-small-large' is carried out until the user stops doing push-up.

By the method, after the continuous change process of the size information among all the video frames in the video frame sequence is obtained, the push-up action of the user is identified according to the change condition of the size information of the face image of the user in the video frame sequence. The identified push-up actions can be recorded and stored locally or uploaded to the cloud, and the recorded contents can include video contents corresponding to the push-up actions, the frequency of the push-up actions, the number of the push-up actions and the like.

Fig. 5 shows a flow of a processing manner of identifying a push-up action of a user in an embodiment of the present application, which includes at least the following processing steps:

step S501, determining a change sequence of the size information of the face image of the user according to the time sequence of video frame arrangement in the video frame sequence. For example, if the size information in 10 consecutive video frames uses the pixel coordinate value of the x-axis, and the pixel coordinate values are arranged in time sequence, the following change sequence [840,800,850,900,950,910,845,795,850,899] can be formed.

Step S502, determining the maximum value and the minimum value of the size information in the change sequence. When solving for maxima and minima, the function may be fitted based on discrete values in the sequence of changes, and then the maxima and minima of these discrete values may be determined by calculating derivatives. In the above-described variation sequence of the present embodiment, the maximum value is 950 corresponding to the 5 th video frame, and the minimum values are 800 and 795 corresponding to the 2 nd and 8 th video frames.

In step S503, the periodic variation process based on the maximum value and the minimum value is recognized as one push-up action of the user. The periodic variation process may be a variation process from a maximum value to a minimum value and then to a maximum value, or a variation process from a minimum value to a maximum value and then to a minimum value. In this embodiment, the periodic change process is a corresponding change process of the 2 nd to 8 th video frames, so that it can be recognized that the user has completed one push-up action in the process of the 2 nd to 8 th video frames. Based on the principle, all push-up actions made by the user in a section of video can be identified, and the push-up actions are accurately counted, so that the social requirement of the user on push-up use is met.

It should be understood by those skilled in the art that the specific forms of the location information and the size information are only examples, and other forms based on similar principles, which are present or later come into existence, should be included in the scope of the present application if applicable, and are included herein by reference. For example, the position information may be coordinates of pixel points forming the outline of the user face image, and the size information may be the number of pixel points within a range covered by the outline of the user face image, so that the push-up action of the user can be identified through the change condition of the size information in the form.

In other embodiments of the present application, if the user is prevented from cheating by simulating the push-up action through other actions in order to further improve the reliability of the push-up identification, a functional module for identifying the fake can be additionally added in addition to the identification of the push-up. Therefore, the push-up detection method provided by the embodiment of the present application may further include the processing steps shown in fig. 6:

step S601, identifying heartbeat representation information of the user in the corresponding video frame. In some embodiments of the present application, a neural network algorithm may be used to identify the heartbeat characterization information, that is, each video frame is input into a neural network trained in advance, and the heartbeat characterization information of the user in the corresponding video frame is output.

Step S602, determining a heartbeat frequency of the user according to a change condition of the heartbeat characterization information in the video frame sequence.

The heartbeat characterization information comprises characterization information corresponding to systole and characterization information corresponding to diastole, and when the heartbeat characterization information is changed periodically once, the heartbeat characterization information represents a heartbeat process. For example, in this embodiment, the token information corresponding to systole may be recorded as "+ 1", the token information corresponding to diastole may be recorded as "-1", and the process of changing the heartbeat token information from +1 to-1 and then to +1 may be regarded as one heartbeat. Thus, a correspondence between the heart beat and the sequence of video frames may be determined, such as N heart beats per M video frames. Further, the heart rate of the user in the period of time may be determined by combining the frame rates of the video frame sequences, for example, when the frame rate of the video frame sequences is 10 frames per second, the heart rate is 10N/M times/second.

Because the video frame includes the face video of the user, different characteristics of the forehead of the user at diastole and systole can be identified, for example, the color of blood flowing through the forehead at systole should be reddish due to pumping of arterial blood, and the color of blood flowing through the forehead should be darkish due to backflow of venous blood at diastole, and different colors of the forehead can be identified by adopting an optical identification mode to determine different heartbeat characterization information. It will be understood by those skilled in the art that the specific form of the heartbeat characterization information described above is merely exemplary, and other forms based on similar principles now known or later developed should be included within the scope of the present application if applicable, and are incorporated herein by reference.

When the heartbeat frequency of the user is determined according to the change condition of the heartbeat representation information, the period judgment can be realized by using the maximum value and the minimum value in the change sequence corresponding to the heartbeat representation information, and then the heartbeat frequency is determined. The specific treatment process can be as follows: firstly, determining a change sequence of heartbeat characterization information of a face image of a user according to a time sequence of video frame arrangement in a video frame sequence, then determining a maximum value and a minimum value of size information in the change sequence, finally determining periodic change time based on the maximum value and the minimum value as one-time heartbeat time of the user, and determining the heartbeat frequency of the user based on the heartbeat time.

Step S603, determining the reliability of the currently identified push-up action according to the heartbeat frequency and the frequency and/or number of the identified push-up actions. In an actual scene, the heartbeat frequency of the user has a certain relationship with the movement strength of the user, that is, the more and faster the push-up is made, the heartbeat frequency tends to rise, for example, the heartbeat frequency of the user is often lower when the user just starts to make the push-up, and the heartbeat frequency of the user is in a higher value after the user makes a section of push-up.

Therefore, based on the principle, the credibility of the currently recognized push-up action is determined according to the heartbeat frequency and the recognized frequency and/or number of the push-up actions, and if the user simulates a video picture shot when the push-up action is performed through other actions without consuming physical strength, the condition that the heartbeat frequency and the recognized frequency and/or number of the push-up actions do not accord with the association relationship occurs, and a lower credibility is obtained at the moment. On the contrary, if the acquired heart rate and the identified frequency and/or number of the push-up actions of the user who is actually doing the push-up meet a certain correlation relationship, a higher reliability is obtained. Therefore, whether the push-up made by the user is effective or not can be judged, cheating of the user is avoided, and interestingness and supervision capability of social contact based on the push-up are improved. For example, in an actual scene, a threshold of reliability may be set, and when the reliability of the determined push-up action is lower than the threshold, it may be determined that the push-up action performed by the user is invalid, and a message may be sent to the user to prompt the user to notify the user of the result of the invalid determination. In addition, after the reminding, or after the reminding, the statistical result of the push-up is cancelled, for example, the statistical push-up quantity is directly cleared, and the statistics is restarted after the credibility is higher than the threshold value, so that the method is more suitable for a social scene of push-up competition among a plurality of users, and the social experience is improved.

In some embodiments of the present application, a neural network algorithm may be used to realize the calculation of the reliability, that is, the currently determined heartbeat frequency and the frequency and/or number of the identified push-up actions are input into the neural network trained in advance, and the reliability is output. In an actual scene, the reliability may be a value in a numerical range of (0-1), and a reliability threshold may be set, and the currently determined reliability is compared with the reliability threshold, and if the currently determined reliability is greater than the reliability threshold, it is determined that the currently identified push-up action is reliable and valid, and if the currently determined reliability is less than the reliability threshold, it is determined that the currently identified push-up action is unreliable and invalid.

In other embodiments of the present application, it may also be determined whether the current detection process is valid detection by detecting whether the acquisition device has moved, where whether the movement has moved may be determined by a displacement amplitude of the acquisition device in the detection process, for example, if the displacement exceeds a certain preset distance, it indicates that the current detection is invalid. Therefore, in the push-up detection method provided by the embodiment of the application, the displacement amplitude of the acquisition device used for acquiring the push-up video including the face image of the user in the detection process can be acquired, if the displacement amplitude exceeds a preset value, the acquisition device moves to a certain degree, the current detection is determined to be invalid, and the push-up detection can be paused or ended. If the pause processing mode is adopted, the push-up detection can be continued after the displacement action is finished, and if the finish processing mode is adopted, a new push-up detection can be restarted after the displacement action is finished.

Fig. 7 shows a processing flow when the push-up detection is implemented by using the scheme provided by the embodiment of the present application, which includes the following processing steps:

and step S701, shooting through a camera to obtain a push-up video and extracting a video frame.

Step S702, detecting the size information of the face image of the user through the video frame.

And step S703, identifying the push-up action according to the change situation of the size information, and determining the number and frequency of the push-up action.

Step S704, detecting heartbeat representation information through a video frame;

step S705, calculating a heartbeat frequency according to a change condition of the heartbeat characterization information.

Step S706, calculating the reliability of the push-up action according to the frequency of the push-up action and the heartbeat frequency.

And step S707, judging whether the push-up action is effective or not based on the reliability, if so, outputting a result, such as push-up quantity, frequency and the like, and if not, returning to the re-detection.

Therefore, the user can acquire the needed push-up video through a camera in any user equipment with a camera shooting function, such as a mobile phone, a tablet personal computer and other equipment, and can accurately detect every push-up action of the user through subsequent processing, so that the social requirement of the user on the push-up can be met.

Based on the same inventive concept, the embodiment of the application also provides push-up recognition equipment, the corresponding method of the equipment is the push-up recognition method in the previous embodiment, and the problem solving principle is similar to that of the method. The apparatus comprises a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform the aforementioned push-up identification method.

Fig. 8 shows a structure of a device suitable for implementing the method and/or technical solution in the embodiment of the present application, and the device 800 includes a Central Processing Unit (CPU)801, which can execute various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage portion 808 into a Random Access Memory (RAM) 803. In the RAM803, various programs and data necessary for system operation are also stored. The CPU 801, ROM 802, and RAM803 are connected to each other via a bus 804. An Input/Output (I/O) interface 805 is also connected to bus 804.

The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, a touch screen, a microphone, an infrared sensor, and the like; an output section 807 including a Display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), an LED Display, an OLED Display, and the like, and a speaker; a storage portion 808 comprising one or more computer-readable media such as a hard disk, optical disk, magnetic disk, semiconductor memory, or the like; and a communication section 809 including a Network interface card such as a LAN (Local Area Network) card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet.

In particular, the methods and/or embodiments in the embodiments of the present application may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. The computer program performs the above-described functions defined in the method of the present application when executed by the Central Processing Unit (CPU) 801.

It should be noted that the computer readable medium described herein can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart or block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer-readable medium carries one or more computer program instructions that are executable by a processor to implement the methods and/or aspects of the embodiments of the present application as described above.

It should be noted that the present application may be implemented in software and/or a combination of software and hardware, for example, implemented using Application Specific Integrated Circuits (ASICs), general purpose computers or any other similar hardware devices. In some embodiments, the software programs of the present application may be executed by a processor to implement the above steps or functions. Likewise, the software programs (including associated data structures) of the present application may be stored in a computer readable recording medium, such as RAM memory, magnetic or optical drive or diskette and the like. Additionally, some of the steps or functions of the present application may be implemented in hardware, for example, as circuitry that cooperates with the processor to perform various steps or functions.

It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims

1. A push-up detection method, wherein the method comprises:

acquiring a push-up video comprising a face image of a user;

2. The method of claim 1, wherein identifying location information of the user face image in the corresponding video frame comprises:

and inputting each video frame into a pre-trained neural network, and outputting the position information of the face image of the user in the corresponding video frame.

3. The method of claim 1, wherein identifying the push-up action of the user according to the variation of the size information of the face image of the user in the video frame sequence comprises:

determining a change sequence of the size information of the face image of the user according to the time sequence of video frame arrangement in the video frame sequence;

determining a maximum value and a minimum value of size information in the change sequence;

and identifying the periodic variation process based on the maximum value and the minimum value as one push-up action of the user.

4. The method of claim 3, wherein the size information includes pixel coordinate values of alignment points in the user face image.

5. The method of claim 1, wherein the user facial image comprises at least any one of:

an image of the user's five sense organs;

an image of a user's eye;

an image of a user's mouth;

an image of a user's nose;

an image of the skin of the user's face.

6. The method of claim 1, wherein acquiring a push-up video including an image of a user's face comprises:

setting an included angle between a shooting direction of the push-up video and an action direction of the push-up of the user to be smaller than a preset value, and acquiring the push-up video comprising the face image of the user based on the shooting direction.

7. The method of claim 1, wherein prior to acquiring the push-up video including the image of the user's face, further comprising:

and acquiring detection trigger information.

8. The method of claim 1, wherein the method further comprises:

when a preset condition is met, playing the related video content to a user, wherein the preset condition comprises at least any one of the following items:

currently acquiring a push-up video including a face image of a user;

the currently identified push-up action of the user does not accord with the push-up standard action;

no new push-up action is identified within a preset time.

9. The method of any of claims 1 to 8, wherein the method further comprises:

identifying heartbeat characterization information of a user in a corresponding video frame, wherein the heartbeat characterization information comprises characterization information corresponding to systole and characterization information corresponding to diastole;

determining the heartbeat frequency of a user according to the change condition of heartbeat representation information in a video frame sequence;

and determining the reliability of the currently identified push-up action according to the heartbeat frequency and the frequency and/or the number of the identified push-up actions.

10. The method of claim 9, wherein determining the frequency of the heartbeat of the user based on the variation of the heartbeat characterization information in the sequence of video frames comprises:

determining a change sequence of heartbeat characterization information of a user face image according to a time sequence of video frame arrangement in a video frame sequence;

and determining the periodic variation time based on the maximum value and the minimum value as the one-time heartbeat time of the user, and determining the heartbeat frequency of the user based on the heartbeat time.

11. The method of claim 9, wherein identifying heartbeat characterization information for a user in a corresponding video frame comprises:

and inputting each video frame into a pre-trained neural network, and outputting heartbeat characterization information of the user in the corresponding video frame.

12. The method of claim 9, wherein determining a confidence level of the currently identified push-up action based on the heartbeat frequency and the frequency of push-up actions comprises:

and inputting the heartbeat frequency and the frequency of the push-up action into a pre-trained neural network, and outputting the currently identified reliability of the push-up action.

13. The method of any of claims 1-8, wherein the method further comprises:

acquiring the displacement amplitude of a collecting device for acquiring a push-up video including a face image of a user in the detection process;

and if the displacement amplitude exceeds a preset value, pausing or finishing the push-up detection.

14. A push-up detection apparatus comprising a memory for storing computer program instructions and a processor for executing the computer program instructions, wherein the computer program instructions, when executed by the processor, trigger the apparatus to perform the method of any of claims 1 to 13.

15. A computer readable medium having stored thereon computer program instructions executable by a processor to implement the method of any one of claims 1 to 13.