CN114694169A - Human skeleton detection method and device - Google Patents

Human skeleton detection method and device Download PDF

Info

Publication number
CN114694169A
CN114694169A CN202110280064.1A CN202110280064A CN114694169A CN 114694169 A CN114694169 A CN 114694169A CN 202110280064 A CN202110280064 A CN 202110280064A CN 114694169 A CN114694169 A CN 114694169A
Authority
CN
China
Prior art keywords
frame
ratio
skeleton
intra
film frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110280064.1A
Other languages
Chinese (zh)
Inventor
孙伟程
叶尚楷
高志忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial Technology Research Institute ITRI
Original Assignee
Industrial Technology Research Institute ITRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US17/137,212 external-priority patent/US11625938B2/en
Application filed by Industrial Technology Research Institute ITRI filed Critical Industrial Technology Research Institute ITRI
Publication of CN114694169A publication Critical patent/CN114694169A/en
Pending legal-status Critical Current

Links

Images

Abstract

The present disclosure provides a method and a device for detecting a human skeleton, wherein the method for detecting the human skeleton comprises the following steps: receiving a film frame, wherein the film frame comprises a human body; judging whether the film frame comprises a piece of prediction information or not; when the film frame comprises the prediction information, judging whether the ratio of a first Intra-frame compressed block (Intra Coded MB, IMB) of a target area of the human body in the film frame is greater than a first threshold; and estimating skeleton information of the human body by using a Motion Vector (MV) when the ratio of the compressed blocks in the first frame of the target area is not greater than the first threshold.

Description

Method and device for detecting human skeleton
Technical Field
The present disclosure relates to a method and an apparatus for detecting a human skeleton, and more particularly, to a method and an apparatus for detecting a human skeleton using motion vector assistance.
Background
With the increasing popularity of commercial edge computing devices and the introduction of 5G networks, the application of intelligent image analysis using deep learning techniques has been implemented in daily life. And human behavior recognition is a basic technology for various applications such as intelligent entertainment, intelligent monitoring, man-machine interaction and the like. Behavior recognition is a challenging task due to the influence of many factors, such as different lighting conditions, view diversity, complex background, and large intra-class variation.
The study of behavior recognition dates back to 1973. At that time Johansson found through experimental observations that the motion of the body can be described by the movement of some major joint points. Therefore, only the combination and tracking of 10 to 12 key nodes can describe a plurality of behaviors such as dancing, walking, running and the like. Therefore, behaviors can be identified through the movement of key nodes of the human body.
Compared with the RGB-based image, the Skeleton-based motion Recognition (skeeleton-based Action Recognition) has the advantages that the Skeleton information has clear and simple characteristics and is not easily influenced by appearance factors. The basis of skeleton-based motion recognition is to perform skeleton detection (position Estimation) first. Open source software for extracting a skeleton from a picture or an image sequence is known as openphase, alphaphase and the like. However, the above open source software has problems of a large amount of computation and low computation efficiency.
Therefore, a method and an apparatus for detecting human skeleton are needed to improve the above problems.
Disclosure of Invention
The following disclosure is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features, other aspects, embodiments, and features will be apparent from consideration of the drawings and from the detailed description below. That is, the following disclosure is provided to introduce concepts, points, benefits and novel and non-obvious technical advantages described herein. Selected, but not all, embodiments are described in further detail below. Thus, the following disclosure is not intended to identify essential features of the claimed subject matter, nor is it intended to be used in determining the scope of the claimed subject matter.
Therefore, it is a primary objective of the present disclosure to provide a method and apparatus for detecting human skeleton to overcome the above-mentioned disadvantages.
The disclosure provides a method for detecting a human skeleton, comprising: receiving a film frame, wherein the film frame comprises a human body; judging whether the film frame comprises a piece of prediction information or not; when the film frame comprises the prediction information, judging whether the ratio of a first Intra-frame compressed block (Intra Coded MB, IMB) of a target area of the human body in the film frame is greater than a first threshold; and estimating skeleton information of the human body by using a Motion Vector (MV) when the ratio of the compressed blocks in the first frame of the target area is not greater than the first threshold.
In one embodiment, when the film frame is a predicted frame, it is determined that the film frame includes a prediction information.
In an embodiment, the method further comprises: when the ratio of the compressed blocks in the first frame of the target area is greater than or equal to the first threshold, a frame detection algorithm is used to obtain the frame information.
In an embodiment, when the film frame includes the prediction information, the method further includes: judging whether the proportion of a compressed block in a second frame of the film frame is larger than a second threshold value or not; and when the second intra-frame compression block of the film frame is not larger than the second threshold, judging whether the proportion of the first intra-frame compression block of the target area is larger than the first threshold.
In an embodiment, the second intra compressed block ratio, IMBPFrameIs expressed by the following formula:
IMBPFrame=100%-PMBPFrame
wherein PMBPFrameIs a second Prediction frame compression block (PMB) ratio, and the second Prediction frame compression block ratio is expressed by the following formula:
PMBPFrame=PMB_area/pixelNumByFrame*100%
wherein PMB _ area is a number of pixels of the compressed block of the second predicted frame, and pixelNumByFrame is a number of pixels included in the movie frame.
In an embodiment, the first intra compressed block ratio, IMBPBBoxIs expressed by the following formula:
IMBPBBox=100%-PMBPBBox
wherein PMBPBBoxIs a first Prediction frame compression block (PMB) ratio, and the first Prediction frame compression block ratio is expressed by the following formula:
PMBPBBox=PMB_area/pixelNumByBBox*100%
wherein PMB _ area is a number of pixels of the first predicted frame compressed block in the target area, and PixelNumByFrame is a number of pixels included in the target area.
In an embodiment, the method further comprises: when the film frame does not include the prediction information, a skeleton detection algorithm is used to obtain the skeleton information
In one embodiment, the skeleton detection algorithm is OpenPose or AlphaPose.
In one embodiment, the Motion vector is generated by a Motion prediction (Motion Estimation) process.
The present disclosure provides a human skeleton detecting device, which includes: one or more processors; and one or more computer storage media storing computer-readable instructions, wherein the processor uses the computer storage media to perform: receiving a film frame, wherein the film frame comprises a human body; judging whether the film frame comprises a piece of prediction information or not; when the film frame comprises the prediction information, judging whether the ratio of a first Intra-frame compressed block (Intra Coded MB, IMB) of a target area of the human body in the film frame is greater than a first threshold; and estimating skeleton information of the human body by using a Motion Vector (MV) when the ratio of the compressed blocks in the first frame of the target area is not greater than the first threshold.
Drawings
FIG. 1 is an environmental diagram illustrating a system using human skeleton detection according to an embodiment of the invention;
FIG. 2 is a flowchart illustrating a method of human skeleton detection according to an embodiment of the present disclosure;
FIG. 3 is a flowchart illustrating a method of human skeleton detection according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating a target area of a human body in a film frame according to an embodiment of the present disclosure;
FIG. 5 is a diagram illustrating a film frame including second predicted frame compressed blocks and second intra compressed blocks according to an embodiment of the present disclosure;
FIG. 6 is a table of experimental data showing second threshold and film calculation efficiency according to an embodiment of the present disclosure;
FIG. 7 is a table of experimental data showing the first threshold and the efficiency of the movie operation according to an embodiment of the present disclosure;
FIG. 8 is a flowchart illustrating an exemplary operating environment for implementing embodiments of the present invention.
[ notation ] to show
100 system
110 electronic device
120 camera
130 user
200 method
S205, S210, S215, S220, S225 step
300 method
S305 step
400 film frame
410 human body
420 target area
Film frame 510
520 film frame
600 form
700 table
800 computing device
810 bus line
812 memory body
814, a processor
816 display element
818I/O port
820I/O element
822 power supply
Detailed Description
Aspects of the present disclosure are described more fully hereinafter with reference to the accompanying drawings. This disclosure may, however, be embodied in many different forms and should not be construed as limited to any specific structure or function presented throughout this disclosure. Rather, these aspects are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art. Based on the teachings herein one skilled in the art should appreciate that the scope of the present disclosure is intended to cover any aspect disclosed herein, whether alone or in combination with any other aspect of the present disclosure to achieve any aspect disclosed herein. For example, it may be implemented using any number of the apparatus or performing methods set forth herein. In addition, the scope of the present disclosure is intended to cover apparatuses or methods implemented using other structure, functionality, or structure and functionality in addition to or other than the various aspects of the present disclosure set forth herein. It should be understood that any aspect disclosed herein may be embodied by one or more elements of a claim.
The word "exemplary" is used herein to mean "serving as an example, instance, or illustration. Any aspect of the present disclosure or design described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other aspects of the present disclosure or design. Moreover, like numerals refer to like elements throughout the several views, and the articles "a" and "an" include plural references unless otherwise specified in the description.
It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. In contrast, when an element is referred to as being "directly connected" or "directly coupled" to another element, there are no intervening elements present. Other words used to describe the relationship between elements should be interpreted in a similar manner (e.g., "between …" versus "directly between …," "adjacent" versus "directly adjacent," etc.).
FIG. 1 is a diagram illustrating an environment of a system 100 for detecting a human skeleton according to an embodiment of the invention. The system 100 for detecting human skeleton comprises an electronic device 110 and a camera 120, wherein the electronic device 100 is configured to capture a user 130 through one or more cameras 120 physically installed.
The electronic device 110 may receive movie frames from various sources. For example, the electronic device 110 may receive a movie frame transmitted by the camera 120, or download the movie frame from the cloud.
The types of electronic devices 110 range from small handheld devices (e.g., mobile phones/portable computers) to large mainframe systems (e.g., mainframe computers). Examples of portable computers include Personal Digital Assistants (PDAs), notebook computers, and like devices. The electronic device 110 may be connected to the camera 120 using a network. The Network may include, but is not limited to, one or more Local Area Networks (LANs) and/or Wide Area Networks (WANs).
It should be understood that the electronic device 110 shown in fig. 1 is an example of a framework detection system 100. Each of the elements shown in fig. 1 may be implemented via any type of computing device, such as computing device 800 described with reference to fig. 8, as shown in fig. 8.
Fig. 2 is a flowchart illustrating a method 200 of human skeleton detection according to an embodiment of the disclosure. The method can be implemented in the electronic device 110 of the human skeleton detection system shown in FIG. 1.
In step S205, the electronic device receives a film frame, wherein the film frame includes a human body. Next, in step S210, the electronic device determines whether the film frame includes a prediction information. In one embodiment, when the film frame is a Predictive Picture (P-Picture), the electronic device determines that the film frame includes a prediction information.
Furthermore, when the film frame includes the prediction information (yes in step S210), in step S215, the electronic device determines whether a ratio of a first Intra compressed block (Intra Coded MB, IMB) of the film frame including a target area of the human body is greater than a first threshold.
When the ratio of the compressed blocks in the first frame including the target area of the human body in the movie frame is not greater than the first threshold (no in step S215), in step S220, the electronic device estimates skeleton information of the human body using a Motion vector generated in a Motion Estimation (Motion Estimation) process.
Returning to step S210, when the film frame does not include the prediction information (no in step S210), in step S225, the electronic device obtains the skeleton information by using a skeleton detection algorithm, wherein the skeleton detection algorithm is an algorithm such as openposition or alphaposition. In one embodiment, when the film frame is an Intra-frame (I-Picture), the electronic device determines that the film frame does not include prediction information. In other words, the film frame is an intra frame, which means that the film frame has no Motion Vector (MV) information.
When the ratio of the compressed blocks in the first frame including the target area of the human body in the movie frame is greater than the first threshold (yes in step S215), in step S225, the electronic device obtains the skeleton information by using a skeleton detection algorithm, wherein the skeleton detection algorithm is an OpenPose or AlphaPose algorithm.
Fig. 3 is a flowchart illustrating a method 300 for detecting human skeleton according to an embodiment of the disclosure. The method can be implemented in the electronic device 110 of the human skeleton detection system shown in FIG. 1.
Unlike fig. 2, after the electronic device determines that the film frame includes the prediction information (yes in step S210), the electronic device may further determine whether a second intra-frame compression block ratio of the film frame is greater than a second threshold in step S305.
When the ratio of the compressed blocks in the second frame of the film frame is not greater than the second threshold (no in step S305), in step S215, the electronic device determines whether the ratio of the compressed blocks in the first frame of the film frame including the target area of the human body is greater than a first threshold.
When the ratio of the compressed blocks in the second frame of the movie frame is greater than the second threshold (yes in step S305), in step S225, the electronic device obtains the skeleton information by using a skeleton detection algorithm, wherein the skeleton detection algorithm is an OpenPose or AlphaPose algorithm.
It should be noted that. The steps with the same names as those in fig. 2 are also as described above, and are not described herein again.
The following describes how the electronic device determines whether the ratio of the compressed blocks in the first frame of the film frame including the target area of the human body is greater than a first threshold and determines whether the ratio of the compressed blocks in the second frame of the film frame is greater than a second threshold in step S215 in fig. 2 and step S305 in fig. 3, respectively.
Fig. 4 is a schematic diagram illustrating a target area of a human body in a film frame according to an embodiment of the disclosure. The electronic device may obtain a target area 420 including a human body 410 in the movie frame 400, and calculate a pixel number pixelNumByFrame included in the target area 420 and a pixel number PMB _ area of a first Prediction frame compression block (PMB). And the first predicted frame compressed block ratio PMBPBBox may be expressed by the following equation:
PMBPBBox=PMB_area/pixelNumByBBox*100%
and the first Intra Coded MB (IMB) ratio IMBPBBox can be expressed by the following formula:
IMBPBBox=100%-PMBPBBox
as shown, the gray area is the first predicted frame compressed block (PMB), and the remaining blocks outside the gray area are the first intra compressed blocks (IMBs). Assume that the target area 420 is composed of 84(7 × 12) squares, each square having a length and a width of 16 pixels each. And the number of pixels PMB _ area included in the gray area is 78 × 16 × 16. The target region 420 includes a number of pixels pixelNumByBBox of 84 × 16 × 16. Therefore, the PMBPBBox (78 × 16 × 16)/(84 × 16 × 16) × 100% (about 92.86%), and the first intra-compressed block ratio IMBPBBox is about 7.14%.
Fig. 5 is a diagram illustrating a film frame including a second predicted frame compressed block and a second intra compressed block according to an embodiment of the present disclosure. Assume that a film frame has four macroblocks (macroblocks) each having a size of 16 x 16 pixels.
The electronic device can calculate a pixel number PMB _ area of the second prediction frame compression block in the film frame and a pixel number pixelNumByFrame included in the film frame. The second prediction frame compressed block ratio PMBPFrame may be expressed by the following formula:
PMBPFrame=PMB_area/pixelNumByFrame*100%
and the compression block ratio IMBPFrame in the second frame can be expressed by the following formula:
IMBPFrame=100%-PMBPFrame
as shown, in the film frame 510, the pixel number PMB _ area of the second prediction frame compressed block is 512(16 × 16 × 2), and the pixel number pixenumbyframe included in the film frame 510 is 1024(16 × 16 × 4). Therefore, the PMBPFrame is 512/1024 × 100% (50%), and the second intra-frame compression block ratio IMBPFrame is 100% -50% — 50%.
As another example, in the film frame 520, the pixel number PMB _ area of the second prediction frame compression block is 256(16 × 16 × 1), and the pixel number pixelNumByFrame included in the film frame 520 is 1024(16 × 16 × 4). Therefore, the PMBPFrame is 256/1024 × 100% (25%), and the second intra-frame compressed block ratio IMBPFrame is 100% -25% ═ 75%.
FIG. 6 is a table 600 showing experimental data for second threshold and film operation efficiency according to an embodiment of the present disclosure. The table 600 shows the error distance and the processing speed of the electronic device for processing the same movie using different second threshold values. As shown in the table 600, the case that the second threshold β is-1 indicates that the electronic device only uses the skeleton detection algorithm (openphase) to obtain the skeleton information in a movie. Table 600 clearly shows that when the second threshold β is 15, the electronic device can increase the processing speed by about 8 times (118.28/14.38-8.2) within an acceptable detection error. In other words, the electronic device can effectively reduce the computation load for obtaining the skeleton information by using the motion vector to assist in detecting the skeleton information when the second threshold β is 15.
FIG. 7 is a table 700 showing experimental data of first threshold and film operation efficiency according to an embodiment of the present disclosure. The table 700 shows the error distance and the processing speed of the electronic device processing the same movie using different first thresholds. As shown in the table 700, the case that the first threshold α is-1 indicates that the electronic device only uses the skeleton detection algorithm (openposition) to obtain the skeleton information in a movie. Table 700 clearly shows that the electronic device can improve the processing speed by about 6 times (137.36/23.57-5.8) within an acceptable detection error when the first threshold α is 20. In other words, the electronic device uses the motion vector to assist in detecting the skeleton information when the first threshold α is 20, so as to effectively reduce the computation amount for obtaining the skeleton information.
It should be noted that the optimal values of the second threshold β and the first threshold α will vary from movie to movie, and the second threshold β and the first threshold α are not limited to the disclosure, and can be appropriately replaced or adjusted by a person skilled in the art according to the present embodiment.
As shown in the tables 600 and 700, the ratio of the compressed blocks in the second frame of the film frame being greater than the second threshold β indicates that the film frame contains a smaller number of motion vectors, so that the picture between the film frames has a larger variation, for example: the change of the shadow in the film or the zooming of the picture when recording the film. In this case, the electronic device uses the skeleton detection algorithm to correct the coordinate position of the skeleton node of the human body at a proper time. When the ratio of the compressed blocks in the second frame of the video frame is not greater than the second threshold β, the electronic device can determine whether to use the motion vector to estimate the skeleton information of the human body according to the ratio of the compressed blocks in the first frame of the video frame in the target area including the human body.
The ratio of the compressed blocks in the first frame of the movie frame greater than the first threshold α represents that the target area of the human body contains a smaller number of motion vectors (i.e., the body movement of the human body has a larger change). Therefore, the electronic device obtains the skeleton information by using a skeleton detection algorithm, so as to avoid the problem that the skeleton nodes cannot be updated correctly due to insufficient number of dynamic vectors near the skeleton nodes.
When the ratio of the compressed blocks in a second frame of the film frame is not greater than a second threshold and the ratio of the compressed blocks in a first frame of the film frame including a target area of the human body is not greater than a first threshold alpha, the electronic device can estimate the positions of the skeleton nodes through the motion vector group around the skeleton nodes of the human body, so as to reduce the times of using the skeleton detection algorithm, improve the operation efficiency and reduce the operation cost.
As mentioned above, the method and apparatus for detecting human skeleton of the present disclosure uses motion vector to assist in detecting human skeleton. The method and the device can achieve the purposes of reducing the times of using a framework detection algorithm, improving the operation efficiency, increasing the number of processed image series flows and reducing the operation cost.
With respect to the described embodiments of the present invention, an exemplary operating environment in which embodiments of the present invention may be implemented is described below. With specific reference to FIG. 8, FIG. 8 illustrates an exemplary operating environment for implementing embodiments of the present invention that may be generally considered a computing device 800. Computing device 800 is but one example of a suitable computing environment and is not intended to suggest any limitation as to the scope of use or functionality of the invention. Neither should the computing device 800 be interpreted as having any dependency or requirement relating to any one or combination of components illustrated.
The present invention may be implemented in computer program code or machine-useable instructions, such as computer-executable instructions of program modules, executed by a computer or other machine, such as a personal digital assistant or other portable device. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The present invention may be implemented in a variety of system configurations, including portable devices, consumer electronics, general purpose computers, more specialized computing devices, and the like. The present invention may also be implemented in a distributed computing environment, processing devices linked by a communications network.
Refer to fig. 8. Computing device 800 includes a bus 810 that directly or indirectly couples various devices including a memory 812, one or more processors 814, one or more display elements 816, input/output (I/O) ports 818, input/output (I/O) elements 820, and a power supply 822. Bus 810 represents what may be one or more busses (e.g., an address bus, data bus, or combination thereof). Although the blocks of FIG. 8 are illustrated with lines for simplicity, in practice, the boundaries of the various elements are not specific, e.g., the presentation elements of the display device may be considered to be I/O elements; the processor may have a memory.
Computing device 800 typically includes a variety of computer-readable media. Computer readable media can be any available media that can be accessed by computing device 800 and includes both volatile and nonvolatile media, removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer-readable media includes both volatile and non-volatile media, removable and non-removable media implemented in any method or technology for storage of information such as computer-readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, (RAM), Read-Only Memory (ROM), Electrically Erasable-Programmable Read-Only Memory (EEPROM), flash Memory or other Memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic disks, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 800. Computer storage media itself does not include signals.
Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modular data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term "modular data signal" refers to a signal that has one or more sets of features or is altered in such a way as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as audio, RF, infrared and other wireless media. Combinations of the above are included within the scope of computer-readable media.
The memory 812 includes computer storage media in the form of volatile and non-volatile memory. The memory may be movable, non-movable, or a combination of the two. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. The computing device 800 includes one or more processors that read data from entities such as the memory 812 or the I/O devices 820. Display element 816 displays data indications to a user or other device. Exemplary display elements include display devices, speakers, printing elements, vibrating elements, and the like.
The I/O ports 818 allow the computing device 800 to be logically connected to other devices including I/O elements 820, some of which are built-in devices. Exemplary components include a microphone, joystick, game pad, satellite dish signal receiver, scanner, printer, wireless device, and the like. I/O component 820 may provide a natural user interface for processing gestures, sounds, or other physiological inputs generated by the user. In some examples, these inputs may be transmitted to a suitable network element for further processing. The computing device 800 may be equipped with a depth camera, such as a stereo camera system, an infrared camera system, an RGB camera system, and combinations of these systems, to detect and identify objects. In addition, the computing device 800 may be equipped with sensors (e.g., radar, light radar) to periodically sense the proximity within a sensing range of the surroundings, generating sensor information indicative of its association with the surroundings. Further, the computing device 800 may be equipped with an accelerometer or gyroscope to detect motion. The output of the accelerometer or gyroscope may be provided to the computing device 800 for display.
Further, the processor 814 in the computing device 800 can execute the programs and instructions in the memory 812 to present the actions and steps described in the above embodiments, or other descriptions in the specification.
Any particular order or hierarchy of steps for processes disclosed herein is purely exemplary. Based upon design preferences, it should be understood that any specific order or hierarchy of steps in the processes may be rearranged within the scope of the disclosure. The accompanying method claims present elements of the various steps in a sample order, and are therefore not to be limited to the specific order or hierarchy presented.
The use of ordinal terms such as "first," "second," "third," etc., in the claims to modify a component does not by itself connote any priority, precedence, or order of individual components or steps performed by the method, but are used merely as labels to distinguish between different components having the same name (but for use of the ordinal term).
Although the present disclosure has been described with reference to exemplary embodiments, it should be understood that various changes and modifications can be made by those skilled in the art without departing from the spirit and scope of the disclosure, and therefore, the scope of the disclosure should be determined by that defined in the appended claims.

Claims (18)

1. A method for detecting a human skeleton, comprising:
receiving a film frame, wherein the film frame comprises a human body;
judging whether the film frame comprises a piece of prediction information or not;
when the film frame comprises the prediction information, judging whether the ratio of a first Intra-frame compressed block (Intra Coded MB, IMB) of a target area of the human body in the film frame is greater than a first threshold; and
estimating skeleton information of the human body using a Motion Vector (MV) when the first intra-frame compressed block ratio of the target region is not greater than the first threshold.
2. The method according to claim 1, wherein when the film frame is a predicted frame, it is determined that the film frame includes a prediction information.
3. The method of claim 1, further comprising:
when the ratio of the compressed blocks in the first frame of the target area is greater than or equal to the first threshold value, the skeleton information is obtained by using a skeleton detection algorithm.
4. The method of claim 1, wherein when the film frame comprises the prediction information, the method further comprises:
judging whether the proportion of a compressed block in a second frame of the film frame is larger than a second threshold value or not; and
when the second intra-frame compression block of the movie frame is not larger than the second threshold, judging whether the proportion of the first intra-frame compression block of the target area is larger than the first threshold.
5. The method of claim 4, wherein the second intra-frame compression block ratio IMBP isFrameIs expressed by the following formula:
IMBPFrame=100%-PMBPFrame
wherein PMBPFrameIs a second Prediction frame compression block (PMB) ratio, and the second Prediction frame compression block ratio is expressed by the following formula:
PMBPFrame=PMB_area/pixelNumByFrame*100%
wherein PMB _ area is a number of pixels of the second prediction frame compressed block, and pixenumbyframe is a number of pixels included in the film frame.
6. The method of claim 1, wherein the method comprises detecting the skeleton of the human bodyThen, the first intra compressed block ratio IMBPBBoxIs expressed by the following formula:
IMBPBBox=100%-PMBPBBox
wherein PMBPBBoxIs a first Prediction frame compression block (PMB) ratio, and the first Prediction frame compression block ratio is expressed by the following formula:
PMBPBBox=PMB_area/pixelNumByBBox*100%
wherein PMB _ area is a number of pixels of the first predicted frame compressed block in the target area, and PixelNumByFrame is a number of pixels included in the target area.
7. The method according to claim 1, further comprising:
when the film frame does not include the prediction information, a skeleton detection algorithm is used to obtain the skeleton information.
8. The method of claim 7, wherein the skeleton detection algorithm is OpenPose or AlphaPose.
9. The method of claim 1, wherein the Motion vector is generated by a Motion Estimation (Motion Estimation) process.
10. A human skeleton detecting device, comprising:
one or more processors; and
one or more computer storage media storing computer-readable instructions, wherein the processor uses the computer storage media to perform:
receiving a film frame, wherein the film frame comprises a human body;
judging whether the film frame comprises a piece of prediction information or not;
when the film frame comprises the prediction information, judging whether the ratio of a first Intra-frame compressed block (Intra Coded MB, IMB) of a target area of the human body in the film frame is greater than a first threshold; and
estimating skeleton information of the human body using a Motion Vector (MV) when the first intra-frame compressed block ratio of the target region is not greater than the first threshold.
11. The apparatus according to claim 10,
and when the film frame is a predicted frame, judging that the film frame comprises prediction information.
12. The apparatus according to claim 10, wherein the processor further performs:
when the ratio of the compressed blocks in the first frame of the target area is greater than or equal to the first threshold, a frame detection algorithm is used to obtain the frame information.
13. The apparatus according to claim 10, wherein when the film frame includes the prediction information, the processor further performs:
judging whether the proportion of a compressed block in a second frame of the film frame is larger than a second threshold value or not; and
when the second intra-frame compression block of the movie frame is not larger than the second threshold, judging whether the proportion of the first intra-frame compression block of the target area is larger than the first threshold.
14. The apparatus according to claim 13, wherein the second intra-frame compression block ratio IMBP isFrameIs expressed by the following formula:
IMBPFrame=100%-PMBPFrame
wherein PMBPFrameCompressing the block (Predi) for a second predicted frameSection MB, PMB) ratio, and the second predicted frame compressed block ratio is expressed by the following formula:
PMBPFrame=PMB_area/pixelNumByFrame*100%
wherein PMB _ area is a number of pixels of the second prediction frame compressed block, and pixenumbyframe is a number of pixels included in the film frame.
15. The apparatus according to claim 10, wherein the first intra-frame compression block ratio IMBP isBBoxIs expressed by the following formula:
IMBPBBox=100%-PMBPBBox
wherein PMBPBBoxIs a first Prediction frame compression block (PMB) ratio, and the first Prediction frame compression block ratio is expressed by the following formula:
PMBPBBox=PMB_area/pixelNumByBBox*100%
wherein PMB _ area is a number of pixels of the first predicted frame compressed block in the target area, and PixelNumByFrame is a number of pixels included in the target area.
16. The apparatus according to claim 10, wherein the processor further performs:
when the film frame does not include the prediction information, a skeleton detection algorithm is used to obtain the skeleton information.
17. The apparatus according to claim 16, wherein the skeleton detection algorithm is openphase or alphaphase.
18. The apparatus according to claim 10, wherein the Motion vector is generated by a Motion Estimation (Motion Estimation) process.
CN202110280064.1A 2020-12-29 2021-03-16 Human skeleton detection method and device Pending CN114694169A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US17/137,212 US11625938B2 (en) 2020-12-29 2020-12-29 Method and device for detecting human skeletons
US17/137,212 2020-12-29
TW110100578A TWI790523B (en) 2020-12-29 2021-01-07 Method and device for detecting human skeleton
TW110100578 2021-01-07

Publications (1)

Publication Number Publication Date
CN114694169A true CN114694169A (en) 2022-07-01

Family

ID=82135569

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110280064.1A Pending CN114694169A (en) 2020-12-29 2021-03-16 Human skeleton detection method and device

Country Status (1)

Country Link
CN (1) CN114694169A (en)

Similar Documents

Publication Publication Date Title
Apicharttrisorn et al. Frugal following: Power thrifty object detection and tracking for mobile augmented reality
US10636152B2 (en) System and method of hybrid tracking for match moving
KR101399804B1 (en) Method and apparatus for tracking and recognition with rotation invariant feature descriptors
JP2020145714A (en) Low-power always-on face detection, tracking, recognition and/or analysis using events-based vision sensor
US10311579B2 (en) Apparatus and method for detecting foreground in image
Tavakkoli et al. Non-parametric statistical background modeling for efficient foreground region detection
CN113286194A (en) Video processing method and device, electronic equipment and readable storage medium
TW202109449A (en) Image processing method and device, electronic equipment and storage medium
CN109685797B (en) Bone point detection method, device, processing equipment and storage medium
KR20130025944A (en) Method, apparatus and computer program product for providing object tracking using template switching and feature adaptation
JP2015057691A (en) Method, apparatus and computer program for activity recognition
CN107944381B (en) Face tracking method, face tracking device, terminal and storage medium
KR102389766B1 (en) Image processing method and apparatus, electronic device and storage medium
KR101642055B1 (en) Motion blur aware visual pose tracking
WO2015186347A1 (en) Detection system, detection method, and program storage medium
EP3234908A1 (en) Method, apparatus and computer program product for blur estimation
Cocorullo et al. Multimodal background subtraction for high-performance embedded systems
CN114694169A (en) Human skeleton detection method and device
TWI790523B (en) Method and device for detecting human skeleton
CN115035596A (en) Behavior detection method and apparatus, electronic device, and storage medium
CN113989334A (en) Method, device and equipment for tracking video moving object and storage medium
Rungruangbaiyok et al. Probabilistic static foreground elimination for background subtraction
WO2017099935A1 (en) Motion detection of object
CN110300253B (en) Image processing apparatus and method, and storage medium storing instructions
JP2009080623A (en) Image processor and image pickup device with image processor mounted thereon

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination