CN111080589A - Target object matching method, system, device and machine readable medium - Google Patents

Target object matching method, system, device and machine readable medium Download PDF

Info

Publication number
CN111080589A
CN111080589A CN201911230513.0A CN201911230513A CN111080589A CN 111080589 A CN111080589 A CN 111080589A CN 201911230513 A CN201911230513 A CN 201911230513A CN 111080589 A CN111080589 A CN 111080589A
Authority
CN
China
Prior art keywords
point
human body
target
skeleton
human
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911230513.0A
Other languages
Chinese (zh)
Inventor
姚志强
周曦
李继伟
钟南昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Jize Technology Co Ltd
Yuncong Technology Group Co Ltd
Original Assignee
Guangzhou Jize Technology Co Ltd
Yuncong Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jize Technology Co Ltd, Yuncong Technology Group Co Ltd filed Critical Guangzhou Jize Technology Co Ltd
Priority to CN201911230513.0A priority Critical patent/CN111080589A/en
Publication of CN111080589A publication Critical patent/CN111080589A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a target object matching method, a system, equipment and a machine readable medium, which are characterized in that one or more matching areas of one or more target objects in different postures are obtained; affine transforming the one or more matching regions to corresponding reference regions in a reference object; and determining the matching degree of the target object and the reference object. The method is based on a bottom-up deep pose method, and can directly detect key points of all human bodies in the whole image, select one or more human bodies as target human bodies, affine transform the key points in the poses of the target human bodies in the continuous frame images to another human body, and calculate the matching degree of the target human bodies and the reference human bodies according to the result after the affine transform; the method not only has short time consumption, but also can calculate the matching degree of the target human body and the reference human body under the condition that a plurality of target human bodies exist.

Description

Target object matching method, system, device and machine readable medium
Technical Field
The present invention relates to recognition technologies, and in particular, to a target object matching method, system, device, and machine-readable medium.
Background
The traditional two-dimensional target object matching method relies on the processing capacity of hardware, and extra sensor equipment is needed to acquire the attitude (Pose) information of a target object and then perform attitude comparison with a reference target. Even methods that do not require additional equipment, such as deformabie Model (DPM), which is an algorithm for object detection, require object-by-object detection and object-pose keypoint location, are time-consuming and have insufficient real-time performance. The method commonly adopted at present is skeleton key point detection based on deep learning, but most of the deep methods are from Top to bottom (Top-down: for detecting key points of a human body of a picture, a pedestrian detector is used for positioning the human body, and then the key point detector is used for detecting the key points of each human body). For example, when the target object is a human body, a common method is to detect the human body first and then perform skeleton point positioning on the human body, and the time consumed by combining the two steps is often large. In some scenarios, multiple target objects may exist at the same time, so that a more efficient matching method is needed to detect and match the postures of the target objects in real time.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, it is an object of the present invention to provide a target object matching method, system, device and machine-readable medium for solving the problems in the prior art.
To achieve the above and other related objects, the present invention provides a target object matching method, including:
acquiring one or more matching areas of one or more target objects in different postures;
affine transforming the one or more matching regions to corresponding reference regions in a reference object;
and determining the matching degree of the target object and the reference object.
Optionally, the reference object corresponds to the target object, the target object comprising at least one of: target human body, target animal body.
Optionally, one or more continuous frame images are acquired, and one or more matching regions of one or more target human bodies in different postures are determined according to the continuous frame images.
Optionally, the matching region comprises at least one of: target human skeleton points, a target human skeleton point combination and a target human body part.
Optionally, the target human skeleton point comprises at least one of: the center point of the head, the left hip point, the right hip point, the left shoulder point and the right shoulder point.
Optionally, the target human skeleton point combination includes at least one of: a human body skeleton point combination consisting of a human head central point, a left hip point and a right hip point; a human body skeleton point combination consisting of a human head central point, a left shoulder point and a right shoulder point; a human body skeleton point combination consisting of the left shoulder point, the left hip point and the right hip point; the right shoulder point, the left hip point and the right hip point form a human body skeleton point combination.
Optionally, the target human body part comprises at least one of: human head, human shoulder, human arm, human buttock, human shank.
Optionally, if the matching region is a target human skeleton point, the reference region includes a reference human skeleton point; the reference human skeleton point includes at least one of: a head center reference point corresponding to a head center point, a left hip reference point corresponding to a left hip point, a right hip reference point corresponding to a right hip point, a left shoulder reference point corresponding to a left shoulder point, a right shoulder reference point corresponding to a right shoulder point.
Optionally, if the matching region is a target human skeleton point combination, the reference region includes a reference human skeleton point combination; the reference human skeleton point combination comprises at least one of the following components: a reference human body skeleton point combination consisting of a human head center reference point, a left hip reference point and a right hip reference point; a reference human body skeleton point combination consisting of a head center reference point, a left shoulder reference point and a right shoulder reference point; a reference human body skeleton point combination consisting of a left shoulder reference point, a left hip reference point and a right hip reference point; and the reference human body skeleton point combination is formed by the right shoulder reference point, the left hip reference point and the right hip reference point.
Optionally, the affine transformation comprises at least one of: rotation, translation and shearing.
Optionally, determining the matching degree between the target object and the reference object specifically includes:
acquiring a target human body skeleton point sequence and a reference human body skeleton point sequence, and determining a skeleton point coincidence sequence and a skeleton point coincidence ratio of the target human body skeleton point sequence and the reference human body skeleton point sequence;
calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point coincidence sequence;
and determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the skeleton point coincidence ratio.
Optionally, determining the matching degree between the target object and the reference object specifically includes:
acquiring a target human body skeleton point combination sequence and a reference human body skeleton point combination sequence, and determining a skeleton point combination coincidence sequence and a skeleton point combination coincidence ratio of the target human body skeleton point combination sequence and the reference human body skeleton point combination sequence;
calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point combination coincidence sequence;
and determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the combined coincidence ratio of the skeleton points.
The invention also provides a target object matching system, which comprises:
the acquisition module is used for acquiring one or more matching areas when one or more target objects are in different postures;
an affine transformation module for affine transforming the one or more matching regions to corresponding reference regions in a reference object;
and the matching module is used for determining the matching degree of the target object and the reference object.
Optionally, the reference object corresponds to the target object, the target object comprising at least one of: target human body, target animal body.
Optionally, the acquiring module includes an image acquiring unit and a matching region unit;
the image acquisition unit is used for acquiring one or more continuous frame images;
the matching region unit determines one or more matching regions when one or more target human bodies are in different postures according to the continuous frame images.
Optionally, the matching region comprises at least one of: target human skeleton points, a target human skeleton point combination and a target human body part.
Optionally, the target human skeleton point comprises at least one of: the center point of the head, the left hip point, the right hip point, the left shoulder point and the right shoulder point.
Optionally, the target human skeleton point combination includes at least one of: a human body skeleton point combination consisting of a human head central point, a left hip point and a right hip point; a human body skeleton point combination consisting of a human head central point, a left shoulder point and a right shoulder point; a human body skeleton point combination consisting of the left shoulder point, the left hip point and the right hip point; the right shoulder point, the left hip point and the right hip point form a human body skeleton point combination.
Optionally, the target human body part comprises at least one of: human head, human shoulder, human arm, human buttock, human shank.
Optionally, if the matching region is a target human skeleton point, the reference region includes a reference human skeleton point; the reference human skeleton point includes at least one of: a head center reference point corresponding to a head center point, a left hip reference point corresponding to a left hip point, a right hip reference point corresponding to a right hip point, a left shoulder reference point corresponding to a left shoulder point, a right shoulder reference point corresponding to a right shoulder point.
Optionally, if the matching region is a target human skeleton point combination, the reference region includes a reference human skeleton point combination; the reference human skeleton point combination comprises at least one of the following components: a reference human body skeleton point combination consisting of a human head center reference point, a left hip reference point and a right hip reference point; a reference human body skeleton point combination consisting of a head center reference point, a left shoulder reference point and a right shoulder reference point; a reference human body skeleton point combination consisting of a left shoulder reference point, a left hip reference point and a right hip reference point; and the reference human body skeleton point combination is formed by the right shoulder reference point, the left hip reference point and the right hip reference point.
Optionally, the affine transformation comprises at least one of: rotation, translation and shearing.
Optionally, the matching module includes a first processing unit, a first calculating unit and a first matching unit;
the first processing unit is used for acquiring a target human body skeleton point sequence and a reference human body skeleton point sequence, and determining a skeleton point coincidence sequence and a skeleton point coincidence ratio of the target human body skeleton point sequence and the reference human body skeleton point sequence;
the first calculation unit is used for calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point coincidence sequence;
the first matching unit is used for determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle inner angle of the target human body skeleton point, the cosine value sequence of the limb triangle inner angle of the reference human body skeleton point and the skeleton point coincidence ratio.
Optionally, the matching module includes a second processing unit, a second calculating unit, and a second matching unit;
the second processing unit is used for acquiring a target human body skeleton point combination sequence and a reference human body skeleton point combination sequence, and determining a skeleton point combination coincidence sequence and a skeleton point combination coincidence ratio of the target human body skeleton point combination sequence and the reference human body skeleton point combination sequence;
the second calculation unit is used for calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point combination coincidence sequence;
the second matching unit is used for determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle inner angle of the target human body skeleton point, the cosine value sequence of the limb triangle inner angle of the reference human body skeleton point and the combined coincidence ratio of the skeleton points.
The invention also provides a target object matching device, comprising:
acquiring one or more matching areas of one or more target objects in different postures;
affine transforming the one or more matching regions to corresponding reference regions in a reference object;
and determining the matching degree of the target object and the reference object.
The present invention also provides an apparatus comprising:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform a method as described in one or more of the above.
The present invention also provides one or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the methods as described in one or more of the above.
As described above, the target object matching method, system, device and machine-readable medium provided by the present invention have the following beneficial effects: the method comprises the steps of obtaining one or more matching areas when one or more target objects are in different postures; affine transforming the one or more matching regions to corresponding reference regions in a reference object; and determining the matching degree of the target object and the reference object. The method is based on a bottom-up deep pose method, and can directly detect key points of all human bodies in the whole continuous frame image, affine transform the key points in the pose of the target human body in the continuous frame image to another human body, and calculate the matching degree of the target human body and the reference human body according to the result after the affine transform; the method not only has short time consumption, but also can calculate the matching degree of the target human body and the reference human body under the condition that a plurality of target human bodies exist at the same time.
Drawings
Fig. 1 is a schematic flowchart of a target object matching method according to an embodiment.
Fig. 2 is a schematic diagram of a target human skeleton point according to an embodiment.
Fig. 3 is a schematic diagram of reference human skeleton points according to an embodiment.
Fig. 4 is a schematic connection diagram of a target object matching system according to an embodiment.
Fig. 5 is a schematic diagram of a hardware structure of an acquisition module according to an embodiment.
Fig. 6 is a schematic diagram of a hardware structure of a matching module according to an embodiment.
Fig. 7 is a schematic diagram of a hardware structure of a matching module according to another embodiment.
Fig. 8 is a schematic hardware structure diagram of a terminal device according to an embodiment.
Fig. 9 is a schematic diagram of a hardware structure of a terminal device according to another embodiment.
Description of the element reference numerals
M10 acquisition module
M20 affine transformation module
M30 matching module
D10 image acquisition unit
D20 matching region cell
D30 first processing unit
D40 first computing unit
D50 first matching unit
D60 second processing unit
D70 second calculation unit
D80 second matching unit
1100 input device
1101 first processor
1102 output device
1103 first memory
1104 communication bus
1200 processing assembly
1201 second processor
1202 second memory
1203 communication assembly
1204 Power supply Assembly
1205 multimedia assembly
1206 voice assembly
1207 input/output interface
1208 sensor assembly
Detailed Description
The embodiments of the present invention are described below with reference to specific embodiments, and other advantages and effects of the present invention will be easily understood by those skilled in the art from the disclosure of the present specification. The invention is capable of other and different embodiments and of being practiced or of being carried out in various ways, and its several details are capable of modification in various respects, all without departing from the spirit and scope of the present invention. It is to be noted that the features in the following embodiments and examples may be combined with each other without conflict.
It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present invention, and the components related to the present invention are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, quantity and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complicated.
Affine transformation: in the geometry, one vector space is linearly transformed and then translated into the other vector space.
Deep Pose: and detecting the deeply learned postures.
The key points are as follows: the best point or a more representative point.
Skeleton key point: the optimal skeleton point or the relatively representative skeleton point in the skeleton points of the human body.
Referring to fig. 1 to 3, the present invention provides a target object matching method, including:
s100, one or more matching areas of one or more target objects in different postures are obtained. In the embodiment of the present application, the target object may include, for example, at least one of: target human body, target animal body. If the target object is a human body, the target human body in different postures comprises at least one of the following: the target human body is a worker, and the posture of the worker during operation is determined; the target human body is a dance teacher, and the dance teacher takes a posture during dance teaching; the target human body is a student, and the student takes the posture during the inter-class exercise; the target human body is the middle-aged woman, the posture of the middle-aged woman when dancing the square, and the like.
In an exemplary embodiment, one or more continuous frame images may be acquired, for example, by a conventional computer and a common camera, and one or more matching regions of one or more target human bodies in different poses are determined from the continuous frame images. Compared with the prior art, dedicated human body detection sensor equipment does not need to be arranged, and the cost is greatly reduced. The continuous frame image includes a video, a continuously shot photograph, and the like.
In some exemplary embodiments, the matching region comprises at least one of: target human skeleton points, a target human skeleton point combination and a target human body part.
Wherein the target human skeleton points comprise at least one of: the center point of the head, the left hip point, the right hip point, the left shoulder point and the right shoulder point. The target human skeleton point combination comprises at least one of the following components: a human body skeleton point combination consisting of a human head central point, a left hip point and a right hip point; a human body skeleton point combination consisting of a human head central point, a left shoulder point and a right shoulder point; a human body skeleton point combination consisting of the left shoulder point, the left hip point and the right hip point; the right shoulder point, the left hip point and the right hip point form a human body skeleton point combination. The target human body part includes at least one of: human head, human shoulder, human arm, human buttock, human shank.
For example, the human body posture is taken as a target object, continuous frame images are taken as videos, and when a matching area of the target object is determined, one or more videos are obtained through a conventional computer and a common camera. One or more human bodies are obtained from a video picture, according to the obtained human bodies, the human body skeleton points are used as matching areas, skeleton key point positioning is carried out on the human bodies in the video picture based on a Bottom-up deep Pose method, and the skeleton key points of all the human bodies in the whole video picture can be directly detected. One or more human bodies are selected as target human bodies or human bodies to be matched, which is equivalent to that skeleton key points of all the target human bodies or all the human bodies to be matched in the whole video picture can be directly detected. Compared with a Top-down (Top-down) method in the prior art, the Deep Pose method based on the bottom-up can directly and integrally detect the human body without detecting the human body from the head of the human body first and then sequentially detecting the human body downwards according to the method in the prior art. Moreover, the embodiment of the application is efficient and accurate, the consumed time is hardly influenced by the number of people in the picture, and the frame rate under a multi-person video scene can reach more than 20FPS (Frames Per Second, the number of Frames transmitted Per Second); compared with the prior art that the frame rate is only 2FPS in a multi-person video scene, the method is obviously superior to the prior art in time consumption and multi-person scene detection. Multiplayer video scenes may include, for example, worker work, dance teaching, somatosensory games, student break-time exercises, square dances, and the like.
S200, affine transforming the one or more matching areas to corresponding reference areas in the reference object. In the embodiment of the application, the reference object corresponds to the target object, and specifically, if the target object is a human body, the reference object is also the human body; if the target object is an animal, the reference object is also an animal.
In some exemplary embodiments, the reference region comprises at least one of: reference human skeleton points, reference human skeleton point combinations and reference human body parts.
If the matching region is a target human body skeleton point, the reference region comprises a reference human body skeleton point; the reference human skeleton point includes at least one of: a head center reference point corresponding to a head center point, a left hip reference point corresponding to a left hip point, a right hip reference point corresponding to a right hip point, a left shoulder reference point corresponding to a left shoulder point, a right shoulder reference point corresponding to a right shoulder point.
If the matching area is the target human body skeleton point combination, the reference area comprises a reference human body skeleton point combination; the reference human skeleton point combination comprises at least one of the following components: a reference human body skeleton point combination consisting of a human head center reference point, a left hip reference point and a right hip reference point; a reference human body skeleton point combination consisting of a head center reference point, a left shoulder reference point and a right shoulder reference point; a reference human body skeleton point combination consisting of a left shoulder reference point, a left hip reference point and a right hip reference point; and the reference human body skeleton point combination is formed by the right shoulder reference point, the left hip reference point and the right hip reference point.
If the matching region is the target human body part, the reference region comprises a reference human body part, and the reference human body part comprises at least one of the following parts: the human body head reference, the human body shoulder reference, the human body arm reference, the human body buttock reference and the human body leg reference.
For example, in the embodiment of the present application, if the target object is a human body, two postures of the human body in a certain state are selected from a video picture, one of the human bodies is selected as the target human body or the human body to be matched, the other human body is selected as a reference human body or a standard human body, and skeleton points of the target human body are affine transformed to the other reference human body, so that the skeleton points of the human body are unified to approximately the same scale and angle. Wherein the affine transformation operation comprises at least one of: rotation, translation and shearing. Since affine transformation relations can be determined from three human skeleton key points and there are at most 18 human skeleton key points per human body, there are 816 choices under the condition that affine transformation relations are determined from only three human skeleton points. Thus, a sequence consisting of 18 human skeletal key points corresponds to a human skeletal point sequence; at least three human skeleton key points are randomly selected from 18 human skeleton key points, the selected human skeleton key points are used as a human skeleton key point combination, and a sequence formed by the human skeleton key point combination is equivalent to a human skeleton point combination sequence. The method and the device can select the optimal three key points of the human body skeleton as affine transformation reference points, or select the combination of more than three key points of the human body skeleton as affine transformation reference points. For example, in the embodiment of the present application, a relatively representative three-point combination in the common visible keypoint sequence is selected as a candidate point combination: that is, affine transformations are sequentially performed by selecting (head center point, left hip point, right hip point), (head center point, left shoulder point, right shoulder point), (left shoulder point, left hip point, right hip point), (right shoulder point, left hip point, right hip point), and the like. If the matching degree calculated according to the candidate point combination is the highest, the candidate point combination is the best point combination or the key point combination.
And S300, determining the matching degree of the target object and the reference object.
In an exemplary embodiment, the determining the matching degree of the target object and the reference object by using the human skeleton point as the matching region specifically includes:
and acquiring a target human body skeleton point sequence and a reference human body skeleton point sequence, and determining a skeleton point coincidence sequence and a skeleton point coincidence ratio of the target human body skeleton point sequence and the reference human body skeleton point sequence. Specifically, a target human body skeleton point sequence consisting of 18 human body skeleton key points of a target human body and a reference human body skeleton point sequence consisting of 18 human body skeleton key points of a reference human body are obtained; and then determining a skeleton point coincidence sequence and a skeleton point coincidence ratio of the target human body skeleton point sequence and the reference human body skeleton point sequence.
Calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point coincidence sequence;
and determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the skeleton point coincidence ratio.
In an exemplary embodiment, the determining the matching degree of the target object and the reference object by using the human skeleton point combination as the matching region specifically includes:
and acquiring a target human body skeleton point combination sequence and a reference human body skeleton point combination sequence, and determining a skeleton point combination coincidence sequence and a skeleton point combination coincidence ratio of the target human body skeleton point combination sequence and the reference human body skeleton point combination sequence. Specifically, more than three human skeleton key points are arbitrarily selected from 18 human skeleton key points of the target human body as a target human skeleton point combination, so that a target human skeleton point combination sequence is formed. Selecting corresponding human skeleton key points from the reference human body to form a reference human skeleton point combination corresponding to the target human skeleton point combination and simultaneously form a reference human skeleton point sequence; and determining a skeleton point combination coincidence sequence and a skeleton point combination coincidence ratio of the target human body skeleton point combination sequence and the reference human body skeleton point combination sequence.
Calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point combination coincidence sequence;
and determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the combined coincidence ratio of the skeleton points.
In the embodiment of the present application, as shown in fig. 2 and 3, determining the matching degree between the target human body and the reference human body by using the human skeleton point as the matching region includes:
respectively obtaining a target human body skeleton point sequence and a reference human body skeleton point sequence, namely respectively obtaining human body skeleton key point sequences of a target human body and a reference human body, carrying out affine transformation on human body skeleton key points of the target human body into the reference human body, and then determining a human body skeleton key point coincidence sequence and a human body skeleton key point coincidence ratio r. And if the human skeleton key points of the target human body are affine transformed to the reference human body and the common visible human skeleton key points exist in the target human body and the reference human body, counting all the common visible human skeleton key points in the target human body and the reference human body, and recording the result as a human skeleton key point coincidence sequence. And dividing the number of all the visible human skeleton key points of the target human body and the reference human body by the number of all the visible human skeleton key points according to the counted number of all the common visible human skeleton key points, and recording the result as a human skeleton key point coincidence ratio r.
The limb angles of the common visible skeleton points of the target human skeleton and the reference human skeleton are respectively calculated and expressed by cosine values, and the range is (-1.0, 1.0). Namely, the cosine value sequences of the inner angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point are calculated according to the human body skeleton key point coincidence sequence. Calculating a cosine value sequence of the target human body skeleton points according to the common visible skeleton key point sequence: a is0,a1,…,ai(ii) a Calculating a cosine value sequence of the target human body skeleton points according to the common visible skeleton key point sequence: b0,b1,...,bi
Calculating the similarity S according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the skeleton point coincidence ratio r as follows:
Figure BDA0002303417070000101
and determining the matching degree of the target human body and the reference human body according to the similarity S. The matching degree and the similarity value correspond to each other, that is, the similarity S between the target human body and the reference human body is 95%, and the matching degree between the target human body and the reference human body is 95%.
The method comprises the steps of obtaining one or more matching areas when one or more target objects are in different postures; affine transforming the one or more matching regions to corresponding reference regions in a reference object; and determining the matching degree of the target object and the reference object. The method is based on a bottom-up Deep Pose method, can directly detect the human skeleton points of all human bodies in the whole video, selects one or more human bodies as target human bodies, and is equivalent to directly detecting the human skeleton points of all target human bodies in the whole video. And selecting human skeleton key points in the human skeleton points, carrying out affine transformation on the skeleton key points of the target human body to the reference human body, and finding out a skeleton key point sequence which is visible by the target human body and the reference human body together after the affine transformation. And calculating limb angles of the target human body and the reference human body according to the commonly visible skeleton key point sequence, namely calculating cosine value sequences of the internal angles of the respective limb triangles of the target human body skeleton point and the reference human body skeleton point according to the commonly visible skeleton key point sequence. And finally, calculating similarity according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the coincidence ratio of the skeleton points, and correspondingly determining the matching degree of the target human body and the reference human body according to the similarity. The method is efficient and accurate, the time consumption is hardly influenced by the number of people in the picture, and the frame rate in a multi-person video scene can reach more than 20FPS (Frames Per Second, the number of Frames transmitted Per Second). Therefore, the application effectively overcomes various defects in the prior art and has high industrial utilization value.
As shown in fig. 4, the present application further provides a target object matching system, which includes:
an obtaining module M10, configured to obtain one or more matching regions when one or more target objects are in different poses. In the embodiment of the present application, the target object may include, for example, at least one of: target human body, target animal body. If the target object is a human body, the target human body in different postures comprises at least one of the following: the target human body is a worker, and the posture of the worker during operation is determined; the target human body is a dance teacher, and the dance teacher takes a posture during dance teaching; the target human body is a student, and the student takes the posture during the inter-class exercise; the target human body is the middle-aged woman, the posture of the middle-aged woman when dancing the square, and the like.
In an exemplary embodiment, as shown in fig. 5, the acquiring module M10 includes an image acquiring unit D10 and a matching region unit D20;
the image acquisition unit D10 is used for acquiring one or more continuous frame images; the image acquisition unit D10 is constituted by a conventional computer and a general camera, for example. The image acquisition unit D10 acquires one or more continuous frame images by a conventional computer and a general camera.
The matching region unit D20 is connected to the image obtaining unit D10 and is configured to determine one or more matching regions when one or more target human bodies are in different postures according to the continuous frame images. Compared with the prior art, dedicated human body detection sensor equipment does not need to be arranged, and the cost is greatly reduced. The continuous frame image includes a video, a continuously shot photograph, and the like.
In some exemplary embodiments, the matching region comprises at least one of: target human skeleton points, a target human skeleton point combination and a target human body part.
Wherein the target human skeleton points comprise at least one of: the center point of the head, the left hip point, the right hip point, the left shoulder point and the right shoulder point. The target human skeleton point combination comprises at least one of the following components: a human body skeleton point combination consisting of a human head central point, a left hip point and a right hip point; a human body skeleton point combination consisting of a human head central point, a left shoulder point and a right shoulder point; a human body skeleton point combination consisting of the left shoulder point, the left hip point and the right hip point; the right shoulder point, the left hip point and the right hip point form a human body skeleton point combination. The target human body part includes at least one of: human head, human shoulder, human arm, human buttock, human shank.
For example, the human body posture is taken as a target object, continuous frame images are taken as videos, and when a matching area of the target object is determined, one or more videos are obtained through a conventional computer and a common camera. One or more human bodies are obtained from a video picture, according to the obtained human bodies, the human body skeleton points are used as matching areas, skeleton key point positioning is carried out on the human bodies in the video picture based on a Bottom-up deep Pose method, and the skeleton key points of all the human bodies in the whole video picture can be directly detected. One or more human bodies are selected as target human bodies or human bodies to be matched, which is equivalent to that skeleton key points of all the target human bodies or all the human bodies to be matched in the whole video picture can be directly detected. Compared with a Top-down (Top-down) method in the prior art, the Deep Pose method based on the bottom-up can directly and integrally detect the human body without detecting the human body from the head of the human body first and then sequentially detecting the human body downwards according to the method in the prior art. Moreover, the embodiment of the application is efficient and accurate, the consumed time is hardly influenced by the number of people in the picture, and the frame rate under a multi-person video scene can reach more than 20FPS (Frames Per Second, the number of Frames transmitted Per Second); compared with the prior art that the frame rate is only 2FPS in a multi-person video scene, the method is obviously superior to the prior art in time consumption and multi-person scene detection. Multiplayer video scenes may include, for example, worker work, dance teaching, somatosensory games, student break-time exercises, square dances, and the like.
An affine transformation module M20 for affine transforming the one or more matching regions to corresponding reference regions in the reference object. In the embodiment of the application, the reference object corresponds to the target object, and if the target object is a human body, the reference object is also the human body; if the target object is an animal, the reference object is also an animal.
In some exemplary embodiments, the reference region comprises at least one of: reference human skeleton points, reference human skeleton point combinations and reference human body parts.
If the matching region is a target human body skeleton point, the reference region comprises a reference human body skeleton point; the reference human skeleton point includes at least one of: a head center reference point corresponding to a head center point, a left hip reference point corresponding to a left hip point, a right hip reference point corresponding to a right hip point, a left shoulder reference point corresponding to a left shoulder point, a right shoulder reference point corresponding to a right shoulder point.
If the matching area is the target human body skeleton point combination, the reference area comprises a reference human body skeleton point combination; the reference human skeleton point combination comprises at least one of the following components: a reference human body skeleton point combination consisting of a human head center reference point, a left hip reference point and a right hip reference point; a reference human body skeleton point combination consisting of a head center reference point, a left shoulder reference point and a right shoulder reference point; a reference human body skeleton point combination consisting of a left shoulder reference point, a left hip reference point and a right hip reference point; and the reference human body skeleton point combination is formed by the right shoulder reference point, the left hip reference point and the right hip reference point.
If the matching region is the target human body part, the reference region comprises a reference human body part, and the reference human body part comprises at least one of the following parts: the human body head reference, the human body shoulder reference, the human body arm reference, the human body buttock reference and the human body leg reference.
For example, in the embodiment of the present application, if the target object is a human body, two postures of the human body in a certain state are selected from a video picture, one of the human bodies is selected as the target human body or the human body to be matched, the other human body is selected as a reference human body or a standard human body, and skeleton points of the target human body are affine transformed to the other reference human body, so that the skeleton points of the human body are unified to approximately the same scale and angle. Wherein the affine transformation operation comprises at least one of: rotation, translation and shearing. Since affine transformation relations can be determined from three human skeleton key points and there are at most 18 human skeleton key points per human body, there are 816 choices under the condition that affine transformation relations are determined from only three human skeleton points. Thus, a sequence consisting of 18 human skeletal key points corresponds to a human skeletal point sequence; at least three human skeleton key points are randomly selected from 18 human skeleton key points, the selected human skeleton key points are used as a human skeleton key point combination, and a sequence formed by the human skeleton key point combination is equivalent to a human skeleton point combination sequence. The method and the device can select the optimal three key points of the human body skeleton as affine transformation reference points, or select the combination of more than three key points of the human body skeleton as affine transformation reference points. For example, in the embodiment of the present application, a relatively representative three-point combination in the common visible keypoint sequence is selected as a candidate point combination: that is, affine transformations are sequentially performed by selecting (head center point, left hip point, right hip point), (head center point, left shoulder point, right shoulder point), (left shoulder point, left hip point, right hip point), (right shoulder point, left hip point, right hip point), and the like. If the matching degree calculated according to the candidate point combination is the highest, the candidate point combination is the best point combination or the key point combination.
And the matching module M30 is used for determining the matching degree of the target object and the reference object.
In an exemplary embodiment, as shown in fig. 6, the matching module M30 includes a first processing unit D30, a first computing unit D40, and a first matching unit D50;
the first processing unit D30 is configured to obtain a target human skeleton point sequence and a reference human skeleton point sequence, and determine a skeleton point coincidence sequence and a skeleton point coincidence ratio of the target human skeleton point sequence and the reference human skeleton point sequence. Specifically, a target human body skeleton point sequence consisting of 18 human body skeleton key points of a target human body and a reference human body skeleton point sequence consisting of 18 human body skeleton key points of a reference human body are obtained; and then determining a skeleton point coincidence sequence and a skeleton point coincidence ratio of the target human body skeleton point sequence and the reference human body skeleton point sequence.
The first calculating unit D40 is connected with the first processing unit D30 and is used for calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point coincidence sequence;
the first matching unit D50 is connected to the first calculating unit D40, and is configured to determine a matching degree between the target human body and the reference human body according to the cosine value sequence of the internal angle of the limb triangle of the target human body skeleton point, the cosine value sequence of the internal angle of the limb triangle of the reference human body skeleton point, and the skeleton point coincidence ratio.
In an exemplary embodiment, as shown in fig. 7, the matching module M30 includes a second processing unit D60, a second computing unit D70 and a second matching unit D80;
the second processing unit D60 is configured to obtain a target human body skeleton point combination sequence and a reference human body skeleton point combination sequence, and determine a skeleton point combination coincidence sequence and a skeleton point combination coincidence ratio of the target human body skeleton point combination sequence and the reference human body skeleton point combination sequence. Specifically, more than three human skeleton key points are arbitrarily selected from 18 human skeleton key points of the target human body as a target human skeleton point combination, so that a target human skeleton point combination sequence is formed. Selecting corresponding human skeleton key points from the reference human body to form a reference human skeleton point combination corresponding to the target human skeleton point combination and simultaneously form a reference human skeleton point sequence; and determining a skeleton point combination coincidence sequence and a skeleton point combination coincidence ratio of the target human body skeleton point combination sequence and the reference human body skeleton point combination sequence.
The second calculating unit D70 is connected with the second processing unit D60 and is used for calculating cosine value sequences of the internal angles of the respective limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point combination coincidence sequence;
the second matching unit D80 is connected with the second calculating unit D70 and is used for determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the combination coincidence ratio of the skeleton points.
In the embodiment of the present application, as shown in fig. 2 and 3, determining the matching degree between the target human body and the reference human body by using the human skeleton point as the matching region includes:
respectively obtaining a target human body skeleton point sequence and a reference human body skeleton point sequence, namely respectively obtaining human body skeleton key point sequences of a target human body and a reference human body, carrying out affine transformation on human body skeleton key points of the target human body into the reference human body, and then determining a human body skeleton key point coincidence sequence and a human body skeleton key point coincidence ratio r. And if the human skeleton key points of the target human body are affine transformed to the reference human body and the common visible human skeleton key points exist in the target human body and the reference human body, counting all the common visible human skeleton key points in the target human body and the reference human body, and recording the result as a human skeleton key point coincidence sequence. And dividing the number of all the visible human skeleton key points of the target human body and the reference human body by the number of all the visible human skeleton key points according to the counted number of all the common visible human skeleton key points, and recording the result as a human skeleton key point coincidence ratio r.
The limb angles of the common visible skeleton points of the target human skeleton and the reference human skeleton are respectively calculated and expressed by cosine values, and the range is (-1.0, 1.0). Namely, the cosine value sequences of the inner angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point are calculated according to the human body skeleton key point coincidence sequence. Calculating a cosine value sequence of the target human body skeleton points according to the common visible skeleton key point sequence: a is0,a1,...,ai(ii) a Calculating a cosine value sequence of the target human body skeleton points according to the common visible skeleton key point sequence: b0,b1,…,bi
Calculating the similarity S according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the skeleton point coincidence ratio r as follows:
Figure BDA0002303417070000141
and determining the matching degree of the target human body and the reference human body according to the similarity S. The matching degree and the similarity value correspond to each other, that is, the similarity S between the target human body and the reference human body is 95%, and the matching degree between the target human body and the reference human body is 95%.
The method comprises the steps of obtaining one or more matching areas when one or more target objects are in different postures; affine transforming the one or more matching regions to corresponding reference regions in a reference object; and determining the matching degree of the target object and the reference object. The method is based on a bottom-up Deep Pose method, can directly detect the human skeleton points of all human bodies in the whole video, selects one or more human bodies as target human bodies, and is equivalent to directly detecting the human skeleton points of all target human bodies in the whole video. And selecting human skeleton key points in the human skeleton points, carrying out affine transformation on the skeleton key points of the target human body to the reference human body, and finding out a skeleton key point sequence which is visible by the target human body and the reference human body together after the affine transformation. And calculating limb angles of the target human body and the reference human body according to the commonly visible skeleton key point sequence, namely calculating cosine value sequences of the internal angles of the respective limb triangles of the target human body skeleton point and the reference human body skeleton point according to the commonly visible skeleton key point sequence. And finally, calculating similarity according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the coincidence ratio of the skeleton points, and correspondingly determining the matching degree of the target human body and the reference human body according to the similarity. The method is efficient and accurate, the time consumption is hardly influenced by the number of people in the picture, and the frame rate in a multi-person video scene can reach more than 20FPS (Frames Per Second, the number of Frames transmitted Per Second). Therefore, the application effectively overcomes various defects in the prior art and has high industrial utilization value.
An embodiment of the present application further provides a target object matching device, including:
acquiring one or more matching areas of one or more target objects in different postures;
affine transforming the one or more matching regions to corresponding reference regions in a reference object;
and determining the matching degree of the target object and the reference object.
In this embodiment, the target object matching device executes the system or the method, and specific functions and technical effects may refer to the above embodiments, which are not described herein again.
An embodiment of the present application further provides an apparatus, which may include: one or more processors; and one or more machine readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method of fig. 1. In practical applications, the device may be used as a terminal device, and may also be used as a server, where examples of the terminal device may include: the mobile terminal includes a smart phone, a tablet computer, an electronic book reader, an MP3 (Moving Picture Experts Group Audio Layer III) player, an MP4 (Moving Picture Experts Group Audio Layer IV) player, a laptop, a vehicle-mounted computer, a desktop computer, a set-top box, an intelligent television, a wearable device, and the like.
Embodiments of the present application also provide a non-transitory readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a device, the device may execute instructions (instructions) of steps included in the method in fig. 1 according to the embodiments of the present application.
Fig. 8 is a schematic diagram of a hardware structure of a terminal device according to an embodiment of the present application. As shown, the terminal device may include: an input device 1100, a first processor 1101, an output device 1102, a first memory 1103, and at least one communication bus 1104. The communication bus 1104 is used to implement communication connections between the elements. The first memory 1103 may include a high-speed RAM memory, and may also include a non-volatile storage NVM, such as at least one disk memory, and the first memory 1103 may store various programs for performing various processing functions and implementing the method steps of the present embodiment.
Alternatively, the first processor 1101 may be, for example, a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and the first processor 1101 is coupled to the input device 1100 and the output device 1102 through a wired or wireless connection.
Optionally, the input device 1100 may include a variety of input devices, such as at least one of a user-oriented user interface, a device-oriented device interface, a software programmable interface, a camera, and a sensor. Optionally, the device interface facing the device may be a wired interface for data transmission between devices, or may be a hardware plug-in interface (e.g., a USB interface, a serial port, etc.) for data transmission between devices; optionally, the user-facing user interface may be, for example, a user-facing control key, a voice input device for receiving voice input, and a touch sensing device (e.g., a touch screen with a touch sensing function, a touch pad, etc.) for receiving user touch input; optionally, the programmable interface of the software may be, for example, an entry for a user to edit or modify a program, such as an input pin interface or an input interface of a chip; the output devices 1102 may include output devices such as a display, audio, and the like.
In this embodiment, the processor of the terminal device includes a function for executing each module of the speech recognition apparatus in each device, and specific functions and technical effects may refer to the above embodiments, which are not described herein again.
Fig. 9 is a schematic hardware structure diagram of a terminal device according to an embodiment of the present application. FIG. 9 is a specific embodiment of the implementation of FIG. 8. As shown in fig. 9, the terminal device of the present embodiment may include a second processor 1201 and a second memory 1202.
The second processor 1201 executes the computer program code stored in the second memory 1202 to implement the method described in fig. 1 in the above embodiment.
The second memory 1202 is configured to store various types of data to support operations at the terminal device. Examples of such data include instructions for any application or method operating on the terminal device, such as messages, pictures, videos, and so forth. The second memory 1202 may include a Random Access Memory (RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory.
Optionally, a second processor 1201 is provided in the processing assembly 1200. The terminal device may further include: communication component 1203, power component 1204, multimedia component 1205, speech component 1206, input/output interfaces 1207, and/or sensor component 1208. The specific components included in the terminal device are set according to actual requirements, which is not limited in this embodiment.
The processing component 1200 generally controls the overall operation of the terminal device. The processing component 1200 may include one or more second processors 1201 to execute instructions to perform all or part of the steps of the target object matching method described above. Further, the processing component 1200 can include one or more modules that facilitate interaction between the processing component 1200 and other components. For example, the processing component 1200 can include a multimedia module to facilitate interaction between the multimedia component 1205 and the processing component 1200.
The power supply component 1204 provides power to the various components of the terminal device. The power components 1204 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the terminal device.
The multimedia components 1205 include a display screen that provides an output interface between the terminal device and the user. In some embodiments, the display screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the display screen includes a touch panel, the display screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
The voice component 1206 is configured to output and/or input voice signals. For example, the voice component 1206 includes a Microphone (MIC) configured to receive external voice signals when the terminal device is in an operational mode, such as a voice recognition mode. The received speech signal may further be stored in the second memory 1202 or transmitted via the communication component 1203. In some embodiments, the speech component 1206 further comprises a speaker for outputting speech signals.
The input/output interface 1207 provides an interface between the processing component 1200 and peripheral interface modules, which may be click wheels, buttons, etc. These buttons may include, but are not limited to: a volume button, a start button, and a lock button.
The sensor component 1208 includes one or more sensors for providing various aspects of status assessment for the terminal device. For example, the sensor component 1208 may detect an open/closed state of the terminal device, relative positioning of the components, presence or absence of user contact with the terminal device. The sensor assembly 1208 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact, including detecting the distance between the user and the terminal device. In some embodiments, the sensor assembly 1208 may also include a camera or the like.
The communication component 1203 is configured to facilitate communications between the terminal device and other devices in a wired or wireless manner. The terminal device may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In one embodiment, the terminal device may include a SIM card slot therein for inserting a SIM card therein, so that the terminal device may log onto a GPRS network to establish communication with the server via the internet.
As can be seen from the above, the communication component 1203, the voice component 1206, the input/output interface 1207 and the sensor component 1208 involved in the embodiment of fig. 9 can be implemented as the input device in the embodiment of fig. 8.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which can be made by those skilled in the art without departing from the spirit and technical spirit of the present invention be covered by the claims of the present invention.

Claims (27)

1. A target object matching method is characterized by comprising the following steps:
acquiring one or more matching areas of one or more target objects in different postures;
affine transforming the one or more matching regions to corresponding reference regions in a reference object;
and determining the matching degree of the target object and the reference object.
2. The target object matching method according to claim 1, wherein the reference object corresponds to the target object, and the target object includes at least one of: target human body, target animal body.
3. The target object matching method according to claim 2, wherein one or more continuous frame images are acquired, and one or more matching regions of one or more target human bodies in different postures are determined according to the continuous frame images.
4. The target object matching method of claim 3, wherein the matching region comprises at least one of: target human skeleton points, a target human skeleton point combination and a target human body part.
5. The target object matching method of claim 4, wherein the target human skeleton points comprise at least one of: the center point of the head, the left hip point, the right hip point, the left shoulder point and the right shoulder point.
6. The target object matching method of claim 4, wherein the target human skeleton point combination comprises at least one of: a human body skeleton point combination consisting of a human head central point, a left hip point and a right hip point; a human body skeleton point combination consisting of a human head central point, a left shoulder point and a right shoulder point; a human body skeleton point combination consisting of the left shoulder point, the left hip point and the right hip point; the right shoulder point, the left hip point and the right hip point form a human body skeleton point combination.
7. The target object matching method according to claim 4, wherein the target human body part comprises at least one of: human head, human shoulder, human arm, human buttock, human shank.
8. The target object matching method of claim 5, wherein if the matching region is a target human skeleton point, the reference region comprises a reference human skeleton point; the reference human skeleton point includes at least one of: a head center reference point corresponding to a head center point, a left hip reference point corresponding to a left hip point, a right hip reference point corresponding to a right hip point, a left shoulder reference point corresponding to a left shoulder point, a right shoulder reference point corresponding to a right shoulder point.
9. The target object matching method of claim 6, wherein if the matching region is a target human skeleton point combination, the reference region comprises a reference human skeleton point combination; the reference human skeleton point combination comprises at least one of the following components: a reference human body skeleton point combination consisting of a human head center reference point, a left hip reference point and a right hip reference point; a reference human body skeleton point combination consisting of a head center reference point, a left shoulder reference point and a right shoulder reference point; a reference human body skeleton point combination consisting of a left shoulder reference point, a left hip reference point and a right hip reference point; and the reference human body skeleton point combination is formed by the right shoulder reference point, the left hip reference point and the right hip reference point.
10. The target object matching method of claim 1, wherein said affine transformation comprises at least one of: rotation, translation and shearing.
11. The target object matching method according to claim 8, wherein determining the matching degree between the target object and the reference object specifically comprises:
acquiring a target human body skeleton point sequence and a reference human body skeleton point sequence, and determining a skeleton point coincidence sequence and a skeleton point coincidence ratio of the target human body skeleton point sequence and the reference human body skeleton point sequence;
calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point coincidence sequence;
and determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the skeleton point coincidence ratio.
12. The target object matching method according to claim 9, wherein determining the matching degree between the target object and the reference object specifically comprises:
acquiring a target human body skeleton point combination sequence and a reference human body skeleton point combination sequence, and determining a skeleton point combination coincidence sequence and a skeleton point combination coincidence ratio of the target human body skeleton point combination sequence and the reference human body skeleton point combination sequence;
calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point combination coincidence sequence;
and determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle internal angle of the target human body skeleton point, the cosine value sequence of the limb triangle internal angle of the reference human body skeleton point and the combined coincidence ratio of the skeleton points.
13. A target object matching system, said system comprising:
the acquisition module is used for acquiring one or more matching areas when one or more target objects are in different postures;
an affine transformation module for affine transforming the one or more matching regions to corresponding reference regions in a reference object;
and the matching module is used for determining the matching degree of the target object and the reference object.
14. The target object matching system of claim 13, wherein the reference object corresponds to the target object, the target object comprising at least one of: target human body, target animal body.
15. The target object matching system of claim 14, wherein the acquisition module comprises an image acquisition unit and a matching area unit;
the image acquisition unit is used for acquiring one or more continuous frame images;
the matching region unit determines one or more matching regions when one or more target human bodies are in different postures according to the continuous frame images.
16. The target object matching system of claim 15, wherein the matching region comprises at least one of: target human skeleton points, a target human skeleton point combination and a target human body part.
17. The target object matching system of claim 16, wherein the target human skeletal points comprise at least one of: the center point of the head, the left hip point, the right hip point, the left shoulder point and the right shoulder point.
18. The target object matching system of claim 16, wherein the target human skeletal point combination comprises at least one of: a human body skeleton point combination consisting of a human head central point, a left hip point and a right hip point; a human body skeleton point combination consisting of a human head central point, a left shoulder point and a right shoulder point; a human body skeleton point combination consisting of the left shoulder point, the left hip point and the right hip point; the right shoulder point, the left hip point and the right hip point form a human body skeleton point combination.
19. The target object matching system of claim 16, wherein the target human body part comprises at least one of: human head, human shoulder, human arm, human buttock, human shank.
20. The target object matching system of claim 17, wherein the reference region comprises a reference human skeleton point if the matching region is a target human skeleton point; the reference human skeleton point includes at least one of: a head center reference point corresponding to a head center point, a left hip reference point corresponding to a left hip point, a right hip reference point corresponding to a right hip point, a left shoulder reference point corresponding to a left shoulder point, a right shoulder reference point corresponding to a right shoulder point.
21. The target object matching system of claim 18, wherein the reference region comprises a reference human skeleton point combination if the matching region is a target human skeleton point combination; the reference human skeleton point combination comprises at least one of the following components: a reference human body skeleton point combination consisting of a human head center reference point, a left hip reference point and a right hip reference point; a reference human body skeleton point combination consisting of a head center reference point, a left shoulder reference point and a right shoulder reference point; a reference human body skeleton point combination consisting of a left shoulder reference point, a left hip reference point and a right hip reference point; and the reference human body skeleton point combination is formed by the right shoulder reference point, the left hip reference point and the right hip reference point.
22. The target object matching system of claim 13, wherein the affine transformation comprises at least one of: rotation, translation and shearing.
23. The target object matching system of claim 20, wherein the matching module comprises a first processing unit, a first computing unit, and a first matching unit;
the first processing unit is used for acquiring a target human body skeleton point sequence and a reference human body skeleton point sequence, and determining a skeleton point coincidence sequence and a skeleton point coincidence ratio of the target human body skeleton point sequence and the reference human body skeleton point sequence;
the first calculation unit is used for calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point coincidence sequence;
the first matching unit is used for determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle inner angle of the target human body skeleton point, the cosine value sequence of the limb triangle inner angle of the reference human body skeleton point and the skeleton point coincidence ratio.
24. The target object matching system of claim 21, wherein the matching module comprises a second processing unit, a second computing unit, and a second matching unit;
the second processing unit is used for acquiring a target human body skeleton point combination sequence and a reference human body skeleton point combination sequence, and determining a skeleton point combination coincidence sequence and a skeleton point combination coincidence ratio of the target human body skeleton point combination sequence and the reference human body skeleton point combination sequence;
the second calculation unit is used for calculating cosine value sequences of the internal angles of the limb triangles of the target human body skeleton point and the reference human body skeleton point according to the skeleton point combination coincidence sequence;
the second matching unit is used for determining the matching degree of the target human body and the reference human body according to the cosine value sequence of the limb triangle inner angle of the target human body skeleton point, the cosine value sequence of the limb triangle inner angle of the reference human body skeleton point and the combined coincidence ratio of the skeleton points.
25. A target object matching apparatus, comprising:
acquiring one or more matching areas of one or more target objects in different postures;
affine transforming the one or more matching regions to corresponding reference regions in a reference object;
and determining the matching degree of the target object and the reference object.
26. An apparatus, comprising:
one or more processors; and
one or more machine-readable media having instructions stored thereon that, when executed by the one or more processors, cause the apparatus to perform the method recited by one or more of claims 1-12.
27. One or more machine-readable media having instructions stored thereon, which when executed by one or more processors, cause an apparatus to perform the method recited by one or more of claims 1-12.
CN201911230513.0A 2019-12-05 2019-12-05 Target object matching method, system, device and machine readable medium Pending CN111080589A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911230513.0A CN111080589A (en) 2019-12-05 2019-12-05 Target object matching method, system, device and machine readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911230513.0A CN111080589A (en) 2019-12-05 2019-12-05 Target object matching method, system, device and machine readable medium

Publications (1)

Publication Number Publication Date
CN111080589A true CN111080589A (en) 2020-04-28

Family

ID=70312879

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911230513.0A Pending CN111080589A (en) 2019-12-05 2019-12-05 Target object matching method, system, device and machine readable medium

Country Status (1)

Country Link
CN (1) CN111080589A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814885A (en) * 2020-07-10 2020-10-23 云从科技集团股份有限公司 Method, system, device and medium for managing image frames
CN114827730A (en) * 2022-04-19 2022-07-29 咪咕文化科技有限公司 Video cover selecting method, device, equipment and storage medium
CN114827730B (en) * 2022-04-19 2024-05-31 咪咕文化科技有限公司 Video cover selection method, device, equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101271520A (en) * 2008-04-01 2008-09-24 北京中星微电子有限公司 Method and device for confirming characteristic point position in image
EP2128818A1 (en) * 2007-01-25 2009-12-02 Shanghai Yaowei Industry Co, Ltd. Method of moving target tracking and number accounting
CN102026013A (en) * 2010-12-18 2011-04-20 浙江大学 Stereo video matching method based on affine transformation
CN103706106A (en) * 2013-12-30 2014-04-09 南京大学 Self-adaption continuous motion training method based on Kinect
CN104917934A (en) * 2015-06-01 2015-09-16 天津航天中为数据系统科技有限公司 Method for eliminating mismatching points, video compensation method and device
CN106295616A (en) * 2016-08-24 2017-01-04 张斌 Exercise data analyses and comparison method and device
CN106406518A (en) * 2016-08-26 2017-02-15 清华大学 Gesture control device and gesture recognition method
CN107563320A (en) * 2017-08-24 2018-01-09 中南大学 Human body sitting posture bearing method of testing and its system based on spatial positional information
US10222178B1 (en) * 2011-04-13 2019-03-05 Litel Instruments Precision geographic location system and method utilizing an image product
CN110414453A (en) * 2019-07-31 2019-11-05 电子科技大学成都学院 Human body action state monitoring method under a kind of multiple perspective based on machine vision

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2128818A1 (en) * 2007-01-25 2009-12-02 Shanghai Yaowei Industry Co, Ltd. Method of moving target tracking and number accounting
CN101271520A (en) * 2008-04-01 2008-09-24 北京中星微电子有限公司 Method and device for confirming characteristic point position in image
CN102026013A (en) * 2010-12-18 2011-04-20 浙江大学 Stereo video matching method based on affine transformation
US10222178B1 (en) * 2011-04-13 2019-03-05 Litel Instruments Precision geographic location system and method utilizing an image product
CN103706106A (en) * 2013-12-30 2014-04-09 南京大学 Self-adaption continuous motion training method based on Kinect
CN104917934A (en) * 2015-06-01 2015-09-16 天津航天中为数据系统科技有限公司 Method for eliminating mismatching points, video compensation method and device
CN106295616A (en) * 2016-08-24 2017-01-04 张斌 Exercise data analyses and comparison method and device
CN106406518A (en) * 2016-08-26 2017-02-15 清华大学 Gesture control device and gesture recognition method
CN107563320A (en) * 2017-08-24 2018-01-09 中南大学 Human body sitting posture bearing method of testing and its system based on spatial positional information
CN110414453A (en) * 2019-07-31 2019-11-05 电子科技大学成都学院 Human body action state monitoring method under a kind of multiple perspective based on machine vision

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
占婵: "《基于角度序列特征的人体动作识别方法》", 《科技创新与生产力》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814885A (en) * 2020-07-10 2020-10-23 云从科技集团股份有限公司 Method, system, device and medium for managing image frames
CN114827730A (en) * 2022-04-19 2022-07-29 咪咕文化科技有限公司 Video cover selecting method, device, equipment and storage medium
CN114827730B (en) * 2022-04-19 2024-05-31 咪咕文化科技有限公司 Video cover selection method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110163048B (en) Hand key point recognition model training method, hand key point recognition method and hand key point recognition equipment
US10043308B2 (en) Image processing method and apparatus for three-dimensional reconstruction
CN108492363B (en) Augmented reality-based combination method and device, storage medium and electronic equipment
CN110059661A (en) Action identification method, man-machine interaction method, device and storage medium
CN105491365A (en) Image processing method, device and system based on mobile terminal
CN108830892B (en) Face image processing method and device, electronic equipment and computer readable storage medium
WO2019218880A1 (en) Interaction recognition method and apparatus, storage medium, and terminal device
WO2020029554A1 (en) Augmented reality multi-plane model animation interaction method and device, apparatus, and storage medium
CN110072046B (en) Image synthesis method and device
CN108646920A (en) Identify exchange method, device, storage medium and terminal device
CN111062276A (en) Human body posture recommendation method and device based on human-computer interaction, machine readable medium and equipment
CN111062981A (en) Image processing method, device and storage medium
CN111340848A (en) Object tracking method, system, device and medium for target area
CN111310725A (en) Object identification method, system, machine readable medium and device
CN111199169A (en) Image processing method and device
CN108683845A (en) Image processing method, device, storage medium and mobile terminal
CN112101252B (en) Image processing method, system, device and medium based on deep learning
WO2020001016A1 (en) Moving image generation method and apparatus, and electronic device and computer-readable storage medium
CN109785444A (en) Recognition methods, device and the mobile terminal of real plane in image
CN111080589A (en) Target object matching method, system, device and machine readable medium
CN116452745A (en) Hand modeling, hand model processing method, device and medium
CN114299615A (en) Key point-based multi-feature fusion action identification method, device, medium and equipment
CN112613490B (en) Behavior recognition method and device, machine readable medium and equipment
CN113610864B (en) Image processing method, device, electronic equipment and computer readable storage medium
CN111258413A (en) Control method and device of virtual object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 511458 room 1009, No.26, Jinlong Road, Nansha District, Guangzhou City, Guangdong Province (only for office use)

Applicant after: Guangzhou Yuncong Dingwang Technology Co., Ltd

Applicant after: Yuncong Technology Group Co.,Ltd.

Address before: 511458 room 1009, No.26, Jinlong Road, Nansha District, Guangzhou City, Guangdong Province (only for office use)

Applicant before: Guangzhou Jize Technology Co.,Ltd.

Applicant before: Yuncong Technology Group Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200428