Detailed Description
Hereinafter, exemplary embodiments of the present application will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and that the present application is not limited by the example embodiments described herein.
As described above, the existing age identification scheme has the following problems:
1. the method completely depends on the face recognition technology, and complete face image data are difficult to acquire in many application scenes;
2. the face recognition algorithm is difficult to obviously improve the accuracy of age recognition in a short time;
3. a large number of videos covering various scenes are collected to judge the ages of the in-out objects manually, the automation degree is low, the workload of personnel is large, and the disadvantages associated with manual processing cannot be overcome.
In view of the above-mentioned shortcomings in the prior art, the basic idea of the present application is to use a more novel algorithm to achieve intelligent age identification. For example, when a face recognition algorithm cannot be used, the age of a human body can be recognized by a motion characteristic of the human body, such as a swing parameter of the head of the human body. The movement behavior of the human body of different ages has its typical characteristics, and the head is the part of the human body that can most conveniently capture the movement characteristics of the human body. In addition, in some embodiments of the present invention, some auxiliary means are also used to implement age identification, such as human body movement speed, crowd information, and the like, so as to improve the success rate of age identification. In some embodiments, identification result stability judgment, weighting average based on application scenes and other means are also adopted to improve the accuracy of age identification. By adopting the scheme of the invention, compared with the traditional age identification mode which singly depends on face identification, the success rate and the accuracy of age identification are greatly improved.
It should be noted that the basic concept of the present application can be applied not only to commercial environments such as shopping malls and shopping centers, but also to other scenes, such as people stream age identification in scenes such as communities, parks and intersections.
Having described the general principles of the present application, various non-limiting embodiments of the present application will now be described with reference to the accompanying drawings.
Fig. 1 illustrates a flowchart of an age identification method according to an exemplary embodiment of the present application. As shown in fig. 1, the age identifying method 100 may begin with step S110 of identifying a head of a human body in a current frame image. The current frame image may be any frame image in a people stream video captured by, for example, a webcam (IPC), and the human head in the current frame image may be identified by various existing or future-developed algorithms, such as, but not limited to, a neural network-based image identification algorithm, and the like. In step S110, the head of each human body in the image, such as the front head, the side head and the back head, may be recognized without being limited to whether it has a complete face. That is, in the method 100, age recognition may be performed only by the human head without relying on face recognition, as described below.
With continued reference to fig. 1, in step S112, a swing parameter of the human head is determined based on the current frame image and a predetermined number of previous frame images between the current frame images. Here, the wobble parameter may include, for example, at least one of a wobble amplitude and a wobble frequency, and for example, the wobble parameter may include the wobble amplitude, may include only the wobble frequency, and may include both the wobble amplitude and the wobble frequency, wherein the wobble parameter may be implemented based on a plurality of frames of images and tracking the motion of the head. Fig. 2 shows a schematic diagram of a process of determining a wobble parameter. As shown in fig. 2, in a current frame image and several previous frame images (collectively referred to as images 10) prior to the current frame image, a human head is identified, as shown in block 11, and a center point position 12 of the human head may be determined. By tracking the movement of the human head, the movement trajectory of the center point 12 of the human head can be determined, and the swing amplitude and swing frequency of the human head can be determined based on the movement trajectory.
Referring back to fig. 1, in step S114, the age of the human body may be determined based on the swing parameter. For convenience of description, the age of the human body determined based on the swing parameter will be referred to as a first age hereinafter. Through counting a large number of samples, the swing parameters of a human body during movement, typically the swing parameters of the head of the human body, are found to have a correlation with the age of the human body. Generally, the older a person, the lower its head swing frequency; the smaller the age of the human body, the higher the frequency of the head swing. On the other hand, the middle aged people move with a more stable body shape and a lower head swing amplitude, and the higher and the lower the age, the higher the head swing amplitude of the middle aged people increases. Based on these characteristics, the age of the person may be determined using swing parameters such as one or more of swing amplitude and swing frequency.
Step S114 may be implemented in a variety of ways. For example, in one example, a lookup table associating the swing parameter with the age of the human body may be established in advance based on empirical data, and the age of the human body may be determined by directly referring to the lookup table. The implementation mode is simple and easy to implement, and the related calculation amount is small, so that the hardware resource is saved. In another example, a neural network model may be trained in advance, and the neural network model directly takes the movement track of the human head as input and takes the age as output. Thus, the trajectory 13 shown in fig. 2 can be directly input to the neural network model to determine the age of the human body. Here, the head swing parameters are embodied in the trajectory 13, and therefore the neural network model also substantially determines the age of the human body based on the swing parameters. This implementation requires a large amount of calculation, but can further improve the accuracy of age prediction.
In the above step, the first age of the human body is determined based on a swing parameter of the head of the human body, such as a swing amplitude and/or a swing frequency. The human head is a human feature that is easier to capture by a camera than, for example, a human face, and particularly when, for example, the human traffic is large, the human face is likely to be blocked by other people or cannot be captured by the camera due to the human traveling direction, and the human head feature is easy to be accurately captured, for example, by installing the camera at a higher position, the head feature of most of human bodies in a dense stream of people can be captured, and then head swing parameters, such as a swing amplitude and/or a swing frequency, can be extracted based on the movement of the head feature. As described above, the swing amplitude and the swing frequency are two parameters closely related to the age of the human body because the human body has different motion characteristics at different ages, which are reflected on the swing parameters of the head of the human body, such as amplitude and frequency. Therefore, by determining the first age of the human body based on the swing amplitude and/or the swing frequency, the success rate of age identification can be improved.
As described above, in the case where a complete face can be captured, a human face can also be used for human age recognition and has high accuracy. Therefore, in some embodiments, a face in the current frame image may also be identified, and the age of the human body may be determined based on the identified face. For convenience of description, the age of the human body determined based on the face recognition will be referred to as a second age hereinafter. It will be appreciated that the face-based age identification process may be performed before, after, or at least partially overlapping steps S110-S114 of fig. 1, and the invention is not limited to the order in which they are performed. Also, the present invention does not limit the order of execution of many of the other steps described in detail below, which steps can be performed sequentially in a different order or can be performed in parallel unless the relative order of execution is determined by context.
In some embodiments of the present invention, the age of the human body may also be identified through a variety of auxiliary methods, which will be described in detail below with reference to fig. 3. As shown in fig. 3, in step S116, the number of persons in the current frame image may be determined, which may be implemented by counting the number of recognized heads, for example, using the human head recognition result of step S110. Next, in step S118, it is determined whether the number of people exceeds a first threshold. If the first threshold is not exceeded, indicating that the flow of people is small, the assisted age identification may be performed using, for example, speed information and group information described below; on the other hand, if the first threshold value is exceeded, it is interpreted that the flow rate of people is large, and at this time, the movement of each person is greatly restricted by the surrounding crowd, and it is difficult to perform the auxiliary age recognition based on the moving speed of the individual or the group information, and it is necessary to consider other age recognition means. By subdividing the scenes based on the flow of people and adopting different age identification schemes aiming at different flow of people, the accuracy of age identification can be further improved, and the universality of the scheme of the invention aiming at different scenes is improved.
With continued reference to fig. 3, in response to the number of people being less than or equal to the first threshold, the moving speed of the human body may be determined based on the current frame image and several previous frame images in step S120, and then the age of the human body is determined based on the moving speed of the human body in step S122. For convenience of description, the age of the human body determined based on the moving speed of the human body will be referred to as a third age hereinafter. Generally, the greater the age, the slower the movement speed; the smaller the age, the faster the movement speed. Similarly, a large number of samples may be counted in advance, a correlation between the moving speed and the age may be established, and other factors such as the density of people flowing may be considered at the same time. Thus, in step S122, the age of the human body can be determined based on the moving speed of the human body, or other auxiliary information such as the density of the stream of people can be combined. In some embodiments, when determining the age of the human body based on the moving speed of the human body in step S122, the relative speed may also be considered. For example, the moving speed of the current human body may be compared with the moving speeds of the surrounding human bodies to determine a speed difference thereof, and then the age of the human body may be determined in consideration of the moving speed of the human body and the speed difference with respect to the surrounding human bodies. Generally, the smaller the age of a human body, the faster its relative velocity. Speed and acceleration are important characterizing parameters of human body movement, and are closely related to human age. By determining the age of the human body based on the velocity and/or acceleration, the success rate and accuracy of age identification can be further improved.
In some embodiments, in response to determining that the number of people is less than or equal to the first threshold at step S118, optionally, the group in which the human body is located may also be determined based on the current frame and the previous frame at step S124. There are various ways to determine that multiple persons belong to a group in the same row. For example, a group of persons close in distance and having approximately the same movement trajectory (direction) may be identified, a group of several persons talking may be identified, and so on. After the population is identified, in step S126, the age of the human body may be determined based on member information within the population. For convenience of description, the human age determined based on the group member information will be referred to as a fourth age hereinafter. For example, if a plurality of members within a population are close in age, it may be determined that the current person is also close in age to the other members. For another example, if there are two middle-aged people and one child in the population, it can be determined that the current person is an elderly person who belongs to the same family. Unlike the above age recognition based on the information of the human body itself, the group information is the information of other members in the group to perform the age recognition of the human body, and therefore, additional information that cannot be provided by the human body itself can be provided, and the accuracy of the age recognition can be further improved.
The above describes identifying the age of the human body by various means, which determine the first, second, third and fourth ages, respectively. It is understood that, when the age recognition is implemented by a video image, the recognition result may fluctuate due to the influence of factors such as human body movement and ambient light. For example, for a video image including a plurality of frame images, a plurality of age values of the first, second, third, or fourth age may be determined using the aforementioned method, and the age values may be different from each other. At this time, the stability of the recognition result is a consideration for determining whether the recognition is accurate. Fig. 4 shows a flow chart of a method for determining the accuracy of a recognition result based on stability. As shown in fig. 4, for a plurality of frame images, a plurality of age values of the respective first, second, third or fourth age may be determined using any one of the aforementioned recognition algorithms in step S142, and then in step S144, it may be determined whether the stability of the plurality of age values reaches a second threshold, i.e., whether the determined age values are stable. For example, the ratio of the most frequently occurring of these age values may be determined. For example, if the ratio thereof reaches a second threshold value, e.g., 66.7%, 70%, 75%, 80%, etc., the success of the recognition may be confirmed in step S146, and the stable value is taken as the recognized age of the human body. On the other hand, if the recognition result stability does not reach the preset threshold, for example, the determined age values are evenly distributed in a larger range, the recognition failure may be confirmed in step S148. In this way, the recognition result with lower accuracy can be eliminated, and only the recognition result with higher accuracy is included, thereby ensuring the reliability of the recognition result.
Referring back to fig. 3, after step S126, or when it is determined in step S118 that the number of persons exceeds the first threshold, it may proceed to step S128 to determine whether the age identification is successful. When proceeding from step S118 to S128, it is only necessary to determine whether the first age and the second age are successfully identified; when proceeding from step S126 to S128, it is determined whether the first, second, third and fourth ages were successfully identified. The criteria for successful recognition is that stable results are recognized as described with reference to fig. 4. Age identification is considered to fail if the identification result is unstable, or no result is identified at all.
In step S128, as long as it is determined that at least one of the first to fourth ages is successfully identified, it is considered that the human age is successfully identified, and it may proceed to step S130 to weight-average the successfully identified ages according to the application scenario to determine the final age of the human. By weighted averaging the successfully identified ages, the contributions of various factors to the age identification result can be comprehensively considered, so that the error of age identification is reduced, and the overall accuracy of the identification model is improved. In the weighted average in step S130, not only one or more of the first to fourth ages successfully identified but also the basic age determined based on the application scenario may be considered. For example, if it is a small ornaments shop, where customers are a student population, the basal age is small, for example, 12-16 years old; in the case of a mall, the customer population is mostly young, for example, the basal age may be 22-30 years. The final age can be determined according to the following formula:
final age w1 base age + w2 first age + w3 second age + w4 third age + w5 fourth age.
Wherein, w 1-w 5 are weight parameters corresponding to ages. When a certain age is not successfully identified, the parameters w1 to w5 may be adjusted accordingly to avoid the effect of the unsuccessfully identified age on the final age. Then, the process may proceed to step S140 to end the processing for the current frame and continue processing for the next frame.
If it is confirmed in step S128 that any one of the first to fourth ages is not successfully identified, the age identification for the individual is considered to be failed. In some embodiments, the individual may be categorized into a recognition failure group, and the number of people in the recognition failure group may be counted, as shown in step S132. In step S134, it may be determined whether the number of people who have failed to identify the group reaches a third threshold. If the third threshold has not been reached, proceeding to step S140, if the third threshold is reached, i.e., when a certain amount of recognition-failed persons are accumulated, the age of each person in the recognition-failed population may be determined using a predetermined age distribution in step S136. For example, the predetermined age distribution may be a reference age distribution of a scene where age identification is performed, such as an age distribution of registered members of a mall, an age distribution of community residents, or an age distribution of previous identification in a similar scene. After the age of the human body is determined by the predetermined age distribution, it may proceed to step S140 to continue the process of the next frame.
Fig. 5 illustrates a functional block diagram of an apparatus 200 for recognizing an age of a human body according to an exemplary embodiment of the present application. As shown in fig. 5, the age identifying apparatus 200 may include a human head identifying unit 202, a swing parameter determining unit 204, and a first age identifying unit 206. The human head recognition unit 202 may be configured to recognize a head of a human body in the current frame image, and the swing parameter determination unit 204 may be configured to determine swing parameters, such as a swing amplitude and a swing frequency, of the recognized head of the human body based on the current frame image and a preset number of previous frame images before the current frame image. The first age identifying unit 206 may then be configured to identify a first age of the person based on the determined swing parameter.
In some examples, optionally, the apparatus for identifying the age of the human body 200 may further include a face recognition unit 208 for recognizing the face of the human body in the current frame image and a second age recognition unit 210 for determining the second age of the human body based on the recognized face.
In some examples, optionally, the apparatus for identifying the age of the human body 200 may further include a counting unit 212 for determining the number of people in the current frame image; a speed determining unit 214 for determining a moving speed of the human body from the current frame image and the previous frame image in response to the number of people being less than or equal to a first threshold; and a third age identifying unit 216 for determining a third age of the human body based on the moving speed of the human body. In some embodiments, the third age identifying unit 216 may also determine the third age of the human body based on the moving speed of the human body and the relative moving speed with respect to the surrounding human body. In some examples, optionally, the apparatus for identifying an age of a human body 200 may further include a group identification unit 218 for determining a group to which the human body belongs based on the current frame image and the previous frame image in response to the number of people being less than or equal to a first threshold; and a fourth age identifying unit 220 for determining a fourth age of the human body based on member information within the group.
It is to be understood that the first to fourth age identifying units 206, 210, 216 and 220 may also determine the stability of their identification results, as described above with reference to fig. 1 to 4, thereby determining whether successful identification is achieved.
With continued reference to fig. 5, in some examples, the apparatus 200 for identifying an age of a human body may optionally further include a weighted average unit 222 that may weight-average the successfully identified first, second, third and fourth ages to obtain a final age of the human body. In the weighted averaging, the human basal age determined by the application scenario, i.e. the average age, may also be taken into account.
In some examples, optionally, the apparatus for identifying age of human body 200 may further include an identification failure statistics unit 224, which may count identification failure groups whose ages are not successfully identified; and a predetermined age assigning unit 226 for determining a fifth age of the human body in the recognition failure population using the predetermined age distribution in response to the number of recognition failure populations reaching a threshold.
The detailed functions and operations of the respective units and modules in the age identifying apparatus 200 described above have been described in detail in the age identifying method described above with reference to fig. 1 to 4, and thus are only briefly described herein, and repeated detailed descriptions thereof are omitted. The age identifying apparatus 200 according to the embodiment of the present application may be implemented in an image processing electronic device, for example, may be integrated into the image processing electronic device as a software module and/or a hardware module.
While the method and apparatus for identifying the age of a human body according to some embodiments of the present application have been described above, it should be understood that many variations in form and detail are possible with respect to the above-described embodiments. For example, in the above method and apparatus, the determined age may be an accurate age value, an estimated age group, or a probability distribution of ages. Furthermore, in the above method and apparatus, age identification may also be performed in combination with other auxiliary information, such as height, gender, etc., all of which are easily contemplated and implemented by those skilled in the art under the teachings of the present application and, therefore, should be considered to fall within the scope of the present invention as defined by the appended claims.
Fig. 6 shows a block diagram of an exemplary electronic device 300 in which the age identifying apparatus 200 can be implemented. As shown in fig. 5, electronic device 300 may include a processor 310, a camera 320, and a memory 330, which are connected to each other by a bus 360.
The processor 310 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device to perform desired functions.
The camera 320 may acquire an image of an identified scene, which includes one or more human bodies to be identified.
Memory 330 may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. Volatile memory can include, for example, Random Access Memory (RAM), cache memory (or the like). The non-volatile memory may include, for example, Read Only Memory (ROM), a hard disk, flash memory, and the like. The memory 330 may have stored therein computer program instructions that the processor 310 may execute to implement the age identification methods of the embodiments of the present application above and/or other desired functions.
In some examples, the electronic device 300 may further include an input unit 340 and an output unit 350. The input unit 340 and the output unit 350 may perform various input and output functions, for example, the input unit 340 may accept registered member profile data of a commercial site, cell resident profile, or previously recognized result in a similar scene, etc., and the output unit 350 may output an age recognition result, etc. The processor 310 is connected to the various modules or units through a bus 360 to control their operation.
In addition to the above-described methods and apparatus, embodiments of the present application may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps in the age identification method according to an embodiment of the present application described above.
The computer program product may be written with program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions, which, when executed by a processor, cause the processor to perform the steps in the age identification method according to an embodiment of the present application described above.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.
The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.