CN110929628A

CN110929628A - Human body identification method and device

Info

Publication number: CN110929628A
Application number: CN201911127867.2A
Authority: CN
Inventors: 王博
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2019-11-18
Filing date: 2019-11-18
Publication date: 2020-03-27

Abstract

The disclosure discloses a human body identification method and device, and belongs to the field of computers. The method comprises the following steps: acquiring a target human body image; inputting the target human body image into a component segmentation module in a human body recognition model to obtain a characteristic image of each human body component included in the target human body image; determining a segmented characteristic image based on the characteristic image of the human body part; and carrying out human body identification processing based on the segmented characteristic image and an identification module in the human body identification model. By adopting the human body identification method provided by the disclosure, the accuracy of character identification can be improved.

Description

Human body identification method and device

Technical Field

The present disclosure relates to the field of computers, and in particular, to a method and an apparatus for human body recognition.

Background

In a surveillance video, because the resolution of the surveillance device is low and the shooting angle is high, a face picture with very high quality cannot be obtained generally, and at this time, a human body of a pedestrian needs to be used for identification, and the identification can be to find the pedestrian in a video frame according to an image of the known pedestrian, or to find the pedestrian with the same characteristics in the video.

In the prior art, a target human body image is segmented, the segmented target human body image is input into a feature extraction model respectively, feature images corresponding to the segmented target human body image are obtained respectively, and a specific pedestrian is identified according to the feature images corresponding to the segmented target human body image respectively.

In implementing the present disclosure, the inventors found that the related art has at least the following problems:

when part of the body of the target pedestrian is exposed on the target human body image, for example, the lower body of the pedestrian is blocked by a building, and since such a target human body image has a large amount of background information that is not useful for identifying the target pedestrian, a large amount of extracted feature information cannot be used for identifying the pedestrian, and the accuracy of person identification is low.

Disclosure of Invention

In order to solve the technical problems in the related art, the present embodiment provides a method and an apparatus for human body recognition. The technical scheme of the human body identification method and the device is as follows:

in a first aspect, a method for human body recognition is provided, the method including:

acquiring a target human body image;

inputting the human body image into a part segmentation module in a human body recognition model to obtain a characteristic image of each human body part included in the target human body image;

determining a segmented characteristic image based on the characteristic image of the human body part;

and carrying out human body identification processing based on the segmented characteristic image and an identification module in the human body identification model.

Optionally, the step of inputting the target human body image into a component segmentation module in a human body recognition model to obtain a feature image of each human body component included in the target human body image includes:

inputting the target human body image into a feature extraction submodule in the component segmentation module to obtain a human body feature image;

inputting the human body feature image into a component segmentation submodule in the component segmentation module to obtain an initial feature image of each human body component included in the target human body image;

and superposing the initial characteristic images of the human body parts with the area images at the corresponding positions in the human body characteristic images respectively to obtain the characteristic images of the human body parts.

Optionally, the determining a segmented feature image based on the feature image of the human body component includes:

amplifying the characteristic image of the human body part based on a pixel filling mode to obtain at least one amplified characteristic image;

and carrying out segmentation processing on the amplified characteristic image to obtain a feature image after segmentation processing.

Optionally, the segmenting the enlarged feature image to obtain a segmented feature image includes:

and inputting the amplified characteristic image into a self-attention screening module to obtain a processed characteristic image, and performing segmentation processing on the processed characteristic image to obtain a segmented characteristic image.

Optionally, the amplifying the characteristic image of the human body component based on the pixel filling manner to obtain at least one amplified characteristic image includes:

and respectively inputting the characteristic images of the human body parts into a plurality of cavity convolution modules with different cavity rates to obtain a plurality of amplified characteristic images.

Optionally, the processing of human body recognition based on the segmented feature image and the recognition module in the human body recognition model includes:

and carrying out human body identification processing based on the segmented characteristic image, the characteristic image of the human body component and an identification module in the human body identification model.

Optionally, the human body recognition processing based on the segmented feature image, the feature image of the human body component, and the recognition module in the human body recognition model includes:

inputting the segmented characteristic image and the characteristic image of the human body component into a characteristic extraction module in the human body recognition model respectively to obtain corresponding characteristic vectors;

merging all the feature vectors to obtain merged feature vectors;

and carrying out human body identification processing based on the combined feature vector and an identification module in the human body identification model.

Optionally, the performing human body recognition processing based on the merged feature vector and the recognition module in the human body recognition model includes:

and inputting the merged characteristic vector and the reference merged characteristic vector into an identification module in the human body identification model to obtain a human body identification result.

In a second aspect, there is provided an apparatus for human body recognition, the apparatus comprising:

an acquisition module configured to acquire a target human body image;

the part segmentation module is configured to input the human body image into a part segmentation module in a human body recognition model, and characteristic images of human body parts included in the target human body image are obtained;

a segmentation module configured to determine a segmented feature image based on the feature image of the human body part;

and the recognition module is configured to perform human body recognition processing based on the segmented characteristic image and the recognition module in the human body recognition model.

Optionally, the dividing module is configured to:

Optionally, the feature image of the human body component is amplified based on a pixel filling manner to obtain at least one amplified feature image, and the at least one amplified feature image is configured to:

Optionally, the recognition module in the human body recognition model and based on the segmented feature image performs human body recognition processing, and is configured to:

Optionally, the human body recognition processing is performed based on the segmented feature image, the feature image of the human body component, and a recognition module in the human body recognition model, and is configured to:

merging all the feature vectors to obtain merged feature vectors;

Optionally, the identification module is configured to:

In a third aspect, a terminal is provided, where the terminal includes a processor and a memory, and the memory stores at least one instruction, and the at least one instruction is loaded and executed by the processor to implement the method for human body recognition according to the first aspect.

In a fourth aspect, a computer-readable storage medium is provided, in which at least one instruction is stored, the at least one instruction being loaded and executed by a processor to implement the method for human body recognition according to the first aspect.

The beneficial effects brought by the technical scheme provided by the embodiment of the disclosure at least comprise:

according to the method provided by the embodiment of the disclosure, the target human body image is obtained, the human body image is input into the component segmentation module in the human body recognition model, the characteristic image of each human body component included in the target human body image is obtained, the characteristic image of the human body component is segmented, the segmented characteristic image is input into the recognition module in the human body recognition model, and human body recognition processing is carried out. The method provided by the embodiment of the disclosure provides a human body identification method, which is used for carrying out component segmentation and further segmentation on a human body image, extracting the characteristics of targets with various scales in a target human body image and further carrying out human body identification processing. Therefore, the human body identification method can improve the accuracy of human identification.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.

Fig. 1 is a structural diagram of a human body recognition model in human body recognition according to an embodiment of the present disclosure;

fig. 2 is a flowchart of a method for human body recognition provided by an embodiment of the present disclosure;

FIG. 3 is a flow chart of part segmentation in human body recognition according to an embodiment of the present disclosure;

fig. 4 is a schematic structural diagram of a human body recognition device provided by an embodiment of the present disclosure;

FIG. 5 is a schematic structural diagram of another human body recognition apparatus provided in the embodiments of the present disclosure;

fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a server according to an embodiment of the present disclosure.

Detailed Description

To make the objects, technical solutions and advantages of the present disclosure more apparent, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

The disclosed embodiments provide a method for human body recognition, which may be implemented by a computer device. The computer device may be a terminal used by a technician, and the terminal may be a mobile terminal such as a mobile phone, a tablet computer, a notebook computer, or the like, or a fixed terminal such as a desktop computer. In the embodiments of the present disclosure, the detailed description of the scheme is performed by taking the execution main body as an example, and other cases are similar to the above, which are not described in detail in the embodiments of the present disclosure.

Fig. 1 is an architecture diagram of a human recognition model in a method of human recognition, according to an example embodiment. Referring to fig. 1, the human body recognition model in the human body recognition method provided by the present disclosure is finally established by performing learning training on a large number of sample images to extract feature information in a target human body image, comparing the feature information in the target human body image with reference feature information of the target human body image to obtain a loss value, performing back propagation on the loss value by using a gradient descent method, adjusting parameters in the human body recognition model, and repeating the above learning training. The aim of identifying the human body is achieved by inputting the target human body image into the established human body identification model.

The scheme can be applied to various scenes relevant to human body identification, for example, in the process that a public security organization searches for criminal suspects through the bayonet device, the human body in the image captured by the bayonet device can be accurately extracted by using the scheme, and the comparison and the investigation of relevant personnel are facilitated. In the embodiment of the present disclosure, a human body is taken as an example to perform a detailed description of the scheme, and other situations are similar and will not be described again.

As shown in fig. 2, the processing flow of the method may include the following steps:

step 201, acquiring a target human body image.

The target human body image may be obtained in various ways, for example, in the training process, the target human body image may be obtained in a preset human body library. In practical implementation, the target human body image may be obtained from any video frame including a pedestrian, and is not limited herein.

In implementation, a target human body image is selected by using a detection frame on the acquired pedestrian image, and the target human body image is acquired, wherein the shape of the detection frame is generally rectangular.

In the training process, any one human body image in a preset human body library and an ID corresponding to the human body image are obtained, wherein the preset human body library comprises the human body image of any pedestrian, the ID corresponding to the human body image and a characteristic image corresponding to the human body image. And selecting the human body image of the pedestrian by using the detection frame to obtain a target human body image.

In the implementation, the feature image of the pedestrian in a certain video frame in the video is compared with the feature image of the pedestrian appearing before the video frame to obtain the identification result, and the identification result is whether the pedestrian in the certain video frame in the video is the pedestrian appearing before the video frame. In the training process, the characteristic images output by the human body recognition model and each characteristic image in the human body library are scored, and the recognition result is the scores of the characteristic images in the human body library and the characteristic images output by the human body recognition model.

Step 202, inputting the target human body image into a component segmentation module in the human body recognition model to obtain the characteristic images of the human body components included in the target human body image.

The part segmentation module is a machine training model and can segment the target human body image into feature images of human body parts in the target human body image so as to be used for subsequent operations. The body parts may include arms, legs, and head, etc., and are not limited in this regard.

In implementation, the acquired target human body image is input into a part segmentation module in the human body recognition model, and part segmentation is performed to obtain characteristic images of each human body part after the part segmentation, such as characteristic images corresponding to a human body, a head, an upper body, a lower body and a shoe.

Optionally, feature extraction is performed on a feature extraction submodule in the target human body image input component segmentation module to obtain a human body feature image. Inputting a part segmentation submodule in a human body feature image input part segmentation module to perform part segmentation to obtain initial feature images of all human body parts included in the target human body image; and superposing the initial characteristic images of the human body parts with the area images at the corresponding positions in the human body characteristic images respectively to obtain the characteristic images of the human body parts.

Further, as shown in fig. 3, the target human body image is input into the feature extraction module, and feature extraction is performed to obtain a human body feature image corresponding to the target human body image, where the size of the human body feature image is 7 × 2048, where the first feature image is a feature image with a length of 7 pixels, a width of 7 pixels, and a number of channels of 2048, where the number of channels is related to the number of convolutions, e.g., 1 convolution, and the number of channels is 1. And inputting the human body feature image into a component segmentation submodule in a component segmentation module to perform component segmentation, obtaining 5 initial feature images through loss function iteration, and superposing the 5 initial feature images on the human body feature image to obtain the feature image of each human body component, wherein the size of the feature image is 7 × 10240. Wherein, the characteristic images of each human body part respectively correspond to the human body, the head, the upper half body, the lower half body and the shoes.

It should be noted that, by determining all pixels of any human body part in the target human body image, the segmentation of each human body part in the target human body image is realized. For example, a feature image of the head is acquired by acquiring all pixels of the head position.

In this embodiment, the fuzzy segmentation of the component is performed through a component segmentation submodule in the component segmentation module, and then the initial feature image of each human body component is obtained, and the initial feature image of the human body component can be accurately segmented through the iteration of the loss function, so that the accuracy of the soft segmentation of the component is improved. The method for aligning the initial characteristic images of the human body parts with the area images at the corresponding positions in the human body characteristic images is adopted to divide the human body part areas, and the characteristic refinement comparison after the parts are aligned is fully considered, so that the divided characteristic images of the human body parts comprise fewer background characteristic images, and the division of the human body part areas is more accurate.

In the training process, a large number of human body images containing pedestrians can be used as samples, human body parts in the human body images of the pedestrians are divided based on human understanding, the human body parts of the human body images of the pedestrians after division are subjected to feature extraction, and then reference feature images of the human body parts of the human body images of the pedestrians are obtained.

Further, a human body image containing pedestrians is input into the component segmentation module, the human body image containing the characteristic image of each human body component is output, the characteristic image containing each human body component of the human body image is compared with the reference characteristic image of each human body component of the corresponding human body image, the difference information of the human body image containing the characteristic image of each human body component and the reference characteristic image of each human body component of the corresponding human body image is calculated by using a loss function, the adjustment value of the parameter in the component segmentation module is determined according to the difference information and a preset training algorithm, and then the parameter of the component segmentation module is subjected to numerical adjustment, so that one-time training is completed. Then, the human body images of other pedestrians in the sample training set are obtained, and the process is repeated. Thus, through training of a large number of samples, a final component segmentation module is obtained.

When part of the body of the target pedestrian is exposed on the target human body image, for example, the lower half of the pedestrian is shielded by a building, and the target human body image is subjected to component segmentation by the component segmentation module, so that a large amount of background information which is useless for identifying the target pedestrian in the target human body image can be removed, and the accuracy of identifying the pedestrian is improved.

And step 203, determining the segmented characteristic image based on the characteristic image of the human body part.

The segmentation may be equal-area segmentation of the feature image, or unequal-area segmentation of the feature image. By segmenting the characteristic image of the human body part, the purpose of improving the weight occupied by the small-scale characteristics is achieved, and therefore more characteristics are extracted for human body recognition. For example, when analyzing the whole body feature image, since the weight occupied by the extracted color feature of the upper body is higher and the weight occupied by the extracted logo on the clothes worn by the user is lower before segmentation, the color feature of the upper body image is generally compared in the feature comparison process, and the feature of the logo on the clothes worn by the user is ignored. After segmentation, in the obtained upper body feature image, the weight of the feature of the logo on the clothes worn by the user is higher than that of the feature of the logo on the clothes worn by the user before segmentation, so that the function of emphasizing the feature of the logo on the clothes worn by the user is achieved, and therefore small-scale features are extracted for human body recognition.

In implementation, the feature images corresponding to the obtained human body, head, upper body, lower body and shoes are segmented, so as to obtain the segmented feature images.

Furthermore, the obtained characteristic images corresponding to the human body, the head, the upper body, the lower body and the shoes are respectively divided into one part, two parts and three parts, so that the characteristic images corresponding to the human body, the head, the upper body, the lower body and the shoes are respectively determined.

Optionally, before segmenting the characteristic image of the human body part, the characteristic image of the human body part may be amplified, and the corresponding steps are as follows: the feature image of the human body part is amplified based on a pixel filling mode to obtain at least one amplified feature image, and the amplified feature image is segmented to obtain the segmented feature image.

The characteristic image of the human body part may be enlarged by filling a certain number of pixels in the characteristic image of the human body part in a certain manner. The pixel filling method may input the feature image into the hole convolution module to fill the feature image of the human body part with pixels, or may fill a certain number of pixels at a certain interval in the feature image, for example, 1 pixel at an interval of 2 pixels in each line of pixels in the feature image with a size of 5 × 5 (unit: pixel), or may fill a certain number of pixels at a fixed position in the feature image.

The pixels are obtained by upsampling pooling, which is to sample pixels in the feature image, determine filling pixels according to values of the pixels, and fill the feature image. For example: the pixels of the area where the fill pixel is located may be sampled, the pixel that is the maximum value among all the pixels may be taken as the fill pixel, and the fill pixel may be filled into the feature image. Here, the way of upsampling pooling is not limited.

In implementation, the feature images of the human body parts included in the target human body image are obtained, the feature images of the human body parts are amplified based on a pixel filling mode to obtain at least one amplified feature image, and the amplified feature images are segmented to obtain the segmented feature images.

Optionally, the feature images of the human body part are respectively input into a plurality of void convolution modules with different void ratios, so as to obtain a plurality of amplified feature images.

The hole convolution is a pixel filling mode, and the characteristic images of the human body parts with different scales are obtained by respectively inputting the characteristic images of the human body parts into the hole convolution with different hole rates. For example, one feature image with a size of 7 × 7 is input to the hole convolution with the hole rates of 1, 2, and 3, and feature images with sizes of 9 × 9, 11 × 11, and 13 × 13 are acquired. The larger the void ratio, the larger the feature image after enlargement.

In the implementation, as shown in fig. 4, the feature images of the human body components having the sizes of 7 × 10240 are input to the hole convolutions having the hole rates of 1, 2, and 3, respectively, and the feature images of the human body components having the sizes of 9 × 10240, 11 × 10240, and 13 × 10240, respectively, are acquired. At this time, the feature images of the human body parts of the same size may be regarded as the same branch, and the feature images of the human body parts of 3 branches may be formed together. Respectively carrying out equal division, halving and trisection on the characteristic images of the human body parts of the same branch, respectively obtaining 1 complete characteristic image of the human body part, 2 characteristic images of the halved human body parts and 3 characteristic images of the trisected human body parts by each branch, and then processing the characteristic images after being split by each branch.

Optionally, before segmenting the feature image of the human body component and after amplifying the feature image of the human body component, the amplified feature image may be input into the self-attention screening module to obtain a processed feature image, and the segmentation processing may be performed on the processed feature image to obtain a segmented feature image.

The self-attention feature screening module is used for finely repairing edge information of the amplified human body and the amplified background, inhibiting noise and highlighting a pedestrian result, so that a more accurate feature image of the human body part is obtained.

It should be noted that step 203 may be regarded as a first-level feature pyramid, and the first-level feature pyramid may extract coarse-grained features of feature images of different scales by amplifying the feature images of each human body part at different scales and then by enhancing the significance of the feature images of different scales through a self-attention feature screening module.

Step 204: and carrying out human body identification processing based on the segmented characteristic image and an identification module in the human body identification model.

The identification module achieves the purpose of identifying a specific pedestrian by calculating the similarity between the characteristics of the pedestrian and the characteristics of the specific pedestrian. The similarity may be determined by calculating a cosine distance between the pedestrian feature and the feature of the specific certain pedestrian, or may be determined by calculating a euclidean distance between the pedestrian feature and the feature of the specific certain pedestrian, where the manner of determining the similarity between the two features is not limited herein. In practical implementation, the identification result is whether the pedestrian is a specific pedestrian. In the testing process, the human body recognition result is a score, and the score is used for representing the similarity degree of the characteristic information output by the human body recognition model and the characteristic information corresponding to each human body image in the human body library.

The recognition module can perform secondary feature extraction on the segmented feature image, then reduce the feature image after the secondary feature extraction into a one-dimensional feature image, splice the feature vectors, and compare the spliced feature vectors with the pre-stored feature vectors to obtain a recognition result.

In implementation, the segmented feature image is input to a recognition module in a human body recognition model for recognition processing.

Optionally, the human body recognition processing is performed based on the segmented feature image, the feature image of the human body component, and the recognition module in the human body recognition model, and includes: inputting the segmented characteristic image and the characteristic image of the human body component into a characteristic extraction module in the human body recognition model respectively to obtain corresponding characteristic vectors; merging all the feature vectors to obtain merged feature vectors; and carrying out human body identification processing based on the combined feature vector and an identification module in the human body identification model.

In implementation, the segmented feature images and the feature images of the human body parts are respectively input into a feature extraction module in a human body recognition model for secondary feature extraction to obtain corresponding feature vectors. And merging all the feature vectors to obtain merged feature vectors. And carrying out human body identification processing based on the combined feature vector and an identification module in the human body identification model.

The secondary feature extraction aims to extract small-scale features in the segmented feature images and improve the accuracy of human body recognition.

Specifically, the feature images of 1 complete human body part, 2 halved human body parts and 3 trisected human body parts are respectively obtained from each branch, and are respectively input into the feature extraction module to perform secondary feature extraction, feature vectors after feature extraction are obtained, the vectors are subjected to dimension reduction processing, one-dimensional vectors corresponding to the feature images of each human body part are obtained, the feature vectors are combined, the combined feature vectors are input into the identification module, and a human body identification result is obtained.

Further, after the combined feature vector is input to the recognition module, a recognition result is obtained. In the actual implementation process, after the combined feature vector is input into the identification module, the feature vector is compared with the existing feature vector to obtain an identification result. In the training process, after the combined feature vectors are input into the recognition module, scoring is carried out according to the similarity between the feature vectors and all the feature vectors in the human body library, and then recognition results are obtained.

In the process of training the human body recognition model, any human body image in the human body library and a first ID corresponding to the human body image are obtained, the human body image is input into the human body recognition model, a feature image output by a target human body image is scored with a feature image of each human body library, a feature image with the highest score of the feature image output by the target human body image is obtained, a second ID of the feature image is obtained, the first ID and the second ID are compared, and a comparison result is obtained. When the first ID is the same as the second ID, the parameters in the human body recognition model are not adjusted in value. And when the first ID is different from the second ID, calculating difference information of the characteristic image of the human body image and the characteristic image output by the human body recognition model by using a loss function, determining an adjustment value of a parameter in the human body recognition model according to the difference information and a preset training algorithm, and further performing numerical adjustment on the parameter in the human body recognition model, thereby completing one-time training. Then, the human body images of other pedestrians in the sample training set are obtained, and the process is repeated. Thus, through training of a large number of samples, the final human body recognition model is obtained.

It should be noted that step 204 may be regarded as a secondary feature pyramid, and the secondary feature pyramid may perform segmentation on feature images of different scales, and then perform secondary feature extraction, so as to extract fine particle features of the feature images of different scales, thereby improving the accuracy of identifying pedestrians.

In this embodiment, the accuracy of identifying the target feature image is further improved by the coarse-grained features extracted by the first-level feature pyramid and the fine-grained features extracted by the second-level feature pyramid.

Based on the same technical concept, an embodiment of the present application further provides an apparatus for extracting core lyrics of a song, where the apparatus may be a terminal in the foregoing embodiment, as shown in fig. 5, and the apparatus includes:

an acquisition module 501 configured to acquire a target human body image;

a part segmentation module 502 configured to input the human body image into a part segmentation module in a human body recognition model, and obtain a feature image of each human body part included in the target human body image;

a segmentation module 503 configured to determine a segmented feature image based on the feature image of the human body part;

a recognition module 504 configured to perform human body recognition processing based on the segmented feature image and a recognition module in the human body recognition model.

Optionally, the determining a segmented feature image based on the feature image of the human body component is configured to:

merging all the feature vectors to obtain merged feature vectors;

Optionally, the human body recognition processing is performed based on the merged feature vector and a recognition module in the human body recognition model, and is configured to:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

It should be noted that: in the human body recognition apparatus provided in the above embodiment, only the division of the functional modules is illustrated, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the functions described above. In addition, the human body identification device provided by the above embodiment and the human body identification method embodiment belong to the same concept, and the specific implementation process thereof is detailed in the method embodiment and is not described herein again.

Fig. 6 is a schematic structural diagram of a terminal according to an embodiment of the present disclosure. The terminal 600 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. The terminal 600 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, etc.

In general, the terminal 600 includes: one or more processors 601 and one or more memories 602.

The processor 601 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 601 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 601 may also include a main processor and a coprocessor, where the main processor is a processor for processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 601 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, processor 601 may also include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

The memory 602 may include one or more computer-readable storage media, which may be non-transitory. The memory 602 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 602 is used to store at least one instruction for execution by the processor 601 to implement the method of human body recognition provided by the method embodiments in the present disclosure.

In some embodiments, the terminal 600 may further optionally include: a peripheral interface 603 and at least one peripheral. The processor 601, memory 602, and peripheral interface 603 may be connected by buses or signal lines. Various peripheral devices may be connected to the peripheral interface 603 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 604, a display 605, a camera 606, an audio circuit 607, a positioning component 608, and a power supply 609.

The peripheral interface 603 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 601 and the memory 602. In some embodiments, the processor 601, memory 602, and peripheral interface 603 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 601, the memory 602, and the peripheral interface 603 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 604 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 604 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 604 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 604 comprises: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 604 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 604 may also include NFC (Near Field Communication) related circuits, which are not limited by this disclosure.

The display 605 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 605 is a touch display screen, the display screen 605 also has the ability to capture touch signals on or over the surface of the display screen 605. The touch signal may be input to the processor 601 as a control signal for processing. At this point, the display 605 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 605 may be one, providing the front panel of the terminal 600; in other embodiments, the display 605 may be at least two, respectively disposed on different surfaces of the terminal 600 or in a folded design; in still other embodiments, the display 605 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 600. Even more, the display 605 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display 605 may be made of LCD (liquid crystal Display), OLED (Organic Light-Emitting Diode), and the like.

The camera assembly 606 is used to capture images or video. Optionally, camera assembly 606 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 606 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

Audio circuitry 607 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 601 for processing or inputting the electric signals to the radio frequency circuit 604 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 600. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 601 or the radio frequency circuit 604 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, audio circuitry 607 may also include a headphone jack.

The positioning component 608 is used to locate the current geographic location of the terminal 600 to implement navigation or LBS (location based Service). The positioning component 608 can be a positioning component based on the GPS (global positioning System) in the united states, the beidou System in china, the graves System in russia, or the galileo System in the european union.

Power supply 609 is used to provide power to the various components in terminal 600. The power supply 609 may be ac, dc, disposable or rechargeable. When the power supply 609 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the terminal 600 also includes one or more sensors 610. The one or more sensors 610 include, but are not limited to: acceleration sensor 611, gyro sensor 612, pressure sensor 613, fingerprint sensor 614, optical sensor 615, and proximity sensor 616.

The acceleration sensor 611 may detect the magnitude of acceleration in three coordinate axes of the coordinate system established with the terminal 600. For example, the acceleration sensor 611 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 601 may control the display screen 605 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 611. The acceleration sensor 611 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 612 may detect a body direction and a rotation angle of the terminal 600, and the gyro sensor 612 and the acceleration sensor 611 may cooperate to acquire a 3D motion of the user on the terminal 600. The processor 601 may implement the following functions according to the data collected by the gyro sensor 612: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensors 613 may be disposed on the side bezel of terminal 600 and/or underneath display screen 605. When the pressure sensor 613 is disposed on the side frame of the terminal 600, a user's holding signal of the terminal 600 can be detected, and the processor 601 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 613. When the pressure sensor 613 is disposed at the lower layer of the display screen 605, the processor 601 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 605. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 614 is used for collecting a fingerprint of a user, and the processor 601 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 614, or the fingerprint sensor 614 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 601 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying, and changing settings, etc. The fingerprint sensor 614 may be disposed on the front, back, or side of the terminal 600. When a physical button or vendor Logo is provided on the terminal 600, the fingerprint sensor 614 may be integrated with the physical button or vendor Logo.

The optical sensor 615 is used to collect the ambient light intensity. In one embodiment, processor 601 may control the display brightness of display screen 605 based on the ambient light intensity collected by optical sensor 615. Specifically, when the ambient light intensity is high, the display brightness of the display screen 605 is increased; when the ambient light intensity is low, the display brightness of the display screen 605 is adjusted down. In another embodiment, the processor 601 may also dynamically adjust the shooting parameters of the camera assembly 606 according to the ambient light intensity collected by the optical sensor 615.

A proximity sensor 616, also known as a distance sensor, is typically disposed on the front panel of the terminal 600. The proximity sensor 616 is used to collect the distance between the user and the front surface of the terminal 600. In one embodiment, when proximity sensor 616 detects that the distance between the user and the front face of terminal 600 gradually decreases, processor 601 controls display 605 to switch from the bright screen state to the dark screen state; when the proximity sensor 616 detects that the distance between the user and the front face of the terminal 600 is gradually increased, the processor 601 controls the display 605 to switch from the breath-screen state to the bright-screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 6 is not intended to be limiting of terminal 600 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

Fig. 7 is a schematic structural diagram of a server 700 according to an embodiment of the present disclosure, where the server 700 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 701 and one or more memories 702, where at least one program code is stored in the one or more memories 702, and is loaded and executed by the one or more processors 701 to implement the methods provided by the foregoing method embodiments. Of course, the server 700 may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input and output, and the server 700 may also include other components for implementing the functions of the device, which are not described herein again.

In an exemplary embodiment, a computer-readable storage medium, such as a memory, including instructions executable by a processor to perform the method of human body recognition in the above-described embodiments is also provided. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, and the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present disclosure and is not intended to limit the present disclosure, so that any modification, equivalent replacement, or improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims

1. A method of performing human body recognition, the method comprising:

acquiring a target human body image;

inputting the target human body image into a component segmentation module in a human body recognition model to obtain a characteristic image of each human body component included in the target human body image;

2. The method of claim 1, wherein the inputting the target human body image into a component segmentation module in a human body recognition model to obtain a feature image of each human body component included in the target human body image comprises:

3. The method of claim 1, wherein determining the segmented feature image based on the feature image of the human body part comprises:

and carrying out segmentation processing on the amplified characteristic image to obtain a segmented characteristic image.

4. The method of claim 3, wherein the segmenting the enlarged feature image to obtain a segmented feature image comprises:

5. The method according to claim 3, wherein the enlarging the characteristic image of the human body part based on the pixel filling manner to obtain at least one enlarged characteristic image comprises:

6. The method according to claim 3, wherein the performing human body recognition processing based on the segmented feature image and the recognition module in the human body recognition model comprises:

7. The method according to claim 6, wherein the performing a human body recognition process based on the segmented feature image, the feature image of the human body part and a recognition module in the human body recognition model comprises:

merging all the feature vectors to obtain merged feature vectors;

8. The method of claim 7, wherein the performing human recognition processing based on the merged feature vector and a recognition module in the human recognition model comprises:

9. An apparatus for human recognition, configured to:

an acquisition module configured to acquire a target human body image;

10. A terminal, characterized in that the terminal comprises a processor and a memory, wherein the memory stores at least one instruction, and the at least one instruction is loaded and executed by the processor to implement the human body recognition method according to any one of claims 1 to 8.

11. A computer-readable storage medium having stored therein at least one instruction, which is loaded and executed by a processor, to implement the method of human body recognition according to any one of claims 1 to 8.