WO2020207038A1

WO2020207038A1 - People counting method, apparatus, and device based on facial recognition, and storage medium

Info

Publication number: WO2020207038A1
Application number: PCT/CN2019/122079
Authority: WO
Inventors: 王金燕
Original assignee: 深圳壹账通智能科技有限公司
Priority date: 2019-04-12
Filing date: 2019-11-29
Publication date: 2020-10-15
Also published as: CN110163092A

Abstract

A people counting method, apparatus, and device based on facial recognition, and a storage medium. The method comprises: when a people counting instruction is received, obtaining a video image from a video stream (S101); performing image segmentation on the video image to obtain multiple segmented images, and extracting image LBP features of the multiple segmented images (S102); inputting the image LBP features into a pre-trained human image recognition model, performing recognition on the image LBP features by the human image recognition model, and outputting recognition results (S103); and counting the number of segmented images with the recognition result being a human image to obtain a first people counting result (S104). According to the method, people in a video are counted by means of image processing technology on the basis of artificial intelligence, thereby greatly improving the efficiency and accuracy of people counting.

Description

People counting method, device, equipment and storage medium based on face recognition To

This application requires the priority of a Chinese patent application filed with the Chinese Patent Office on April 12, 2019, the application number is 201910297454.2, and the invention title is "People counting methods, devices, equipment and computer-readable storage media based on face recognition" , The entire contents of which are incorporated in the application by reference

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a method, device, device, and storage medium for counting people based on face recognition.

Background technique

At present, when it is necessary to count people in conference rooms, stations, shopping malls and other areas, it is generally necessary to count the people manually or obtain the result of the people counting indirectly through other methods, which leads to low efficiency of people counting and inaccurate statistics.

Summary of the invention

This application provides a method, device, equipment and storage medium for counting people based on face recognition, aiming to improve the efficiency and accuracy of counting people.

In order to achieve the above objective, this application provides a method for counting people based on face recognition, and the method includes:

When receiving the people counting instruction, obtain the video image from the video stream;

Performing picture segmentation on the video image to obtain multiple segmented images, and extracting image LBP features of the multiple segmented images;

Input the LBP features of the image into a pre-trained portrait recognition model, and the LBP features of the image are recognized by the portrait recognition model, and the recognition result is output;

Count the number of the segmented images whose recognition result is a portrait, and obtain a first population count result;

The step of extracting image LBP features of a plurality of the segmented images includes:

Dividing the segmented image into multiple regions;

Comparing the central gray value of each pixel in each area with the gray values of 8 adjacent pixels adjacent to the pixel to obtain the LBP feature of the pixel;

Obtain a histogram of each area based on the LBP feature of the pixel;

Perform normalization processing on the histogram of each region to obtain a statistical histogram, and obtain the image LBP feature of the segmented image based on the statistical histogram.

Preferably, the step of inputting the LBP features of the image into a pre-trained portrait recognition model, and then recognizing the LBP features of the image by the portrait recognition model, before the step of outputting the recognition result further includes:

Collect a preset number of sample images, and set the label of the sample images as portrait or non-portrait;

After compressing the sample image into 128×128 pixels, perform gray-scale processing and random incomplete processing to obtain a processed sample image;

Extracting the sample LBP feature of the processed sample image to obtain the sample LBP feature;

The sample LBP features are input into a neural network created based on TensorFlow for training to obtain the portrait recognition model, and the recognition result output by the portrait recognition model is a portrait or a non-portrait.

Preferably, the segmented image includes a first segmented image and a second segmented image, and the step of segmenting the video image into a picture to obtain multiple segmented images includes:

Compressing the video image into a compressed video image of 512×512 pixels;

Segmenting the compressed video image according to 64×64 pixels to obtain multiple first segmented images;

Perform secondary image segmentation on the overlapping area of adjacent segmented images in the first segmented image with 64 pixels as a starting point to obtain a second segmented image.

Preferably, the statistical recognition result is the number of segmented images of the portrait, and after the step of obtaining the first population statistical result, the method further includes:

Report the first population count result to the server according to the preset result report interface.

Preferably, before the step of reporting the first population count result to the server according to the preset result reporting interface, the method further includes:

Judging whether an abnormal human figure is included in the first population count result according to the peak counting method;

If the first person counting result does not include the abnormal portrait, perform the step: reporting the first person counting result to the server according to a preset result reporting interface;

If the first person counting result includes the abnormal person, after removing the number of the abnormal person from the first person counting result, a second person counting result is obtained, and the second person counting result is Report to the server.

Preferably, the portrait recognition model records the portrait coordinates of the portrait in the segmented image whose recognition result is a portrait, and the step of judging whether an abnormal portrait is included in the first population count result according to the crest counting method includes:

Obtaining the number of times the portrait coordinates appear within a preset time;

If the number of times that the portrait coordinates appear within the preset time is greater than or equal to the number threshold, it is determined that the portrait corresponding to the image coordinates is not an abnormal portrait;

If the number of occurrences of the portrait image coordinates within the preset time is less than the number threshold, it is determined that the portrait corresponding to the image coordinates is an abnormal portrait, and the portrait corresponding to the image coordinates is marked as an abnormal portrait.

To achieve the foregoing objective, an embodiment of the present application further provides a device for counting people based on face recognition, and the device for counting people based on face recognition includes:

The obtaining module is used to obtain video images from the video stream when receiving the people counting instruction;

An extraction module, configured to perform picture segmentation on the video image to obtain multiple segmented images, and extract image LBP features of the multiple segmented images;

A recognition module, configured to input LBP features of the image into a pre-trained portrait recognition model, the portrait recognition model recognizes the LBP features of the image, and outputs a recognition result;

A statistics module, configured to count the number of the segmented images whose recognition result is a portrait, to obtain a first population statistics result;

The extraction module is also used for:

Dividing the segmented image into multiple regions;

Obtain a histogram of each area based on the LBP feature of the pixel;

In order to achieve the foregoing objective, an embodiment of the present application further provides a device for counting people based on face recognition. The device for counting people based on face recognition includes a processor, a memory, and a face recognition-based device stored in the memory. People counting computer readable instructions, when the computer readable instructions for counting people based on face recognition are executed by the processor, implement the steps of the method for counting people based on face recognition as described above.

In order to achieve the above objective, an embodiment of the present application further provides a computer storage medium, the computer storage medium stores a computer readable instruction for counting people based on face recognition, and the computer readable instruction for counting people based on face recognition The steps of the method for counting people based on face recognition as described above are realized when the processor is running.

Compared with the prior art, the present application proposes a method, device, device, and storage medium for counting people based on face recognition. The method includes: obtaining a video image from a video stream when receiving a people counting instruction; The video image is segmented to obtain multiple segmented images, and the image LBP features of the multiple segmented images are extracted; the image LBP features are input to a pre-trained portrait recognition model, and the portrait recognition model The LBP feature of the image is recognized, and the recognition result is output; the recognition result is the number of segmented images of the portrait, and the first people count result is obtained. This application is based on artificial intelligence and uses image processing technology to count the number of people in the video, thereby greatly improving the efficiency and accuracy of people counting.

Description of the drawings

FIG. 1 is a schematic diagram of the hardware structure of a people counting device based on face recognition according to various embodiments of the present application;

2 is a schematic flowchart of a first embodiment of a method for counting people based on face recognition in this application;

3 is a schematic flowchart of a second embodiment of a method for counting people based on face recognition in this application;

Fig. 4 is a schematic diagram of functional modules of a first embodiment of a device for counting people based on face recognition according to the present application.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

detailed description

It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application.

The person counting device based on face recognition mainly involved in the embodiments of the present application refers to a network connection device that can realize network connection. The person counting device based on face recognition may be a server, a cloud platform, and the like.

Referring to FIG. 1, FIG. 1 is a schematic diagram of the hardware structure of a people counting device based on face recognition according to various embodiments of the present application. In the embodiment of the present application, the device for counting people based on face recognition may include a processor 1001 (for example, a central processor Processing Unit, CPU), communication bus 1002, input port 1003, output port 1004, memory 1005. Among them, the communication bus 1002 is used to realize the connection and communication between these components; the input port 1003 is used for data input; the output port 1004 is used for data output. The memory 1005 can be a high-speed RAM memory or a stable memory (non-volatile memory). memory), such as a disk memory. The memory 1005 may optionally be a storage device independent of the aforementioned processor 1001. Those skilled in the art can understand that the hardware structure shown in FIG. 1 does not constitute a limitation to the present application, and may include more or less components than those shown in the figure, or combine certain components, or different component arrangements.

Continuing to refer to FIG. 1, the memory 1005 as a readable storage medium in FIG. 1 may include an operating system, a network communication module, an application program module, and computer readable instructions for counting people based on face recognition. In FIG. 1, the network communication module is mainly used to connect to the server and perform data communication with the server; and the processor 1001 can call the computer-readable instructions for counting people based on face recognition stored in the memory 1005, and execute the instructions provided in the embodiments of the present application. A method of counting people based on face recognition.

The embodiment of the present application provides a method for counting people based on face recognition.

Referring to FIG. 2, FIG. 2 is a schematic flowchart of a first embodiment of a method for counting people based on face recognition in this application.

In this embodiment, the method for counting people based on face recognition is applied to a device for counting people based on face recognition, and the method includes:

Step S101, when receiving a number counting instruction, obtain a video image from a video stream;

In this embodiment, a camera is installed in the people counting area where the people counting is needed in advance, and the people counting area is captured by the camera to obtain and save the video stream in real time. For example, install a camera at a certain location in the conference room to capture information such as scenes and people in the conference room, and save the current video stream of the conference room.

When receiving the user's voice or touch operation command for counting the number of people, the video image is obtained from the video stream. Understandably, the people counting instruction includes a time point, and the time point may be the current time, the historical time, and the scheduled future time. Generally, the video stream has a time stamp, and the video image corresponding to the time point is obtained according to the time stamp.

Step S102: Perform picture segmentation on the video image to obtain multiple segmented images, and extract image LBP features of the multiple segmented images;

In this embodiment, the video image needs to be segmented twice to obtain the first segmented image and the second segmented image. Specifically, the step of performing picture segmentation on the video image to obtain multiple segmented images includes:

Step S102-1a, compressing the video image into a compressed video image of 512×512 pixels;

The video image is compressed to obtain the compressed video image of 512×512 pixels. Understandably, in other embodiments, the video image may be compressed according to other pixels.

In step S102-1b, the compressed video image is segmented into pictures by 64×64 pixels to obtain multiple first segmented images;

Performing a first segmentation on the compressed video image: segmenting the compressed video image by 64×64 pixels to obtain multiple first segmented images.

Step S102-1c: Perform secondary image segmentation on the overlapping area of adjacent segmented images in the first segmented image with 64 pixels as a starting point to obtain a second segmented image.

Inevitably, there will be overlapping areas after image segmentation. Therefore, the overlapping area of adjacent segmented images in the first segmented image is segmented twice with 64 pixels as the starting point to obtain the second segmented image.

In this embodiment, the segmented image includes the first segmented image and the second segmented image.

In this embodiment, LBP (Local Binary Patterns (local binary mode) is an operator used to describe the characteristics of the local texture of an image, which has the characteristic of gray invariance.

Further, the step of extracting image LBP features of a plurality of the segmented images includes:

Step S102-2a: Divide the segmented image into multiple regions;

The segmented image is divided into a plurality of regions of a preset size, for example, into a plurality of regions of 16×16.

Step S102-2b: comparing the central gray value of each pixel in each region with the gray values of 8 adjacent pixels adjacent to the pixel to obtain the LBP feature of the pixel;

Specifically, if the adjacent gray value is greater than the central gray value, the position of the adjacent pixel is marked as 1; if the adjacent gray value is less than or equal to the central gray value , The position of the adjacent pixel is marked as 0; in this way, comparing with 8 points in the neighborhood of 3*3 can generate an 8-bit binary number (usually convertible to a decimal number, that is, the LBP feature, the LBP The value is an integer between 1 and 256), thereby obtaining the LBP feature of the pixel.

Step S102-2c: Obtain a histogram of each area based on the LBP feature of the pixel;

After obtaining the LBP feature of each pixel in the area, statistics of the LBP feature of each pixel can obtain the histogram of each area.

Step S102-2d: Perform normalization processing on the histogram of each area to obtain a statistical histogram, and obtain the image LBP feature of the segmented image based on the statistical histogram.

Generally, when using LBP to express image texture, only focus on the Uniform mode, and group all other modes into the same category. As a result, the normalized image can better reflect the texture of each typical area, while at the same time diminishing the smoothness. Regional characteristics. In this embodiment, the histogram of each region is normalized to obtain a statistical histogram, and the image LBP feature of the segmented image is obtained based on the statistical histogram.

Further, in order to make the LBP feature rotation-invariant, the binary system is rotated. For example, the initial LBP feature obtained at the beginning is 10010000, then after the initial feature is rotated clockwise, it can be converted to the minimum value of 00001001. Value form, so that the decimal value of the minimum form is the smallest, that is, the LBP is the smallest. No matter how the segmented image is rotated, the LBP is the smallest, which can ensure that the LBP has rotation invariance.

Step S103: Input the LBP feature of the image into a pre-trained portrait recognition model, and the LBP feature of the image is recognized by the portrait recognition model, and the recognition result is output;

In this embodiment, the step S103: input the LBP features of the image into a pre-trained portrait recognition model, the LBP features of the image are recognized by the portrait recognition model, and the step of outputting the recognition result also includes:

Step S103a: Collect a preset number of sample images, and set the label of the sample images as portrait or non-portrait;

The sample image includes a portrait sample image and a non-portrait sample image, and the portrait sample image includes a human face sample image and a human upper body sample image.

In this embodiment, 100,000 pieces of the face sample images are collected, 50,000 pieces of the upper body sample images of the person are collected, and labels are set for the 100,000 pieces of the face sample images and 50,000 pieces of the upper body sample images of the person As a portrait. Collect 10,000 non-portrait images, and set the label of the 10,000 non-portrait images as non-portrait.

Understandably, using the upper body image of the person as a training sample, when the face in the video image is occluded, the number of people can be counted according to the characteristics of the upper body image, which can prevent inaccurate statistical results and lack of people. . Moreover, using non-personal images as training samples can make the trained person-recognition model recognize non-personal images and make the statistical results more accurate.

Step S103b, compress the sample image into 128×128 pixels, and then perform grayscale processing and random incomplete processing to obtain a processed sample image;

In this embodiment, the sample image is first compressed into 128×128 pixels to obtain a compressed sample image. Then, the compressed sample image is gray-scale processed by one of image inversion and logarithmic transformation to obtain a gray-scale sample image. Then, the grayscale sample image is subjected to random incomplete processing using an image repair method to obtain the processed sample image.

Step S103c, extract the sample LBP feature of the processed sample image, and obtain the sample LBP feature;

The processed sample image is divided into a plurality of sample regions, and the sample center gray value of each sample pixel in each sample region is divided into the gray value of the 8 sample adjacent pixels adjacent to the sample pixel. The degree value is compared to obtain the sample LBP feature of the sample pixel; the sample histogram of each sample area is obtained based on the LBP feature of the sample pixel; the sample histogram of each sample area is normalized A statistical sample histogram is obtained by transformation processing, and a sample image LBP feature of the sample image is obtained based on the sample statistical histogram.

Step S103d, input the sample LBP features into a neural network created based on TensorFlow for training to obtain the portrait recognition model, and the recognition result output by the portrait recognition model is a portrait or a non-portrait.

The TensorFlow is an open source machine learning framework, and TensorFlow is widely used in programming implementation of various machine learning algorithms. Using TensorFlow can help developers build models in extreme codes and make the products they need based on the models.

In this embodiment, the sample LBP features are input into a neural network created based on TensorFlow for training. After repeated training for millions of times, the sample LBP features can be accurately classified according to the labels of the corresponding sample images, thus The portrait recognition model is obtained, and the recognition result output by the portrait recognition model is a portrait or a non-portrait, that is, the recognition result of a sample image labeled as a portrait in the sample image is output as a portrait, and the label in the sample image is The recognition result of the non-personal sample image is output as a non-personal image.

Step S104: Count the number of the segmented images whose recognition result is a portrait, and obtain a first population count result.

According to the recognition result output by the portrait recognition model, the number of the segmented images whose recognition result is the portrait is counted, and the number of the segmented images is taken as the first person counting result.

This embodiment uses the above solution to obtain a video image from a video stream when receiving a people counting instruction; segment the video image to obtain multiple segmented images, and extract the number of segmented images. Image LBP features; input the image LBP features into a pre-trained portrait recognition model, the portrait recognition model recognizes the image LBP features, and outputs the recognition result; the statistical recognition result is the number of segmented images of the portrait, Obtain the first number of people counting results. Therefore, based on artificial intelligence, image processing technology is used to count the number of people in the video, which greatly improves the efficiency and accuracy of people counting.

As shown in FIG. 3, the second embodiment of the present application proposes a method for counting people based on face recognition. Based on the first embodiment shown in FIG. 2, the statistical recognition result is the number of segmented images of portraits. , After the step of obtaining the first number of people counting result, it also includes:

Step S106: According to a preset result reporting interface, the first number of people counting results are reported to the server.

Specifically, a reporting interface is preset, and the reporting interface is used for network communication with the server. Understandably, the reporting interface may also report camera information, area information, time information, etc., corresponding to the first people counting result to the server.

The step S106: before the step of reporting the first population count result to the server according to the preset result reporting interface, the method further includes:

Understandably, since the video images in the video stream change in real time, the video images obtained from the video stream may not be stable enough, and may cause the first portrait statistics result due to people walking, posture changes, etc. Not accurate enough.

Specifically, acquiring the number of times the portrait coordinates appear within a preset time;

The video image is extracted from the video stream according to a preset duration, and the preset duration may be 100ms, 200ms, etc., the video image is segmented to obtain multiple segmented images, and the image LBP features of the multiple segmented images are extracted; the image LBP features are input to a pre-trained portrait recognition model, and the The portrait recognition model recognizes the LBP features of the image, and outputs the portrait coordinates of the portrait in the segmented image.

If the number of times that the portrait coordinates appear within the preset time is greater than or equal to the threshold of times, it is determined that the portrait corresponding to the image coordinates is not an abnormal portrait; in this embodiment, the preset time may be 1 minute, and the number of times The threshold may be 4 times, 10 times, etc. For example, if the number of times the portrait coordinate appears in 1 minute is 10 times, it means that the portrait corresponding to the image coordinate is not an abnormal portrait. If the number of occurrences of the portrait image coordinates within the preset time is less than the number threshold, it is determined that the portrait corresponding to the image coordinates is an abnormal portrait, and the portrait corresponding to the image coordinates is marked as an abnormal portrait. If the first people counting result does not include the abnormal portrait, then the step is performed: according to the preset result reporting interface, the first people counting result is reported to the server; if the first people counting result includes all the If the abnormal portrait is described, after removing the number of the abnormal portrait from the first population statistics result, a second population statistics result is obtained, and the second population statistics result is reported to the server. If there are two abnormal figures, the second statistical result is obtained after subtracting 2 from the first statistical result.

This embodiment uses the above solution to obtain a video image from a video stream when receiving a people counting instruction; segment the video image to obtain multiple images, and extract multiple images of the segmented images Image LBP features; input the image LBP features into a pre-trained portrait recognition model, the portrait recognition model recognizes the image LBP features, and outputs the recognition result; the statistical recognition result is the number of segmented images of the portrait, Obtain the first population statistics result; report the first population statistics result to the server according to the preset result reporting interface. Therefore, based on artificial intelligence, image processing technology is used to count the number of people in the video, which greatly improves the efficiency and accuracy of people counting.

In addition, this embodiment also provides a device for counting people based on face recognition. Referring to Fig. 4, Fig. 4 is a schematic diagram of the functional modules of the first embodiment of the device for counting people based on face recognition in this application.

The device for counting people based on face recognition provided in the present application is a virtual device, which is stored in the memory 1005 of the device for counting people based on face recognition shown in FIG. 1 to realize the computer readable people counting device based on face recognition. All functions of the instruction: when receiving the people counting instruction, obtain the video image from the video stream; use to segment the video image to obtain multiple segmented images, and extract multiple segments The image LBP features of the image; used to input the image LBP features into a pre-trained portrait recognition model, and the portrait recognition model recognizes the image LBP features, and outputs the recognition results; used to count the recognition results for all the portraits State the number of segmented images, and obtain the first person counting result.

Specifically, the device for counting people based on face recognition in this embodiment includes:

The obtaining module 10 is configured to obtain a video image from a video stream when a number counting instruction is received;

The extraction module 20 is configured to perform picture segmentation on the video image, obtain multiple segmented images, and extract image LBP features of the multiple segmented images;

The recognition module 30 is configured to input the LBP features of the image into a pre-trained portrait recognition model, and the portrait recognition model recognizes the LBP features of the image, and outputs a recognition result;

The statistics module 40 is configured to count the number of the segmented images whose recognition result is a portrait, and obtain a first population statistics result.

Further, the identification module is also used for:

Further, the extraction module is also used for:

Compressing the video image into a compressed video image of 512×512 pixels;

Further, the extraction module is also used for:

Dividing the segmented image into multiple regions;

Obtain a histogram of each area based on the LBP feature of the pixel;

Furthermore, the statistics module is also used for:

In addition, the present application also provides a computer storage medium that stores a computer readable instruction for counting people based on face recognition. When the computer readable instruction for counting people based on face recognition is run by a processor The steps for realizing the method for counting people based on face recognition as described above will not be repeated here. The computer-readable storage medium may be a non-volatile readable storage medium.

Compared with the prior art, the present application proposes a method, device, device, and storage medium for counting people based on face recognition. The method includes: obtaining a video image from a video stream when receiving a people counting instruction; The video image is segmented to obtain multiple segmented images, and the image LBP features of the segmented images are extracted; the image LBP features are input into a pre-trained portrait recognition model, and the portrait recognition model The LBP feature of the image is recognized, and the recognition result is output; the recognition result is the number of segmented images of the portrait, and the first people count result is obtained. This application is based on artificial intelligence and uses image processing technology to count the number of people in the video, thereby greatly improving the efficiency and accuracy of people counting.

It should be noted that in this article, the terms "include", "include" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or system including a series of elements not only includes those elements, It also includes other elements that are not explicitly listed, or elements inherent to the process, method, article, or system. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, method, article or system that includes the element.

The serial numbers of the foregoing embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disk, optical disk), including a number of instructions to make a terminal device execute the method described in each embodiment of this application.

The above are only preferred embodiments of this application, and do not limit the scope of this application. Any equivalent structure or process transformation made using the content of the description and drawings of this application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

A method for counting people based on face recognition, wherein the method includes:

When receiving the people counting instruction, obtain the video image from the video stream;

Performing picture segmentation on the video image to obtain multiple segmented images, and extracting image local binary mode LBP features of the multiple segmented images;

Input the LBP features of the image into a pre-trained portrait recognition model, and the LBP features of the image are recognized by the portrait recognition model, and the recognition result is output;

Count the number of the segmented images whose recognition result is a portrait, and obtain a first population count result;

The step of extracting image LBP features of a plurality of the segmented images includes:

Dividing the segmented image into multiple regions;

Comparing the central gray value of each pixel in each area with the gray values of 8 adjacent pixels adjacent to the pixel to obtain the LBP feature of the pixel;

Obtain a histogram of each area based on the LBP feature of the pixel;

Perform normalization processing on the histogram of each region to obtain a statistical histogram, and obtain the image LBP feature of the segmented image based on the statistical histogram.
The method according to claim 1, wherein the step of inputting LBP features of the image into a pre-trained portrait recognition model, and recognizing the LBP features of the image by the portrait recognition model, the step of outputting the recognition result further comprises :

Collect a preset number of sample images, and set the label of the sample images as portrait or non-portrait;

After compressing the sample image into 128×128 pixels, perform gray-scale processing and random incomplete processing to obtain a processed sample image;

Extracting the sample LBP feature of the processed sample image to obtain the sample LBP feature;

The sample LBP features are input into a neural network created based on TensorFlow for training to obtain the portrait recognition model, and the recognition result output by the portrait recognition model is a portrait or a non-portrait.
The method according to claim 1, wherein the segmented image includes a first segmented image and a second segmented image, and the step of dividing the video image into a picture to obtain a plurality of segmented images comprises :

Compressing the video image into a compressed video image of 512×512 pixels;

Segmenting the compressed video image according to 64×64 pixels to obtain multiple first segmented images;

Perform secondary image segmentation on the overlapping area of adjacent segmented images in the first segmented image with 64 pixels as a starting point to obtain a second segmented image.
The method according to claim 1, wherein the statistical recognition result is the number of segmented images of the portrait, and after the step of obtaining the first population statistical result, the method further comprises:

Report the first population count result to the server according to the preset result report interface.
The method according to claim 4, wherein before the step of reporting the first people counting result to the server according to the preset result reporting interface, the method further comprises:

Judging whether an abnormal human figure is included in the first population count result according to the peak counting method;

If the first person counting result does not include the abnormal portrait, perform the step: reporting the first person counting result to the server according to a preset result reporting interface;

If the first person counting result includes the abnormal person, after removing the number of the abnormal person from the first person counting result, a second person counting result is obtained, and the second person counting result is Report to the server.
The method according to claim 5, wherein the portrait recognition model records the portrait coordinates of the portrait in the segmented image whose recognition result is a portrait, and the method of crest counting determines whether the first population count result is The steps to include unusual figures include:

Obtaining the number of times the portrait coordinates appear within a preset time;

If the number of times that the portrait coordinates appear within the preset time is greater than or equal to the number threshold, it is determined that the portrait corresponding to the image coordinates is not an abnormal portrait;

If the number of occurrences of the portrait image coordinates within the preset time is less than the number threshold, it is determined that the portrait corresponding to the image coordinates is an abnormal portrait, and the portrait corresponding to the image coordinates is marked as an abnormal portrait.
A device for counting people based on face recognition, wherein the device for counting people based on face recognition includes:

The obtaining module is used to obtain video images from the video stream when receiving the people counting instruction;

An extraction module, configured to perform picture segmentation on the video image to obtain multiple segmented images, and extract image LBP features of the multiple segmented images;

A recognition module, configured to input LBP features of the image into a pre-trained portrait recognition model, the portrait recognition model recognizes the LBP features of the image, and outputs a recognition result;

A statistics module, configured to count the number of the segmented images whose recognition result is a portrait, to obtain a first population statistics result;

The extraction module is also used for:

Dividing the segmented image into multiple regions;

Comparing the central gray value of each pixel in each area with the gray values of 8 adjacent pixels adjacent to the pixel to obtain the LBP feature of the pixel;

Obtain a histogram of each area based on the LBP feature of the pixel;

Perform normalization processing on the histogram of each region to obtain a statistical histogram, and obtain the image LBP feature of the segmented image based on the statistical histogram.
The device according to claim 7, wherein the identification module is further used for:

Collect a preset number of sample images, and set the label of the sample images as portrait or non-portrait;

After compressing the sample image into 128×128 pixels, perform gray-scale processing and random incomplete processing to obtain a processed sample image;

Extracting the sample LBP feature of the processed sample image to obtain the sample LBP feature;

The sample LBP features are input into a neural network created based on TensorFlow for training to obtain the portrait recognition model, and the recognition result output by the portrait recognition model is a portrait or a non-portrait.
The device according to claim 7, wherein the extraction module is also used for;

Compressing the video image into a compressed video image of 512×512 pixels;

Segmenting the compressed video image according to 64×64 pixels to obtain multiple first segmented images;

Perform secondary image segmentation on the overlapping area of adjacent segmented images in the first segmented image with 64 pixels as a starting point to obtain a second segmented image.
The device according to claim 7, wherein the statistics module is also used for

Report the first population count result to the server according to the preset result report interface.
The device according to claim 10, wherein the statistics module is further used for:

Judging whether an abnormal human figure is included in the first population count result according to the peak counting method;

If the first person counting result does not include the abnormal portrait, perform the step: reporting the first person counting result to the server according to a preset result reporting interface;

If the first person counting result includes the abnormal person, after removing the number of the abnormal person from the first person counting result, a second person counting result is obtained, and the second person counting result is Report to the server.
The device according to claim 11, wherein the statistics module is further used for:

Obtaining the number of times the portrait coordinates appear within a preset time;

If the number of times that the portrait coordinates appear within the preset time is greater than or equal to the number threshold, it is determined that the portrait corresponding to the image coordinates is not an abnormal portrait;

If the number of occurrences of the portrait image coordinates within the preset time is less than the number threshold, it is determined that the portrait corresponding to the image coordinates is an abnormal portrait, and the portrait corresponding to the image coordinates is marked as an abnormal portrait.
A device for counting people based on face recognition, wherein the device for counting people based on face recognition includes a processor, a memory, and computer-readable instructions for counting people based on face recognition stored in the memory, and When the computer-readable instructions for counting people based on face recognition are executed by the processor, the following steps are implemented:

When receiving the people counting instruction, obtain the video image from the video stream;

Performing picture segmentation on the video image to obtain multiple segmented images, and extracting image local binary mode LBP features of the multiple segmented images;

Input the LBP features of the image into a pre-trained portrait recognition model, and the LBP features of the image are recognized by the portrait recognition model, and the recognition result is output;

Count the number of the segmented images whose recognition result is a portrait, and obtain a first population count result;

The step of extracting image LBP features of a plurality of the segmented images includes:

Dividing the segmented image into multiple regions;

Comparing the central gray value of each pixel in each area with the gray values of 8 adjacent pixels adjacent to the pixel to obtain the LBP feature of the pixel;

Obtain a histogram of each area based on the LBP feature of the pixel;

Perform normalization processing on the histogram of each region to obtain a statistical histogram, and obtain the image LBP feature of the segmented image based on the statistical histogram.
The device according to claim 13, wherein the step of inputting LBP features of the image into a pre-trained portrait recognition model, and recognizing the LBP features of the image by the portrait recognition model, the step of outputting the recognition result further comprises :

Collect a preset number of sample images, and set the label of the sample images as portrait or non-portrait;

After compressing the sample image into 128×128 pixels, perform gray-scale processing and random incomplete processing to obtain a processed sample image;

Extracting the sample LBP feature of the processed sample image to obtain the sample LBP feature;

The sample LBP features are input into a neural network created based on TensorFlow for training to obtain the portrait recognition model, and the recognition result output by the portrait recognition model is a portrait or a non-portrait.
The apparatus according to claim 13, wherein the segmented image includes a first segmented image and a second segmented image, and the step of dividing the video image into a picture to obtain a plurality of segmented images comprises :

Compressing the video image into a compressed video image of 512×512 pixels;

Segmenting the compressed video image according to 64×64 pixels to obtain multiple first segmented images;

Perform secondary image segmentation on the overlapping area of adjacent segmented images in the first segmented image with 64 pixels as a starting point to obtain a second segmented image.
The device according to claim 13, wherein the statistical recognition result is the number of segmented images of the portrait, and after the step of obtaining the first number of people statistical result, the method further comprises:

Report the first population count result to the server according to the preset result report interface.
The device according to claim 16, wherein before the step of reporting the first population count result to the server according to the preset result reporting interface, the method further comprises:

Judging whether an abnormal human figure is included in the first population count result according to the peak counting method;

If the first person counting result does not include the abnormal portrait, perform the step: reporting the first person counting result to the server according to a preset result reporting interface;

If the first person counting result includes the abnormal person, after removing the number of the abnormal person from the first person counting result, a second person counting result is obtained, and the second person counting result is Report to the server.
18. The device according to claim 17, wherein the portrait recognition model records the portrait coordinates of the portrait in the segmented image whose recognition result is a portrait, and the method of crest counting determines whether the first population count result includes The steps for unusual portraits include:

Obtaining the number of times the portrait coordinates appear within a preset time;

If the number of times that the portrait coordinates appear within the preset time is greater than or equal to the number threshold, it is determined that the portrait corresponding to the image coordinates is not an abnormal portrait;

If the number of occurrences of the portrait image coordinates within the preset time is less than the number threshold, it is determined that the portrait corresponding to the image coordinates is an abnormal portrait, and the portrait corresponding to the image coordinates is marked as an abnormal portrait.
A computer storage medium, wherein the computer storage medium stores a computer readable instruction for counting people based on face recognition, and the following steps are implemented when the computer readable instruction for counting people based on face recognition is executed by a processor :

When receiving the people counting instruction, obtain the video image from the video stream;

Performing picture segmentation on the video image to obtain multiple segmented images, and extracting image local binary mode LBP features of the multiple segmented images;

Input the LBP features of the image into a pre-trained portrait recognition model, and the LBP features of the image are recognized by the portrait recognition model, and the recognition result is output;

Count the number of the segmented images whose recognition result is a portrait, and obtain a first population count result;

The step of extracting image LBP features of a plurality of the segmented images includes:

Dividing the segmented image into multiple regions;

Comparing the central gray value of each pixel in each area with the gray values of 8 adjacent pixels adjacent to the pixel to obtain the LBP feature of the pixel;

Obtain a histogram of each area based on the LBP feature of the pixel;

Perform normalization processing on the histogram of each region to obtain a statistical histogram, and obtain the image LBP feature of the segmented image based on the statistical histogram.
The computer storage medium according to claim 19, wherein the step of inputting the LBP features of the image into a pre-trained portrait recognition model, and recognizing the LBP features of the image by the portrait recognition model, and outputting the recognition result Also includes:

Collect a preset number of sample images, and set the label of the sample images as portrait or non-portrait;

After compressing the sample image into 128×128 pixels, perform gray-scale processing and random incomplete processing to obtain a processed sample image;

Extracting the sample LBP feature of the processed sample image to obtain the sample LBP feature;

The sample LBP features are input into a neural network created based on TensorFlow for training to obtain the portrait recognition model, and the recognition result output by the portrait recognition model is a portrait or a non-portrait. To