CN116521093B

CN116521093B - Smart community face data storage method and system

Info

Publication number: CN116521093B
Application number: CN202310797180.XA
Authority: CN
Inventors: 张锦标; 郑飞龙; 黄茂三; 吴艳红; 林晓莲; 林建设
Original assignee: Zhangzhou Keheng Information Technology Co ltd
Current assignee: Zhangzhou Keheng Information Technology Co ltd
Priority date: 2023-07-03
Filing date: 2023-07-03
Publication date: 2023-09-15
Anticipated expiration: 2043-07-03
Also published as: CN116521093A

Abstract

The invention relates to the technical field of coding compression, in particular to a method and a system for storing face data of an intelligent community, comprising the following steps: according to the frequency of each data in the human face data sequence, an initial compression dictionary is constructed, in the process of compressing the human data sequence, the local repetition probability of the sequence formed by the data to be matched and the additional items is predicted, the compression dictionary is updated according to the local repetition probability, so that the data with high local repetition probability are located at the position relatively in front of the compression dictionary, the obtained code sequence is smaller in numerical value and larger in repetition rate, the effect of further compressing the code sequence is better, and the storage space of the human face data is saved.

Description

Smart community face data storage method and system

Technical Field

The invention relates to the technical field of coding compression, in particular to a method and a system for storing face data of an intelligent community.

Background

The intelligent community is a community management mode based on information technology, and the intelligent, convenient and humanized community resident life is realized by establishing an intelligent community management platform and information interaction and connection in the community, so that the community treatment level and resident satisfaction are improved.

The intelligent access control is an intelligent technology in an intelligent community, user identities are rapidly distinguished through face data collection, and safety authorization authentication is carried out according to authorities of different users.

Because community personnel are more, the data volume of face data is huge, and in order to save storage space, the face data needs to be compressed and stored. The existing compression method, such as LZW coding, compresses face data by constructing a compression dictionary and compressing the face data. However, when the encoding result of the LZW encoding is converted into binary for storage, the number of bits of the encoding result converted into binary is longer due to the larger number of the values in the encoding result, so that the occupied storage space is larger, and the storage of the face data is not facilitated.

Disclosure of Invention

The invention provides a method and a system for storing face data of an intelligent community, which are used for solving the existing problems.

The intelligent community face data storage method adopts the following technical scheme:

the embodiment of the invention provides a method for storing face data of an intelligent community, which comprises the following steps:

acquiring a face image, and converting the face image into a face data sequence;

taking the same numerical value in the face data sequence as data, acquiring the frequency of each data in the face data sequence, and constructing an initial compression dictionary according to the frequency of each data; constructing a null sequence, and recording the null sequence as a coded sequence;

compressing the face data sequence:

s1: taking the first data in the face data sequence as data P to be matched;

s2: taking the next data in the face data sequence as an additional item C;

s3: searching a sequence P+C formed by the data P to be matched and the additional item C in the compression dictionary:

s301: if the sequence P+C formed by the data P to be matched and the additional item C exists in the compression dictionary, taking the sequence P+C formed by the data P to be matched and the additional item C as new data P to be matched;

s302: if the sequence P+C formed by the data P to be matched and the additional item C does not exist in the compression dictionary, outputting a coding result; adding data P to be matched into the coded sequence; according to the frequency of each data in the data P to be matched in the face data sequence and the coded sequence, the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C is obtained; updating the compression dictionary according to the local repetition probability; taking the additional item C as new data P to be matched;

s4: repeating S2-S3, and forming a coding sequence by all the output coding results;

s5: compressing the coding sequence to obtain compressed data;

the compressed data is stored.

Preferably, the constructing an initial compression dictionary according to the frequency of each data includes the following specific steps:

constructing an empty dictionary, wherein the dictionary comprises sequence number columns and data columns; and ordering all kinds of data according to the sequence from big to small of the frequency of each kind of data in the face data sequence, sequentially adding all kinds of data to the data column of the dictionary according to the ordering result, sequentially increasing the value of the sequence number column of the dictionary from 1, and taking the dictionary as an initial compression dictionary.

Preferably, the outputting the encoding result includes the following specific steps:

and outputting the serial numbers corresponding to the data P to be matched in the dictionary as the coding results.

Preferably, the obtaining the local repetition probability of the sequence p+c formed by the data P to be matched and the additional item C includes the following specific steps:

wherein ,the local repetition probability of a sequence P+C formed by the data P to be matched and the additional item C is given; />Frequency structure in face data sequence for each data contained in data P to be matchedA collection of products; />As a function of the minimum value; />For the +.>The frequency with which the individual data appear in the encoded sequence; />The number of data contained in the data P to be matched; />For the coded sequence comprising the data P to be matched +.>A set of frequencies of each data in the individual elements in the face data sequence; />The number of elements containing data P to be matched in the coded sequence; />Is an exponential function with a base of natural constant.

Preferably, the updating the compression dictionary according to the local repetition probability comprises the following specific steps:

when the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C is greater than or equal to a preset repetition threshold value, adding the sequence P+C formed by the data P to be matched and the additional item C to the position of P in the data column of the compression dictionary, and sequentially moving the P in the data column of the compression dictionary and all the data behind the P in the data column of the compression dictionary one bit backwards;

and when the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C is smaller than a preset repetition threshold value, adding the sequence P+C formed by the data P to be matched and the additional item C to the tail end of the compression dictionary data column.

The embodiment of the invention provides a smart community face data storage system, which comprises:

the face data acquisition module acquires face images and converts the face images into a face data sequence;

the compression dictionary construction module is used for taking the same numerical value in the face data sequence as data, obtaining the frequency of each data in the face data sequence and constructing an initial compression dictionary according to the frequency of each data;

the face data compression module constructs an empty sequence, and marks the empty sequence as a coded sequence;

compressing the face data sequence:

s1: taking the first data in the face data sequence as data P to be matched;

s2: taking the next data in the face data sequence as an additional item C;

s5: compressing the coding sequence to obtain compressed data;

the face data storage module is used for storing the compressed data;

and the face data decompression module decompresses the compressed data.

The technical scheme of the invention has the beneficial effects that: most of the values in the coding result of the LZW coding are larger, when the coding result of the LZW coding is stored, the coding result is required to be converted into binary, the length of binary data depends on the largest value in the coding result, and the number of bits for converting the coding result into binary is longer due to the larger value in the coding result, so that the storage space is larger. According to the method, the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C is predicted according to the data P to be matched, the sequence P+C is added into the compression dictionary according to the local repetition probability, so that data with high local repetition probability is positioned at a position relatively forward in the compression dictionary, and when the data in the face data sequence appears again, the data can be encoded by using smaller numerical values. By dynamically adjusting the positions of the data in the compression dictionary, all data with large local repetition rate can be encoded by using smaller values, so that the values in the encoding sequence are smaller and the repetition rate is larger, the compression effect of further compressing the encoding sequence is better, and the storage space of the face data is saved.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart showing steps of a method for storing face data of an intelligent community according to the present invention;

fig. 2 is a block diagram of a smart community face data storage system according to the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description is given below of a specific implementation, structure, characteristics and effects of the intelligent community face data storage method according to the invention in combination with the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of the intelligent community face data storage method provided by the invention with reference to the accompanying drawings.

Referring to fig. 1, a flowchart of steps of a method for storing face data of an intelligent community according to an embodiment of the invention is shown, the method includes the following steps:

s001, collecting face data.

After the user authorization, acquiring face images of personnel in the community through the intelligent community APP, and marking the size of the face images as M multiplied by N. In order to facilitate the subsequent compression and storage of the face image, the gray values of all the pixels in the face image are scanned according to a zigzag scanning mode, and the gray values of all the pixels in the face image are unfolded into a one-dimensional sequence which is recorded as a face data sequence. In other embodiments, other scanning methods may be used by the practitioner.

Thus, the acquisition of the face data is realized, and the face data sequence is obtained.

S002, constructing an initial compression dictionary.

It should be noted that, the LZW coding is a data compression method, and compresses data by a method of compressing while constructing a compression dictionary, and the embodiment of the present invention is an improvement of the LZW coding, so that an initial compression dictionary needs to be constructed first. In order to ensure that the coding result of the data with larger occurrence frequency in the face data sequence is small, an initial compression dictionary needs to be constructed according to the frequency of the data in the face data sequence, and meanwhile, the initial compression dictionary needs to cover all different values in the face data sequence.

In the embodiment of the invention, an empty dictionary is constructed, and the dictionary comprises a sequence number column and a data column. The sequence number column is used for storing the sequence number of the data, and the data column is used for storing the coding object. The value of the sequence number column is incremented sequentially from 1.

The same numerical value in the face data sequence is regarded as one data, the frequency of each data in the face data sequence is counted, all kinds of data are ordered according to the sequence from big to small, all kinds of data are added to the data sequence of the dictionary in sequence according to the ordering result, and the dictionary is used as an initial compression dictionary.

Thus, an initial compression dictionary is obtained.

S003, compressing the face data sequence to obtain compressed data.

It should be noted that, in the encoding process, the LZW encoding continuously adds a string with a length greater than 1 to the compression dictionary, so that when the same string appears again in the data, the string with a length greater than 1 can be encoded into a value according to the compression dictionary, thereby realizing the compression of the data. The repetition rate of the numerical values in the encoding result of the LZW encoding is very small, and since each character string appearing in the data needs to be added to the end of the compression dictionary of the LZW encoding, the amount of data in the compression dictionary is very large, resulting in a large majority of numerical values in the encoding result of the LZW encoding. All data in the computer are stored in a binary form, when the coded result of LZW coding is stored, the coded result is required to be converted into binary, the length of binary data depends on the largest numerical value in the coded result, and the number of bits for converting the coded result into binary is longer due to the larger numerical value in the coded result, so that the storage space is larger. Therefore, the probability that the character string repeatedly appears next is combined, the character string is dynamically added to the position of the compression dictionary, which is relatively front, so that the numerical values in the coding result are smaller, the repetition rate is relatively high, the LZW coding efficiency is improved, the coding result can be further compressed, and the compression efficiency is further improved.

In the embodiment of the invention, a null sequence is constructed and marked as a coded sequence, and the coded sequence is used for storing coded data in a face data sequence.

Compressing the face data sequence:

1. taking the first data in the face data sequence as data P to be matched;

2. taking the next data in the face data sequence as an additional item C;

3. searching a sequence P+C formed by the data P to be matched and the additional item C in the compression dictionary:

(1) If the sequence P+C formed by the data P to be matched and the additional item C exists in the compression dictionary, taking the sequence P+C formed by the data P to be matched and the additional item C as new data P to be matched;

(2) If the sequence P+C formed by the data P to be matched and the additional item C does not exist in the compression dictionary, outputting the corresponding sequence number of the data P to be matched in the dictionary as a coding result, and adding the data P to be matched into the coded sequence.

Calculating the local repetition probability of a sequence P+C formed by the data P to be matched and the additional item C:

wherein ,the local repetition probability of a sequence P+C formed by the data P to be matched and the additional item C is given; />A set formed by the frequencies of each data contained in the data P to be matched in the face data sequence; />As a minimum function, the frequency of the data P to be matched in the face data sequence is determined by the frequencies of all the data constituting the data P to be matched, and the frequency of the data P to be matched is maximum ∈>；/>For the +.>The frequency with which the individual data appear in the encoded sequence; />The number of the data contained in the data P to be matched is the length of the data P to be matched; />For the sum of the frequencies of occurrence of all data in the coded sequence of the data P to be matched, +.>Representing the encoded proportion of all the data in the data P to be matched, and when the encoded proportion of all the data in the data P to be matched is smaller, indicating that the possibility of the data P to be matched is higher; />For the coded sequence comprising the data P to be matched +.>A set of frequencies of each data in the individual elements in the face data sequence, +.>For the coded sequence comprising the data P to be matched +.>Maximum possible frequency of individual elements, +.>The number of elements containing data P to be matched in the coded sequence; />Is an exponential function with a natural constant as a base; />To the sum of the maximum possible frequencies of all elements of the coded sequence containing the data P to be matched, since all elements of the coded sequence containing the data P to be matched are different from the sequence p+c>The ratio of the data P to be matched to the sequence composed of the data other than C is expressed, and the probability of the sequence p+c to be repeated later is smaller when the ratio is larger, namely the probability of the sequence p+c formed by the data P to be matched and the additional item C to be repeated later is larger when the ratio is smaller, because of the local similarity of the face data sequences.

A repetition threshold T is preset, where the embodiment is described by taking t=0.5 as an example, and the embodiment is not specifically limited, where T may be determined according to the specific implementation situation. When the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C is greater than or equal to the repetition threshold T, updating the compression dictionary: and adding the sequence P+C formed by the data P to be matched and the additional item C to the position where P is located in the data column of the compression dictionary, and sequentially shifting one bit backwards in the data column of the compression dictionary and all the following data. When the compression dictionary is updated, only the position of data in the data sequence of the compression dictionary is adjusted, and the position of sequence numbers in the sequence number sequence is not adjusted.

When the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C is smaller than the repetition threshold T, the sequence P+C formed by the data P to be matched and the additional item C is added to the tail end of the compression dictionary data column.

The additional item C is taken as new data P to be matched.

4. And (3) repeating the step (2-3) until all data in the face data sequence are coded, and stopping iteration. All the output coding results form a one-dimensional sequence, and the one-dimensional sequence is recorded as a coding sequence.

5. And compressing the coding sequence by utilizing Huffman coding to obtain compressed data. It should be noted that in other embodiments, the operator may compress the coding sequence by other compression methods, including but not limited to shannon coding and arithmetic coding.

Thus, the face data sequence is compressed, and compressed data is obtained.

It should be noted that, in the embodiment of the present invention, the local repetition probability of the sequence p+c formed by the data P to be matched and the additional item C is predicted according to the data P to be matched, and the sequence p+c is added to the compression dictionary according to the local repetition probability, so that the data with high local repetition probability is located at a position relatively forward in the compression dictionary, and when the data appears again in the face data sequence, the data can be encoded with a smaller value. By dynamically adjusting the positions of the data in the compression dictionary, all data with large local repetition rate can be encoded by using smaller values, so that the values in the encoded sequence are smaller and the repetition rate is larger. The compression effect of further compressing the coding sequence by using Huffman coding is better.

S004, storing the compressed data.

The compressed data, the frequency of each data in the face data sequence obtained in step S002, and the size mxn of the face image obtained in step S001 are stored on the face data storage server.

S005, decompressing the compressed data.

When the face data needs to be read, the compressed data needs to be decompressed, specifically:

an initial compression dictionary is constructed by the method in step S002 according to the frequency of each data in the face data sequence. And decompressing the compressed data by using Huffman coding to obtain a coding sequence.

A null sequence is constructed, denoted as the encoded sequence.

Decoding the encoded sequence:

1. the first element in the coding sequence is used as an element to be decoded;

2. and acquiring data corresponding to the same serial number as the element to be decoded in the compression dictionary, and recording the data as P as decoded data.

3. The decoded data P is added to the encoded sequence. Since the additional item C is the latter data of the decoded data P in the face data sequence, the additional item C is unknown when the latter data is not decoded. At this time, the additional term C is taken as an unknown number, the local repetition probability of the sequence p+c is obtained according to the decoded data P, and when the local repetition probability is greater than or equal to the repetition threshold T, all data in the compression dictionary P and thereafter are sequentially moved one bit backward, and the original position of the compression dictionary P is taken as a blank bit.

Since the calculation of the partial repetition probability of the sequence p+c is related to P only, when the additional term C is an unknown, the partial repetition probability of the sequence p+c can be obtained from P alone.

4. And taking the next element in the coding sequence as a new element to be decoded, acquiring first data in data corresponding to the same serial number as the element to be decoded in the compression dictionary as an additional item C, and adding the sequence P+C into blank bits of the compression dictionary. And acquiring data corresponding to the same serial number as the element to be decoded in the compression dictionary as new decoding data P.

5. Repeating the steps 3-4 until all elements in the coded sequence are traversed and iterated, splitting the elements with the length larger than 1 in the coded sequence into a plurality of elements with the length of 1 to form a new sequence, and recording the new sequence as a face data sequence.

And filling the face data sequence into a space matrix with the size of M multiplied by N to obtain a face image.

Through the steps, the compression storage and decompression of the face data are completed.

Referring to fig. 2, a block diagram of a smart community face data storage system according to an embodiment of the present invention is shown, the system includes:

and the face data acquisition module acquires face images and converts the face images into a face data sequence.

The compression dictionary construction module is used for taking the same numerical value in the face data sequence as data, obtaining the frequency of each data in the face data sequence and constructing an initial compression dictionary according to the frequency of each data.

And the face data compression module is used for constructing a null sequence, and recording the null sequence as a coded sequence.

Compressing the face data sequence:

1: and taking the first data in the face data sequence as data P to be matched.

2: and taking the next data in the face data sequence as an additional item C.

3: searching a sequence P+C formed by the data P to be matched and the additional item C in the compression dictionary:

(1): and if the sequence P+C formed by the data P to be matched and the additional item C exists in the compression dictionary, taking the sequence P+C formed by the data P to be matched and the additional item C as new data P to be matched.

(2): if the sequence P+C formed by the data P to be matched and the additional item C does not exist in the compression dictionary, outputting a coding result; adding data P to be matched into the coded sequence; according to the frequency of each data in the data P to be matched in the face data sequence and the coded sequence, the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C is obtained; updating the compression dictionary according to the local repetition probability; the additional item C is taken as new data P to be matched.

4: repeating the steps 2-3, and forming the coding sequence by all the output coding results.

5: and compressing the coding sequence to obtain compressed data.

And the face data storage module is used for storing the compressed data.

And the face data decompression module decompresses the compressed data.

According to the embodiment of the invention, the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C is predicted by the data P to be matched, and the sequence P+C is added into the compression dictionary according to the local repetition probability, so that the data with high local repetition probability is positioned at a position relatively forward in the compression dictionary, and when the data in the face data sequence reappears, the data can be encoded by a smaller numerical value. By dynamically adjusting the positions of the data in the compression dictionary, all data with large local repetition rate can be encoded by using smaller values, so that the values in the encoding sequence are smaller and the repetition rate is larger, the compression effect of further compressing the encoding sequence is better, and the storage space of the face data is saved.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. A method for storing face data of an intelligent community is characterized by comprising the following steps:

compressing the face data sequence:

s1: taking the first data in the face data sequence as data P to be matched;

s2: taking the next data in the face data sequence as an additional item C;

s5: compressing the coding sequence to obtain compressed data;

storing the compressed data;

the method for obtaining the local repetition probability of the sequence P+C formed by the data P to be matched and the additional item C comprises the following specific steps:

wherein ,the local repetition probability of a sequence P+C formed by the data P to be matched and the additional item C is given; />A set formed by the frequencies of each data contained in the data P to be matched in the face data sequence; />As a function of the minimum value; />For the +.>The frequency with which the individual data appear in the encoded sequence; />The number of data contained in the data P to be matched;for the coded sequence comprising the data P to be matched +.>Each data in each element is in the face data sequenceA set of frequencies in a column; />The number of elements containing data P to be matched in the coded sequence; />Is an exponential function with a natural constant as a base;

the updating of the compression dictionary according to the local repetition probability comprises the following specific steps:

2. The method for storing face data of a smart community according to claim 1, wherein the constructing an initial compression dictionary according to the frequency of each data comprises the following specific steps:

3. The method for storing face data of smart community according to claim 1, wherein the outputting the encoded result comprises the following specific steps:

4. A smart community face data storage system, the system comprising:

compressing the face data sequence:

s1: taking the first data in the face data sequence as data P to be matched;

s2: taking the next data in the face data sequence as an additional item C;

s5: compressing the coding sequence to obtain compressed data;

the face data storage module is used for storing the compressed data;

the face data decompression module decompresses the compressed data;

wherein ,the local repetition probability of a sequence P+C formed by the data P to be matched and the additional item C is given; />A set formed by the frequencies of each data contained in the data P to be matched in the face data sequence; />As a function of the minimum value; />For the +.>The frequency with which the individual data appear in the encoded sequence; />The number of data contained in the data P to be matched;for the coded sequence comprising the data P to be matched +.>A set of frequencies of each data in the individual elements in the face data sequence; />The number of elements containing data P to be matched in the coded sequence; />Is an exponential function with a natural constant as a base;