CN112214637A

CN112214637A - Data storage method applied to smart community

Info

Publication number: CN112214637A
Application number: CN202011089278.2A
Authority: CN
Inventors: 麦雪楹
Original assignee: Guangzhou Feike Technology Co ltd
Current assignee: Guangzhou Feike Technology Co ltd
Priority date: 2020-10-13
Filing date: 2020-10-13
Publication date: 2021-01-12

Abstract

The application discloses a data storage method applied to a smart community, comprising the following steps: the method comprises the steps that edge nodes collect multi-source heterogeneous data uploaded by a plurality of terminals in an intelligent community; the edge node uploads the multi-source heterogeneous data to a cloud server; the cloud server performs normalization processing on the multi-source heterogeneous data and performs data labeling; the cloud server monitors the storage capacity of each edge node and calculates the residual storage capacity of each edge node; when the residual storage capacity of the first edge node is lower than a first preset threshold value, the cloud server establishes a multi-source heterogeneous data index table of the first edge node, wherein the index table is used for indexing multi-source heterogeneous data to be compressed; the cloud server sets a compression strategy of the first edge node, and sends the compression strategy to the first edge node, wherein the compression strategy is used for setting corresponding compression levels for different labels; and the first edge node compresses the stored multi-source heterogeneous data based on a compression strategy.

Description

Data storage method applied to smart community

Technical Field

The application relates to the technical field of information, in particular to a data storage method applied to an intelligent community.

Background

The intelligent community is characterized in that the intelligent community fully depends on the internet and the internet of things, relates to the fields of intelligent buildings, intelligent homes, road network monitoring, intelligent hospitals, city life line management, food and medicine management, ticket management, home care, personal health, digital life and the like, holds the important opportunities of a new technological innovation revolution and the wave of information industry, fully exerts the advantages of developed Information Communication (ICT) industry, advanced RFID related technologies, excellent telecommunication services and information-based infrastructure and the like, accelerates the key industrial technology attack and construction of the intelligent environment of community development by building ICT infrastructure, authentication, safety and other platforms and demonstration projects, forms a new life, industrial development, social management and other modes based on mass information and intelligent filtering processing, and is oriented to the construction of a brand new community form.

In present wisdom community, the acquisition and the storage of heterogeneous data of multisource are carried out by the cloud server who sets up between the edge node and the community inside the community, but heterogeneous data volume of multisource can be bigger and bigger along with the increase of time, therefore how to reduce the data volume is the difficult problem that a wisdom community needs to solve.

Disclosure of Invention

The embodiment of the application provides a data storage method applied to a smart community, and the method is used for solving the problem of surplus storage caused by redundant data volume of the smart community in the prior art.

The embodiment of the invention provides a data storage method applied to an intelligent community, which comprises the following steps:

the method comprises the steps that edge nodes collect multi-source heterogeneous data uploaded by a plurality of terminals in a smart community, wherein the multi-source heterogeneous data comprise video monitoring data, face data, community environment noise data, community temperature and humidity data and community parking condition data;

the edge node uploads the multi-source heterogeneous data to a cloud server;

the cloud server carries out normalization processing on the multi-source heterogeneous data and carries out data labeling, wherein the data labeling comprises storage time, data activity, importance level, data size and compression environment;

the cloud server monitors the storage capacity of each edge node and calculates the residual storage capacity of each edge node;

when the residual storage capacity of a first edge node is lower than a first preset threshold value, the cloud server establishes a multi-source heterogeneous data index table of the first edge node, wherein the index table is used for indexing the multi-source heterogeneous data to be compressed;

the cloud server sets a compression strategy of the first edge node, and sends the compression strategy to the first edge node, wherein the compression strategy is used for setting corresponding compression levels for different labels;

and the first edge node compresses the stored multi-source heterogeneous data based on the compression strategy.

Optionally, the cloud server sets a compression policy of the first edge node, including:

dividing the storage time into recent data, medium-term data and early-stage data within 3 months, less than 6 months after 3 months and more than 6 months, and setting corresponding integrals of the data in different periods;

dividing the activity into high activity, medium activity and low activity according to the calling frequency higher than 5 times per month, the calling frequency lower than 5 times and more than 2 times, the calling frequency lower than 2 times per month, and setting integrals corresponding to different activity data;

dividing the importance levels into important, more important and general levels, and setting corresponding integrals of different importance levels;

dividing the data size into more than or equal to 10GB, less than 10GB and more than or equal to 5GB, dividing the data size into high data size, medium data size and low data size, and setting integrals corresponding to different data sizes;

and sequentially performing single scoring on the multi-source heterogeneous data stored in the first edge node according to the integral rule, accumulating the multi-source heterogeneous data, calculating the final score of the multi-source heterogeneous data, if the final score result exceeds a second threshold, performing no compression, if the final score result is lower than the second threshold and exceeds a third threshold, performing moderate compression, and if the final score result is lower than the third threshold, performing deep compression, wherein the second threshold is higher than the third threshold.

Optionally, the data tag further includes a data association for indicating a degree of association between different types of multi-source heterogeneous data, and the method further includes:

dividing the data relevance into strong relevance and weak relevance, and performing data combination on a plurality of data of the strong relevance;

and acquiring the compression grade of the strongly-associated data combination, acquiring the lowest compression grade of the strongly-associated data combination if the compression grades of the strongly-associated data combination are inconsistent, and adjusting the compression grades of all data in the strongly-associated data combination to be the lowest compression grade.

Optionally, the data tag further includes a key event tag, and the method further includes:

if the key event label of the multi-source heterogeneous data is true, adjusting the compression level of the multi-source heterogeneous data combined by the multi-source heterogeneous data and the strongly-associated data thereof to be free from compression;

and if the key event label of the multi-source heterogeneous data is false, the compression level of the multi-source heterogeneous data and the strongly-associated multi-source heterogeneous data is not required to be adjusted.

Optionally, the method further comprises:

and the first edge node performs redundancy backup on the multi-source heterogeneous data according to the importance level and migrates the data of the redundancy backup to a second edge node, wherein the residual storage capacity of the second edge node is higher than the first preset threshold value.

Optionally, the compressing, by the first edge node, the stored multi-source heterogeneous data based on the compression policy includes:

evaluating the probability of the abnormal multi-source heterogeneous data of different types;

setting different self-adaptive data sampling frequencies, wherein the data sampling frequencies correspond to the abnormal probabilities of the multi-source heterogeneous data of different types and the compression grades;

sampling the stored multi-source heterogeneous data based on the data sampling frequency, only reserving the sampled multi-source heterogeneous data, and deleting the stored multi-source heterogeneous data before sampling;

converting the sampled multi-source heterogeneous data into a binary system;

and expressing '0' or '1' with adjacent number exceeding 8 bits in the binary system by eight-bit coding, wherein the eight-bit coding comprises a start bit, a '0' or '1' home bit, 4 adjacent digits and an end bit.

Optionally, before the first edge node compresses the stored multi-source heterogeneous data based on the compression policy, the method further includes:

and setting a dynamic random number to encrypt the multi-source heterogeneous data.

Optionally, the method further comprises:

monitoring the operation logs of a plurality of terminals in the intelligent community, identifying abnormal events in the operation logs, and recording the time corresponding to the abnormal events;

all collected multi-source heterogeneous data within the time limit corresponding to the abnormal event are obtained, and the multi-source heterogeneous data are set as key event multi-source heterogeneous data, so that data backup is carried out on the key event multi-source heterogeneous data

According to the method provided by the embodiment of the invention, the cloud server monitors the storage capacity of different edge nodes, when the storage space of the first edge node is insufficient, a compression strategy for the first edge node is set, and multi-source heterogeneous data of the first edge node is compressed according to different compression levels, so that the timeliness of data compression is ensured, the storage space is saved, important data is not required to be compressed or shallowly compressed, and the response speed of the edge nodes is high and the data is comprehensive when the important data is called.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below.

FIG. 1 is an architecture diagram of a data storage system for a smart community in one embodiment;

FIG. 2 is a flow diagram illustrating data storage for a smart community in one embodiment.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

FIG. 1 is a block diagram of a data storage system of a smart community in an embodiment of the present invention. As shown in fig. 1, the present invention embodiment includes a terminal 10, a plurality of edge nodes 20, and a cloud server 30, where the terminal may include an intelligent Access 11, an intelligent terminal 12, an intelligent Access Point (AP) 13, an intelligent gate 14, an intelligent monitoring device 15, a noise collection device 16, a temperature and humidity sensor 17, and an intelligent spike 18. The intelligent access control 11 is provided with a visual intercom system, can acquire human faces through a built-in camera and acquire user voice through a voice input device, and therefore intercom is completed; the intelligent terminal 12 comprises a smart phone, an interphone and other timely communication terminals, and can be divided into a full-duplex mode or a half-duplex mode; the intelligent access point AP13 comprises network equipment such as an intelligent route and a small base station, is used for constructing a reliable networking and can be distributed in the external environment of the community; the intelligent gate 14 is located at an entrance and an exit of a gate of a community, can be opened or closed according to an internal or external instruction and based on a face recognition system, and the intelligent monitoring equipment 15 is located inside the community, and is provided with a high-definition camera, so that people or objects in different places can be recorded in real time, intelligent recognition is performed, and potential dangers are eliminated. Noise collection device 16 may collect the noise level inside the cell and issue an alarm when the noise is too high (e.g., decibel of square dance noise). Temperature and humidity sensor 17 can monitor the district building, the outdoor temperature and humidity in district, and intelligent spike 18 is located inside the parking area, cooperates with supervisory equipment, can advance the vehicle into the field, leave the field, whether stop fixed parking stall and monitor to show parking status through red, green LED demonstration.

The edge nodes are distributed in the cell, and one cell can be composed of a plurality of edge nodes. The edge node can store and forward data in the form of an intelligent router, has a certain data processing function, can sink partial data processing service of the cloud server to the intelligent cell, and has the advantages of quick response and local processing.

The edge nodes are data transfer stations and storage edge ends of the data, and can perform certain data processing capacity, and the cloud server can store massive data and has a large amount of data operation processing capacity due to good data expansibility, so that different data can be collected and collected by the edge nodes and then transmitted, and the cloud server performs data cleaning and data analysis based on different data, and finally forms a certain analysis result for decision making.

FIG. 2 is a flowchart of a data storage method for an intelligent community according to an embodiment of the present invention. As shown in fig. 2, the method includes:

s101, edge nodes collect multi-source heterogeneous data uploaded by a plurality of terminals in a smart community, wherein the multi-source heterogeneous data comprises video monitoring data, face data, community environment noise data, community temperature and humidity data and community parking condition data;

in the smart community, the multi-source heterogeneous data is data which is acquired by different data sources and has different data formats, such as different data defined by pictures, videos, signaling and various sensors, and the data formats, the data sizes and the data types of the data are different due to the regulation of different protocols of the data formats, the signaling and the sensors. For the field of the smart community, the commonly used multisource heterogeneous data comprises video monitoring data collected by monitoring equipment, face data collected by an intelligent gate, community environment noise data collected by a noise sensor, temperature and humidity data collected by the temperature and humidity sensor and community parking condition data collected by the combination of an intelligent spike and the monitoring equipment. The format of the video monitoring data may be a common video format, such as MP4, MPEG, etc., the face data may be composed of a plurality of face images, the format may be JPG, PNP, etc., the temperature and humidity data, the noise data is usually small in data volume, and the format is different according to different manufacturers. The parking condition data comprises condition data of remaining parking spaces, parking route guidance, parking accidents and the like, a group of heartbeat packet data is generally uploaded by character strings, and the data is collected by edge nodes and is subjected to transparent transmission operation.

S102, the edge node uploads the multi-source heterogeneous data to a cloud server;

it should be noted that different edge nodes collect multi-source heterogeneous data of different cells, and send the data to the cloud server.

S103, the cloud server performs normalization processing on the multi-source heterogeneous data and performs data labeling, wherein the data labeling comprises storage time, data activity, importance level, data size and compression environment;

the cloud server has strong data processing capacity, so when multi-source heterogeneous data of different cells, different data sources and different data formats are received, the cloud server needs to perform data cleaning on the different data and perform normalization operation.

In the embodiment of the invention, after the cloud server performs the normalization operation of the multi-source heterogeneous data, data labels (hashtags) are marked on different types of data, the data labels are label files for performing qualitative and quantitative analysis on the data, and the attributes of different data are qualitatively and quantitatively analyzed, so that the subsequent data analysis is facilitated.

Types of data tags in the embodiment of the present invention, the following are mainly included: time of storage, data liveness (frequency of data calls), importance level of data (important, more important and common), data size, and compression environment (whether and how compressed can be compressed). The data activity indicates the calling frequency of the data, when a group of data is repeatedly called in different application scenes, the data is considered to be in an active period, for example, if the same monitoring video is repeatedly checked in a period of time, the monitoring video is considered to have high activity and to be in a data active period. When a group of data is not called by any application scene within a period of time, the data activity is low. The importance level of the data can be determined according to the man-made label and the machine automatic label, for the machine label, generally speaking, the data value collected by a specific sensor, such as a temperature and humidity sensor, will keep floating within a certain range of values, if the value is exceeded, the data is considered abnormal, and the data needs to be alerted, the importance level of the data is considered to be high, and attention needs to be paid, and for another example, the face data recognized in the face collected image has the faces of strangers and owners/tenants, and high attention needs to be paid, and the data can be automatically marked as important.

S104, the cloud server monitors the storage capacity of each edge node and calculates the residual storage capacity of each edge node;

after the cloud server labels the multi-source heterogeneous data, monitoring the storage capacity of each edge node by using a monitoring function, reporting the storage capacity by using a heartbeat package mechanism of each edge node, and calculating the residual storage capacity of each edge node.

S105, when the residual storage capacity of the first edge node is lower than a first preset threshold value, the cloud server establishes a multi-source heterogeneous data index table of the first edge node, wherein the index table is used for indexing the multi-source heterogeneous data to be compressed;

in the embodiment of the present invention, a first preset threshold is set as a red storage line, for example, 80% of a storage space, and if the red storage line is exceeded, it is proved that the storage amount is large, and a sufficient storage space cannot be provided for subsequent data, and data compression or data migration is required. Data migration is a temporary solution, that is, a migration policy is used to migrate part or all of data to the rest of edge nodes or cloud servers, but the migration solution may cause network congestion and resource occupation in other places, and is not an optimal solution.

In addition, the cloud server establishes a multi-source heterogeneous data index table of the first edge node, the index table aims to index the compressed data, and an index pointer is established so that the compressed multi-source heterogeneous data can be quickly pointed by the pointer after the subsequent compression process is finished, and subsequent decompression and data calling are facilitated.

S106, the cloud server sets a compression strategy of the first edge node and sends the compression strategy to the first edge node, wherein the compression strategy is used for setting corresponding compression levels for different labels;

when the storage capacity of the first edge node is not enough, the first edge node needs to be compressed, and the cloud server sets a compression strategy for the first edge node to compress data in the first edge node.

The compression strategy is a strategy set based on data stored in the first edge node, namely, a corresponding data tag, specifically, integration sequencing can be performed according to the storage time, the data activity, the importance or not, the data size and the like, different integration threshold gears are set, and final integration determines whether deep compression, general compression or no compression is needed.

Specifically, the compression policy may be:

dividing the storage time into recent data, medium-term data and early-stage data within 3 months, less than 6 months and more than 6 months, and setting corresponding integrals of the data in different periods;

dividing the activity into high activity, medium activity and low activity according to the calling frequency higher than 5 times per month, the calling frequency lower than 5 times and more than 2 times, the calling frequency lower than 2 times per month, and setting corresponding integrals of different activity data;

and sequentially carrying out single scoring on the multi-source heterogeneous data stored in the first edge node according to the integral rule, accumulating the multi-source heterogeneous data, calculating the final score of the multi-source heterogeneous data, if the final score result exceeds a second threshold value, performing no compression, if the final score result is lower than the second threshold value and exceeds a third threshold value, performing moderate compression, and if the final score result is lower than the third threshold value, performing deep compression, wherein the second threshold value is higher than the third threshold value.

As shown in table 1: the data A, B and C in Table 1 are typical data of three different types, and the corresponding relationship between the label and the compression level is shown in Table 1

TABLE 1

In addition, as an extended embodiment, the data tag further includes data relevance, where the data relevance is used to indicate a degree of relevance between different types of multi-source heterogeneous data, for example, in a parking situation, collected data between the monitoring device and the intelligent spike belongs to strongly-relevant data, face data collected by the monitoring device and the gate also belongs to strongly-relevant data, and both data need to be used in a specific application scenario, and data collected by the intelligent spike and the gate belongs to weakly-relevant data. The embodiment of the present invention further includes:

dividing the data relevance into strong relevance and weak relevance, and performing data combination on a plurality of strongly relevant data;

and acquiring the compression grade of the strongly-associated data combination, acquiring the lowest compression grade of the strongly-associated data combination if the compression grades of the strongly-associated data combination are inconsistent, and adjusting the compression grades of all data in the strongly-associated data combination to be the lowest compression grade. For example, the collected data between the monitoring device and the intelligent spike belongs to strongly-associated data, the compression levels of the monitoring device and the intelligent spike should be consistent, otherwise, in a specific application scenario, the data of the monitoring device can be directly called, and when the data of the intelligent spike needs to be called, the decompression operation of the data of the intelligent spike has to be waited, which may cause a great reduction in the calling efficiency.

Optionally, in an embodiment of the present invention, the data tag further includes a key event tag, where the key event is a special triggering event defined according to a rule, for example, if a certain household is stolen within a certain period of time, monitoring information, face data, and the like within the period of time may be a key survey object, data within the period of time may need to be stored locally within a long period of time in the future, so that convenience is provided for cell maintenance personnel, public security personnel, and the like to view the data at any time, and the key event tag is added to video data collected by monitoring equipment within the period of time, face data collected by a gate, and the like. Optionally, the critical event may be set to be of boolean type, i.e. true or false (1 or 0), and is false by default, and is set to be true when a special condition (e.g. a theft event within a certain period of time as mentioned above) occurs, then the method of the embodiment of the present invention further includes:

if the key event label of the multi-source heterogeneous data is true, adjusting the compression level of the multi-source heterogeneous data combined by the multi-source heterogeneous data and the strongly-associated data thereof to be unnecessary to compress;

if the key event label of the multi-source heterogeneous data is false, the compression level of the multi-source heterogeneous data and the strongly-associated multi-source heterogeneous data is not required to be adjusted.

And S107, the first edge node compresses the stored multi-source heterogeneous data based on the compression strategy.

The compression method includes, but is not limited to, conventional video, audio, and image compression methods, and in the embodiment of the present invention, data compression may also be performed by other compression methods. In particular, the amount of the solvent to be used,

the edge node evaluates the abnormal probability of the multi-source heterogeneous data of different types, specifically, the edge node calculates the historical abnormal probability according to the abnormal frequency of the historical multi-source heterogeneous data, draws an abnormal probability curve, and predicts and evaluates the abnormal probability of the multi-source heterogeneous data of different types based on the curve;

sampling the stored multi-source heterogeneous data based on the data sampling frequency, only reserving the sampled multi-source heterogeneous data, and deleting the stored multi-source heterogeneous data before sampling; specifically, sampling is a way of selectively collecting data according to a certain frequency in a certain period, for example, 6 groups of total data are collected in 1-10s, and the sampling frequency is located 0.5 times/s, i.e., 1 time is collected in two seconds, so that 3 groups of data are reserved after sampling in 1-10s, and the other 3 groups of data are discarded.

Converting the sampled multi-source heterogeneous data into a binary system;

and expressing '0' or '1' with adjacent number exceeding 8 bits in the binary system by eight-bit coding, wherein the eight-bit coding comprises a start bit, a '0' or '1' home bit, 4 adjacent digits and an end bit. For example, in binary coding, the adjacent number is "0000000000", 10 0, then with eight-bit coding, the result after coding can be 11010100, where from left to right, 11 is the start bit, 0 indicates 0 for compression, 1010 indicates 10 for the adjacent number, and the last bit 0 is the end bit.

Optionally, in this embodiment of the present invention, the first edge node performs redundant backup on the multi-source heterogeneous data according to the importance level, and migrates data of the redundant backup to a second edge node, where a remaining storage capacity of the second edge node is higher than the first preset threshold.

Optionally, before the first edge node compresses the stored multi-source heterogeneous data based on the compression policy, the embodiment of the present invention further includes:

Optionally, the method further comprises:

and acquiring all the collected multi-source heterogeneous data within the time limit corresponding to the abnormal event, and setting the multi-source heterogeneous data as key event multi-source heterogeneous data so as to perform data backup on the key event multi-source heterogeneous data.

The above is only a specific embodiment of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A data storage method applied to a smart community is characterized by comprising the following steps:

the edge node uploads the multi-source heterogeneous data to a cloud server;

2. The method of claim 1, wherein the cloud server sets a compression policy for the first edge node, comprising:

3. The method of claim 2, wherein the data tag further comprises a data correlation for indicating a degree of correlation between different types of multi-source heterogeneous data, and the method further comprises:

4. The method of claim 3, wherein the data tags further comprise key event tags, the method further comprising:

5. The method of claim 1, further comprising:

6. The method of claim 1, wherein the first edge node compresses the stored multi-source heterogeneous data based on the compression policy, comprising:

converting the sampled multi-source heterogeneous data into a binary system;

7. The method of claim 6, wherein prior to the first edge node compressing the stored multi-source heterogeneous data based on the compression policy, the method further comprises:

8. The method of claim 1, further comprising: