CN110222043B - Data monitoring method, device and equipment of cloud storage server - Google Patents

Data monitoring method, device and equipment of cloud storage server Download PDF

Info

Publication number
CN110222043B
CN110222043B CN201910507052.0A CN201910507052A CN110222043B CN 110222043 B CN110222043 B CN 110222043B CN 201910507052 A CN201910507052 A CN 201910507052A CN 110222043 B CN110222043 B CN 110222043B
Authority
CN
China
Prior art keywords
data
uploaded
similarity
attribute information
user attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910507052.0A
Other languages
Chinese (zh)
Other versions
CN110222043A (en
Inventor
咸鹤群
高原
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Center Information Technology Ltd By Share Ltd
Original Assignee
Qingdao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University filed Critical Qingdao University
Priority to CN201910507052.0A priority Critical patent/CN110222043B/en
Publication of CN110222043A publication Critical patent/CN110222043A/en
Application granted granted Critical
Publication of CN110222043B publication Critical patent/CN110222043B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2358Change logging, detection, and notification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries

Abstract

The application discloses a data monitoring method, a data monitoring device, data monitoring equipment and a computer readable storage medium of a cloud storage server, which can acquire user attribute information of data to be uploaded; determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity set; determining the total number of the similarities exceeding a similarity threshold in the similarity set; and finally, when the total quantity is determined to exceed the quantity threshold value, updating the popularity value of the data to be uploaded according to the growth curve function model, and storing the data to be uploaded. Therefore, when the user uploads the data, the similarity between the user attribute and the attribute of the user uploading the data is calculated, the updating mode of the data popularity value is determined according to the similarity, and the mode of adaptively adjusting the data popularity value based on the user attribute avoids the problem of data leakage caused by the single adjustment mode of the popularity value and improves the safety of the data.

Description

Data monitoring method, device and equipment of cloud storage server
Technical Field
The present disclosure relates to the field of data deduplication, and in particular, to a data monitoring method, apparatus, device, and computer readable storage medium for a cloud storage server.
Background
The data deduplication technology is a reduction technology capable of identifying and eliminating redundant data and only storing single copy data, and is widely applied to the field of cloud storage.
The cloud storage server is responsible for storing data uploaded by a large number of users, when the cloud storage server executes data deduplication, data encryption keys or parameters need to be shared among different users, in order to improve storage efficiency and guarantee user data safety, the cloud storage server usually adopts a deduplication method for dividing popularity, namely, a popularity threshold value is defined, when the number of users uploading certain data reaches or exceeds a set threshold value, the data is regarded as popular data, and otherwise the data is non-popular data. The server performs deduplication operation on the streaming data, thereby improving the security of non-popular data.
At present, the de-duplication method based on popularity mainly comprises the following steps: the cloud storage server allocates a uniform set threshold value for the data, and when one user uploads certain data, the server adds 1 to the popularity value of the data. When the number of copies of the data reaches or exceeds a given threshold, the data is considered popular data, otherwise the data is unpopular data.
The drawbacks of this method are: in a practical application scenario, some data uploaders are from the same group, for example, a company, and if the number of employees of the company is large, the popularity value of the data on the cloud storage server quickly reaches or even exceeds a predetermined threshold value. In fact, the data is not truly "popular," i.e., the data is not owned by individual users on the network, but rather by the company. Since data deduplication requires sharing of data encryption keys or parameters between different users, deduplication operations may result in internal data leakage or external leakage of encryption keys and parameters. In this case, if the traditional popularity value updating method is adopted to perform deduplication processing on the data, problems such as internal data leakage and the like may be caused.
Therefore, the popularity calculation mode of the traditional popularity-based data deduplication method is single, so that the possibility of data leakage exists, and the safety is low.
Disclosure of Invention
The application aims to provide a data monitoring method, a data monitoring device, data monitoring equipment and a computer readable storage medium of a cloud storage server, and aims to solve the problem that the popularity value updating mode of a traditional popularity-based data deduplication method is single, so that the data security is low. The specific scheme is as follows:
in a first aspect, the present application provides a data monitoring method for a cloud storage server, including:
acquiring user attribute information of data to be uploaded;
determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity set;
determining a total number of similarities in the set of similarities that exceed a similarity threshold;
and when the total number exceeds a number threshold, updating the popularity value of the data to be uploaded according to a growth curve function model, and storing the data to be uploaded.
Optionally, when the total number exceeds a number threshold, updating the popularity value of the data to be uploaded according to a growth curve function model, including:
when the total number exceeds a number threshold value, determining that the uploading users of the data to be uploaded are group users, and updating the popularity value of the data to be uploaded according to a growth curve function model;
and when the total number does not exceed the number threshold, determining that the uploading user of the data to be uploaded is an individual user, and adding one to the popularity value of the data to be uploaded.
Optionally, the determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the copy data of the data to be uploaded includes:
setting corresponding similarity calculation modes for the user attribute information of different information types respectively;
in the process of calculating the similarity, a target similarity calculation mode is determined according to the information type of the current user attribute information, and the similarity between the current user attribute information of the data to be uploaded and the current user attribute information of the copy data of the data to be uploaded is determined according to the target similarity calculation mode.
Optionally, the information type includes any one or more of the following items: determining a numerical type, determining a symbolic type, determining a regional type, blurring a numerical type, and blurring a semantic type.
In a second aspect, the present application further provides a data monitoring apparatus for a cloud storage server, including:
an attribute information acquisition module: the system comprises a data acquisition module, a data processing module and a data transmission module, wherein the data acquisition module is used for acquiring user attribute information of data to be uploaded;
a similarity determination module: the similarity collection module is used for determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity collection;
a quantity determination module: determining a total number of similarities in the set of similarities that exceed a similarity threshold;
an update module: and when the total number exceeds a number threshold, updating the popularity value of the data to be uploaded according to a growth curve function model, and storing the data to be uploaded.
Optionally, the update module includes:
a group user updating unit: the system comprises a data acquisition module, a data uploading module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring the total quantity of the data to be uploaded;
individual user update unit: and the data uploading module is used for determining that the uploading user of the data to be uploaded is an individual user when the total number does not exceed the number threshold, and adding one to the popularity value of the data to be uploaded.
Optionally, the similarity determining module includes:
a type setting unit: the similarity calculation method is used for setting corresponding similarity calculation modes for the user attribute information of different information types respectively;
a similarity calculation unit: and the similarity calculation module is used for determining a target similarity calculation mode according to the information type of the current user attribute information in the similarity calculation process, and determining the similarity between the current user attribute information of the data to be uploaded and the current user attribute information of the copy data of the data to be uploaded according to the target similarity calculation mode.
In a third aspect, the present application further provides a data monitoring device of a cloud storage server, including:
a memory: for storing a computer program;
a processor: the computer program is used for executing the computer program to realize the steps of the data monitoring method of the cloud storage server.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program for implementing the steps of the data monitoring method of the cloud storage server as described above when the computer program is executed by a processor.
The data monitoring method, the data monitoring device, the data monitoring equipment and the computer readable storage medium of the cloud storage server can acquire user attribute information of data to be uploaded; determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity set; then determining the total number of the similarity exceeding a similarity threshold in the similarity set; and finally, when the total quantity is determined to exceed the quantity threshold value, updating the popularity value of the data to be uploaded according to the growth curve function model, and storing the data to be uploaded. Therefore, according to the scheme, when the user uploads the data, the similarity between the attribute of the user and the attribute of the user who uploads the data is calculated, the updating mode of the data popularity value is further determined according to the similarity, and the mode of adaptively adjusting the data popularity value based on the user attribute avoids the problem of data leakage caused by the fact that the popularity value adjusting mode is single, and the data security is remarkably improved.
Drawings
For a clearer explanation of the embodiments or technical solutions of the prior art of the present application, the drawings needed for the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a flowchart illustrating a first implementation of a data monitoring method for a cloud storage server according to an embodiment of the present disclosure;
fig. 2 is a flowchart illustrating an implementation of a second embodiment of a data monitoring method for a cloud storage server provided in the present application;
fig. 3 is a functional block diagram of an embodiment of a data monitoring apparatus of a cloud storage server provided in the present application;
fig. 4 is a schematic structural diagram of an embodiment of a data monitoring device of a cloud storage server provided in the present application.
Detailed Description
In order that those skilled in the art will better understand the disclosure, the following detailed description will be given with reference to the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
At present, the popularity value adjustment method of the popularity-based data deduplication scheme is as follows: whenever a user uploads data, the popularity value of the data is increased by 1. The adjustment mode of the popularity value is too single, data leakage is easily caused, and the safety of the data is reduced. In order to solve the problem, the application provides a data monitoring method, a data monitoring device, data monitoring equipment and a computer readable storage medium of a cloud storage server, so that the purpose of adaptively adjusting the popularity value of data based on user attributes is achieved, and the safety of the data is remarkably improved.
Referring to fig. 1, a first embodiment of a data monitoring method for a cloud storage server provided by the present application is described below, where the first embodiment includes:
step S101: acquiring user attribute information of data to be uploaded;
in this embodiment, the user attribute information refers to reference information for measuring the degree of association between users, and the user attribute information may include various attribute information, specifically, information such as a local area network address, an age, a gender, and personal preferences, and the embodiment does not limit which information is specifically selected. According to the embodiment, the popularity value of the data is determined in what manner according to the degree of association between the current user and the uploading user of the copy data in the cloud storage server. As a specific implementation manner, in this embodiment, the current users are divided into two categories, which are personal users and group users, respectively, according to the association degree, and when it is determined that the current user is a personal user, the popularity value of the data is adjusted according to a traditional popularity value adjustment manner, that is, the popularity value is added by 1; when the current user is determined to be the group user, the popularity value of the data is adjusted according to the popularity value adjusting mode provided by the embodiment.
Specifically, before obtaining the user attribute information of the data to be uploaded, it may be determined whether the data to be uploaded is the first uploaded data, and after determining that the data to be uploaded is the first uploaded data, the popularity value of the data may be directly determined according to a conventional popularity value adjustment manner, that is, the popularity value is 1; after determining that the data to be uploaded is not the first uploaded data, the operation of this embodiment is executed.
Step S102: determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity set;
when multiple copies of data to be uploaded are stored in the cloud storage server, the similarity between the user attribute information of the data to be uploaded and the attribute information of each copy data needs to be determined respectively in this embodiment, so as to obtain a similarity set, where the similarity set includes similarities between a current user and an uploading user of each copy data.
As described above, in this embodiment, the user attribute information may include multiple kinds of attribute information, and in an actual application scenario, if the user attribute information includes multiple kinds of attribute information, when determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data, the distances between the attribute information may be respectively obtained, and then the overall similarity may be determined according to the distances between the attribute information. As a preferred embodiment, weighted values may be set for the respective user attributes, and when determining the overall similarity, the distance of each attribute information and the corresponding weighted value are integrated to determine the overall similarity.
Step S103: determining a total number of similarities in the set of similarities that exceed a similarity threshold;
step S104: and when the total number exceeds a number threshold, updating the popularity value of the data to be uploaded according to a growth curve function model, and storing the data to be uploaded.
As described above, when determining the user type, the present embodiment utilizes two preset thresholds, which are a similarity threshold and a quantity threshold, specifically, first determines the total number of similarities exceeding the similarity threshold in the similarity set, then determines whether the total number exceeds the quantity threshold, and if so, determines that the current user is a group user, and adjusts the similarity of the data according to the similarity adjustment method provided in the present embodiment. As a specific implementation manner, when the current user is determined as a group user, the popularity value of the data to be uploaded is adjusted according to a growth curve function model, where the growth curve function model mainly refers to a Pearl model of a growth curve function.
It is worth mentioning that, in addition to the two thresholds, the embodiment also utilizes a popularity threshold set in advance, and the popularity threshold is used for judging whether to execute the data deduplication operation. The model parameters of the growth curve function model comprise the popularity threshold, so that according to the rule of the growth curve function model, the popularity value does not exceed the preset popularity threshold after the popularity value is adjusted according to the growth curve function model, and data deduplication operation is not triggered.
The data monitoring method for the cloud storage server provided by the embodiment can acquire user attribute information of data to be uploaded; determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity set; then determining the total number of the similarity exceeding a similarity threshold in the similarity set; and finally, when the total quantity exceeds a quantity threshold value, updating the popularity value of the data to be uploaded according to the growth curve function model, and storing the data to be uploaded. Therefore, according to the scheme, when the user uploads the data, the similarity between the attribute of the user and the attribute of the user who uploads the data is calculated, the updating mode of the data popularity value is further determined according to the similarity, and the mode of adaptively adjusting the data popularity value based on the user attribute avoids the problem of data leakage caused by the fact that the popularity value adjusting mode is single, and the data security is remarkably improved.
The second embodiment of the data monitoring method for the cloud storage server provided by the present application is described in detail below, and is implemented based on the first embodiment, and is expanded to a certain extent on the basis of the first embodiment. Specifically, the embodiment divides the types of the attribute information, and provides a specific attribute distance calculation method for each type attribute.
Referring to fig. 2, the second embodiment specifically includes:
step S201: initializing system parameters;
the embodiment is implemented based on a cloud storage server, and the cloud storage server mainly interacts with a client of a current user. In the actual application scenario, the current user A0And generating a ciphertext F and an inquiry tag H, wherein the ciphertext F is the data to be uploaded, sending an uploading request comprising the ciphertext F and the data to be uploaded to the cloud storage server, judging whether the data to be uploaded is the data uploaded for the first time by the cloud storage server through the inquiry tag, and executing the subsequent steps of the embodiment when the data to be uploaded is judged to be the data not uploaded for the first time.
The initialization process includes:
(1) setting a popularity threshold T;
(2) defining data popularity: using count (F) to represent the comprehensive count of the data F to be uploaded, and when the count (F) is less than T, defining F as non-popular data; otherwise, defining F as popular data;
(3) defining a bilinear mapping tag generation function e (Y, Hash (F))XOutputting a label H corresponding to F by the function as a unique identifier of the data F to be uploaded, wherein Y is an encrypted public key, X is an auxiliary key, and Hash (F) is a Hash value of the data;
(4) setting the size of a similarity threshold f and a quantity threshold z, wherein the value range of f is (0,1), and z is a fixed integer;
(5) the content of the user attribute information is defined, including a plurality of kinds of attribute information, and the type of the attribute information is defined as one of the following types [ determination numerical type, determination symbolic type, determination interval type, fuzzy interval or fuzzy number type, fuzzy semantic type ].
Step S202: acquiring user attribute information of data to be uploaded;
the user attribute information specifically includes multiple kinds of attribute information, the information type of the attribute information may be any one of the foregoing information types, and the similarity between the user attribute information in this embodiment is a result of weighted summation of the similarities of the attribute information.
Step S203: determining a target distance calculation mode according to the information type of the current attribute information, and determining the distance between the current attribute information of the data to be uploaded and the current attribute information of the copy data of the data to be uploaded according to the target distance calculation mode;
as described above, the present embodiment classifies the types of the attribute information in advance, and on this basis, the present embodiment sets corresponding distance calculation manners for the attribute information of different information types, respectively. The following introduces the distance calculation method of attribute information of different information types:
for attribute information of a certain numerical representation, e.g. two certain attribute values x1And x2Can directly adopt a formula dDN(x1,x2)=|x1-x2Calculating the distance of the absolute value;
for the attribute information represented by the determined symbol, if the attribute information of the current user is completely the same as the attribute information of the uploading user of some copy data, the distance is determined to be 0, otherwise, the distance is infinite;
for attribute information represented by a fixed interval, assume that attribute values X of two fixed intervals1And X2In this embodiment, the distance is calculated by using an EW type distance formula based on the width, which specifically includes:
Figure BDA0002092176500000091
wherein
Figure BDA0002092176500000092
For the expected value of the interval X,
Figure BDA0002092176500000093
is the width of the interval X, X(L)Is the interval starting point, x(H)Is the interval termination point;
for the fuzzy number or the fuzzy interval which cannot be quantized into a specific value after the attribute is digitized, preprocessing is firstly performed, in the embodiment, the fuzzy number or the fuzzy interval is represented by F, and the alpha-level interval obtained by preprocessing is represented by X (alpha). Then the attribute distance is calculated by the following formula:
Figure BDA0002092176500000094
wherein the content of the first and second substances,
Figure BDA0002092176500000095
is defined as being [0,1 ]]The above, and is a positive continuous function;
for the attribute information of the fuzzy semantic representation, the distance between the attribute information can be calculated through a membership function.
Step S204: determining the overall similarity between the user attribute information according to the distance between the attribute information;
the previous step obtains the distance between each attribute information and records the current user A0Uploading user A with cloud copy datai(subscript i is known for the ith user, 0 < i ≦ M) distance d for the jth attributeij(1. ltoreq. j. ltoreq.m), then A0Forming a distance matrix C with the attribute distance of the user with known same data on the cloud storage server, and performing normalization processing on each column in C to obtain a standard matrix R, wherein the process is as follows:
Figure BDA0002092176500000096
wherein d isi'jIs a numerical value obtained after normalization processing.
In this embodiment, weighted values are respectively assigned to m pieces of attribute information, and the assignment result is recorded as: w ═ W1,w2,...,wm}. Wherein w is more than or equal to 0jIs less than or equal to 1, and
Figure BDA0002092176500000101
by the formula RWT=(D1,D2,...,DM) Obtain the current user A0Weighted overall distance D between uploading user of copy dataiWherein W isTFor the transposition of the weight vector W, the similarity S between the user attribute information is finally determinediThat is, A0With user AiThe similarity between them is Si=1-Di
Step S205: obtaining a similarity set according to the similarity of the user attribute information of the current user and the uploading user of each copy data;
step S206: determining the total number of the similarity exceeding a similarity threshold in the similarity set, judging whether the total number exceeds a number threshold, if so, entering a step S207, otherwise, entering a step S208;
step S207: determining that the current user is a group user, updating the popularity value of the data to be uploaded according to a growth curve function model, and storing the data to be uploaded;
step S208: determining that the current user is a personal user, and adding one to the popularity value of the data to be uploaded;
as described above, in this embodiment, the adjustment manner of the popularity value is determined according to the type of the current user, the user types include individual users and group users, and the popularity state of the data is only changed due to the addition of the individual users, so as to better ensure the security of the data in the group. Specifically, when A0Popularity value updates for individual usersThe method is as follows:
count(F)=count(F)+1
when A is0When the group user is in use, the dynamic adjustment of counting is needed, and the method is as follows:
Figure BDA0002092176500000102
wherein a and b are constants.
Step S209: and when the updated popularity value is larger than the popularity threshold value, executing data deduplication operation on the data, otherwise, storing the data to be uploaded.
In this embodiment, after the current user requests to upload data, if it is determined that the popularity value of the data exceeds the popularity threshold, data deduplication operation is performed, it is ensured that only one copy of data is stored in the cloud storage server, and the cloud storage server creates an access link for the user who owns the data, so that the purposes of saving network bandwidth and storage space are achieved.
In summary, the present embodiment provides a data monitoring method for a cloud storage server, which is implemented based on the cloud storage server, and the main process includes: receiving an uploading request sent by a current user, wherein the uploading request comprises encrypted data to be uploaded; acquiring user attribute information of a current user; performing attribute similarity calculation on the current user and all users known to upload the data in the cloud storage server; judging the user type of the current user according to the similarity calculation result, wherein the user type specifically comprises an individual user and a group user; updating the popularity value of the data according to the updating mode of the corresponding popularity value according to the user type; store the data or perform a data deduplication operation.
Therefore, the method determines the similarity between the current user and the uploading user of each copy data through the user attribute information, further determines the user type according to the similarity, and performs self-adaptive adjustment on the popularity numerical counting mode of the data to be uploaded according to the user type. The method identifies possible group users according to the similarity degree of the user attributes, always ensures that the uploading operation of the group users does not change the current popularity value state of the data (namely, the data is changed from non-popular to popular), ensures the safety of the data and related encryption parameters in the data deduplication system, reduces the speed of the internal data of the group approaching the popularity threshold, effectively solves the problem of internal data leakage possibly caused by the data, and better protects the safety of the user data.
In the following, a data monitoring apparatus of a cloud storage server provided in an embodiment of the present application is introduced, and a data monitoring apparatus of a cloud storage server described below and a data monitoring method of a cloud storage server described above may be referred to correspondingly.
As shown in fig. 3, the apparatus includes:
the attribute information acquisition module 301: the system comprises a data acquisition module, a data processing module and a data transmission module, wherein the data acquisition module is used for acquiring user attribute information of data to be uploaded;
similarity determination module 302: the similarity collection module is used for determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity collection;
the quantity determination module 303: determining a total number of similarities in the set of similarities that exceed a similarity threshold;
the update module 304: and when the total number exceeds a number threshold, updating the popularity value of the data to be uploaded according to a growth curve function model, and storing the data to be uploaded.
As a specific implementation, the update module 304 includes:
a group user updating unit: the system comprises a data acquisition module, a data uploading module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring the total quantity of the data to be uploaded;
individual user update unit: and the data uploading module is used for determining that the uploading user of the data to be uploaded is an individual user when the total number does not exceed the number threshold, and adding one to the popularity value of the data to be uploaded.
As a specific implementation manner, the similarity determination module 302 includes:
a type setting unit: the similarity calculation method is used for setting corresponding similarity calculation modes for the user attribute information of different information types respectively;
a similarity calculation unit: and the similarity calculation module is used for determining a target similarity calculation mode according to the information type of the current user attribute information in the similarity calculation process, and determining the similarity between the current user attribute information of the data to be uploaded and the current user attribute information of the copy data of the data to be uploaded according to the target similarity calculation mode.
The data monitoring apparatus of the cloud storage server in this embodiment is used to implement the foregoing data monitoring method of the cloud storage server, and therefore a specific implementation manner of the apparatus may be seen in the foregoing embodiment parts of the data monitoring method of the cloud storage server, for example, the attribute information obtaining module 301, the similarity determining module 302, the number determining module 303, and the updating module 304 are respectively used to implement steps S101, S102, S103, and S104 in the foregoing data monitoring method of the cloud storage server. Therefore, specific embodiments thereof may be referred to in the description of the corresponding respective partial embodiments, and will not be described herein.
In addition, since the data monitoring apparatus of the cloud storage server of this embodiment is used to implement the data monitoring method of the cloud storage server, the role of the data monitoring apparatus corresponds to that of the method described above, and details are not described here.
In addition, the present application further provides a data monitoring device of a cloud storage server, as shown in fig. 4, including:
the memory 401: for storing a computer program;
the processor 402: the computer program is used for executing the computer program to realize the steps of the data monitoring method of the cloud storage server.
Finally, the present application provides a computer-readable storage medium having stored thereon a computer program for implementing the steps of a data monitoring method of a cloud storage server as described above when the computer program is executed by a processor.
The data monitoring device and the computer-readable storage medium of the cloud storage server in this embodiment are used to implement the data monitoring method of the cloud storage server, so the specific implementation manners of the device and the computer-readable storage medium can be seen in the foregoing part of the embodiment of the data monitoring method of the cloud storage server, and the functions of the device and the computer-readable storage medium correspond to those of the embodiment of the method, and are not described herein again.
The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above detailed descriptions of the solutions provided in the present application, and the specific examples applied herein are set forth to explain the principles and implementations of the present application, and the above descriptions of the examples are only used to help understand the method and its core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (7)

1. A data monitoring method of a cloud storage server is characterized by comprising the following steps:
acquiring user attribute information of data to be uploaded;
determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity set;
determining a total number of similarities in the set of similarities that exceed a similarity threshold;
when the total number exceeds a number threshold, updating the popularity value of the data to be uploaded according to a growth curve function model, and storing the data to be uploaded;
when the total number exceeds a number threshold, updating the popularity value of the data to be uploaded according to a growth curve function model, including:
when the total number exceeds a number threshold value, determining that the uploading users of the data to be uploaded are group users, and updating the popularity value of the data to be uploaded according to a growth curve function model;
and when the total number does not exceed the number threshold, determining that the uploading user of the data to be uploaded is an individual user, and adding one to the popularity value of the data to be uploaded.
2. The method of claim 1, wherein the determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the copy data of the data to be uploaded comprises:
setting corresponding similarity calculation modes for the user attribute information of different information types respectively;
in the process of calculating the similarity, a target similarity calculation mode is determined according to the information type of the current user attribute information, and the similarity between the current user attribute information of the data to be uploaded and the current user attribute information of the copy data of the data to be uploaded is determined according to the target similarity calculation mode.
3. The method of claim 2, wherein the type of information comprises any one or more of: determining a numerical type, determining a symbolic type, determining a regional type, blurring a numerical type, and blurring a semantic type.
4. A data monitoring device of a cloud storage server is characterized by comprising:
an attribute information acquisition module: the system comprises a data acquisition module, a data processing module and a data transmission module, wherein the data acquisition module is used for acquiring user attribute information of data to be uploaded;
a similarity determination module: the similarity collection module is used for determining the similarity between the user attribute information of the data to be uploaded and the user attribute information of the duplicate data of the data to be uploaded to obtain a similarity collection;
a quantity determination module: determining a total number of similarities in the set of similarities that exceed a similarity threshold;
an update module: when the total number exceeds a number threshold, updating the popularity value of the data to be uploaded according to a growth curve function model, and storing the data to be uploaded;
the update module includes:
a group user updating unit: the system comprises a data acquisition module, a data uploading module, a data storage module and a data processing module, wherein the data acquisition module is used for acquiring the total quantity of the data to be uploaded;
individual user update unit: and the data uploading module is used for determining that the uploading user of the data to be uploaded is an individual user when the total number does not exceed the number threshold, and adding one to the popularity value of the data to be uploaded.
5. The apparatus of claim 4, wherein the similarity determination module comprises:
a type setting unit: the similarity calculation method is used for setting corresponding similarity calculation modes for the user attribute information of different information types respectively;
a similarity calculation unit: and the similarity calculation module is used for determining a target similarity calculation mode according to the information type of the current user attribute information in the similarity calculation process, and determining the similarity between the current user attribute information of the data to be uploaded and the current user attribute information of the copy data of the data to be uploaded according to the target similarity calculation mode.
6. A data monitoring device of a cloud storage server is characterized by comprising:
a memory: for storing a computer program;
a processor: the computer program is executed to implement the steps of the data monitoring method of the cloud storage server according to any one of claims 1 to 3.
7. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is configured to implement the steps of the data monitoring method of a cloud storage server according to any one of claims 1 to 3.
CN201910507052.0A 2019-06-12 2019-06-12 Data monitoring method, device and equipment of cloud storage server Active CN110222043B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910507052.0A CN110222043B (en) 2019-06-12 2019-06-12 Data monitoring method, device and equipment of cloud storage server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910507052.0A CN110222043B (en) 2019-06-12 2019-06-12 Data monitoring method, device and equipment of cloud storage server

Publications (2)

Publication Number Publication Date
CN110222043A CN110222043A (en) 2019-09-10
CN110222043B true CN110222043B (en) 2021-08-24

Family

ID=67816678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910507052.0A Active CN110222043B (en) 2019-06-12 2019-06-12 Data monitoring method, device and equipment of cloud storage server

Country Status (1)

Country Link
CN (1) CN110222043B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10990402B1 (en) 2019-12-18 2021-04-27 Red Hat, Inc. Adaptive consumer buffer

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103997512A (en) * 2014-04-14 2014-08-20 南京邮电大学 Data duplicate quantity determination method for cloud storage system
CN106612320A (en) * 2016-06-14 2017-05-03 四川用联信息技术有限公司 Encrypted data dereplication method for cloud storage
CN108984574A (en) * 2017-06-05 2018-12-11 北京嘀嘀无限科技发展有限公司 Data processing method and device
CN109818757A (en) * 2019-03-18 2019-05-28 广东工业大学 Cloud storage data access control method, Attribute certificate awarding method and system
CN109863734A (en) * 2016-11-21 2019-06-07 英特尔公司 The data management in network centered on information

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150199379A1 (en) * 2012-10-30 2015-07-16 Google Inc. Sorting and searching of related content based on underlying file metadata
US10643135B2 (en) * 2016-08-22 2020-05-05 International Business Machines Corporation Linkage prediction through similarity analysis
CN107295358B (en) * 2016-08-31 2019-11-05 北京师范大学珠海分校 A kind of 3D Streaming Media storage method under cloud environment
CN106682126B (en) * 2016-12-14 2020-09-25 河海大学 Topic data set filtering and sorting method and system based on overall data quality

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103997512A (en) * 2014-04-14 2014-08-20 南京邮电大学 Data duplicate quantity determination method for cloud storage system
CN106612320A (en) * 2016-06-14 2017-05-03 四川用联信息技术有限公司 Encrypted data dereplication method for cloud storage
CN109863734A (en) * 2016-11-21 2019-06-07 英特尔公司 The data management in network centered on information
CN108984574A (en) * 2017-06-05 2018-12-11 北京嘀嘀无限科技发展有限公司 Data processing method and device
CN109818757A (en) * 2019-03-18 2019-05-28 广东工业大学 Cloud storage data access control method, Attribute certificate awarding method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于用户定义安全条件的可验证重复数据删除方法;刘红燕等;《计算机研究与发展》;20181015;第2134-2148页 *

Also Published As

Publication number Publication date
CN110222043A (en) 2019-09-10

Similar Documents

Publication Publication Date Title
US20120166448A1 (en) Adaptive Index for Data Deduplication
CN108830284B (en) Image identification method based on ciphertext image gray histogram vector
US9535954B2 (en) Join processing device, data management device, and string similarity join system
CN110134718B (en) Fuzzy search method supporting multiple keywords based on attribute encryption
CN106534164A (en) Cyberspace user identity-based effective virtual identity description method in computer
CN110222043B (en) Data monitoring method, device and equipment of cloud storage server
CN112352412B (en) Network traffic processing method and device, storage medium and computer equipment
Moia et al. Similarity digest search: A survey and comparative analysis of strategies to perform known file filtering using approximate matching
CN113765841A (en) Malicious domain name detection method and device
Wang et al. Accelerated training via device similarity in federated learning
Chen et al. Image Deduplication Based on Hashing and Clustering in Cloud Storage.
CN115879152A (en) Self-adaptive privacy protection method, device and system based on minimum mean square error criterion
CN110275991B (en) Hash value determination method and device, storage medium and electronic device
US20210294512A1 (en) Data storage method and apparatus, storage medium and computer device
CN104063555A (en) User model establishing method for intelligent remote sensing information distribution
CN106682130B (en) Similar picture detection method and device
Avrachenkov et al. Change rate estimation and optimal freshness in web page crawling
CN114925286B (en) Public opinion data processing method and device
Dai et al. A multibranch search tree-based multi-keyword ranked search scheme over encrypted cloud data
JP2018110442A5 (en)
CN114328394A (en) Campus data acquisition method and device, storage medium and equipment
CN109271580B (en) Search method, device, client and search engine
CN107329911A (en) A kind of cache replacement algorithm based on CP ABE attribute access mechanism
Lee et al. A new cloaking algorithm using Hilbert curves for privacy protection
Agarwala et al. Client side secure image deduplication using DICE protocol

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220915

Address after: 250014 No. 19, ASTRI Road, Lixia District, Shandong, Ji'nan

Patentee after: Shandong center information technology Limited by Share Ltd.

Address before: 266100 Hongkong East Road, Laoshan District, Qingdao, Shandong Province, No. 7

Patentee before: QINGDAO University

TR01 Transfer of patent right