CN113254502A

CN113254502A - Method, system, device and medium for filtering and valuing multi-party homogeneous data

Info

Publication number: CN113254502A
Application number: CN202110630333.2A
Authority: CN
Inventors: 李京昆; 刘文思; 洪薇; 洪健
Original assignee: Hubei Yangzhong Jushi Information Technology Co ltd
Current assignee: Hubei Yangzhong Jushi Information Technology Co ltd
Priority date: 2021-06-07
Filing date: 2021-06-07
Publication date: 2021-08-13
Anticipated expiration: 2041-06-07
Also published as: CN113254502B

Abstract

The invention discloses a method, a system, a device and a medium for filtering and valuing multi-party homogeneous data, which relate to the field of data security and are used for acquiring the multi-party homogeneous data, wherein the quantity of the homogeneous data is E; judging the types of the multi-party homogeneous data, and classifying the homogeneous data into static data or dynamic data; acquiring a first filtering variable according to the static data and the dynamic data; filtering the same-class data according to the first filtering variable to obtain a plurality of first data, wherein the plurality of first data correspond to a plurality of classes of the same-class data one to one; acquiring first data to obtain a plurality of first data hash values; obtaining a consistent calculation result by comparing a plurality of first data hash values; the invention can improve the verification efficiency of the authenticity of the same type of data under various environments and ensure the consistency of the data.

Description

Method, system, device and medium for filtering and valuing multi-party homogeneous data

Technical Field

The invention relates to the field of data security, in particular to a method, a system, a device and a medium for filtering and valuing multi-party homogeneous data.

Background

A Data Source Name (DSN) is a data structure that contains information about a particular database that is necessary for an open database connection driver to be able to connect to the database. The DSN is stored in a registry or as a separate text file, and the information contained in the DSN is the name, directory and database driver, as well as the user ID and password (depending on the type of DSN). Developers create a separate DSN for each database. To connect to a certain database, a developer needs to specify the DSN in the program. In contrast, a connection without a DSN requires specifying all necessary information in the program.

There are three types of DSNs: user DSNs (sometimes also called machine DSNs), system DSNs and file DSNs. Both user and system DSNs vary depending on the particular computer, with DSN information stored in a registry. User DSNs allow a single user to access a database on a single computer, and system DSNs allow multiple users on a computer to access a database. The file DSN stores relevant information in a text file ending in one DSN extension and can be shared by multiple users on different computers with the same drive installed.

Data authentication technology is an important means for realizing data authenticity in an open network environment, and has been widely concerned by the industry and academia. Currently, many research results have been obtained in data authentication technology, including digital signature technology, Message Authentication Codes (MAC), and Authentication Data Structures (ADS). These techniques have been widely applied to the field of data authenticity authentication and play a great role. However, with the continuous expansion of network application environments, a specific data authenticity authentication technology cannot be applied to all network environments, when multiple pieces of similar data are calculated, the different pieces of similar data acquired by an initiator may have differences due to the influence of variable data such as a calculation environment and a timestamp, and when a consistency check is performed on a hash value result obtained by calculation, a consistent result cannot be obtained mathematically. Therefore, corresponding data authenticity authentication methods need to be constructed for different network application scenarios.

Therefore, in order to further improve the verification of the authenticity of the homogeneous data under various environments, a method for filtering and valuing the heterogeneous data is urgently needed.

Disclosure of Invention

The invention aims to improve the verification efficiency of the authenticity of the same-type data in various environments and ensure the consistency of the data.

In order to achieve the above object, the present invention provides a method for filtering and valuing multiple homogeneous data, comprising:

acquiring multi-party homogeneous data, wherein the quantity of the homogeneous data is E;

judging the types of the multi-party homogeneous data, and classifying the homogeneous data into static data or dynamic data;

acquiring a first filtering variable according to the static data and the dynamic data;

filtering the same-class data according to the first filtering variable to obtain a plurality of first data, wherein the plurality of first data correspond to a plurality of classes of the same-class data one to one;

acquiring first data to obtain a plurality of first data hash values;

and obtaining a consistent calculation result by comparing the plurality of first data hash values.

Furthermore, when multiple pieces of data of the same type are obtained, if the data of the same type are directly obtained, hash value comparison is performed, under the influence of variable data such as different computing environments and timestamps, hash value comparison can be influenced, the same hash value can be obtained, a consistent result cannot be obtained, and verification efficiency of authenticity of the data of the same type can be influenced.

In order to improve the verification efficiency of the authenticity of the same-class data, the same-class data are classified into static data and dynamic data before being calculated by a user, wherein the static data cannot change within a long period of time and generally do not change along with operation. Static data does not include input, output data, and data to be changed in a chained operation. Static data is used primarily as a control or reference during operation. And the dynamic data: dynamic data changes over time in the system application. The dynamic data includes input, output data, and data to be changed in a serial operation. The dynamic data directly reflects the transaction process. Then, a first filtering variable needing noise elimination is obtained according to the static data and the dynamic data, the first data obtained through filtering is subjected to Hash calculation after elimination to obtain a first data Hash value, a plurality of first data Hash values are compared to obtain a consistent calculation result, and the influence of variable data on the consistent calculation result is avoided.

Specifically, the obtaining of the filter variable according to the static data and the dynamic data includes:

extracting dynamic information in the dynamic data and extracting static information in the static data;

and identifying a timestamp corresponding to the static information in the dynamic information according to the static information, wherein the timestamp is a first filtering variable.

Filtering homogeneous data according to the filtering variables, which specifically comprises the following steps:

screening out data information corresponding to the dynamic data and the static data under the timestamp according to the timestamp;

and taking the data information as first data corresponding to the same kind of data.

Through comparing a plurality of first data hash values, a consistent calculation result is obtained, which specifically comprises:

comparing the plurality of first data hash values, wherein if the plurality of first data hash values are consistent, the first data hash values are consistent calculation results;

and comparing the plurality of first data hash values, if the first data hash values are inconsistent, screening the inconsistent first data hash values to serve as first difference values, acquiring the quantity Y of the first difference values, and judging a consistent calculation result according to the quantity Y of the first difference values.

Wherein, judge unanimous calculated result through the quantity Y of first difference value, specifically include:

if the number Y of the first difference values is smaller than the threshold value X, the first data hash value is a consistent calculation result;

if the number Y of the first difference values is larger than or equal to the threshold value X, obtaining a second filtering variable according to the static data and the dynamic data, and screening the first data according to the second filtering variable to obtain a plurality of second data, wherein the plurality of second data correspond to the same data in multiple directions one by one; acquiring second data to obtain a plurality of second data hash values, comparing the plurality of second data hash values, and if the plurality of second data hash values are consistent, determining the second data hash values as consistent calculation results;

if the first data hash values are inconsistent, screening inconsistent second data hash values as second difference values, and acquiring the quantity T of the second difference values, wherein if the quantity T of the second difference values is smaller than a threshold value X, the second data hash values are consistent calculation results;

if the number T of the second difference values is larger than or equal to the threshold value X, acquiring a second filtering variable again and screening second data until the number T of the second difference values is smaller than the threshold value X; the threshold value X is smaller than the number E of the homogeneous data, and the threshold value X is set according to the number E of the homogeneous data.

Acquiring a second filtering variable according to the static data and the dynamic data, which specifically comprises the following steps:

acquiring all attribute data of static data and dynamic data;

calculating to an attribute hash table according to the attribute data hash;

acquiring an attribute data hash value from an attribute hash table, and deleting the attribute data hash value from the attribute hash table;

and screening out second data from the first data according to the attribute data hash value.

After the first filtering variable is filtered, if the consistent calculation result cannot be achieved, the same type of data filtered by the first filtering variable is further filtered by adopting the second filtering variable, namely the first data is filtered according to the second filtering variable to obtain second data, the second data is calculated to obtain a second data hash value, and the consistent calculation result is obtained by comparing the second data hash value, so that the noise information is further removed, the accuracy of the consistent calculation result can be better ensured, and the verification efficiency of the authenticity of the same type of data is improved.

Corresponding to the method in the invention, the invention also provides a system for filtering and valuing the multi-part homogeneous data, which comprises an acquisition module, a classification module, a first filtering module and a judgment module;

the acquisition module is used for acquiring multi-party homogeneous data;

the classification module is used for classifying the same-class data into static data or dynamic data;

the first filtering module is used for filtering homogeneous data according to a first filtering variable to obtain a plurality of first data which correspond to the same kind of data in multiple parts one by one;

the judging module is used for acquiring a plurality of first data hash values according to the plurality of first data, and obtaining a consistent calculation result by comparing the plurality of first data hash values.

Furthermore, the system also comprises a second filtering module, wherein the second filtering module is used for screening the first data according to a second filtering variable to obtain second data and sending the second data to the judging module;

the judging module is further used for obtaining a plurality of second data hash values according to the plurality of second data, and obtaining a consistent calculation result by comparing the plurality of second data hash values.

Corresponding to the method in the invention, the invention also provides an electronic device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the steps of the method for filtering and valuing the multi-party same-kind data when executing the computer program.

Corresponding to the method in the invention, the invention also provides a storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the steps of the method for filtering and valuing the multi-party homogeneous data when being executed by a processor.

One or more technical schemes provided by the invention at least have the following technical effects or advantages:

the invention filters the same kind of data through the first filtering variable, preliminarily eliminates noise information, calculates the multi-party same kind of data, can perform multi-party verification on the filtered consistent calculation result to obtain the consistent calculation result, and further filters the first data through the second filtering variable when the consistent calculation result can not be obtained, thereby further eliminating the noise data, being more beneficial to the consistency verification of the multi-party calculation result, further obtaining the consistent calculation result and improving the authenticity verification of the same kind of data under various environments.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention. In the drawings:

FIG. 1 is a flow chart of a method for filtering and valuing multi-party homogeneous data;

FIG. 2 is a flow chart of a system for filtering and valuing multi-party homogeneous data.

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflicting with each other.

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described and thus the scope of the present invention is not limited by the specific embodiments disclosed below.

It will be understood by those skilled in the art that in the present disclosure, the terms "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in an orientation or positional relationship indicated in the drawings for ease of description and simplicity of description, and do not indicate or imply that the referenced devices or components must be constructed and operated in a particular orientation and thus are not to be considered limiting.

It is understood that the terms "a" and "an" should be interpreted as meaning that a number of one element or element is one in one embodiment, while a number of other elements is one in another embodiment, and the terms "a" and "an" should not be interpreted as limiting the number.

Example one

Referring to fig. 1, fig. 1 is a schematic flow chart of a method for filtering and valuing heterogeneous data, the present invention provides a method for filtering and valuing heterogeneous data, the method includes:

acquiring first data to obtain a plurality of first data hash values;

Acquiring a filtering variable according to the static data and the dynamic data, which specifically comprises the following steps:

The method for judging the consistent calculation result through the number Y of the first difference values specifically comprises the following steps:

acquiring all attribute data of static data and dynamic data;

calculating to an attribute hash table according to the attribute data hash;

Example two

The present embodiment introduces the method for filtering and valuing multi-party homogeneous data in the present invention with reference to specific examples:

the method comprises the following steps: acquiring multi-party homogeneous data, namely homogeneous data A, homogeneous data B, homogeneous data C, homogeneous data D and homogeneous data F; in this embodiment, the number E of homogeneous data is 5;

step two: judging the types of the multi-party homogeneous data, and classifying the homogeneous data into static data or dynamic data;

2.1 comparing according to the same source historical hash values of the same type of data, if the same type of data is consistent in the time, the data is static data, and if the same type of data is inconsistent, the data is dynamic data;

2.2 according to the comparison between the static data and the dynamic data, obtaining the timestamp information when the static data and the dynamic data are equal, namely a first filtering variable;

2.3, filtering static data and dynamic data according to the timestamp to obtain first data, wherein the timestamp information in the first data is the same as the acquired timestamp information;

step three: acquiring first data to obtain a plurality of first data hash values;

3.1 calculating first data to obtain a hash value A, a hash value B, a hash value C, a hash value D and a hash value F;

step four: obtaining consistent calculation results by comparing a plurality of first data hash values

4.1 comparing the hash value A, the hash value B, the hash value C, the hash value D and the hash value F, wherein if the hash value A, the hash value B, the hash value C, the hash value D and the hash value F are consistent, the consistent data is a consistent calculation result;

4.2 if the hash value a, the hash value B, the hash value C, the hash value D and the hash value F are inconsistent, obtaining inconsistent screened inconsistent first data hash values, and obtaining a number Y of first difference values as first difference values, in this embodiment, the hash value a, the hash value B, and the hash value C are consistent, the hash value D and the hash value F are inconsistent with the hash value C, the number Y of the first difference values is 2, in this embodiment, the threshold X is 20% of the number E of the same type of data, that is, the threshold X is 1, and the number Y of the first difference values is greater than the threshold X;

4.3, acquiring a second filtering variable according to the static data and the dynamic data;

4.31 acquiring all attribute data of the static data and the dynamic data;

4.32 calculating to an attribute hash table according to the attribute data hash;

4.33 obtaining an attribute data hash value in the attribute hash table, and deleting the attribute data hash value from the attribute hash table;

4.34 screening out second data from the first data according to the attribute data hash value;

for example: the same type data is employee third information, and all attribute data of the static data and the dynamic data are the employee third information and comprise employee ID information, name information, gender information, department information and identity card information;

performing hash calculation according to the information to obtain employee ID hash values, name hash values, gender hash values, department hash values and identity card hash values, and storing all the employee ID hash values, the name hash values, the gender hash values, the department hash values and the identity card hash values in an attribute hash table;

when a second filtering variable needs to be acquired, acquiring an employee ID hash value in the attribute hash table, and deleting the employee ID hash value in the attribute hash table after the employee ID hash value is acquired;

screening out second data from the first data according to the attribute data hash value, wherein the second data are data with employee ID hash value identification;

step five: acquiring second data to obtain 5 second data hash values, and comparing the second data hash values;

5.1 comparing the 5 second data hash values, wherein if the two second data hash values are consistent, the second data hash values are consistent calculation results;

5.2 if the data values are inconsistent, screening inconsistent second data hash values as second difference values, and acquiring the quantity T of the second difference values, wherein if the quantity T of the second difference values is smaller than a threshold value X, the second data hash values are consistent calculation results;

in this embodiment, if the number T of the second difference values is 2 and is greater than the threshold value X, the step 4.3 is performed again, the second filter variable is obtained again, the name hash value is obtained from the attribute hash table, and the name hash value in the attribute hash table is deleted after the name hash value is obtained;

screening third data from the second data according to the name hash value, wherein the third data are data with name hash value identification and employee ID hash value identification;

obtaining the hash value of the third data, comparing, and if the hash values are consistent, the hash value of the third data is a consistent calculation result;

if the third data hash values are inconsistent, screening inconsistent third data hash values as third difference values, and acquiring the number M of the third difference values, wherein if the number M of the third difference values is smaller than a threshold value X, the third data hash values are consistent calculation results;

in this embodiment, the number M of the third difference values in the third data hash value is 1, and the number M of the third difference values is smaller than the threshold X, so that the third data hash value is a consistent calculation result.

EXAMPLE III

Referring to fig. 2, fig. 2 is a schematic composition diagram of a system for filtering and valuing multiple pieces of homogeneous data, and a second embodiment of the present invention provides a system for destroying distributed network data, where the system includes; acquisition module, classification module, first filtering module and judgment module

The acquisition module is used for acquiring multi-party homogeneous data;

The system also comprises a second filtering module, wherein the second filtering module is used for screening the first data according to a second filtering variable to obtain second data and sending the second data to the judging module;

In the third embodiment of the present invention, the first filtering module is specifically configured as follows: the first filtering module extracts static information in the static data according to the dynamic information in the extracted dynamic data, identifies a timestamp corresponding to the static information in the dynamic data according to the static information, the timestamp is a first filtering variable, and screens out data information corresponding to the dynamic data and the static data under the timestamp according to the timestamp; and taking the data information as first data corresponding to the same kind of data.

In the third embodiment of the present invention, the second filtering module is specifically configured as follows: acquiring all attribute data of static data and dynamic data; calculating to an attribute hash table according to the attribute data hash; acquiring an attribute data hash value from an attribute hash table, and deleting the attribute data hash value from the attribute hash table; screening out second data from the first data according to the attribute data hash value;

in the third embodiment of the present invention, the specific manner of the determining module is as follows: calculating first data to obtain a first data hash value, comparing a plurality of first data hash values, and if the first data hash values are consistent, determining that the first data hash values are consistent calculation results; comparing the plurality of first data hash values, if the first data hash values are inconsistent, screening out the inconsistent first data hash values as first difference values, acquiring the quantity Y of the first difference values, if the quantity Y of the first difference values is larger than or equal to a threshold value X, acquiring a second filtering variable according to static data and dynamic data, screening out the first data according to the second filtering variable, and obtaining a plurality of second data, wherein the plurality of second data correspond to the same data of a plurality of parties one to one; acquiring second data to obtain a plurality of second data hash values, comparing the plurality of second data hash values, and if the plurality of second data hash values are consistent, determining the second data hash values as consistent calculation results; if the first data hash values are inconsistent, screening inconsistent second data hash values as second difference values, and acquiring the quantity T of the second difference values, wherein if the quantity T of the second difference values is smaller than a threshold value X, the second data hash values are consistent calculation results; if the number T of the second difference values is larger than or equal to the threshold value X, acquiring a second filtering variable again and screening second data until the number T of the second difference values is smaller than the threshold value X; the threshold value X is smaller than the number E of the homogeneous data, and the threshold value X is set according to the number E of the homogeneous data.

Example four

The fourth embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program that is stored in the memory and can be run on the processor, where the processor implements the steps of the method for filtering and evaluating the multi-party homogeneous data when executing the computer program.

The processor may be a central processing unit, or may be other general-purpose processor, a digital signal processor, an application specific integrated circuit, an off-the-shelf programmable gate array or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory can be used for storing the computer program and/or the module, and the processor realizes various functions of the multi-party homogeneous data filtering and valuing device in the invention by operating or executing the data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a smart memory card, a secure digital card, a flash memory card, at least one magnetic disk storage device, a flash memory device, or other volatile solid state storage device.

EXAMPLE five

An embodiment five of the present invention provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps of the method for filtering and valuing multiple pieces of homogeneous data are implemented.

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM) or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. The method for filtering and valuing the multi-party homogeneous data is characterized by comprising the following steps:

acquiring first data to obtain a plurality of first data hash values;

2. The method for filtering values of multiple homogeneous data according to claim 1, wherein obtaining the filter variable according to the static data and the dynamic data specifically comprises:

3. The method for filtering values of multiple homogeneous data according to claim 2, wherein filtering homogeneous data according to the filter variable specifically comprises:

screening out data information corresponding to the same type of data under the timestamp according to the timestamp;

4. The method for filtering the value of the multiple pieces of similar data according to claim 1, wherein the obtaining of the consistent calculation result by comparing the hash values of the first data specifically comprises:

and comparing the plurality of first data hash values, if the first data hash values are inconsistent, screening the inconsistent first data hash values as first difference values, acquiring the quantity Y of the first difference values, and acquiring a consistent calculation result through the quantity Y of the first difference values.

5. The method for filtering the value of the multi-party homogeneous data according to claim 4, wherein the obtaining of the consistent calculation result through the number Y of the first difference values specifically comprises:

if the number Y of the first difference values is larger than or equal to the threshold value X, obtaining a second filtering variable according to the static data and the dynamic data, and screening the first data according to the second filtering variable to obtain a plurality of second data, wherein the plurality of second data correspond to the same data in multiple directions one by one;

acquiring second data to obtain a plurality of second data hash values, comparing the plurality of second data hash values, and if the plurality of second data hash values are consistent, determining the second data hash values as consistent calculation results;

6. The method for filtering values of multiple similar data according to claim 5, wherein obtaining the second filter variable according to the static data and the dynamic data specifically comprises:

acquiring all attribute data of static data and dynamic data;

calculating to an attribute hash table according to the attribute data hash;

7. The system for filtering and valuing the multi-party homogeneous data is characterized by comprising an acquisition module, a classification module, a first filtering module and a judgment module;

the acquisition module is used for acquiring multi-party homogeneous data;

the first filtering module is used for screening homogeneous data according to a first filtering variable to obtain a plurality of first data which correspond to the multiple homogeneous data one by one;

8. The system for filtering values of multiple similar data according to claim 7, further comprising a second filtering module, wherein the second filtering module is configured to filter the first data according to a second filtering variable to obtain second data, and send the second data to the determining module;

9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program performs the steps of the method for filtering heterogeneous data according to any of claims 1-6.

10. A storage medium, in which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for filtering values from heterogeneous data according to any one of claims 1 to 6.