CN112650454A - Internet of things multi-source data storage method and device based on deduplication rule - Google Patents

Internet of things multi-source data storage method and device based on deduplication rule Download PDF

Info

Publication number
CN112650454A
CN112650454A CN202011642078.5A CN202011642078A CN112650454A CN 112650454 A CN112650454 A CN 112650454A CN 202011642078 A CN202011642078 A CN 202011642078A CN 112650454 A CN112650454 A CN 112650454A
Authority
CN
China
Prior art keywords
data
internet
things
rule
deduplication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011642078.5A
Other languages
Chinese (zh)
Inventor
温文坤
马凤鸣
王鑫
李玮棠
林英喜
陈杰文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Jixiang Technology Co Ltd
Original Assignee
Guangzhou Jixiang Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jixiang Technology Co Ltd filed Critical Guangzhou Jixiang Technology Co Ltd
Priority to CN202011642078.5A priority Critical patent/CN112650454A/en
Publication of CN112650454A publication Critical patent/CN112650454A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • G06F3/0641De-duplication techniques

Abstract

The embodiment of the invention discloses an internet of things multi-source data storage method and device based on a deduplication rule, wherein the method comprises the following steps: the method comprises the steps that a first node device receives an internet of things data set sent by a plurality of second node devices, and the relevance of data contents in the internet of things data set is determined; determining a duplicate removal processing rule according to the relevance of the data content and the data type; and sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing. According to the scheme, the data storage efficiency is improved, the data transmission and maintenance are facilitated, and the data stability and safety are guaranteed.

Description

Internet of things multi-source data storage method and device based on deduplication rule
Technical Field
The embodiment of the application relates to the technical field of Internet of things, in particular to a method and a device for storing multi-source data of the Internet of things based on a deduplication rule.
Background
With the popularization of the internet of things technology and the powerful functions of intelligent equipment, the internet of things equipment plays an increasingly important role in daily life of people. Various kinds of intelligent physical network terminal devices are applied to various fields. Generally, the internet of things connects articles with the internet through various information sensing devices, so that information exchange can be realized for all common physical objects which can be independently addressed, and the purposes of intelligent identification, positioning, tracking, monitoring and management are finally achieved.
The data of the internet of things are from different sensing devices, such as a recognizer, a video device, a temperature sensor, a humidity sensor and the like, and generally, the data format and the semantic structure have certain difference. Meanwhile, with the popularization and the large-scale application of the internet of things equipment, the data volume of the internet of things is increased sharply, and how to efficiently and reasonably store the data of the internet of things with various types and various requirements is a problem which needs to be solved urgently at present.
Disclosure of Invention
The embodiment of the invention provides an internet of things multi-source data storage method and device based on a deduplication rule, which improve data storage efficiency, facilitate data transmission and maintenance, and simultaneously ensure data stability and safety.
In a first aspect, an embodiment of the present invention provides an internet-of-things multi-source data storage method based on a deduplication rule, where the method includes:
the method comprises the steps that a first node device receives an internet of things data set sent by a plurality of second node devices, and the relevance of data contents in the internet of things data set is determined;
determining a duplicate removal processing rule according to the relevance of the data content and the data type;
and sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing.
Optionally, the determining the association degree of the data content in the data set of the internet of things includes:
and determining the relevance of the data content according to the field type of the data content in the set of the Internet of things.
Optionally, the determining a deduplication processing rule according to the association degree of the data content and the data type includes:
if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining the corresponding deduplication processing rule as a merging processing rule;
and if the association degree of the data content is greater than the first preset threshold and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule.
Optionally, the server performs deduplication processing, including:
and if the server is the server corresponding to the merging processing rule, determining the public data and the private data of each node device, merging the public data, and then independently storing the private data.
In a second aspect, an embodiment of the present invention further provides an internet-of-things multi-source data storage device based on a deduplication rule, including:
the data receiving and processing module is used for receiving an internet of things data set sent by a plurality of second node devices and determining the relevance of data contents in the internet of things data set;
the rule determining module is used for determining a deduplication processing rule according to the relevance of the data content and the data type;
and the data sending module is used for sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to the processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing.
Optionally, the data receiving and processing module is specifically configured to:
and determining the relevance of the data content according to the field type of the data content in the set of the Internet of things.
Optionally, the rule determining module is specifically configured to:
if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining the corresponding deduplication processing rule as a merging processing rule;
and if the association degree of the data content is greater than the first preset threshold and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule.
Optionally, the data sending module is specifically configured to:
and if the server is the server corresponding to the merging processing rule, determining the public data and the private data of each node device, merging the public data, and then independently storing the private data.
In a third aspect, an embodiment of the present invention further provides an internet-of-things multi-source data storage device based on a deduplication rule, where the device includes:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the internet of things multi-source data storage method based on the deduplication rule according to the embodiment of the invention.
In a fourth aspect, the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform the deduplication rule based internet-of-things multi-source data storage method according to the present invention.
In the embodiment of the invention, a first node device receives an internet of things data set sent by a plurality of second node devices, and determines the association degree of data contents in the internet of things data set; determining a duplicate removal processing rule according to the relevance of the data content and the data type; and sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing. According to the scheme, the data storage efficiency is improved, the data transmission and maintenance are facilitated, and the data stability and safety are guaranteed.
Drawings
Fig. 1 is a flowchart of an internet-of-things multi-source data storage method based on a deduplication rule according to an embodiment of the present invention;
fig. 2 is a flowchart of another internet-of-things multi-source data storage method based on a deduplication rule according to an embodiment of the present invention;
fig. 3 is a flowchart of another internet-of-things multi-source data storage method based on a deduplication rule according to an embodiment of the present invention;
fig. 4 is a flowchart of another internet-of-things multi-source data storage method based on a deduplication rule according to an embodiment of the present invention;
fig. 5 is a block diagram of a structure of an internet-of-things multi-source data storage device based on a deduplication rule according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.
Detailed Description
The embodiments of the present invention will be described in further detail with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad invention. It should be further noted that, for convenience of description, only some structures, not all structures, relating to the embodiments of the present invention are shown in the drawings.
Fig. 1 is a flowchart of an internet-of-things multi-source data storage method based on a deduplication rule according to an embodiment of the present invention, which is suitable for storing internet-of-things device data. The scheme of one embodiment of the application specifically comprises the following steps:
step S101, a first node device receives an Internet of things data set sent by a plurality of second node devices, and determines the relevance of data contents in the Internet of things data set.
The first node device may be an internet of things gateway device, and the second node device may be an internet of things terminal device covered by the internet of things gateway device. And the plurality of different second node devices upload data to the first node device, and the data are sent to the server through the first node device for storage. Optionally, the first node device may also be a transit device selected from a plurality of second node devices.
In an embodiment, the first node device processes the received internet of things data sets of the plurality of second node devices, specifically, determines the association degree of data content in the internet of things data set of each second node device. Exemplarily, taking character data as an example, the data a uploaded by the second node device 1 includes 8 fields, and the data b uploaded by the second node device 2 also includes 8 fields, where the fields 1 to 6 of the two fields are general fields, and the fields 7 and 8 are fields characterizing the characteristics of the current data, and then the fields 1 to 6 are associated fields in each data; taking image data as an example, the second node device 3 uploads an image a at a first moment, the second node device 4 uploads an image b at a second moment, and the image association between the image a and the image b can be determined, if the contents of background images are consistent and no special target object exists, the association between the image a and the image b is 100%, if the difference between the target objects exists, the association is determined by the size of the area of the whole image occupied by the target object, and if the area occupies 10% of the whole image, the association is determined to be 90%; taking the text content as an example, the second node device 5 uploads the text a, the second node device 6 uploads the text b, and through similarity comparison, if the contents of the two at the beginning or the end are determined to be consistent, the association degree is 100% of the middle content part, there is an obvious text difference, and the similarity is 0.
And S102, determining a deduplication processing rule according to the relevance of the data content and the data type.
In one embodiment, after the first node device determines the association degree between the data uploaded by each second node device, the deduplication processing rule is determined according to the association degree of specific data content and the data type. Illustratively, the upload data of the second node m is m1, the upload data of the second node n is n1, and the first node determines a specific deduplication processing rule according to the relevancy between m1 and n1 and the data types of m and n.
In one embodiment, taking the data content including a character type, an audio type and a video image type as an example, for each type of data, after determining the similarity, there is a difference in the corresponding deduplication processing mechanisms.
Specifically, a relevance threshold value can be set for each data type, and when the relevance of the data in the determined internet of things data set is greater than the relevance threshold value, the corresponding deduplication processing rule is matched according to the corresponding data type. Examples are shown in the following table:
data type Threshold of degree of association Rule of removing weight
Character type 50% Deduplication rule 1
Audio type 60% Deduplication rule 2
Video image type 80% De-weight rule 3
Wherein, the deduplication rule 1 may exemplarily be: extracting characters with the same content as public data, and taking the rest of the characters as private data, wherein the public data is stored separately and the private data is stored separately; the deduplication rule 2 may exemplarily be: deleting an audio portion that does not contain a sound of a specific object (e.g., a person); the deduplication rule 3 may exemplarily be: and deleting the background image in the video image, or extracting the image of the target object (such as a vehicle) for independent storage.
And S103, sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing.
In one embodiment, different deduplication rules correspond to different processing servers, processing servers in a processing cluster corresponding to the deduplication processing rules are determined, and the internet of things data sets sent by the determined second node devices are sent to the processing servers for deduplication processing and then stored. The server which corresponds to each duplication elimination rule is arranged for batch processing, and duplication elimination storage efficiency is improved.
Therefore, the first node device receives the internet of things data set sent by the second node devices, and determines the relevance of the data content in the internet of things data set; determining a duplicate removal processing rule according to the relevance of the data content and the data type; and sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing. According to the scheme, the data storage efficiency is improved, the data transmission and maintenance are facilitated, and the data stability and safety are guaranteed.
Fig. 2 is a flowchart of another internet-of-things multi-source data storage method based on a deduplication rule according to an embodiment of the present invention. On the basis of the technical scheme, the determining the relevance of the data content in the data set of the internet of things comprises:
and determining the relevance of the data content according to the field type of the data content in the set of the Internet of things. The method specifically comprises the following steps:
step S201, the first node device receives an internet of things data set sent by a plurality of second node devices, and determines a relevance of data content according to a field type of the data content in the internet of things set.
In one embodiment, a specific way of determining the association degrees of the internet of things data sets of different second node devices is provided, specifically, the internet of things data sets sent by a plurality of second node devices are received, and the association degrees of the data contents are determined according to the field types of the data contents in the internet of things sets. If 10 fields are included in total, 8 of the field types being the same, the association is determined to be 80%.
Step S202, determining a deduplication processing rule according to the relevance of the data content and the data type.
Step S203, sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing.
According to the data storage method and the data storage device, the first node device receives the Internet of things data set sent by the second node devices, and the relevance of the data content is determined according to the field type of the data content in the Internet of things set, so that the data storage efficiency is improved, the data transmission and maintenance are facilitated, and the data stability and safety are guaranteed.
Fig. 3 is a flowchart of another internet-of-things multi-source data storage method based on a deduplication rule according to an embodiment of the present invention. On the basis of the above technical solution, the determining a deduplication processing rule according to the association degree of the data content and the data type includes:
if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining the corresponding deduplication processing rule as a merging processing rule;
and if the association degree of the data content is greater than the first preset threshold and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule. The method specifically comprises the following steps:
step S301, the first node device receives an Internet of things data set sent by a plurality of second node devices, and determines the association degree of the data content according to the field type of the data content in the Internet of things set.
Step S302, if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining that the corresponding deduplication processing rule is a merge processing rule, and if the association degree of the data content is greater than a first preset threshold and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule.
In one embodiment, the first preset value may be 90%, the first preset type may be a video image type, and the second preset type may be an audio type. The method comprises the following steps of reserving one part of the same (namely associated) data content for merging processing, and separately reserving processing rules for non-associated parts; the deletion processing rule may be to delete an audio data segment not containing a vocal part.
Step S303, sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing.
As can be seen from the above, if the association degree of the data content is greater than the first preset value and the data type is the first preset type, the corresponding deduplication processing rule is determined as the merge processing rule, and if the association degree of the data content is greater than the first preset threshold value and the data type is the second preset type, the corresponding deduplication processing rule is determined as the delete processing rule, so that the data storage efficiency is improved, the data transmission and maintenance are facilitated, and the data stability and the data security are ensured at the same time.
Fig. 4 is a flowchart of another internet-of-things multi-source data storage method based on a deduplication rule according to an embodiment of the present invention. On the basis of the technical scheme, the server performs deduplication processing, and the deduplication processing comprises the following steps:
and if the server is the server corresponding to the merging processing rule, determining the public data and the private data of each node device, merging the public data, and then independently storing the private data. The method specifically comprises the following steps:
step S401, the first node device receives an Internet of things data set sent by a plurality of second node devices, and determines the association degree of the data content according to the field type of the data content in the Internet of things set.
Step S402, if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining that the corresponding deduplication processing rule is a merging processing rule, and if the association degree of the data content is greater than a first preset threshold value and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule.
Step S403, sending the Internet of things data of the second node equipment corresponding to the same deduplication processing rule to the processing servers in the corresponding processing clusters, if the Internet of things data is the server corresponding to the merging processing rule, determining public data and private data of each node equipment, merging the public data, and then storing the private data separately.
In one embodiment, there are two different data processing types of server devices. Illustratively, if the common data and the private data of each node device are determined for a server corresponding to the merge processing rule, and the private data is separately stored after the common data is merged, specifically, taking the type of the video image data as an example, the corresponding common data is the same image background data, and the private data may be the corresponding monitoring contents of different faces, vehicles, and the like under different pictures.
According to the data processing method and the data processing system, the data of the Internet of things of the second node equipment corresponding to the same duplicate removal processing rule is sent to the processing servers in the corresponding processing clusters, if the data processing system is a server corresponding to a combined processing rule, public data and private data of each node equipment are determined, and after the public data are combined, the private data are independently stored, so that the data storage efficiency is improved, the data transmission and maintenance are facilitated, and the data stability and the data safety are guaranteed.
Fig. 5 is a structural block diagram of an internet of things multi-source data storage device based on a deduplication rule according to an embodiment of the present invention, where the device is configured to execute the internet of things multi-source data storage method based on the deduplication rule according to the data receiving end embodiment, and has functional modules and beneficial effects corresponding to the execution method. As shown in fig. 5, the apparatus specifically includes: a data reception processing module 101, a rule determination module 102, and a data transmission module 103, wherein,
the data receiving and processing module 101 is configured to receive an internet of things data set sent by a plurality of second node devices, and determine a relevance of data content in the internet of things data set;
a rule determining module 102, configured to determine a deduplication processing rule according to the association degree of the data content and the data type;
and the data sending module 103 is configured to send the internet of things data of the second node device corresponding to the same deduplication processing rule to the processing server in the corresponding processing cluster, and is configured to store the data after the server performs deduplication processing.
According to the scheme, the first node equipment receives the Internet of things data sets sent by the second node equipment, and the association degree of the data content in the Internet of things data sets is determined; determining a duplicate removal processing rule according to the relevance of the data content and the data type; and sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing. According to the scheme, the data storage efficiency is improved, the data transmission and maintenance are facilitated, and the data stability and safety are guaranteed.
In one possible embodiment, the relevancy of the data content is determined according to the field type of the data content in the internet of things set.
In a possible embodiment, the rule determining module is specifically configured to:
if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining the corresponding deduplication processing rule as a merging processing rule;
and if the association degree of the data content is greater than the first preset threshold and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule.
In a possible embodiment, the data sending module is specifically configured to:
and if the server is the server corresponding to the merging processing rule, determining the public data and the private data of each node device, merging the public data, and then independently storing the private data.
Fig. 6 is a schematic structural diagram of an internet-of-things multi-source data storage device based on a deduplication rule according to an embodiment of the present invention, as shown in fig. 6, the device includes a processor 201, a memory 202, an input device 203, and an output device 204; the number of the processors 201 in the device may be one or more, and one processor 201 is taken as an example in fig. 6; the processor 201, the memory 202, the input device 203 and the output device 204 in the apparatus may be connected by a bus or other means, for example in fig. 6. The memory 202 is a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the internet-of-things multi-source data storage method based on the deduplication rule in the embodiment of the present invention. The processor 201 executes various functional applications and data processing of the device by running the software programs, instructions and modules stored in the memory 202, that is, the internet-of-things multi-source data storage method based on the deduplication rule is realized. The input device 203 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function controls of the apparatus. The output device 204 may include a display device such as a display screen.
Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform a deduplication rule-based internet-of-things multi-source data storage method, including:
the method comprises the steps that a first node device receives an internet of things data set sent by a plurality of second node devices, and the relevance of data contents in the internet of things data set is determined;
determining a duplicate removal processing rule according to the relevance of the data content and the data type;
and sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing.
In one possible embodiment, the determining the relevancy of the data content in the data set of the internet of things includes:
and determining the relevance of the data content according to the field type of the data content in the set of the Internet of things.
In a possible embodiment, the determining the deduplication processing rule according to the association degree of the data content and the data type includes:
if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining the corresponding deduplication processing rule as a merging processing rule;
and if the association degree of the data content is greater than the first preset threshold and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule.
In one possible embodiment, the server performs deduplication processing, including:
and if the server is the server corresponding to the merging processing rule, determining the public data and the private data of each node device, merging the public data, and then independently storing the private data.
From the above description of the embodiments, it is obvious for those skilled in the art that the embodiments of the present invention can be implemented by software and necessary general hardware, and certainly can be implemented by hardware, but the former is a better implementation in many cases. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions to make a computer device (which may be a personal computer, a service, or a network device) perform the methods described in the embodiments of the present invention.
It should be noted that, in the embodiment of the internet-of-things multi-source data storage device based on the deduplication rule, each included unit and module is only divided according to functional logic, but is not limited to the above division, as long as corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiment of the invention.
It should be noted that the foregoing is only a preferred embodiment of the present invention and the technical principles applied. Those skilled in the art will appreciate that the embodiments of the present invention are not limited to the specific embodiments described herein, and that various obvious changes, adaptations, and substitutions are possible, without departing from the scope of the embodiments of the present invention. Therefore, although the embodiments of the present invention have been described in more detail through the above embodiments, the embodiments of the present invention are not limited to the above embodiments, and many other equivalent embodiments may be included without departing from the concept of the embodiments of the present invention, and the scope of the embodiments of the present invention is determined by the scope of the appended claims.

Claims (10)

1. The Internet of things multi-source data storage method based on the deduplication rule is characterized by comprising the following steps:
the method comprises the steps that a first node device receives an internet of things data set sent by a plurality of second node devices, and the relevance of data contents in the internet of things data set is determined;
determining a duplicate removal processing rule according to the relevance of the data content and the data type;
and sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to a processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing.
2. The internet-of-things multi-source data storage method based on the deduplication rule of claim 1, wherein the determining the relevancy of the data content in the internet-of-things data set comprises:
and determining the relevance of the data content according to the field type of the data content in the set of the Internet of things.
3. The Internet of things multi-source data storage method based on the deduplication rule as claimed in claim 1, wherein the determining the deduplication processing rule according to the relevance of the data content and the data type comprises:
if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining the corresponding deduplication processing rule as a merging processing rule;
and if the association degree of the data content is greater than the first preset threshold and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule.
4. The Internet of things multi-source data storage method based on the deduplication rules according to any one of claims 1 to 3, wherein the server performs deduplication processing, and comprises:
and if the server is the server corresponding to the merging processing rule, determining the public data and the private data of each node device, merging the public data, and then independently storing the private data.
5. Thing networking multisource data storage device based on deduplication rule, its characterized in that includes:
the data receiving and processing module is used for receiving an internet of things data set sent by a plurality of second node devices and determining the relevance of data contents in the internet of things data set;
the rule determining module is used for determining a deduplication processing rule according to the relevance of the data content and the data type;
and the data sending module is used for sending the Internet of things data of the second node equipment corresponding to the same duplicate removal processing rule to the processing server in the corresponding processing cluster, and storing the data after the server performs duplicate removal processing.
6. The internet-of-things multi-source data storage device based on the deduplication rule of claim 5, wherein the data receiving and processing module is specifically configured to:
and determining the relevance of the data content according to the field type of the data content in the set of the Internet of things.
7. The internet of things multi-source data storage device based on the deduplication rule of claim 5, wherein the rule determination module is specifically configured to:
if the association degree of the data content is greater than a first preset value and the data type is a first preset type, determining the corresponding deduplication processing rule as a merging processing rule;
and if the association degree of the data content is greater than the first preset threshold and the data type is a second preset type, determining that the corresponding deduplication processing rule is a deletion processing rule.
8. The Internet of things multi-source data storage device based on the deduplication rules according to any one of claims 5-7, wherein the data sending module is specifically configured to:
and if the server is the server corresponding to the merging processing rule, determining the public data and the private data of each node device, merging the public data, and then independently storing the private data.
9. An internet of things multi-source data storage device based on deduplication rules, the device comprising: one or more processors; a storage device to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the deduplication rule based internet of things multi-source data storage method of any of claims 1-4.
10. A storage medium containing computer-executable instructions for performing the deduplication rule-based internet-of-things multi-source data storage method of any one of claims 1-4 when executed by a computer processor.
CN202011642078.5A 2020-12-31 2020-12-31 Internet of things multi-source data storage method and device based on deduplication rule Pending CN112650454A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011642078.5A CN112650454A (en) 2020-12-31 2020-12-31 Internet of things multi-source data storage method and device based on deduplication rule

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011642078.5A CN112650454A (en) 2020-12-31 2020-12-31 Internet of things multi-source data storage method and device based on deduplication rule

Publications (1)

Publication Number Publication Date
CN112650454A true CN112650454A (en) 2021-04-13

Family

ID=75367021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011642078.5A Pending CN112650454A (en) 2020-12-31 2020-12-31 Internet of things multi-source data storage method and device based on deduplication rule

Country Status (1)

Country Link
CN (1) CN112650454A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107294804A (en) * 2017-06-21 2017-10-24 深圳市盛路物联通讯技术有限公司 A kind of method and apparatus based on transmission duration control Internet of Things data filtering
CN109558400A (en) * 2018-11-28 2019-04-02 北京锐安科技有限公司 Data processing method, device, equipment and storage medium
CN110941598A (en) * 2019-12-02 2020-03-31 北京锐安科技有限公司 Data deduplication method, device, terminal and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107294804A (en) * 2017-06-21 2017-10-24 深圳市盛路物联通讯技术有限公司 A kind of method and apparatus based on transmission duration control Internet of Things data filtering
CN109558400A (en) * 2018-11-28 2019-04-02 北京锐安科技有限公司 Data processing method, device, equipment and storage medium
CN110941598A (en) * 2019-12-02 2020-03-31 北京锐安科技有限公司 Data deduplication method, device, terminal and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄勤龙等: "《云计算数据安全》", 31 January 2018, 北京邮电大学出版社 *

Similar Documents

Publication Publication Date Title
CN109033387B (en) Internet of things searching system and method fusing multi-source data and storage medium
US10356024B2 (en) Moderating content in an online forum
US9830386B2 (en) Determining trending topics in social media
CN111818112B (en) Kafka system-based message sending method and device
CN109408639B (en) Bullet screen classification method, bullet screen classification device, bullet screen classification equipment and storage medium
CN106844341B (en) Artificial intelligence-based news abstract extraction method and device
WO2014206151A1 (en) System and method for tagging and searching documents
CN111835760A (en) Alarm information processing method and device, computer storage medium and electronic equipment
CN110389859B (en) Method, apparatus and computer program product for copying data blocks
CN114625918A (en) Video recommendation method, device, equipment, storage medium and program product
CN109710502B (en) Log transmission method, device and storage medium
TW202123026A (en) Data archiving method, device, computer device and storage medium
CN110162769B (en) Text theme output method and device, storage medium and electronic device
CN111930963B (en) Knowledge graph generation method and device, electronic equipment and storage medium
WO2021103594A1 (en) Tacitness degree detection method and device, server and readable storage medium
CN107122464B (en) Decision-making assisting system and method
CN111126034B (en) Medical variable relation processing method and device, computer medium and electronic equipment
CN110309328B (en) Data storage method and device, electronic equipment and storage medium
CN116821133A (en) Data processing method and device
CN112650454A (en) Internet of things multi-source data storage method and device based on deduplication rule
CN111259057A (en) Data processing method and device for civil appeal analysis
CN112765371A (en) Internet of things single data storage method and device based on deduplication rule
CN115935958A (en) Resume processing method and device, storage medium and electronic equipment
CN116089658A (en) Object commonality extraction method and device, storage medium and electronic equipment
CN115423030A (en) Equipment identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210413