CN116991329A - Data redundancy prevention method and system for self-service terminal equipment - Google Patents

Data redundancy prevention method and system for self-service terminal equipment Download PDF

Info

Publication number
CN116991329A
CN116991329A CN202311238831.8A CN202311238831A CN116991329A CN 116991329 A CN116991329 A CN 116991329A CN 202311238831 A CN202311238831 A CN 202311238831A CN 116991329 A CN116991329 A CN 116991329A
Authority
CN
China
Prior art keywords
data
stored
storage
redundancy
segments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311238831.8A
Other languages
Chinese (zh)
Other versions
CN116991329B (en
Inventor
冯海青
李文才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mingtech Co ltd
Original Assignee
Mingtech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mingtech Co ltd filed Critical Mingtech Co ltd
Priority to CN202311238831.8A priority Critical patent/CN116991329B/en
Publication of CN116991329A publication Critical patent/CN116991329A/en
Application granted granted Critical
Publication of CN116991329B publication Critical patent/CN116991329B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application relates to the field of data storage and discloses a data redundancy prevention method and a data redundancy prevention system of self-service terminal equipment, wherein the data redundancy prevention method and the data redundancy prevention system comprise a data storage module, a verification and retrieval module, a matching segmentation module and a storage link module; the method has the advantages that the characteristic data segments are extracted from the data to be stored, the repeated data in the database can be conveniently matched and identified, so that the data to be stored is stored in a segmented mode, the repeated segments are subjected to redundancy prevention processing by adopting a scheme that multiple data readings point to shared storage contents, compared with the storage mode in the prior art, the storage occupation of terminal equipment can be greatly reduced, the common source storage of the repeated data can effectively promote the management work of data safety, the data heat content of the system is judged more intuitively, and meanwhile, the equipment calculation force occupation in the redundancy prevention calculation process is effectively reduced due to the matching mode of the data characteristic segments.

Description

Data redundancy prevention method and system for self-service terminal equipment
Technical Field
The application relates to the field of data storage, in particular to a data redundancy prevention method and a data redundancy prevention system for self-service terminal equipment.
Background
With the rapid development of computer and cloud technologies, the data volume of individuals and servers is rapidly increasing, which results in the continuous increase of the cloud and the pressure of individuals on storage, a large amount of data needs to be stored, and the demand of storage devices is gradually difficult to meet.
In addition to the improvement of storage devices based on the physical layer to solve the problem, it can be understood by the problems in the prior art that some of the reasons for the current problem include civilian use of storage requirements, the reduction of threshold causes expansion of user groups, and among the user groups, there are a large number of repeated or partially contained identical data contents among different user groups, resulting in a large waste of storage space.
Disclosure of Invention
The application aims to provide a data redundancy prevention method and system for self-service terminal equipment, which are used for solving the problems in the background technology.
In order to achieve the above purpose, the present application provides the following technical solutions:
a data redundancy prevention system for a self-service terminal device, comprising:
the data storage module is used for acquiring data to be stored, matching the data to be stored based on a preset header database, acquiring corresponding identification header data segments, carrying out feature extraction on the data to be stored according to preset data length intervals and feature data segment lengths, and acquiring a plurality of feature data segments, wherein the identification header data segments and the corresponding feature data segments are redundancy prevention data sets;
the verification and retrieval module is used for retrieving the matched database based on the redundancy prevention data set, if the identification head data segment and the continuous characteristic data segments are matched with the redundancy prevention data set in the matched database, corresponding storage data of the matched redundancy prevention data set is obtained, the storage data is compared with the corresponding data segment of the data to be stored, and a judgment result is generated;
the matching segmentation module is used for marking the data to be stored in a segmentation mode based on the head and the tail data nodes of the matched redundancy prevention data set when the judging results are the same, and segmenting the data to be stored based on the segmentation mark to obtain a plurality of sub-data segments to be stored;
and the storage link module is used for carrying out guiding pointing links of storage addresses on the sub-data to be stored, which are characterized by the same judgment result, and the corresponding storage data, distributing storage spaces for the rest of a plurality of sub-data segments to be stored, generating storage address links, and generating storage reading links of the data to be stored based on the guiding pointing links and the storage address links.
As a further aspect of the application: each piece of data to be stored comprises an indefinite number of identification head data segments, the identification head data segments are in one-to-one correspondence with redundancy prevention data sets, the data segment intervals of the data to be stored, which correspond to the redundancy prevention data sets, can be overlapped, the sub data segments to be stored are provided with maximum data quantity, and the characteristic data segments in each redundancy prevention data set are correspondingly provided with maximum numbers.
As still further aspects of the application: the data storage module includes a header setting unit;
and the head setting unit is used for extracting the data of the data to be stored in the data segment and establishing an identification head data segment if the matched identification head data segment is not searched in the preset data segment length when the data to be stored is matched based on the head database, wherein the preset data segment length is smaller than the data length interval, and the data length of the identification head data segment is smaller than the data length interval.
As still further aspects of the application: the matching segmentation module comprises a segmentation reservation unit;
the segmentation reservation unit is used for respectively defining segmentation reservation nodes on two sides of the segmentation mark based on the size of a preset reserved data segment, when the data to be stored is segmented, the reserved data segments on two sides are copied and segmented to the front and rear sub-data segments to be stored respectively, and the reserved data segments are used for carrying out matching positioning on data content when adjacent sub-data to be stored are recombined.
As still further aspects of the application: the auxiliary reading guide module comprises:
the reading request unit is used for acquiring a data reading request, acquiring a corresponding storage reading link based on the data reading request, and acquiring a guiding pointing link and a storage address link of a plurality of storage sub-data segments;
the reading judgment unit is used for acquiring the unit time reading frequency corresponding to the guiding pointing link and the storage address link, and generating an auxiliary reading link guiding the user communication link with the read storage data if the unit time reading frequency is greater than a preset threshold value;
and the reading guide unit is used for acquiring the stored data by a user based on the auxiliary reading link.
The embodiment of the application aims to provide a data redundancy prevention method of self-service terminal equipment, which comprises the following steps:
acquiring data to be stored, matching the data to be stored based on a preset header database, acquiring corresponding identification header data segments, performing feature extraction on the data to be stored according to preset data length intervals and feature data segment lengths, and acquiring a plurality of feature data segments, wherein the identification header data segments and the corresponding feature data segments are redundancy prevention data sets;
searching a matching database based on the redundancy prevention data set, if the identification head data segment and the continuous characteristic data segments are matched with the redundancy prevention data set in the matching database, acquiring corresponding storage data of the matched redundancy prevention data set, comparing the storage data with the corresponding data segment of the data to be stored, and generating a judging result;
when the judging results are characterized as the same, the data to be stored are segmented and marked based on the head and tail data nodes of the matched redundancy prevention data set, and the data to be stored are segmented based on the segmentation marks, so that a plurality of sub data segments to be stored are obtained;
and carrying out guiding pointing links of storage addresses on the sub-data to be stored, which are characterized by the same judging result, and the corresponding storage data, distributing storage spaces for the rest of the sub-data segments to be stored, generating storage address links, and generating storage reading links of the data to be stored based on the guiding pointing links and the storage address links.
As a further aspect of the application: each piece of data to be stored comprises an indefinite number of identification head data segments, the identification head data segments are in one-to-one correspondence with redundancy prevention data sets, the data segment intervals of the data to be stored, which correspond to the redundancy prevention data sets, can be overlapped, the sub data segments to be stored are provided with maximum data quantity, and the characteristic data segments in each redundancy prevention data set are correspondingly provided with maximum numbers.
As still further aspects of the application: the method also comprises the steps of:
when the data to be stored is matched based on the header database, if the matched identification header data segment is not retrieved in the preset data segment length, extracting the data of the data to be stored in the data segment and establishing an identification header data segment, wherein the preset data segment length is smaller than the data length interval, and the data length of the identification header data segment is smaller than the data length interval.
As still further aspects of the application: the method also comprises the steps of:
dividing reserved nodes are respectively defined on two sides of the segmentation mark based on the size of a preset reserved data segment, when the data to be stored is divided, the reserved data segments on two sides are copied and respectively divided into front and rear sub-data segments to be stored, and the reserved data segments are used for carrying out matching positioning on data content when adjacent sub-data to be stored are recombined.
As still further aspects of the application: the method also comprises the steps of:
the data reading request acquires a corresponding storage reading link, and acquires a guiding pointing link and a storage address link of a plurality of storage sub-data segments;
acquiring a unit time reading frequency corresponding to the guide pointing link and the storage address link, and if the unit time reading frequency is greater than a preset threshold value, generating an auxiliary reading link for guiding a user communication link with the read storage data;
and acquiring the stored data by a user based on the auxiliary reading link.
Compared with the prior art, the application has the beneficial effects that: the method has the advantages that the characteristic data segments are extracted from the data to be stored, the repeated data in the database can be conveniently matched and identified, so that the data to be stored is stored in a segmented mode, the repeated segments are subjected to redundancy prevention processing by adopting a scheme that multiple data readings point to shared storage contents, compared with the storage mode in the prior art, the storage occupation of terminal equipment can be greatly reduced, the common source storage of the repeated data can effectively promote the management work of data safety, the data heat content of the system is judged more intuitively, and meanwhile, the equipment calculation force occupation in the redundancy prevention calculation process is effectively reduced due to the matching mode of the data characteristic segments.
Drawings
Fig. 1 is a block diagram of a data redundancy prevention system of a self-service terminal device.
Fig. 2 is a block diagram of an auxiliary reading guide module in a data redundancy prevention system of a self-service terminal device.
Fig. 3 is a flow chart of a method for preventing redundancy of data of a self-service terminal device.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
Specific implementations of the application are described in detail below in connection with specific embodiments.
As shown in fig. 1, a data redundancy preventing system of a self-service terminal device according to an embodiment of the present application includes the following steps:
the data storage module 100 is configured to obtain data to be stored, match the data to be stored based on a preset header database, obtain a corresponding identification header data segment, perform feature extraction on the data to be stored at a preset data length interval and a feature data segment length, and obtain a plurality of feature data segments, where the same identification header data segment and the corresponding feature data segment are redundancy-preventing data sets.
And the verification and retrieval module 300 is configured to retrieve a matching database based on the redundancy prevention data set, and if the identification header data segment and the consecutive plurality of feature data segments match the redundancy prevention data set in the matching database, obtain corresponding stored data of the redundancy prevention data set that matches, compare the stored data with the corresponding data segment of the data to be stored, and generate a judgment result.
And the matching segmentation module 500 is configured to perform segmentation marking on the data to be stored based on the first and last data nodes of the redundancy preventing data set, and segment the data to be stored based on the segmentation marking, so as to obtain a plurality of sub-data segments to be stored when the judging results are characterized as the same.
And the storage link module 700 is configured to perform a guiding pointing link of a storage address on the sub-data to be stored and the corresponding storage data, which are characterized by the same determination result, allocate storage space to the remaining plurality of sub-data segments to be stored, generate a storage address link, and generate a storage reading link of the data to be stored based on the guiding pointing link and the storage address link.
In this embodiment, a data redundancy preventing system of a self-service terminal device is provided, by extracting a characteristic data segment from data to be stored, matching and identifying repeated data in a database can be conveniently performed, so that the repeated data segment is stored in a segmented mode, and the repeated segment is subjected to redundancy preventing processing by adopting a scheme that multiple data reads point to shared storage content, so that compared with the storage mode in the prior art, the storage occupation of the terminal device can be greatly reduced, the common source storage of the repeated data can effectively promote the management work of data security, the system is provided with more visual judgment on the data heat content, and meanwhile, the matching mode of the data characteristic segment also effectively reduces the equipment calculation occupation in the redundancy preventing calculation process; when the method is used, the data segments stored in the database are all corresponding redundancy-preventing data sets which are convenient to match and search, wherein the redundancy-preventing data sets comprise identification head data segments used for identification and a plurality of characteristic data segments, if a long data shaft exists, namely the front part of the data comprises one head data segment used for identification, then the plurality of characteristic data segments are sequentially included at the same interval, when a certain data is stored by a user, the data are matched according to the storage database of the head data segments, when the consistent content of the certain head data segment is searched, the plurality of characteristic data segments are extracted at the same interval based on the head data segments, so that characteristic matching can be carried out on the data stored in the database, whether the data are identical or not is judged, if the data are identical to the stored part in the database, the data are not repeatedly stored, and the redundancy of the system is reduced by setting a storage address leading to the stored data.
As another preferred embodiment of the present application, each piece of data to be stored includes an indefinite number of pieces of identification header data, each piece of identification header data corresponds to an anti-redundancy data set one by one, data segment intervals of the data to be stored corresponding to a plurality of anti-redundancy data sets may overlap, the sub data segment to be stored is provided with a maximum data amount, and each piece of feature data segment in each anti-redundancy data set is provided with a maximum number.
Further, the data storage module 100 includes a header setting unit;
and the head setting unit is used for extracting the data of the data to be stored in the data segment and establishing an identification head data segment if the matched identification head data segment is not searched in the preset data segment length when the data to be stored is matched based on the head database, wherein the preset data segment length is smaller than the data length interval, and the data length of the identification head data segment is smaller than the data length interval.
In this embodiment, for the same data to be stored, there may be multiple redundancy preventing data sets, and the matched identification header data segment may be only a small segment of the same data, and may not directly indicate that the data is consistent with the stored data, so multiple redundancy preventing data sets are needed to be matched for more reliable and reasonable judgment, and when the identification header data segment is acquired, if the data is not matched with the header data within a certain data length, new header data is newly established based on the data, which indicates that the segment of data is not stored in the library.
As another preferred embodiment of the present application, the matching segmentation module 500 includes a segmentation reservation unit;
the segmentation reservation unit is used for respectively defining segmentation reservation nodes on two sides of the segmentation mark based on the size of a preset reserved data segment, when the data to be stored is segmented, the reserved data segments on two sides are copied and segmented to the front and rear sub-data segments to be stored respectively, and the reserved data segments are used for carrying out matching positioning on data content when adjacent sub-data to be stored are recombined.
In this embodiment, the partitioning reservation has the effect that, when the data after the partitioning of the two ends is read again, the positioning of the data combination of the two ends can be performed through the overlapped data of the same part, so that the situation that the original data cannot be restored after the data combination of the partitioning is caused by the misplacement loss of the data and the like is avoided.
As shown in fig. 2, as another preferred embodiment of the present application, the auxiliary read guiding module 900 further includes:
the read request unit 901 is configured to obtain a data read request, obtain a corresponding storage read link based on the data read request, and obtain a guide pointing link and a storage address link of a plurality of storage sub-data segments.
The reading judgment unit 902 is configured to obtain a reading frequency of unit time corresponding to the guidance pointing link and the storage address link, and if the reading frequency of unit time is greater than a preset threshold, generate an auxiliary reading link for guiding a user communication link with the read storage data.
A read guiding unit 903, configured to obtain the stored data through a user based on the auxiliary read link.
In this embodiment, the auxiliary read guiding module 900 is used for preventing redundancy during the continuous reading process of the database, and the partial shared data segment may have a higher read frequency, which may easily cause a memory life problem and a read bandwidth occupation problem, so that the read/write pressure on the database can be effectively reduced by guiding the read request of the frequently read data segment to the already read device side for sharing the data content.
As shown in fig. 3, the present application further provides a data redundancy prevention method of a self-service terminal device, which includes the steps of:
s200, obtaining data to be stored, matching the data to be stored based on a preset header database, obtaining corresponding identification header data segments, performing feature extraction on the data to be stored according to preset data length intervals and feature data segment lengths, and obtaining a plurality of feature data segments, wherein the identification header data segments and the corresponding feature data segments are redundancy-preventing data sets.
And S400, searching a matching database based on the redundancy prevention data set, and if the identification head data segment and the continuous characteristic data segments are matched with the redundancy prevention data set in the matching database, acquiring corresponding storage data of the matched redundancy prevention data set, comparing the storage data with the corresponding data segment of the data to be stored, and generating a judging result.
And S600, when the judging result is characterized as the same, carrying out segmentation marking on the data to be stored based on the head and tail data nodes matched with the redundancy prevention data set, and carrying out segmentation on the data to be stored based on the segmentation marking to obtain a plurality of sub data segments to be stored.
S800, carrying out guiding pointing links of storage addresses on the sub-data to be stored, which are characterized by the same judgment result, and the corresponding storage data, distributing storage spaces for the rest of the plurality of sub-data segments to be stored, generating storage address links, and generating storage reading links of the data to be stored based on the guiding pointing links and the storage address links.
As another preferred embodiment of the present application, each piece of data to be stored includes an indefinite number of pieces of identification header data, each piece of identification header data corresponds to an anti-redundancy data set one by one, data segment intervals of the data to be stored corresponding to a plurality of anti-redundancy data sets may overlap, the sub data segment to be stored is provided with a maximum data amount, and each piece of feature data segment in each anti-redundancy data set is provided with a maximum number.
As another preferred embodiment of the present application, further comprising the steps of:
the data reading request acquires a corresponding storage reading link, and acquires a guiding pointing link and a storage address link of a plurality of storage sub-data segments.
And acquiring a unit time reading frequency corresponding to the guide pointing link and the storage address link, and if the unit time reading frequency is larger than a preset threshold value, generating an auxiliary reading link for guiding a user communication link with the read storage data.
And acquiring the stored data by a user based on the auxiliary reading link.
When the data to be stored is matched based on the header database, if the matched identification header data segment is not retrieved in the preset data segment length, extracting the data of the data to be stored in the data segment and establishing an identification header data segment, wherein the preset data segment length is smaller than the data length interval, and the data length of the identification header data segment is smaller than the data length interval.
As another preferred embodiment of the present application, further comprising the steps of:
dividing reserved nodes are respectively defined on two sides of the segmentation mark based on the size of a preset reserved data segment, when the data to be stored is divided, the reserved data segments on two sides are copied and respectively divided into front and rear sub-data segments to be stored, and the reserved data segments are used for carrying out matching positioning on data content when adjacent sub-data to be stored are recombined.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
Other embodiments of the present disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A data redundancy prevention system for a self-service terminal device, comprising:
the data storage module is used for acquiring data to be stored, matching the data to be stored based on a preset header database, acquiring corresponding identification header data segments, carrying out feature extraction on the data to be stored according to preset data length intervals and feature data segment lengths, and acquiring a plurality of feature data segments, wherein the identification header data segments and the corresponding feature data segments are redundancy prevention data sets;
the verification and retrieval module is used for retrieving the matched database based on the redundancy prevention data set, if the identification head data segment and the continuous characteristic data segments are matched with the redundancy prevention data set in the matched database, corresponding storage data of the matched redundancy prevention data set is obtained, the storage data is compared with the corresponding data segment of the data to be stored, and a judgment result is generated;
the matching segmentation module is used for marking the data to be stored in a segmentation mode based on the head and the tail data nodes of the matched redundancy prevention data set when the judging results are the same, and segmenting the data to be stored based on the segmentation mark to obtain a plurality of sub-data segments to be stored;
and the storage link module is used for carrying out guiding pointing links of storage addresses on the sub-data to be stored, which are characterized by the same judgment result, and the corresponding storage data, distributing storage spaces for the rest of a plurality of sub-data segments to be stored, generating storage address links, and generating storage reading links of the data to be stored based on the guiding pointing links and the storage address links.
2. The data redundancy preventing system of a self-service terminal device according to claim 1, wherein each piece of data to be stored includes an indefinite number of pieces of identification header data, the pieces of identification header data are in one-to-one correspondence with redundancy preventing data groups, the pieces of data of the data to be stored corresponding to a plurality of redundancy preventing data groups can overlap, the sub data pieces to be stored are provided with maximum data amounts, and the feature data pieces in each redundancy preventing data group are correspondingly provided with maximum numbers.
3. A data redundancy prevention system for a self-service terminal apparatus according to claim 2, wherein said data storage module comprises a header setting unit;
and the head setting unit is used for extracting the data of the data to be stored in the data segment and establishing an identification head data segment if the matched identification head data segment is not searched in the preset data segment length when the data to be stored is matched based on the head database, wherein the preset data segment length is smaller than the data length interval, and the data length of the identification head data segment is smaller than the data length interval.
4. A data redundancy prevention system for a self-service terminal apparatus according to claim 3, wherein said matching segmentation module comprises a segmentation reservation unit;
the segmentation reservation unit is used for respectively defining segmentation reservation nodes on two sides of the segmentation mark based on the size of a preset reserved data segment, when the data to be stored is segmented, the reserved data segments on two sides are copied and segmented to the front and rear sub-data segments to be stored respectively, and the reserved data segments are used for carrying out matching positioning on data content when adjacent sub-data to be stored are recombined.
5. The data redundancy prevention system of a self-service terminal device according to claim 1, further comprising an auxiliary reading guide module, specifically comprising:
the reading request unit is used for acquiring a data reading request, acquiring a corresponding storage reading link based on the data reading request, and acquiring a guiding pointing link and a storage address link of a plurality of storage sub-data segments;
the reading judgment unit is used for acquiring the unit time reading frequency corresponding to the guiding pointing link and the storage address link, and generating an auxiliary reading link guiding the user communication link with the read storage data if the unit time reading frequency is greater than a preset threshold value;
and the reading guide unit is used for acquiring the stored data by a user based on the auxiliary reading link.
6. The data redundancy prevention method of the self-service terminal equipment is characterized by comprising the following steps of:
acquiring data to be stored, matching the data to be stored based on a preset header database, acquiring corresponding identification header data segments, performing feature extraction on the data to be stored according to preset data length intervals and feature data segment lengths, and acquiring a plurality of feature data segments, wherein the identification header data segments and the corresponding feature data segments are redundancy prevention data sets;
searching a matching database based on the redundancy prevention data set, if the identification head data segment and the continuous characteristic data segments are matched with the redundancy prevention data set in the matching database, acquiring corresponding storage data of the matched redundancy prevention data set, comparing the storage data with the corresponding data segment of the data to be stored, and generating a judging result;
when the judging results are characterized as the same, the data to be stored are segmented and marked based on the head and tail data nodes of the matched redundancy prevention data set, and the data to be stored are segmented based on the segmentation marks, so that a plurality of sub data segments to be stored are obtained;
and carrying out guiding pointing links of storage addresses on the sub-data to be stored, which are characterized by the same judging result, and the corresponding storage data, distributing storage spaces for the rest of the sub-data segments to be stored, generating storage address links, and generating storage reading links of the data to be stored based on the guiding pointing links and the storage address links.
7. The method for preventing data redundancy of a self-service terminal device according to claim 6, wherein each piece of data to be stored includes an indefinite number of pieces of identification header data, each piece of identification header data corresponds to a redundancy preventing data set one by one, data segment intervals of the data to be stored corresponding to a plurality of redundancy preventing data sets can overlap, the sub data segment to be stored is provided with a maximum data amount, and each piece of feature data segment in each redundancy preventing data set is provided with a maximum number.
8. The method for preventing data redundancy of a self-service terminal apparatus of claim 7, further comprising the steps of:
when the data to be stored is matched based on the header database, if the matched identification header data segment is not retrieved in the preset data segment length, extracting the data of the data to be stored in the data segment and establishing an identification header data segment, wherein the preset data segment length is smaller than the data length interval, and the data length of the identification header data segment is smaller than the data length interval.
9. The method for preventing data redundancy of a self-service terminal device according to claim 8, further comprising the steps of:
dividing reserved nodes are respectively defined on two sides of the segmentation mark based on the size of a preset reserved data segment, when the data to be stored is divided, the reserved data segments on two sides are copied and respectively divided into front and rear sub-data segments to be stored, and the reserved data segments are used for carrying out matching positioning on data content when adjacent sub-data to be stored are recombined.
10. The method for preventing data redundancy of a self-service terminal device according to claim 6, further comprising the steps of:
the data reading request acquires a corresponding storage reading link, and acquires a guiding pointing link and a storage address link of a plurality of storage sub-data segments;
acquiring a unit time reading frequency corresponding to the guide pointing link and the storage address link, and if the unit time reading frequency is greater than a preset threshold value, generating an auxiliary reading link for guiding a user communication link with the read storage data;
and acquiring the stored data by a user based on the auxiliary reading link.
CN202311238831.8A 2023-09-25 2023-09-25 Data redundancy prevention method and system for self-service terminal equipment Active CN116991329B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311238831.8A CN116991329B (en) 2023-09-25 2023-09-25 Data redundancy prevention method and system for self-service terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311238831.8A CN116991329B (en) 2023-09-25 2023-09-25 Data redundancy prevention method and system for self-service terminal equipment

Publications (2)

Publication Number Publication Date
CN116991329A true CN116991329A (en) 2023-11-03
CN116991329B CN116991329B (en) 2023-12-08

Family

ID=88530489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311238831.8A Active CN116991329B (en) 2023-09-25 2023-09-25 Data redundancy prevention method and system for self-service terminal equipment

Country Status (1)

Country Link
CN (1) CN116991329B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101479944A (en) * 2006-04-28 2009-07-08 网络装置公司 System and method for sampling based elimination of duplicate data
CN103150260A (en) * 2011-11-25 2013-06-12 华为数字技术(成都)有限公司 Method and device for deleting repeating data
US20140188822A1 (en) * 2012-12-28 2014-07-03 Futurewei Technologies, Inc. Efficient De-Duping Using Deep Packet Inspection
CN105912268A (en) * 2016-04-12 2016-08-31 韶关学院 Distributed data deduplocation method and apparatus based on self-matching characteristics
US20180107401A1 (en) * 2016-10-18 2018-04-19 International Business Machines Corporation Using volume header records to identify matching tape volumes
CN108228083A (en) * 2016-12-21 2018-06-29 伊姆西Ip控股有限责任公司 For the method and apparatus of data deduplication
CN114721594A (en) * 2022-03-31 2022-07-08 新华三信息技术有限公司 Distributed storage method, device, equipment and machine readable storage medium
CN115993939A (en) * 2023-03-22 2023-04-21 陕西中安数联信息技术有限公司 Method and device for deleting repeated data of storage system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101479944A (en) * 2006-04-28 2009-07-08 网络装置公司 System and method for sampling based elimination of duplicate data
CN103150260A (en) * 2011-11-25 2013-06-12 华为数字技术(成都)有限公司 Method and device for deleting repeating data
US20140188822A1 (en) * 2012-12-28 2014-07-03 Futurewei Technologies, Inc. Efficient De-Duping Using Deep Packet Inspection
CN105912268A (en) * 2016-04-12 2016-08-31 韶关学院 Distributed data deduplocation method and apparatus based on self-matching characteristics
US20180107401A1 (en) * 2016-10-18 2018-04-19 International Business Machines Corporation Using volume header records to identify matching tape volumes
CN108228083A (en) * 2016-12-21 2018-06-29 伊姆西Ip控股有限责任公司 For the method and apparatus of data deduplication
CN114721594A (en) * 2022-03-31 2022-07-08 新华三信息技术有限公司 Distributed storage method, device, equipment and machine readable storage medium
CN115993939A (en) * 2023-03-22 2023-04-21 陕西中安数联信息技术有限公司 Method and device for deleting repeated data of storage system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郑亚光;潘久辉;: "一种基于滑动分块的重复数据检测算法", 计算机工程, no. 02 *

Also Published As

Publication number Publication date
CN116991329B (en) 2023-12-08

Similar Documents

Publication Publication Date Title
CN108829781B (en) Client information query method, device, computer equipment and storage medium
US11567940B1 (en) Cache-aware system and method for identifying matching portions of two sets of data in a multiprocessor system
CN110795919A (en) Method, device, equipment and medium for extracting table in PDF document
CN109597571B (en) Data storage method, data reading method, data storage device, data reading device and computer equipment
CN111723056A (en) Small file processing method, device, equipment and storage medium
CN111858678A (en) Redis-based key value deletion method, computer device, apparatus and storage medium
WO2019075968A1 (en) Cross-page recognition method for form information, electronic device, and computer-readable storage medium
CN106980680B (en) Data storage method and storage device
CN112685333A (en) Heap memory management method and device
CN116991329B (en) Data redundancy prevention method and system for self-service terminal equipment
CN105183383A (en) Recombination method for irrelevant mirror images of file system
CN111208952A (en) Storage system capacity expansion method, readable storage medium and computing device
CN110956031A (en) Text similarity matching method, device and system
CN110008140B (en) Memory management method and device, computer equipment and storage medium
CN109165305B (en) Characteristic value storage and retrieval method and device
CN108959486B (en) Audit field information acquisition method and device, computer equipment and storage medium
CN113255742A (en) Policy matching degree calculation method and system, computer equipment and storage medium
CN109977121B (en) Big data rapid storage system
CN112800123A (en) Data processing method, data processing device, computer equipment and storage medium
CN111090542A (en) Abnormal block identification method and device based on abnormal power failure and computer equipment
CN114205161B (en) Network attacker discovery and tracking method
CN116992443A (en) Data security identification method and system based on network monitoring
CN115951825A (en) Information management method and system applied to data center
CN112905191B (en) Data processing method, device, computer readable storage medium and computer equipment
CN117556087B (en) Customer service reply data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant