CN112306998B - Method, device and server for de-duplication of traffic and delegation data - Google Patents
Method, device and server for de-duplication of traffic and delegation data Download PDFInfo
- Publication number
- CN112306998B CN112306998B CN202011087893.XA CN202011087893A CN112306998B CN 112306998 B CN112306998 B CN 112306998B CN 202011087893 A CN202011087893 A CN 202011087893A CN 112306998 B CN112306998 B CN 112306998B
- Authority
- CN
- China
- Prior art keywords
- data
- duplication
- field
- traffic
- delegation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000001914 filtration Methods 0.000 claims abstract description 86
- 238000012545 processing Methods 0.000 claims description 36
- 238000012790 confirmation Methods 0.000 claims description 11
- 238000004590 computer program Methods 0.000 claims description 4
- 238000004891 communication Methods 0.000 claims description 3
- 238000003379 elimination reaction Methods 0.000 description 13
- 230000008030 elimination Effects 0.000 description 12
- 238000010586 diagram Methods 0.000 description 11
- 238000012217 deletion Methods 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 239000003086 colorant Substances 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application provides a method, a device and a server for de-duplication of traffic commission data, which are characterized in that the traffic commission data is de-duplicated according to the index time period of the de-duplication index field and the time period of the de-duplication time field of the received traffic commission data and the filtering time period of the de-duplication time field, the configuration interface for de-duplication is dynamically generated through the initial selection interface, the de-duplication index field and the de-duplication time field are dynamically selected through the configuration interface, the flexibility and the accuracy of de-duplication are improved, and meanwhile, the speed of de-duplication is improved by adopting a redis database, and the system expansibility is stronger. The universality is high.
Description
Technical Field
The present application relates to the field of traffic, and in particular, to a method, an apparatus, and a server for deduplicating traffic commission data.
Background
With the vigorous development of the internet, big data technology is widely applied in various fields, a large amount of data is required to be continuously accessed from the outside in a short time in many big data application scenes, a large amount of repeated data exists in the data, for example, in a market and a station where people are dense, a person stays near a certain wifi electronic fence for a period of time, similar data can be collected for a plurality of times, for example, in a traffic operation scene, when a vehicle moves slowly or is stationary at a gate, a data collector can repeatedly upload the vehicle passing record of the vehicle in a short time, and a big data platform stores more repeated data. If the repeated data are stored without being processed, on the one hand, the throughput and the storage capacity of the system are wasted, and on the other hand, the subsequent data use and data analysis are greatly affected.
At present, the data is generally subjected to deduplication processing by an ETL (Extract-Transform-Load) tool when the data is extracted, or is subjected to deduplication processing by an SQL statement, but the ETL tool needs that the data is completely consistent to determine that the data is duplicate data, and cannot determine whether the data is duplicate data by partial keyword consistency or partial keyword consistency within a specified period of time; the SQL statement is only used in the relational database, has no generality and expansibility, and has a larger influence on the performance of the database in the duplication elimination process.
Therefore, the current data deduplication method has the technical problems of insufficient flexibility and accuracy and poor expansibility and universality.
Disclosure of Invention
The embodiment of the application provides a data exchange and delegation deduplication method and device, which are used for solving the technical problems of insufficient flexibility and accuracy and poor expansibility and universality of the current data deduplication method.
In a first aspect, an embodiment of the present application provides a method for deduplicating exchange and delegation data, where the method for deduplicating exchange and delegation data includes:
displaying a selection interface of target data, and receiving data type selection operation through the selection interface;
determining the target data according to the data type selection operation, and acquiring metadata corresponding to the target data;
updating the selection interface according to the metadata, and generating and displaying a configuration interface;
receiving a de-duplication parameter configuration operation through the configuration interface, determining a de-duplication index field and a de-duplication time field according to the metadata according to the de-duplication parameter configuration operation, and setting a filtering duration corresponding to the de-duplication time field;
receiving traffic delegation data, wherein the traffic delegation data comprises an index data segment of the deduplication index field and a time data segment of the deduplication time field;
Based on the data cached in the redis database, according to the index data segment, the time data segment and the filtering duration, the traffic commission data is cached to the redis database after being subjected to de-duplication processing.
In some embodiments, determining the target data according to the data type selection operation, and acquiring metadata corresponding to the target data specifically includes:
selecting the data type according to the data type selection operation, and determining target data corresponding to the data type;
and acquiring the metadata corresponding to the target data according to the data type.
In some embodiments, updating the selection interface according to the metadata, generating and displaying a configuration interface specifically includes:
displaying the metadata, setting a field index frame corresponding to the metadata, a field selection frame corresponding to each metadata, and a field confirmation frame;
updating the selection interface, and generating and displaying the configuration interface.
In some embodiments, the determining the deduplication index field and the deduplication time field according to the metadata, and setting a filtering duration corresponding to the deduplication time field specifically includes:
Searching the de-duplication index field and the de-duplication time field through the field index frame;
selecting the de-duplication index field and the de-duplication time field through the field selection frame;
confirming the duplicate-removal-index field and the duplicate-removal-time field through the field confirmation frame;
and displaying the de-duplication index field and the de-duplication time field, and setting a filtering duration corresponding to the de-duplication time field.
In some embodiments, the de-duplication index field includes at least one field of the metadata, and the de-duplication time field includes only one of the fields of the metadata.
In some embodiments, based on the data cached in the redis database, according to the index data segment, the time data segment and the filtering duration, the traffic delegation data is cached to the redis database after being subjected to deduplication processing, which specifically includes:
judging whether the data cached in the redis database comprises the index data segment, if the redis database does not comprise the index data segment, determining that the traffic delegation data is not repeated data and writing the traffic delegation data into the redis database;
if the data cached in the redis database comprises the index data segment, judging whether the difference value between the time data segment and the time data segment of the last transaction delegation data cached in the redis database is within the filtering duration, if not, determining that the transaction delegation data is not repeated data, and writing the transaction delegation data into the redis database; and if the time is within the filtering time, determining that the traffic commission data is repeated data and discarding, and updating the time data segment of the last traffic commission data cached in the redis database into the time data segment.
In some embodiments, the determining whether the data cached in the redis database includes the index data segment, if the redis database does not include the index data segment, determining that the commit data is not duplicate data and writing the commit data into the redis database, specifically includes:
generating a rediskey name corresponding to the traffic delegation data by using the deduplication index field and the index data segment;
generating a redis key value corresponding to the traffic and delegation data by the time data segment;
the redis key name and the redis key value form a duplicate-judging redis key value corresponding to the traffic delegation data;
inserting the re-dis key value pair into the dis database, and judging whether the data cached in the dis database comprises the dis key name or not;
and if the redis key name is not included, determining that the exchange and delegation data is not repeated data, and writing the redis key value pair into the redis database.
In some embodiments, the determining whether the difference between the time data segment and the time data segment of the last piece of commit data cached in the redis database is within the filtering duration, if not, determining that the commit data is not duplicate data, and writing the commit data into the redis database; if the filtering time is within the filtering time, determining that the traffic commission data is repeated data and discarding, and updating the time data segment of the last traffic commission data cached in the redis database into the time data segment, specifically including:
Judging whether the difference value between the redis key value and the redis key value of the last piece of the traffic delegation data cached in the redis database is within the filtering duration or not;
if the data is not in the filtering duration, determining that the traffic commission data is not repeated data, and writing the redis key value pair into the redis database;
if the filtering duration is within the filtering duration, determining that the traffic commission data is repeated data and discarding, and updating a redis key value corresponding to the last traffic commission data cached in the redis database into the redis key value.
In a second aspect, an embodiment of the present application further provides a device for deduplicating exchange and delegation data, where the device for deduplicating exchange and delegation data includes:
the data type selection module is used for displaying a selection interface of the target data and receiving data type selection operation through the selection interface;
the metadata acquisition module is used for determining the target data according to the data type selection operation and acquiring metadata corresponding to the target data;
the interface generation module is used for updating the selection interface according to the metadata and generating and displaying a configuration interface;
the configuration module is used for receiving a de-duplication parameter configuration operation through the configuration interface, determining a de-duplication index field and a de-duplication time field according to the metadata according to the de-duplication parameter configuration operation, and setting a filtering duration corresponding to the de-duplication time field;
The receiving module is used for receiving the exchange and delegation data, wherein the exchange and delegation data comprises an index data segment of the deduplication index field and a time data segment of the deduplication time field;
and the deduplication module is used for caching the traffic commission data to the redis database after performing deduplication processing according to the index data segment, the time data segment and the filtering duration based on the data cached in the redis database.
In a third aspect, an embodiment of the present application further provides a server, including: a processor, a memory, a bus, and a computer program stored on the memory and executable on the processor, wherein the processor and the memory complete communication with each other through the bus; the memory processor, when executing the computer program, implements the delegation data deduplication method as described above.
The beneficial effects are that: the embodiment of the application provides a method, a device and a server for removing duplication of exchange and delegation data, wherein the exchange and delegation data is carried out by determining target data of a certain data type and metadata of the target data through a selection interface, updating the selection interface to generate a configuration interface, selecting a duplication removal index field and a duplication removal time field for duplication removal processing from the metadata through the configuration interface, setting filtering time corresponding to the duplication removal time field, and finally carrying out duplication removal processing on the exchange and delegation data according to the index time period of the duplication removal index field and the time period of the duplication removal time field of the exchange and delegation data of the same type as the target data and the filtering time, wherein the exchange and delegation data duplication removal method can dynamically generate a configuration interface for duplication removal through an initial selection interface, dynamically select the duplication removal index field and the duplication removal time field through the configuration interface, improves flexibility and accuracy of duplication removal processing, adopts a disdatabase of a key value pair with high concurrent access during duplication removal processing, can effectively filter the duplication removal time of the received data, improves the duplication removal processing speed, has high expansibility, and is high in system.
Drawings
The technical solution and other advantageous effects of the present application will be made apparent by the following detailed description of the specific embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a schematic view of a scenario of a data delegation method according to an embodiment of the present application.
Fig. 2 is a flow chart of a method for de-duplication of data submitted by an embodiment of the present application.
Fig. 3 is a schematic diagram of a selection interface of a data exchange and delegation method according to an embodiment of the present application.
Fig. 4 is a schematic diagram of a configuration interface of a data exchange and delegation method according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of a data exchange and delegation apparatus according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to fall within the scope of the application.
In embodiments of the application, "for example," example, "and" such as "are used to mean" serving as an example, instance, or illustration. Any embodiment described herein as "for example," "example," and "such as" is not necessarily to be construed as preferred or advantageous over other embodiments. The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for purposes of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known structures and processes have not been described in detail so as not to obscure the description of the application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed herein.
The embodiment of the application provides a method, a device and a server for de-duplication of traffic and delegation data, which are respectively described in detail below.
Referring to fig. 1, fig. 1 is a schematic diagram of a scenario of a method for delegation of data according to an embodiment of the present application. The network architecture of the data exchange and delegation method comprises a data collector 11, a server 12 and a redis database 13 which are sequentially connected, wherein the data collector 11 is used for collecting real-time data and uploading the collected data to the server 12, the server 12 performs deduplication processing on the data, and the redis database 13 is used for caching intermediate data in the data deduplication processing process of the server 12. The number of the data collector 11, the server 12 and the redis database 13 may be plural, and it should be noted that, in the embodiment of the present application, each redis database 13 temporarily caches target data of one data type for a period of time.
The execution subject of the embodiment of the application is the server 12, and the server 12 performs the duplicate removal processing on the data acquired by the data acquirer 11 and caches the data in the redis database 13. The redis database 13 is a key-value (key-value) type memory database, and can perform an atomic operation on the database through a key value which is accessed in a high concurrency manner during data deduplication processing, so that the minimization of occupied resources is ensured to the greatest extent from the aspects of time and space.
The embodiment of the application can be applied to big data scenes with time continuity of various data, such as traffic running scenes, and the motor vehicle data has time continuity. The data acquisition device of the bayonet uploads the passing records of the motor vehicle passing through the bayonet to the server, when the motor vehicle slowly moves or is stationary at the bayonet, the data acquisition device of the bayonet uploads a plurality of passing records of the motor vehicle in a short time, the uploaded passing records are different in passing time, if the passing records can only keep the first passing record of the motor vehicle in a certain filtering time interval, then the passing records uploaded for multiple times later are discarded as repeated data, the same operation is carried out in the next filtering time, and the repeated data can be effectively filtered so as to avoid the redundant data stored in the server.
It should be noted that, the schematic scenario diagram of the data exchange and delegation method shown in fig. 1 is only an example, and is for more clearly describing the technical solution of the embodiment of the present application, and does not constitute a limitation on the technical solution provided by the embodiment of the present application.
It should be noted that the following description order of the embodiments is not a limitation of the preferred order of the embodiments.
Referring to fig. 2, fig. 2 is a schematic diagram of a method for de-duplication of data, where the method for de-duplication of data includes:
s1, displaying a selection interface of target data, and receiving data type selection operation through the selection interface.
The target data refers to data source data that needs to be subjected to deduplication processing, for example, in a traffic operation scene, the target data includes traffic commission data such as motor vehicle data, non-motor vehicle data, portrait data, personnel data, wifi data, electric girth data, and the like, and data classification is performed in advance on the different types of target data, so that the target data is selected as data source data of a certain data type through the data classification. It will be appreciated that the classification of data may be performed by defining template names, vendor names, class names, etc. to provide detailed classification of each data type.
The selection interface is an initial background interface for determining a certain data type according to a data type selection operation and then acquiring target data of the certain data type.
The data type selection operation refers to a selection operation for selecting a certain data type.
Specifically, fig. 3 is a schematic diagram of a selection interface of the method for delegation of data deduplication according to an embodiment of the present application, as shown in fig. 3, the selection interface is shown, and target data that has been subjected to data classification in advance can be selected through the selection interface, so that a data type selection operation is received through the selection interface, that is, target data of a certain data type selected through the data type selection operation is determined through the selection interface.
It should be noted that, through this step, target data that has been subjected to data classification in advance may be obtained, and for data that cannot be selected for data types through the selection interface, the target data that is not required to be subjected to deduplication in the embodiment of the present application is not included.
S2, determining the target data according to the data type selection operation, and acquiring metadata corresponding to the target data.
The metadata is data describing data, and mainly refers to information describing attributes of the data, for example, data describing attributes of the motor vehicle, such as a device name, a device code, a track number, a license plate type, a license plate color, a passing time, a vehicle vanishing time, and the like, for the motor vehicle, that is, metadata corresponding to the motor vehicle data, that is, metadata is information describing attributes of the data type according to the data type of the target data.
The metadata of the data type is defined according to the data type of the target data.
Specifically, as shown in fig. 3, a data type is selected by a data type selection operation, and metadata corresponding to the target data of the data type is displayed by a selection interface while the target data of the data type is acquired.
S3, updating the selection interface according to the metadata, and generating and displaying a configuration interface.
Specifically, fig. 4 is a schematic diagram of a configuration interface of the method for delegation of data in accordance with an embodiment of the present application, and, as shown in fig. 3 and fig. 4, a certain data type is determined through a selection interface, and after metadata corresponding to target data of the data type is obtained, the selection interface is updated, so that the configuration interface is generated and displayed, that is, the configuration interface is a background operation interface after updating the selection interface.
S4, receiving a de-duplication parameter configuration operation through the configuration interface, determining a de-duplication index field and a de-duplication time field according to the metadata according to the de-duplication parameter configuration operation, and setting a filtering duration corresponding to the de-duplication time field.
It will be appreciated that in order to perform deduplication processing on target data, it is necessary to set a deduplication determination rule on the target data, and determine whether the target data is duplicate data according to the deduplication determination rule. The de-duplication index field and the de-duplication time field in the embodiment of the present application are key fields of the de-duplication decision rule, and the de-duplication index field and the de-duplication time field may be stored in the redis database for a longer specified time.
The deduplication index field is mainly a feature field for identifying the attribute of the target data, the deduplication time field is mainly a feature field for identifying the time attribute of the target data, the filtering duration is a filtering duration corresponding to the deduplication time field, for example, for motor vehicle data, equipment names, equipment codes, license plates, license plate types, license plate colors and the like can be used as the deduplication index field, the passing time, the vehicle vanishing time and the like can be used as the deduplication time field, and the filtering duration is a fixed time range for judging whether the target data is the repeated data according to the deduplication time field.
Specifically, as shown in fig. 4, in the configuration interface, a deduplication index field and a deduplication time field, which are used as a deduplication decision rule, are selected from the presented metadata through a deduplication parameter configuration operation, and a filter duration corresponding to the deduplication time field is set.
S5, receiving the exchange and delegation data, wherein the exchange and delegation data comprises an index data segment of the deduplication index field and a time data segment of the deduplication time field.
The index data segment refers to a field value of a duplication elimination index field, the time data segment refers to a field value of a duplication elimination time field, for example, for vehicle data, if a license plate number is used as the duplication elimination index field and a passing time is used as the duplication elimination time field, the field value of the license plate number (actual license plate number, for example, ja a xxxx) is used as the index data segment, and the field value of the passing time (actual passing time, for example, 2020-09-1310:00:00) is used as the time data segment, and it should be noted that the time data segment is generally a time point and all uses uniform timestamps.
Where the commission data refers to target data including index data segments and time data segments, for example, the commission data includes "jaw AXXXX" and "2020-09-13:00:00".
Specifically, the exchange and delegation data is received, and then the exchange and delegation data is subjected to duplication removal according to the index data segment and the time data segment of the exchange and delegation data.
It can be understood that the data type of the traffic commission data should be the same as the data type of the target data, that is, after the deduplication decision rule is configured for the target data of a certain data type, the traffic commission data of the certain data type is subjected to deduplication processing.
S6, based on the data cached in the redis database, according to the index data segment, the time data segment and the filtering duration, the traffic commission data is subjected to de-duplication processing and then cached to the redis database.
It should be noted that, in the embodiment of the present application, the deduplication decision rule refers to: and when the index data segments of the two pieces of traffic and delegation data are the same and the difference value of the time data segments is within the filtering duration, judging the two pieces of traffic and delegation data as repeated data.
For example, for motor vehicle data, the license plate number is used as an index data section, the elapsed time is used as a deduplication time field,
if the motor vehicle with the license plate number of jaw AXXXXXX passes through the bayonets from 2020-09-13:10:00 to 2020-09-1310:01:00 and the filtering time length is set to be 30s, according to the deduplication judging rule, jaw AXXXXXX is an index data segment, and a certain time between 2020-09-13:00 and 2020-09-13:10:00 is a deduplication time field.
Assuming that there is traffic commission data of 2020-09-13:00:00 passing through the bayonet, there are a plurality of index data segments of jaw AXXXXXX between 2020-09-13:00:00 and 2020-09-13:10:00:30, traffic commission data of time data segments of 2020-09-13:00 to 2020-09-13:10:00:30 is accessed at a certain moment, the traffic commission data of time data segments of 2020-09-13:00:00 is written into the redis database, and traffic commission data of time data segments of 2020-09-13:00:01 to 2020-09-13:00:30 are discarded as repeated data; next, the commission data of 2020-09-13-10:00:31 is not duplicate data because the time period of the commission data of 2020-09-13-10:00:00 buffered in the redis database differs by more than 30s, but is written as new data to the redis database, and then the commission data of 2020-09-13-10:00:32 to 2020-09-13-10:00:60 is discarded as duplicate data, and so on.
Specifically, the re-processed exchange and delegation data are all cached in the redis database, and the de-duplication processing is performed on the exchange and delegation data according to the index data segment and the time data segment, namely, the index data segment and the time data segment of the exchange and delegation data are compared with the index data segment and the time data segment cached in the redis database, so as to judge whether the exchange and delegation data are repeated data or not, so that the exchange and delegation data are discarded, or the exchange and delegation data are written into the redis database as new data.
It should be noted that in the traffic operation scenario, since the traffic commission data is accessed one by one, when judging whether the time data segment of the currently accessed traffic commission data exceeds the filtering duration, comparing the time data segment of the currently accessed traffic commission data with the time data segment of the last traffic commission data corresponding to the same index data segment cached in the redis database, namely judging whether the difference value between the time data segment of the currently accessed traffic commission data and the time data segment of the last traffic commission data corresponding to the same index data segment cached in the redis database exceeds the filtering duration.
According to the method for removing duplication of the traffic commission data, the target data of a certain data type and the metadata of the target data are determined through the selection interface, the selection interface is updated to generate the configuration interface, the duplication removal index field and the duplication removal time field for duplication removal processing are selected from the metadata through the configuration interface, the filtering duration corresponding to the duplication removal time field is set, and finally, duplicate data close to the receiving time can be effectively filtered based on the data cached in the redis database according to the received index time period of the duplication removal index field and the time data period of the duplication removal time field of the traffic commission data of the same type as the target data and the filtering duration.
In one embodiment, step S2 specifically includes the steps of:
s21, selecting the data type according to the data type selection operation, and determining target data corresponding to the data type.
S22, acquiring the metadata corresponding to the target data according to the data type.
The selection interface is provided with a data type selection control, and the data type selection operation comprises the selection of a template name, a manufacturer and a classification name, so that the selection control of the template name, the manufacturer and the classification name is respectively provided, and the selection control is a drop-down box control.
Specifically, as shown in fig. 3, in the selection interface, after a certain data type is selected through the data type selection operation, it is determined that the target data of the data type is subjected to deduplication processing, and according to the data type, the selection interface displays all metadata corresponding to the target data of the data type. The metadata includes names, fields, and types, for example, "device code DeviceID string", "passing time Passtime string", and the like.
In one embodiment, step S3 specifically includes the steps of:
s31, displaying the metadata, setting a field index frame corresponding to the metadata, a field selection frame corresponding to each metadata, and a field confirmation frame.
S32, updating the selection interface, and generating and displaying the configuration interface.
The field index box is a text box control, the field selection box is a selection box control, and the field confirmation box is a button control. As shown in fig. 3, the field acknowledgement boxes specifically include a deduplication index field acknowledgement box and a deduplication time field acknowledgement box.
Specifically, as shown in fig. 3 and 4, a field index frame for searching metadata, a field selection frame for selecting each metadata, and a field confirmation frame for confirming that the selected metadata is used as a deduplication index field and a deduplication time field are set in a selection interface, and then the selection interface is updated to generate a display control of the deduplication index field, the deduplication time field, and a filtering duration, so that a configuration interface for deduplication processing is dynamically generated. The display control is a textbox control.
In addition, as shown in fig. 4, the display control of the duplication elimination index field, the duplication elimination time field and the filtering duration of the configuration interface is generated, and the duplication elimination index field and the duplication elimination time field deletion control are also generated, that is, the selected duplication elimination index field and duplication elimination time field can be deleted through the deletion control, and then the duplication elimination index field and the duplication elimination time field are selected again through the field index frame, the field selection frame and the field confirmation frame of the metadata. Wherein the deletion control is a button control.
In one embodiment, the determining the de-duplication index field and the de-duplication time field according to the metadata in step S4, and setting a filtering duration corresponding to the de-duplication time field specifically includes the following steps:
s41, searching the duplicate removal index field and the duplicate removal time field through the field index box control.
S42, selecting the duplicate removal index field and the duplicate removal time field through the field selection box control.
S43, confirming the duplicate removal index field and the duplicate removal time field through the field confirming box control.
S44, displaying the de-duplication index field and the de-duplication time field, and setting a filter time length corresponding to the de-duplication time field.
Specifically, as shown in fig. 4, the de-duplication index field and the de-duplication time field are searched through a field index frame, the de-duplication index field and the de-duplication time field are selected through the field selection frame, after the de-duplication index field and the de-duplication time field are confirmed through the field confirmation frame, the de-duplication index field is displayed by a display control of the de-duplication index field, the de-duplication time field is displayed by a display control of the de-duplication time field, and meanwhile, a filtering duration corresponding to the de-duplication time field is set in a display control of a filtering duration, that is, the de-duplication index field and the de-duplication time field which are used as a de-duplication judgment rule are displayed in a configuration interface, and meanwhile, a filtering duration corresponding to the de-duplication time field is required to be set, so that the de-duplication index field, the de-duplication time field and the filtering duration which are used as a de-duplication judgment rule are confirmed.
It will be appreciated that if the deduplication index field and the deduplication time field are selected directly by the field selection box, then the deduplication index field and the deduplication time field need not be searched by the field index box.
It should be noted that the deduplication index field includes at least one metadata field, and the deduplication time field includes only one metadata field. As shown in fig. 4, for the de-duplication index field, a field of the first metadata is selected as a de-duplication index field 1, a field of the second metadata is selected as a de-duplication index field 2, and so on.
In one embodiment, step S6 specifically includes the steps of:
s61, judging whether the data cached in the redis database comprises the index data segment, if the redis database does not comprise the index data segment, determining that the traffic delegation data is not repeated data and writing the traffic delegation data into the redis database.
The embodiment of the application adopts the redis database to cache the intermediate data in the process of the duplicate removal processing, and can utilize the key value pair which can be accessed by the redis database in high concurrence to improve the speed of the duplicate removal processing.
Specifically, when receiving the latest traffic commission data, the index data segment and the time data segment of the traffic commission data can be determined, firstly, whether the index data segment is included in the data cached in the redis database is judged, if the index data segment is not cached in the redis database, the traffic commission data is not repeated, and the traffic commission data is written into the redis database as new data.
For example, if a motor vehicle having a license plate number of jaw AXXXXXX is passed through a bayonet in 2020-09-13:00:00 to 2020-09-13:10:01:00, the filter time period is set to 30s.
When a motor vehicle with a license plate number of jaw AXXXXXX is received, the 2020-09-13-10:00:05 traffic commission data passing through a bayonet is firstly judged whether traffic commission data comprising jaw AXXXXXX exists in a redis database, if not, the traffic commission data is judged not to be repeated data, and the traffic commission data is written into the redis database as new data.
S62, if the data cached in the redis database comprises the index data segment, judging whether the difference value between the time data segment and the time data segment of the last piece of traffic delegation data cached in the redis database is within the filtering duration, if not, determining that the traffic delegation data is not repeated data, and writing the traffic delegation data into the redis database; and if the time is within the filtering time, determining that the traffic commission data is repeated data and discarding, and updating the time data segment of the last traffic commission data cached in the redis database into the time data segment.
Specifically, in step S61, if the index data segment of the piece of commit data is cached in the redis database, it is further determined whether the time data segment of the piece of commit data exceeds the filtering duration, that is, the commit data corresponding to the index data segment is found in the redis database, that is, the time data segment of the last piece of commit data cached in the redis database, then the time data segment of the last received piece of commit data is compared with the time data segment of the last piece of commit data cached in the redis database, if the difference between the time data segment and the time data segment does not exceed the filtering duration, it is determined that the piece of commit data is repeated data, and if the difference between the time data segment and the time data segment exceeds the filtering duration, it is determined that the piece of commit data is not repeated data, and the piece of commit data is written into the redis database as new data, that is the last piece of commit data cached in the redis database as the next piece of commit data to be received.
For example, assuming that, following the example of step S61 above, there is traffic data for license plates with the numbers of ja AXXXXXX, 2020-09-13:00:00 passing through the bayonets, if traffic data including "ja AXXXX" exists in the redis database, 2020-09-13:00:05 and 2020-09-13 are further described
10:00:00, and the time difference is 5s, and the filtration duration is not longer than 30s, so that the intersection commission data of the motor vehicle of the jaw AXXXXXX passing through the bayonet 2020-09-13-10:00:05 is removed as repeated data. Next, when a motor vehicle with a license plate number of a jaw axxxxx is received, the traffic commission data of the passing of the license plate 2020-09-13:00:31 is compared with the traffic commission data of the passing of the license plate 2020-09-13:00:31 to judge whether the traffic commission data is the repeated data or not by comparing the time data segment with the traffic commission data of the passing of the license plate 2020-09-13:00:31 and the traffic commission data of the passing of the license plate 2020-09-13:00:31 because the 2020-09-13:00:00 is different from the 2020-09-13:00:00 by more than 30s, the traffic commission data is not the repeated data, but is written into a redis database as new data.
In one embodiment, step S62 specifically includes the steps of:
s621, generating a redis key name corresponding to the traffic delegation data by using the deduplication index field and the index data segment.
S622, generating a redis key value corresponding to the traffic delegation data by the time data segment.
For example, for a piece of exchange commission data with index data segment of jaw axxx and time data segment of 2020-09-1310:00:05, the "prefix_license plate number field name_jaw axxx" may be used as the redis key name of the piece of exchange commission data, and the "2020-09-13-10:00:05" may be used as the redis key value of the piece of exchange commission data.
It should be noted that, if fields of a plurality of metadata are selected as the deduplication index fields, the field names and field values of the plurality of metadata are arranged according to a preset form to generate a final index data segment, where the preset form may be: the "prefix_deduplication index field 1 field name|deduplication index field 1 field value, the" deduplication index field 2 field name|deduplication index field 2 field value ", for example, for a piece of traffic delegated data including an index data segment" jaw axxxxx "and an index data segment" device name BXXXXX ", the" prefix_lanno|jaw AXXXXXX, deviceID |bxxxxx "is used as the redis key name corresponding to the piece of traffic delegated data. The preset form is merely an example, and embodiments of the present application are not limited thereto.
S623, forming the redis key name and the redis key value into a redis key value corresponding to the traffic delegation data.
For example, the Redis key name "prefix_license plate number field name_jaw AXXXXXX" and the Redis key value "2020-09-13:00:05" are combined into a key value pair (prefix_license plate number field name_jaw AXXXX, 2020-09-13:00:05) as the weight determination rediss key value corresponding to the piece of traffic commission data.
S624, inserting the re-dis key value pair into the dis database, and judging whether the data cached in the dis database comprises the dis key name.
S625, if the redis key name is not included, determining that the traffic commission data is not repeated data, and writing the redis key value pair into the redis database.
For example, a key value pair (prefix_license plate number field name_jaw AXXXXXX, 2020-09-13:10:00:05) is inserted into the redis database, whether the key name 'prefix_license plate number field name_jaw AXXXXXX' exists in the redis database is judged, if not, the delivery commission data is determined not to be repeated data, and the key value pair (prefix_license plate number field name_jaw AXXXXXX, 2020-13:00:05) is written into the redis database.
In one embodiment, step S63 specifically includes the steps of:
s631, judging whether the difference value between the redis key value and the redis key value of the last piece of the traffic commission data cached in the redis database is within the filtering duration.
S632, if not in the filtering duration, determining that the traffic commission data is not repeated data, and writing the redis key value pair into the redis database.
S633, if the filtering duration is within the filtering duration, determining that the exchange and delegation data is repeated data and discarding, and updating a redis key value corresponding to the last exchange and delegation data cached in the redis database into the redis key value.
For example, in the example of adapting step S625, if the key name "prefix_license plate number field name_jaw axxxxx" exists in the redis database, the redis key value "2020-09-13:00:05" is further compared with the redis key value "2020-09-13:10:00:00" of the last piece of traffic commission data cached in the redis database, the difference time between the two is 5S, and if the filtering duration is not 30S, the filtering duration is within the filtering duration, so that the traffic commission data with the index data segment of jaw axxxxx and the time data segment of 2020-09-13:00:05 is removed as repeated data.
Based on the above embodiments, a complete embodiment is provided below to describe in detail the implementation process of the data duplication removal method for exchange and delegation:
firstly, a selection interface of target data is displayed, a template name, a manufacturer name and a classification name are selected on the selection interface through a data type selection operation, so that target data of a certain data type is selected to determine that the target data of the data type is subjected to de-duplication processing, metadata corresponding to the target data is displayed according to the data type by the selection interface, the metadata comprises fields, names and types of the metadata, and then the selection interface is updated according to the metadata to generate and display a configuration interface.
Then, determining a deduplication index field and a deduplication time field from metadata through the configuration interface on the configuration interface, setting a filtering duration corresponding to the deduplication index field, and taking the deduplication index field, the deduplication time field and the filtering duration as deduplication parameters of a deduplication judgment rule, wherein the deduplication judgment rule is: and when the index data segments of the two pieces of traffic commission data are the same and the difference value of the time data segments is within the filtering duration, judging the two pieces of traffic commission data as repeated data.
Finally, receiving the exchange and delegation data, wherein the data type of the exchange and delegation data is the same as the data type of the target data, the exchange and delegation data is real-time data, the exchange and delegation data comprises an index data segment of a deduplication index field (namely a field value of the deduplication index field) and a time data segment of a deduplication time field (namely a field value of the deduplication time field), the deduplication time field and the index data segment are used for generating a redis key name corresponding to the exchange and delegation data, the time data segment is used for generating a redis key value corresponding to the exchange and delegation data, the redis key name corresponding to the exchange and delegation data and the redis key value corresponding to the exchange and delegation data form a redis key value pair corresponding to the exchange and delegation data, and the redis key value pair of the exchange and delegation data are compared, specifically, the exchange and delegation data are cached in a redis database: judging whether a redis key name of the delegation data exists in a redis database, if not, determining that the delegation data is not repeated data, and writing the delegation data into the redis database as new data; if so, further judging whether the difference value between the redis key value of the traffic commission data and the redis key value of the last traffic commission data cached in the redis database is within the filtering time period, if so, determining that the traffic commission data is repeated data and is discarded, and if not, determining that the traffic commission data is not repeated data, and writing the traffic commission data as new data into the redis database. And in a period of filtering duration, the method loops so as to perform de-duplication processing on the accessed traffic and delegation data and then buffer the traffic and delegation data into a redis database.
According to the data exchange and delegation method provided by the embodiment of the application, when verification is carried out in an actual project, before the method is used, the data volume generated by wifi electronic fence equipment in a certain city per hour is about 300 ten thousand, the data volume generated by wifi electronic fence equipment in a certain city per hour is 4000 ten thousand, the data storage volume of an original storage system is occupied in a large amount, after the method is used, more than 50% of repeated data can be filtered out by reasonably configuring a deduplication index field, a deduplication time period and filtering time length, and because key value pairs with high concurrency access of a redis database are adopted in the deduplication process, the deduplication speed is high, and the efficiency of data access is basically not influenced.
On the basis of the method described in the foregoing embodiment, the embodiment of the present application will be further described from the perspective of a data exchange and delegation apparatus, please refer to fig. 5, fig. 5 is a schematic structural diagram of the data exchange and delegation apparatus provided in the embodiment of the present application, where the data exchange and delegation apparatus includes:
the data type selection module 51 is configured to display a selection interface of the target data, and receive a data type selection operation through the selection interface;
a metadata obtaining module 52, configured to determine the target data according to the data type selection operation, and obtain metadata corresponding to the target data;
An interface generating module 53, configured to update the selection interface according to the metadata, and generate and display a configuration interface;
a configuration module 54, configured to receive a deduplication parameter configuration operation through the configuration interface, and determine a deduplication index field and a deduplication time field according to the metadata according to the deduplication parameter configuration operation;
a receiving module 55, configured to receive traffic delegation data, where the traffic delegation data includes an index data segment of the deduplication index field, and a time data segment of the deduplication time field;
the deduplication module 56 is configured to perform deduplication processing on the traffic commission data according to the index data segment and the time data segment based on the data cached in the redis database, and then cache the traffic commission data in the redis database.
According to the data exchange and delegation device provided by the embodiment of the application, the target data of a certain data type and the metadata of the target data are determined through the selection interface, then the selection interface is updated to generate the configuration interface, the configuration interface is used for selecting the deduplication index field and the deduplication time field for deduplication processing from the metadata, the filtering duration corresponding to the deduplication time field is set, finally, based on the data cached in the redis database, the data exchange and delegation device can carry out deduplication processing on the data exchange and delegation according to the received index time period of the deduplication index field and the time data period of the deduplication time field of the same type of the target data and the filtering duration, the configuration interface for deduplication can be dynamically generated through the initial selection interface, the deduplication index field and the deduplication time field are dynamically selected through the configuration interface, the flexibility and the accuracy of deduplication processing are improved, meanwhile, the redis database of a key value pair with high concurrency access is adopted in the deduplication processing, the repeated data with similar receiving time can be effectively filtered, the speed of deduplication processing is improved, and the system has strong expansibility.
In one embodiment, the metadata obtaining module 52 is further configured to select the data type according to the data type selection operation, and determine target data corresponding to the data type; and acquiring the metadata corresponding to the target data according to the data type.
In one embodiment, the interface generating module 53 is further configured to display the metadata, set a field index frame corresponding to the metadata, a field selection frame corresponding to each metadata, and a field confirmation frame; updating the selection interface, and generating and displaying the configuration interface.
In one embodiment, configuration module 54 is further configured to search for the deduplication index field and the deduplication time field through the field index box; selecting the de-duplication index field and the de-duplication time field through the field selection frame; confirming the duplicate-removal-index field and the duplicate-removal-time field through the field confirmation frame; and displaying the de-duplication index field and the de-duplication time field, and setting a filtering duration corresponding to the de-duplication time field.
In one embodiment, the deduplication module 56 is further configured to determine whether the data cached in the redis database includes the index data segment, and if the redis database does not include the index data segment, determine that the commit data is not duplicate data and write the commit data to the redis database; if the data cached in the redis database comprises the index data segment, judging whether the difference value between the time data segment and the time data segment of the last transaction delegation data cached in the redis database is within the filtering duration, if not, determining that the transaction delegation data is not repeated data, and writing the transaction delegation data into the redis database; and if the time is within the filtering time, determining that the traffic commission data is repeated data and discarding, and updating the time data segment of the last traffic commission data cached in the redis database into the time data segment.
In one embodiment, referring to fig. 6, fig. 6 is a schematic structural diagram of a server according to an embodiment of the present application, where the server includes a processor 601, a memory 602, and a bus 603, where the processor 601 and the memory 602 complete communication with each other through the bus 603; the memory 602 stores program instructions executable by the processor 601 to perform the methods provided by the method embodiments described above, for example, including: displaying a selection interface of target data, and receiving data type selection operation through the selection interface; determining the target data according to the data type selection operation, and acquiring metadata corresponding to the target data; updating the selection interface according to the metadata, and generating and displaying a configuration interface; receiving a de-duplication parameter configuration operation through the configuration interface, determining a de-duplication index field and a de-duplication time field according to the metadata according to the de-duplication parameter configuration operation, and setting a filtering duration corresponding to the de-duplication time field; receiving traffic delegation data, wherein the traffic delegation data comprises an index data segment of the deduplication index field and a time data segment of the deduplication time field; based on the data cached in the redis database, according to the index data segment, the time data segment and the filtering duration, the traffic commission data is cached to the redis database after being subjected to de-duplication processing.
In one embodiment, an embodiment of the present application provides a non-transitory computer-readable storage medium storing computer instructions that cause a computer to perform the methods provided by the method embodiments described above, for example, including: displaying a selection interface of target data, and receiving data type selection operation through the selection interface; determining the target data according to the data type selection operation, and acquiring metadata corresponding to the target data; updating the selection interface according to the metadata, and generating and displaying a configuration interface; receiving a de-duplication parameter configuration operation through the configuration interface, determining a de-duplication index field and a de-duplication time field according to the metadata according to the de-duplication parameter configuration operation, and setting a filtering duration corresponding to the de-duplication time field; receiving traffic delegation data, wherein the traffic delegation data comprises an index data segment of the deduplication index field and a time data segment of the deduplication time field; based on the data cached in the redis database, according to the index data segment, the time data segment and the filtering duration, the traffic commission data is cached to the redis database after being subjected to de-duplication processing.
Wherein the computer storage medium may include: read Only Memory (ROM), random access Memory (RAM, random Access Memory), magnetic or optical disk, and the like.
Because the instructions stored in the computer storage medium may execute the steps in any of the methods provided in the embodiments of the present application, the beneficial effects that any of the methods provided in the embodiments of the present application may be achieved are described in detail in the previous embodiments, and are not described herein.
The method, the device and the server for de-duplication of the traffic and delegation data provided by the embodiment of the application are described in detail, and specific examples are applied to the description of the principle and the implementation mode of the application, and the description of the above embodiment is only used for helping to understand the technical scheme and the core idea of the application; those of ordinary skill in the art will appreciate that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.
Claims (9)
1. A method for de-duplication of traffic-commissioned data, comprising:
Displaying a selection interface of target data, and receiving data type selection operation through the selection interface;
determining the target data according to the data type selection operation, and acquiring metadata corresponding to the target data;
updating the selection interface according to the metadata, and generating and displaying a configuration interface;
receiving a de-duplication parameter configuration operation through the configuration interface, determining a de-duplication index field and a de-duplication time field according to the metadata according to the de-duplication parameter configuration operation, and setting a filtering duration corresponding to the de-duplication time field;
receiving traffic delegation data, wherein the traffic delegation data comprises an index data segment of the deduplication index field and a time data segment of the deduplication time field;
based on the data cached in the redis database, according to the index data segment, the time data segment and the filtering duration, the traffic commission data is cached to the redis database after the duplicate removal processing is carried out, specifically:
judging whether the data cached in the redis database comprises the index data segment, if the redis database does not comprise the index data segment, determining that the traffic delegation data is not repeated data and writing the traffic delegation data into the redis database;
If the data cached in the redis database comprises the index data segment, judging whether the difference value between the time data segment and the time data segment of the last exchange delegation data cached in the redis database is within the filtering duration, if not, determining that the exchange delegation data is not repeated data, writing the exchange delegation data into the redis database, and updating the time data segment of the last exchange delegation data cached in the redis database into the time data segment; and if the filtering duration is within the filtering duration, determining that the traffic and delegation data are repeated data and discarding.
2. The method for de-duplication of data according to claim 1, wherein determining the target data according to the data type selection operation and obtaining metadata corresponding to the target data specifically includes:
selecting the data type according to the data type selection operation, and determining target data corresponding to the data type;
and acquiring the metadata corresponding to the target data according to the data type.
3. The method for delegation of data deduplication as claimed in claim 1, wherein updating the selection interface according to the metadata generates and displays a configuration interface, specifically comprising:
Displaying the metadata, setting a field index frame corresponding to the metadata, a field selection frame corresponding to each metadata, and a field confirmation frame;
updating the selection interface, and generating and displaying the configuration interface.
4. The method for de-duplication of traffic-delegated data according to claim 3, wherein said determining the de-duplication index field and the de-duplication time field according to the metadata, and setting a filtering duration corresponding to the de-duplication time field, specifically includes:
searching the de-duplication index field and the de-duplication time field through the field index frame;
selecting the de-duplication index field and the de-duplication time field through the field selection frame;
confirming the duplicate-removal-index field and the duplicate-removal-time field through the field confirmation frame;
and displaying the de-duplication index field and the de-duplication time field, and setting a filtering duration corresponding to the de-duplication time field.
5. The method of delegation of data deduplication as described in claim 4, wherein the deduplication index field comprises at least one field of the metadata, and the deduplication time field comprises only one of the fields of metadata.
6. The method for de-duplication of traffic-commission data according to claim 1, wherein the determining whether the data cached in the redis database includes the index data segment, if the redis database does not include the index data segment, determining that the traffic-commission data is not duplicate data and writing the traffic-commission data into the redis database, specifically includes:
generating a rediskey name corresponding to the traffic delegation data by using the deduplication index field and the index data segment;
generating a redis key value corresponding to the traffic and delegation data by the time data segment;
the redis key name and the redis key value form a duplicate-judging redis key value corresponding to the traffic delegation data;
inserting the re-dis key value pair into the dis database, and judging whether the data cached in the dis database comprises the dis key name or not;
and if the redis key name is not included, determining that the exchange and delegation data is not repeated data, and writing the redis key value pair into the redis database.
7. The method for deduplication of exchange and delegation data according to claim 6, wherein the determining whether a difference between the time data segment and a time data segment of a last exchange and delegation data cached in the redis database is within the filtering duration, if not, determining that the exchange and delegation data is not duplicate data, and writing the exchange and delegation data into the redis database; if the filtering duration is within the filtering duration, determining that the traffic commission data is repeated data and discarding, and updating the time data segment of the last traffic commission data cached in the redis database into the time data segment, wherein the method specifically comprises the following steps:
Judging whether the difference value between the redis key value and the redis key value of the last piece of the traffic delegation data cached in the redis database is within the filtering duration or not;
if the data is not in the filtering duration, determining that the traffic commission data is not repeated data, writing the redis key value pair into the redis database, and updating a redis key value corresponding to the last traffic commission data cached in the redis database into the redis key value;
and if the filtering duration is within the filtering duration, determining that the traffic and delegation data are repeated data and discarding.
8. A traffic delegation data deduplication apparatus, comprising:
the data type selection module is used for displaying a selection interface of the target data and receiving data type selection operation through the selection interface;
the metadata acquisition module is used for determining the target data according to the data type selection operation and acquiring metadata corresponding to the target data;
the interface generation module is used for updating the selection interface according to the metadata and generating and displaying a configuration interface;
the configuration module is used for receiving a de-duplication parameter configuration operation through the configuration interface, determining a de-duplication index field and a de-duplication time field according to the metadata according to the de-duplication parameter configuration operation, and setting a filtering duration corresponding to the de-duplication time field;
The receiving module is used for receiving the exchange and delegation data, wherein the exchange and delegation data comprises an index data segment of the deduplication index field and a time data segment of the deduplication time field;
the deduplication module is configured to perform deduplication processing on the traffic commission data based on the data cached in the redis database according to the index data segment, the time data segment and the filtering duration, and then cache the traffic commission data to the redis database, where the traffic commission data is specifically configured to: judging whether the data cached in the redis database comprises the index data segment, if the redis database does not comprise the index data segment, determining that the traffic delegation data is not repeated data and writing the traffic delegation data into the redis database; if the data cached in the redis database comprises the index data segment, judging whether the difference value between the time data segment and the time data segment of the last exchange delegation data cached in the redis database is within the filtering duration, if not, determining that the exchange delegation data is not repeated data, writing the exchange delegation data into the redis database, and updating the time data segment of the last exchange delegation data cached in the redis database into the time data segment; and if the filtering duration is within the filtering duration, determining that the traffic and delegation data are repeated data and discarding.
9. A server, comprising: a processor, a memory, a bus, and a computer program stored on the memory and executable on the processor, wherein the processor and the memory complete communication with each other through the bus; the memory processor, when executing the computer program, implements the method of delegation data deduplication as defined in any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011087893.XA CN112306998B (en) | 2020-10-13 | 2020-10-13 | Method, device and server for de-duplication of traffic and delegation data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011087893.XA CN112306998B (en) | 2020-10-13 | 2020-10-13 | Method, device and server for de-duplication of traffic and delegation data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112306998A CN112306998A (en) | 2021-02-02 |
CN112306998B true CN112306998B (en) | 2023-11-24 |
Family
ID=74489774
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011087893.XA Active CN112306998B (en) | 2020-10-13 | 2020-10-13 | Method, device and server for de-duplication of traffic and delegation data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112306998B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114398405A (en) * | 2022-01-18 | 2022-04-26 | 政采云有限公司 | Method, device and medium for processing repeated data |
CN114372064B (en) * | 2022-03-22 | 2022-07-12 | 飞狐信息技术(天津)有限公司 | Data processing apparatus, method, computer readable medium and processor |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003249961A (en) * | 2002-02-26 | 2003-09-05 | Nippon Telegr & Teleph Corp <Ntt> | Method, device and program for storing traffic information, recoding medium recorded with the program, and method, device and program for retrieving traffic information, and recording medium recorded with the program |
CN102622443A (en) * | 2012-03-13 | 2012-08-01 | 北京邮电大学 | Customized screening system and method for microblog |
CN103646071A (en) * | 2013-12-09 | 2014-03-19 | 中国航天系统工程有限公司 | Dynamic data search method |
CN106354775A (en) * | 2016-08-23 | 2017-01-25 | 成都科来软件有限公司 | Data dynamic configuration system and method based on custom fields |
WO2018106580A1 (en) * | 2016-12-05 | 2018-06-14 | Rise Interactive Media & Analytics, LLC | Interactive data-driven graphical user interfaces for search engine optimization |
CN108170826A (en) * | 2018-01-08 | 2018-06-15 | 北京国信宏数科技有限责任公司 | A kind of macro economic analysis method and system based on internet big data |
CN108717790A (en) * | 2018-07-06 | 2018-10-30 | 广州市交通运输研究所 | A kind of vehicle driving analysis method based on bayonet license plate identification data |
CN109461312A (en) * | 2018-12-27 | 2019-03-12 | 泰华智慧产业集团股份有限公司 | Car data De-weight method is crossed based on redis |
CN109522305A (en) * | 2018-12-06 | 2019-03-26 | 北京千方科技股份有限公司 | A kind of big data De-weight method and device |
CN110134725A (en) * | 2019-05-22 | 2019-08-16 | 山东浪潮通软信息科技有限公司 | A kind of method and system of dynamic queries extended field |
CN110322687A (en) * | 2018-03-30 | 2019-10-11 | 杭州海康威视系统技术有限公司 | The method and apparatus for determining target intersection running state information |
CN111382146A (en) * | 2018-12-29 | 2020-07-07 | 浙江宇视科技有限公司 | Data storage control method, device, equipment and storage medium |
CN111477015A (en) * | 2019-01-22 | 2020-07-31 | 上海宝康电子控制工程有限公司 | Method, device and storage medium for realizing intersection vehicle space-time distribution display control |
CN111724593A (en) * | 2020-06-12 | 2020-09-29 | 重庆大学 | Bus travel time prediction method considering traffic flow transfer rate based on RFID |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9298717B2 (en) * | 2012-06-14 | 2016-03-29 | Empire Technology Development Llc | Data deduplication management |
US20140222820A1 (en) * | 2013-02-05 | 2014-08-07 | Gary Alan Mott | Apparatus, system, and method for vehicular characteristic monitoring and cataloging |
GB2514555A (en) * | 2013-05-28 | 2014-12-03 | Ibm | Deduplication for a storage system |
CN106874321B (en) * | 2016-06-24 | 2020-04-21 | 阿里巴巴集团控股有限公司 | Pluggable data retrieval method and device |
-
2020
- 2020-10-13 CN CN202011087893.XA patent/CN112306998B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2003249961A (en) * | 2002-02-26 | 2003-09-05 | Nippon Telegr & Teleph Corp <Ntt> | Method, device and program for storing traffic information, recoding medium recorded with the program, and method, device and program for retrieving traffic information, and recording medium recorded with the program |
CN102622443A (en) * | 2012-03-13 | 2012-08-01 | 北京邮电大学 | Customized screening system and method for microblog |
CN103646071A (en) * | 2013-12-09 | 2014-03-19 | 中国航天系统工程有限公司 | Dynamic data search method |
CN106354775A (en) * | 2016-08-23 | 2017-01-25 | 成都科来软件有限公司 | Data dynamic configuration system and method based on custom fields |
WO2018106580A1 (en) * | 2016-12-05 | 2018-06-14 | Rise Interactive Media & Analytics, LLC | Interactive data-driven graphical user interfaces for search engine optimization |
CN108170826A (en) * | 2018-01-08 | 2018-06-15 | 北京国信宏数科技有限责任公司 | A kind of macro economic analysis method and system based on internet big data |
CN110322687A (en) * | 2018-03-30 | 2019-10-11 | 杭州海康威视系统技术有限公司 | The method and apparatus for determining target intersection running state information |
CN108717790A (en) * | 2018-07-06 | 2018-10-30 | 广州市交通运输研究所 | A kind of vehicle driving analysis method based on bayonet license plate identification data |
CN109522305A (en) * | 2018-12-06 | 2019-03-26 | 北京千方科技股份有限公司 | A kind of big data De-weight method and device |
CN109461312A (en) * | 2018-12-27 | 2019-03-12 | 泰华智慧产业集团股份有限公司 | Car data De-weight method is crossed based on redis |
CN111382146A (en) * | 2018-12-29 | 2020-07-07 | 浙江宇视科技有限公司 | Data storage control method, device, equipment and storage medium |
CN111477015A (en) * | 2019-01-22 | 2020-07-31 | 上海宝康电子控制工程有限公司 | Method, device and storage medium for realizing intersection vehicle space-time distribution display control |
CN110134725A (en) * | 2019-05-22 | 2019-08-16 | 山东浪潮通软信息科技有限公司 | A kind of method and system of dynamic queries extended field |
CN111724593A (en) * | 2020-06-12 | 2020-09-29 | 重庆大学 | Bus travel time prediction method considering traffic flow transfer rate based on RFID |
Also Published As
Publication number | Publication date |
---|---|
CN112306998A (en) | 2021-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI682274B (en) | Key-value store tree | |
TWI702506B (en) | System, machine readable medium, and machine-implemenated method for merge tree garbage metrics | |
CN112306998B (en) | Method, device and server for de-duplication of traffic and delegation data | |
CN110268399A (en) | Merge tree modification for maintenance operations | |
CN103365886B (en) | The querying method of the spatial event in car networking and Optimizing Queries device | |
CN109298835B (en) | Data archiving processing method, device, equipment and storage medium of block chain | |
EP2997472B1 (en) | Managing memory and storage space for a data operation | |
CN113297135A (en) | Data processing method and device | |
CN111125298A (en) | Method, equipment and storage medium for reconstructing NTFS file directory tree | |
CN108038253B (en) | Log query processing method and device | |
CN107665219A (en) | A kind of blog management method and device | |
CN107391769B (en) | Index query method and device | |
CN111125171A (en) | Monitoring data access method, device, equipment and readable storage medium | |
CN114297196B (en) | Metadata storage method and device, electronic equipment and storage medium | |
CN115422479A (en) | Track association method and device, electronic equipment and machine-readable storage medium | |
CN107220248A (en) | A kind of method and apparatus for data storage | |
US10185660B2 (en) | System and method for automated data organization in a storage system | |
JP2023531751A (en) | Vehicle data storage method and system | |
CN115454353B (en) | High-speed writing and query method for space application data | |
CN107894942B (en) | Method and device for monitoring data table access amount | |
CN113467997B (en) | Data recovery method and device, mobile equipment and storage medium | |
JP5655764B2 (en) | Sampling apparatus, sampling program, and method thereof | |
CN115481086A (en) | Mass small file reading and writing method and system, electronic device and storage medium | |
CN116028572A (en) | Communication service data processing method, device and computer storage medium | |
CN110442555B (en) | Method and system for reducing fragments of selective reserved space |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |