CN113157716A - Data processing method, device, equipment and medium - Google Patents

Data processing method, device, equipment and medium Download PDF

Info

Publication number
CN113157716A
CN113157716A CN202110520749.9A CN202110520749A CN113157716A CN 113157716 A CN113157716 A CN 113157716A CN 202110520749 A CN202110520749 A CN 202110520749A CN 113157716 A CN113157716 A CN 113157716A
Authority
CN
China
Prior art keywords
time
target
identification information
data
data information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110520749.9A
Other languages
Chinese (zh)
Other versions
CN113157716B (en
Inventor
张帅帅
蔡辉
李元洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Netease Cloud Music Technology Co Ltd
Original Assignee
Hangzhou Netease Cloud Music Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Netease Cloud Music Technology Co Ltd filed Critical Hangzhou Netease Cloud Music Technology Co Ltd
Priority to CN202110520749.9A priority Critical patent/CN113157716B/en
Publication of CN113157716A publication Critical patent/CN113157716A/en
Application granted granted Critical
Publication of CN113157716B publication Critical patent/CN113157716B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure relates to a data processing method, apparatus, device, and medium. Since the target data information corresponds to a first version time which identifies the time when the target data information is updated, the index document of the target identification information corresponds to a second version time which identifies the time when the index document of the target identification information is updated, and subsequently when the index document of the target identification information is updated according to the target data information, the comparison result of the second version time of the index document of the target identification information and the first version time can be considered, so that the problem that the later updated data in the index document of the target identification information is replaced by the target data information because the index document of the target identification information is updated directly according to the target data information when the time when the target data information is updated is earlier than the time when the index document of the target identification information is updated is effectively avoided, the accuracy and the stability of data synchronization are effectively improved.

Description

Data processing method, device, equipment and medium
Technical Field
The present disclosure relates to the field of big data technologies, and in particular, to a data processing method, apparatus, device, and medium.
Background
In order to facilitate users inside and outside the enterprise to inquire about contents to be consulted, many enterprises construct their own search services on a search platform based on search engines, such as a full text search engine (Lucene), a distributed full text search engine (ElasticSearch), an enterprise-level search application server (Solr), and the like. In order to enable the user to query the information to be consulted on the search platform, the search engine needs to acquire and store the content that can be queried in the database, that is, the content that can be queried in the database needs to be synchronously processed. When the data stored in the database changes, the search engine also needs to timely and accurately acquire the changed data and update the stored data according to the changed data, so as to ensure the accuracy of the information queried by the subsequent user. Therefore, how to accurately and quickly synchronize changed data to be synchronized into a search engine is a problem which is of increasing interest in recent years.
Disclosure of Invention
The present disclosure provides a data processing method, apparatus, device, and medium, which are used to solve the problem that the existing method cannot accurately synchronize changed data to be synchronized to a search engine.
The present disclosure provides a data processing method, the method comprising:
acquiring first version time of changed target data information and target identification information contained in the target data information; wherein the first version time is used for identifying the time when the target data information is updated;
determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information; the second version time is used for identifying the time when the index document of the target identification information is updated.
In a possible implementation manner, the obtaining of the changed target data information includes:
and determining the changed target data information according to the data information carried in the received change notification message.
In a possible implementation manner, the obtaining of the changed target data information includes:
if the first time meets the preset determination condition, determining the data information to be synchronized which is changed in the target time period; determining any changed data information to be synchronized as the target data information; the target time period is a time interval between the first time and a second time; the second time is the last determined time when the determination condition is satisfied.
In a possible embodiment, the determining that the first time meets the predetermined determination condition includes:
if the first time is determined to be the time for receiving the full-scale synchronization instruction, determining that the first time meets a preset determination condition; the full synchronization instruction is an instruction for synchronizing each data to be synchronized at the first time into the corresponding index document; or
And if the time interval between the first time and the second time is determined to reach a preset duration, determining that the first time meets a preset determination condition.
In a possible implementation manner, the determining the data information to be synchronized, which is changed within the target time period, includes:
acquiring each data to be synchronized at the first time in an off-line manner, and summarizing each data to be synchronized according to identification information corresponding to each data to be synchronized;
for each piece of identification information, determining first data information according to the identification information and each piece of data to be synchronized corresponding to the identification information;
acquiring each piece of second data information corresponding to the second time;
and determining the data information to be synchronized, which is changed in the target time period, according to each piece of first data information and each piece of second data information, and updating each piece of second data information according to each piece of first data information.
In a possible implementation manner, the determining, according to the each piece of first data information and the each piece of second data information, data information to be synchronized, which is changed within the target time period, includes:
for the identification information contained in each piece of first data information, if it is determined that second data information containing the identification information does not exist, determining the first data information containing the identification information as changed data information to be synchronized; and if the second data information containing the identification information is determined to exist and the second data information is inconsistent with the first data information containing the identification information, determining the first data information containing the identification information as the changed data information to be synchronized.
In one possible embodiment, the method further comprises:
acquiring a target change type of the target data information;
judging whether the target change type is a deletion type; the deletion type represents that each data to be synchronized corresponding to the target identification information contained in the target data information is deleted;
if the target change type is a deletion type, directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information;
and if the target change type is not the deletion type, executing a subsequent step of determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information.
In a possible implementation manner, the determining whether to update the index document of the target identification information according to the comparison result between the second version time and the first version time of the index document of the target identification information includes:
if the second version time of the index document without the target identification information is determined, or the second version time is not later than the first version time, updating the index document with the target identification information according to the target data information; updating the second version time of the index document of the updated target identification information according to the first version time;
and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
In one possible embodiment, the first version time is obtained by:
if the target data information is determined by the data information to be synchronized which is changed in the target time period, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; determining the first version time according to the updated time and the first time;
if the target data information is determined through the received notification message, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; and determining the first version time according to the updated time and a fourth time corresponding to the target data information.
The present disclosure provides a data processing apparatus, the apparatus comprising:
the processing unit is used for acquiring first version time of changed target data information and target identification information contained in the target data information; wherein the first version time is used for identifying the time when the target data information is updated;
the updating unit is used for determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information; the second version time is used for identifying the time when the index document of the target identification information is updated.
In a possible implementation manner, the processing unit is specifically configured to determine the target data information that is changed according to the data information carried in the received change notification message.
In a possible implementation manner, the processing unit is specifically configured to determine, if it is determined that the first time meets a preset determination condition, data information to be synchronized, which is changed within a target time period; determining any changed data information to be synchronized as the target data information; the target time period is a time interval between the first time and a second time; the second time is the last determined time when the determination condition is satisfied.
In a possible implementation manner, the processing unit is specifically configured to determine that the first time meets a preset determination condition if it is determined that the first time is the time when the full-amount synchronization instruction is received; the full synchronization instruction is an instruction for synchronizing each data to be synchronized at the first time into the corresponding index document; or if the time interval between the first time and the second time is determined to reach the preset duration, determining that the first time meets the preset determination condition.
In a possible implementation manner, the processing unit is specifically configured to obtain each to-be-synchronized data at the first time offline, and summarize each to-be-synchronized data according to identification information corresponding to each to-be-synchronized data; for each piece of identification information, determining first data information according to the identification information and each piece of data to be synchronized corresponding to the identification information; acquiring each piece of second data information corresponding to the second time; and determining the data information to be synchronized, which is changed in the target time period, according to each piece of first data information and each piece of second data information, and updating each piece of second data information according to each piece of first data information.
In a possible implementation manner, the processing unit is specifically configured to, for each piece of identification information included in the first data information, determine, if it is determined that there is no second data information including the identification information, that the first data information including the identification information is to-be-synchronized data information that is changed; and if the second data information containing the identification information is determined to exist and the second data information is inconsistent with the first data information containing the identification information, determining the first data information containing the identification information as the changed data information to be synchronized.
In a possible implementation manner, the obtaining unit is further configured to obtain a target change type of the target data information;
the updating unit is further configured to determine whether the target change type is a deletion type; the deletion type represents that each data to be synchronized corresponding to the target identification information contained in the target data information is deleted; if the target change type is a deletion type, directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information; and if the target change type is not the deletion type, executing a subsequent step of determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information.
In a possible implementation manner, the updating unit is specifically configured to update the index document of the target identification information according to the target data information if it is determined that a second version time of the index document of the target identification information does not exist, or the second version time is not later than the first version time; updating the second version time of the index document of the updated target identification information according to the first version time; and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
In a possible implementation manner, the updating unit is specifically configured to obtain the first version time by:
if the target data information is determined by the changed data information to be synchronized, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; determining the first version time according to the updated time and the first time;
if the target data information is determined through the received notification message, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; and determining the first version time according to the updated time and a fourth time corresponding to the target data information.
The present disclosure provides an electronic device comprising at least a processor and a memory, the processor being adapted to carry out the steps of the data processing method as described in any one of the above when executing a computer program stored in the memory.
The present disclosure provides a computer-readable storage medium, in which a computer program is stored, which computer program, when being executed by a processor, realizes the steps of the data processing method as set forth in any one of the above.
In the disclosure, since the changed target data information corresponds to a first version time for identifying the time when the target data information is updated, and the index document of the target identification information also corresponds to a second version time for identifying the time when the index document of the target identification information is updated, the comparison result between the second version time of the index document of the target identification information and the first version time can be considered when the index document of the target identification information is updated according to the target data information, so as to effectively avoid the problem that the later updated data in the index document of the target identification information is replaced by the target data information because the index document of the target identification information is updated directly according to the target data information when the time when the target data information is updated is earlier than the time when the index document of the target identification information is updated, the accuracy and the stability of data synchronization are effectively improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description are only some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without inventive efforts.
Fig. 1 is a schematic diagram of a real-time task flow provided by an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a process flow of a full-scale task provided by an embodiment of the present disclosure;
fig. 3 is a schematic diagram of a data processing process provided by an embodiment of the present disclosure;
FIG. 4 is a flow chart illustrating a specific data processing provided by the present disclosure;
FIG. 5 is a flow chart illustrating a specific data processing provided by the present disclosure;
fig. 6 is a scene schematic diagram of a specific data processing flow provided by the embodiment of the present disclosure;
fig. 7 is a scene schematic diagram of a specific data processing flow provided by the embodiment of the present disclosure;
fig. 8 is a schematic diagram of a data processing procedure in a specific music database provided by the present disclosure;
fig. 9 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure;
fig. 10 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
The present disclosure will be described in further detail below with reference to the attached drawings, and it should be understood that the described embodiments are only a part of the embodiments of the present disclosure, and not all embodiments. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the scope of protection of the present disclosure.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
In this document, it is to be understood that any number of elements in the figures are provided by way of illustration and not limitation, and any nomenclature is used for differentiation only and not in any limiting sense.
For convenience of understanding, some concepts involved in the embodiments of the present disclosure are explained below:
document: the content carrier in the search engine, which has a certain data structure, corresponds to a row record of a table of a relational database.
Indexing: a collection of identically structured documents in a search engine.
Source table: with respect to the table structure in the index used to hold the data from which the document's content originated.
Real-time tasks: and monitoring the source table change in real time and synchronizing the tasks of the document.
And a full task, namely a task of traversing all source table data to construct and synchronize the document to a search engine.
Due to the development of internet technology, more and more users search for contents desired to be queried using a search platform. Two key technologies are generally used to query the content on the search platform, which are: techniques for synchronizing documents in a database into a search engine, and techniques for relevance matching based on search keywords entered by a user and the content of documents saved in the search engine. The technology for synchronizing the documents in the database into the search engine is a precondition for realizing whether a user can inquire data on a search platform.
When the data stored in the database is synchronized to the search engine for the first time, the data in the database is only required to be summarized and cleaned to obtain a data structure (e.g., a document) suitable for being stored in the search engine, and then the obtained document is synchronized to the search engine. However, since the data stored in the database may change at any time, it is necessary to monitor whether the data stored in the database changes, for example, data addition/deletion, data change, etc., and synchronize the changed data to be synchronized into the search engine in time, so that the user can query the updated data, and the accuracy of the content that can be queried in the search engine is ensured. Therefore, how to accurately and quickly synchronize changed data to be synchronized into a search engine is a problem which is of increasing interest in recent years.
In the related art, real-time tasks and full tasks are generally adopted to accurately and quickly synchronize changed data to be synchronized into a search engine. The real-time tasks can guarantee the real-time property of synchronizing the changed data to be synchronized into the search engine, and the full amount of tasks can guarantee that the changed data to be synchronized can be synchronized into the search engine. The real-time task and the full-scale task are respectively described in detail as follows:
firstly, real-time tasks. For convenience in describing real-time tasks, reference is made to the accompanying drawings in which:
fig. 1 is a schematic diagram of a real-time task flow provided by an embodiment of the present disclosure. As shown in fig. 1, when there is a change in data in the source table stored in the database, the database sends an index synchronization request for the change in the target data information to the electronic device for index synchronization. And after receiving the index synchronization request, the electronic equipment acquires the related data in the source table of the target data information in the database. And generating a document according to the acquired related data and sending the document to a search engine so that the search engine updates the stored document according to the document.
And II, performing full-scale tasks. For convenience in describing the overall task, reference is made to the accompanying drawings in which:
fig. 2 is a schematic flowchart of a full-scale task provided in an embodiment of the present disclosure. As shown in fig. 2, the electronic device for index synchronization acquires data in the source table stored in the database according to a preset period, that is, traverses the data in the source table stored in the database regularly. And summarizing and cleaning the acquired data to acquire each document. Each generated document is then sent to a search engine, so that the search engine updates each saved document according to each received document.
Whether the tasks are real-time tasks or full tasks, when the data in the search engine is updated, the data stored in the database and the updated data in the search engine are inconsistent.
For example, for a real-time task, even if the electronic device acquires the relevant data of the changed data to be synchronized each time due to the fact that the sequence of the multiple index synchronization requests is inconsistent with the sequence of the generated documents sent to the search engine, the electronic device may subsequently have the document corresponding to the changed data to be synchronized and then send the document to the search engine, and the electronic device may update the stored index document according to the later received document, that is, update the index document corresponding to the later changed data to be synchronized in the search engine according to the document corresponding to the currently changed data to be synchronized.
For another example, for a full amount of tasks, after the electronic device acquires each piece of data included in the database, before the search engine updates the stored index document based on the received document, if the data included in the database is changed again, the data synchronized by the full amount of tasks replaces the data synchronized by the real-time tasks, so that the data stored in the database and the data in the index document in the search engine are still inconsistent.
Therefore, in order to solve the problem that the existing method cannot timely and accurately update the documents in the search engine, the present disclosure provides a data processing method, apparatus, device and medium. In the disclosure, since the changed target data information corresponds to a first version time for identifying the time when the target data information is updated, and the index document of the target identification information also corresponds to a second version time for identifying the time when the index document of the target identification information is updated, the comparison result between the second version time of the index document of the target identification information and the first version time can be considered when the index document of the target identification information is updated according to the target data information, so as to effectively avoid the problem that the later updated data in the index document of the target identification information is replaced by the target data information because the index document of the target identification information is updated directly according to the target data information when the time when the target data information is updated is earlier than the time when the index document of the target identification information is updated, the accuracy and the stability of data synchronization are effectively improved.
Fig. 3 is a schematic diagram of a data processing process provided in an embodiment of the present disclosure, where the process includes:
s301: acquiring first version time of changed target data information and target identification information contained in the target data information; wherein the first version time is used for identifying the time when the target data information is updated.
The data processing method provided by the disclosure can be applied to electronic equipment, and the electronic equipment can be a server, an intelligent device and the like. In the specific implementation process, the flexible setting can be performed according to the actual requirement, and is not specifically limited herein.
In order to conveniently and accurately determine an index document which needs to be updated, in the disclosure, when data to be synchronized is changed, an electronic device acquires changed target data information, where the target data information includes target identification information. Subsequently, the electronic device can quickly determine the index document to be updated according to the target identification information. The target identification information is used to indicate an index document that needs to be updated, and the index document may be a document stored in a search engine, a document stored in a backup database, or the like. In the specific implementation process, the flexible setting can be performed according to the actual requirement, and is not specifically limited herein.
The data to be synchronized may be stored in a database or in a storage area. The device that stores the data to be synchronized may be the same as the electronic device or may be different.
In one example, the target data information may include only the target identification information and the changed data to be synchronized.
In another example, the target data information may include target identification information and each piece of data to be synchronized corresponding to the target identification information. Each data to be synchronized comprises the data to be synchronized which is changed, and can also comprise the data to be synchronized which is not changed.
It should be noted that the target identification information may be represented as a number, a character string, or the like, or may be represented in other forms, and any representation form that can uniquely identify a document may be used as the identification information of the index document in the present disclosure.
In order to update the document conveniently and accurately, in the present disclosure, when the data to be synchronized is changed, the electronic device may further obtain a version time (denoted as a first version time) of the changed target data information. The first version time is used to identify a time when the target data information is updated. For example, the first version time is used to identify the time when the target data information was updated at the latest. And subsequently, according to the first version time, performing corresponding processing to determine whether to update the index document of the target identification information.
S302: determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information; the second version time is used for identifying the time when the index document of the target identification information is updated.
In order to solve the problem that when the target data information is updated earlier than the index document of the target identification information, the index document of the target identification information is updated directly according to the target data information, so that the later updated data in the index document of the target identification information is replaced by the target data information, in the present disclosure, the index document of the target identification information also corresponds to the version time (denoted as the second version time). After the target identification information and the first version time of the target data information are acquired based on the above embodiment, the second version time of the index document of the target identification information is acquired, and the first version time and the second version time are compared. And determining whether to update the index document of the target identification information according to the comparison result. And the second version time is used for identifying the time when the index document of the target identification information is updated. For example, the second version time is determined by a time when the index document identifying the target identification information is updated at the latest.
In one example, the index document of the target identification information is generally updated according to the sequence of the time of updating, and therefore, when the first version time is later than the second version time, it is described that the change of the target data information occurs after the time of updating the index document of the target identification information, and the index document of the target identification information can be updated according to the target data information.
When the first version time is earlier than the second version time, which indicates that the change of the target data information occurs before the time when the index document of the target identification information is updated, the index document of the target identification information may not be updated.
In a possible implementation manner, if the target data information includes only the target identification information and the changed data to be synchronized, in order to accurately update the index document of the target identification information, when it is determined that the index document of the target identification information is updated according to the target data information, each data to be synchronized corresponding to the target identification information may be acquired from the device that stores the data to be synchronized, that is, each data to be synchronized corresponding to the target identification information stored in the device that stores the data to be synchronized is retrieved, and the target document is constructed according to the target identification information and each data to be synchronized corresponding to the target identification information. And updating the index document of the target identification information according to the target document.
In another possible implementation manner, if the target data information includes the target identification information and each piece of data to be synchronized corresponding to the target identification information, the target document may be constructed according to the target data information. And updating the index document of the target identification information according to the target document.
In the disclosure, since the changed target data information corresponds to a first version time for identifying the time when the target data information is updated, and the index document of the target identification information also corresponds to a second version time for identifying the time when the index document of the target identification information is updated, the comparison result between the second version time of the index document of the target identification information and the first version time can be considered when the index document of the target identification information is updated according to the target data information, so as to effectively avoid the problem that the later updated data in the index document of the target identification information is replaced by the target data information because the index document of the target identification information is updated directly according to the target data information when the time when the target data information is updated is earlier than the time when the index document of the target identification information is updated, the accuracy and the stability of data synchronization are effectively improved.
On the basis of the foregoing embodiment, in order to update a document in time, in this disclosure, the acquiring target data information that changes includes:
and determining the changed target data information according to the data information carried in the received change notification message.
In a possible implementation manner, since there may be a case where data to be synchronized is changed at any time, in order to process the changed data to be synchronized in time, a data change situation in the device storing the data to be synchronized may be monitored in real time, for example, by monitoring a change event of a database binlog of the device storing the data to be synchronized. When it is determined that the data to be synchronized changes, the electronic device analyzes the change notification message after receiving the change notification message, and acquires target data information carried in the change notification message.
In another possible implementation, the obtaining of the changed target data information includes:
if the first time meets the preset determination condition, determining the data information to be synchronized which is changed in the target time period; determining any changed data information to be synchronized as the target data information; the target time period is a time interval between the first time and a second time; the second time is the last determined time when the determination condition is satisfied.
Since there may also be a case that part of data is not synchronized to the electronic device in the process of regularly changing the data to be synchronized or synchronizing the changed data to be synchronized in real time, a certain condition is preset in order to ensure that the electronic device can acquire and synchronize all the changed data to be synchronized. The determination condition may be that the first time is a preset time, or a time interval between the first time and the second time reaches a preset duration, or the first time is a time when the full-scale synchronization instruction is received. The electronic device may determine whether to acquire the changed target data information according to whether the first time meets a preset determination condition. The first time may be a current time, a time that is a preset time before the current time, or a preset certain time. The second time is a time before the first time, and the second time is a last determined time when the determination condition is satisfied, and it is understood that the second time is a last first time.
In an example, if the preset determination condition is that the first time is the time when the full synchronization instruction is received, when the user wishes to synchronize each piece of to-be-synchronized data that is changed in all currently-stored to-be-synchronized data into the electronic device, the full synchronization instruction may be input through the smart device. The user can input the full-scale synchronization command through the intelligent device in many ways, and the full-scale synchronization command can be input through operating control equipment, such as a mouse, a helmet, a remote controller and the like, can be input through a voice mode, can be input through operating a display of the intelligent device, and can be input through operating hardware buttons on the intelligent device. In the specific implementation process, the flexible setting can be performed according to the actual requirement, and is not specifically limited herein. After the intelligent device receives the full-scale synchronization instruction input by the user, the full-scale synchronization instruction can be sent to the electronic device. The electronic equipment determines the time when the full-scale synchronization instruction is received as a first time, and determines that the first time meets a preset determination condition. The full synchronization instruction is an instruction used for instructing the electronic equipment to synchronize each piece of data to be synchronized at the first time into the corresponding index document.
It should be noted that the electronic device and the smart device may be the same or different. In the specific implementation process, the flexible setting can be performed according to the actual requirement, and is not specifically limited herein.
In another example, if the preset determination condition is that the time interval between the first time and the second time reaches the preset time duration, when the electronic device determines that the time duration recorded on the timer reaches the preset time duration, the time duration when the time duration recorded on the timer reaches the preset time duration may be determined as the first time satisfying the preset determination condition. The time length recorded on the timer is the time interval between the first time and the second time, and the timer is cleared whenever the time length recorded on the timer reaches a preset time length.
In the related art, when the data size of the data to be synchronized is large, for example, hundreds of millions of data to be synchronized, it is time-consuming to collect and clean all the data to be synchronized, generate each document and synchronize each document to the search engine, and it is also not beneficial for a user to query accurate contents in the process of performing a full-scale task on the electronic device. Therefore, in order to improve the efficiency of data synchronization and reduce the time taken to synchronize the data to be synchronized, which is changed, only the data information that has changed in the time interval (target time period) between the first time and the second time may be determined as the target data information and subsequent synchronization may be performed. And if the first time meets the at least one preset determination condition, acquiring the data information to be synchronized which is changed in the target time period. And then, aiming at each changed data information to be synchronized, determining the changed data information to be synchronized as target data information.
In a possible implementation manner, the determining the data information to be synchronized, which is changed within the target time period, includes:
acquiring each data to be synchronized at the first time in an off-line manner, and summarizing each data to be synchronized according to identification information corresponding to each data to be synchronized;
for each piece of identification information, determining first data information according to the identification information and each piece of data to be synchronized corresponding to the identification information;
acquiring each piece of second data information corresponding to the second time;
and determining the data information to be synchronized, which is changed in the target time period, according to each piece of first data information and each piece of second data information, and updating each piece of second data information according to each piece of first data information.
In the related art, in the process of generating a document according to each acquired data to be synchronized, frequent reading of the device storing the data to be synchronized for many times may increase the reading pressure of the device storing the data to be synchronized, possibly affect other services, or cause a downtime of the device storing the data to be synchronized. Based on this, in order to avoid the above problem, each to-be-synchronized data saved at the first time may be obtained offline, that is, each to-be-synchronized data saved in the device that saves the to-be-synchronized data may be obtained offline sequentially or randomly from the first time. In the offline acquisition process, each piece of data to be synchronized stored in the device for the data to be synchronized may still be updated.
After each to-be-synchronized data stored at the first time is acquired offline, each to-be-synchronized data is summarized according to the identification information corresponding to each to-be-synchronized data, that is, a data set corresponding to each identification information is acquired. And then, for each identification information, determining first data information according to the identification information and each data to be synchronized corresponding to the identification information.
For example, the identification information and each data to be synchronized corresponding to the identification information are spliced.
For another example, each piece of data to be synchronized corresponding to the identification information is subjected to deduplication processing, and then the identification information and the data to be synchronized after deduplication processing are spliced. Wherein, the performing the deduplication processing on each to-be-synchronized data corresponding to the identification information comprises: if n pieces of data to be synchronized which are completely the same exist in the M pieces of data to be synchronized corresponding to the identification information, n-1 pieces of data to be synchronized in the n pieces of data to be synchronized are deleted. Wherein n is an integer of 2 or more and M or less, and M is an integer of 2 or more.
After each piece of first data information is acquired based on the above embodiment, each piece of data information corresponding to the second time is acquired. And then determining the data information to be synchronized which is changed in the target time period according to each piece of first data information and each piece of second data information.
In a possible implementation manner, the determining, according to the each piece of first data information and the each piece of second data information, data information to be synchronized, which is changed within the target time period, includes:
for the identification information contained in each piece of first data information, if it is determined that second data information containing the identification information does not exist, determining the first data information containing the identification information as changed data information to be synchronized; and if the second data information containing the identification information is determined to exist and the second data information is inconsistent with the first data information containing the identification information, determining the first data information containing the identification information as the changed data information to be synchronized.
In the target time period, the data to be synchronized stored in the second time may be modified, added, deleted, and the like. Based on this, when determining the data information to be synchronized, which is changed in each piece of first data information, it may be determined, for the identification information included in each piece of first data information, whether the identification information matches with the identification information included in any one of the second data information, that is, whether there is second data information including the identification information.
If it is determined that the second data information containing the identification information does not exist, which indicates that the first data information is newly added data information in the target time period, the first data information may be determined as changed data information to be synchronized.
If the second data information containing the identification information is determined to exist, which indicates that the first data information is not the newly added data information in the target time period but may be data information of other change types, whether the second data information containing the identification information is inconsistent with the first data information can be continuously judged, so that whether the first data information is the changed data information to be synchronized can be accurately determined.
If the second data information containing the identification information is determined to be inconsistent with the first data information, the first data information is indicated to be data information of other change types, and the first data information is determined to be changed data information to be synchronized; and if the second data information containing the identification information is determined to be consistent with the first data information, which indicates that the first data information may not be changed in the target time period, determining that the first data information is not the changed data information to be synchronized.
In order to determine the changed data information to be synchronized at the next first time when the preset determination condition is met, based on the above embodiment, after the changed data information to be synchronized is determined from the first data information, the stored second data information is updated according to each first data information.
The target data information is determined in the off-line mode, so that the electronic equipment can conveniently and accurately acquire the target data information, and the reading pressure of the equipment for storing the data to be synchronized can be reduced. And subsequently, only the determined target data information can be synchronized, so that the workload required by the electronic equipment in the data synchronization process is reduced, the time consumed by synchronously storing each piece of data to be synchronized is reduced, and the efficiency of synchronously storing each piece of data to be synchronized is improved.
The following describes a data processing method provided by the present disclosure by using a specific embodiment, and fig. 4 is a schematic diagram of a specific data processing flow provided by the present disclosure, as shown in fig. 4, the flow includes:
s401: if the change notification message is received, the target data information which is changed is determined according to the data information carried in the received change notification message, and S408 is executed.
S402: and if the first time meets the preset determination condition, acquiring each data to be synchronized at the first time in an off-line manner.
Here, the present disclosure does not limit the execution order of S401 and S402, that is, S401 may be executed before S402, S401 may also be executed after S402, and S401 and S402 may be executed simultaneously.
In a possible implementation manner, if the first time is determined to be the time when the full-synchronization instruction is received, it is determined that the first time meets a preset determination condition. The full-scale synchronization instruction is an instruction for synchronizing each piece of data to be synchronized at the first time into the corresponding index document.
In another possible implementation manner, if it is determined that the time interval between the first time and the second time reaches the preset time duration, it is determined that the first time satisfies the preset determination condition.
And the second time is the last determined time meeting the determination condition.
S403: and summarizing each piece of data to be synchronized according to the identification information corresponding to each piece of data to be synchronized acquired in the step S402.
S404: and for each identification information in the S403, determining first data information according to the identification information and each to-be-synchronized data corresponding to the identification information.
S405: and acquiring each piece of second data information corresponding to the second time.
S406: and determining the data information to be synchronized which is changed in the target time period according to each piece of first data information and each piece of second data information.
Specifically, determining the to-be-synchronized data information that is changed in the target time period includes:
for the identification information contained in each piece of first data information, if it is determined that second data information containing the identification information does not exist, determining the first data information containing the identification information as changed data information to be synchronized; and if the second data information containing the identification information is determined to exist and the second data information is inconsistent with the first data information containing the identification information, determining the first data information containing the identification information as the changed data information to be synchronized.
S407: and updating each piece of second data information according to each piece of first data information.
S408: and acquiring the first version time of the changed target data information and the target identification information contained in the target data information.
S409: and determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information.
In order to improve the accuracy in the data synchronization process, on the basis of the foregoing embodiments, in this disclosure, the method further includes:
acquiring a target change type of the target data information;
judging whether the target change type is a deletion type; the deletion type represents that each data to be synchronized corresponding to the target identification information contained in the target data information is deleted;
if the target change type is a deletion type, directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information;
and if the target change type is not the deletion type, executing a subsequent step of determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information.
In order to conveniently and accurately update the document, in the present disclosure, when the data to be synchronized is changed, the electronic device may further obtain a target change type of the changed target data information, so as to facilitate a subsequent determination of a specific update method for the index document of the target identification information according to the target change type. The target change type can be deletion, addition, modification and the like.
Typically, the deleted index document is subsequently not allowed to be written further. Therefore, when the target change type of the target data information is deletion, each data corresponding to the target identification information in the index document of the target identification information can be directly deleted, and it is not necessary to subsequently obtain each data to be synchronized corresponding to the target identification information from the device for storing the data to be synchronized, or to compare the first version time with the second version time. Based on this, in the present disclosure, if the target change type is the deletion type, which indicates that each data corresponding to the target identification information included in the index document of the target identification information is to be deleted, each data corresponding to the target identification information included in the index document of the target identification information may be deleted, and the second version time of the index document of the target identification information may be updated according to the third time corresponding to the target data information.
The third time corresponding to the target data information may be determined according to the time of each to-be-synchronized data corresponding to the target identification information contained in the index document from which the target identification information is deleted, may be determined at a preset certain time, and may also be determined according to the current time. The specific implementation process can be flexibly set according to actual requirements, and detailed description is omitted here.
In another possible implementation, the target change type of the target data type may not be the deletion type, and for a change that is not the deletion type, it is determined whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information.
If the target change type is not the deletion type, which indicates that each piece of data to be synchronized corresponding to the target identification information included in the index document which does not need to delete the target identification information is not deleted, whether the index document of the target identification information is updated or not can be determined according to the first version time of the target data information and the comparison result of the first version time.
In one example, the determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information includes:
if the second version time of the index document without the target identification information is determined, or the second version time is not later than the first version time, updating the index document with the target identification information according to the target data information; updating the second version time of the index document of the updated target identification information according to the first version time;
and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
Since there may be a case where the index document of the target identification information corresponds to the second version information or there may be no corresponding second version information, when determining whether to update the index document of the target identification information according to the comparison result between the first version time and the first version time of the target data information, it may be determined whether there is the second version time of the index document of the target identification information, and when there is the second version time, it may be determined whether the update time of the index document of the target identification information is earlier than the update time of the target data information according to the comparison result between the second version time and the first version time, so as to determine whether to update the index document of the target identification information according to the target data information.
In an example, if it is determined that the second version time of the index document without the target identification information is not present, or the second version time is not later than the first version time, which indicates that the time of updating the index document of the target identification information is earlier than the time of updating the target data information, the index document of the target identification information may be updated directly according to the target data information.
In order to update the index document of the target identification information next time, the second version time of the updated index document of the target identification information may be updated according to the first version time of the target data information.
In another example, if it is determined that the second version time of the index document having the target identification information is later than the first version time, which indicates that the time when the index document of the target identification information is updated is later than the time when the target data information is updated, it is determined that the index document of the target identification information is not updated.
In a possible implementation manner, updating the index document of the target identification information according to the target change type and the target data information may be specifically implemented by the following calculation logic:
Figure BDA0003063854960000211
the deleted indicates that the change type of the target data information is a deletion type, doc is an index document of the target identification information, params is the target data information, doc dataversion is the second version time, and params dataversion is the first version time.
Whether the second version time is stored or not, and when the second version time is stored, the comparison between the second version time and the first version time is carried out, whether the updating time of the index document of the target identification information is earlier than the updating time of the target data information or not is determined, and whether the index document of the target identification information is updated according to the target data information or not is further determined, so that the problem that the index document of the updated target identification information is inconsistent with the data to be synchronized due to the fact that the sequence of the obtained target data information is inconsistent with the sequence of data change among the target data information is avoided, and the accuracy and the stability of data synchronization are improved.
The following describes a data processing method provided by the present disclosure by using a specific embodiment, and fig. 5 is a schematic diagram of a specific data processing flow provided by the present disclosure, as shown in fig. 5, the flow includes:
s501: and acquiring the changed target data information.
For a specific process of acquiring the changed target data information, reference may be made to S401 to S407 shown in fig. 4 in the foregoing embodiment.
S502: and acquiring the first version time of the changed target data information and the target identification information contained in the target data information.
In a possible embodiment, if the target data information, that is, the target data information, acquired through S402 to S407 shown in fig. 4 is determined by the data information to be synchronized that is changed within the target time period, the time when the data to be synchronized corresponding to the target identification information included in the target data information is updated is determined; and determining the first version time according to the updated time and the first time.
In another possible implementation, if the obtained target data information, that is, the target data information is determined by the received notification information through S401 shown in fig. 4, determining a time when the to-be-synchronized data corresponding to the target identification information included in the target data information is updated; and determining the first version time according to the updated time and the fourth time corresponding to the target data information.
S503: and acquiring the target change type of the target data information.
S504: and judging whether the target change type is a deletion type, if so, executing S505, otherwise, executing S506.
The deletion type indicates that each piece of data to be synchronized corresponds to the target identification information included in the deletion target data information.
S505: and directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information.
S506: and determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information.
Specifically, the process of determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information includes:
if the second version time of the index document without the target identification information is determined, or the second version time is not later than the first version time, updating the index document with the target identification information according to the target data information; updating the second version time of the index document of the updated target identification information according to the first version time;
and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
In order to improve the accuracy and stability of data synchronization, on the basis of the foregoing embodiments, in the present disclosure, the first version time may be obtained as follows:
if the target data information is determined by the data information to be synchronized which is changed in the target time period, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; determining the first version time according to the updated time and the first time;
if the target data information is determined through the received notification message, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; and determining the first version time according to the updated time and a fourth time corresponding to the target data information.
As a possible implementation manner, the latest time to be updated among the times to be synchronized, which correspond to the target identification information included in the target data information, may be determined as the first version time.
As another possible implementation, since all the data to be synchronized at the first time are acquired offline, each data to be synchronized is offline from the first time. In the process of completing offline all the data to be synchronized, there may still be an operation of changing the data to be synchronized, so that some time of updating corresponding to the data to be synchronized in the data to be synchronized obtained offline is after the first time, or may be before the first time, or even there may be no corresponding time of updating corresponding to the data to be synchronized. Therefore, when the first version time of the target data information is determined, if the target data information is determined by the to-be-synchronized data information that is changed within the target time period, the latest time of update of the to-be-synchronized data corresponding to the target identification information included in the target data information may be determined first. The latest updated time is then compared with the time of starting to go offline (i.e., the first time), and the first version time of the target data information is determined according to the comparison result. If the time for updating the data to be synchronized corresponding to the target identification information included in the target data information is not recorded, the time for updating can be determined as any time not greater than the first time. For example, the first time may be 0.
In one example, the maximum of the latest time updated and the first time may be determined as the first version time. Specifically, it can be determined by the following calculation logic:
dataVersion=max(max(related entities updateTime),dumpTaskStartTime)
wherein dataVersion is the first version time, related entries update time is the latest update time, and dumptasskstarttime is the first time.
For example, if the latest time of update is greater than the first time, which means that the latest time of update is later than the first time, the latest time of update is determined as the first version time.
For another example, if the latest time of update is not greater than the first time, which indicates that the latest time of update is not later than the first time, the first time is determined as the first version time.
As another possible implementation manner, in order to avoid the network state and other reasons, the sending order of the target data information is different from the receiving order of the target data information, so as to affect the updating of the index document, when the target data information is determined by the received notification message, the first version time may be determined according to the time when the data to be synchronized corresponding to the target identification information included in the target data information is updated and the time (denoted as the fourth time) when the notification message is generated.
When the first version time of the target data information is determined, if the target data information is determined by the to-be-synchronized data information that is changed in the target time period, the latest time of updating of the to-be-synchronized data corresponding to the target identification information included in the target data information may be determined. Then, the latest time of updating is compared with the fourth time, and the first version time of the target data information is determined according to the comparison result. If the time for updating the data to be synchronized corresponding to the target identification information included in the target data information is not recorded, the time for updating can be determined as any time not greater than the fourth time. For example, the fourth time may be 0.
In one example, the maximum of the latest time updated and the third time may be determined as the first version time. Specifically, it can be determined by the following calculation logic:
dataVersion=max(max(related entities updateTime),currTime)
wherein dataVersion is the first version time, related entries update time is the latest update time, currTime is the fourth time.
For example, if the latest time of update is greater than the fourth time, which means that the latest time of update is later than the fourth time, the latest time of update is determined as the first version time.
For another example, if the latest time of update is not greater than the fourth time, which means that the latest time of update is not later than the fourth time, the fourth time is determined as the first version time.
The method in the embodiment determines the first version time, so that the first version time can accurately represent the latest time for updating the target data information, and is favorable for determining whether the time for updating the index document of the target identification information is earlier than the time for updating the target data information through the comparison between the second version time and the first version time, and further determining whether to update the index document of the target identification information according to the target data information, thereby avoiding the problem that the index document of the updated target identification information is inconsistent with the data to be synchronized when the sequence of the acquired target data information is inconsistent with the sequence of data change among the target data information, and improving the accuracy and stability of data synchronization.
The data processing method provided by the present disclosure is described below by a specific embodiment, and fig. 6 is a scene schematic diagram of a specific data processing flow provided by the embodiment of the present disclosure. As shown in fig. 6, when it is determined that the first time satisfies the preset determination condition, the data to be synchronized saved in the device for saving the data to be synchronized is acquired. The data to be synchronized is stored in the form of a source table in a database in the device.
For example, if it is determined that the time interval between the first time and the second time reaches the preset time duration, dump is used to store each piece of data to be synchronized in the source table stored in the device for storing the data to be synchronized. And the second time is the last determined time meeting the determination condition.
And summarizing each data to be synchronized according to the identification information corresponding to each data to be synchronized. For example, join each data to be synchronized contained in the dump-to source table.
And aiming at each identification information, determining first data information according to the identification information and each data to be synchronized corresponding to the identification information. Wherein, the data format of the first data information may be the same as the data format of the index document.
For example, the first data information may be determined by the spark big data platform according to the identification information and each data to be synchronized corresponding to the identification information.
And acquiring each piece of second data information corresponding to the second time.
And determining the data information to be synchronized which is changed in the target time period according to each piece of first data information and each piece of second data information, and updating each piece of second data information according to each piece of first data information. And determining any changed data information to be synchronized as the target data information, wherein the target time period is a time interval between the first time and the second time.
For any determined target data information, the following steps are executed:
the method comprises the steps of obtaining a first version time of changed target data information, obtaining a target change type of the target data information, and obtaining target identification information contained in the target data information.
The time for updating the data to be synchronized corresponding to the target identification information contained in the target data information can be determined; and determining the first version time according to the updated time and the fourth time corresponding to the target data information.
And judging whether the target change type is a deletion type. The deletion type indicates that each piece of data to be synchronized corresponds to the target identification information included in the deletion target data information.
And if the target change type is a deletion type, directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information.
If the target change type is not the deletion type, whether the second version time of the index document of the target identification information exists or not and whether the second version time is not later than the first version time or not are judged.
If the second version time of the index document without the target identification information is determined, or the second version time is not later than the first version time, updating the index document with the target identification information according to the target data information; updating the second version time of the index document of the updated target identification information according to the first version time;
and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
The following describes the data processing method provided by the present disclosure through a specific embodiment, and fig. 7 is a scene schematic diagram of a specific data processing flow provided by the embodiment of the present disclosure. As shown in fig. 7, the change time of the data to be synchronized held in the device for holding the data to be synchronized can be listened to in real time. For example, the data to be synchronized is stored in the device for storing the data to be synchronized. The data to be synchronized is stored in a database in the device in the form of a source table, and the electronic device monitors the change time of the database binlog.
When the change notification message is received, analyzing the change notification message, and acquiring the first version time of the changed target data information carried in the change notification message, the target change type of the target data information, and the target identification information contained in the target data information.
The time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated can be determined. And determining the first version time according to the updated time and the first time.
And judging whether the target change type is a deletion type. The deletion type indicates that each piece of data to be synchronized corresponds to the target identification information included in the deletion target data information.
And if the target change type is a deletion type, directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information.
If the target change type is not the deletion type, whether the second version time of the index document of the target identification information exists or not and whether the second version time is not later than the first version time or not are judged.
If the second version time of the index document without the target identification information is determined, or the second version time is not later than the first version time, updating the index document with the target identification information according to the target data information; updating the second version time of the index document of the updated target identification information according to the first version time;
and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
In the disclosure, since the changed target data information corresponds to a first version time for identifying the time when the target data information is updated, and the index document of the target identification information also corresponds to a second version time for identifying the time when the index document of the target identification information is updated, the comparison result between the second version time of the index document of the target identification information and the first version time can be considered when the index document of the target identification information is updated according to the target data information, so as to effectively avoid the problem that the later updated data in the index document of the target identification information is replaced by the target data information because the index document of the target identification information is updated directly according to the target data information when the time when the target data information is updated is earlier than the time when the index document of the target identification information is updated, the accuracy and the stability of data synchronization are effectively improved.
In the following description with reference to a specific application scenario, fig. 8 is a schematic diagram of a data processing process in a specific music database provided by the present disclosure, where the process includes:
a device for storing data to be synchronized stores a list line width table, a song width table and an authorization book contract width table, and each table stores data to be synchronized.
If the time interval between the first time and the second time reaches the preset time length, dump is used for storing a source table stored in the device for data to be synchronized.
And summarizing each data to be synchronized according to the identification information corresponding to each data to be synchronized stored in the list line width table and the authorization book and the same width table to obtain a list-authorization book aggregation table.
And summarizing each data to be synchronized according to the identification information corresponding to each data to be synchronized stored in the list-authorization book aggregation table and the song width table to obtain the song list width table.
And for each identification information, determining first data information according to the identification information and each data to be synchronized corresponding to the identification information, and converting the data format of the data information into doc.
And acquiring each piece of second data information corresponding to the second time.
And determining the data information to be synchronized which is changed in the target time period according to each piece of first data information and each piece of second data information. And determining any changed data information to be synchronized as the target data information, wherein the target time period is a time interval between the first time and the second time.
For any determined target data information, the following data synchronization steps are executed:
the method comprises the steps of obtaining a first version time of changed target data information, obtaining a target change type of the target data information, and obtaining target identification information contained in the target data information.
The time for updating the data to be synchronized corresponding to the target identification information contained in the target data information can be determined; and determining the first version time according to the updated time and the fourth time corresponding to the target data information.
And judging whether the target change type is a deletion type. The deletion type indicates that each piece of data to be synchronized corresponds to the target identification information included in the deletion target data information.
And if the target change type is a deletion type, directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information.
If the target change type is not the deletion type, whether the second version time of the index document of the target identification information exists or not and whether the second version time is not later than the first version time or not are judged.
If the second version time of the index document without the target identification information is determined, or the second version time is not later than the first version time, updating the index document with the target identification information according to the target data information; updating the second version time of the index document of the updated target identification information according to the first version time;
and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
After determining that each piece of target data information is synchronized into the corresponding index document, the synchronization result can be notified to relevant staff through a preset communication mode.
The present disclosure also provides a data processing apparatus, and fig. 9 is a schematic structural diagram of a data processing apparatus provided in an embodiment of the present disclosure, where the apparatus includes:
a processing unit 91, configured to acquire a first version time of changed target data information and target identification information included in the target data information; wherein the first version time is used for identifying the time when the target data information is updated;
an updating unit 92, configured to determine whether to update the index document of the target identification information according to a comparison result between the second version time and the first version time of the index document of the target identification information; the second version time is used for identifying the time when the index document of the target identification information is updated.
Since the principle of the data processing apparatus for solving the problem is similar to that of the data processing method, the implementation of the data processing apparatus may refer to the implementation of the method, and repeated details are not repeated.
In a possible implementation manner, the processing unit 91 is specifically configured to determine the target data information that is changed according to the data information carried in the received change notification message.
In a possible implementation manner, the processing unit 91 is specifically configured to determine, if it is determined that the first time meets a preset determination condition, data information to be synchronized, which is changed within a target time period; determining any changed data information to be synchronized as the target data information; the target time period is a time interval between the first time and a second time; the second time is the last determined time when the determination condition is satisfied.
In a possible implementation manner, the processing unit 91 is specifically configured to determine that the first time meets a preset determination condition if it is determined that the first time is the time when the full-amount synchronization instruction is received; the full synchronization instruction is an instruction for synchronizing each data to be synchronized at the first time into the corresponding index document; or if the time interval between the first time and the second time is determined to reach the preset duration, determining that the first time meets the preset determination condition.
In a possible implementation manner, the processing unit 91 is specifically configured to obtain each to-be-synchronized data at the first time offline, and summarize each to-be-synchronized data according to identification information corresponding to each to-be-synchronized data; for each piece of identification information, determining first data information according to the identification information and each piece of data to be synchronized corresponding to the identification information; acquiring each piece of second data information corresponding to the second time; and determining the data information to be synchronized, which is changed in the target time period, according to each piece of first data information and each piece of second data information, and updating each piece of second data information according to each piece of first data information.
In a possible implementation manner, the processing unit 91 is specifically configured to, for the identification information included in each piece of first data information, determine, if it is determined that there is no second data information including the identification information, that the first data information including the identification information is to-be-synchronized data information that is changed; and if the second data information containing the identification information is determined to exist and the second data information is inconsistent with the first data information containing the identification information, determining the first data information containing the identification information as the changed data information to be synchronized.
In a possible implementation manner, the obtaining unit is further configured to obtain a target change type of the target data information;
the updating unit 92 is further configured to determine whether the target change type is a deletion type; the deletion type represents that each data to be synchronized corresponding to the target identification information contained in the target data information is deleted; if the target change type is a deletion type, directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information; and if the target change type is not the deletion type, executing a subsequent step of determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information.
In a possible implementation manner, the updating unit 92 is specifically configured to update the index document of the target identification information according to the target data information if it is determined that a second version time of the index document of the target identification information does not exist, or the second version time is not later than the first version time; updating the second version time of the index document of the updated target identification information according to the first version time; and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
In a possible implementation manner, the updating unit 92 is specifically configured to obtain the first version time by:
if the target data information is determined by the changed data information to be synchronized, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; determining the first version time according to the updated time and the first time;
if the target data information is determined through the received notification message, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; and determining the first version time according to the updated time and a fourth time corresponding to the target data information.
In the disclosure, since the changed target data information corresponds to a first version time for identifying the time when the target data information is updated, and the index document of the target identification information also corresponds to a second version time for identifying the time when the index document of the target identification information is updated, the comparison result between the second version time of the index document of the target identification information and the first version time can be considered when the index document of the target identification information is updated according to the target data information, so as to effectively avoid the problem that the later updated data in the index document of the target identification information is replaced by the target data information because the index document of the target identification information is updated directly according to the target data information when the time when the target data information is updated is earlier than the time when the index document of the target identification information is updated, the accuracy and the stability of data synchronization are effectively improved.
As shown in fig. 10, which is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure, on the basis of the foregoing embodiments, an embodiment of the present disclosure further provides an electronic device, as shown in fig. 10, including: the system comprises a processor 1001, a communication interface 1002, a memory 1003 and a communication bus 1004, wherein the processor 1001, the communication interface 1002 and the memory 1003 are communicated with each other through the communication bus 1004;
the memory 1003 has stored therein a computer program which, when executed by the processor 1001, causes the processor 1001 to perform the steps of:
acquiring first version time of changed target data information and target identification information contained in the target data information; wherein the first version time is used for identifying the time when the target data information is updated;
determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information; the second version time is used for identifying the time when the index document of the target identification information is updated.
Because the principle of the electronic device for solving the problems is similar to the data processing method, the implementation of the electronic device may refer to the implementation of the method, and repeated details are not repeated.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface 1002 is used for communication between the electronic apparatus and other apparatuses.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.
On the basis of the foregoing embodiments, the embodiments of the present disclosure further provide a computer-readable storage medium, in which a computer program executable by a processor is stored, and when the program runs on the processor, the processor is caused to execute the following steps:
acquiring first version time of changed target data information and target identification information contained in the target data information; wherein the first version time is used for identifying the time when the target data information is updated;
determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information; the second version time is used for identifying the time when the index document of the target identification information is updated.
Since the principle of solving the problem of the computer-readable storage medium is similar to that of the data processing method, the specific implementation may refer to the implementation of the data processing method, and repeated details are not repeated.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the present disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications can be made in the present disclosure without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and their equivalents, the present disclosure is intended to include such modifications and variations as well.

Claims (10)

1. A method of data processing, the method comprising:
acquiring first version time of changed target data information and target identification information contained in the target data information; wherein the first version time is used for identifying the time when the target data information is updated;
determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information; the second version time is used for identifying the time when the index document of the target identification information is updated.
2. The method of claim 1, wherein the obtaining of the changed target data information comprises:
and determining the changed target data information according to the data information carried in the received change notification message.
3. The method of claim 1, wherein the obtaining of the changed target data information comprises:
if the first time meets the preset determination condition, determining the data information to be synchronized which is changed in the target time period; determining any changed data information to be synchronized as the target data information; the target time period is a time interval between the first time and a second time; the second time is the last determined time when the determination condition is satisfied.
4. The method of claim 3, wherein the determining that the first time satisfies a predetermined determination condition comprises:
if the first time is determined to be the time for receiving the full-scale synchronization instruction, determining that the first time meets a preset determination condition; the full synchronization instruction is an instruction for synchronizing each data to be synchronized at the first time into the corresponding index document; or
And if the time interval between the first time and the second time is determined to reach a preset duration, determining that the first time meets a preset determination condition.
5. The method according to any one of claims 1-4, further comprising:
acquiring a target change type of the target data information;
judging whether the target change type is a deletion type; the deletion type represents that each data to be synchronized corresponding to the target identification information contained in the target data information is deleted;
if the target change type is a deletion type, directly deleting each data to be synchronized corresponding to the target identification information contained in the index document of the target identification information, and updating the second version time of the index document of the target identification information according to the third time corresponding to the target data information;
and if the target change type is not the deletion type, executing a subsequent step of determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information.
6. The method of claim 5, wherein the determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information comprises:
if the second version time of the index document without the target identification information is determined, or the second version time is not later than the first version time, updating the index document with the target identification information according to the target data information; updating the second version time of the index document of the updated target identification information according to the first version time;
and if the second version time of the index document with the target identification information is determined to exist and the second version time is later than the first version time, determining not to update the index document with the target identification information.
7. The method of claim 5, wherein the first version time is obtained by:
if the target data information is determined by the data information to be synchronized which is changed in the target time period, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; determining the first version time according to the updated time and the first time;
if the target data information is determined through the received notification message, determining the time when the data to be synchronized corresponding to the target identification information contained in the target data information is updated; and determining the first version time according to the updated time and a fourth time corresponding to the target data information.
8. A data processing apparatus, characterized in that the apparatus comprises:
the processing unit is used for acquiring first version time of changed target data information and target identification information contained in the target data information; wherein the first version time is used for identifying the time when the target data information is updated;
the updating unit is used for determining whether to update the index document of the target identification information according to the comparison result of the second version time and the first version time of the index document of the target identification information; the second version time is used for identifying the time when the index document of the target identification information is updated.
9. An electronic device, characterized in that the electronic device comprises at least a processor and a memory, the processor being adapted to carry out the steps of the data processing method according to any of claims 1-7 when executing a computer program stored in the memory.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the data processing method according to any one of claims 1 to 7.
CN202110520749.9A 2021-05-13 2021-05-13 Data processing method, device, equipment and medium Active CN113157716B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110520749.9A CN113157716B (en) 2021-05-13 2021-05-13 Data processing method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110520749.9A CN113157716B (en) 2021-05-13 2021-05-13 Data processing method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN113157716A true CN113157716A (en) 2021-07-23
CN113157716B CN113157716B (en) 2023-05-26

Family

ID=76874761

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110520749.9A Active CN113157716B (en) 2021-05-13 2021-05-13 Data processing method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN113157716B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114372064A (en) * 2022-03-22 2022-04-19 飞狐信息技术(天津)有限公司 Data processing apparatus, method, computer readable medium and processor

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090063422A1 (en) * 2007-09-05 2009-03-05 Shoji Kodama Search engine system using snapshot function of storage system
CN106254094A (en) * 2016-07-19 2016-12-21 中国银联股份有限公司 A kind of method of data synchronization and system
CN107315825A (en) * 2017-07-05 2017-11-03 北京奇艺世纪科技有限公司 A kind of index upgrade system, method and device
CN111324660A (en) * 2018-12-13 2020-06-23 杭州海康威视系统技术有限公司 Data synchronization method and device, electronic equipment and machine-readable storage medium
CN112256715A (en) * 2020-11-12 2021-01-22 微医云(杭州)控股有限公司 Index updating method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090063422A1 (en) * 2007-09-05 2009-03-05 Shoji Kodama Search engine system using snapshot function of storage system
CN106254094A (en) * 2016-07-19 2016-12-21 中国银联股份有限公司 A kind of method of data synchronization and system
CN107315825A (en) * 2017-07-05 2017-11-03 北京奇艺世纪科技有限公司 A kind of index upgrade system, method and device
CN111324660A (en) * 2018-12-13 2020-06-23 杭州海康威视系统技术有限公司 Data synchronization method and device, electronic equipment and machine-readable storage medium
CN112256715A (en) * 2020-11-12 2021-01-22 微医云(杭州)控股有限公司 Index updating method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114372064A (en) * 2022-03-22 2022-04-19 飞狐信息技术(天津)有限公司 Data processing apparatus, method, computer readable medium and processor
CN114372064B (en) * 2022-03-22 2022-07-12 飞狐信息技术(天津)有限公司 Data processing apparatus, method, computer readable medium and processor

Also Published As

Publication number Publication date
CN113157716B (en) 2023-05-26

Similar Documents

Publication Publication Date Title
US12007846B2 (en) Manifest-based snapshots in distributed computing environments
US10803016B2 (en) Predictive models of file access patterns by application and file type
JP6419319B2 (en) Synchronize shared folders and files
CN104423960B (en) A kind of method and system of project continuous integrating
KR102423125B1 (en) Database syncing
US20170255663A1 (en) Propagation of data changes in a distributed system
CN110688382B (en) Data storage query method and device, computer equipment and storage medium
CN106874281B (en) Method and device for realizing database read-write separation
US10614087B2 (en) Data analytics on distributed databases
JP2015510174A (en) Location independent files
US11210211B2 (en) Key data store garbage collection and multipart object management
US10824612B2 (en) Key ticketing system with lock-free concurrency and versioning
US20210373914A1 (en) Batch to stream processing in a feature management platform
WO2015135370A1 (en) Data update method and system
US20170228409A1 (en) In-memory journaling
CN110781197A (en) Hive offline synchronous verification method and device and electronic equipment
CN111680017A (en) Data synchronization method and device
US11210212B2 (en) Conflict resolution and garbage collection in distributed databases
CN113468196B (en) Method, apparatus, system, server and medium for processing data
CN113157716B (en) Data processing method, device, equipment and medium
US9390131B1 (en) Executing queries subject to different consistency requirements
CN113918648A (en) Data synchronization method and device, electronic equipment and storage medium
CN117131138A (en) Data lake-based data processing method, device, equipment and medium
US11023449B2 (en) Method and system to search logs that contain a massive number of entries
CN111147226B (en) Data storage method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant