WO2017008677A1 - Data detection method and device - Google Patents

Data detection method and device Download PDF

Info

Publication number
WO2017008677A1
WO2017008677A1 PCT/CN2016/089079 CN2016089079W WO2017008677A1 WO 2017008677 A1 WO2017008677 A1 WO 2017008677A1 CN 2016089079 W CN2016089079 W CN 2016089079W WO 2017008677 A1 WO2017008677 A1 WO 2017008677A1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter information
data
specified file
list
parameter
Prior art date
Application number
PCT/CN2016/089079
Other languages
French (fr)
Chinese (zh)
Inventor
梁永锋
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2017008677A1 publication Critical patent/WO2017008677A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment

Definitions

  • the present application relates to the field of data processing technologies, and in particular, to a data detection method and a data detection device.
  • log files that are constantly generated, such as a buried log on the page for recording user behavior. These logs are distributed across many servers and can have up to 10W servers with multiple log files on each machine. In the business, these log files need to be synchronized to the data warehouse system for analysis. Therefore, in the synchronization process, it is necessary to ensure that the data is not lost, so as to provide real guarantee for the service.
  • a sample check which periodically outputs a regular log in the log, and then detects whether these regular logs exist at the target location (data warehouse system). If these regular logs do not exist, the data is considered to be lost.
  • the other is the final consistency check. In the data collection process, the last record number of the file is sent to the target location, and then the number of records finally received is counted at the target location. If the number of records is inconsistent, the data is considered to be lost.
  • the technical problem to be solved by the embodiments of the present application is to provide a data detection method, which can improve the detection accuracy of lost data.
  • the embodiment of the present application further provides a data detecting apparatus for ensuring implementation and application of the foregoing method.
  • a data detection method including:
  • the determining whether the obtained parameter information corresponding to the specified file has a gap includes:
  • the parameter information of the acquired synchronization data is inserted into the parameter information linked list of the specified file, where the parameter information linked list includes parameter information corresponding to the specified file acquired earlier;
  • the parameter information of the vacant parameter information in the merged parameter information list is found according to the preset condition.
  • searching for the parameter information of the vacant parameter information in the merged parameter information list according to the preset condition including:
  • Determining by using a predetermined number of thresholds, whether there is a parameter information of the vacancy in the merged parameter information list.
  • the search result is the parameter information of the vacancy and the number of times of searching reaches the threshold number, the determined location is determined. There is a gap in the parameter information corresponding to the specified file.
  • the method further includes:
  • the parameter information of the synchronization data includes an offset and a data length
  • the offset of the next synchronization data is the sum of the offset of the synchronization data and the data length
  • the embodiment of the present application further discloses a data detecting apparatus, including:
  • a parameter obtaining unit configured to acquire parameter information of the synchronization data when the data of the specified file is synchronized, the parameter information being associated with the synchronization data
  • a parameter confirmation unit configured to determine whether there is a gap in each parameter information corresponding to the specified file that has been obtained
  • the result confirming unit is configured to determine that the specified file has data loss when the parameter confirming unit determines that there is a gap.
  • parameter confirmation unit includes:
  • Inserting a subunit configured to insert the obtained parameter information of the synchronization data into a parameter information linked list of the specified file, where the parameter information linked list includes parameter information corresponding to the specified file acquired earlier;
  • the confirmation subunit is configured to search for the parameter information of the vacant parameter information list in the merged parameter information list according to the preset condition.
  • the confirmation sub-unit is configured to repeatedly search for the parameter information of the vacant parameter information list in the merged parameter information list according to the preset number of times threshold, and when the search result is the parameter information of the vacancy, the number of times of the search reaches the When the threshold value is used, it is determined that there is a gap in the parameter information corresponding to the acquired specified file.
  • the device further includes:
  • the lost data determining unit is configured to determine data corresponding to the parameter information of the merged parameter information list that is hollowed out as data that is lost in the specified file.
  • the parameter information of the synchronization data includes an offset and a data length
  • the offset of the next synchronization data is the sum of the offset of the synchronization data and the data length
  • the embodiments of the present application include the following advantages:
  • the method disclosed in the embodiment of the present application not only improves the detection accuracy of the lost data, but also has a simple and easy detection process and high detection efficiency.
  • FIG. 1 is a flow chart showing the steps of an embodiment of a data detecting method of the present application
  • FIG. 2 is a flow chart showing the steps of a method for determining whether there is a gap in each parameter information corresponding to the specified file that has been obtained in the present application;
  • FIG. 3 is a schematic diagram of a parameter information linked list in the present application.
  • FIG. 4 is a structural block diagram of an embodiment of a data detecting apparatus of the present application.
  • Figure 5 is a block diagram showing the structure of a parameter confirming unit in the present application.
  • FIG. 6 is a structural block diagram of another embodiment of a data detecting apparatus of the present application.
  • FIG. 1 a flow chart of steps of an embodiment of a data detection method of the present application is shown, which may specifically include the following steps:
  • Step 101 When synchronizing the data of the specified file, acquiring parameter information of the synchronization data, where the parameter information is associated with the synchronization data.
  • the device for detecting whether the data is lost may be the target server itself when performing data synchronization, or a module set in the target server, or may be independent of the target.
  • the source server transmitting the data may send the data of the specified file together with the parameter information of the current data to the target server, and the device requests the target server. Obtaining parameter information of the synchronization data; or, the source server may separately send data of the specified file and parameter information of the current data to the target server and the device.
  • the parameter information may be newly added information, and the source server and the device agree that the source server carries the parameter information of the current data each time data synchronization is performed on the specified file.
  • the parameter information is associated with the synchronization data, and the parameter information of the different data is different, and the parameter information may follow a certain rule, for example, the parameter information of the continuously transmitted synchronization data is continuous.
  • the parameter information may also be information that is included in an existing file, such as an offset and a data length. Each time the file is read, there will be two parameter information of offset (offset) and data length (length). When the source server reads the updated data of the file for data synchronization, the synchronous data is simultaneously transmitted and read. Offset and data length.
  • the parameter information is associated with the synchronization data, and the offset of the next synchronization data is the sum of the offset of the synchronization data and the data length, namely:
  • the length of each synchronization data may be the same or different.
  • the parameter information may also be information of other content, as long as the parameter information is associated with the synchronization data of the specified file, and may be used to distinguish the synchronization data of the specified file at each synchronization.
  • Step 102 Determine whether there is a gap in each parameter information corresponding to the obtained specified file.
  • the step may determine whether there is a gap in the parameter information according to the parameter information corresponding to all the obtained specified files. Specifically, it may be determined according to the law between the parameter information whether there is a vacancy or missing parameter information.
  • the parameter information of the synchronous data that is continuously sent in the previous step is continuous, in this step, it can be determined whether the parameter information of the obtained specified file is continuous, and if it is continuous, there is no vacancy. If it is not continuous, the parameter information is missing.
  • parameter information of other content since the parameter information is associated with the synchronization data of the specified file, the law between the different parameter information may be utilized to determine whether there is a gap in the parameter information of the obtained specified file.
  • step 103 If it is determined that there is a vacancy, step 103 is performed. If there is no vacancy, step 104 is performed.
  • step 103 it is determined that there is data loss in the specified file.
  • the source server Since the source server performs data synchronization according to the data update of the specified file, if the parameter information obtained by the device is vacant, the data corresponding to the vacated parameter information is lost, and the data synchronization has a problem.
  • Step 104 Determine that there is no data loss in the specified file.
  • the method disclosed in the embodiment of the present application not only improves the detection accuracy of the lost data, but also has a simple and easy detection process and high detection efficiency.
  • the process of determining whether the parameter information corresponding to the specified file is vacant may further include:
  • step 201 the parameter information of the acquired synchronization data is inserted into the parameter information linked list of the specified file, and the parameter information linked list includes the parameter information corresponding to the specified file acquired earlier.
  • a parameter information linked list of the specified file may be established, where the parameter information linked list includes all the obtained parameter information corresponding to the specified file, that is, The parameter information of the synchronization data obtained during each data synchronization process of the file.
  • the parameter information linked list can arrange parameter information according to the order of the size of the parameter information, etc., to facilitate searching.
  • the device obtains the parameter information of the new synchronization data of the specified file in step 101, the parameter information is inserted into the corresponding parameter information linked list according to the arrangement rule.
  • Step 202 Combine adjacent parameter information in the parameter information linked list.
  • the adjacent parameter information is also parameter information of the synchronization data in the adjacent two data synchronizations.
  • Step 203 Search for the parameter information of the vacant parameter information in the merged parameter information list according to the preset condition.
  • the parameter information that cannot be merged may be the parameter information of the vacancy.
  • the condition may also be set, and if the search result satisfies the preset condition, it is determined whether there is a vacancy parameter. information.
  • the threshold value of the number of times of the search is set in advance, and the parameter information of the vacant parameter information list is repeatedly searched according to the preset number of times threshold.
  • the search result is the parameter information of the vacancy and the number of times of the search reaches the threshold of the number of times, the judgment has been made.
  • the parameter information corresponding to the obtained specified file is vacant.
  • the data corresponding to the parameter information of the merging parameter information list may be determined as the data lost in the data synchronization process of the specified file.
  • the following takes the parameter information as offset and length as an example for description.
  • these two parameters are carried and saved in the target server.
  • the parameter information and the real synchronization data are saved, and the result state must be consistent, with success or failure.
  • the means for detecting whether the data is lost may scan the parameters of the specified file saved by the target server to determine whether there is data loss.
  • the device establishes a parameter information linked list (TailHead) corresponding to the specified file, continuously reads parameter information of the new synchronous data from the target server, and inserts the parameter information into the parameter information linked list.
  • ailHead parameter information linked list
  • the established parameter information list is as shown in FIG. 3, taking the d1 object of TailHead as an example, and the tail is 1 to indicate that its offset is 1, length.
  • a value of 1, head is 5 indicates that the offset is 5 and the length is 1.
  • D1 indicates that the synchronization data between 1 and 5 has been detected.
  • the data in the figure indicates that the synchronization data in 1 to 5, 8 to 10, 12, and m to n already exist, and is not lost. .
  • the apparatus When the apparatus performs the foregoing step 101, when the parameter information of the new synchronization data of the specified file is read from the target server, for example, the offset of the read sync data is 11 and the length is 1, and the foregoing step 201 is executed, and the read is performed.
  • the parameter information is inserted into the linked list TailHead, then the tail value of d2 is updated to 11 according to the parameter information offset of 11 and length being 1.
  • the foregoing step 202 is performed to merge adjacent parameter information in the linked list. For example, the tail of d2 is 11 and the offset is 11, the length is 1, and the head of d3 is 12, that is, the offset is 12, and the length is 1, then 1.
  • step 203 is performed to detect whether there is a gap in the parameter information. Taking d1 and d2 as an example, the offset between them is 6, 7 indicating that the data corresponding to the two parameter information does not exist. If the two parameters are judged to be absent after multiple times (the number of thresholds can be adjusted), then the data corresponding to offsets 6 and 7 can be considered to be lost.
  • FIG. 4 a structural block diagram of an embodiment of a data detecting apparatus of the present application is shown, which may specifically include the following units:
  • the parameter obtaining unit 401 is configured to acquire parameter information of the synchronization data when the data of the specified file is synchronized, and the parameter information is associated with the synchronization data.
  • the parameter confirmation unit 402 is configured to determine whether each parameter information corresponding to the specified file that has been acquired has a gap.
  • the result confirming unit 403 is configured to determine that the specified file has data loss when the parameter confirming unit determines that there is a gap.
  • the device by using the parameter information of the specified file and the association relationship between the parameter information and the synchronization data of the specified file, the device only needs to determine whether the obtained parameter information has a gap, thereby determining that the specified file is in the Whether there is data loss when data is synchronized.
  • the device not only improves the detection accuracy of the lost data, but also has a simple and easy detection process and high detection efficiency.
  • the result confirmation unit 403 may determine that there is no data loss in the specified file.
  • the parameter confirmation unit 402 may further include:
  • the insertion sub-unit 501 is configured to insert the acquired parameter information of the synchronization data into the parameter information linked list of the specified file, where the parameter information linked list includes parameter information corresponding to the specified file acquired earlier.
  • the merging subunit 502 is configured to merge adjacent parameter information in the parameter information linked list.
  • the confirmation subunit 503 is configured to search for the parameter information of the vacancy in the merged parameter information list according to the preset condition.
  • the confirmation sub-unit 503 may be specifically configured to repeatedly search for the parameter information of the vacant parameter information list in the merged parameter information list according to a preset number of times threshold, and when the search result is the parameter information of the vacancy, the number of times of finding reaches When the threshold value is described, it is determined that there is a gap in the parameter information of the specified file that has been acquired.
  • the device may further include:
  • the lost data determining unit 601 is configured to determine data corresponding to the parameter information of the merged parameter information list that is hollowed out as data that is lost in the specified file.
  • the parameter information of the above synchronization data may include an offset and a data length, and the offset of the next synchronization data is the sum of the offset of the synchronization data and the data length.
  • the embodiment of the present application also discloses an electronic device, including a memory and a processor.
  • the processor and the memory are connected to each other through a bus; the bus may be an ISA bus, a PCI bus, or an EISA bus.
  • the bus can be divided into an address bus, a data bus, a control bus, and the like.
  • the memory is used to store a program, and specifically, the program may include program code, and the program code includes computer operation instructions.
  • the memory may include a high speed RAM memory and may also include a non-volatile memory such as at least one disk memory.
  • the processor is used to read the program code in the memory and perform the following steps:
  • the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
  • embodiments of the embodiments of the present application may be provided as a method, apparatus, or calculation. Machine program products. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the computer device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • the memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory.
  • RAM random access memory
  • ROM read only memory
  • Memory is an example of a computer readable medium.
  • Computer readable media includes both permanent and non-persistent, removable and non-removable media.
  • Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device.
  • computer readable media does not include non-persistent computer readable media, such as modulated data signals and carrier waves.
  • Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG.
  • These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device
  • Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing terminal device such that a series of operational steps are performed on the computer or other programmable terminal device to produce computer-implemented processing, such that the computer or other programmable terminal device
  • the instructions executed on the implementation are used to implement a process or multiple processes in the flowchart And/or block diagram of the steps of a function specified in a box or blocks.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed are a data detection method and device. The data detection method comprises: when synchronizing data of a specified file, acquiring parameter information about the synchronized data, wherein the parameter information is associated with the synchronized data (101); determining whether each piece of parameter information corresponding to the acquired specified file has a vacancy (102); and if so, determining that data loss occurs in the specified file (103). The detection method not only improves the detection accuracy for lost data, but also has a simple and feasible detection process and relatively high detection efficiency.

Description

一种数据检测方法和装置Data detection method and device
本申请要求2015年07月16日递交的申请号为201510419821.3、发明名称为“一种数据检测方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application Serial No. No. No. No. No. No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No No
技术领域Technical field
本申请涉及数据处理技术领域,特别是涉及一种数据检测方法和一种数据检测装置。The present application relates to the field of data processing technologies, and in particular, to a data detection method and a data detection device.
背景技术Background technique
在各种web应用中,会有很多日志文件不断产生,比如页面上用于记录用户行为的埋点日志等。这些日志分布在很多台服务器上,可以多达10W台服务器,每台机器上又有多个日志文件。业务上需要将这些日志文件同步到数据仓库系统中进行分析,那么,在同步过程中需要保证数据不丢失,才能为业务提供真正的保障。In various web applications, there are many log files that are constantly generated, such as a buried log on the page for recording user behavior. These logs are distributed across many servers and can have up to 10W servers with multiple log files on each machine. In the business, these log files need to be synchronized to the data warehouse system for analysis. Therefore, in the synchronization process, it is necessary to ensure that the data is not lost, so as to provide real guarantee for the service.
目前,在海量数据的同步过程中,有两种方案用于检测数据是否丢失。一种是抽样检查,在日志中定时输出有规律的日志,然后在目标地(数据仓库系统)检测这些规律日志是否存在。如果这些规律日志不存在,就认为数据丢失。另外一种是最终一致性检查,在数据收集过程中将文件的最后记录数发送给目标地,然后在目标地统计出最终收到的记录数,如记录数不一致则认为数据丢失。At present, in the process of synchronizing massive data, there are two schemes for detecting whether data is lost. One is a sample check, which periodically outputs a regular log in the log, and then detects whether these regular logs exist at the target location (data warehouse system). If these regular logs do not exist, the data is considered to be lost. The other is the final consistency check. In the data collection process, the last record number of the file is sent to the target location, and then the number of records finally received is counted at the target location. If the number of records is inconsistent, the data is considered to be lost.
然而,对于抽样检查,由于存在一定的抽样率,极易出现数据丢失但无法检测到的情况。对于最终一致性检查,在数据传输的过程中,时常出现数据重复的情况,如果没有去重,那么在目标地统计出来的最终记录数很多情况下是不准确的。由此可见,上述检测方式对丢失数据的检测准确度较低。However, for sampling inspection, due to the existence of a certain sampling rate, it is highly prone to data loss but cannot be detected. For the final consistency check, in the process of data transmission, data duplication often occurs. If there is no deduplication, then the final number of records counted in the target is inaccurate. It can be seen that the above detection method has low detection accuracy for lost data.
因此,目前需要本领域技术人员迫切解决的一个技术问题就是:如何提高对丢失数据的检测准确度。Therefore, a technical problem that needs to be solved urgently by those skilled in the art is how to improve the detection accuracy of lost data.
发明内容Summary of the invention
本申请实施例所要解决的技术问题是提供一种数据检测方法,能够提高对丢失数据的检测准确度。The technical problem to be solved by the embodiments of the present application is to provide a data detection method, which can improve the detection accuracy of lost data.
相应的,本申请实施例还提供了一种数据检测装置,用以保证上述方法的实现及应用。Correspondingly, the embodiment of the present application further provides a data detecting apparatus for ensuring implementation and application of the foregoing method.
为了解决上述问题,本申请公开了一种数据检测方法,包括: In order to solve the above problems, the present application discloses a data detection method, including:
当对指定文件的数据进行同步时,获取同步数据的参数信息,所述参数信息与所述同步数据相关联;Obtaining parameter information of the synchronization data when the data of the specified file is synchronized, the parameter information being associated with the synchronization data;
确定已获取的所述指定文件对应的各参数信息是否存在空缺;Determining whether there is a gap in each parameter information corresponding to the specified file that has been obtained;
若是,则确定所述指定文件存在数据丢失。If yes, it is determined that there is data loss in the specified file.
进一步,所述确定已获取的所述指定文件对应的各参数信息是否存在空缺,包括:Further, the determining whether the obtained parameter information corresponding to the specified file has a gap includes:
将获取的所述同步数据的参数信息插入所述指定文件的参数信息链表中,所述参数信息链表中包含有在先获取的所述指定文件对应的参数信息;The parameter information of the acquired synchronization data is inserted into the parameter information linked list of the specified file, where the parameter information linked list includes parameter information corresponding to the specified file acquired earlier;
将所述参数信息链表中相邻的参数信息进行合并;Merging adjacent parameter information in the parameter information linked list;
按照预置条件查找合并后的所述参数信息链表中是否存在空缺的参数信息。The parameter information of the vacant parameter information in the merged parameter information list is found according to the preset condition.
进一步,按照预置条件查找合并后的所述参数信息链表中是否存在空缺的参数信息,包括:Further, searching for the parameter information of the vacant parameter information in the merged parameter information list according to the preset condition, including:
按照预设的次数阈值重复查找合并后的所述参数信息链表中是否存在空缺的参数信息,当查找结果为存在空缺的参数信息且查找次数达到所述次数阈值时,判定所述已获取的所述指定文件对应的参数信息存在空缺。Determining, by using a predetermined number of thresholds, whether there is a parameter information of the vacancy in the merged parameter information list. When the search result is the parameter information of the vacancy and the number of times of searching reaches the threshold number, the determined location is determined. There is a gap in the parameter information corresponding to the specified file.
进一步,所述方法还包括:Further, the method further includes:
将所述合并后的所述参数信息链表中空缺的参数信息对应的数据确定为所述指定文件丢失的数据。Determining, by the merged data corresponding to the parameter information of the parameter information list, the data of the specified file is lost.
进一步,所述同步数据的参数信息包括偏移量和数据长度,且下一次同步数据的偏移量为本次同步数据的偏移量与数据长度的和。Further, the parameter information of the synchronization data includes an offset and a data length, and the offset of the next synchronization data is the sum of the offset of the synchronization data and the data length.
本申请实施例还公开了一种数据检测装置,包括:The embodiment of the present application further discloses a data detecting apparatus, including:
参数获取单元,被配置为当对指定文件的数据进行同步时,获取同步数据的参数信息,所述参数信息与所述同步数据相关联;a parameter obtaining unit configured to acquire parameter information of the synchronization data when the data of the specified file is synchronized, the parameter information being associated with the synchronization data;
参数确认单元,被配置为确定已获取的所述指定文件对应的各参数信息是否存在空缺;a parameter confirmation unit, configured to determine whether there is a gap in each parameter information corresponding to the specified file that has been obtained;
结果确认单元,被配置为当所述参数确认单元确定存在空缺时,确定所述指定文件存在数据丢失。The result confirming unit is configured to determine that the specified file has data loss when the parameter confirming unit determines that there is a gap.
进一步,所述参数确认单元包括:Further, the parameter confirmation unit includes:
插入子单元,被配置为将获取的所述同步数据的参数信息插入所述指定文件的参数信息链表中,所述参数信息链表中包含有在先获取的所述指定文件对应的参数信息;Inserting a subunit, configured to insert the obtained parameter information of the synchronization data into a parameter information linked list of the specified file, where the parameter information linked list includes parameter information corresponding to the specified file acquired earlier;
合并子单元,被配置为将所述参数信息链表中相邻的参数信息进行合并; Merging the subunits, configured to merge adjacent parameter information in the parameter information linked list;
确认子单元,被配置为按照预置条件查找合并后的所述参数信息链表中是否存在空缺的参数信息。The confirmation subunit is configured to search for the parameter information of the vacant parameter information list in the merged parameter information list according to the preset condition.
进一步,所述确认子单元,被配置为按照预设的次数阈值重复查找合并后的所述参数信息链表中是否存在空缺的参数信息,当查找结果为存在空缺的参数信息且查找次数达到所述次数阈值时,判定所述已获取的所述指定文件对应的参数信息存在空缺。Further, the confirmation sub-unit is configured to repeatedly search for the parameter information of the vacant parameter information list in the merged parameter information list according to the preset number of times threshold, and when the search result is the parameter information of the vacancy, the number of times of the search reaches the When the threshold value is used, it is determined that there is a gap in the parameter information corresponding to the acquired specified file.
进一步,所述装置还包括:Further, the device further includes:
丢失数据确定单元,被配置为将所述合并后的所述参数信息链表中空缺的参数信息对应的数据确定为所述指定文件丢失的数据。The lost data determining unit is configured to determine data corresponding to the parameter information of the merged parameter information list that is hollowed out as data that is lost in the specified file.
进一步,所述同步数据的参数信息包括偏移量和数据长度,且下一次同步数据的偏移量为本次同步数据的偏移量与数据长度的和。Further, the parameter information of the synchronization data includes an offset and a data length, and the offset of the next synchronization data is the sum of the offset of the synchronization data and the data length.
与现有技术相比,本申请实施例包括以下优点:Compared with the prior art, the embodiments of the present application include the following advantages:
本申请实施例通过利用指定文件的参数信息,以及参数信息与指定文件的同步数据之间的关联关系,只需确定已获取的参数信息是否存在空缺,即可确定出指定文件在数据同步时是否存在数据丢失。该方法中由于每次指定文件的数据同步都会携带参数信息,因此不会存在抽样检查中抽样率的问题,所以避免了没有抽样到的数据存在数据丢失时无法检测到的问题,而且,由于参数信息与指定文件的同步数据相关联,参数信息可以表征指定文件在不同同步次数的数据,所以相对于最终一致性检查,解决了数据重复传输时目标地统计记录数有误而导致的检测不准确的问题。本申请实施例公开的方法不仅提高了对丢失数据的检测准确度,而且检测过程简单易行,检测效率较高。In the embodiment of the present application, by using the parameter information of the specified file and the association relationship between the parameter information and the synchronization data of the specified file, it is only necessary to determine whether the obtained parameter information has a gap, thereby determining whether the specified file is in data synchronization. There is data loss. In this method, since the data synchronization of each specified file carries the parameter information, there is no problem of the sampling rate in the sampling check, so that the problem that the unsampled data cannot be detected when the data is lost is avoided, and The information is associated with the synchronization data of the specified file, and the parameter information can represent the data of the specified file at different synchronization times, so the detection of the target statistical record is incorrect when the data is repeatedly transmitted relative to the final consistency check. The problem. The method disclosed in the embodiment of the present application not only improves the detection accuracy of the lost data, but also has a simple and easy detection process and high detection efficiency.
附图说明DRAWINGS
图1是本申请的一种数据检测方法实施例的步骤流程图;1 is a flow chart showing the steps of an embodiment of a data detecting method of the present application;
图2是本申请中的一种确定已获取的所述指定文件对应的各参数信息是否存在空缺的方法实施例的步骤流程图;2 is a flow chart showing the steps of a method for determining whether there is a gap in each parameter information corresponding to the specified file that has been obtained in the present application;
图3是本申请中的一种参数信息链表的示意图;3 is a schematic diagram of a parameter information linked list in the present application;
图4是本申请的一种数据检测装置实施例的结构框图;4 is a structural block diagram of an embodiment of a data detecting apparatus of the present application;
图5是本申请中的一种参数确认单元的结构框图;Figure 5 is a block diagram showing the structure of a parameter confirming unit in the present application;
图6是本申请的另一种数据检测装置实施例的结构框图。FIG. 6 is a structural block diagram of another embodiment of a data detecting apparatus of the present application.
具体实施方式 detailed description
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请作进一步详细的说明。The above described objects, features and advantages of the present application will become more apparent and understood.
参照图1,示出了本申请的一种数据检测方法实施例的步骤流程图,具体可以包括如下步骤:Referring to FIG. 1 , a flow chart of steps of an embodiment of a data detection method of the present application is shown, which may specifically include the following steps:
步骤101,当对指定文件的数据进行同步时,获取同步数据的参数信息,该参数信息与同步数据相关联。Step 101: When synchronizing the data of the specified file, acquiring parameter information of the synchronization data, where the parameter information is associated with the synchronization data.
本申请实施例中,该用于检测数据是否丢失的装置(以下简称该装置)可以是进行数据同步时的目标服务器本身,或者是设置在该目标服务器内的一模块,也可以是独立于目标服务器之外且可以与目标服务器进行交互的装置。In the embodiment of the present application, the device for detecting whether the data is lost (hereinafter referred to as the device) may be the target server itself when performing data synchronization, or a module set in the target server, or may be independent of the target. A device that is external to the server and that can interact with the target server.
在对指定文件的数据进行同步时,例如对日志的更新数据进行同步,发送数据的源服务器可以将指定文件的数据连同本次数据的参数信息一并发送至目标服务器,该装置向目标服务器请求获得该同步数据的参数信息;或者,源服务器可以将指定文件的数据和本次数据的参数信息分别发送至目标服务器和该装置。When synchronizing the data of the specified file, for example, synchronizing the updated data of the log, the source server transmitting the data may send the data of the specified file together with the parameter information of the current data to the target server, and the device requests the target server. Obtaining parameter information of the synchronization data; or, the source server may separately send data of the specified file and parameter information of the current data to the target server and the device.
其中,该参数信息可以是新增的信息,由源服务器和该装置约定,源服务器每次对指定文件进行数据同步,就携带本次数据的参数信息。参数信息与同步数据相关联,不同数据的参数信息不同,参数信息之间可以遵循一定的规律,例如连续发送的同步数据的参数信息是连续的。The parameter information may be newly added information, and the source server and the device agree that the source server carries the parameter information of the current data each time data synchronization is performed on the specified file. The parameter information is associated with the synchronization data, and the parameter information of the different data is different, and the parameter information may follow a certain rule, for example, the parameter information of the continuously transmitted synchronization data is continuous.
该参数信息也可以是现有文件自带的信息,例如,偏移量和数据长度。每一次文件的读取都会有偏移量(offset)和数据长度(length)两个参数信息,在源服务器读取该文件的更新数据进行数据同步时,同时传输该同步数据在本次读取的偏移量和数据长度。该参数信息与同步数据相关联,下一次同步数据的偏移量为本次同步数据的偏移量与数据长度的和,即:The parameter information may also be information that is included in an existing file, such as an offset and a data length. Each time the file is read, there will be two parameter information of offset (offset) and data length (length). When the source server reads the updated data of the file for data synchronization, the synchronous data is simultaneously transmitted and read. Offset and data length. The parameter information is associated with the synchronization data, and the offset of the next synchronization data is the sum of the offset of the synchronization data and the data length, namely:
offsetnext=offsetnow+lengthnow Offset next =offset now +length now
其中,每次同步数据的length可能相同也可能不同。Among them, the length of each synchronization data may be the same or different.
该参数信息也可以是其它内容的信息,只要该参数信息与指定文件的同步数据相关联,可以用于区分指定文件在每次同步时的同步数据即可。The parameter information may also be information of other content, as long as the parameter information is associated with the synchronization data of the specified file, and may be used to distinguish the synchronization data of the specified file at each synchronization.
步骤102,确定已获取的指定文件对应的各参数信息是否存在空缺。Step 102: Determine whether there is a gap in each parameter information corresponding to the obtained specified file.
在上步骤获得参数信息后,本步骤即可根据所有已获取的该指定文件对应的参数信息确定参数信息中是否存在空缺。具体的可以根据参数信息之间的规律来确定是否存在空缺或缺失的参数信息。 After obtaining the parameter information in the previous step, the step may determine whether there is a gap in the parameter information according to the parameter information corresponding to all the obtained specified files. Specifically, it may be determined according to the law between the parameter information whether there is a vacancy or missing parameter information.
例如,上步骤中约定连续发送的同步数据的参数信息是连续的,则在本步骤中可以判断已获取的指定文件的参数信息之间是否是连续的,如果是连续的,说明不存在空缺,如果是不连续的,则说明参数信息存在缺失。For example, if the parameter information of the synchronous data that is continuously sent in the previous step is continuous, in this step, it can be determined whether the parameter information of the obtained specified file is continuous, and if it is continuous, there is no vacancy. If it is not continuous, the parameter information is missing.
再例如,若上步骤采用偏移量和数据长度作为参数信息,则可以根据已获得的某一偏移量和数据长度,根据连续两次同步的偏移量之间的关系offsetnext=offsetnow+lengthnow来确定出邻近的偏移量,然后在已获取的参数信息中查找是否存在该确定出的邻近的偏移量,以此类推,即可确定出已获取的指定文件对应的参数信息中是否存在空缺。For another example, if the upper step uses the offset and the data length as the parameter information, the offset between the offsets of the two consecutive synchronizations may be offset according to the offset and the data length that have been obtained next =offset now +length now determines the offset of the neighbor, and then finds whether there is the determined offset of the neighbor in the obtained parameter information, and so on, and can determine the parameter information corresponding to the obtained specified file. Is there a gap in the middle?
若采用其它内容的参数信息,由于参数信息与指定文件的同步数据相关联,可以利用不同的参数信息之间的规律来确定已获取的指定文件的参数信息中是否存在空缺。If parameter information of other content is used, since the parameter information is associated with the synchronization data of the specified file, the law between the different parameter information may be utilized to determine whether there is a gap in the parameter information of the obtained specified file.
若确定存在空缺,则执行步骤103,若不存在空缺,则执行步骤104。If it is determined that there is a vacancy, step 103 is performed. If there is no vacancy, step 104 is performed.
步骤103,确定指定文件存在数据丢失。In step 103, it is determined that there is data loss in the specified file.
由于源服务器在进行数据同步时,是根据指定文件的数据更新按次序进行数据同步的,如果该装置所获得的参数信息存在空缺,则说明空缺的参数信息对应的数据丢失,数据同步存在问题。Since the source server performs data synchronization according to the data update of the specified file, if the parameter information obtained by the device is vacant, the data corresponding to the vacated parameter information is lost, and the data synchronization has a problem.
步骤104,确定指定文件不存在数据丢失。Step 104: Determine that there is no data loss in the specified file.
本申请实施例通过利用指定文件的参数信息,以及参数信息与指定文件的同步数据之间的关联关系,只需确定已获取的参数信息是否存在空缺,即可确定出指定文件在数据同步时是否存在数据丢失。该方法中由于每次指定文件的数据同步都会携带参数信息,因此不会存在抽样检查中抽样率的问题,所以避免了没有抽样到的数据存在数据丢失时无法检测到的问题,而且,由于参数信息与指定文件的同步数据相关联,参数信息可以表征指定文件在不同同步次数的数据,所以相对于最终一致性检查,解决了数据重复传输时目标地统计记录数有误而导致的检测不准确的问题。本申请实施例公开的方法不仅提高了对丢失数据的检测准确度,而且检测过程简单易行,检测效率较高。In the embodiment of the present application, by using the parameter information of the specified file and the association relationship between the parameter information and the synchronization data of the specified file, it is only necessary to determine whether the obtained parameter information has a gap, thereby determining whether the specified file is in data synchronization. There is data loss. In this method, since the data synchronization of each specified file carries the parameter information, there is no problem of the sampling rate in the sampling check, so that the problem that the unsampled data cannot be detected when the data is lost is avoided, and The information is associated with the synchronization data of the specified file, and the parameter information can represent the data of the specified file at different synchronization times, so the detection of the target statistical record is incorrect when the data is repeatedly transmitted relative to the final consistency check. The problem. The method disclosed in the embodiment of the present application not only improves the detection accuracy of the lost data, but also has a simple and easy detection process and high detection efficiency.
在本申请的另一实施例中,确定已获取的所述指定文件对应的各参数信息是否存在空缺的过程,如图2所示,可以进一步包括:In another embodiment of the present application, the process of determining whether the parameter information corresponding to the specified file is vacant, as shown in FIG. 2, may further include:
步骤201,将获取的同步数据的参数信息插入指定文件的参数信息链表中,该参数信息链表中包含有在先获取的该指定文件对应的参数信息。In step 201, the parameter information of the acquired synchronization data is inserted into the parameter information linked list of the specified file, and the parameter information linked list includes the parameter information corresponding to the specified file acquired earlier.
为了便于确定参数信息中是否存在空缺,本实施例中可以建立该指定文件的参数信息链表,该参数信息链表中包含有所有已获取的该指定文件对应的参数信息,也即在指 定文件的各次数据同步过程中获得的同步数据的参数信息。该参数信息链表可以按参数信息的大小顺序等排列参数信息,以便于查找。In order to facilitate the determination of whether there is a vacancy in the parameter information, in this embodiment, a parameter information linked list of the specified file may be established, where the parameter information linked list includes all the obtained parameter information corresponding to the specified file, that is, The parameter information of the synchronization data obtained during each data synchronization process of the file. The parameter information linked list can arrange parameter information according to the order of the size of the parameter information, etc., to facilitate searching.
当该装置在步骤101中获得该指定文件的新的同步数据的参数信息时,将该参数信息按照排列规律插入对应的参数信息链表中。When the device obtains the parameter information of the new synchronization data of the specified file in step 101, the parameter information is inserted into the corresponding parameter information linked list according to the arrangement rule.
步骤202,将参数信息链表中相邻的参数信息进行合并。Step 202: Combine adjacent parameter information in the parameter information linked list.
在插入参数信息链表后,将相邻的参数信息进行合并。其中,相邻的参数信息也即相邻的两次数据同步中同步数据的参数信息。具体可以依据参数信息的设置规律来推定两参数信息是否相邻,例如,如果连续的同步数据对应的参数信息是连续的,则将连续的参数信息是相邻的参数信息,可以合并;如果按照上述offsetnext=offsetnow+lengthnow设置参数信息,则根据数据长度来推定两参数信息是否相邻,再确定是否可以合并。After inserting the parameter information list, the adjacent parameter information is merged. The adjacent parameter information is also parameter information of the synchronization data in the adjacent two data synchronizations. Specifically, the two parameter information may be adjacent according to the setting rule of the parameter information. For example, if the parameter information corresponding to the continuous synchronization data is continuous, the continuous parameter information is adjacent parameter information, which may be combined; The above offset next =offset now +length now sets the parameter information, and based on the data length, it is estimated whether the two parameter information is adjacent, and then whether it can be merged.
步骤203,按照预置条件查找合并后的参数信息链表中是否存在空缺的参数信息。Step 203: Search for the parameter information of the vacant parameter information in the merged parameter information list according to the preset condition.
在本步骤中,不能合并的参数信息之间可能就是存在空缺的参数信息,在确定是否存在空缺的参数信息时,还可以设置条件,查找结果满足预设条件时,再确定是否存在空缺的参数信息。In this step, the parameter information that cannot be merged may be the parameter information of the vacancy. When determining whether there is vacant parameter information, the condition may also be set, and if the search result satisfies the preset condition, it is determined whether there is a vacancy parameter. information.
例如预先设置查找的次数阈值,按照预设的次数阈值重复查找合并后的参数信息链表中是否存在空缺的参数信息,当查找结果为存在空缺的参数信息且查找次数达到该次数阈值时,判定已获取的指定文件对应的参数信息存在空缺。For example, the threshold value of the number of times of the search is set in advance, and the parameter information of the vacant parameter information list is repeatedly searched according to the preset number of times threshold. When the search result is the parameter information of the vacancy and the number of times of the search reaches the threshold of the number of times, the judgment has been made. The parameter information corresponding to the obtained specified file is vacant.
另外,在确定出空缺的参数信息后,该合并后的所述参数信息链表中空缺的参数信息对应的数据即可确定为指定文件在数据同步过程中丢失的数据。In addition, after the parameter information of the vacancy is determined, the data corresponding to the parameter information of the merging parameter information list may be determined as the data lost in the data synchronization process of the specified file.
下面以参数信息为offset和length为例进行说明。在每次对指定文件进行数据同步时,都会携带上这2个参数,并且在目标服务器保存起来。在目标服务器,参数信息和真实同步数据的保存,结果状态必须是一致的,同时成功或者失败。该用于检测数据是否丢失的装置可以扫描目标服务器保存的该指定文件的参数,进而判断是否存在数据丢失的情况。The following takes the parameter information as offset and length as an example for description. Each time the data is synchronized to the specified file, these two parameters are carried and saved in the target server. In the target server, the parameter information and the real synchronization data are saved, and the result state must be consistent, with success or failure. The means for detecting whether the data is lost may scan the parameters of the specified file saved by the target server to determine whether there is data loss.
该装置建立指定文件对应的参数信息链表(TailHead),不断从目标服务器读取新的同步数据的参数信息,并将参数信息插入该参数信息链表中。为便于理解,假设每次数据同步获得的参数信息中length一直为1,已建立的参数信息链表如图3所示,以TailHead的d1对象为例,tail为1表示它的offset为1,length为1,head为5表示offset为5,length为1。d1表示1到5之间的同步数据都已检测到,则如图3所示,图中的数据表示1~5、8~10、12、m~n中的同步数据都已经存在,未丢失。 The device establishes a parameter information linked list (TailHead) corresponding to the specified file, continuously reads parameter information of the new synchronous data from the target server, and inserts the parameter information into the parameter information linked list. For ease of understanding, it is assumed that the length of the parameter information obtained by each data synchronization is always 1, and the established parameter information list is as shown in FIG. 3, taking the d1 object of TailHead as an example, and the tail is 1 to indicate that its offset is 1, length. A value of 1, head is 5 indicates that the offset is 5 and the length is 1. D1 indicates that the synchronization data between 1 and 5 has been detected. As shown in FIG. 3, the data in the figure indicates that the synchronization data in 1 to 5, 8 to 10, 12, and m to n already exist, and is not lost. .
当该装置执行前述步骤101,从目标服务器中读取到该指定文件的新的同步数据的参数信息时,例如读取到同步数据的offset为11和length为1,执行前述步骤201,将读取的参数信息插入到链表TailHead中,那么根据参数信息offset为11和length为1,将d2的tail值更新为11即可。然后执行前述步骤202合并链表中相邻的参数信息,比如d2的tail为11也即offset为11,length为1,d3的head为12也即offset为12,length为1,那么由于length均为1,可以确定d2与d3相邻,合并d2和d3,其中head为8,tail为13。然后执行前述步骤203检测参数信息是否存在空缺,以d1、d2为例,他们之间间隔的offset为6、7,表示这2个参数信息对应的数据不存在。如果这2个参数经过多次(次数阈值可以调整)判断仍然不存在,那么就可以认为offset6、7对应的数据丢失。When the apparatus performs the foregoing step 101, when the parameter information of the new synchronization data of the specified file is read from the target server, for example, the offset of the read sync data is 11 and the length is 1, and the foregoing step 201 is executed, and the read is performed. The parameter information is inserted into the linked list TailHead, then the tail value of d2 is updated to 11 according to the parameter information offset of 11 and length being 1. Then, the foregoing step 202 is performed to merge adjacent parameter information in the linked list. For example, the tail of d2 is 11 and the offset is 11, the length is 1, and the head of d3 is 12, that is, the offset is 12, and the length is 1, then 1. It can be determined that d2 is adjacent to d3, and d2 and d3 are combined, wherein head is 8 and tail is 13. Then, the foregoing step 203 is performed to detect whether there is a gap in the parameter information. Taking d1 and d2 as an example, the offset between them is 6, 7 indicating that the data corresponding to the two parameter information does not exist. If the two parameters are judged to be absent after multiple times (the number of thresholds can be adjusted), then the data corresponding to offsets 6 and 7 can be considered to be lost.
需要说明的是,对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请实施例所必须的。It should be noted that, for the method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the embodiments of the present application are not limited by the described action sequence, because In accordance with embodiments of the present application, certain steps may be performed in other sequences or concurrently. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions involved are not necessarily required in the embodiments of the present application.
参照图4,示出了本申请一种数据检测装置实施例的结构框图,具体可以包括如下单元:Referring to FIG. 4, a structural block diagram of an embodiment of a data detecting apparatus of the present application is shown, which may specifically include the following units:
参数获取单元401,被配置为当对指定文件的数据进行同步时,获取同步数据的参数信息,所述参数信息与所述同步数据相关联。The parameter obtaining unit 401 is configured to acquire parameter information of the synchronization data when the data of the specified file is synchronized, and the parameter information is associated with the synchronization data.
参数确认单元402,被配置为确定已获取的所述指定文件对应的各参数信息是否存在空缺。The parameter confirmation unit 402 is configured to determine whether each parameter information corresponding to the specified file that has been acquired has a gap.
结果确认单元403,被配置为当所述参数确认单元确定存在空缺时,确定所述指定文件存在数据丢失。The result confirming unit 403 is configured to determine that the specified file has data loss when the parameter confirming unit determines that there is a gap.
本申请实施例中,该装置通过利用指定文件的参数信息,以及参数信息与指定文件的同步数据之间的关联关系,只需确定已获取的参数信息是否存在空缺,即可确定出指定文件在数据同步时是否存在数据丢失。该装置不仅提高了对丢失数据的检测准确度,而且检测过程简单易行,检测效率较高。In the embodiment of the present application, by using the parameter information of the specified file and the association relationship between the parameter information and the synchronization data of the specified file, the device only needs to determine whether the obtained parameter information has a gap, thereby determining that the specified file is in the Whether there is data loss when data is synchronized. The device not only improves the detection accuracy of the lost data, but also has a simple and easy detection process and high detection efficiency.
在另一实例中,当参数确认单元确定不存在空缺时,结果确认单元403可以确定所述指定文件不存在数据丢失。In another example, when the parameter confirmation unit determines that there is no vacancy, the result confirmation unit 403 may determine that there is no data loss in the specified file.
在另一实施例中,如图5所示,该参数确认单元402可以进一步包括: In another embodiment, as shown in FIG. 5, the parameter confirmation unit 402 may further include:
插入子单元501,被配置为将获取的所述同步数据的参数信息插入所述指定文件的参数信息链表中,所述参数信息链表中包含有在先获取的所述指定文件对应的参数信息。The insertion sub-unit 501 is configured to insert the acquired parameter information of the synchronization data into the parameter information linked list of the specified file, where the parameter information linked list includes parameter information corresponding to the specified file acquired earlier.
合并子单元502,被配置为将所述参数信息链表中相邻的参数信息进行合并。The merging subunit 502 is configured to merge adjacent parameter information in the parameter information linked list.
确认子单元503,被配置为按照预置条件查找合并后的所述参数信息链表中是否存在空缺的参数信息。The confirmation subunit 503 is configured to search for the parameter information of the vacancy in the merged parameter information list according to the preset condition.
其中,确认子单元503,可以具体被配置为按照预设的次数阈值重复查找合并后的所述参数信息链表中是否存在空缺的参数信息,当查找结果为存在空缺的参数信息且查找次数达到所述次数阈值时,判定所述已获取的所述指定文件的参数信息存在空缺。The confirmation sub-unit 503 may be specifically configured to repeatedly search for the parameter information of the vacant parameter information list in the merged parameter information list according to a preset number of times threshold, and when the search result is the parameter information of the vacancy, the number of times of finding reaches When the threshold value is described, it is determined that there is a gap in the parameter information of the specified file that has been acquired.
在另一实施例中,如图6所示,该装置还可以包括:In another embodiment, as shown in FIG. 6, the device may further include:
丢失数据确定单元601,被配置为将所述合并后的所述参数信息链表中空缺的参数信息对应的数据确定为所述指定文件丢失的数据。The lost data determining unit 601 is configured to determine data corresponding to the parameter information of the merged parameter information list that is hollowed out as data that is lost in the specified file.
上述同步数据的参数信息可以包括偏移量和数据长度,且下一次同步数据的偏移量为本次同步数据的偏移量与数据长度的和。The parameter information of the above synchronization data may include an offset and a data length, and the offset of the next synchronization data is the sum of the offset of the synchronization data and the data length.
本申请实施例还公开了一种电子设备,包括存储器和处理器。The embodiment of the present application also discloses an electronic device, including a memory and a processor.
处理器与存储器通过总线相互连接;总线可以是ISA总线、PCI总线或EISA总线等。所述总线可以分为地址总线、数据总线、控制总线等。The processor and the memory are connected to each other through a bus; the bus may be an ISA bus, a PCI bus, or an EISA bus. The bus can be divided into an address bus, a data bus, a control bus, and the like.
其中,存储器用于存储一段程序,具体地,程序可以包括程序代码,所述程序代码包括计算机操作指令。存储器可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。Wherein, the memory is used to store a program, and specifically, the program may include program code, and the program code includes computer operation instructions. The memory may include a high speed RAM memory and may also include a non-volatile memory such as at least one disk memory.
处理器用于读取存储器中的程序代码,执行以下步骤:The processor is used to read the program code in the memory and perform the following steps:
当对指定文件的数据进行同步时,获取同步数据的参数信息,所述参数信息与所述同步数据相关联;Obtaining parameter information of the synchronization data when the data of the specified file is synchronized, the parameter information being associated with the synchronization data;
确定已获取的所述指定文件对应的各参数信息是否存在空缺;Determining whether there is a gap in each parameter information corresponding to the specified file that has been obtained;
若是,则确定所述指定文件存在数据丢失。If yes, it is determined that there is data loss in the specified file.
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。For the device embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method embodiment.
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。The various embodiments in the present specification are described in a progressive manner, and each embodiment focuses on differences from other embodiments, and the same similar parts between the various embodiments can be referred to each other.
本领域内的技术人员应明白,本申请实施例的实施例可提供为方法、装置、或计算 机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should appreciate that embodiments of the embodiments of the present application may be provided as a method, apparatus, or calculation. Machine program products. Therefore, the embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
在一个典型的配置中,所述计算机设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括非持续性的电脑可读媒体(transitory media),如调制的数据信号和载波。In a typical configuration, the computer device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory. The memory may include non-persistent memory, random access memory (RAM), and/or non-volatile memory in a computer readable medium, such as read only memory (ROM) or flash memory. Memory is an example of a computer readable medium. Computer readable media includes both permanent and non-persistent, removable and non-removable media. Information storage can be implemented by any method or technology. The information can be computer readable instructions, data structures, modules of programs, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory. (ROM), electrically erasable programmable read only memory (EEPROM), flash memory or other memory technology, compact disk read only memory (CD-ROM), digital versatile disk (DVD) or other optical storage, Magnetic tape cartridges, magnetic tape storage or other magnetic storage devices or any other non-transportable media can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-persistent computer readable media, such as modulated data signals and carrier waves.
本申请实施例是参照根据本申请实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing terminal device to produce a machine such that instructions are executed by a processor of a computer or other programmable data processing terminal device Means are provided for implementing the functions specified in one or more of the flow or in one or more blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing terminal device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The instruction device implements the functions specified in one or more blocks of the flowchart or in a flow or block of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程 和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing terminal device such that a series of operational steps are performed on the computer or other programmable terminal device to produce computer-implemented processing, such that the computer or other programmable terminal device The instructions executed on the implementation are used to implement a process or multiple processes in the flowchart And/or block diagram of the steps of a function specified in a box or blocks.
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。While a preferred embodiment of the embodiments of the present application has been described, those skilled in the art can make further changes and modifications to the embodiments once they are aware of the basic inventive concept. Therefore, the appended claims are intended to be interpreted as including all the modifications and the modifications
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。Finally, it should also be noted that in this context, relational terms such as first and second are used merely to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply these entities. There is any such actual relationship or order between operations. Furthermore, the terms "comprises" or "comprising" or "comprising" or any other variations are intended to encompass a non-exclusive inclusion, such that a process, method, article, or terminal device that includes a plurality of elements includes not only those elements but also Other elements that are included, or include elements inherent to such a process, method, article, or terminal device. An element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article, or terminal device that comprises the element, without further limitation.
以上对本申请所提供的一种数据检测方法和一种数据检测装置,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。 The data detection method and a data detection device provided by the present application are described in detail above. The principles and implementation manners of the present application are described in the specific examples. The description of the above embodiments is only for helping. The method of the present application and its core idea are understood; at the same time, for those skilled in the art, according to the idea of the present application, there will be changes in the specific implementation manner and the scope of application. It should be understood that the limitations of the application.

Claims (10)

  1. 一种数据检测方法,其特征在于,包括:A data detection method, comprising:
    当对指定文件的数据进行同步时,获取同步数据的参数信息,所述参数信息与所述同步数据相关联;Obtaining parameter information of the synchronization data when the data of the specified file is synchronized, the parameter information being associated with the synchronization data;
    确定已获取的所述指定文件对应的各参数信息是否存在空缺;Determining whether there is a gap in each parameter information corresponding to the specified file that has been obtained;
    若是,则确定所述指定文件存在数据丢失。If yes, it is determined that there is data loss in the specified file.
  2. 根据权利要求1所述的方法,其特征在于,所述确定已获取的所述指定文件对应的各参数信息是否存在空缺,包括:The method according to claim 1, wherein the determining whether the obtained parameter information corresponding to the specified file has a gap includes:
    将获取的所述同步数据的参数信息插入所述指定文件的参数信息链表中,所述参数信息链表中包含有在先获取的所述指定文件对应的参数信息;The parameter information of the acquired synchronization data is inserted into the parameter information linked list of the specified file, where the parameter information linked list includes parameter information corresponding to the specified file acquired earlier;
    将所述参数信息链表中相邻的参数信息进行合并;Merging adjacent parameter information in the parameter information linked list;
    按照预置条件查找合并后的所述参数信息链表中是否存在空缺的参数信息。The parameter information of the vacant parameter information in the merged parameter information list is found according to the preset condition.
  3. 根据权利要求2所述的方法,其特征在于,按照预置条件查找合并后的所述参数信息链表中是否存在空缺的参数信息,包括:The method according to claim 2, wherein the parameter information of the vacant parameter list in the merged parameter information list is searched according to a preset condition, including:
    按照预设的次数阈值重复查找合并后的所述参数信息链表中是否存在空缺的参数信息,当查找结果为存在空缺的参数信息且查找次数达到所述次数阈值时,判定所述已获取的所述指定文件对应的参数信息存在空缺。Determining, by using a predetermined number of thresholds, whether there is a parameter information of the vacancy in the merged parameter information list. When the search result is the parameter information of the vacancy and the number of times of searching reaches the threshold number, the determined location is determined. There is a gap in the parameter information corresponding to the specified file.
  4. 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method of claim 2, wherein the method further comprises:
    将所述合并后的所述参数信息链表中空缺的参数信息对应的数据确定为所述指定文件丢失的数据。Determining, by the merged data corresponding to the parameter information of the parameter information list, the data of the specified file is lost.
  5. 根据权利要求1至4中任意一项所述的方法,其特征在于,所述同步数据的参数信息包括偏移量和数据长度,且下一次同步数据的偏移量为本次同步数据的偏移量与数据长度的和。The method according to any one of claims 1 to 4, wherein the parameter information of the synchronization data includes an offset and a data length, and the offset of the next synchronization data is a bias of the synchronization data. The sum of the shift and the length of the data.
  6. 一种数据检测装置,其特征在于,包括:A data detecting device, comprising:
    参数获取单元,被配置为当对指定文件的数据进行同步时,获取同步数据的参数信息,所述参数信息与所述同步数据相关联;a parameter obtaining unit configured to acquire parameter information of the synchronization data when the data of the specified file is synchronized, the parameter information being associated with the synchronization data;
    参数确认单元,被配置为确定已获取的所述指定文件对应的各参数信息是否存在空缺;a parameter confirmation unit, configured to determine whether there is a gap in each parameter information corresponding to the specified file that has been obtained;
    结果确认单元,被配置为当所述参数确认单元确定存在空缺时,确定所述指定文件存在数据丢失。 The result confirming unit is configured to determine that the specified file has data loss when the parameter confirming unit determines that there is a gap.
  7. 根据权利要求6所述的装置,其特征在于,所述参数确认单元包括:The device according to claim 6, wherein the parameter confirmation unit comprises:
    插入子单元,被配置为将获取的所述同步数据的参数信息插入所述指定文件的参数信息链表中,所述参数信息链表中包含有在先获取的所述指定文件对应的参数信息;Inserting a subunit, configured to insert the obtained parameter information of the synchronization data into a parameter information linked list of the specified file, where the parameter information linked list includes parameter information corresponding to the specified file acquired earlier;
    合并子单元,被配置为将所述参数信息链表中相邻的参数信息进行合并;Merging the subunits, configured to merge adjacent parameter information in the parameter information linked list;
    确认子单元,被配置为按照预置条件查找合并后的所述参数信息链表中是否存在空缺的参数信息。The confirmation subunit is configured to search for the parameter information of the vacant parameter information list in the merged parameter information list according to the preset condition.
  8. 根据权利要求7所述的装置,其特征在于,The device of claim 7 wherein:
    所述确认子单元,被配置为按照预设的次数阈值重复查找合并后的所述参数信息链表中是否存在空缺的参数信息,当查找结果为存在空缺的参数信息且查找次数达到所述次数阈值时,判定所述已获取的所述指定文件对应的参数信息存在空缺。The acknowledgment sub-unit is configured to repeatedly search for the parameter information of the vacant parameter information list in the merged parameter information list according to the preset number of times thresholds, and when the search result is the vacancy parameter information, and the number of times of the search reaches the threshold number of times When it is determined, there is a gap in the parameter information corresponding to the obtained specified file.
  9. 根据权利要求7所述的装置,其特征在于,所述装置还包括:The device according to claim 7, wherein the device further comprises:
    丢失数据确定单元,被配置为将所述合并后的所述参数信息链表中空缺的参数信息对应的数据确定为所述指定文件丢失的数据。The lost data determining unit is configured to determine data corresponding to the parameter information of the merged parameter information list that is hollowed out as data that is lost in the specified file.
  10. 根据权利要求6至9中任意一项所述的装置,其特征在于,所述同步数据的参数信息包括偏移量和数据长度,且下一次同步数据的偏移量为本次同步数据的偏移量与数据长度的和。 The apparatus according to any one of claims 6 to 9, wherein the parameter information of the synchronization data includes an offset amount and a data length, and the offset of the next synchronization data is a bias of the synchronization data. The sum of the shift and the length of the data.
PCT/CN2016/089079 2015-07-16 2016-07-07 Data detection method and device WO2017008677A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510419821.3A CN106649056B (en) 2015-07-16 2015-07-16 A kind of data detection method and device
CN201510419821.3 2015-07-16

Publications (1)

Publication Number Publication Date
WO2017008677A1 true WO2017008677A1 (en) 2017-01-19

Family

ID=57756858

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/089079 WO2017008677A1 (en) 2015-07-16 2016-07-07 Data detection method and device

Country Status (2)

Country Link
CN (1) CN106649056B (en)
WO (1) WO2017008677A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108874825B (en) * 2017-05-12 2021-11-02 北京京东尚科信息技术有限公司 Abnormal data verification method and device
CN108183957B (en) * 2017-12-29 2020-12-18 北京奇虎科技有限公司 Master-slave synchronization method and device
CN117992441B (en) * 2024-02-07 2024-08-06 广州翌拓软件开发有限公司 Data processing method and system for synchronous auditing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6604205B1 (en) * 2000-02-07 2003-08-05 Hewlett-Packard Development Co., L.P. System and method for state synchronization
CN101686184A (en) * 2008-09-23 2010-03-31 中兴通讯股份有限公司 Processing method of service data packet loss of synchronous network of multimedia broadcast multicast service
CN101751394A (en) * 2008-12-16 2010-06-23 青岛海信传媒网络技术有限公司 Method and system for synchronizing data
CN103338131A (en) * 2013-06-20 2013-10-02 百度在线网络技术(北京)有限公司 Method and equipment for testing log transmitting loss rate
CN103514223A (en) * 2012-06-28 2014-01-15 阿里巴巴集团控股有限公司 Data synchronism method and system of database
CN103988189A (en) * 2011-12-08 2014-08-13 国际商业机器公司 Method for detecting data loss of data transfer between information devices

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101009516B (en) * 2006-01-26 2011-05-04 华为技术有限公司 A method, system and device for data synchronization
CN201717889U (en) * 2010-07-05 2011-01-19 深圳华强游戏软件有限公司 Progressive data synchronization system
CN103457905B (en) * 2012-05-28 2015-09-09 腾讯科技(深圳)有限公司 Method of data synchronization, system and equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6604205B1 (en) * 2000-02-07 2003-08-05 Hewlett-Packard Development Co., L.P. System and method for state synchronization
CN101686184A (en) * 2008-09-23 2010-03-31 中兴通讯股份有限公司 Processing method of service data packet loss of synchronous network of multimedia broadcast multicast service
CN101751394A (en) * 2008-12-16 2010-06-23 青岛海信传媒网络技术有限公司 Method and system for synchronizing data
CN103988189A (en) * 2011-12-08 2014-08-13 国际商业机器公司 Method for detecting data loss of data transfer between information devices
CN103514223A (en) * 2012-06-28 2014-01-15 阿里巴巴集团控股有限公司 Data synchronism method and system of database
CN103338131A (en) * 2013-06-20 2013-10-02 百度在线网络技术(北京)有限公司 Method and equipment for testing log transmitting loss rate

Also Published As

Publication number Publication date
CN106649056B (en) 2019-07-02
CN106649056A (en) 2017-05-10

Similar Documents

Publication Publication Date Title
US10915495B2 (en) Automated archiving of user generated media files
US10311230B2 (en) Anomaly detection in distributed ledger systems
CN108108127B (en) File reading method and system
CN107515874B (en) Method and equipment for synchronizing incremental data in distributed non-relational database
CN109739810B (en) File synchronization method, server, client and device with storage function
EP2724266A1 (en) Extracting incremental data
WO2016091069A1 (en) Data operation method and device
EP3329292B1 (en) Pps tagging of acoustic sample data
CN107748790B (en) Online service system, data loading method, device and equipment
WO2017008677A1 (en) Data detection method and device
US11438133B2 (en) Data synchronization in a P2P network
WO2017028690A1 (en) File processing method and system based on etl
CN104660635A (en) Message synchronizing method, device and system
CN106843820B (en) Code processing method and device
CN105376277A (en) Data synchronization method and device
WO2019057193A1 (en) Data deletion method and distributed storage system
CN109062500B (en) Metadata management server, data storage system and data storage method
US9069681B1 (en) Real-time log joining on a continuous stream of events that are approximately ordered
CN103973727A (en) Data synchronizing method and device
US20090028057A1 (en) Network delay measurement method and communication system
CN105991744B (en) Method and apparatus for synchronizing user application data
CN106648839A (en) Method and device for processing data
CN109471901B (en) Data synchronization method and device
CN111147226B (en) Data storage method, device and storage medium
CN102054036B (en) File synchronizing method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16823824

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16823824

Country of ref document: EP

Kind code of ref document: A1