CN112818656A - Data difference comparison method, device, equipment, medium and computer program product - Google Patents

Data difference comparison method, device, equipment, medium and computer program product Download PDF

Info

Publication number
CN112818656A
CN112818656A CN202110106085.1A CN202110106085A CN112818656A CN 112818656 A CN112818656 A CN 112818656A CN 202110106085 A CN202110106085 A CN 202110106085A CN 112818656 A CN112818656 A CN 112818656A
Authority
CN
China
Prior art keywords
data
module
version
difference comparison
type
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110106085.1A
Other languages
Chinese (zh)
Other versions
CN112818656B (en
Inventor
王延猛
谢涛
董淑照
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110106085.1A priority Critical patent/CN112818656B/en
Publication of CN112818656A publication Critical patent/CN112818656A/en
Application granted granted Critical
Publication of CN112818656B publication Critical patent/CN112818656B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Document Processing Apparatus (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure discloses a data difference comparison method, a data difference comparison device, data difference comparison equipment, a data difference comparison medium and a computer program product, and relates to the technical field of computers, in particular to the field of text data processing. The specific implementation scheme is as follows: splitting historical version data and current version data of preset data according to data types to obtain a historical version data module of each data type and a data module of the same type of the current version; performing difference comparison of rich text data on the historical version data module corresponding to each data type and the corresponding data module of the same type of the current version to obtain a difference comparison result of the data module of each data type; and generating a difference comparison data stream according to the difference comparison result, wherein the difference comparison data stream comprises the combined historical version data and the current version data and is used for indicating the data change process from the historical version data to the current version data to obtain the version difference comparison result of the preset data. According to the method, the processing efficiency of data difference comparison can be improved.

Description

Data difference comparison method, device, equipment, medium and computer program product
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, a medium, and a computer program product for data difference comparison.
Background
In the process of text auditing, the data of the new version and the old version are usually compared, and the data change information is checked through comparison to determine whether the new version is in compliance.
Rich Text (Rich Text) can be understood as multi-dimensional data information, such as Text, links, embedded images and diagrams, etc. carrying attribute information (such as font, color, etc.). In the process of comparing the new and old versions of data containing rich text, the difference comparison of the new and old versions of rich text data is needed.
Disclosure of Invention
The present disclosure provides a data difference comparison method, apparatus, device, medium, and computer program product.
According to a first aspect of the present disclosure, there is provided a data difference comparison method, including: splitting historical version data and current version data of preset data according to data types to obtain a historical version data module corresponding to each data type and a data module corresponding to the same type of the current version; performing difference comparison of rich text data on the historical version data module corresponding to each data type and the corresponding data module of the same type of the current version to obtain a difference comparison result corresponding to the data module of each data type; and generating a difference comparison data stream according to the difference comparison result, wherein the difference comparison data stream comprises combined historical version data and current version data and is used for indicating a data change process from the historical version data to the current version data to obtain the version difference comparison result of the preset data.
According to a second aspect of the present disclosure, there is provided a data difference comparison apparatus including: the version data splitting module is used for splitting the historical version data and the current version data according to the data types to obtain a historical version data module corresponding to each data type and a data module corresponding to the same type of the current version; the data difference comparison module is used for carrying out difference comparison on rich text data on the historical version data module corresponding to each data type and the corresponding data module of the same type of the current version to obtain a difference comparison result corresponding to the data module of each data type; and the comparison result determining module is used for generating a difference comparison data stream according to the difference comparison result, wherein the difference comparison data stream comprises the combined historical version data and the current version data and is used for indicating the data change process from the historical version data to the current version data so as to determine the difference comparison result of the historical version data and the current version data.
According to a third aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform any of the data difference comparison methods described above.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of the data variance comparison methods.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of any of the above-described data difference comparison methods.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram of a scenario according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram of a data difference comparison method according to an embodiment of the present disclosure;
FIG. 3 is a schematic flow chart diagram illustrating a data difference comparison method according to another embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a difference contrast data flow generated in an exemplary embodiment of the present disclosure;
FIG. 5 is a schematic diagram of an adaptation process performed on a disparity-contrast data stream in an exemplary embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a data difference comparison apparatus according to an embodiment of the present disclosure;
FIG. 7 is a block diagram of an electronic device used to implement methods of embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Embodiments of the present disclosure and features of embodiments may be combined with each other without conflict. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
In the disclosed embodiment, plain text may be understood as data information containing only words and basic punctuation; rich text is understood to be multi-dimensional data information, such as text, hyperlinks, pictures, maps, tables, lists, modules, etc., that carry attribute information (e.g., font, color, bold, italics, subscript, underline, etc.) and data that is presented from multiple dimensions.
In an application scenario of comparing new and old version data of a plain text, only the addition, deletion, and modification of characters can be calculated, and the editing path of a user is analyzed, for example, the plain text data of the current version and the plain text data of the historical version are compared through a path comparison command of a version management tool (Subversion, SVN), that is, SVN diff, or through a tool of a distributed version control system git, so as to obtain plain text modification information.
However, the requirement of difference comparison of rich text data submitted in internet usage scenarios cannot be met, that is, the modification of new and old versions of rich text data by a user cannot be fully reflected. In an application scenario of comparing new and old version data of rich text, modification contents of multiple dimensions need to be compared, and a modification point of a user or a change process of the new and old version data is reflected.
Fig. 1 is a scene schematic of an embodiment of the disclosure. In the scenario shown in fig. 1, it includes: historical version data 11, current version data 12, a data difference comparison server 13 and an auditing end device 14.
Wherein the historical version data 11 and the current version data 12 may be old version rich text data and new version rich text data, respectively.
The data difference comparison server 13 may perform a data difference comparison method according to the received historical version data 11 and the current version data 12 to obtain a difference comparison result.
The data difference comparison server 13 may also process the difference comparison result, and send the comparison result data obtained by the processing to the auditing end device 14.
The auditing end device 14 may perform data auditing according to the received comparison result data to determine whether the change information from the historical version data 11 to the current version data 12 is compliant, that is, to determine whether the current version data 12 is compliant.
In fig. 1, the data difference comparison server 13 may be a single service device or a server cluster including a plurality of service devices.
The auditor device 14 may include, but is not limited to: personal computers, smart phones, tablets, personal digital assistants, servers, and the like. The user can input the contents of the questions into the question-answering service platform 12 through the terminal 11.
The data difference comparison server 13 and the auditing end device 14 can establish connection through a network. In particular, the network may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
It should be understood that the number of devices in fig. 1 is merely illustrative. According to the actual application needs, can carry out nimble adjustment. For example, the configuration may be flexible according to the requirement, and the content in this aspect is not limited.
In some differential comparison schemes of rich text new and old version data, paragraph segmentation needs to be performed on a whole submitted predetermined data, the paragraph similarity of the new and old version data is determined through global similarity (simhash), so that content comparison of similar paragraphs is performed, and the similarity of the new and old version contents is determined through a text compression algorithm and a distance algorithm.
Moreover, the processing process of segmenting the whole paragraph is limited by various data structures in the preset data, so that the accuracy is not high in the calculation of the corresponding relation of the multiple paragraphs, and the whole accuracy is reduced.
In other differential comparison schemes for rich text new and old version data, the differential comparison result needs to be highly coupled with a specific auditing service scenario, for example, the differential comparison result must be processed into a format specified by a specific service, for example, the data in the specified format often only reflects data modification information on the old version data or only reflects data modification information on the new version data, so that the differential comparison result has poor adaptability between different service scenarios and is difficult to maintain, the existing differential comparison result is difficult to use in the new service scenario, and the development cost is high.
Fig. 2 is a schematic flow chart of a data difference comparison method according to an embodiment of the disclosure.
In a first aspect, referring to fig. 1, an embodiment of the present disclosure provides a data difference comparison method, which may include the following steps.
S210, splitting the historical version data and the current version data of the preset data according to the data types to obtain a historical version data module corresponding to each data type and a data module corresponding to the same type of the current version.
S220, carrying out difference comparison of rich text data on the historical version data module corresponding to each data type and the corresponding data module of the same type of the current version to obtain a difference comparison result corresponding to the data module of each data type.
And S230, generating a difference comparison data stream according to the difference comparison result, wherein the difference comparison data stream comprises the combined historical version data and the current version data and is used for indicating the data change process from the historical version data to the current version data to obtain the version difference comparison result of the preset data.
According to the data difference comparison method disclosed by the embodiment of the disclosure, data splitting can be performed according to data types, so that new and old version data of predetermined data are respectively split into data modules of different data types, difference comparison of rich text data in various types of modules of the new and old version data is performed, and the new and old version data are combined into a difference comparison data stream according to a difference comparison result.
In the processing process of the data difference comparison method of the embodiment of the disclosure, the data of the new and old versions can be divided into the modules of different types through different data types, so that the data structuring is realized, the difference comparison is performed according to the modules of different types, and the difference comparison result corresponding to each module is obtained.
Moreover, the difference comparison data stream obtained by the data difference comparison method according to the embodiment of the disclosure is a combination of the historical version data and the current version data, and can be used for indicating the data change process from the historical version data to the current version data, so that the difference comparison data stream can be applied to more service scenes for data auditing, decoupling of the processing logic of the data difference comparison method from a specific service scene is realized, the adaptability is strong, and the data maintenance cost can be reduced.
In the embodiment of the present disclosure, the data modules corresponding to different data types in the predetermined data may include at least one of the following modules: the system comprises a paragraph text module, a table data module, a preset general module, a picture module and a map module.
The rich text data in the paragraph text module comprises a plurality of paragraph texts, such as a body module, a summary module and a basic information module. The text in the paragraph text may include text information in different languages, such as chinese and english.
The preset general module is a module including preset field information and field content, and may include, for example: a movie participating module, a TV play participating module, a code module and the like in encyclopedia information.
In some embodiments, the step S220 may specifically include the following steps.
S11, acquiring a data type as a first type, and splitting a historical version data module corresponding to the first type and a corresponding data module of the same type of the current version into a plurality of corresponding historical version data units and a plurality of corresponding current version data units.
And S12, performing difference comparison of rich text data on the same data unit in a plurality of versions corresponding to the first type to obtain a module difference comparison result corresponding to the first type.
And S13, in the process of obtaining the module difference comparison result, iterating for multiple times to obtain different data types as the first type, and obtaining a plurality of corresponding module difference comparison results until obtaining the difference comparison result corresponding to the data module of each data type.
Through the steps S11-S12, when performing difference comparison of different types of data blocks in the new and old version data, the data blocks can be further divided into data units according to the different types of data blocks, and the difference comparison of rich text data is performed on the basis of each data unit to obtain difference comparison results corresponding to the different types of data blocks. In the process, aiming at different types of data modules, the data difference comparison efficiency can be improved on the basis of not changing the original rich text structure, and the difference comparison of the rich text data is carried out through each split data unit in the same type of module, so that the data difference comparison efficiency and the accuracy are improved.
In one embodiment, the data module corresponding to the first type is any one of the following: the system comprises a paragraph text module, a table data module and a preset general module.
In this embodiment, step S12 may specifically include the following steps.
And S21, performing plain text conversion on the rich text data in the respective version data units corresponding to the first type to obtain plain text conversion results of the historical version data units and plain text conversion results of the current version data units.
And S22, comparing plain text differences according to plain text conversion results of the same data unit in each version, and obtaining plain text difference comparison results of the same data unit in each version under the condition that plain texts have differences.
And S23, comparing the same data unit in each version according to the modified data of the same data unit in each version under the condition that the plain text is the same, and obtaining the modified difference comparison result of the same data unit in each version, wherein the modified data is used for indicating the text style.
And S24, combining the pure text difference comparison result of the same data unit in each version with the modification difference comparison result of the same data unit in each version to obtain the module difference comparison result corresponding to the first type.
In the above steps S21-S24, for the rich text data in the paragraph text module, the rich text data in the table data module, and the rich text data in the preset general module, the difference comparison result of the plain text conversion result may be calculated first through plain text conversion, if the difference comparison results of the plain text conversion results are the same, the difference comparison of the next-step modified data does not need to be performed on the same data unit in the corresponding new and old version data, and if the difference comparison result of the plain text conversion results enters the next-step modified data difference comparison, so that the data processing efficiency may be improved, and the calculation resources may be saved.
In one embodiment, when the data module corresponding to the first type is a paragraph text module, the same data unit in each version is obtained by splitting the data module of each version according to characters: each character in the historical version paragraph text module and each corresponding character in the current version same paragraph text module.
In one embodiment, when the data module corresponding to the first type is a table type data module, the same data unit in each version is obtained by splitting each version data module according to rows and columns: each cell in the historical version table class data module and the same cell in the corresponding current version table class data module.
In one embodiment, when the first type is a preset general module, the same data unit in each version is obtained by splitting the data module of each version according to a line: the data content of each field in each row in the historical version reservation generic module and the data content of each field in the same row in the corresponding current version reservation generic module.
In the embodiment of the present disclosure, each type of module may have different processing logics, and therefore, in the processing flow of the difference comparison method, the difference comparison processing logics of different types of modules are slightly different, for example, the difference comparison processing logics may be embodied as that different types of modules have different data unit splitting methods, and for a paragraph text module, a table type data module, and a preset general module, the corresponding module difference comparison result may not only embody whether there is a difference but also embody specific difference contents, thereby implementing rich text difference comparison of diversified data types and improving data processing efficiency.
In some embodiments, the data module corresponding to the first type is any one of the following: a picture class module and a map class module.
In this embodiment, step S12 may specifically include the following steps.
And S31, performing difference comparison on rich text data contained in the same data unit in a plurality of respective versions corresponding to the first type, and taking whether the contained rich text data has difference as a module difference comparison result corresponding to the first type.
In this embodiment, for the picture type module and the map type module in the rich text data, according to the characteristic that the new version data and the old version data of the type module are usually replaced as a whole, whether the contained rich text data has a difference or not can be used as a module difference comparison result, so that data difference comparison is performed in a targeted manner according to the type of the module, and rich text difference comparison and data processing efficiency of diversified data types are realized.
In one embodiment, in the case that the data module corresponding to the first type is a picture class module, the same data unit in the respective versions is a picture in the historical version picture class module and a picture in the same picture display position in the corresponding current version picture class module.
In one embodiment, in the case that the first type of corresponding data module is a map class module, the same data unit in the respective version shows the map of the same position for each map in the map class module of the historical version and the map class module of the corresponding current version.
In the embodiment of the present disclosure, each type of module may have different processing logics, and therefore, in the processing flow of the difference comparison method, the difference comparison processing logics of different types of modules are slightly different, for example, the difference comparison processing logics may be embodied as that different types of modules have different data unit splitting methods.
In one embodiment, the data module corresponding to each data type is any one of the following: the data difference comparison method comprises a paragraph text module, a table data module and a preset general module, and further comprises the following steps: and S41, generating corresponding first marking information according to the difference comparison result, wherein the marking information is used for indicating the data change process from the historical version data to the current version data.
In one embodiment, the data module corresponding to each data type is any one of the following: the data difference comparison method comprises the following steps: and S42, generating corresponding second marking information according to the difference comparison result, wherein the second marking information is used for indicating that the data content of the historical version data and the current version data changes.
In this embodiment, for the paragraph text module, the table data module, and the preset general module, the difference change process may be reflected by corresponding marks, and for the picture module and the map module, whether the picture and the map are changed may be reflected by corresponding marks, and the difference comparison result of rich text data of diverse data types in the predetermined data is marked, so as to improve the usability of the difference comparison result.
Through the data difference comparison process described in the above embodiment, the difference comparison result of the rich text data of the new and old version data can be marked, for example, the marking information of content operations such as deletion, addition, replacement, and the like is added to the merged data of the new and old version data, so that one difference data stream can reflect the change process of the contents of the new and old versions. In the embodiment, the difference comparison result data stream of the merged data including the new version data and the old version data can meet the use of different auditing business parties, so that difference comparison processing, specific business scene decoupling and data multiplexing of the difference comparison result are realized, and the scene adaptability and the universality of the difference comparison result are greatly improved.
In one embodiment, the data difference comparison method further includes: s51, converting the difference comparison data stream into a data stream matched with the service scene of the auditing end according to the preset auditing end service scene; and S52, sending the adapted data stream to the designated auditing end device for data auditing according to the difference comparison data stream at the auditing end device.
In this embodiment, for a specific service scene of the auditing end, format conversion may be performed on the difference comparison data stream obtained by the data difference comparison method of the present disclosure to obtain a data stream adapted to the service scene of the auditing end.
In one embodiment, the data change process may include a data addition process and a data deletion process; the adapted data stream comprises at least any one of the following data streams: a revision mode data stream, a predefined backend data stream, and a machine audit data stream.
The revision mode data stream is used for displaying the current version data and prompt information of newly added data, and the newly added data is determined according to a data adding process.
The preset back-end data flow comprises two parts of data, wherein one part of data is used for displaying the prompt messages of the historical version data and the deleted data, the deleted data is determined according to the data deleting process, and the other part of data is used for displaying the prompt messages of the current version data and the newly added data.
And the machine audit data stream is used for displaying the module change state, wherein the module change state is the change state information from the historical version data module corresponding to each data type to the data module with the same type of the current version.
In this embodiment, in the process of calling the difference comparison data stream by each service party that audits the new and old version data, a specific adapter may be developed according to different service usage scenarios for converting the difference comparison data stream into an adapted data stream, so as to perform difference comparison of rich text data in the new and old version data in different service usage scenarios, and thus, in various different specific service scenarios, the difference comparison data stream is processed according to respective adapted data requirements, so that a processing flow may be simplified, and development intervention cost may be reduced.
Fig. 3 is a schematic flow chart of a data difference comparison method according to another embodiment of the disclosure.
As shown in fig. 3, the data difference comparison method may include the following steps.
S301, data is input.
In this step, the input data includes historical version data and current version data of the predetermined data.
And S302, splitting the data module.
In this step, the historical version data and the current version data of the predetermined data may be split according to the data type, so as to obtain historical version data modules of different data types and current version data modules of different data types.
The predetermined data is structured and divided into a plurality of different modules by the module splitting of different data types, including a text module, a summary module, a basic information module, a table module, a map module, a picture (album) module and a general module such as a movie participation module, a code module and the like.
Fig. 3 shows exemplary different types of data modules, such as a text module, a basic information module, a general module, a picture module, etc.
In this disclosure, the historical version data modules of different data types and the current version data modules of different data types may be obtained by combining the processing procedure described in the embodiment of S302, which is not described herein again in this disclosure.
And S303, calling a data difference comparison method of each data module.
In this step, for example, for paragraph texts in the new and old version data, text splitting may be performed in the paragraph respectively to obtain each text in the paragraph, and if adjacent texts in the paragraph do not change, they may be displayed in a connected manner, and if there is a change, a mark of the change process is displayed.
In this step, as an example, for the table type data module, the table is split according to each row and each column, and the rich text content in the cell is subjected to difference comparison with each cell obtained by splitting as a unit.
As an example, for a predetermined generic module, such as: and the participating movie module and the participating TV play module are split according to each line and compare the data content of each field in each line.
As an example, for the picture class module, in case that the picture content is detected to be changed, corresponding marking content is performed to indicate that the picture content is changed.
In this embodiment of the disclosure, the data difference comparison result of each module may be obtained by combining the processing procedure described in S303, and details of this embodiment of the disclosure are not described herein again.
S304, judging whether rich text data exists in the data unit obtained by splitting the data module of each data type, if so, executing the step S305, and if not, executing the step S309.
S305, plain text is converted.
In the step, for the data of the new version and the old version, the rich text data in the data module of each data type is subjected to plain text conversion to obtain plain text conversion results of the same data unit in the respective versions.
In the embodiment of the present disclosure, a plain text conversion result may be obtained by combining the processing procedures described in step S21 and step S305, and details of the embodiment of the present disclosure are not repeated herein.
S306, calculating the plain text difference.
In the step, based on the plain text conversion result as data, the plain text difference comparison result in each module is obtained according to the difference comparison result of the plain text conversion results of the same data unit in the new and old version data.
In the embodiment of the present disclosure, on the basis of splitting (i.e., scattering) plain texts of the same data unit in the new and old version data into each text, difference comparison of plain text conversion results of the same data unit is performed by taking each character as a unit.
S307, the modification data difference is calculated.
In this step, modification data comparison is performed on rich text data in the same data unit having the same plain text conversion result on the basis of the difference calculation result in S406.
In the embodiment of the present disclosure, by calculating the modified data difference logic processing, the modified data difference comparison result in each module can be obtained by traversing the rich text data in each same data unit in the plain text conversion result, and if the plain text conversion results are the same, performing modified data difference comparison on the rich text data in the same data unit.
S308, data combination (merge).
In this step, the plain text difference comparison result in each module and the modified data difference comparison result in each module may be subjected to data combination to obtain the rich text data difference comparison result in each module.
S309, customizing the comparison logic.
In this step, if there is no rich text data in the data unit obtained by splitting the data module of each data type, processing is performed according to a user-defined comparison logic. For example, if only plain text is contained in the data unit obtained by splitting the data module of each data type, the processing is performed according to a plain text data difference comparison method.
And S310, formatting module data.
In this step, the data difference comparison results corresponding to the modules can be unified into a fixed format, which facilitates subsequent data calling.
And S311, merging the data.
In this step, the data difference comparison results corresponding to each module in the new and old version data may be merged according to a predetermined sequence and rules to obtain the data difference comparison results merged by each module. As an example, the predetermined order and specification may be implemented by user customization.
In the embodiment of the disclosure, through data merging logic processing, plain text and modified data can be combined together to form a data difference comparison result of rich text of new and old version data.
S312, generating a difference data stream.
In this step, a difference comparison data stream is generated according to the data difference comparison result merged by each module.
In the embodiment of the present disclosure, by generating a difference data stream logic process, a difference comparison data stream may be obtained, where the difference comparison data stream includes combined historical version data and current version data, and is used to indicate a data change process from the historical version data to the current version data.
And S313, distributing the service.
In the step, according to the preset service scene, the difference comparison data stream is converted into the data stream matched with the service scene of the auditing terminal through the adapters (converters) of the respective service scenes, and the matched data stream is sent to the appointed auditing terminal equipment through service distribution, so that data calling and data auditing of each service party are facilitated.
In the embodiment of the present disclosure, the adaptation data of the difference comparison data streams of different service scenarios is obtained through data adaptation logic processing, so that the adaptation data of the difference comparison data streams is sent to the corresponding service client.
Specifically, the data distribution according to different service scenarios may include the following steps.
S3131, processing the difference contrast data stream with the revision mode data adaptor to obtain a revision mode data stream.
S3132, processing the differential contrast data stream with a back-end data adapter to obtain a predefined back-end data stream.
And S3133, processing the difference comparison data stream by using the machine audit data adapter to obtain a machine audit data stream.
S3134, processing the difference comparison data stream with other adapters to obtain a corresponding audit data stream. The other adapters are data adapters in data formats required by the service scenes except the service scenes.
Through the above steps S3131-S3133, on the basis of the data of the differential contrast data stream obtained through the data differential contrast processing, a specific adapter is developed for a specific service usage scenario, and the differential contrast data stream is adapted to obtain corresponding adaptation data.
And S314, outputting the result.
In the step, the data obtained by the adaptation processing is audited, and the auditing result of the difference content of the rich text data in the new version data and the old version data is obtained.
FIG. 4 illustrates a schematic diagram of generating a difference contrast data stream in an exemplary embodiment of the present disclosure.
As shown in fig. 4, a and B represent different versions of the same rich text paragraph P, where a is the content of the history version and B is the content of the current version, and according to the data difference comparison method described in the above embodiment, the change process from a to B is: delete character b, delete character E and add-new character E.
In fig. 4, a deletion line indicates that the corresponding character is deleted, and an underline indicates that the corresponding character is newly added, that is, the disparity-contrast data stream includes a and B merged and can be used to indicate a data change process from the historical version data to the current version data.
Fig. 5 is a schematic diagram of an adaptation process for a difference contrast data stream in an exemplary embodiment of the disclosure. Fig. 5 uses the same rich text passage P as in fig. 4, and a and B have the same meaning.
As shown in fig. 5, for different versions of the same rich text paragraph P, according to the data difference comparison method described in the above embodiment, a difference comparison data stream is obtained, which includes a and B that are combined and can be used to indicate a data change process from the historical version data to the current version data. Namely: the difference obtained by combining the new version data and the old version data is compared with the data stream, and the state of the new version data and the old version data and the state of the data can be represented by one piece of data.
With continued reference to FIG. 5, the difference contrast data stream is processed using the revision mode data adapter to obtain a revision mode data stream. As shown in fig. 5, the schema data stream is revised to show the prompt information of the current version data and the newly added data.
By way of example, the data streams are contrasted using back-end data adapter differences to arrive at a predefined back-end data stream. As shown in fig. 5, the predetermined backend data stream includes two parts of data, where one part of data is used to display the historical version data and the prompt information of the deleted data, the deleted data is determined according to the data deletion process, and the other part of data is used to display the prompt information of the current version data and the newly added data.
As an example, the data stream is differentially compared using a machine audit data adapter to obtain a machine audit data stream. As shown in fig. 5, the machine checks the data stream, and may show the module change status, which is the change status information from the historical version data module corresponding to each data type to the data module of the same type as the current version.
It should be understood that the module may be at least one of a text paragraph module, a table type data module, a preset general module, a picture type data module, and a map type data module.
As can be seen from fig. 5, multiple data structures of the difference contrast data stream can be generated by the adapter and provided for different services, and the same data can solve the problem of using multiple service scenarios.
Fig. 6 is a schematic structural diagram of a data difference comparison apparatus according to an embodiment of the disclosure.
In a second aspect, referring to fig. 6, an embodiment of the present disclosure provides a data difference comparison apparatus 600, which may include the following modules.
The version data splitting module 610 is configured to split the historical version data and the current version data according to the data types to obtain a historical version data module corresponding to each data type and a data module corresponding to the current version of the same type.
And the data difference comparison module 620 is configured to perform difference comparison of rich text data on the historical version data module corresponding to each data type and the corresponding data module of the same type of the current version, so as to obtain a difference comparison result corresponding to the data module of each data type.
A comparison result determining module 630, configured to generate a difference comparison data stream according to the difference comparison result, where the difference comparison data stream includes the combined historical version data and current version data, and is configured to indicate a data change process from the historical version data to the current version data, so as to determine the difference comparison result between the historical version data and the current version data.
In one embodiment, the data difference comparison module 620 may include the following elements.
And the data module splitting unit is used for acquiring a data type as a first type, and splitting a historical version data module corresponding to the first type and a corresponding data module of the same type as the current version into a plurality of corresponding historical version data units and a plurality of corresponding current version data units.
And the data module comparison unit is used for performing difference comparison on rich text data on the same data unit in a plurality of respective versions corresponding to the first type to obtain a module difference comparison result corresponding to the first type.
And the module comparison result determining unit is used for iterating for multiple times to obtain different data types as the first type in the process of obtaining the module difference comparison result, and obtaining a plurality of corresponding module difference comparison results until obtaining the difference comparison result corresponding to the data module of each data type.
In some embodiments, the data module corresponding to the first type is any one of the following: the system comprises a paragraph text module, a table data module and a preset general module.
In this embodiment, the data module comparison unit may be specifically configured to: plain text conversion is carried out on rich text data in a plurality of respective version data units corresponding to the first type, and plain text conversion results of a plurality of historical version data units and plain text conversion results of a plurality of current version data units are obtained; comparing plain text differences according to plain text conversion results of the same data unit in each version, and obtaining plain text difference comparison results of the same data unit in each version under the condition that plain texts have differences; under the condition that plain texts are the same, comparing according to the modification data of the same data unit in each version to obtain modification difference comparison results of the same data unit in each version, wherein the modification data is used for indicating a text style; and combining the pure text difference comparison result of the same data unit in each version with the modification difference comparison result of the same data unit in each version to obtain the module difference comparison result corresponding to the first type.
In some embodiments, when the data module corresponding to the first type is a paragraph text module, the same data unit in each version is obtained by splitting the data module of each version according to characters: each character in the text module of the paragraph of the historical version and each character in the text module of the same paragraph of the corresponding current version; under the condition that the data module corresponding to the first type is a table type data module, the same data unit in each version is obtained by splitting each version data module according to rows and columns: each cell in the historical version table class data module and the same cell in the corresponding current version table class data module; under the condition that the first type is a preset universal module, the same data unit in each version is obtained by splitting each version data module according to a line: the data content of each field in each row in the historical version reservation generic module and the data content of each field in the same row in the corresponding current version reservation generic module.
In some embodiments, the data module corresponding to the first type is any one of the following: a picture class module and a map class module.
In this embodiment, the data module comparison unit may be specifically configured to: and comparing differences of rich text data contained in the same data unit in a plurality of respective versions corresponding to the first type, and taking whether the contained rich text data has differences as a module difference comparison result corresponding to the first type.
In some embodiments, in the case that the data module corresponding to the first type is a picture module, the same data unit in each version is a picture in the historical version picture module and a picture in the same picture display position in the corresponding current version picture module; and under the condition that the data module corresponding to the first type is a map module, the same data unit in each version is a map of the same map display position in each map in the map module of the historical version and the map module of the corresponding current version.
In some embodiments, the data module corresponding to each data type is any one of the following: the system comprises a paragraph text module, a table data module and a preset general module.
The data difference comparison device further includes: and the first marking module is used for generating corresponding first marking information according to the difference comparison result, and the marking information is used for indicating the data change process from the historical version data to the current version data.
In some embodiments, the data module corresponding to each data type is any one of the following: a picture class module and a map class module.
The data difference comparison device further includes: and the second marking module is used for generating corresponding second marking information according to the difference comparison result, and the second marking information is used for indicating that the data content of the historical version data and the data content of the current version data are changed.
In some embodiments, the data difference comparing means further comprises: the adaptation processing module is used for converting the difference comparison data stream into a data stream adapted to the service scene of the auditing end according to the preset service scene of the auditing end; and the data sending module is used for sending the adapted data stream to the appointed auditing end equipment so as to audit data of the data stream according to the difference comparison at the auditing end equipment.
In some embodiments, the data change process includes a data addition process and a data deletion process; the adapted data stream comprises at least any one of the following data streams: a revision mode data stream, a predefined backend data stream, and a machine audit data stream.
The revision mode data stream is used for displaying the current version data and prompt information of newly added data, and the newly added data is determined according to a data adding process.
The preset back-end data flow comprises two parts of data, wherein one part of data is used for displaying the prompt messages of the historical version data and the deleted data, the deleted data is determined according to the data deleting process, and the other part of data is used for displaying the prompt messages of the current version data and the newly added data.
And the machine audit data stream is used for displaying the module change state, wherein the module change state is the change state information from the historical version data module corresponding to each data type to the data module with the same type of the current version.
According to the data difference contrast device of the embodiment of the disclosure, can be through different data types, divide the data of new and old version into the module of different grade type, realize the structurization of data, thereby carry out the difference contrast according to the module of different grade type, obtain the difference contrast result that each module corresponds, whole processing procedure can avoid carrying out the segmentation from data whole, the realization is with non-invasive design, on the basis that does not change original rich text structure, carry out the segmentation to new and old version data, simplify the operating procedure and improve the treatment effeciency, and can promote the operation process space that can carry out the difference contrast of the data display pattern of different grade type in new and old version data.
Moreover, the difference comparison data stream obtained by the data difference comparison method according to the embodiment of the disclosure is a combination of the historical version data and the current version data, and can be used for indicating the data change process from the historical version data to the current version data, so that the difference comparison data stream can be applied to more service scenes for data auditing, decoupling of the processing logic of the data difference comparison method from a specific service scene is realized, the adaptability is strong, data maintenance can be reduced, and in a new service scene, the difference comparison data stream is processed into a corresponding format only on the basis of the difference comparison data stream, so that the development cost is reduced.
It is to be understood that this disclosure is not limited to the particular configurations and processes described in the above embodiments and shown in the drawings. For convenience and brevity of description, detailed description of a known method is omitted here, and for the specific working processes of the system, the module and the unit described above, reference may be made to corresponding processes in the foregoing method embodiments, which are not described herein again.
The present disclosure also provides an electronic device and a readable storage medium according to an embodiment of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program in a Random Access Memory (RAM)703 from a storage unit 708. In the RAM703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM702, and the RAM703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704. Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 701 performs the respective methods and processes described above, such as the data difference comparison method. For example, in some embodiments, the data difference comparison method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM702 and/or communications unit 709. When loaded into RAM703 and executed by the computing unit 701, may perform one or more steps of the data difference comparison method described above. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the data difference comparison method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
According to an embodiment of the present disclosure, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements any one of the above-mentioned data difference comparison methods.
Artificial intelligence is the subject of research that causes computers to simulate certain mental processes and intelligent behaviors of humans (e.g., learning, reasoning, planning, etc.), both at the hardware level and at the software level. The artificial intelligence hardware technology generally comprises the technologies of a sensor, a special artificial intelligence chip, cloud computing, distributed storage, big data processing and the like; the artificial intelligence software technology comprises a computer vision technology, a voice recognition technology, a natural language processing technology, machine learning/deep learning, a big data processing technology, a knowledge map technology and the like.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (21)

1. A data difference comparison method, comprising:
splitting historical version data and current version data of preset data according to data types to obtain a historical version data module corresponding to each data type and a data module corresponding to the same type of the current version;
performing difference comparison of rich text data on the historical version data module corresponding to each data type and the corresponding data module of the same type of the current version to obtain a difference comparison result corresponding to the data module of each data type;
and generating a difference comparison data stream according to the difference comparison result, wherein the difference comparison data stream comprises combined historical version data and current version data and is used for indicating a data change process from the historical version data to the current version data to obtain the version difference comparison result of the preset data.
2. The method according to claim 1, wherein the performing the difference comparison of rich text data on the historical version data module corresponding to each data type and the corresponding data module of the same type as the current version to obtain the difference comparison result corresponding to the data module of each data type includes:
acquiring a data type as a first type, and splitting a historical version data module corresponding to the first type and a corresponding data module of the same type of a current version into a plurality of corresponding historical version data units and a plurality of corresponding current version data units;
performing difference comparison of rich text data on the same data unit in a plurality of respective versions corresponding to the first type to obtain a module difference comparison result corresponding to the first type;
and in the process of obtaining the module difference comparison result, iterating for multiple times to obtain different data types as the first type, and obtaining the corresponding multiple module difference comparison results until obtaining the difference comparison result corresponding to the data module of each data type.
3. The method of claim 2, wherein the data module corresponding to the first type is any one of: the system comprises a paragraph text module, a table data module and a preset general module;
the performing difference comparison of rich text data on the same data unit in the multiple respective versions corresponding to the first type to obtain a module difference comparison result corresponding to the first type includes:
performing plain text conversion on rich text data in a plurality of respective version data units corresponding to the first type to obtain plain text conversion results of a plurality of historical version data units and plain text conversion results of a plurality of current version data units;
comparing plain text differences according to plain text conversion results of the same data unit in each version, and obtaining plain text difference comparison results of the same data unit in each version under the condition that plain texts have differences;
under the condition that plain texts are the same, comparing according to the modification data of the same data unit in each version to obtain modification difference comparison results of the same data unit in each version, wherein the modification data is used for indicating a text style;
and combining the pure text difference comparison result of the same data unit in each version with the modification difference comparison result of the same data unit in each version to obtain the module difference comparison result corresponding to the first type.
4. The method of claim 3, wherein,
under the condition that the data module corresponding to the first type is a paragraph text module, the same data unit in each version is obtained by splitting each version data module according to characters: each character in the text module of the paragraph of the historical version and each character in the text module of the same paragraph of the corresponding current version;
under the condition that the data module corresponding to the first type is a table type data module, the same data unit in each version is obtained by splitting each version data module according to rows and columns: each cell in the historical version table class data module and the same cell in the corresponding current version table class data module;
under the condition that the first type is a preset universal module, the same data unit in each version is obtained by splitting each version data module according to a line: the data content of each field in each row in the historical version reservation generic module and the data content of each field in the same row in the corresponding current version reservation generic module.
5. The method of claim 2, wherein the data module corresponding to the first type is any one of: a picture class module and a map class module;
the performing difference comparison of rich text data on the same data unit in the multiple respective versions corresponding to the first type to obtain a module difference comparison result corresponding to the first type includes:
and comparing differences of rich text data contained in the same data unit in a plurality of respective versions corresponding to the first type, and taking whether the contained rich text data has differences as a module difference comparison result corresponding to the first type.
6. The method of claim 5, wherein,
under the condition that the data module corresponding to the first type is a picture module, the same data unit in each version is a picture in the historical version picture module and a picture in the same picture display position in the corresponding current version picture module;
and under the condition that the data module corresponding to the first type is a map module, the same data unit in each version is a map of the same map display position in each map in the map module of the historical version and the map module of the corresponding current version.
7. The method of any one of claims 1-6,
the data module corresponding to each data type is any one of the following modules: the method comprises a paragraph text module, a table data module and a preset general module, and the method further comprises the following steps:
generating corresponding first marking information according to the difference comparison result, wherein the marking information is used for indicating a data change process from historical version data to current version data;
the data module corresponding to each data type is any one of the following modules: a picture class module and a map class module, the method further comprising:
and generating corresponding second marking information according to the difference comparison result, wherein the second marking information is used for indicating that the data content of the historical version data and the current version data changes.
8. The method of any of claims 1-6, further comprising:
converting the difference comparison data stream into a data stream matched with a service scene of an auditing end according to a preset service scene of the auditing end;
and sending the adapted data stream to a designated auditing end device for data auditing according to the difference comparison data stream at the auditing end device.
9. The method of claim 8, wherein,
the data change process comprises a data adding process and a data deleting process;
the adapted data stream comprises at least any one of the following data streams: revising the schema data stream, the predetermined back-end data stream, and the machine audit data stream; wherein,
the revision mode data stream is used for displaying prompt information of the current version data and newly added data, and the newly added data is determined according to the data newly added process;
the predetermined back-end data stream comprises two parts of data, wherein one part of data is used for displaying prompt messages of historical version data and deleted data, the deleted data is determined according to the data deleting process, and the other part of data is used for displaying prompt messages of current version data and newly added data;
and the machine audits a data stream, and is used for displaying the module change state, wherein the module change state is the change state information from the historical version data module corresponding to each data type to the data module with the same type as the current version.
10. A data difference comparison apparatus comprising:
the version data splitting module is used for splitting the historical version data and the current version data according to the data types to obtain a historical version data module corresponding to each data type and a data module corresponding to the same type of the current version;
the data difference comparison module is used for carrying out difference comparison on rich text data on the historical version data module corresponding to each data type and the corresponding data module of the same type of the current version to obtain a difference comparison result corresponding to the data module of each data type;
and the comparison result determining module is used for generating a difference comparison data stream according to the difference comparison result, wherein the difference comparison data stream comprises the combined historical version data and the current version data and is used for indicating the data change process from the historical version data to the current version data so as to determine the difference comparison result of the historical version data and the current version data.
11. The apparatus of claim 10, wherein the data difference comparison module comprises:
the data module splitting unit is used for acquiring a data type as a first type, and splitting a historical version data module corresponding to the first type and a corresponding data module of the same type as a current version into a plurality of corresponding historical version data units and a plurality of corresponding current version data units;
the data module comparison unit is used for carrying out difference comparison on rich text data on the same data unit in a plurality of respective versions corresponding to the first type to obtain a module difference comparison result corresponding to the first type;
and the module comparison result determining unit is used for iterating for multiple times to obtain different data types as the first type in the process of obtaining the module difference comparison result, and obtaining a plurality of corresponding module difference comparison results until obtaining the difference comparison result corresponding to the data module of each data type.
12. The apparatus of claim 11, wherein the data module corresponding to the first type is any one of: the system comprises a paragraph text module, a table data module and a preset general module; the data module comparison unit is specifically configured to:
performing plain text conversion on rich text data in a plurality of respective version data units corresponding to the first type to obtain plain text conversion results of a plurality of historical version data units and plain text conversion results of a plurality of current version data units;
comparing plain text differences according to plain text conversion results of the same data unit in each version, and obtaining plain text difference comparison results of the same data unit in each version under the condition that plain texts have differences;
under the condition that plain texts are the same, comparing according to the modification data of the same data unit in each version to obtain modification difference comparison results of the same data unit in each version, wherein the modification data is used for indicating a text style;
and combining the pure text difference comparison result of the same data unit in each version with the modification difference comparison result of the same data unit in each version to obtain the module difference comparison result corresponding to the first type.
13. The apparatus of claim 12, wherein,
under the condition that the data module corresponding to the first type is a paragraph text module, the same data unit in each version is obtained by splitting each version data module according to characters: each character in the text module of the paragraph of the historical version and each character in the text module of the same paragraph of the corresponding current version;
under the condition that the data module corresponding to the first type is a table type data module, the same data unit in each version is obtained by splitting each version data module according to rows and columns: each cell in the historical version table class data module and the same cell in the corresponding current version table class data module;
under the condition that the first type is a preset universal module, the same data unit in each version is obtained by splitting each version data module according to a line: the data content of each field in each row in the historical version reservation generic module and the data content of each field in the same row in the corresponding current version reservation generic module.
14. The apparatus of claim 11, wherein the data module corresponding to the first type is any one of: a picture class module and a map class module;
the data module comparison unit is specifically configured to:
and comparing differences of rich text data contained in the same data unit in a plurality of respective versions corresponding to the first type, and taking whether the contained rich text data has differences as a module difference comparison result corresponding to the first type.
15. The apparatus of claim 14, wherein,
under the condition that the data module corresponding to the first type is a picture module, the same data unit in each version is a picture in the historical version picture module and a picture in the same picture display position in the corresponding current version picture module;
and under the condition that the data module corresponding to the first type is a map module, the same data unit in each version is a map of the same map display position in each map in the map module of the historical version and the map module of the corresponding current version.
16. The apparatus of any one of claims 10-15,
the data module corresponding to each data type is any one of the following modules: the system comprises a paragraph text module, a table data module and a preset general module; the device further comprises:
the first marking module is used for generating corresponding first marking information according to the difference comparison result, and the marking information is used for indicating the data change process from the historical version data to the current version data;
the data module corresponding to each data type is any one of the following modules: the device comprises a picture class module and a map class module, and further comprises:
and the second marking module is used for generating corresponding second marking information according to the difference comparison result, and the second marking information is used for indicating that the data content of the historical version data and the data content of the current version data are changed.
17. The apparatus of any one of claims 10-15, further comprising:
the adaptation processing module is used for converting the difference comparison data stream into a data stream adapted to a service scene of the auditing end according to a preset service scene of the auditing end;
and the data sending module is used for sending the adapted data stream to the appointed auditing end equipment so as to audit data of the data stream according to the difference comparison at the auditing end equipment.
18. The apparatus of claim 17, wherein,
the data change process comprises a data adding process and a data deleting process;
the adapted data stream comprises at least any one of the following data streams: revising the schema data stream, the predetermined back-end data stream, and the machine audit data stream; wherein,
the revision mode data stream is used for displaying prompt information of the current version data and newly added data, and the newly added data is determined according to the data newly added process;
the predetermined back-end data stream comprises two parts of data, wherein one part of data is used for displaying prompt messages of historical version data and deleted data, the deleted data is determined according to the data deleting process, and the other part of data is used for displaying prompt messages of current version data and newly added data;
and the machine audits a data stream, and is used for displaying the module change state, wherein the module change state is the change state information from the historical version data module corresponding to each data type to the data module with the same type as the current version.
19. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.
20. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-9.
21. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-9.
CN202110106085.1A 2021-01-26 2021-01-26 Data difference comparison method, device, equipment, medium and computer program product Active CN112818656B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110106085.1A CN112818656B (en) 2021-01-26 2021-01-26 Data difference comparison method, device, equipment, medium and computer program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110106085.1A CN112818656B (en) 2021-01-26 2021-01-26 Data difference comparison method, device, equipment, medium and computer program product

Publications (2)

Publication Number Publication Date
CN112818656A true CN112818656A (en) 2021-05-18
CN112818656B CN112818656B (en) 2023-10-27

Family

ID=75859443

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110106085.1A Active CN112818656B (en) 2021-01-26 2021-01-26 Data difference comparison method, device, equipment, medium and computer program product

Country Status (1)

Country Link
CN (1) CN112818656B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688616A (en) * 2021-10-27 2021-11-23 深圳市明源云科技有限公司 Method, device and equipment for detecting chart report difference and storage medium
CN113923472A (en) * 2021-09-01 2022-01-11 北京奇艺世纪科技有限公司 Video content analysis method and device, electronic equipment and storage medium
CN113934644A (en) * 2021-12-16 2022-01-14 深圳市明源云链互联网科技有限公司 Version difference comparison method and device, intelligent terminal and readable storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182773A1 (en) * 2004-02-18 2005-08-18 Feinsmith Jason B. Machine-implemented activity management system using asynchronously shared activity data objects and journal data items
US7281018B1 (en) * 2004-05-26 2007-10-09 Microsoft Corporation Form template data source change
US20090030755A1 (en) * 2007-07-25 2009-01-29 Utbk, Inc. Systems and Methods to Dynamically Generate Listings to Selectively Track User Responses
CN105335360A (en) * 2014-05-26 2016-02-17 国际商业机器公司 Method and apparatus for generating document structure
CN105404521A (en) * 2014-05-30 2016-03-16 广州市动景计算机科技有限公司 Incremental upgrading method and relevant device
CN106716402A (en) * 2014-05-12 2017-05-24 迪飞奥公司 Entity-centric knowledge discovery
KR20180073128A (en) * 2016-12-22 2018-07-02 항저우 순왕 테크놀로지 컴퍼니 리미티드 A data updating method based on data block comparison
CN109408102A (en) * 2018-09-04 2019-03-01 珠海格力电器股份有限公司 Version comparison method and device, household electrical appliance and network equipment
US10366053B1 (en) * 2015-11-24 2019-07-30 Amazon Technologies, Inc. Consistent randomized record-level splitting of machine learning data
CN110427215A (en) * 2019-07-30 2019-11-08 阿里巴巴集团控股有限公司 A kind of program version mRNA differential display mRNA method and device applied to front end exploitation
CN111558218A (en) * 2020-07-13 2020-08-21 腾讯科技(深圳)有限公司 Method for controlling entry in game client and related device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182773A1 (en) * 2004-02-18 2005-08-18 Feinsmith Jason B. Machine-implemented activity management system using asynchronously shared activity data objects and journal data items
US7281018B1 (en) * 2004-05-26 2007-10-09 Microsoft Corporation Form template data source change
US20090030755A1 (en) * 2007-07-25 2009-01-29 Utbk, Inc. Systems and Methods to Dynamically Generate Listings to Selectively Track User Responses
CN106716402A (en) * 2014-05-12 2017-05-24 迪飞奥公司 Entity-centric knowledge discovery
CN105335360A (en) * 2014-05-26 2016-02-17 国际商业机器公司 Method and apparatus for generating document structure
CN105404521A (en) * 2014-05-30 2016-03-16 广州市动景计算机科技有限公司 Incremental upgrading method and relevant device
US10366053B1 (en) * 2015-11-24 2019-07-30 Amazon Technologies, Inc. Consistent randomized record-level splitting of machine learning data
KR20180073128A (en) * 2016-12-22 2018-07-02 항저우 순왕 테크놀로지 컴퍼니 리미티드 A data updating method based on data block comparison
CN109408102A (en) * 2018-09-04 2019-03-01 珠海格力电器股份有限公司 Version comparison method and device, household electrical appliance and network equipment
CN110427215A (en) * 2019-07-30 2019-11-08 阿里巴巴集团控股有限公司 A kind of program version mRNA differential display mRNA method and device applied to front end exploitation
CN111558218A (en) * 2020-07-13 2020-08-21 腾讯科技(深圳)有限公司 Method for controlling entry in game client and related device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘芸良;肖纯;史晓雯;刘艳;: "基于真实SNPs数据的仿真方法实现与效果评价", 中国医院统计, no. 01 *
郎为民;张汉;赵毅丰;姚晋芳;: "一种基于区块链的物联网行为监控和活动管理方案", 信息网络安全, no. 02 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113923472A (en) * 2021-09-01 2022-01-11 北京奇艺世纪科技有限公司 Video content analysis method and device, electronic equipment and storage medium
CN113923472B (en) * 2021-09-01 2023-09-01 北京奇艺世纪科技有限公司 Video content analysis method, device, electronic equipment and storage medium
CN113688616A (en) * 2021-10-27 2021-11-23 深圳市明源云科技有限公司 Method, device and equipment for detecting chart report difference and storage medium
CN113688616B (en) * 2021-10-27 2022-02-25 深圳市明源云科技有限公司 Method, device and equipment for detecting chart report difference and storage medium
CN113934644A (en) * 2021-12-16 2022-01-14 深圳市明源云链互联网科技有限公司 Version difference comparison method and device, intelligent terminal and readable storage medium

Also Published As

Publication number Publication date
CN112818656B (en) 2023-10-27

Similar Documents

Publication Publication Date Title
CN112818656B (en) Data difference comparison method, device, equipment, medium and computer program product
CN113407850B (en) Method and device for determining and acquiring virtual image and electronic equipment
CN111008309A (en) Query method and device
JP7309811B2 (en) Data annotation method, apparatus, electronics and storage medium
CN114218889A (en) Document processing method, document model training method, document processing device, document model training equipment and storage medium
CN113657395A (en) Text recognition method, and training method and device of visual feature extraction model
CN113869042A (en) Text title generation method and device, electronic equipment and storage medium
CN116302218B (en) Function information adding method, device, equipment and storage medium
CN114880498B (en) Event information display method and device, equipment and medium
CN114330718B (en) Method and device for extracting causal relationship and electronic equipment
CN114490969B (en) Question and answer method and device based on table and electronic equipment
CN113110874B (en) Method and apparatus for generating code structure diagram
CN115374063A (en) File processing method, device, equipment and storage medium
CN111831179B (en) Signing method, device and computer readable medium
CN110245342A (en) The method, apparatus and storage medium of text matches
CN113362111A (en) Content sending method and device and electronic equipment
CN113642295A (en) Page layout method, page layout device and computer program product
CN113138760A (en) Page generation method and device, electronic equipment and medium
CN112560466A (en) Link entity association method and device, electronic equipment and storage medium
CN112487765A (en) Method and device for generating notification text
CN113590219B (en) Data processing method and device, electronic equipment and storage medium
CN114281981B (en) News brief report generation method and device and electronic equipment
JP7504247B2 (en) Note-creating method and device, electronic device, storage medium, and computer program
CN114383600B (en) Processing method and device for map, electronic equipment and storage medium
US20230222827A1 (en) Method and apparatus for processing document image, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant