CN111177409A - Method and device for realizing data processing, computer storage medium and terminal - Google Patents
Method and device for realizing data processing, computer storage medium and terminal Download PDFInfo
- Publication number
- CN111177409A CN111177409A CN201911377740.6A CN201911377740A CN111177409A CN 111177409 A CN111177409 A CN 111177409A CN 201911377740 A CN201911377740 A CN 201911377740A CN 111177409 A CN111177409 A CN 111177409A
- Authority
- CN
- China
- Prior art keywords
- attribute
- attribute value
- identification information
- storing
- attribute values
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 238000004590 computer program Methods 0.000 claims description 14
- 239000000126 substance Substances 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A method, a device, a computer storage medium and a terminal for realizing data processing comprise: determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated; and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value. According to the embodiment of the invention, the management of historical data of the knowledge graph is realized by storing and processing the graph data to be integrated.
Description
Technical Field
This document relates to, but is not limited to, knowledge graph technology, and more particularly, to a method, an apparatus, a computer storage medium, and a terminal for implementing data processing.
Background
At present, knowledge-graph technology is widely used to describe things and associations between transactions; among them, the Property Graph (Property Graph) is a very basic and common Graph representation mode. In the attribute map, vertices (i.e., points in the knowledge graph) represent entities in real society (e.g., people, cars, hotels, etc.), and edges (i.e., edges in the knowledge graph) represent relationships in real society (e.g., parent-child relationships, people-car-owned relationships, etc.). Each entity is uniquely identified by a primary key field, and a label (label) identifies its category (e.g., person, car, etc.), and each entity may have a plurality of other attributes. Relationships are similar to entities, with a label identifying the type of relationship, each relationship may also have multiple other attributes, and relationships are uniquely identified by the subject and object's host key that make up the relationship.
In the process of constructing a knowledge graph based on existing data, entities and relationships may be collated according to structured and unstructured raw data. The original data may change continuously with the change of time; in the process of establishing the knowledge graph, data processing is generally performed in a coverage updating mode according to the data timestamp, and entity and relationship state information at historical time points cannot be inquired in the knowledge graph, so that the mastering of users and technicians on the content related to the knowledge graph is influenced, how to effectively manage the data related to the knowledge graph, and the improvement of the related performance of the knowledge graph becomes a technical problem to be solved.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides a method and a device for realizing data processing, a computer storage medium and a terminal, which can realize effective management of data related to a knowledge graph.
The embodiment of the invention provides a method for realizing data processing, which comprises the following steps:
determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated;
and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value.
In an exemplary embodiment, the storing the attribute values according to whether the attributes contain the same attribute value includes:
when the attribute values contained in the attributes are different, storing each attribute value according to whether the attribute value contains identification information;
when the attributes contain the same attribute value, the same attribute value is subjected to duplicate removal processing; storing each attribute value according to whether the attribute value after the duplicate removal processing contains the identification information;
the identification information is used for distinguishing the writing time of each attribute value.
In an exemplary embodiment, the performing the deduplication processing on the same attribute value includes:
determining whether the same attribute value contained in the attribute contains the identification information;
when the identification information is contained, reserving an attribute value of preset time; when the identification information is not contained, any attribute value in the same attribute values is reserved;
wherein the preset time includes: the earliest time or the latest time.
In an exemplary embodiment, the storing the attribute values according to whether the identification information is included includes:
when the identification information is contained, storing each attribute value according to the sequence of writing time;
and when the identification information is not contained, storing each attribute value according to the reading sequence of the integrated map data.
In an exemplary embodiment, the identification information includes any one of:
version information, timestamp information.
In an exemplary embodiment, the storing of the attribute values includes:
and storing the attribute value of the attribute into a preset storage area.
On the other hand, an embodiment of the present invention further provides a device for implementing data processing, including: a determination unit and a storage unit; wherein the content of the first and second substances,
the determination unit is used for: determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated;
the storage unit is used for: and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value.
In an exemplary embodiment, the storage unit is specifically configured to:
when the attribute values contained in the attributes are different, storing each attribute value according to whether the attribute value contains identification information;
when the attributes contain the same attribute value, the same attribute value is subjected to duplicate removal processing; storing each attribute value according to whether the attribute value after the duplicate removal processing contains the identification information;
the identification information is used for distinguishing the writing time of each attribute value.
In still another aspect, an embodiment of the present invention further provides a computer storage medium, where a computer program is stored, and when the computer program is executed by a processor, the method for implementing data processing is implemented.
In another aspect, an embodiment of the present invention further provides a terminal, including: a memory and a processor, the memory having a computer program stored therein; wherein the content of the first and second substances,
the processor is configured to execute the computer program in the memory;
the computer program, when executed by the processor, implements a method of implementing data processing as described above.
Compared with the related art, the technical scheme of the application comprises the following steps: determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated; and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value. According to the embodiment of the invention, the management of historical data of the knowledge graph is realized by storing and processing the graph data to be integrated.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a flow chart of a method for implementing data processing according to an embodiment of the present invention;
fig. 2 is a block diagram of an apparatus for implementing data processing according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
Fig. 1 is a flowchart of a method for implementing data processing according to an embodiment of the present invention, as shown in fig. 1, including:
the embodiment of the invention can determine whether the attribute of the map data contains a plurality of attribute values or not by referring to the correlation principle of the database.
And 102, when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value.
In an exemplary embodiment, the storing the attribute values according to whether the attributes contain the same attribute value includes:
when the attribute values contained in the attributes are different, storing each attribute value according to whether the attribute value contains identification information;
when the attributes contain the same attribute value, the same attribute value is subjected to duplicate removal processing; storing each attribute value according to whether the attribute value after the duplicate removal processing contains the identification information;
the identification information is used for distinguishing the writing time of each attribute value.
In an exemplary embodiment, the performing the deduplication processing on the same attribute value includes:
determining whether the same attribute value contained in the attribute contains the identification information;
when the identification information is contained, reserving an attribute value of preset time; when the identification information is not contained, any attribute value in the same attribute values is reserved;
wherein the preset time includes: the earliest time or the latest time.
It should be noted that the earliest time includes: the time closest to the current time; the latest time includes: the time farthest from the current time.
In an exemplary embodiment, storing the attribute values according to whether the identification information is included includes:
when the identification information is contained, storing each attribute value according to the sequence of writing time;
and when the identification information is not contained, storing each attribute value according to the reading sequence of the integrated map data.
In the embodiment of the invention, a plurality of attribute values of the same attribute in the atlas data to be integrated are stored, so that data support is provided for acquiring historical data and realizing application analysis of the knowledge atlas.
In an exemplary embodiment, the identification information includes any one of:
version information, timestamp information.
It should be noted that, the timestamp information and the version information in the embodiment of the present invention may be set based on a correlation principle; for example, the version number is set at a preset time granularity.
TABLE 1
The following example is made by way of example, table 1 is an example of a first data source of spectrum data according to an embodiment of the present invention, and refer to table 1 for recording write time of attribute values in a time stamp manner; according to the embodiment of the invention, the attribute values of the timestamps ts1 and ts3 are divided into the attribute values of version 1 according to the time granularity, the attribute values of the timestamps ts2 and ts4 are divided into the attribute values of version 2, and after the versions are divided, the table 2 which uses the version identification attribute values to store the sequence is obtained. Table 3 is an example of a second data source according to the embodiment of the present invention, and refer to table 3 to store attribute values of attributes in the second data source with version information; table 4 is an example of data stored and processed in the embodiment of the present invention, and referring to table 4, the same attribute values in tables 2 and 3 are subjected to deduplication processing according to version information and then stored as map data shown in table 4.
TABLE 2
TABLE 3
In an exemplary embodiment, the storing of the attribute values includes:
and storing the attribute value of the attribute into a preset storage area.
Through the processing, the embodiment of the invention realizes the storage of historical map data, and each entity or relationship can be uniquely determined by the primary key and the identification information.
TABLE 4
Compared with the related art, the technical scheme of the application comprises the following steps: determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated; and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value. According to the embodiment of the invention, the management of historical data of the knowledge graph is realized by storing and processing the graph data to be integrated.
Fig. 2 is a block diagram of an apparatus for implementing data processing according to an embodiment of the present invention, as shown in fig. 2, including: a determination unit and a storage unit; wherein the content of the first and second substances,
the determination unit is used for: determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated;
the storage unit is used for: and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value.
In an exemplary embodiment, the storage unit is specifically configured to:
when the attribute values contained in the attributes are different, storing each attribute value according to whether the attribute value contains identification information;
when the attributes contain the same attribute value, the same attribute value is subjected to duplicate removal processing; storing each attribute value according to whether the attribute value after the duplicate removal processing contains the identification information;
the identification information is used for distinguishing the writing time of each attribute value.
In an exemplary embodiment, the storage unit is configured to perform deduplication processing on the same attribute value, and includes:
determining whether the same attribute value contained in the attribute contains the identification information;
when the identification information is contained, reserving an attribute value of preset time; when the identification information is not contained, any attribute value in the same attribute values is reserved;
wherein the preset time includes: the earliest time or the latest time.
It should be noted that the earliest time includes: the time closest to the current time; the latest time includes: the time farthest from the current time.
In an exemplary embodiment, the storage unit is configured to store the attribute values according to whether the identification information is included, and includes:
when the identification information is contained, storing each attribute value according to the sequence of writing time;
and when the identification information is not contained, storing each attribute value according to the reading sequence of the integrated map data.
In the embodiment of the invention, a plurality of attribute values of the same attribute in the atlas data to be integrated are stored, so that data support is provided for acquiring historical data and realizing application analysis of the knowledge atlas.
In an exemplary embodiment, the identification information includes any one of:
version information, timestamp information.
It should be noted that, the timestamp information and the version information in the embodiment of the present invention may be set based on a correlation principle;
in an exemplary embodiment, the storage unit is configured to store the attribute values, and includes:
and storing the attribute value of the attribute into a preset storage area.
Through the processing, the embodiment of the invention realizes the storage of historical map data, and each entity or relationship can be uniquely determined by the primary key and the identification information.
Compared with the related art, the technical scheme of the application comprises the following steps: determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated; and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value. According to the embodiment of the invention, the management of historical data of the knowledge graph is realized by storing and processing the graph data to be integrated.
The embodiment of the invention also provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and when being executed by a processor, the computer program realizes the method for realizing data processing.
An embodiment of the present invention further provides a terminal, including: a memory and a processor, the memory having a computer program stored therein; wherein the content of the first and second substances,
the processor is configured to execute the computer program in the memory;
the computer program, when executed by the processor, implements a method of implementing data processing as described above.
"one of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed by several physical components in cooperation. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as is well known to those of ordinary skill in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, Digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can accessed by a computer. In addition, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media as known to those skilled in the art. "
Claims (10)
1. A method of implementing data processing, comprising:
determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated;
and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value.
2. The method according to claim 1, wherein the storing the attribute values according to whether the attributes contain the same attribute value comprises:
when the attribute values contained in the attributes are different, storing each attribute value according to whether the attribute value contains identification information;
when the attributes contain the same attribute value, the same attribute value is subjected to duplicate removal processing; storing each attribute value according to whether the attribute value after the duplicate removal processing contains the identification information;
the identification information is used for distinguishing the writing time of each attribute value.
3. The method of claim 2, wherein the performing deduplication processing on the same attribute value comprises:
determining whether the same attribute value contained in the attribute contains the identification information;
when the identification information is contained, reserving an attribute value of preset time; when the identification information is not contained, any attribute value in the same attribute values is reserved;
wherein the preset time includes: the earliest time or the latest time.
4. The method according to claim 2, wherein the storing of the attribute values according to whether the identification information is included comprises:
when the identification information is contained, storing each attribute value according to the sequence of writing time;
and when the identification information is not contained, storing each attribute value according to the reading sequence of the integrated map data.
5. The method according to any one of claims 2 to 4, wherein the identification information comprises any one of:
version information, timestamp information.
6. The method according to any one of claims 2 to 4, wherein the storing of the attribute values comprises:
and storing the attribute value of the attribute into a preset storage area.
7. An apparatus for implementing data processing, comprising: a determination unit and a storage unit; wherein the content of the first and second substances,
the determination unit is used for: determining whether each attribute comprises two or more attribute values or not for the atlas data to be integrated;
the storage unit is used for: and when the attribute comprises two or more attribute values, storing the attribute values according to whether the attribute comprises the same attribute value.
8. The apparatus according to claim 7, wherein the storage unit is specifically configured to:
when the attribute values contained in the attributes are different, storing each attribute value according to whether the attribute value contains identification information;
when the attributes contain the same attribute value, the same attribute value is subjected to duplicate removal processing; storing each attribute value according to whether the attribute value after the duplicate removal processing contains the identification information;
the identification information is used for distinguishing the writing time of each attribute value.
9. A computer storage medium having stored thereon a computer program which, when executed by a processor, implements a method of implementing data processing as claimed in any one of claims 1 to 6.
10. A terminal, comprising: a memory and a processor, the memory having a computer program stored therein; wherein the content of the first and second substances,
the processor is configured to execute the computer program in the memory;
the computer program, when executed by the processor, implements a method of implementing data processing as recited in any of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911377740.6A CN111177409A (en) | 2019-12-27 | 2019-12-27 | Method and device for realizing data processing, computer storage medium and terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911377740.6A CN111177409A (en) | 2019-12-27 | 2019-12-27 | Method and device for realizing data processing, computer storage medium and terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111177409A true CN111177409A (en) | 2020-05-19 |
Family
ID=70654079
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911377740.6A Pending CN111177409A (en) | 2019-12-27 | 2019-12-27 | Method and device for realizing data processing, computer storage medium and terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111177409A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116934556A (en) * | 2023-09-08 | 2023-10-24 | 四川三思德科技有限公司 | Target personnel accurate control method based on big data fusion |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103810224A (en) * | 2012-11-15 | 2014-05-21 | 阿里巴巴集团控股有限公司 | Information persistence and query method and device |
CN104866498A (en) * | 2014-02-24 | 2015-08-26 | 华为技术有限公司 | Information processing method and device |
US20190019088A1 (en) * | 2017-07-14 | 2019-01-17 | Guangdong Shenma Search Technology Co., Ltd. | Knowledge graph construction method and device |
CN109902130A (en) * | 2019-01-31 | 2019-06-18 | 北京明略软件系统有限公司 | A kind of date storage method, data query method and apparatus, storage medium |
-
2019
- 2019-12-27 CN CN201911377740.6A patent/CN111177409A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103810224A (en) * | 2012-11-15 | 2014-05-21 | 阿里巴巴集团控股有限公司 | Information persistence and query method and device |
CN104866498A (en) * | 2014-02-24 | 2015-08-26 | 华为技术有限公司 | Information processing method and device |
US20190019088A1 (en) * | 2017-07-14 | 2019-01-17 | Guangdong Shenma Search Technology Co., Ltd. | Knowledge graph construction method and device |
CN109902130A (en) * | 2019-01-31 | 2019-06-18 | 北京明略软件系统有限公司 | A kind of date storage method, data query method and apparatus, storage medium |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116934556A (en) * | 2023-09-08 | 2023-10-24 | 四川三思德科技有限公司 | Target personnel accurate control method based on big data fusion |
CN116934556B (en) * | 2023-09-08 | 2023-12-26 | 四川三思德科技有限公司 | Target personnel accurate control method based on big data fusion |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10606806B2 (en) | Method and apparatus for storing time series data | |
CN108512726B (en) | Data monitoring method and equipment | |
CN110619252B (en) | Method, device and equipment for identifying form data in picture and storage medium | |
CN112465032A (en) | Distribution method and device of training data labeling tasks and computing equipment | |
CN111581267A (en) | Object data storage method and device | |
CN111177409A (en) | Method and device for realizing data processing, computer storage medium and terminal | |
CN110704620B (en) | Method and device for identifying same entity based on knowledge graph | |
CN111475305B (en) | Big data processing method and system based on cloud platform multithreading | |
CN111401438A (en) | Image sorting method, device and system | |
CN111177093A (en) | Method, device and medium for sharing scientific and technological resources | |
CN112650748A (en) | Business clue distribution method and device, electronic equipment and readable storage medium | |
CN109669623A (en) | A kind of file management method, document management apparatus, electronic equipment and storage medium | |
CN114490644A (en) | Data storage method, device and storage medium | |
CN114556283B (en) | Method and device for data writing, consistency checking and reading | |
US11023226B2 (en) | Dynamic data ingestion | |
CN110852709A (en) | Method and device for realizing early warning processing, computer storage medium and terminal | |
CN108388406B (en) | Data processing method and device | |
CN107454953B (en) | EMV (empirical mode decomposition) implementation method and device | |
CN105426473A (en) | Electronic business card duplicate removal method and device | |
EP3418914A1 (en) | Data management apparatuses, methods, and non-transitory tangible machine-readable media thereof | |
CN111126016A (en) | Gantt chart drawing method and device, computer storage medium and terminal | |
CN111177408A (en) | Method and device for realizing data processing, computer storage medium and terminal | |
US20230067107A1 (en) | Managing vertex level access in a graph via user defined tag rules | |
CN109102754B (en) | Data map generation method and device | |
CN110489125B (en) | File management method and computer storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200519 |