CN112445806A - Method and device for continuously updating data - Google Patents

Method and device for continuously updating data Download PDF

Info

Publication number
CN112445806A
CN112445806A CN201910810984.2A CN201910810984A CN112445806A CN 112445806 A CN112445806 A CN 112445806A CN 201910810984 A CN201910810984 A CN 201910810984A CN 112445806 A CN112445806 A CN 112445806A
Authority
CN
China
Prior art keywords
data
storage
format conversion
needing
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910810984.2A
Other languages
Chinese (zh)
Inventor
程德生
王博
蒋洵
江峰
张鹤
王梨
孙延春
苏翃宇
钱刚
李俊呈
周丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Soft Hangzhou Anren Network Communication Co ltd
Original Assignee
China Soft Hangzhou Anren Network Communication Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Soft Hangzhou Anren Network Communication Co ltd filed Critical China Soft Hangzhou Anren Network Communication Co ltd
Priority to CN201910810984.2A priority Critical patent/CN112445806A/en
Publication of CN112445806A publication Critical patent/CN112445806A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for continuously updating data, comprising the following steps: for data needing to be put in storage, if the data is updating data and is mapped data, carrying out format conversion on the data needing to be put in storage; processing the data after format conversion; performing data quality inspection on the data processed by the data processing; performing data conversion on the data passing the data quality inspection; and performing data storage on the data subjected to data conversion. The method and the system can guarantee the accuracy, the availability, the timeliness and the integrity of the warehousing data of the big data platform so as to support various services of the social co-treatment big data platform.

Description

Method and device for continuously updating data
Technical Field
The invention relates to the technical field of big data, in particular to a method and a device for continuously updating data.
Background
The social community big data platform provides public data services and paid data services for government affairs, enterprises, social public and third party platforms, and different data source data enriches basic data of the big data platform continuously along with continuous development of the platforms. How to guarantee the accuracy, availability, timeliness and integrity of data is a problem to be solved urgently.
Disclosure of Invention
The method and the device for continuously updating the data can ensure the accuracy, the availability, the timeliness and the integrity of the warehousing data of the big data platform so as to support various businesses of the social co-treatment big data platform.
In a first aspect, the present invention provides a method for continuously updating data, the method comprising:
for data needing to be put in storage, if the data is updating data and is mapped data, carrying out format conversion on the data needing to be put in storage;
processing the data after format conversion;
performing data quality inspection on the data processed by the data processing;
performing data conversion on the data passing the data quality inspection;
and performing data storage on the data subjected to data conversion.
Optionally, if the data to be put in storage is new data and is unmapped data, before performing format conversion on the data to be put in storage, the method further includes:
performing data mapping on the data needing to be put in storage;
data preprocessing is carried out on the data subjected to data mapping;
and performing data logic check on the data subjected to data preprocessing, and performing format conversion on the data needing to be put in storage when the data logic check is passed.
Optionally, the method further comprises:
and carrying out data processing again on the data which fails the data quality check.
Optionally, the method further comprises:
and carrying out data mapping again on the data which is not passed by the data logic check.
In a second aspect, the present invention provides an apparatus for continuously updating data, the apparatus comprising:
the format conversion unit is used for carrying out format conversion on the data needing to be put in storage if the data is the updating data and is the mapped data;
the processing unit is used for carrying out data processing on the data subjected to format conversion;
a quality inspection unit for performing data quality inspection on the data processed by the data processing unit;
the data conversion unit is used for carrying out data conversion on the data passing the data quality inspection;
and the warehousing unit is used for performing data warehousing on the data subjected to the data conversion.
Optionally, the apparatus further comprises:
the mapping unit is used for mapping the data needing to be put in storage before the format conversion unit performs the format conversion on the data needing to be put in storage when the data needing to be put in storage is new data and is unmapped data;
the preprocessing unit is used for preprocessing the data subjected to the data mapping;
the logic inspection unit is used for carrying out data logic inspection on the data subjected to data preprocessing;
and the format conversion unit is used for performing format conversion on the data needing to be put in storage when the data logic check passes.
Optionally, the data processing unit is further configured to perform data processing again on the data that fails in the data quality check.
Optionally, the mapping unit is further configured to perform data mapping again on data that fails in the data logic check.
According to the method and the device for continuously updating the data, provided by the embodiment of the invention, for the data needing to be put in storage, if the data is the updated data and is the mapped data, format conversion is firstly carried out on the data needing to be put in storage, then data processing and data quality inspection are carried out on the data subjected to format conversion, and the data passing the data quality inspection is put in storage after data conversion is carried out, so that the accuracy, the availability, the timeliness and the integrity of the data can be ensured, the data quality is ensured, and various services of a large social co-treatment data platform are supported.
Drawings
FIG. 1 is a flow chart of a method for continuously updating data according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for continuously updating data according to another embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an apparatus for continuously updating data according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus for continuously updating data according to another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
The gathered data is an important component of basic data of a social co-treatment big data platform, is a 'life line' on which analysis depends, and is also important basic information indispensable for realizing market environment treatment. With the continuous development of platforms, different data source data continuously enrich the basic data of a big data platform. And a long-acting dynamic updating and maintaining mechanism of basic data is established, the integrity, the situation and the accuracy of the database are maintained, and the method is a precondition for realizing information analysis of a large community co-governance data platform. In order to guarantee the accuracy, availability, timeliness and integrity of data, a continuous data updating mechanism is the basis of dynamic updating of a large data platform and is also the premise of data quality guarantee.
An embodiment of the present invention provides a method for continuously updating data, as shown in fig. 1, the method includes:
and S11, for the data needing to be put in storage, if the data is the updating data and the data is the mapped data, carrying out format conversion on the data needing to be put in storage.
And S12, processing the data after format conversion.
And S13, performing data quality check on the data processed by the data processing.
And S14, performing data conversion on the data passing the data quality check.
And S15, storing the data after data conversion.
According to the method for continuously updating the data, provided by the embodiment of the invention, for the data needing to be put in storage, if the data is the updated data and is the mapped data, format conversion is firstly carried out on the data needing to be put in storage, then data processing and data quality inspection are carried out on the data subjected to format conversion, and the data passing the data quality inspection are put in storage after data conversion, so that the accuracy, the availability, the timeliness and the integrity of the data can be ensured, the data quality is ensured, and various services of a large social co-treatment data platform are supported.
The method for continuously updating data according to the embodiment of the present invention is described in detail below.
As shown in fig. 2, the method for continuously updating data includes:
s21, for the data needing to be put in storage, judging whether the data is the updated data or not and whether the data is the mapped data or not, if the data is the updated data and the mapped data, executing the step S22 to the step S26; if the data is new data and is unmapped data, step S27 is executed.
And S22, converting the format of the data needing to be put in storage.
And S23, processing the data after format conversion.
And S24, performing data quality check on the data processed, if the data quality check is passed, executing the step S25, otherwise, returning to execute the step S23.
And S25, performing data conversion on the data passing the data quality check.
And S26, storing the data after data conversion.
And S27, performing data mapping on the data needing to be put in storage.
And S28, performing data preprocessing on the data subjected to the data mapping.
And S29, performing data logic check on the data subjected to the data preprocessing, if the data logic check is passed, executing the step S22, otherwise, returning to execute the step S27.
According to the method for continuously updating the data, provided by the embodiment of the invention, for the data needing to be put in storage, if the data is new data and is data which is not mapped, data mapping and data preprocessing are carried out, then data logic inspection is carried out, format conversion is carried out after the data logic inspection is passed, then data processing and data quality inspection are carried out on the data which is subjected to format conversion, and the data which is subjected to data quality inspection is put in storage after data conversion is carried out, so that the accuracy, the availability, the timeliness and the integrity of the data can be guaranteed, the data quality is ensured, and various services of a social co-treatment large data platform are supported.
An embodiment of the present invention further provides a device for continuously updating data, as shown in fig. 3, the device includes:
the format conversion unit 11 is configured to perform format conversion on data to be put into a database if the data is updated data and is mapped data;
a processing unit 12, configured to perform data processing on the format-converted data;
a quality inspection unit 13 for performing data quality inspection on the data subjected to the data processing;
a data conversion unit 14 for performing data conversion on the data passing the data quality inspection;
and a storage unit 15, configured to store the data subjected to the data conversion.
According to the device for continuously updating data, provided by the embodiment of the invention, for the data needing to be put in storage, if the data is the updated data and is the mapped data, format conversion is firstly carried out on the data needing to be put in storage, then data processing and data quality inspection are carried out on the data subjected to format conversion, and the data passing the data quality inspection are put in storage after data conversion, so that the accuracy, the availability, the timeliness and the integrity of the data can be ensured, the data quality is ensured, and various services of a large social co-treatment data platform are supported.
Further, as shown in fig. 4, the apparatus further includes:
the mapping unit 16 is configured to, when data to be put into storage is new data and is unmapped data, perform data mapping on the data to be put into storage before the format conversion unit performs format conversion on the data to be put into storage;
a preprocessing unit 17, configured to perform data preprocessing on the data subjected to the data mapping;
a logic inspection unit 18, configured to perform data logic inspection on the data subjected to data preprocessing;
the format conversion unit 11 is configured to perform format conversion on the data to be put into a database when the data logic check passes.
Optionally, the data processing unit 14 is further configured to perform data processing again on the data that fails the data quality check.
Optionally, the mapping unit 16 is further configured to perform data mapping again on data that fails the data logic check.
According to the device for continuously updating data, provided by the embodiment of the invention, for the data needing to be put in storage, if the data is new data and is data which is not mapped, data mapping and data preprocessing are carried out, then data logic inspection is carried out, format conversion is carried out after the data logic inspection is passed, then data processing and data quality inspection are carried out on the data which is subjected to format conversion, and the data which is subjected to data quality inspection is put in storage after data conversion is carried out, so that the accuracy, the availability, the timeliness and the integrity of the data can be guaranteed, the data quality is ensured, and various services of a social co-treatment large data platform are supported.
It will be understood by those skilled in the art that all or part of the processes of the embodiments of the methods described above may be implemented by a computer program, which may be stored in a computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (8)

1. A method for continuously updating data, the method comprising:
for data needing to be put in storage, if the data is updating data and is mapped data, carrying out format conversion on the data needing to be put in storage;
processing the data after format conversion;
performing data quality inspection on the data processed by the data processing;
performing data conversion on the data passing the data quality inspection;
and performing data storage on the data subjected to data conversion.
2. The method according to claim 1, wherein if the data to be binned is new data and is unmapped data, before performing the format conversion on the data to be binned, the method further comprises:
performing data mapping on the data needing to be put in storage;
data preprocessing is carried out on the data subjected to data mapping;
and performing data logic check on the data subjected to data preprocessing, and performing format conversion on the data needing to be put in storage when the data logic check is passed.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
and carrying out data processing again on the data which fails the data quality check.
4. The method of claim 2, further comprising:
and carrying out data mapping again on the data which is not passed by the data logic check.
5. An apparatus for continuously updating data, the apparatus comprising:
the format conversion unit is used for carrying out format conversion on the data needing to be put in storage if the data is the updating data and is the mapped data;
the processing unit is used for carrying out data processing on the data subjected to format conversion;
a quality inspection unit for performing data quality inspection on the data processed by the data processing unit;
the data conversion unit is used for carrying out data conversion on the data passing the data quality inspection;
and the warehousing unit is used for performing data warehousing on the data subjected to the data conversion.
6. The apparatus of claim 5, further comprising:
the mapping unit is used for mapping the data needing to be put in storage before the format conversion unit performs the format conversion on the data needing to be put in storage when the data needing to be put in storage is new data and is unmapped data;
the preprocessing unit is used for preprocessing the data subjected to the data mapping;
the logic inspection unit is used for carrying out data logic inspection on the data subjected to data preprocessing;
and the format conversion unit is used for performing format conversion on the data needing to be put in storage when the data logic check passes.
7. The apparatus according to claim 5 or 6, wherein the data processing unit is further configured to perform data processing again on data that fails in the data quality check.
8. The apparatus of claim 6, wherein the mapping unit is further configured to remap data for data that fails the data logic check.
CN201910810984.2A 2019-08-29 2019-08-29 Method and device for continuously updating data Pending CN112445806A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910810984.2A CN112445806A (en) 2019-08-29 2019-08-29 Method and device for continuously updating data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910810984.2A CN112445806A (en) 2019-08-29 2019-08-29 Method and device for continuously updating data

Publications (1)

Publication Number Publication Date
CN112445806A true CN112445806A (en) 2021-03-05

Family

ID=74742215

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910810984.2A Pending CN112445806A (en) 2019-08-29 2019-08-29 Method and device for continuously updating data

Country Status (1)

Country Link
CN (1) CN112445806A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477547A (en) * 2009-01-20 2009-07-08 中国测绘科学研究院 Regulation based spatial data integration method
CN109977162A (en) * 2019-04-10 2019-07-05 广东省城乡规划设计研究院 A kind of urban and rural planning data transfer device, system and computer readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477547A (en) * 2009-01-20 2009-07-08 中国测绘科学研究院 Regulation based spatial data integration method
CN109977162A (en) * 2019-04-10 2019-07-05 广东省城乡规划设计研究院 A kind of urban and rural planning data transfer device, system and computer readable storage medium

Similar Documents

Publication Publication Date Title
US9690775B2 (en) Real-time sentiment analysis for synchronous communication
US8285539B2 (en) Extracting tokens in a natural language understanding application
US20150220332A1 (en) Resolving merge conflicts that prevent blocks of program code from properly being merged
US10922738B2 (en) Intelligent assistance for support agents
US20170032027A1 (en) Contact Center Virtual Assistant
US20130238313A1 (en) Domain specific natural language normalization
CN108846441B (en) Image similarity detection method and device and computer readable storage medium
CN110738055A (en) Text entity identification method, text entity identification equipment and storage medium
JP7409197B2 (en) Elaboration of repair patterns for static analysis violations in software programs
WO2022247967A1 (en) Electronic receipt mail processing
CN110019542B (en) Generation of enterprise relationship, generation of organization member database and identification of same name member
CN113205814B (en) Voice data labeling method and device, electronic equipment and storage medium
US20220147863A1 (en) Machine-learning model for determining post-visit phone call propensity
CN113434123A (en) Service processing method and device and electronic equipment
CN108962228A (en) model training method and device
CN112445806A (en) Method and device for continuously updating data
US20090271231A1 (en) Solution utilizing commodity-oriented correction guidelines to correct defective electronic business transactions
CN112767933B (en) Voice interaction method, device, equipment and medium of highway maintenance management system
US9898457B1 (en) Identifying non-natural language for content analysis
CN102799423A (en) Method and device for implementing dynamic method in JSF (java service face)
US11068236B2 (en) Identification of users across multiple platforms
WO2023272833A1 (en) Data detection method, apparatus and device and readable storage medium
CN113283995B (en) Insurance public estimation remote access summary method, device and equipment
CN110727677A (en) Method and device for tracing blood relationship of table in data warehouse
CN112015858B (en) Information detection method, information detection device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210305