CN114443631A - Method for regularly processing non-public big data in retail industry - Google Patents

Method for regularly processing non-public big data in retail industry Download PDF

Info

Publication number
CN114443631A
CN114443631A CN202111667480.3A CN202111667480A CN114443631A CN 114443631 A CN114443631 A CN 114443631A CN 202111667480 A CN202111667480 A CN 202111667480A CN 114443631 A CN114443631 A CN 114443631A
Authority
CN
China
Prior art keywords
data
applicant
processing
retail
retail industry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111667480.3A
Other languages
Chinese (zh)
Inventor
夏倩
王骏
蒋健
肖卫强
徐建
卢昕博
蔡宁
黄纯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Tobacco Zhejiang Industrial Co Ltd
Original Assignee
China Tobacco Zhejiang Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Tobacco Zhejiang Industrial Co Ltd filed Critical China Tobacco Zhejiang Industrial Co Ltd
Priority to CN202111667480.3A priority Critical patent/CN114443631A/en
Publication of CN114443631A publication Critical patent/CN114443631A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/256Integrating or interfacing systems involving database management systems in federated or virtual databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a method for processing non-public big data in the retail industry at regular time. The method specifically comprises the following steps: the data application party provides a data acquisition application for the data of the retail user to acquire a corresponding key information and a data dictionary of the data of the retail user, and registers a machine needing to acquire the data by using the key; calling a data downloading interface returned by a data provider to monitor data; after the data is intercepted, the data is automatically downloaded and transmitted to a folder in a designated enterprise; decompressing the downloaded data file through an automatic decompression program, and importing the decompressed data file into a corresponding database according to the type of the decompressed data file; and integrating the date marked data file into an enterprise internal data platform after the verification of the ETL tool is carried out in comparison with the data dictionary of the retail customer data. The invention has the beneficial effects that: and the compressed data packet is automatically intercepted, downloaded, decompressed and imported at regular time, so that the effect of the ETL tool under the scene is obviously improved.

Description

Method for regularly processing non-public big data in retail industry
Technical Field
The invention relates to the technical field related to big data processing, in particular to a method for processing non-public big data in the retail industry at regular time.
Background
As data becomes an enterprise asset, some industry entities begin or initially complete the construction of internal data platforms. Data outside of an enterprise also increasingly requires integration of various related data within the enterprise, particularly within the industry. But some non-public data exists in the external data and can only be provided for units within a specific range in the industry. For example, in the industry of regional agency of a group, the daily sales data of national retail customers can only be collected every day, then the sales data are gathered and processed and then opened to each unit in the industry, and after the industry data are taken by each unit, the sales data are integrated with other data of the enterprise for processing and use.
Due to the fact that related data in the industry are sensitive and other safety factors are considered, a data provider does not allow an opposite party to directly obtain data through direct butt joint of a traditional ETL tool, and the right of data interface access can be obtained by the data provider after permission is obtained through application. And because the national retail customers are numerous, the daily retail data volume is large, and the interface service can be provided to the outside only in a compressed packet mode. The traditional ETL tool can only acquire data from a database or a specified file and cannot effectively process compressed packet data, so that the ETL tool has poor effect in the scene. At present, no method for processing non-public big data in related industries at fixed time exists.
Disclosure of Invention
The invention provides a method for processing non-public big data in the retail industry at regular time, which solves the problems disclosed in the background technology.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a method for processing non-public big data in the retail industry at regular time comprises the following steps:
the method comprises the following steps that firstly, a data application party provides a data acquisition application to a data provider, corresponding key information and a data dictionary are obtained after the application is approved, a machine needing to acquire data is registered by using the key information, and the data provider authenticates the machine;
step two, the data supplier sends a data download interface to the data applicant, and the data applicant uses the machine after passing the authentication to call the data download interface to regularly monitor whether the data is issued every day;
step three, after the data is intercepted and issued on the same day, the data is automatically downloaded and transmitted to a data applicant;
step four, decompressing the downloaded data file through an automatic decompressing program, reading the decompressed data file through a conversion program, converting the decompressed data file into an sql script compatible with the data of the DB2, and introducing the sql script into a pre-constructed DB2 database for execution to obtain the latest data of the DB 2;
reading the latest data imported into the DB2 by using an ETL tool, and cleaning dirty data and null data by comparing the data in the data dictionary; and converting field information of the cleaned data, and then summarizing the converted data information into a data platform of a data application party which is constructed in advance.
Further, in the first step, the data applicant encrypts and stores the key in the disk by using an asymmetric encryption algorithm.
Further, in the second step, the data download interface is called in a timer round robin manner.
Further, in the third step: and transmitting the data to the data applicant through the SFTP service configured with the password.
Further, in the second step, the machine authentication of the data applicant needs to be completed within a specified time, and the data provider sends an alarm message to the corresponding data applicant before expiration.
Further, in the second step, when the data issuing is not monitored all day long, an alarm message is sent to the data applicant personnel.
Further, in the third step, when the data is monitored to be issued but the data downloading is reported to be wrong, an alarm message is sent to the data application party, and meanwhile, the data downloading processing flow is retried to be initiated; when the decompression process in the fourth step reports errors, an alarm message is sent to a data application party, and meanwhile, the process of decompressing data is retried; and step five, when the import process reports errors, sending an alarm message to the data application party, and simultaneously retrying to initiate an import data processing flow.
Further, in the fifth step, when some field data are abnormal or empty during the verification of the intercepted and downloaded decompressed data file, an alarm message is sent to the data applicant.
Accordingly, the one or more programs include instructions which, when executed by a computing device, cause the computing device to perform a method according to any of the methods described above.
Accordingly, a computing device, comprising:
one or more processors, one or more memories, and one or more programs stored in the one or more memories and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods of claims 1-8.
The invention achieves the following beneficial effects: the method comprises the steps of automatically monitoring, downloading, decompressing and importing corresponding compressed data packets according to a set period in a timing mode, and effectively processing the compressed data packets, so that the effect of the ETL tool under the scene is remarkably improved.
Detailed Description
The invention is further described below in connection with the following description. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The invention discloses a method for processing non-public retail big data at fixed time, which comprises the following steps:
(1) a data application party (each industrial and commercial enterprise) puts forward a retail customer data acquisition application to a data provider (a monopoly), acquires corresponding key information and a data dictionary of the retail customer data after applying for approval, and registers a machine needing to acquire the data by using a key; the data application party applies for the key regularly according to a certain time period (generally one month), and the asymmetric encryption algorithm is used for encrypting and storing the key in a disk and backing up the key at a plurality of machines.
(2) The machine after passing the authentication calls a data downloading interface returned by the data provider, and regularly monitors whether data are issued every day; the method specifically comprises the following steps: and calling the interface in a timer round-robin mode, and judging whether the interface contains the content and whether the content is downloaded.
(3) The data is automatically downloaded after being intercepted and issued on the same day, and the data is transmitted to an internal folder of a designated enterprise after being downloaded; the method specifically comprises the following steps: the method comprises the steps of firstly downloading the file directory of a machine for acquiring data, and uploading the file directory to an internal folder of an enterprise through SFTP service configured with a password.
(4) Decompressing the downloaded data files through an automatic decompressing program, and importing the data files into a corresponding DB2 database according to the decompressed data file type DB 2; the method specifically comprises the following steps: the file is decompressed by a decompression program in the program, the content of the file is read by a conversion program and converted into sql script compatible with the data of the DB2, and the data is imported into the DB2 for execution.
(5) And integrating the date marked data file into an enterprise internal data platform after the verification of the ETL tool is carried out in comparison with the data dictionary of the retail customer data. The method specifically comprises the following steps: ETL reads the latest data imported into DB2, and dirty data and null data cleaning are carried out according to the data in the retail customer data dictionary; the field information transforms and then aggregates the data information into the platform.
Wherein:
and when the machine authentication for automatically processing data by the data applicant expires two days before, sending an alarm message to corresponding data applicant personnel, and applying for the key again by the data applicant personnel.
When the data issuing is not intercepted all day long, a warning message is sent to the corresponding data applicant personnel, and the data applicant personnel verify the reason why the retail data issuing is not carried out all day long.
And when the data is intercepted and issued but the error is reported in the data downloading, decompressing or importing process, sending an alarm message to corresponding data application personnel, and simultaneously retrying to initiate the data downloading, decompressing and importing processing flow.
When some field data are abnormal or empty during verification of the intercepted and downloaded decompressed data file, an alarm message is sent to corresponding data application personnel, and the data application personnel remove the reason why partial data content has problems.
If the data is not issued to the data applicant on the same day due to the reason of the data provider, and when data supplement more than one day is subsequently issued, the data applicant automatically downloads and decompresses a compressed data file containing multiple days, analyzes the retail data corresponding to each day, and checks whether the date of no data issuance before is contained; and if no data is issued on a certain day, sending an alarm message to corresponding data applicant personnel, and integrating the data file with the marked date into an enterprise internal data platform after the data file is verified by an ETL tool.
Through the design of the method, the authority of data interface access can be obtained after the application of a data application party obtains permission, the machine authorization for receiving data is realized through the design of a secret key, the authorized machine can automatically monitor, download, decompress and import the corresponding compressed data packet at regular time according to a set period, and the traditional ETL tool can directly obtain data from a database or a specified file of the authorized machine, namely the compressed data packet can be effectively processed.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform a method of timing non-public retail big data.
A computing device comprising one or more processors, one or more memories, and one or more programs stored in the one or more memories and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing a method of timing processing of non-public retail big data.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The present invention is not limited to the above embodiments, and any modifications, equivalent replacements, improvements, etc. made within the spirit and principle of the present invention are included in the scope of the claims of the present invention which are filed as the application.

Claims (10)

1. A method for processing non-public big data in the retail industry at regular time is characterized by comprising the following steps:
the method comprises the following steps that firstly, a data application party provides a data acquisition application to a data provider, corresponding key information and a data dictionary are obtained after the application is approved, a machine needing to acquire data is registered by using the key information, and the data provider authenticates the machine;
step two, the data supplier sends a data download interface to the data applicant, and the data applicant uses the machine after passing the authentication to call the data download interface to regularly monitor whether the data is issued every day;
step three, after the data is intercepted and issued on the same day, the data is automatically downloaded and transmitted to a data applicant;
step four, decompressing the downloaded data file through an automatic decompressing program, reading the decompressed data file through a conversion program, converting the decompressed data file into an sql script compatible with the data of the DB2, and introducing the sql script into a pre-constructed DB2 database for execution to obtain the latest data of the DB 2;
reading the latest data imported into the DB2 by using an ETL tool, and cleaning dirty data and null data by comparing the data in the data dictionary; and converting field information of the cleaned data, and then summarizing the converted data information into a data platform of a data application party which is constructed in advance.
2. The method for periodically processing the non-public big data in the retail industry as claimed in claim 1, wherein in the first step, the data applicant encrypts and stores the key in the disk by using an asymmetric encryption algorithm.
3. The method for processing the non-public big data in the retail industry periodically according to claim 1, wherein in the second step, the data downloading interface is called by a timer cycle.
4. The method for processing the non-public big data of the retail industry periodically according to claim 1, wherein the step three is as follows: and transmitting the data to the data applicant through the SFTP service configured with the password.
5. The method as claimed in claim 1, wherein in the second step, the machine authentication of the data application party is completed within a predetermined time, and the data provider sends an alarm message to the corresponding data application party before expiration.
6. The method for processing the non-public big data in the retail industry periodically according to claim 1, wherein in the second step, when the data transmission is not intercepted all day long, an alarm message is sent to the data applicant personnel.
7. The method for processing the non-public big data in the retail industry at regular time according to claim 1, wherein in the third step, when the data is intercepted and the data downloading error is reported, a warning message is sent to the data applicant, and the data downloading process is attempted again; when the decompression process in the fourth step reports errors, an alarm message is sent to a data application party, and meanwhile, the process of decompressing data is retried; and step five, when the import process reports errors, sending an alarm message to the data application party, and simultaneously retrying to initiate an import data processing flow.
8. The method for processing the non-public big data in the retail industry periodically according to claim 1, wherein in the fifth step, when some field data of the intercepted and downloaded decompressed data file are abnormal or empty during verification, an alarm message is sent to the data applicant.
9. A computer readable storage medium storing one or more programs, characterized in that: the one or more programs include instructions that, when executed by a computing device, cause the computing device to perform any of the methods of claims 1-8.
10. A computing device, comprising:
one or more processors, one or more memories, and one or more programs stored in the one or more memories and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods of claims 1-8.
CN202111667480.3A 2021-12-31 2021-12-31 Method for regularly processing non-public big data in retail industry Pending CN114443631A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111667480.3A CN114443631A (en) 2021-12-31 2021-12-31 Method for regularly processing non-public big data in retail industry

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111667480.3A CN114443631A (en) 2021-12-31 2021-12-31 Method for regularly processing non-public big data in retail industry

Publications (1)

Publication Number Publication Date
CN114443631A true CN114443631A (en) 2022-05-06

Family

ID=81365735

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111667480.3A Pending CN114443631A (en) 2021-12-31 2021-12-31 Method for regularly processing non-public big data in retail industry

Country Status (1)

Country Link
CN (1) CN114443631A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110218872A1 (en) * 2010-03-02 2011-09-08 Shopkeep Llc System and Method for Remote Management of Sale Transaction Data
CN111221889A (en) * 2018-11-26 2020-06-02 上海阿米特数据系统有限公司 CASS retail data integration service platform
US20200334686A1 (en) * 2019-04-22 2020-10-22 Target Brands, Inc. System for third party sellers in online retail environment
US20210004875A1 (en) * 2018-06-04 2021-01-07 ThumbStopper Brand Online Content Dissemination to Retailer Social Media Outlets
CN113836210A (en) * 2021-09-15 2021-12-24 浙江中烟工业有限责任公司 Method for regularly processing non-public retail big data in tobacco industry

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110218872A1 (en) * 2010-03-02 2011-09-08 Shopkeep Llc System and Method for Remote Management of Sale Transaction Data
US20210004875A1 (en) * 2018-06-04 2021-01-07 ThumbStopper Brand Online Content Dissemination to Retailer Social Media Outlets
CN111221889A (en) * 2018-11-26 2020-06-02 上海阿米特数据系统有限公司 CASS retail data integration service platform
US20200334686A1 (en) * 2019-04-22 2020-10-22 Target Brands, Inc. System for third party sellers in online retail environment
CN113836210A (en) * 2021-09-15 2021-12-24 浙江中烟工业有限责任公司 Method for regularly processing non-public retail big data in tobacco industry

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
薄璐;: "基于大数据的零售业数据集市系统设计研究", 技术与市场, no. 06, 15 June 2018 (2018-06-15) *

Similar Documents

Publication Publication Date Title
CN110543464B (en) Big data platform applied to intelligent park and operation method
WO2020259629A1 (en) Block chain-based data inspection method and apparatus
CN111934879B (en) Encryption method, device, equipment and medium for data transmission of internal and external network system
KR102179152B1 (en) Client authentication using social relationship data
CN109213790B (en) Block chain-based data circulation analysis method and system
WO2018089843A1 (en) Secured auditing system based on verified hash algorithm
CN109657492B (en) Database management method, medium, and electronic device
CN110932859B (en) User information processing method, device and equipment and readable storage medium
CN105491058A (en) API access distributed authorization method and system
CN112464212A (en) Data authority control reconstruction method based on mature complex service system
CN111164630A (en) System and method for valuing digital assets
CN107918564B (en) Data transmission exception handling method and device, electronic equipment and storage medium
CN113158233A (en) Data preprocessing method and device and computer storage medium
CN111897877B (en) High-performance high-reliability data sharing system and method based on distributed ideas
CN107423583B (en) A kind of software protecting device remapping method and device
CN109254893B (en) Service data auditing method, device, server and storage medium
CN114095228A (en) Safe access method, system and device for data of Internet of things based on block chain and edge calculation and storage medium
CN112818016A (en) API-based real-time and off-line data query method and system
CN114443631A (en) Method for regularly processing non-public big data in retail industry
CN113836210A (en) Method for regularly processing non-public retail big data in tobacco industry
CN112702354A (en) Data resource sharing tracing method and device based on block chain technology
CN116800535A (en) Method and device for avoiding secret between multiple servers
US11294926B1 (en) Master extract, transform, and load (ETL) application for accommodating multiple data source types having disparate data formats
CN115795509A (en) Weak password event processing method and device, processor and electronic equipment
CN115564429A (en) System for prepaid transaction monitoring and related methods and blockchains

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination