CN114443631A - Method for regularly processing non-public big data in retail industry - Google Patents
Method for regularly processing non-public big data in retail industry Download PDFInfo
- Publication number
- CN114443631A CN114443631A CN202111667480.3A CN202111667480A CN114443631A CN 114443631 A CN114443631 A CN 114443631A CN 202111667480 A CN202111667480 A CN 202111667480A CN 114443631 A CN114443631 A CN 114443631A
- Authority
- CN
- China
- Prior art keywords
- data
- applicant
- processing
- retail
- retail industry
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000012545 processing Methods 0.000 title claims abstract description 28
- 238000012795 verification Methods 0.000 claims abstract description 5
- 230000006837 decompression Effects 0.000 claims abstract description 4
- 230000008569 process Effects 0.000 claims description 10
- 230000015654 memory Effects 0.000 claims description 8
- 230000008676 import Effects 0.000 claims description 5
- 238000003860 storage Methods 0.000 claims description 5
- 230000003203 everyday effect Effects 0.000 claims description 4
- 230000002159 abnormal effect Effects 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 238000004140 cleaning Methods 0.000 claims description 3
- 230000005540 biological transmission Effects 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000002354 daily effect Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 210000001503 joint Anatomy 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 231100000817 safety factor Toxicity 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/256—Integrating or interfacing systems involving database management systems in federated or virtual databases
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Storage Device Security (AREA)
Abstract
The invention discloses a method for processing non-public big data in the retail industry at regular time. The method specifically comprises the following steps: the data application party provides a data acquisition application for the data of the retail user to acquire a corresponding key information and a data dictionary of the data of the retail user, and registers a machine needing to acquire the data by using the key; calling a data downloading interface returned by a data provider to monitor data; after the data is intercepted, the data is automatically downloaded and transmitted to a folder in a designated enterprise; decompressing the downloaded data file through an automatic decompression program, and importing the decompressed data file into a corresponding database according to the type of the decompressed data file; and integrating the date marked data file into an enterprise internal data platform after the verification of the ETL tool is carried out in comparison with the data dictionary of the retail customer data. The invention has the beneficial effects that: and the compressed data packet is automatically intercepted, downloaded, decompressed and imported at regular time, so that the effect of the ETL tool under the scene is obviously improved.
Description
Technical Field
The invention relates to the technical field related to big data processing, in particular to a method for processing non-public big data in the retail industry at regular time.
Background
As data becomes an enterprise asset, some industry entities begin or initially complete the construction of internal data platforms. Data outside of an enterprise also increasingly requires integration of various related data within the enterprise, particularly within the industry. But some non-public data exists in the external data and can only be provided for units within a specific range in the industry. For example, in the industry of regional agency of a group, the daily sales data of national retail customers can only be collected every day, then the sales data are gathered and processed and then opened to each unit in the industry, and after the industry data are taken by each unit, the sales data are integrated with other data of the enterprise for processing and use.
Due to the fact that related data in the industry are sensitive and other safety factors are considered, a data provider does not allow an opposite party to directly obtain data through direct butt joint of a traditional ETL tool, and the right of data interface access can be obtained by the data provider after permission is obtained through application. And because the national retail customers are numerous, the daily retail data volume is large, and the interface service can be provided to the outside only in a compressed packet mode. The traditional ETL tool can only acquire data from a database or a specified file and cannot effectively process compressed packet data, so that the ETL tool has poor effect in the scene. At present, no method for processing non-public big data in related industries at fixed time exists.
Disclosure of Invention
The invention provides a method for processing non-public big data in the retail industry at regular time, which solves the problems disclosed in the background technology.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a method for processing non-public big data in the retail industry at regular time comprises the following steps:
the method comprises the following steps that firstly, a data application party provides a data acquisition application to a data provider, corresponding key information and a data dictionary are obtained after the application is approved, a machine needing to acquire data is registered by using the key information, and the data provider authenticates the machine;
step two, the data supplier sends a data download interface to the data applicant, and the data applicant uses the machine after passing the authentication to call the data download interface to regularly monitor whether the data is issued every day;
step three, after the data is intercepted and issued on the same day, the data is automatically downloaded and transmitted to a data applicant;
step four, decompressing the downloaded data file through an automatic decompressing program, reading the decompressed data file through a conversion program, converting the decompressed data file into an sql script compatible with the data of the DB2, and introducing the sql script into a pre-constructed DB2 database for execution to obtain the latest data of the DB 2;
reading the latest data imported into the DB2 by using an ETL tool, and cleaning dirty data and null data by comparing the data in the data dictionary; and converting field information of the cleaned data, and then summarizing the converted data information into a data platform of a data application party which is constructed in advance.
Further, in the first step, the data applicant encrypts and stores the key in the disk by using an asymmetric encryption algorithm.
Further, in the second step, the data download interface is called in a timer round robin manner.
Further, in the third step: and transmitting the data to the data applicant through the SFTP service configured with the password.
Further, in the second step, the machine authentication of the data applicant needs to be completed within a specified time, and the data provider sends an alarm message to the corresponding data applicant before expiration.
Further, in the second step, when the data issuing is not monitored all day long, an alarm message is sent to the data applicant personnel.
Further, in the third step, when the data is monitored to be issued but the data downloading is reported to be wrong, an alarm message is sent to the data application party, and meanwhile, the data downloading processing flow is retried to be initiated; when the decompression process in the fourth step reports errors, an alarm message is sent to a data application party, and meanwhile, the process of decompressing data is retried; and step five, when the import process reports errors, sending an alarm message to the data application party, and simultaneously retrying to initiate an import data processing flow.
Further, in the fifth step, when some field data are abnormal or empty during the verification of the intercepted and downloaded decompressed data file, an alarm message is sent to the data applicant.
Accordingly, the one or more programs include instructions which, when executed by a computing device, cause the computing device to perform a method according to any of the methods described above.
Accordingly, a computing device, comprising:
one or more processors, one or more memories, and one or more programs stored in the one or more memories and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods of claims 1-8.
The invention achieves the following beneficial effects: the method comprises the steps of automatically monitoring, downloading, decompressing and importing corresponding compressed data packets according to a set period in a timing mode, and effectively processing the compressed data packets, so that the effect of the ETL tool under the scene is remarkably improved.
Detailed Description
The invention is further described below in connection with the following description. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby.
The invention discloses a method for processing non-public retail big data at fixed time, which comprises the following steps:
(1) a data application party (each industrial and commercial enterprise) puts forward a retail customer data acquisition application to a data provider (a monopoly), acquires corresponding key information and a data dictionary of the retail customer data after applying for approval, and registers a machine needing to acquire the data by using a key; the data application party applies for the key regularly according to a certain time period (generally one month), and the asymmetric encryption algorithm is used for encrypting and storing the key in a disk and backing up the key at a plurality of machines.
(2) The machine after passing the authentication calls a data downloading interface returned by the data provider, and regularly monitors whether data are issued every day; the method specifically comprises the following steps: and calling the interface in a timer round-robin mode, and judging whether the interface contains the content and whether the content is downloaded.
(3) The data is automatically downloaded after being intercepted and issued on the same day, and the data is transmitted to an internal folder of a designated enterprise after being downloaded; the method specifically comprises the following steps: the method comprises the steps of firstly downloading the file directory of a machine for acquiring data, and uploading the file directory to an internal folder of an enterprise through SFTP service configured with a password.
(4) Decompressing the downloaded data files through an automatic decompressing program, and importing the data files into a corresponding DB2 database according to the decompressed data file type DB 2; the method specifically comprises the following steps: the file is decompressed by a decompression program in the program, the content of the file is read by a conversion program and converted into sql script compatible with the data of the DB2, and the data is imported into the DB2 for execution.
(5) And integrating the date marked data file into an enterprise internal data platform after the verification of the ETL tool is carried out in comparison with the data dictionary of the retail customer data. The method specifically comprises the following steps: ETL reads the latest data imported into DB2, and dirty data and null data cleaning are carried out according to the data in the retail customer data dictionary; the field information transforms and then aggregates the data information into the platform.
Wherein:
and when the machine authentication for automatically processing data by the data applicant expires two days before, sending an alarm message to corresponding data applicant personnel, and applying for the key again by the data applicant personnel.
When the data issuing is not intercepted all day long, a warning message is sent to the corresponding data applicant personnel, and the data applicant personnel verify the reason why the retail data issuing is not carried out all day long.
And when the data is intercepted and issued but the error is reported in the data downloading, decompressing or importing process, sending an alarm message to corresponding data application personnel, and simultaneously retrying to initiate the data downloading, decompressing and importing processing flow.
When some field data are abnormal or empty during verification of the intercepted and downloaded decompressed data file, an alarm message is sent to corresponding data application personnel, and the data application personnel remove the reason why partial data content has problems.
If the data is not issued to the data applicant on the same day due to the reason of the data provider, and when data supplement more than one day is subsequently issued, the data applicant automatically downloads and decompresses a compressed data file containing multiple days, analyzes the retail data corresponding to each day, and checks whether the date of no data issuance before is contained; and if no data is issued on a certain day, sending an alarm message to corresponding data applicant personnel, and integrating the data file with the marked date into an enterprise internal data platform after the data file is verified by an ETL tool.
Through the design of the method, the authority of data interface access can be obtained after the application of a data application party obtains permission, the machine authorization for receiving data is realized through the design of a secret key, the authorized machine can automatically monitor, download, decompress and import the corresponding compressed data packet at regular time according to a set period, and the traditional ETL tool can directly obtain data from a database or a specified file of the authorized machine, namely the compressed data packet can be effectively processed.
The above description is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, several modifications and variations can be made without departing from the technical principle of the present invention, and these modifications and variations should also be regarded as the protection scope of the present invention.
A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform a method of timing non-public retail big data.
A computing device comprising one or more processors, one or more memories, and one or more programs stored in the one or more memories and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing a method of timing processing of non-public retail big data.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The present invention is not limited to the above embodiments, and any modifications, equivalent replacements, improvements, etc. made within the spirit and principle of the present invention are included in the scope of the claims of the present invention which are filed as the application.
Claims (10)
1. A method for processing non-public big data in the retail industry at regular time is characterized by comprising the following steps:
the method comprises the following steps that firstly, a data application party provides a data acquisition application to a data provider, corresponding key information and a data dictionary are obtained after the application is approved, a machine needing to acquire data is registered by using the key information, and the data provider authenticates the machine;
step two, the data supplier sends a data download interface to the data applicant, and the data applicant uses the machine after passing the authentication to call the data download interface to regularly monitor whether the data is issued every day;
step three, after the data is intercepted and issued on the same day, the data is automatically downloaded and transmitted to a data applicant;
step four, decompressing the downloaded data file through an automatic decompressing program, reading the decompressed data file through a conversion program, converting the decompressed data file into an sql script compatible with the data of the DB2, and introducing the sql script into a pre-constructed DB2 database for execution to obtain the latest data of the DB 2;
reading the latest data imported into the DB2 by using an ETL tool, and cleaning dirty data and null data by comparing the data in the data dictionary; and converting field information of the cleaned data, and then summarizing the converted data information into a data platform of a data application party which is constructed in advance.
2. The method for periodically processing the non-public big data in the retail industry as claimed in claim 1, wherein in the first step, the data applicant encrypts and stores the key in the disk by using an asymmetric encryption algorithm.
3. The method for processing the non-public big data in the retail industry periodically according to claim 1, wherein in the second step, the data downloading interface is called by a timer cycle.
4. The method for processing the non-public big data of the retail industry periodically according to claim 1, wherein the step three is as follows: and transmitting the data to the data applicant through the SFTP service configured with the password.
5. The method as claimed in claim 1, wherein in the second step, the machine authentication of the data application party is completed within a predetermined time, and the data provider sends an alarm message to the corresponding data application party before expiration.
6. The method for processing the non-public big data in the retail industry periodically according to claim 1, wherein in the second step, when the data transmission is not intercepted all day long, an alarm message is sent to the data applicant personnel.
7. The method for processing the non-public big data in the retail industry at regular time according to claim 1, wherein in the third step, when the data is intercepted and the data downloading error is reported, a warning message is sent to the data applicant, and the data downloading process is attempted again; when the decompression process in the fourth step reports errors, an alarm message is sent to a data application party, and meanwhile, the process of decompressing data is retried; and step five, when the import process reports errors, sending an alarm message to the data application party, and simultaneously retrying to initiate an import data processing flow.
8. The method for processing the non-public big data in the retail industry periodically according to claim 1, wherein in the fifth step, when some field data of the intercepted and downloaded decompressed data file are abnormal or empty during verification, an alarm message is sent to the data applicant.
9. A computer readable storage medium storing one or more programs, characterized in that: the one or more programs include instructions that, when executed by a computing device, cause the computing device to perform any of the methods of claims 1-8.
10. A computing device, comprising:
one or more processors, one or more memories, and one or more programs stored in the one or more memories and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111667480.3A CN114443631A (en) | 2021-12-31 | 2021-12-31 | Method for regularly processing non-public big data in retail industry |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111667480.3A CN114443631A (en) | 2021-12-31 | 2021-12-31 | Method for regularly processing non-public big data in retail industry |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114443631A true CN114443631A (en) | 2022-05-06 |
Family
ID=81365735
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111667480.3A Pending CN114443631A (en) | 2021-12-31 | 2021-12-31 | Method for regularly processing non-public big data in retail industry |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114443631A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110218872A1 (en) * | 2010-03-02 | 2011-09-08 | Shopkeep Llc | System and Method for Remote Management of Sale Transaction Data |
CN111221889A (en) * | 2018-11-26 | 2020-06-02 | 上海阿米特数据系统有限公司 | CASS retail data integration service platform |
US20200334686A1 (en) * | 2019-04-22 | 2020-10-22 | Target Brands, Inc. | System for third party sellers in online retail environment |
US20210004875A1 (en) * | 2018-06-04 | 2021-01-07 | ThumbStopper | Brand Online Content Dissemination to Retailer Social Media Outlets |
CN113836210A (en) * | 2021-09-15 | 2021-12-24 | 浙江中烟工业有限责任公司 | Method for regularly processing non-public retail big data in tobacco industry |
-
2021
- 2021-12-31 CN CN202111667480.3A patent/CN114443631A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110218872A1 (en) * | 2010-03-02 | 2011-09-08 | Shopkeep Llc | System and Method for Remote Management of Sale Transaction Data |
US20210004875A1 (en) * | 2018-06-04 | 2021-01-07 | ThumbStopper | Brand Online Content Dissemination to Retailer Social Media Outlets |
CN111221889A (en) * | 2018-11-26 | 2020-06-02 | 上海阿米特数据系统有限公司 | CASS retail data integration service platform |
US20200334686A1 (en) * | 2019-04-22 | 2020-10-22 | Target Brands, Inc. | System for third party sellers in online retail environment |
CN113836210A (en) * | 2021-09-15 | 2021-12-24 | 浙江中烟工业有限责任公司 | Method for regularly processing non-public retail big data in tobacco industry |
Non-Patent Citations (1)
Title |
---|
薄璐;: "基于大数据的零售业数据集市系统设计研究", 技术与市场, no. 06, 15 June 2018 (2018-06-15) * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110543464B (en) | Big data platform applied to intelligent park and operation method | |
WO2020259629A1 (en) | Block chain-based data inspection method and apparatus | |
CN111934879B (en) | Encryption method, device, equipment and medium for data transmission of internal and external network system | |
KR102179152B1 (en) | Client authentication using social relationship data | |
CN109213790B (en) | Block chain-based data circulation analysis method and system | |
WO2018089843A1 (en) | Secured auditing system based on verified hash algorithm | |
CN109657492B (en) | Database management method, medium, and electronic device | |
CN110932859B (en) | User information processing method, device and equipment and readable storage medium | |
CN105491058A (en) | API access distributed authorization method and system | |
CN112464212A (en) | Data authority control reconstruction method based on mature complex service system | |
CN111164630A (en) | System and method for valuing digital assets | |
CN107918564B (en) | Data transmission exception handling method and device, electronic equipment and storage medium | |
CN113158233A (en) | Data preprocessing method and device and computer storage medium | |
CN111897877B (en) | High-performance high-reliability data sharing system and method based on distributed ideas | |
CN107423583B (en) | A kind of software protecting device remapping method and device | |
CN109254893B (en) | Service data auditing method, device, server and storage medium | |
CN114095228A (en) | Safe access method, system and device for data of Internet of things based on block chain and edge calculation and storage medium | |
CN112818016A (en) | API-based real-time and off-line data query method and system | |
CN114443631A (en) | Method for regularly processing non-public big data in retail industry | |
CN113836210A (en) | Method for regularly processing non-public retail big data in tobacco industry | |
CN112702354A (en) | Data resource sharing tracing method and device based on block chain technology | |
CN116800535A (en) | Method and device for avoiding secret between multiple servers | |
US11294926B1 (en) | Master extract, transform, and load (ETL) application for accommodating multiple data source types having disparate data formats | |
CN115795509A (en) | Weak password event processing method and device, processor and electronic equipment | |
CN115564429A (en) | System for prepaid transaction monitoring and related methods and blockchains |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |