WO2022048201A1 - 数据处理方法及装置、电子设备、存储介质 - Google Patents

数据处理方法及装置、电子设备、存储介质 Download PDF

Info

Publication number
WO2022048201A1
WO2022048201A1 PCT/CN2021/096363 CN2021096363W WO2022048201A1 WO 2022048201 A1 WO2022048201 A1 WO 2022048201A1 CN 2021096363 W CN2021096363 W CN 2021096363W WO 2022048201 A1 WO2022048201 A1 WO 2022048201A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
event
task
level category
fragmentation
Prior art date
Application number
PCT/CN2021/096363
Other languages
English (en)
French (fr)
Inventor
解高纯
Original Assignee
北京沃东天骏信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京沃东天骏信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京沃东天骏信息技术有限公司
Priority to EP21863271.9A priority Critical patent/EP4209933A4/en
Priority to US18/043,912 priority patent/US20230342369A1/en
Publication of WO2022048201A1 publication Critical patent/WO2022048201A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to a data processing method, a data processing apparatus, an electronic device, and a computer-readable storage medium.
  • the data structure is relatively complex and the data is large, which easily leads to the information system taking a long time for data statistics. , the data processing cycle is long, resulting in low data processing efficiency of the information system.
  • the purpose of the embodiments of the present disclosure is to provide a data processing method, a data processing apparatus, an electronic device, and a computer-readable storage medium, so as to at least to a certain extent, the data processing efficiency of the information system corresponding to the intelligent offline item transaction system in the related scheme low problem.
  • a data processing method comprising: acquiring data to be counted, and performing fragmentation processing on the data to be counted to generate task fragment data;
  • the fragmented data is distributed to the event listener; based on the event listener, the data is queried and assembled according to the task fragmented data to generate multi-level category data corresponding to the data to be counted.
  • a data processing device comprising: a task fragmentation module for acquiring data to be counted, and performing fragmentation processing on the data to be counted to generate task fragment data; a data distribution module , which is used to distribute the task fragmentation data to the event listener through the preset event broadcaster; the data assembly module is used to query and assemble the data according to the task fragmentation data based on the event listener to generate the Multi-level category data corresponding to the data to be counted.
  • an electronic device comprising: a processor; and a memory on which computer-readable instructions are stored, the computer-readable instructions, when executed by the processor, cause all The electronic device performs the following steps: acquiring data to be counted, and performing fragmentation processing on the data to be counted to generate task fragment data; distributing the task fragment data to event listeners through a preset event broadcaster; The event listener queries and assembles the data according to the task fragmentation data to generate multi-level category data corresponding to the data to be counted.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, causes the processor to perform the following steps: acquiring data to be counted, and Performing fragmentation processing on the to-be-statistical data to generate task fragmentation data; distributing the task fragmentation data to an event listener through a preset event broadcaster; based on the event listener, querying according to the task fragmentation data And assemble the data to generate multi-level category data corresponding to the data to be counted.
  • FIG. 1 schematically shows a schematic structural diagram of tree-structured data according to some embodiments of the present disclosure
  • FIG. 2 schematically shows a schematic flowchart of a data processing method according to some embodiments of the present disclosure
  • FIG. 3 schematically shows a schematic structural diagram of a data distribution mechanism according to some embodiments of the present disclosure
  • FIG. 4 schematically shows a schematic flowchart of data processing for statistical data to be processed according to some embodiments of the present disclosure
  • FIG. 5 schematically shows a schematic diagram of a data processing apparatus according to some embodiments of the present disclosure
  • FIG. 6 schematically shows a schematic structural diagram of a computer system of an electronic device according to some embodiments of the present disclosure
  • FIG. 7 schematically illustrates a schematic diagram of a computer-readable storage medium according to some embodiments of the present disclosure.
  • Example embodiments will now be described more fully with reference to the accompanying drawings.
  • Example embodiments can be embodied in various forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art.
  • the automatic order placing data is tree-structured data.
  • the tree-structured data includes first-level category data 101 to first-level category data 102 .
  • the first-level category data 101 includes second-level category data 103 to second-level category data 104.
  • the first-level category data 102 also includes the same second-level category data, which is not shown in the figure;
  • Category data 103 includes third-level category data 105 to third-level category data 106.
  • second-level category data 104 also includes the same third-level category data, which is not shown in the figure; wherein third-level category data 105 includes The fourth-level category data 107 to the fourth-level category data 108, of course, the third-level category data 106 also includes the same fourth-level category data, which is not shown in the figure.
  • the four-level category data in FIG. 1 may also include five-level category data, which is further subdivided into multiple sub-category data according to the data corresponding to the actual situation, which is not specifically shown in FIG. 1 . It can be clearly seen from Figure 1 that the tree-structured automatic ordering data not only has a complex data structure, but also has a large amount of data.
  • the background when the user clicks on the page, the background will initiate a Mysql database query operation, insert each level, and then perform a final summary of data statistics based on the data of each level.
  • the system queries the database for statistics. If the query for the fourth-level category data is triggered, it will be relatively simple to query the third-level category data corresponding to all the data from the fourth-level category data 107 to the fourth-level category data 108. Class category data 105.
  • the corresponding secondary category data needs to be aggregated and calculated, and then each secondary category needs to aggregate its corresponding tertiary category data.
  • the data needs to summarize all the corresponding four-level category data, and it will be calculated level by level. At this time, there will be a lot of data to be counted, which will make the statistical calculation very time-consuming.
  • timing statistics such as current tree data, single-threaded, layer-by-layer, each node requires statistical data. Assuming that there are four layers, the timing tasks need to be calculated layer by layer after the final statistics are obtained. Save the Mysql database.
  • a data processing method is first provided.
  • the data processing method can be applied to terminal devices, such as mobile phones, computers and other electronic devices, and can also be applied to servers.
  • terminal devices such as mobile phones, computers and other electronic devices
  • servers can also be applied to servers.
  • FIG. 2 schematically shows a schematic diagram of the flow of a data processing method according to some embodiments of the present disclosure.
  • the data processing method may include the following steps:
  • Step S210 obtaining data to be counted, and performing fragmentation processing on the data to be counted to generate task fragmentation data
  • Step S220 distributing the task fragmentation data to the event listener through a preset event broadcaster
  • Step S230 based on the event listener, query and assemble data according to the task fragmentation data to generate multi-level category data corresponding to the data to be counted.
  • the data to be counted is first processed into pieces, which reduces the complexity of the data and improves the data processing efficiency of the information system; on the other hand , Distribute the task fragmentation data to the event listener through the event broadcaster, query and assemble the data through the event listener, realize the asynchronous processing of the statistical data, effectively reduce the data calculation cycle, and further improve the data processing efficiency of the information system.
  • step S210 the data to be counted is acquired, and the data to be counted is subjected to fragmentation processing to generate task fragmented data.
  • the data to be counted may refer to data that needs to be counted and has a complex structure or complex path.
  • the data to be counted may be tree-structured data corresponding to the type of goods in the automatic order data, It may also be data in a mesh structure, or other data to be counted with a complex structure or a complex path, which is not specifically limited in this exemplary embodiment.
  • Fragmentation processing can refer to the process of flattening and dismantling statistical data according to specific data.
  • fragmentation processing can be to treat all the data to be statistic corresponding to the intelligent offline item trading system according to different item trading system numbers.
  • the process of decomposing the tree structure of the statistical data may also be a process of decomposing the mesh structure of the statistical data according to the path number, which is not particularly limited in this exemplary embodiment.
  • the task fragmentation data may be metadata generated after fragmentation of the statistical data to be processed.
  • identification data corresponding to the data to be counted may be obtained, and then the data to be counted is segmented according to the identification data to generate task segment data corresponding to the data to be counted.
  • the identification data may refer to the basis for performing fragmentation processing on the statistical data to be processed.
  • the identification data can be the item transaction system number StoreID corresponding to each automatic order data, or it can be the path number of the mesh structure, of course.
  • the identification data may also be other data that can be used as a basis for performing fragmentation processing on the statistical data to be processed, which is not specifically limited in this exemplary embodiment.
  • the identification data may be the item transaction system number, and the first-level category data corresponding to the first item transaction system number, the second-level category data
  • the first-level category data, the third-level category data and the fourth-level category data are used as the first task segment data; the first-level category data, the second-level category data, and the third-level category data corresponding to the second item transaction system number can be
  • the data and the fourth-level category data are used as the second task fragmentation data, and so on, and then the data to be counted, which is more complex in structure and contains more data, is fragmented and processed to generate multiple task fragmentation data.
  • step S220 the task fragment data is distributed to event listeners through a preset event broadcaster.
  • a preset event broadcaster may refer to a program in the Spring event monitoring mechanism for distributing received events to data consuming nodes of event listeners, and event listeners may refer to receiving events The event distributed by the broadcaster and the program of the event consuming node that processes the event.
  • the target corresponding to the task fragmentation data may be determined according to the task fragmentation data based on the preset event broadcaster identification data; and then perform data query according to the target identification data to determine the first-level category data corresponding to the target identification data.
  • the target identification data may refer to the identification data corresponding to the current task fragmentation data.
  • the first item transaction system number corresponds to the first task fragmentation data
  • the second item transaction system number corresponds to the second task fragmentation data.
  • the target identification data may be the second item transaction system number.
  • the information contained in the current task fragmentation data is the target identification data and the scattered data corresponding to the target identification data.
  • the identification data is used for data query to determine the first-level category data corresponding to the target identification data, so as to facilitate the consumption and sorting of data during subsequent data query and assembly, and improve the efficiency of data statistics.
  • a data capture event can be constructed according to the task fragmentation data and the target identification data and the first-level category data corresponding to the task fragmentation data; Data scraping events are processed asynchronously.
  • the data capture event can refer to the event instruction constructed according to the target identification data and the first-level category data corresponding to the task fragmentation data.
  • the Spring event monitoring mechanism mainly consists of three parts: events, event broadcasters, and event listeners. By constructing a data capture event with target identification data and first-level category data corresponding to the task fragmentation data, it is convenient for subsequent preset event broadcasters and event listeners to consume task fragmentation data.
  • step S230 based on the event listener, query and assemble data according to the task fragmentation data to generate multi-level category data corresponding to the data to be counted.
  • a preset event broadcaster distributes and broadcasts a data capture event composed of multiple task fragment data to multiple event listeners, and the multiple event listeners receive their corresponding data
  • the data grabbing event is executed asynchronously, so as to realize the asynchronous processing of the data grabbing event composed of the fragmented data of multiple tasks.
  • the event listener queries data according to the target identification data and the first-level category data corresponding to the task shard data, and assembles the queried data along the first-level category data to obtain the multi-level category corresponding to the current task shard data. data, and finally assemble multi-level category data corresponding to different task fragmentation data to obtain tree-structured multi-level category data corresponding to the data to be counted.
  • the target database can be queried and captured according to the target identification data and the first-level category data corresponding to the task fragmentation data in the data capture event; level category data and sub-category data to generate multi-level category data corresponding to the data to be counted.
  • the sub-category data may refer to multiple secondary category data corresponding to the first-level category data under the current target identification data queried and captured by the event listener, for example, the sub-category data may be the first item Under the transaction system number, the transaction system number of the first item queried and captured through the event listener corresponds to the second-level category data, the third-level category data and the fourth-level category data of the first-level category data. It is a schematic illustration, which is not specifically limited in this exemplary embodiment.
  • the target database may refer to a database that stores data to be counted.
  • the target database can be queried and obtained the lower-level category data corresponding to the bottom-level category data of the target identification data.
  • Order data and commodity data; then, the sub-category data can be generated through the step-by-step statistical processing of the lowest-level category data; wherein, the sub-category data includes multiple levels of category data, and identifies the category data at different levels deal with.
  • the bottom-level category data may refer to the category data at the lowest level in the sub-category data corresponding to the next-level category data of the target identification data, for example, the bottom-level category data may be the leaf nodes of tree structure data
  • the node may also be an edge node of the mesh structure data, which is not particularly limited in this exemplary embodiment.
  • the lowest-level category data can be the fourth-level category data; query and capture the fourth-level category data under the first-level category data through the event listener Corresponding order data and commodity data, then according to the order data and commodity data corresponding to the fourth-level category data, the third-level category data is statistically assembled, and then the second-level category data is obtained according to the third-level category data statistics, and finally according to the second-level category data.
  • the category data, the third-level category data, and the fourth-level category data obtain subcategory data corresponding to the first-level category data.
  • the identification processing can be performed on the different metadata in the second-level category data, the third-level category data and the fourth-level category data. , through the identification of different metadata, improve the efficiency of data storage, and improve the efficiency of data query in subsequent data query. For example, when a user queries the first-level category data, it does not need to be recursively obtained from the fourth-level category data, but can be obtained directly according to the data identifier, which effectively improves the data query efficiency.
  • the flattened multi-level category data may be stored in the target database.
  • the metadata in the multi-level category data contains different level identifiers, and the multi-level category data and the different level identifiers are stored in the target database. can effectively improve data query efficiency.
  • This exemplary embodiment mainly shards the first-level category data, performs sharding calculation according to each path, and then executes it regularly. After the calculation is completed, it is saved to the database and then used for user query.
  • the scheduled task executes it once a day, and execute the data on the previous day.
  • the scheduled task When the scheduled task is started, it first obtains the data corresponding to all the item trading systems, and then distributes the data of each item trading system as a piece of data for the consuming node to calculate and start consumption.
  • the computing node When the computing node consumes the fragmented data, it starts processing.
  • the node first obtains the task fragmentation data, obtains the item transaction system data, obtains the first-level category data corresponding to the item transaction system through data query, and distributes the data after obtaining the data. , at this time, through the Spring event monitoring mechanism, events are created and distributed, and all events will be processed asynchronously.
  • the data includes the first-level category data and sub-category data and the data of the item trading system.
  • the current event processor first grabs the query assembly data, and first calculates the fourth-level category.
  • the project data including the number of orders placed by automatic replenishment, and then capture the data of manual orders. Save the data to the database.
  • the fourth-level category data is aggregated to the third-level category data. Push up layer by layer, and finally generate node data at all levels (the data needs to be marked with a few layers), and each time the calculation is completed, the calculation results are sent out through JMQ (message middleware).
  • the system consumes the message, obtains the calculation data and stores it in the Mysql data record.
  • FIG. 3 schematically shows a schematic structural diagram of a data distribution mechanism according to some embodiments of the present disclosure.
  • the Spring event monitoring mechanism is mainly composed of three parts: an event source 301 , an event broadcaster 302 , and an event listener registry 303 .
  • the event source 301 includes multiple events 304
  • the event listener registry 303 manages multiple event listeners 305 .
  • a data capture event is constructed according to the target identification data corresponding to the task fragment data and the first-level category data.
  • the data capture event is stored in the event source 301
  • the multiple data capture events in the event source 301 are distributed and broadcast to the multiple event listeners 305 in the event listener registry 303, and each event listener 305 is responsible for one data capture.
  • the event listener 305 asynchronously processes and consumes the data capture event to obtain multi-level category data of the data to be counted.
  • FIG. 4 schematically shows a flow chart of data processing for statistical data to be processed according to some embodiments of the present disclosure.
  • step S401 the data to be counted is obtained, and the data to be counted is subjected to fragmentation processing to obtain task fragmentation data;
  • Step S402 obtaining target identification data and primary category data corresponding to the task fragmentation data according to the task fragmentation data, and constructing a data capture event by using the target identification data and the primary category data;
  • Step S403 consume the data capture event through the computing node, that is, distribute and broadcast the data capture event to the event listener through the event broadcaster in the Spring event monitoring mechanism;
  • Step S404 performing data statistics and grabbing processing according to the data grabbing event through the event listener
  • Step S405 publishing a data grabbing event to realize grabbing order data and commodity data in the database; assembling the grabbed data to generate multi-level category data corresponding to different target identification data;
  • Step S406 calculate the matching rate according to the multi-level category data (according to business requirements, the correlation of the item transaction system corresponding to the category dimension needs to be counted, for example, a certain date or a certain period of date of a certain item transaction system number is counted.
  • the proportion of automatic replenishment using the intelligent replenishment system in the dimension that is, the automatic order matching rate.
  • the following content is mainly analyzed based on the statistical automatic order matching rate rate);
  • step S407 the matching result is sent through the message middleware; the multi-level category data and the matching rate obtained from the calculation can be sent to the message queue through the message middleware JMQ;
  • Step S408 consume the message sent by the message middleware; consume the matching result sent by the message middleware JMQ to the message queue;
  • step S409 the calculation result is stored; the calculation result obtained by consumption is stored in the Mysql database.
  • the data processing apparatus 500 includes: a task fragmentation module 510 , a data distribution module 520 and a data assembly module 530 . in:
  • the task fragmentation module 510 is configured to obtain data to be counted, and perform fragmentation processing on the data to be counted to generate task fragmentation data;
  • the data distribution module 520 is configured to distribute the task fragmentation data to the event listener through a preset event broadcaster;
  • the data assembly module 530 is configured to query and assemble data according to the task fragmentation data based on the event listener to generate multi-level category data corresponding to the data to be counted.
  • the task segmentation module 510 is further configured to:
  • Fragment processing is performed on the data to be counted according to the identification data, and task fragment data corresponding to the data to be counted is generated.
  • the data processing apparatus 500 further includes a primary category data determination unit, and the primary category data determination unit is configured to:
  • Data query is performed according to the target identification data to determine the primary category data corresponding to the target identification data.
  • the data distribution module 520 is further configured to:
  • the data assembly module 530 further includes:
  • a subcategory data crawling unit is used for, based on the event listener, according to the target identification data and the first-level category data corresponding to the task fragmentation data in the data crawling event, in the target database Query and grab sub-category data in ;
  • a multi-level category data generating unit configured to assemble the target identification data, the first-level category data and the sub-category data to generate multi-level category data corresponding to the data to be counted.
  • the subcategory data capture unit is further configured to:
  • the target identification data and the first-level category data corresponding to the task fragment data in the data capture event query and obtain the first-level category under the target identification data in the target database
  • the data corresponds to the order data and commodity data of the bottom-level category data
  • Sub-category data is generated by performing statistical processing on the lowest-level category data; wherein, the sub-category data includes category data of multiple levels, and identification processing is performed on category data of different levels.
  • the data processing apparatus 500 further includes a storage unit, and the storage unit is configured to:
  • the flattened multi-level category data is stored in the target database.
  • modules or units of the data processing apparatus are mentioned in the above detailed description, this division is not mandatory. Indeed, according to embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of one module or unit described above may be further divided into multiple modules or units to be embodied.
  • an electronic device capable of implementing the above data processing method is also provided.
  • aspects of the present disclosure may be implemented as a system, method or program product. Therefore, various aspects of the present disclosure can be embodied in the following forms: a complete hardware embodiment, a complete software embodiment (including firmware, microcode, etc.), or a combination of hardware and software aspects, which may be collectively referred to herein as an embodiment "circuit", “module” or "system”.
  • FIG. 6 An electronic device 600 according to such an embodiment of the present disclosure is described below with reference to FIG. 6 .
  • the electronic device 600 shown in FIG. 6 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • electronic device 600 takes the form of a general-purpose computing device.
  • Components of the electronic device 600 may include, but are not limited to: the above-mentioned at least one processing unit 610 , the above-mentioned at least one storage unit 620 , a bus 630 connecting different system components (including the storage unit 620 and the processing unit 610 ), and a display unit 640 .
  • the storage unit stores program codes, and the program codes can be executed by the processing unit 610, so that the processing unit 610 executes various exemplary methods according to the present disclosure described in the above-mentioned “Exemplary Methods” section of this specification.
  • Example steps For example, the processing unit 610 may perform step S210 as shown in FIG. 1 , obtain the data to be counted, and perform fragmentation processing on the data to be counted to generate task fragmented data; step S220 , through the preset event broadcaster Distributing the task fragmentation data to the event listener; Step S230, based on the event listener, query and assemble the data according to the task fragmentation data to generate multi-level category data corresponding to the data to be counted.
  • the storage unit 620 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 621 and/or a cache storage unit 622 , and may further include a read only storage unit (ROM) 623 .
  • RAM random access storage unit
  • ROM read only storage unit
  • the storage unit 620 may also include a program/utility 624 having a set (at least one) of program modules 625 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, An implementation of a network environment may be included in each or some combination of these examples.
  • the bus 630 may be representative of one or more of several types of bus structures, including a memory cell bus or memory cell controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any of a variety of bus structures bus.
  • the electronic device 600 may also communicate with one or more external devices 670 (eg, keyboards, pointing devices, Bluetooth devices, etc.), with one or more devices that enable a user to interact with the electronic device 600, and/or with Any device (eg, router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through input/output (I/O) interface 650 . Also, the electronic device 600 may communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network such as the Internet) through a network adapter 660 . As shown, network adapter 660 communicates with other modules of electronic device 600 via bus 630 . It should be understood that, although not shown, other hardware and/or software modules may be used in conjunction with electronic device 600, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives and data backup storage systems.
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on a network , including several instructions to cause a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to an embodiment of the present disclosure.
  • a computing device which may be a personal computer, a server, a terminal device, or a network device, etc.
  • a computer-readable storage medium on which a program product capable of implementing the above-described method of the present specification is stored.
  • various aspects of the present disclosure may also be implemented in the form of a program product including program code for causing the program product to run on a terminal device when the program product is run on a terminal device.
  • the terminal device performs the steps according to various exemplary embodiments of the present disclosure described in the above-mentioned "Example Method" section of this specification.
  • a program product 700 for implementing the above-mentioned data processing method according to an embodiment of the present disclosure is described, which can adopt a portable compact disc read only memory (CD-ROM) and include program codes, and can be stored in a terminal devices such as personal computers.
  • CD-ROM portable compact disc read only memory
  • the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • the program product may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer readable signal medium may include a propagated data signal in baseband or as part of a carrier wave, with readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a readable signal medium can also be any readable medium, other than a readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • Program code embodied on a readable medium may be transmitted using any suitable medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural Programming Language - such as the "C" language or similar programming language.
  • the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (eg, using an Internet service provider business via an Internet connection).
  • LAN local area network
  • WAN wide area network
  • an external computing device eg, using an Internet service provider business via an Internet connection
  • the exemplary embodiments described herein may be implemented by software, or may be implemented by software combined with necessary hardware. Therefore, the technical solutions according to the embodiments of the present disclosure may be embodied in the form of software products, and the software products may be stored in a non-volatile storage medium (which may be CD-ROM, U disk, mobile hard disk, etc.) or on a network , which includes several instructions to cause a computing device (which may be a personal computer, a server, a touch terminal, or a network device, etc.) to execute the method according to the embodiment of the present disclosure.
  • a computing device which may be a personal computer, a server, a touch terminal, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Accounting & Taxation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • Software Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Fuzzy Systems (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供了一种数据处理方法及装置、电子设备、存储介质,涉及计算机技术领域。数据处理方法包括:获取待统计数据,并将待统计数据进行分片处理生成任务分片数据(S210);通过预设事件广播器将任务分片数据分发到事件监听器(S220);基于事件监听器,根据任务分片数据查询并组装数据以生成待统计数据对应的多层级类目数据(S230)。将树形结构的多层级类目数据打平拆解处理,有效提升信息系统的数据处理效率。

Description

数据处理方法及装置、电子设备、存储介质
相关申请的交叉引用
本申请要求于2020年09月04日提交的申请号为202010924220.9、名称为“数据处理方法及装置、电子设备、存储介质”的中国专利申请的优先权,该中国专利申请的全部内容通过引用全部并入本文。
技术领域
本公开涉及计算机技术领域,具体而言,涉及一种数据处理方法、数据处理装置、电子设备以及计算机可读存储介质。
背景技术
随着科学技术的发展,无人管理的智能线下物品交易系统(如智能实体门店、智能自动贩卖机等)越来越受到人们的欢迎。因此智能线下物品交易系统所属的供应链采购侧智能补货复杂数据的统计也变得越来越重要。
目前,相关方案中,在对智能线下物品交易系统对应的采购侧智能补货复杂数据进行统计时,由于数据结构比较复杂,数据较大,容易导致信息系统进行数据统计时耗费的时间较长,数据处理的周期较长,导致信息系统的数据处理效率低下。
需要说明的是,在上述背景技术部分公开的信息仅用于加强对本公开的背景的理解,因此可以包括不构成对本领域普通技术人员已知的现有技术的信息。
发明内容
本公开实施例的目的在于提供一种数据处理方法、数据处理装置、电子设备以及计算机可读存储介质,进而至少在一定程度上相关方案中对智能线下物品交易系统对应的信息系统数据处理效率低下的问题。
根据本公开的第一方面,提供了一种数据处理方法,包括:获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;通过预设事件广播器将所述任务分片数据分发到事件监听器;基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
根据本公开的第二方面,提供了一种数据处理装置,包括:任务分片模块,用于获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;数据分发模块,用于通过预设事件广播器将所述任务分片数据分发到事件监听器;数据组装模块,用于基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
根据本公开的第三方面,提供了一种电子设备,包括:处理器;以及存储器,所述存 储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述电子设备执行如下步骤:获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;通过预设事件广播器将所述任务分片数据分发到事件监听器;基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
根据本公开的第四方面,提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行如下步骤:获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;通过预设事件广播器将所述任务分片数据分发到事件监听器;基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。在附图中:
图1示意性示出了根据本公开的一些实施例的树型结构数据的结构示意图;
图2示意性示出了根据本公开的一些实施例的数据处理方法的流程示意图;
图3示意性示出了根据本公开的一些实施例的数据分发机制的结构示意图;
图4示意性示出了根据本公开的一些实施例的对待统计数据进行数据处理的流程示意图;
图5示意性示出了根据本公开的一些实施例的数据处理装置的示意图;
图6示意性示出了根据本公开的一些实施例的电子设备的计算机系统的结构示意图;
图7示意性示出了根据本公开的一些实施例的计算机可读存储介质的示意图。
在附图中,相同或对应的标号表示相同或对应的部分。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。
此外,所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施例中。在下面的描述中,提供许多具体细节从而给出对本公开的实施例的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而没有特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知方法、 装置、实现或者操作以避免模糊本公开的各方面。
此外,附图仅为示意性图解,并非一定是按比例绘制。附图中所示的方框图仅仅是功能实体,不一定必须与物理上独立的实体相对应。即,可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
申请人研究发现,对于自动下单数据为树形结构数据,参考图1所示,树形结构数据包括一级类目数据101至一级类目数据102。其中一级类目数据101包括二级类目数据103至二级类目数据104,当然,一级类目数据102也包括同样的二级类目数据,图中没有示出;其中二级类目数据103包括三级类目数据105至三级类目数据106,当然,二级类目数据104也包括同样的三级类目数据,图中没有示出;其中三级类目数据105包括四级类目数据107至四级类目数据108,当然,三级类目数据106也包括同样的四级类目数据,图中没有示出。容易理解的是,图1中的四级类目数据还可以包括五级类目数据,具体根据实际情况对应的数据,更细分为多个子类目数据,在图1中不在具体示出。由图1可以明显得出,树型结构的自动下单数据,不仅数据结构较为复杂,而且数据量偏大。
在一种相关方案中,用户点击页面时,后台会发起Mysql数据库查询操作,插叙各个层次,然后根据各层数据进行数据统计最终汇总。当用户出发查询操作时,系统查询数据库进行统计,如果触发对四级类目数据的查询,则会比较简单的查询到四级类目数据107至四级类目数据108的所有数据对应的三级类目数据105。但是,如果用户查询一级类目数据101时,需要将其对应的二级类目数据汇总计算,然后每个二级类目需要汇总其对应的三级类目数据,每个三级类目数据需要汇总对应的所有四级类目数据,会逐级统计计算,此时需要统计的数据会非常多,进而导致统计计算非常耗时。
在另一种相关方案中,通过定时任务,定时统计,例如当前树形数据,单线程逐层每个节点,都需要统计数据,假设为四层,则定时任务需要逐层计算最终统计出来后保存Mysql数据库。通过定时提前数据计算,但是此方案中,同样面对的问题是计算时间耗费较长,单线程计算,数据较大时,导致计算周期及服务器压力较大。
还有一种相关方案中,通过大数据方式,大数据方式统计处理此类数据,会非常方便。但是,通过大数据的方式进行数据统计,如果针对无大数据计算平台的公司或者用户,这将带来很大花费,包括平台搭建花费(例如spark,hbase,hadoop,kafka等集群)、工时耗费、机器资源的投入等,花费的成本较高。
基于上述一个或者多个问题,在本示例实施例中,首先提供了一种数据处理方法,该数据处理方法可以应用于终端设备,例如手机、电脑等电子设备,也可以应用于服务器,本示例实施例对此不做特殊限定。下面以服务器执行该方法为例进行说明,图2示意性示出了根据本公开的一些实施例的数据处理方法流程的示意图。参考图2所示,该数据处理方法可以包括以下步骤:
步骤S210,获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;
步骤S220,通过预设事件广播器将所述任务分片数据分发到事件监听器;
步骤S230,基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
根据本示例实施例中的数据处理方法,一方面,在进行待统计数据查询组装前,先将待统计数据进行分片处理,降低数据的复杂度,提升信息系统的数据处理效率;另一方面,通过事件广播器将任务分片数据分发到事件监听器,通过事件监听器查询并组装数据,实现对待统计数据的异步处理,有效减少数据计算周期,进一步提升信息系统的数据处理效率。
下面,将对本示例实施例中的数据处理方法进行进一步的说明。
在步骤S210中,获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据。
在本公开的一个示例实施例中,待统计数据可以是指需要进行统计的、具有复杂结构或者复杂路径的数据,例如待统计数据可以是自动下单数据的货物种类对应的树形结构数据,也可以是网状结构的数据,还可以是其他具有复杂结构或者复杂路径的待统计数据,本示例实施例对此不做特殊限定。
分片处理可以是指根据特定数据对带统计数据进行打平拆解的处理过程,例如分片处理可以是将所有智能线下物品交易系统对应的待统计数据,按照不同的物品交易系统编号对待统计数据的树形结构进行分解的处理过程,也可以是根据路径编号对待统计数据的网状结构进行分解的处理过程,本示例实施例对此不做特殊限定。任务分片数据可以是对待统计数据进行分片处理后生成的元数据。
具体的,可以获取待统计数据对应的标识数据,然后根据标识数据对待统计数据进行分片处理,生成待统计数据对应的任务分片数据。
其中,标识数据可以是指能够对待统计数据进行分片处理的依据。例如,待统计数据可以是自动下单数据的货物种类对应的树形结构数据时,标识数据可以是各自动下单数据对应的物品交易系统编号StoreID,也可以是网状结构的路径编号,当然,标识数据还可以是其他能够作为对待统计数据进行分片处理的依据的数据,本示例实施例对此不做特殊限定。
举例而言,在待统计数据是自动下单数据的货物种类对应的树形结构数据时,标识数据可以是物品交易系统编号,可以将第一物品交易系统编号对应的一级类目数据、二级类目数据、三级类目数据和四级类目数据作为第一任务分片数据;可以将第二物品交易系统编号对应的一级类目数据、二级类目数据、三级类目数据和四级类目数据作为第二任务分片数据,以此类推,进而将结构较复杂、包含数据较多的待统计数据分片处理生成多个任务分片数据。
在步骤S220中,通过预设事件广播器将所述任务分片数据分发到事件监听器。
在本公开的一个示例实施例中,预设事件广播器可以是指Spring事件监听机制中用 于将接收到的事件分发给事件监听器的数据消费节点的程序,事件监听器可以是指接收事件广播器分发的事件,并对事件进行处理的事件消费节点的程序。
在本公开的一个示例实施例中,在通过预设事件广播器将任务分片数据分发到事件监听器之前,可以基于预设事件广播器,根据任务分片数据确定任务分片数据对应的目标标识数据;然后根据目标标识数据进行数据查询确定目标标识数据对应的一级类目数据。
其中,目标标识数据可以是指当前任务分片数据对应的标识数据,例如,第一物品交易系统编号对应第一任务分片数据;第二物品交易系统编号对应第二任务分片数据,在当前任务分片数据是第二任务分片数据时,目标标识数据可以是第二物品交易系统编号,当然,此处仅是示意性举例说明,本示例实施例对此不做特殊限定。
在预设事件广播器获取到分片处理后的任务分片数据时,此时当前任务分片数据包含的信息是目标标识数据以及目标标识数据对应的分散数据,在进行事件分发前,根据目标标识数据进行数据查询确定目标标识数据对应的一级类目数据,以便于在后续进行数据查询以及组装时是对数据的消费以及整理,提升数据统计的效率。
进一步的,可以基于预设事件广播器,根据任务分片数据以及任务分片数据对应的目标标识数据和一级类目数据构建数据抓取事件;将数据抓取事件分发到事件监听器以对数据抓取事件进行异步处理。
其中,数据抓取事件可以是指根据任务分片数据对应的目标标识数据和一级类目数据构建的事件指令,Spring事件监听机制主要由三部分构成:事件,事件广播器,事件监听器。通过将任务分片数据对应的目标标识数据和一级类目数据构建数据抓取事件,便于后续预设事件广播器和事件监听器对任务分片数据的消费。
在步骤S230中,基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
在本公开的一个示例实施例中,通过预设事件广播器将多个任务分片数据构成的数据抓取事件分发广播到多个事件监听器,多个事件监听器在接收到各自对应的数据抓取事件时,异步执行数据抓取事件,以实现对多个任务分片数据构成的数据抓取事件的异步处理。事件监听器根据任务分片数据对应的目标标识数据和一级类目数据查询数据,并沿着一级类目数据对查询到的数据进行组装,得到当前任务分片数据对应的多层级类目数据,最后将不同的任务分片数据对应的多层级类目数据进行组装,得到待统计数据对应的树状结构的多层级类目数据。
具体的,可以基于事件监听器,根据数据抓取事件中的任务分片数据对应的目标标识数据和一级类目数据在目标数据库中查询并抓取子类目数据;组装目标标识数据、一级类目数据和子类目数据以生成待统计数据对应的多层级类目数据。
其中,子类目数据可以是指通过事件监听器查询并抓取的当前目标标识数据下、一级类目数据对应的多个次级类目数据,例如,子类目数据可以是第一物品交易系统编号下,通过事件监听器查询并抓取的第一物品交易系统编号对应一级类目数据的二级类目数据、 三级类目数据和四级类目数据,当然,此处仅是示意性举例说明,本示例实施例对此不做特殊限定。目标数据库可以是指存储待统计数据的数据库。
进一步的,可以根据数据抓取事件中的任务分片数据对应的目标标识数据和一级类目数据,在目标数据库中查询并获取目标标识数据下一级类目数据对应最底层类目数据的订单数据以及商品数据;然后可以通过最底层类目数据进行逐级统计处理生成子类目数据;其中,子类目数据包括多个层级的类目数据,并对不同层级的类目数据进行标识处理。
其中,最底层类目数据可以是指目标标识数据下一级类目数据对应的子类目数据中处于最低层级的类目数据,例如,最底层类目数据可以是树形结构数据的叶子结点,也可以是网状结构数据的边缘节点,本示例实施例对此不做特殊限定。
举例而言,如果树型结构数据的层级包括四个层级,则最底层类目数据可以是四级类目数据;通过事件监听器查询并抓取一级类目数据下的四级类目数据对应的订单数据以及商品数据,然后根据四级类目数据对应的订单数据以及商品数据统计组装得到三级类目数据,然后根据三级类目数据统计得到二级类目数据,最终根据二级类目数据、三级类目数据和四级类目数据得到一级类目数据对应的子类目数据。
同时,可以在得到二级类目数据、三级类目数据和四级类目数据时,对二级类目数据、三级类目数据和四级类目数据中的不同元数据进行标识处理,通过不同元数据的标识,提高数据存储的效率,并在后续进行数据查询时,提升数据查询的效率。例如,用户在对一级类目数据进行查询时,不需要从四级类目数据逐级递推得到,而是直接可以根据数据的标识直接进行获取,有效提升数据的查询效率。
可选的,可以在根据任务分片数据查询并组装数据以生成待统计数据对应的多层级类目数据之后,将打平处理后的多层级类目数据存储到目标数据库中。生成待统计数据对应的多层级类目数据时,多层级类目数据中的元数据均包含不同层级标识,将多层级类目数据以及不同层级标识均存储在目标数据库中,在对数据进行使用时,能够有效提升数据的查询效率。
本示例实施例主要是通过对一级类目数据进行分片,按照每个路径进行分片计算,然后定时执行,计算完成后,保存到数据库,然后供用户查询使用。
整个计算可以通过以下几个步骤完成:
制定定时任务,每天执行一次,执行数据日期为前一日。定时任务启动时,首先获取所有物品交易系统对应的数据,然后将每个物品交易系统的数据作为一个分片数据分发出去,供消费节点计算开始消费。
计算节点消费到分片数据时,开始进行处理,节点首先获取任务分片数据,获取物品交易系统数据,通过数据查询,获取此物品交易系统对应的一级类目数据,获取数据后再进行分发,此时通过Spring事件监听机制,创建事件,将事件进行分发,此时所有事件会进行异步处理。
任务分发完成后,数据包含一级类目数据及子类目数据和物品交易系统的数据,事件 监听器监听到事件后,首先当前事件处理器先去抓取查询组装数据,首先计算四级类目数据中,包括自动补货下单数量,然后抓取手动下单数据。将数据保存数据库。同时汇总四级类目数据到三级类目数据。逐层上推,最终生成各级节点数据(数据需标记第几层级),每次计算完成将计算结果通过JMQ(消息中间件)发送出去。
最后,在监听到JMQ消息后,系统消费消息,获取计算数据存储到Mysql数据记录。
通过基于Spring事件监听机制,抓取数据、异步处理、打平处理待统计数据对应的树形结构数据,提前打平数据并存储到Mysql数据库,能够有效提升信息系统的数据处理效率,提升用户查询数据的效率。
图3示意性示出了根据本公开的一些实施例的数据分发机制的结构示意图。
参考图3所示,Spring事件监听机制主要由三部分构成:事件源301,事件广播器302,事件监听器注册表303。其中,事件源301包括多个事件304,事件监听器注册表303中管理多个事件监听器305。在将待统计数据分片处理得到多个任务分片数据之后,根据任务分片数据对应的目标标识数据以及一级类目数据构建数据抓取事件,此时数据抓取事件存储在事件源301中,然后通过事件广播器302将事件源301中的多个数据抓取事件分发广播到事件监听器注册表303中的多个事件监听器305中,每个事件监听器305负责一个数据抓取事件,此时事件监听器305异步处理并消费数据抓取事件得到待统计数据的多层级类目数据。
图4示意性示出了根据本公开的一些实施例的对待统计数据进行数据处理的流程示意图。
参考图4所示,步骤S401,获取待统计数据,并对待统计数据进行分片处理得到任务分片数据;
步骤S402,根据任务分片数据获取任务分片数据对应的目标标识数据以及一级类目数据,并通过目标标识数据以及一级类目数据构建数据抓取事件;
步骤S403,通过计算节点消费数据抓取事件,即通过Spring事件监听机制中的事件广播器将数据抓取事件分发广播到事件监听器;
步骤S404,通过事件监听器依据数据抓取事件进行数据统计与抓取处理;
步骤S405,发布数据抓取事件以实现在数据库中抓取订单数据以及商品数据;对于抓取到的数据进行组装生成不同目标标识数据对应的多层级类目数据;
步骤S406,根据多层级类目数据进行匹配率计算(根据业务需求需要统计类目维度对应的物品交易系统的相关性,例如统计某个物品交易系统编号的某个日期或者某段日期此类目维度的使用智能补货系统进行自动补货的所占比率,即自动单匹配率,以下内容主要基于统计自动单匹配率率进行分析);
步骤S407,通过消息中间件发送匹配结果;可以通过消息中间件JMQ将计算得到多层级类目数据以及匹配率发送到消息队列;
步骤S408,消费消息中间件发送的消息;消费消息中间件JMQ发送到消息队列的匹 配结果;
步骤S409,存储计算结果;将消费得到的计算结果存储到Mysql数据库。
需要说明的是,尽管在附图中以特定顺序描述了本公开中方法的各个步骤,但是,这并非要求或者暗示必须按照该特定顺序来执行这些步骤,或是必须执行全部所示的步骤才能实现期望的结果。附加的或备选的,可以省略某些步骤,将多个步骤合并为一个步骤执行,以及/或者将一个步骤分解为多个步骤执行等。
此外,在本示例实施例中,还提供了一种数据处理装置。参照图5所示,该数据处理装置500包括:任务分片模块510、数据分发模块520以及数据组装模块530。其中:
任务分片模块510用于获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;
数据分发模块520用于通过预设事件广播器将所述任务分片数据分发到事件监听器;
数据组装模块530用于基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
在本公开的一种示例性实施例中,基于前述方案,所述任务分片模块510还被配置为:
获取所述待统计数据对应的标识数据;
根据所述标识数据对所述待统计数据进行分片处理,生成所述待统计数据对应的任务分片数据。
在本公开的一种示例性实施例中,基于前述方案,所述数据处理装置500还包括一级类目数据确定单元,所述一级类目数据确定单元被配置为:
基于所述预设事件广播器,根据所述任务分片数据确定所述任务分片数据对应的目标标识数据;
根据所述目标标识数据进行数据查询确定所述目标标识数据对应的一级类目数据。
在本公开的一种示例性实施例中,基于前述方案,所述数据分发模块520还被配置为:
基于所述预设事件广播器,根据所述任务分片数据以及所述任务分片数据对应的所述目标标识数据和所述一级类目数据构建数据抓取事件;
将所述数据抓取事件分发到事件监听器以对所述数据抓取事件进行异步处理。
在本公开的一种示例性实施例中,基于前述方案,所述数据组装模块530还包括:
子类目数据抓取单元,用于基于所述事件监听器,根据所述数据抓取事件中的所述任务分片数据对应的所述目标标识数据和所述一级类目数据在目标数据库中查询并抓取子类目数据;
多层级类目数据生成单元,用于组装所述目标标识数据、所述一级类目数据和所述子类目数据以生成所述待统计数据对应的多层级类目数据。
在本公开的一种示例性实施例中,基于前述方案,所述子类目数据抓取单元还被配置为:
根据所述数据抓取事件中的所述任务分片数据对应的所述目标标识数据和所述一级 类目数据,在目标数据库中查询并获取所述目标标识数据下所述一级类目数据对应最底层类目数据的订单数据以及商品数据;
通过所述最底层类目数据进行逐级统计处理生成子类目数据;其中,所述子类目数据包括多个层级的类目数据,并对不同层级的类目数据进行标识处理。
在本公开的一种示例性实施例中,基于前述方案,所述数据处理装置500还包括存储单元,所述存储单元被配置为:
将打平处理后的所述多层级类目数据存储到目标数据库中。
上述中数据处理装置各模块的具体细节已经在对应的数据处理方法中进行了详细的描述,因此此处不再赘述。
应当注意,尽管在上文详细描述中提及了数据处理装置的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。
此外,在本公开的示例性实施例中,还提供了一种能够实现上述数据处理方法的电子设备。
所属技术领域的技术人员能够理解,本公开的各个方面可以实现为系统、方法或程序产品。因此,本公开的各个方面可以具体实现为以下形式,即:完全的硬件实施例、完全的软件实施例(包括固件、微代码等),或硬件和软件方面结合的实施例,这里可以统称为“电路”、“模块”或“系统”。
下面参照图6来描述根据本公开的这种实施例的电子设备600。图6所示的电子设备600仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图6所示,电子设备600以通用计算设备的形式表现。电子设备600的组件可以包括但不限于:上述至少一个处理单元610、上述至少一个存储单元620、连接不同系统组件(包括存储单元620和处理单元610)的总线630、显示单元640。
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元610执行,使得所述处理单元610执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施例的步骤。例如,所述处理单元610可以执行如图1中所示的步骤S210,获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;步骤S220,通过预设事件广播器将所述任务分片数据分发到事件监听器;步骤S230,基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
存储单元620可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)621和/或高速缓存存储单元622,还可以进一步包括只读存储单元(ROM)623。
存储单元620还可以包括具有一组(至少一个)程序模块625的程序/实用工具624,这样的程序模块625包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线630可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。
电子设备600也可以与一个或多个外部设备670(例如键盘、指向设备、蓝牙设备等)通信,还可与一个或者多个使得用户能与该电子设备600交互的设备通信,和/或与使得该电子设备600能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口650进行。并且,电子设备600还可以通过网络适配器660与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器660通过总线630与电子设备600的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备600使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。
通过以上的实施例的描述,本领域的技术人员易于理解,这里描述的示例实施例可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施例的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本公开实施例的方法。
在本公开的示例性实施例中,还提供了一种计算机可读存储介质,其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施例中,本公开的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施例的步骤。
参考图7所示,描述了根据本公开的实施例的用于实现上述数据处理方法的程序产品700,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本公开的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承 载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
此外,上述附图仅是根据本公开示例性实施例的方法所包括的处理的示意性说明,而不是限制目的。易于理解,上述附图所示的处理并不表明或限制这些处理的时间顺序。另外,也易于理解,这些处理可以是例如在多个模块中同步或异步执行的。
通过以上的实施例的描述,本领域的技术人员易于理解,这里描述的示例实施例可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施例的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、触控终端、或者网络设备等)执行根据本公开实施例的方法。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施例。本申请旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (10)

  1. 一种数据处理方法,包括:
    获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;
    通过预设事件广播器将所述任务分片数据分发到事件监听器;
    基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
  2. 根据权利要求1所述的数据处理方法,其中,所述将所述待统计数据进行分片处理生成任务分片数据,包括:
    获取所述待统计数据对应的标识数据;
    根据所述标识数据对所述待统计数据进行分片处理,生成所述待统计数据对应的任务分片数据。
  3. 根据权利要求2所述的数据处理方法,其中,在通过预设事件广播器将所述任务分片数据分发到事件监听器之前,所述方法还包括:
    基于所述预设事件广播器,根据所述任务分片数据确定所述任务分片数据对应的目标标识数据;
    根据所述目标标识数据进行数据查询确定所述目标标识数据对应的一级类目数据。
  4. 根据权利要求3所述的数据处理方法,其中,所述通过预设事件广播器将所述任务分片数据分发到事件监听器,包括:
    基于所述预设事件广播器,根据所述任务分片数据以及所述任务分片数据对应的所述目标标识数据和所述一级类目数据构建数据抓取事件;
    将所述数据抓取事件分发到事件监听器以对所述数据抓取事件进行异步处理。
  5. 根据权利要求4所述的数据处理方法,其中,所述基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据,包括:
    基于所述事件监听器,根据所述数据抓取事件中的所述任务分片数据对应的所述目标标识数据和所述一级类目数据在目标数据库中查询并抓取子类目数据;
    组装所述目标标识数据、所述一级类目数据和所述子类目数据以生成所述待统计数据对应的多层级类目数据。
  6. 根据权利要求5所述的数据处理方法,其中,根据所述数据抓取事件中的所述任务分片数据对应的所述目标标识数据和所述一级类目数据在目标数据库中查询并抓取子类目数据,包括:
    根据所述数据抓取事件中的所述任务分片数据对应的所述目标标识数据和所述一级类目数据,在目标数据库中查询并获取所述目标标识数据下所述一级类目数据对应最底层类目数据的订单数据以及商品数据;
    通过所述最底层类目数据进行逐级统计处理生成子类目数据;其中,所述子类目数据 包括多个层级的类目数据,并对不同层级的类目数据进行标识处理。
  7. 根据权利要求1所述的数据处理方法,其中,在根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据之后,所述方法还包括:
    将打平处理后的所述多层级类目数据存储到目标数据库中。
  8. 一种数据处理装置,包括:
    任务分片模块,用于获取待统计数据,并将所述待统计数据进行分片处理生成任务分片数据;
    数据分发模块,用于通过预设事件广播器将所述任务分片数据分发到事件监听器;
    数据组装模块,用于基于所述事件监听器,根据所述任务分片数据查询并组装数据以生成所述待统计数据对应的多层级类目数据。
  9. 一种电子设备,包括:
    处理器;以及
    存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时实现如权利要求1至7中任一项所述的数据处理方法。
  10. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7中任一项所述的数据处理方法。
PCT/CN2021/096363 2020-09-04 2021-05-27 数据处理方法及装置、电子设备、存储介质 WO2022048201A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21863271.9A EP4209933A4 (en) 2020-09-04 2021-05-27 DATA PROCESSING METHOD AND APPARATUS, AND ELECTRONIC DEVICE AND STORAGE MEDIUM
US18/043,912 US20230342369A1 (en) 2020-09-04 2021-05-27 Data processing method and apparatus, and electronic device and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010924220.9 2020-09-04
CN202010924220.9A CN113778976A (zh) 2020-09-04 2020-09-04 数据处理方法及装置、电子设备、存储介质

Publications (1)

Publication Number Publication Date
WO2022048201A1 true WO2022048201A1 (zh) 2022-03-10

Family

ID=78834982

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/096363 WO2022048201A1 (zh) 2020-09-04 2021-05-27 数据处理方法及装置、电子设备、存储介质

Country Status (4)

Country Link
US (1) US20230342369A1 (zh)
EP (1) EP4209933A4 (zh)
CN (1) CN113778976A (zh)
WO (1) WO2022048201A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117112574B (zh) * 2023-10-20 2024-02-23 美云智数科技有限公司 树形业务数据构建方法、装置、计算机设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8825720B1 (en) * 2011-04-12 2014-09-02 Emc Corporation Scaling asynchronous reclamation of free space in de-duplicated multi-controller storage systems
CN104881492A (zh) * 2015-06-12 2015-09-02 北京京东尚科信息技术有限公司 基于缓存分片技术的数据过滤方法和装置
CN108351900A (zh) * 2015-10-07 2018-07-31 甲骨文国际公司 用于分片的关系数据库组织
CN108460094A (zh) * 2018-01-30 2018-08-28 上海天旦网络科技发展有限公司 存储统计数据的方法和系统
CN110046183A (zh) * 2019-04-16 2019-07-23 北京易沃特科技有限公司 一种时序数据聚合检索方法、设备及介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0311564D0 (en) * 2003-05-20 2003-06-25 Ibm Monitoring operational data in data processing systems
CN103412962B (zh) * 2013-09-04 2016-09-07 国家测绘地理信息局卫星测绘应用中心 一种海量瓦片数据的存储方法及读取方法
US10049134B2 (en) * 2014-06-12 2018-08-14 International Business Machines Corporation Method and system for processing queries over datasets stored using hierarchical data structures
CN107329983B (zh) * 2017-06-01 2020-12-01 昆仑智汇数据科技(北京)有限公司 一种机器数据分布式存储、读取方法及系统
CN110245046B (zh) * 2019-05-29 2023-05-02 吉旗(成都)科技有限公司 一种针对Android App无埋点的数据统计方法及装置
CN111427906B (zh) * 2020-03-30 2023-06-09 南方电网数字平台科技(广东)有限公司 拖拽式的多组件混合应用的数据可视化系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8825720B1 (en) * 2011-04-12 2014-09-02 Emc Corporation Scaling asynchronous reclamation of free space in de-duplicated multi-controller storage systems
CN104881492A (zh) * 2015-06-12 2015-09-02 北京京东尚科信息技术有限公司 基于缓存分片技术的数据过滤方法和装置
CN108351900A (zh) * 2015-10-07 2018-07-31 甲骨文国际公司 用于分片的关系数据库组织
CN108460094A (zh) * 2018-01-30 2018-08-28 上海天旦网络科技发展有限公司 存储统计数据的方法和系统
CN110046183A (zh) * 2019-04-16 2019-07-23 北京易沃特科技有限公司 一种时序数据聚合检索方法、设备及介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4209933A4 *

Also Published As

Publication number Publication date
CN113778976A (zh) 2021-12-10
EP4209933A1 (en) 2023-07-12
EP4209933A4 (en) 2024-04-17
US20230342369A1 (en) 2023-10-26

Similar Documents

Publication Publication Date Title
CN109997126B (zh) 事件驱动提取、变换、加载(etl)处理
US10948526B2 (en) Non-parametric statistical behavioral identification ecosystem for electricity fraud detection
US11972272B2 (en) Generation of bots based on observed behavior
Bordin et al. Dspbench: A suite of benchmark applications for distributed data stream processing systems
WO2019143705A1 (en) Dimension context propagation techniques for optimizing sql query plans
CN111339073A (zh) 实时数据处理方法、装置、电子设备及可读存储介质
CN110069495A (zh) 数据存储方法、装置和终端设备
CN104699725A (zh) 数据搜索处理方法及系统
WO2012068557A1 (en) Real-time analytics of streaming data
CN111429241A (zh) 一种账务处理方法和装置
CN105405070A (zh) 一种分布式内存电网系统构建方法
CN111143286A (zh) 一种云平台日志管理方法及系统
CN112559301B (zh) 业务处理方法、存储介质、处理器及电子装置
CN114297173A (zh) 一种面向大规模海量数据的知识图谱构建方法和系统
CN113642300A (zh) 一种报表生成方法、装置、电子设备及计算机可读介质
US11704363B2 (en) System and method for generating highly scalable temporal graph database
WO2022048201A1 (zh) 数据处理方法及装置、电子设备、存储介质
WO2021056739A1 (zh) 性能分析方法、装置、计算机设备及存储介质
CN109947736B (zh) 实时计算的方法和系统
US8306953B2 (en) Online management of historical data for efficient reporting and analytics
Wang et al. Sublinear algorithms for big data applications
JP7305641B2 (ja) リモートデバイスからのアプリケーションアクティビティデータをトラッキングし、リモートデバイスのための修正動作データ構造を生成するための方法およびシステム
CN113918577B (zh) 数据表识别方法、装置、电子设备及存储介质
CN112235367B (zh) 一种实体行为关系消息订阅方法、系统、终端及存储介质
CN114238438A (zh) 一种数据实时计算统计的方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21863271

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021863271

Country of ref document: EP

Effective date: 20230404