CN110796408A - Data processing method and device, electronic equipment and computer readable storage medium - Google Patents

Data processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN110796408A
CN110796408A CN201911008425.6A CN201911008425A CN110796408A CN 110796408 A CN110796408 A CN 110796408A CN 201911008425 A CN201911008425 A CN 201911008425A CN 110796408 A CN110796408 A CN 110796408A
Authority
CN
China
Prior art keywords
task
historical
data
tasks
batch
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911008425.6A
Other languages
Chinese (zh)
Inventor
李青
李根剑
张伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rajax Network Technology Co Ltd
Original Assignee
Rajax Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rajax Network Technology Co Ltd filed Critical Rajax Network Technology Co Ltd
Priority to CN201911008425.6A priority Critical patent/CN110796408A/en
Publication of CN110796408A publication Critical patent/CN110796408A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping
    • G06Q10/0838Historical data

Abstract

The embodiment of the invention discloses a data processing method, a data processing device, an electronic device and a computer readable storage medium, wherein historical tasks are packaged and grouped through acquired historical task data, and the quality of the historical task data in each group is evaluated to acquire data quality information of the historical task data, wherein when a plurality of continuous historical tasks in a task batch meet a first condition, the historical tasks completed between two continuous first task processing actions in the task batch are packaged and grouped, and when a plurality of continuous historical tasks in the task batch meet a second condition, the historical tasks continuously completing second task processing actions in the task batch are packaged and grouped, so that the accuracy of data quality evaluation can be improved.

Description

Data processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a data processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
At present, massive data is brought by big data, and a user can acquire related information through data analysis, but the process of data analysis is meaningful only by establishing high-quality effective data. Therefore, how to accurately evaluate the quality of the data is a problem to be solved urgently.
Disclosure of Invention
In view of the above, the present invention provides a data processing method, an apparatus, an electronic device and a computer-readable storage medium, so as to improve the accuracy of data quality evaluation.
In a first aspect, an embodiment of the present invention provides a data processing method, where the method includes:
receiving a data instruction from a server;
analyzing the data instruction through at least one processor to obtain historical task data, wherein the historical task data comprises task processing resources corresponding to the historical task, a starting address, an arrival address and processing time information of the historical task;
for a task batch of a task processing resource, in response to a plurality of continuous historical tasks in the task batch meeting a preset first condition, packaging and grouping the historical tasks completed between two continuous first task processing actions in the task batch, and in response to a plurality of continuous historical tasks in the task batch meeting a preset second condition, packaging and grouping the historical tasks continuously completed with second task processing actions in the task batch;
performing, by at least one processor, a quality assessment of historical task data in the packet to obtain data quality information.
Optionally, the step of enabling the plurality of consecutive historical tasks in the task batch to meet the preset first condition is specifically:
the distance between the arrival addresses of any two of the consecutive plurality of historical tasks is not within a preset distance.
Optionally, the step of enabling the plurality of consecutive historical tasks in the task batch to meet the predetermined second condition is specifically:
the distance between the arrival addresses of the continuous plurality of historical tasks is within a preset distance.
Optionally, the performing, by at least one processor, quality evaluation on the historical task data in the packet to obtain data quality information includes:
simulating the processing time of the historical tasks in the group according to the historical task data in the group to acquire the simulation time information of the historical tasks in the group;
and performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to acquire the data quality information.
Optionally, the performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to obtain the data quality information includes:
and judging the historical task data in the group as valid data in response to the error of the processing time information and the simulation time information of the historical tasks in the group being within a preset time range.
Optionally, the method further includes:
and performing data cleaning on the historical task data according to the data quality information to obtain effective data.
Optionally, the method further includes:
training a corresponding time prediction model according to the effective data;
and predicting the completion time of the currently processed task according to the trained time prediction model.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, where the apparatus includes:
a receiving unit configured to receive a data instruction from a server;
the data acquisition unit is configured to analyze the data instruction through at least one processor and acquire historical task data, wherein the historical task data comprises task processing resources corresponding to the historical tasks, starting addresses, arrival addresses and processing time information of the historical tasks;
the packaging and grouping unit is configured to package and group the history tasks completed between two continuous first task processing actions in a task batch in response to a plurality of continuous history tasks in the task batch meeting a preset first condition, and package and group the history tasks completed with second task processing actions in the task batch in response to a plurality of continuous history tasks in the task batch meeting a preset second condition;
a quality evaluation unit configured to perform quality evaluation on the historical task data in the packet by at least one processor to obtain data quality information.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, where the memory is used to store one or more computer program instructions, where the one or more computer program instructions are executed by the processor to implement the following steps:
receiving a data instruction from a server;
analyzing the data instruction through at least one processor to obtain historical task data, wherein the historical task data comprises task processing resources corresponding to the historical task, a starting address, an arrival address and processing time information of the historical task;
for a task batch of a task processing resource, in response to a plurality of continuous historical tasks in the task batch meeting a preset first condition, packaging and grouping the historical tasks completed between two continuous first task processing actions in the task batch, and in response to a plurality of continuous historical tasks in the task batch meeting a preset second condition, packaging and grouping the historical tasks continuously completed with second task processing actions in the task batch;
performing, by at least one processor, a quality assessment of historical task data in the packet to obtain data quality information.
Optionally, the step of enabling the plurality of consecutive historical tasks in the task batch to meet the preset first condition is specifically:
the distance between the arrival addresses of any two of the consecutive plurality of historical tasks is not within a preset distance.
Optionally, the step of enabling the plurality of consecutive historical tasks in the task batch to meet the predetermined second condition is specifically:
the distance between the arrival addresses of the continuous plurality of historical tasks is within a preset distance.
Optionally, the performing, by at least one processor, quality evaluation on the historical task data in the packet to obtain data quality information includes:
simulating the processing time of the historical tasks in the group according to the historical task data in the group to acquire the simulation time information of the historical tasks in the group;
and performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to acquire the data quality information.
Optionally, the performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to obtain the data quality information includes:
and judging the historical task data in the group as valid data in response to the error of the processing time information and the simulation time information of the historical tasks in the group being within a preset time range.
Optionally, the steps further include:
and performing data cleaning on the historical task data according to the data quality information to obtain effective data.
Optionally, the steps further include:
training a corresponding time prediction model according to the effective data;
and predicting the completion time of the currently processed task according to the trained time prediction model.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement a method as described above.
According to the embodiment of the invention, the historical tasks are packaged and grouped through the acquired historical task data, and the quality of the historical task data in each group is evaluated to acquire the data quality information of the historical task data, wherein when a plurality of continuous historical tasks in a task batch meet a first condition, the historical tasks completed between two continuous first task processing actions in the task batch are packaged and grouped, and when a plurality of continuous historical tasks in a task batch meet a second condition, the historical tasks continuously completing second task processing actions in the task batch are packaged and grouped, so that the accuracy of data quality evaluation can be improved.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following description of the embodiments of the present invention with reference to the accompanying drawings, in which:
FIG. 1 is a diagram of a task processing process and its corresponding task nodes in the prior art;
FIG. 2 is a schematic diagram of a prior art task delivery path;
FIG. 3 is a flow chart of a data processing method of an embodiment of the present invention;
FIG. 4 is a schematic diagram of a historical task trajectory and a simulation trajectory according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of another historical task trajectory and simulation trajectory according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 7 is a schematic diagram of an electronic device of an embodiment of the invention.
Detailed Description
The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.
Further, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.
Unless the context clearly requires otherwise, throughout the description, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".
In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.
In the field of logistics distribution, in order to accurately predict distribution time, a time prediction model corresponding to different time intervals is generally established according to behavior of distribution resources, so as to predict time between different task nodes in a distribution process, and then a plurality of predicted times are added to obtain overall predicted distribution time. The time prediction model corresponding to each time interval is usually trained and acquired based on the completion time information of each task node of the historical task. Taking a takeaway task as an example, the completion time point of each task node is predicted by each prediction model to predict the overall delivery time.
Fig. 1 is a schematic diagram of a task processing process and corresponding task nodes in the prior art. As shown in fig. 1, since the task node of the take-out task may include going to pick up a meal after taking an order, arriving at the range of the merchant, arriving at the merchant, picking up a meal, leaving the range of the merchant, arriving at the range of the delivery address, arriving at the delivery address, delivering a meal, leaving the range of the delivery address, and the like. For example, for the takeaway tasks in the same cell, the time from the arrival at the cell to the arrival at the delivery address is obviously very different, and therefore, for different time intervals formed by each task node, an independent time prediction model is usually adopted to perform prediction based on different characteristics so as to improve the accuracy of the prediction model. For example, the time from the previous meal taking to the arrival at the range of the business (such as the building where the business is), which is mainly related to the traffic condition and the distance, can be predicted by adopting an independent model, and the like. In order to verify whether each prediction model is accurate, verification and adjustment are generally performed in combination with historical task data.
At present, on each takeout platform, the completion time points of 4 task nodes, such as order taking, business arriving, meal taking and meal delivery, are generally fed back to the system, that is, the time completion information of each historical task includes the time record information of the 4 task nodes. Therefore, in order to ensure the accuracy of the time prediction model obtained by training according to the currently fed back historical task data, the cleaning of the historical task data to improve the data quality of the training data is an essential step.
FIG. 2 is a schematic diagram of a prior art task delivery path. Taking take-away tasks as an example, as shown in fig. 2, in a task batch, the delivery rider first takes take-away task objects B1, B2 and B3 from the merchant at position a1, and sends take-away task objects B1 and B2 to corresponding arrival addresses via path 1. During the delivery of the takeaway task objects B1 and B2, the delivery rider accepts the delivery task of the takeaway task object B4, reaches the merchant at location a2 via path 2, takes the takeaway task object B4, and sends the takeaway task objects B3 and B4 to the corresponding arrival addresses via path 3.
In the conventional data quality evaluation method, historical task data of takeout task objects B1-B4 are respectively obtained, the historical task data comprises a starting address, an arrival address, order receiving time, store arriving time, meal fetching time and delivery time, then respectively estimating the arrival time, the meal taking time and the delivery time of the takeout task objects B1-B4 according to the starting address, the arrival address, the order taking time, the road condition information and the like corresponding to the takeout task objects B1-B4, or the estimated store time, the meal taking time and the delivery time of the takeout task objects B1-B4 are inquired according to the pre-acquired distribution distance and time comparison table, and comparing the estimated arrival time, the meal taking time and the delivery time with the actual arrival time, the meal taking time and the delivery time, thereby determining data quality information of the historical task data of the takeaway task object B1-B4 according to the comparison result.
For example, in a simulation derived delivery distance versus time chart, delivery distances for take-away tasks are between L1-L2, which correspond to a threshold time range of x1-x 2. For example, assuming that the distance of the route 1 is between L1 and L2, if the time consumed by the delivery rider on the route 1 is within the threshold time range x1 to x2, it is determined that the time period information from the meal-taking time to the arrival time of the takeout task objects B1 and B2 is valid history data. In the prior art, data quality evaluation is performed only according to independent information of one historical task, and the dependency relationship between historical tasks of the same batch is not considered, for example, a straight delivery path of the takeout task object B3 is a path 4, but in the actual delivery process, a delivery rider first delivers the takeout task objects B1 and B2 along the route, acquires the takeout task object B4 along the route to the address a2, and then delivers the takeout task object B3, so that in the existing data quality evaluation method, the historical data information of the takeout task object B3 is likely to be classified as invalid data, but actually is valid data.
Therefore, the embodiment of the invention provides a data processing method, which fully considers the dependency relationship among tasks in the same batch in the logistics distribution field, such as the influence of the co-fetching behavior, the co-sending behavior or the co-fetching and co-sending behavior on the completion time information of each task node of the historical tasks, so as to improve the accuracy of data quality evaluation.
Fig. 3 is a flowchart of a data processing method of an embodiment of the present invention. As shown in fig. 3, the data processing method according to the embodiment of the present invention includes the following steps:
step S100, receiving a data command from the server. In this embodiment, a data instruction for data evaluation is triggered by the server.
And step S200, analyzing the data instruction through at least one processor to acquire historical task data. The historical task data comprises task processing resources corresponding to the historical tasks, starting addresses, arrival addresses and processing time information of the historical tasks. The processing time information of the historical task includes completion time information of each task node of the historical task. Taking take-away tasks as an example, task processing resources are also delivery resources, including delivery riders, delivery terminals, vehicles, and the like. The processing time information of the historical task includes order taking time, store arriving time, meal taking time, delivery time and the like.
Step S300, for a task batch of a task processing resource, in response to that a plurality of consecutive historical tasks in the task batch meet a preset first condition, packaging and grouping the historical tasks completed between two consecutive first task processing actions in the task batch, and in response to that a plurality of consecutive historical tasks in the task batch meet a preset second condition, packaging and grouping the historical tasks continuously completed with second task processing actions in the task batch.
In an optional implementation manner, the task batch of the task processing resource may be divided according to a time period, or the stage from the completion of all the tasks bound to the task processing resource to the next completion of all the tasks bound to the task processing resource may be a task batch, which is not limited in this embodiment.
In an optional implementation manner, the step of enabling a plurality of consecutive historical tasks in a task batch to meet a preset first condition is specifically: the distance between the arrival addresses of any two historical tasks in the plurality of continuous historical tasks in the task batch is not within the preset distance.
Taking the takeaway task as an example, the first processing action is a meal fetching action. Assume that a certain task batch of a delivery resource includes takeout task objects B1-B4, where takeout task objects B1-B3 are taken at address a1, then the delivery rider arrives at cell 1 to deliver takeout task objects B1 and B2, then the delivery rider arrives at address a2 to take takeout task object B4, and the delivery rider arrives at cell 2 to deliver takeout task objects B4 and B3. The arrival addresses of the takeout task objects B1 and B3 are not in the same cell, that is, not within the preset distance, so that the historical tasks completed between two consecutive first task processing actions in the task batch can be packed and grouped, that is, the takeout task objects B1 and B2 are packed as a first group, and the takeout task objects B3-B4 are packed as a second group.
In an optional implementation manner, the step of the task batch in which the plurality of consecutive historical tasks satisfy the predetermined second condition is specifically: the distance between the arrival addresses of a plurality of consecutive historical tasks in the task batch is within a preset distance.
Taking the takeout task as an example, the second processing action is a meal delivery action. Assume that in a certain task batch of a delivery resource, take-out task objects C1-C3 are included, wherein take-out task objects C1-C3 at address A3, and then the delivery rider arrives at cell 3 to continuously deliver take-out task objects C1-C3. The arrival addresses of the takeaway task objects C1-C3 are in the same cell, that is, within a preset distance, so that the historical tasks that continuously complete the second task processing action in the task batch can be packed and grouped, that is, the takeaway task objects C1-C3 are packed and grouped as a third group.
It should be understood that, in this embodiment, the consecutive plurality of historical tasks in one task batch may include all the historical tasks in the task batch, or may include only a part of the historical tasks in the task batch, and this embodiment is not limited thereto. Alternatively, the historical task objects may be divided according to the arrival addresses thereof, for example, if the distance between the arrival addresses of more than a predetermined number of consecutive historical tasks is within a preset distance, the historical tasks are packed and grouped by adopting a second condition.
Step S400, performing quality evaluation according to the information of the historical tasks in each group through at least one processor to obtain data quality information of the historical task data.
In an alternative implementation, step S400 includes:
step S410, simulating the processing time of the historical tasks in each group according to the historical task data in each group to obtain the simulation time information of the historical tasks in each group. Taking a takeaway task as an example, the completion time of each task node of each historical task in a group is simulated according to the order taking time, the starting address and the arrival address of each historical task in the group, so as to obtain the simulation time information of each historical task in the group.
Step S420, performing quality evaluation on the historical task data in each group according to the processing time information of the historical tasks in each group and the corresponding simulation time information to acquire corresponding data quality information. In an alternative implementation manner, in response to that the error of the processing time information and the simulation time information of the historical task in the packet is within a preset time range, the historical task data in the packet is judged to be valid data.
FIG. 4 is a diagram of a historical task trajectory and a simulation trajectory according to an embodiment of the present invention. In this embodiment, for a historical task in a task batch, when the distance between the arrival addresses of any two historical tasks in a plurality of consecutive historical tasks is not within the preset distance, the historical tasks completed between the processing actions of two consecutive first tasks in the task batch are packed and grouped. Taking the takeaway task as an example, the first task processing action is a meal taking action, and the preset distance may be a distance range of the same cell, or may be a specific distance range (e.g., 100 m).
As shown in fig. 4, in one task lot of one delivery rider, take-out task objects B1-B4 are included that are continuously delivered, wherein take-out task objects B1 and B2 are a first group and task objects B3 and B4 are a second group. In this embodiment, the historical tasks in each group of the task batch are subjected to trajectory segmentation to obtain a takeout task trajectory 41, and the historical tasks in each group of the task batch are subjected to simulation to obtain a takeout task simulation trajectory 42.
As shown in FIG. 4, in the takeaway task trajectory 41, takeaway task objects B1-B3 are taken at a meal-taking address A1. After the rider riding time Eta1 is distributed, the takeout task objects B1 and B2 are distributed in sequence in the cell 1, and the consumed time is Ets 1. After the delivery of the takeout task objects B1 and B2 is completed, the delivery rider takes the takeout task object B4 at the meal taking address a2 after the on-road riding time Eta2, takes the post-meal continuing riding time Eta3 to reach the cell 2, and delivers the takeout task objects B4 and B3 in turn, taking the time of the take Ets 2.
Since the takeaway tasks of the task batch are packed and grouped and the simulation of the task trajectory is performed based on the grouping, the takeaway task simulation trajectory 42 corresponds to the actual takeaway task trajectory 41. As shown in FIG. 4, in the takeaway task simulation track 42, takeaway task objects B1-B3 are fetched at a fetch address A1. After the rider riding time Eta '1 is distributed, the takeout task objects B1 and B2 are sequentially distributed in the cell 1, and the consumed time is Ets' 1. After the delivery of the takeout task objects B1 and B2 is completed, the delivery rider takes the takeout task object B4 at the meal taking address a2 after the on-road riding time Eta '2, takes the post-meal continuation riding time Eta '3 to the cell 2, and delivers the takeout task objects B4 and B3 in turn, consuming time Ets ' 2.
As shown in fig. 3, in the takeout task trajectory 41 and the takeout task simulation trajectory 42, the time difference reaching the cell 1 is Err1, and the data quality evaluation of the historical task data of each group in the task batch can be performed by determining whether the time difference Err1 is within a preset time range. If so, the historical task data in the group is valid data, otherwise, the historical task data in the group is invalid data.
FIG. 5 is a schematic diagram of another historical task trajectory and simulation trajectory according to an embodiment of the invention. In this embodiment, for the historical tasks in one task batch, the historical tasks that continuously complete the second task processing action in the task batch are packed and grouped, wherein the distance between the arrival addresses of the plurality of continuous historical tasks in the task batch is within the preset distance. Taking the take-out task as an example, the second task processing action is a meal delivery action, and the preset distance may be a distance range of the same cell, or may be a specific distance range (e.g. 100 m).
As shown in FIG. 5, in one task batch for one delivery rider, there are continuously delivered takeaway task objects C1-C3, where the takeaway task objects C1-C3 are in a third group. In this embodiment, the historical tasks in the third group of the task batch are subjected to trajectory segmentation to obtain a takeout task trajectory 51, and the historical tasks in the third group of the task batch are subjected to simulation to obtain a takeout task simulation trajectory 52.
As shown in FIG. 5, in the takeaway task track 51, take task objects C1-C3 at a meal-taking address A3. After the rider riding time Eta4 is distributed, the district 3 is reached to sequentially distribute takeout task objects B C1-C3, and the consumed time is Ets 3.
Since the takeaway tasks of the task batch are packed and grouped and the simulation of the task trajectory is performed based on the grouping, the takeaway task simulation trajectory 52 corresponds to the actual takeaway task trajectory 51. As shown in FIG. 5, in the takeaway task simulation track 52, takeaway task objects C1-C3 are fetched at a fetch address A3. After the rider riding time Eta '4 is distributed, the takeout task objects C1-C3 are distributed in sequence in the cell 3, and the consumed time is Ets' 3.
As shown in fig. 5, in the takeout task trajectory 51 and the takeout task simulation trajectory 52, the time difference reaching the cell 3 is Err2, and the data quality evaluation of the historical task data of each group in the task batch can be performed by determining whether the time difference Err2 is within a preset time range. If so, the historical task data in the third group is valid data, otherwise, the historical task data in the third group is invalid data.
In the embodiment, historical tasks with dependency relationships (such as fetching, co-sending or fetching and co-sending) are grouped, and task track simulation is performed on each group, so that the simulation track basically corresponds to the actual track of the historical tasks, and therefore simulation time information of each task node of the historical tasks can be simulated more accurately, data quality evaluation is performed on historical task data according to actual processing time information and simulation time information of the historical tasks, and accuracy of the data quality evaluation is improved.
According to the embodiment of the invention, the historical tasks are packaged and grouped through the acquired historical task data, and the quality of the historical task data in each group is evaluated to acquire the data quality information of the historical task data, wherein when a plurality of continuous historical tasks in a task batch meet a first condition, the historical tasks completed between two continuous first task processing actions in the task batch are packaged and grouped, and when a plurality of historical tasks in a task batch meet a second condition, the historical tasks continuously completing second task processing actions in the task batch are packaged and grouped, so that the accuracy of data quality evaluation can be improved.
In an optional implementation manner, the data processing method of this embodiment further includes: and performing data cleaning on the historical task data according to the data quality information to obtain effective data.
In an optional implementation manner, the data processing method of this embodiment further includes:
and training a corresponding time prediction model according to the effective data, and predicting the completion time of the currently processed task according to the trained time prediction model.
In this embodiment, historical tasks having a dependency relationship are grouped, task trajectory simulation is performed on each group, so that quality evaluation is performed on historical task data according to simulation time information and actual processing time information of the historical tasks, and invalid data, which is data quality information, is cleaned to obtain valid data.
Fig. 6 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention. As shown in fig. 6, the data processing apparatus 6 of the present embodiment includes a receiving unit 61, a data acquiring unit 62, a packetizing unit 63, and a quality evaluating unit 64.
Wherein the receiving unit 61 is configured to receive data instructions from the server. The data obtaining unit 62 is configured to parse the data instruction through at least one processor, and obtain historical task data, where the historical task data includes task processing resources corresponding to the historical task, a start address, an arrival address, and processing time information of the historical task. The packing and grouping unit 63 is configured to, for a task batch of a task processing resource, pack and group history tasks completed between two consecutive first task processing actions in the task batch in response to a plurality of consecutive history tasks in the task batch satisfying a preset first condition, and pack and group history tasks completed continuously with second task processing actions in the task batch in response to a plurality of consecutive history tasks in the task batch satisfying a preset second condition. The quality assessment unit 64 is configured to perform a quality assessment on the historical task data in the packet by at least one processor to obtain data quality information.
In this embodiment, the historical tasks are packed and grouped through the acquired historical task data, and the quality of the historical task data in each group is evaluated to acquire data quality information of the historical task data, wherein when a plurality of consecutive historical tasks in a task batch meet a first condition, the historical tasks completed between two consecutive first task processing actions in the task batch are packed and grouped, and when a plurality of consecutive historical tasks in a task batch meet a second condition, the historical tasks continuously completing a second task processing action in the task batch are packed and grouped, so that the accuracy of data quality evaluation can be improved.
Fig. 7 is a schematic diagram of an electronic device of an embodiment of the invention. As shown in fig. 7, in the present embodiment, the electronic device 7 includes a server, a terminal, and the like. As shown in fig. 7, the electronic device 7: at least one processor 71; and a memory 72 communicatively coupled to the at least one processor 71; and a communication component 73 communicatively coupled to the scanning device, the communication component 73 receiving and transmitting data under control of the processor 71; wherein the memory 72 stores instructions executable by the at least one processor 71, the instructions being executable by the at least one processor 71 to perform the steps of:
receiving a data instruction from a server;
analyzing the data instruction through at least one processor to obtain historical task data, wherein the historical task data comprises task processing resources corresponding to the historical task, a starting address, an arrival address and processing time information of the historical task;
for a task batch of a task processing resource, in response to a plurality of continuous historical tasks in the task batch meeting a preset first condition, packaging and grouping the historical tasks completed between two continuous first task processing actions in the task batch, and in response to a plurality of continuous historical tasks in the task batch meeting a preset second condition, packaging and grouping the historical tasks continuously completed with second task processing actions in the task batch;
performing, by at least one processor, a quality assessment of historical task data in the packet to obtain data quality information.
Optionally, the step of enabling the plurality of consecutive historical tasks in the task batch to meet the preset first condition is specifically:
the distance between the arrival addresses of any two of the consecutive plurality of historical tasks is not within a preset distance.
Optionally, the step of enabling the plurality of consecutive historical tasks in the task batch to meet the predetermined second condition is specifically:
the distance between the arrival addresses of the continuous plurality of historical tasks is within a preset distance.
Optionally, the performing, by at least one processor, quality evaluation on the historical task data in the packet to obtain data quality information includes:
simulating the processing time of the historical tasks in the group according to the historical task data in the group to acquire the simulation time information of the historical tasks in the group;
and performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to acquire the data quality information.
Optionally, the performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to obtain the data quality information includes:
and judging the historical task data in the group as valid data in response to the error of the processing time information and the simulation time information of the historical tasks in the group being within a preset time range.
Optionally, the instructions are further executable by the at least one processor 71 to perform the steps of:
and performing data cleaning on the historical task data according to the data quality information to obtain effective data.
Optionally, the instructions are further executable by the at least one processor 71 to perform the steps of:
training a corresponding time prediction model according to the effective data;
and predicting the completion time of the currently processed task according to the trained time prediction model.
Specifically, the electronic device includes: one or more processors 71 and a memory 72, one processor 71 being exemplified in fig. 7. The processor 71 and the memory 72 may be connected by a bus or other means, and fig. 7 illustrates the connection by a bus as an example. Memory 72, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The processor 71 executes various functional applications of the device and data processing, i.e., implements the above-described data processing method, by executing nonvolatile software programs, instructions, and modules stored in the memory 72.
The memory 72 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store a list of options, etc. Further, the memory 72 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 72 may optionally include memory located remotely from the processor 71, which may be connected to an external device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more modules are stored in the memory 72 and, when executed by the one or more processors 71, perform the data processing method in any of the method embodiments described above.
The product can execute the method provided by the embodiment of the application, has corresponding functional modules and beneficial effects of the execution method, and can refer to the method provided by the embodiment of the application without detailed technical details in the embodiment.
In this embodiment, the historical tasks are packed and grouped through the acquired historical task data, and the quality of the historical task data in each group is evaluated to acquire data quality information of the historical task data, wherein when a plurality of consecutive historical tasks in a task batch meet a first condition, the historical tasks completed between two consecutive first task processing actions in the task batch are packed and grouped, and when a plurality of consecutive historical tasks in a task batch meet a second condition, the historical tasks continuously completing a second task processing action in the task batch are packed and grouped, so that the accuracy of data quality evaluation can be improved.
Another embodiment of the invention is directed to a non-transitory storage medium storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
The embodiment of the invention discloses A1 and a data processing method, wherein the method comprises the following steps:
receiving a data instruction from a server;
analyzing the data instruction through at least one processor to obtain historical task data, wherein the historical task data comprises task processing resources corresponding to the historical task, a starting address, an arrival address and processing time information of the historical task;
for a task batch of a task processing resource, in response to a plurality of continuous historical tasks in the task batch meeting a preset first condition, packaging and grouping the historical tasks completed between two continuous first task processing actions in the task batch, and in response to a plurality of continuous historical tasks in the task batch meeting a preset second condition, packaging and grouping the historical tasks continuously completed with second task processing actions in the task batch;
performing, by at least one processor, a quality assessment of historical task data in the packet to obtain data quality information.
A2, the method according to A1, wherein the meeting of the plurality of consecutive historical tasks in the task batch with the preset first condition is specifically:
the distance between the arrival addresses of any two of the consecutive plurality of historical tasks is not within a preset distance.
A3, the method according to A1, wherein the continuous historical tasks in the task batch meet a second predetermined condition, specifically:
the distance between the arrival addresses of the continuous plurality of historical tasks is within a preset distance.
A4, the method of A1, wherein the quality assessment, by the at least one processor, of historical task data in the grouping to obtain data quality information comprises:
simulating the processing time of the historical tasks in the group according to the historical task data in the group to acquire the simulation time information of the historical tasks in the group;
and performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to acquire the data quality information.
A5, the method according to A4, wherein the quality evaluation of the historical task data in the group according to the processing time information and the corresponding simulation time information of the historical tasks in the group to obtain the data quality information comprises:
and judging the historical task data in the group as valid data in response to the error of the processing time information and the simulation time information of the historical tasks in the group being within a preset time range.
A6, the method according to A1, wherein the method further comprises:
and performing data cleaning on the historical task data according to the data quality information to obtain effective data.
A7, the method according to A6, wherein the method further comprises:
training a corresponding time prediction model according to the effective data;
and predicting the completion time of the currently processed task according to the trained time prediction model.
The embodiment of the invention also discloses B1 and a data processing device, wherein the device comprises:
a receiving unit configured to receive a data instruction from a server;
the data acquisition unit is configured to analyze the data instruction through at least one processor and acquire historical task data, wherein the historical task data comprises task processing resources corresponding to the historical tasks, starting addresses, arrival addresses and processing time information of the historical tasks;
the packaging and grouping unit is configured to package and group the history tasks completed between two continuous first task processing actions in a task batch in response to a plurality of continuous history tasks in the task batch meeting a preset first condition, and package and group the history tasks completed with second task processing actions in the task batch in response to a plurality of continuous history tasks in the task batch meeting a preset second condition;
a quality evaluation unit configured to perform quality evaluation on the historical task data in the packet by at least one processor to obtain data quality information.
The embodiment of the present invention also discloses C1, an electronic device, including a memory and a processor, where the memory is used to store one or more computer program instructions, where the one or more computer program instructions are executed by the processor to implement the following steps:
receiving a data instruction from a server;
analyzing the data instruction through at least one processor to obtain historical task data, wherein the historical task data comprises task processing resources corresponding to the historical task, a starting address, an arrival address and processing time information of the historical task;
for a task batch of a task processing resource, in response to a plurality of continuous historical tasks in the task batch meeting a preset first condition, packaging and grouping the historical tasks completed between two continuous first task processing actions in the task batch, and in response to a plurality of continuous historical tasks in the task batch meeting a preset second condition, packaging and grouping the historical tasks continuously completed with second task processing actions in the task batch;
performing, by at least one processor, a quality assessment of historical task data in the packet to obtain data quality information.
C2, the electronic device according to C1, wherein the meeting of the plurality of consecutive historical tasks in the task batch with the preset first condition is specifically:
the distance between the arrival addresses of any two of the consecutive plurality of historical tasks is not within a preset distance.
C3, the electronic device according to C1, wherein the fact that the plurality of consecutive historical tasks in the task batch meet the predetermined second condition is specifically that:
the distance between the arrival addresses of the continuous plurality of historical tasks is within a preset distance.
C4, the electronic device of C1, wherein the quality assessment, by the at least one processor, of the historical task data in the group to obtain data quality information comprises:
simulating the processing time of the historical tasks in the group according to the historical task data in the group to acquire the simulation time information of the historical tasks in the group;
and performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to acquire the data quality information.
C5, the electronic device according to C4, wherein the quality evaluation of the historical task data in the group according to the processing time information and the corresponding simulation time information of the historical tasks in the group to obtain the data quality information comprises:
and judging the historical task data in the group as valid data in response to the error of the processing time information and the simulation time information of the historical tasks in the group being within a preset time range.
C6, the electronic device of C1, wherein the steps further comprise:
and performing data cleaning on the historical task data according to the data quality information to obtain effective data.
C7, the electronic device of C6, wherein the steps further comprise:
training a corresponding time prediction model according to the effective data;
and predicting the completion time of the currently processed task according to the trained time prediction model.
The embodiment of the invention also discloses D1, a computer readable storage medium, and computer program instructions stored thereon, wherein the computer program instructions realize the method according to any one of A1-A7 when being executed by a processor.

Claims (10)

1. A method of data processing, the method comprising:
receiving a data instruction from a server;
analyzing the data instruction through at least one processor to obtain historical task data, wherein the historical task data comprises task processing resources corresponding to the historical task, a starting address, an arrival address and processing time information of the historical task;
for a task batch of a task processing resource, in response to a plurality of continuous historical tasks in the task batch meeting a preset first condition, packaging and grouping the historical tasks completed between two continuous first task processing actions in the task batch, and in response to a plurality of continuous historical tasks in the task batch meeting a preset second condition, packaging and grouping the historical tasks continuously completed with second task processing actions in the task batch;
performing, by at least one processor, a quality assessment of historical task data in the packet to obtain data quality information.
2. The method according to claim 1, wherein the meeting of the preset first condition by the plurality of consecutive historical tasks in the task batch is specifically:
the distance between the arrival addresses of any two of the consecutive plurality of historical tasks is not within a preset distance.
3. The method according to claim 1, wherein the meeting of the predetermined second condition by the consecutive plurality of historical tasks in the task batch is specifically:
the distance between the arrival addresses of the continuous plurality of historical tasks is within a preset distance.
4. The method of claim 1, wherein the quality assessment of the historical task data in the packet by the at least one processor to obtain data quality information comprises:
simulating the processing time of the historical tasks in the group according to the historical task data in the group to acquire the simulation time information of the historical tasks in the group;
and performing quality evaluation on the historical task data in the group according to the processing time information of the historical tasks in the group and the corresponding simulation time information to acquire the data quality information.
5. The method of claim 4, wherein the performing quality assessment on the historical task data in the group according to the processing time information and the corresponding simulation time information of the historical tasks in the group to obtain the data quality information comprises:
and judging the historical task data in the group as valid data in response to the error of the processing time information and the simulation time information of the historical tasks in the group being within a preset time range.
6. The method of claim 1, further comprising:
and performing data cleaning on the historical task data according to the data quality information to obtain effective data.
7. The method of claim 6, further comprising:
training a corresponding time prediction model according to the effective data;
and predicting the completion time of the currently processed task according to the trained time prediction model.
8. A data processing apparatus, characterized in that the apparatus comprises:
a receiving unit configured to receive a data instruction from a server;
the data acquisition unit is configured to analyze the data instruction through at least one processor and acquire historical task data, wherein the historical task data comprises task processing resources corresponding to the historical tasks, starting addresses, arrival addresses and processing time information of the historical tasks;
the packaging and grouping unit is configured to package and group the history tasks completed between two continuous first task processing actions in a task batch in response to a plurality of continuous history tasks in the task batch meeting a preset first condition, and package and group the history tasks completed with second task processing actions in the task batch in response to a plurality of continuous history tasks in the task batch meeting a preset second condition;
a quality evaluation unit configured to perform quality evaluation on the historical task data in the packet by at least one processor to obtain data quality information.
9. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to perform the steps of:
receiving a data instruction from a server;
analyzing the data instruction through at least one processor to obtain historical task data, wherein the historical task data comprises task processing resources corresponding to the historical task, a starting address, an arrival address and processing time information of the historical task;
for a task batch of a task processing resource, in response to a plurality of continuous historical tasks in the task batch meeting a preset first condition, packaging and grouping the historical tasks completed between two continuous first task processing actions in the task batch, and in response to a plurality of continuous historical tasks in the task batch meeting a preset second condition, packaging and grouping the historical tasks continuously completed with second task processing actions in the task batch;
performing, by at least one processor, a quality assessment of historical task data in the packet to obtain data quality information.
10. A computer-readable storage medium on which computer program instructions are stored, which, when executed by a processor, implement the method of any one of claims 1-7.
CN201911008425.6A 2019-10-22 2019-10-22 Data processing method and device, electronic equipment and computer readable storage medium Pending CN110796408A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911008425.6A CN110796408A (en) 2019-10-22 2019-10-22 Data processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911008425.6A CN110796408A (en) 2019-10-22 2019-10-22 Data processing method and device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN110796408A true CN110796408A (en) 2020-02-14

Family

ID=69440966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911008425.6A Pending CN110796408A (en) 2019-10-22 2019-10-22 Data processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN110796408A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460283A (en) * 2020-03-06 2020-07-28 拉扎斯网络科技(上海)有限公司 Information processing method, information processing device, electronic equipment and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101593219A (en) * 2008-05-30 2009-12-02 国际商业机器公司 Dynamically switch the emulation mode and the emulator of simulation model
US8078485B1 (en) * 2008-05-29 2011-12-13 Accenture Global Services Limited Postal, freight, and logistics industry high performance capability assessment
CN107094165A (en) * 2016-08-31 2017-08-25 阿里巴巴集团控股有限公司 Distribution capacity is determined, dispatching task obtains, dispenses resource regulating method and equipment
CN108364146A (en) * 2017-01-26 2018-08-03 北京小度信息科技有限公司 Logistics distribution emulation mode and device
CN110119847A (en) * 2019-05-14 2019-08-13 拉扎斯网络科技(上海)有限公司 A kind of prediction technique, device, storage medium and electronic equipment dispensing duration

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8078485B1 (en) * 2008-05-29 2011-12-13 Accenture Global Services Limited Postal, freight, and logistics industry high performance capability assessment
CN101593219A (en) * 2008-05-30 2009-12-02 国际商业机器公司 Dynamically switch the emulation mode and the emulator of simulation model
CN107094165A (en) * 2016-08-31 2017-08-25 阿里巴巴集团控股有限公司 Distribution capacity is determined, dispatching task obtains, dispenses resource regulating method and equipment
CN108364146A (en) * 2017-01-26 2018-08-03 北京小度信息科技有限公司 Logistics distribution emulation mode and device
CN110119847A (en) * 2019-05-14 2019-08-13 拉扎斯网络科技(上海)有限公司 A kind of prediction technique, device, storage medium and electronic equipment dispensing duration

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111460283A (en) * 2020-03-06 2020-07-28 拉扎斯网络科技(上海)有限公司 Information processing method, information processing device, electronic equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN109034660B (en) Method and related device for determining risk control strategy based on prediction model
CN106294614A (en) Method and apparatus for access service
CN110765615B (en) Logistics simulation method, device and equipment
CN109829667B (en) Logistics node parcel prediction method and device
CN105160027B (en) Advertisement data processing method and device
CN105719162B (en) Method and device for monitoring validity of promotion link
CN104813143A (en) New road detection logic
CN107832329A (en) Page resource acquisition methods and terminal device
CN107689968A (en) Processing system, the method and device of task
CN106155806A (en) A kind of multi-task scheduling method and server
CN108696399A (en) The test method and device of business service
CN110796408A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN110378529B (en) Data generation method and device, readable storage medium and electronic equipment
CN106656675B (en) Detection method and device for transmission node cluster
US7839790B2 (en) Network congestion analysis
CN114239977A (en) Method, device, equipment and storage medium for determining estimated delivery time length
CN111582407B (en) Task processing method and device, readable storage medium and electronic equipment
CN112383607B (en) Logistics service verification method and device based on block chain and electronic equipment
CN113128933A (en) Abnormity positioning method and device
JP6437368B2 (en) Operation information distribution device
CN112884481A (en) Block chain payment processing method based on cloud computing and big data service center
CN112036702A (en) Data processing method and device, readable storage medium and electronic equipment
CN109309717A (en) A kind of data transmission method, device, electronic equipment and storage medium
CN113029131B (en) Building entrance direction determining method and device
CN106487857B (en) Route display method, server and client in electronic commerce application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200214

RJ01 Rejection of invention patent application after publication