CN114399344B

CN114399344B - Data processing method and data processing device

Info

Publication number: CN114399344B
Application number: CN202210295787.3A
Authority: CN
Inventors: 曹绍升
Original assignee: Beijing Qisheng Technology Co Ltd
Current assignee: Xiamen Qiwen Technology Co ltd; Beijing Qisheng Technology Co Ltd
Priority date: 2022-03-24
Filing date: 2022-03-24
Publication date: 2022-07-08
Anticipated expiration: 2042-03-24
Also published as: CN114399344A

Abstract

The embodiment of the invention discloses a data processing method and a data processing device. The method and the device for predicting the task transfer amount of the user group in the preset geographic area have the advantages that the task transfer amount sequence formed by the transfer amounts of the tasks of the sub-areas in the preset geographic area in the first time period is obtained, the transfer amount prediction model corresponding to the target area is obtained from the plurality of transfer amount prediction models, and therefore the transfer amount of the tasks of the preset geographic area in the second time period is obtained as the data of the quantity of the tasks required by the user group by taking the task transfer amount sequence as the input of the target transfer amount prediction model corresponding to the target area. According to the embodiment of the invention, the transfer condition of the tasks among the sub-areas of different geographic areas can be accurately predicted through the transfer prediction models corresponding to the different geographic areas so as to estimate the demand condition of the user on the goods.

Description

Data processing method and data processing device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a data processing method and a data processing apparatus.

Background

With the continuous development of the technical field of the internet, in daily life, a user can purchase goods of a mood through the network, reserve a car for a taxi on the internet, unlock a shared bicycle and the like. The existing processes of online shopping, online booking, sharing single car use and the like are all realized in the form of tasks, and the execution process of the tasks usually involves position transfer of goods, vehicles and other articles, so the position transfer of the tasks is also the position transfer of the articles. The frequency of transferring articles among different areas is generally high, so that the prior art has difficulty in accurately estimating the transferring rule of the task.

Disclosure of Invention

In view of the above, an object of the embodiments of the present invention is to provide a data processing method and a data processing apparatus for predicting task transfer situations between different areas to estimate user demand situations for goods.

According to a first aspect of embodiments of the present invention, there is provided a data processing method, the method including:

acquiring a task transfer quantity sequence of a target area in a first time period, wherein the task transfer quantity sequence comprises task transfer quantity data generated by each sub-area in the target area in a plurality of sub-time periods of the first time period, the task transfer quantity data represents the number of tasks of which the starting positions are located in target sub-areas and the ending positions are located in non-target sub-areas, and the target area is a preset geographical area;

acquiring a target transfer amount prediction model corresponding to the target area from a plurality of transfer amount prediction models;

and acquiring a task transfer quantity matrix of the target region in a second time period according to the task transfer quantity sequence based on the target transfer quantity prediction model, wherein the task transfer quantity matrix comprises task transfer quantity data of each sub-region in the second time period.

Preferably, the method further comprises:

and determining object transfer data of each sub-area in the target area in the second time period according to the task transfer quantity matrix, wherein the object transfer data represent the number of objects corresponding to the tasks with the starting positions located in the target sub-areas and the ending positions located in the non-target sub-areas.

Preferably, the target transfer amount prediction model is obtained by training according to a first sample set and a second sample set, each first sample in the first sample set includes a first historical task transfer amount sequence of the corresponding geographic area in a first historical time period and a first historical task transfer amount matrix in a second historical time period, the first historical task transfer amount sequence includes first historical task transfer amount data generated by each sub-area in the corresponding geographic area in a plurality of sub-historical time periods of the first historical time period, the first historical task transfer amount matrix includes first historical task transfer amount data of each sub-area in the corresponding geographic area in the second historical time period, and the first historical task transfer amount data represents the number of historical tasks with starting positions located in predetermined sub-areas and ending positions located in non-predetermined sub-areas, each second sample in the second sample set comprises a second historical task transfer amount sequence of the target area in a first historical time period and a second historical task transfer amount matrix in a second historical time period, the second historical task transfer amount sequence comprises second historical task transfer amount data generated by each sub-area in the target area in a plurality of sub-historical time periods of the first historical time period, the second historical task transfer amount matrix comprises second historical task transfer amount data of each sub-area in the target area in the second historical time period, and the second historical task transfer amount data represents the number of historical tasks of which the starting positions are located in the target sub-area and the ending positions are located in the non-target sub-area.

Preferably, the first sample and the second sample are determined by:

acquiring the first historical task transfer amount data and the second historical task transfer amount data;

detecting abnormal data in the first historical task transfer amount data and the second task transfer amount data according to a preset abnormal detection method;

determining a corresponding first historical task transfer amount sequence and a first historical task transfer amount matrix according to the first historical task transfer amount data after abnormal data are removed;

and determining the corresponding second historical task transfer amount sequence and the corresponding second historical task transfer amount matrix according to the second historical task transfer amount data after abnormal data are removed.

Preferably, the target transfer amount prediction model is obtained by training as follows:

taking each first historical task transfer quantity sequence as input, and training an initial model by taking the corresponding first historical task transfer quantity matrix as a target to obtain a first model;

and taking each second historical task transfer quantity sequence as input, and training the first model by taking the corresponding second historical task transfer quantity matrix as a target to obtain the transfer quantity prediction model.

Preferably, the transfer amount prediction model is a recurrent neural network or a convolutional neural network.

According to a second aspect of embodiments of the present invention, there is provided a data processing apparatus, the apparatus comprising:

the task transferring quantity data represents the quantity of tasks with starting positions located in target sub-areas and ending positions located in non-target sub-areas, and the target area is a preset geographic area;

a model acquisition unit configured to acquire a target transfer amount prediction model corresponding to the target region from a plurality of transfer amount prediction models;

and the first quantity prediction unit is used for acquiring a task transfer quantity matrix of the target region in a second time period according to the task transfer quantity sequence based on the target transfer quantity prediction model, and the task transfer quantity matrix comprises task transfer quantity data of each sub-region in the second time period.

According to a third aspect of embodiments of the present invention, there is provided a computer readable storage medium having stored thereon computer program instructions, wherein the computer program instructions, when executed by a processor, implement the method of any of the first aspects.

According to a fourth aspect of embodiments of the present invention, there is provided an electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method according to any one of the first aspect.

According to a fifth aspect of embodiments of the present invention, there is provided a computer program product comprising computer programs/instructions, characterized in that the computer programs/instructions are executed by a processor to implement the method according to any one of the first aspect.

The method and the device for predicting the task transfer amount of the user group in the preset geographic area have the advantages that the task transfer amount sequence formed by the transfer amounts of the tasks of the sub-areas in the preset geographic area in the first time period is obtained, the transfer amount prediction model corresponding to the target area is obtained from the plurality of transfer amount prediction models, and therefore the transfer amount of the tasks of the preset geographic area in the second time period is obtained as the data of the quantity of the tasks required by the user group by taking the task transfer amount sequence as the input of the target transfer amount prediction model corresponding to the target area. According to the embodiment of the invention, the transfer condition of the tasks among the sub-areas of different geographic areas can be accurately predicted through the transfer prediction models corresponding to the different geographic areas so as to estimate the demand condition of the user on the goods.

Drawings

The above and other objects, features and advantages of the present invention will become more apparent from the following description of the embodiments of the present invention with reference to the accompanying drawings, in which:

FIG. 1 is a schematic diagram of a hardware system architecture of an embodiment of the present invention;

FIG. 2 is a flow chart of a data processing method of the first embodiment of the present invention;

FIG. 3 is a schematic illustration of a subsequence of the target area at sub-time period T1 of an embodiment of the present invention;

FIG. 4 is a schematic diagram of a data processing apparatus according to a second embodiment of the present invention;

fig. 5 is a schematic view of an electronic device according to a third embodiment of the present invention.

Detailed Description

The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.

Further, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.

Unless the context clearly requires otherwise, throughout the description, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".

In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.

In the embodiment of the present invention, a description is given by taking a rental order in which a task is a shared bicycle as an example. Those skilled in the art will readily appreciate that the method of the present embodiment is equally applicable where the task is other tasks, such as a network appointment order, a vehicle pick-up order, a shipping task, etc.

In daily life, a user can purchase goods of a mood through a network, reserve a network appointment, unlock a shared bicycle and the like. Taking the technical field of shared bicycles as an example, the shared bicycle platform is an on-line leasing platform for providing bicycle sharing service for users in public service areas such as campuses, subway stations, bus stations, residential areas and the like. As the number of users using shared vehicles continues to increase, the frequency of use of shared vehicles becomes more frequent. The user can release the rental order of the shared bicycle on the shared bicycle platform in a mode of unlocking the shared bicycle. The user often does not return to the original starting location, such as riding from a house to a subway station, after riding the same shared bicycle, so that the shared bicycle is more likely to shift in location when the rental order is settled than when the rental order is started. The position of the shared bicycle is also the position of the rental order, and the acceleration of the use frequency leads the transfer frequency of the shared bicycle among different areas to be higher, so that the prior art is difficult to accurately estimate the transfer rule of the rental order.

FIG. 1 is a diagram of a hardware system architecture of an embodiment of the present invention. In an application scenario of shared bicycle rental, the hardware system architecture shown in fig. 1 includes at least one user terminal 11, at least one shared bicycle 12, and at least one platform-side server (hereinafter, also referred to as server) 13, and fig. 1 illustrates one user terminal 11, one shared bicycle 12, and one server 13 as an example. The user terminal 11, the shared bicycle 12, and the server 13 may be communicatively connected through a network. The user may scan the two-dimensional code set on the shared bicycle 12 through a shared bicycle rental client (e.g., APP, applet, etc.) configured on the user terminal 11 to unlock and use the shared bicycle 12. While scanning the two-dimensional code of the shared bicycle 12, the user terminal 11 may send a rental order generation request of the shared bicycle 12 to the shared bicycle platform, so that the server 13 may obtain the rental order generation request and generate a rental order of the shared bicycle 12. The sharing bicycle 12 is configured with a positioning device, so that during the whole riding process of the user, the position information of the user can be acquired and reported to the server 12, so that the server 12 can record at least the position information of the sharing bicycle 12 when the sharing bicycle is unlocked and locked.

In the embodiment of the present invention, the server 12 may obtain a task transfer amount sequence of a predetermined geographic area (i.e., a target area) in a first time period, and obtain a target transfer amount prediction model corresponding to the target area. Thus, the server 12 obtains the task transfer amount matrix of the target area in the second time period according to the task transfer amount sequence based on the target transfer amount prediction model.

The task transfer quantity sequence comprises task transfer quantity data generated by each sub-region in the target region in a plurality of sub-time periods of the first time period, and the task transfer quantity data represents the number of tasks of which the starting positions are located in the sub-target regions of the target region and the ending positions are located in non-target sub-regions. The task transfer amount matrix comprises task transfer amount data of each sub-area in the target area in the second time period.

In an optional implementation manner of the embodiment of the present invention, the server 12 may further determine, according to the task transfer amount matrix, object transfer data of each sub-area in the target area in the second time period. The object transfer data represent the number of objects corresponding to the tasks with the starting positions located in the target sub-regions and the ending positions located in the non-target sub-regions.

The data processing method according to the embodiment of the present invention is described below with reference to method embodiments. Fig. 2 is a flowchart of a data processing method according to a first embodiment of the present invention. As shown in fig. 2, the method of the present embodiment includes the following steps:

step S100, acquiring a task transfer amount sequence of the target area in a first time period.

In the present embodiment, the target area may be an administrative area, such as province, city, district, county, etc. In this step, the server may determine a predetermined geographic area as a target area, divide and number the target area at equal intervals on the map plane, and determine a plurality of sub-areas corresponding to the target area. After the plurality of sub-regions are determined, the server may respectively obtain the task transfer amount data generated by each sub-region in each first time period, and respectively convert the task transfer amount data generated by each sub-region in the same sub-time period into a sub-sequence, so as to determine the task transfer amount sequence corresponding to the target region according to each sub-sequence corresponding to the first time period. Specifically, in the present embodiment, the time period length of each sub-time period in the first time period is the same, and for example, may be 1 hour, 2 hours, 1 day, 1 week, and the like.

It is easy to understand that, in this embodiment, each sub-period may be a continuous period or a discontinuous period. For example, the first time period may be 0 to 24 points of 20 days of 2 months of 2022, and the first time period may include 24 sub-time periods of 0:00 to 1:00, 1:00 to 2:00, 2:00 to 3:00, and so on; or the first time period may be 15:00-16:00 of 21/2022/27/2022, and the first time period may include 7 sub-periods of 15:00-16:00 of 21/2/2022, and 15:00-16:00 of 22/2/2022/22/2022.

In this embodiment, the target region may be divided into M × N (where M and N are predetermined integers greater than 1) sub-regions having the same size and not overlapping with each other, and thus the target region is generally a square region, but alternatively, the target region may also be a region having another shape, for example, a polygonal region, and the size of the sub-regions may be different, and the sub-regions may also be regions having another shape. The pitch may be set according to actual requirements, and may be determined according to the length and width of the target region, a predetermined division accuracy (for example, the division accuracy is a rectangle dividing the sub region into 3km × 3km sizes), and the like.

The task transfer amount data characterizes the number of tasks having starting positions located in sub-target areas of the target area and ending positions located in non-target sub-areas. Each sub-region within the target region may be determined to be a target sub-region, and may also be determined to be a non-target sub-region. It is easy to understand that the target sub-region and the non-target sub-region of the present embodiment may be the same sub-region at the same time.

For example, the target region includes a region a1, a region a2, a region a3, a region a, and a region a5, and when the target region is a region a1 within the target region, the task transfer amount data corresponding to the region a1 includes the number of tasks whose start positions are in the region a1 and end positions are in the region a1, the region a2, the region a3, the region a, and the region a5, respectively.

Fig. 3 is a schematic diagram of a subsequence of the target area at sub-period T1 of an embodiment of the invention. Fig. 3 illustrates an example of dividing the target area into 6 sub-areas, namely, an area a1, an area a2, an area a3, an area a4, an area a5, and an area a 6. As shown in fig. 3, m (i, j) is the number of tasks in the sub-period T1 with the start position in the area ai and the end position in the area aj, where i and j are integers greater than or equal to 1 and less than or equal to 6, respectively.

For example, the target area is area a1, the first time period is 0 o 'clock-24 o' clock of 20 days 2 months 2022 years, and every 1 hour in the first time period is a sub-time period, that is, the first time period may include 24 sub-time periods of 0:00-1:00, 1:00-2:00, 2:00-3:00, and the like. The server may divide the area a1 at equal intervals on the map plane to obtain four sub-areas corresponding to the area a1, which are respectively the sub-area a1, the sub-area a2, the sub-area A3, and the sub-area a 4. Then, taking the sub-period of 9:00 to 10:00 as an example, the server may obtain task transfer amount data generated by the sub-region a1, the sub-region a2, the sub-region A3, and the sub-region a4 within 9:00 to 10:00, respectively, determine original service data generated by the sub-region a1, the sub-region a2, the sub-region A3, and the sub-region a4 within the same sub-period as a sub-sequence, and determine a task transfer amount sequence of the region a1 within 0 to 24 points of 20 days of 2 months and 20 days of 2022 years according to a sub-period sequence corresponding to 24 sub-sequences, that is, the sequence of 0:00 to 1:00 >23:00 to 24: 00.

Step S200 is to acquire a target transition amount prediction model corresponding to the target region from the plurality of transition amount prediction models.

In this step, the server may obtain, according to the area identifier of the target area, a target transfer amount prediction model corresponding to the target area from a plurality of transfer amount prediction models stored in the database or locally. In this embodiment, in order to improve the accuracy of task transfer prediction, different regions may correspond to different transfer prediction models, and the transfer prediction models are pre-trained models.

The number of buildings in different areas is not evenly distributed, and the flow of people is also uneven, so that the task transfer amount of most of the subareas in some areas is more, and the task transfer amount of most of the subareas in some areas is less or even none. In order to balance the difference of the task transfer amount between different regions and improve the accuracy of the task transfer amount prediction between sub-regions in different regions, the embodiment trains the transfer amount prediction model in a transfer learning manner.

Each geographic area can be determined as a target area, so that the transfer amount prediction models corresponding to each geographic area are trained in the same manner, and the target transfer amount prediction model corresponding to the target area is taken as an example in this embodiment for explanation. The target transfer amount prediction model can be obtained by training according to the first sample set and the second sample set. Each first sample in the first sample set can comprise a first historical task transfer amount sequence of the corresponding geographic area in a first historical time period and a first historical task transfer amount matrix in a second historical time period. The first historical task transfer volume sequence is similar to the task transfer volume sequence and comprises first historical task transfer volume data generated by each sub-region in the corresponding geographic region in a plurality of sub-historical time periods of the first historical time period. The first historical task transfer volume matrix is similar to the task transfer volume matrix and includes first historical task transfer volume data generated by sub-regions in the corresponding geographic region during a second historical time period. And the first historical task transfer amount data is similar to the task transfer amount data and represents the number of historical tasks of which the starting positions are located in the preset sub-area in the corresponding geographic area and the ending positions are located in the non-preset sub-area.

Each second sample in the second sample set may include a second historical sequence of task transfer volumes for the target area over the first historical period of time and a second historical matrix of task transfer volumes over the second historical period of time. The second historical task transfer amount sequence is also similar to the task transfer amount sequence and comprises second historical task transfer amount data generated by each sub-region in the target region in a plurality of sub-historical time periods of the first historical time period. The second historical task transfer amount matrix is also similar to the task transfer amount matrix and includes second historical task transfer amount data generated by each sub-region in the target region during a second historical time period. And the second historical task transfer amount data is similar to the task transfer amount data and represents the number of historical tasks with the starting position located in the target sub-area and the ending position located in the non-target sub-area in the target area.

Optionally, the server may obtain second task transfer amount data corresponding to each target region of first historical task transfer amount transfer data corresponding to each geographic region, and then detect abnormal data in each first historical task transfer amount data and each second task transfer amount data according to a predetermined abnormality detection method, so as to determine a first sample by determining a corresponding first historical task transfer amount sequence and a first historical task transfer amount matrix according to the first historical task transfer amount data from which the abnormal data is removed, and determine a corresponding second historical task transfer amount sequence and a second historical task transfer amount matrix according to the second historical task transfer data from which the abnormal data is removed, so as to determine a second sample.

Alternatively, the predetermined anomaly detection method may be an isolated Forest, DBSCAN (Density-Based Spatial Clustering of Applications with Noise), Random Cut Forest (RCF), or the like.

It is readily understood that the first set of samples may comprise at least part of the second samples of the second set of samples. The first historical time periods corresponding to the training samples including the first samples and the second samples may or may not be the same historical time period, and the geographic areas corresponding to the training samples may or may not be the same geographic area, but the first historical time periods may have partial sub-historical time period time coincidences. And when the geographic areas corresponding to different training samples are the same geographic area, the corresponding first historical time periods are not the same historical time period, and when the first historical time periods corresponding to different training samples are the same historical time period, the corresponding geographic areas are not the same geographic area. Similarly, the second historical time periods corresponding to the training samples may or may not be the same historical time period, and the geographic areas corresponding to the training samples may or may not be the same geographic area, but the second historical time periods may have time coincidence of partial sub-historical time periods. And when the geographic areas corresponding to different training samples are the same geographic area, the corresponding second historical time periods are not the same historical time period, and when the second historical time periods corresponding to different training samples are the same historical time period, the corresponding geographic areas are not the same geographic area. Further, the number of sub-regions in different geographical regions may be the same or different.

The task transfer amount sequence is a time sequence formed by a plurality of matrixes, and the matrixes can be used for representing the image data, namely, the task transfer amount can be regarded as the image sequence. Therefore, in this embodiment, the prediction model of the branch amount may be a Recurrent Neural Network (RNN) or a Convolutional Neural Network (CNN). Taking RNN as an example, RNN is a type of recurrent neural network that takes sequence data as input, recurses in the evolution direction of the sequence, and all nodes are connected in a chain. RNNs are memorable, shared by parameters, and graphically complete (in computational theory, if a series of rules of manipulating data, such as instruction sets, programming languages, cellular automata, can be used to simulate a single-band turing machine, the rules are graphically complete), and thus have certain advantages in learning the non-linear characteristics of a sequence. Further, the target transfer amount prediction model of the embodiment may be an existing recurrent neural network, such as a Long Short-Term Memory network (LSTM), a Bidirectional recurrent neural network (Bi-RNN), and the like.

When the target transfer amount prediction model is trained, the server may use each first historical task transfer amount sequence as an input of an initial model which is not subjected to parameter adjustment, and train the initial model with a corresponding first historical task transfer amount matrix as a target. After the loss function of the initial model converges, the server may determine the trained initial model as the first model. After obtaining the first model obtained through data training of a plurality of geographic areas, the server may use each second historical task transfer amount sequence corresponding to the target area as an input of the first model, and train the first model with the corresponding second historical task transfer amount matrix as a target. After the loss function of the first model converges, the server may determine that the trained model is the target transfer amount prediction model corresponding to the target region.

The number of layers of the neural network is usually large, and the number of neurons in each layer is large, so that the adjustment process of the weights corresponding to the neurons and the bias terms in each layer is complicated. Therefore, optionally, in order to ensure the accuracy of the target transfer amount prediction model in predicting the task transfer amount, before training the first model, the server may fix the weight and the bias term of the first i layer of the first model according to a preset weight fixing parameter, and adjust the weight and the bias term from the i +1 th layer through the second historical task transfer amount sequence and the second historical task transfer amount matrix.

And step S300, acquiring a task transfer amount matrix of the target area in a second time period according to the task transfer amount sequence based on the target transfer amount prediction model.

In this step, the server may obtain a task transfer amount matrix of the target area in the second time period by using the task transfer amount sequence as an input of the target transfer amount prediction model. In this embodiment, the time length of the second time period may be the same as or different from the time length of the sub-time period in the first time period, and this embodiment is not limited.

Similar to the task transfer amount sequence of the target region in the first time period, the task transfer amount matrix of the target region in the second time period also includes task transfer amount data generated by each sub-region in the target region in the second time period, and the task transfer amount data represents the number of tasks whose start positions are located in the target sub-region and whose end positions are located in the non-target sub-region, that is, the required number of tasks whose start positions are located in the target sub-region and whose end positions are located in the non-target sub-region in the second time period by the user group.

Optionally, after determining the task transfer amount data of each sub-region in the target region, the method of this embodiment may further include the following steps:

and step S400, determining object transfer data of each sub-area in the target area in a second time period according to the task transfer amount matrix.

In this embodiment, the object transfer data represents the number of objects corresponding to the task whose start position belongs to the target sub-region and whose end position belongs to the non-target sub-region, that is, the number of the objects whose start positions are located in the target sub-region and whose end positions are located in the non-target sub-region in the second time period.

According to different actual application fields, the objects can be shared objects such as bicycles, networked appointment carts and commodities, and the number of the objects corresponding to different tasks is different. Depending on the application scenario, different tasks may correspond to different numbers of objects (i.e., items), for example, in the application scenario of shared-vehicle rental, one rental order typically corresponds to one shared vehicle, while in the application scenario of cargo transportation, one shipping task may correspond to at least one item, for example, in the application scenario of vehicle transportation, one shipping task may correspond to 6 vehicles. Therefore, after determining the task transfer matrix of the target area in the second time period, the server may determine the object transfer data of each sub-area in the second time period according to the corresponding relationship between the number of tasks and the number of objects.

After the object transfer data is determined, the server can obtain the actual number of the objects in each sub-area, and supplement or reduce the number of the objects in each sub-area according to the predicted required number of the objects of which the starting positions are located in the target sub-area and the ending positions are located in the non-target sub-area in the second time period by the user group so as to meet the user requirements.

The method comprises the steps of obtaining a task transfer quantity sequence formed by the transfer quantities of tasks of all sub-areas in a preset geographic area in all sub-time periods of a first time period, obtaining a transfer quantity prediction model corresponding to a target area from a plurality of transfer quantity prediction models, and obtaining the transfer quantity of the tasks of the preset geographic area in a second time period as the required quantity data of a user group to the tasks by taking the task transfer quantity sequence as the input of the target transfer quantity prediction model corresponding to the target area. According to the embodiment, the transfer situations of tasks among the sub-areas of different geographic areas can be accurately predicted through the transfer amount prediction models corresponding to the different geographic areas so as to estimate the demand situations of users for goods.

Fig. 4 is a schematic diagram of a data processing apparatus according to a second embodiment of the present invention. As shown in fig. 4, the apparatus of the present embodiment includes a sequence acquisition unit 401, a model acquisition unit 402, and a first quantity prediction unit 403.

The sequence acquiring unit 401 is configured to acquire a task transfer amount sequence of a target region in a first time period, where the task transfer amount sequence includes task transfer amount data generated by each sub-region in the target region in a plurality of sub-time periods of the first time period, the task transfer amount data represents the number of tasks whose starting positions are located in the target sub-region and whose ending positions are located in non-target sub-regions, and the target region is a predetermined geographic region. The model obtaining unit 402 is configured to obtain a target transfer amount prediction model corresponding to the target region from a plurality of transfer amount prediction models. The first quantity prediction unit 403 is configured to obtain, according to the task transfer amount sequence, a task transfer amount matrix of the target region in a second time period based on the target transfer amount prediction model, where the task transfer amount matrix includes task transfer amount data of each sub-region in the second time period.

Further, the apparatus also comprises a second quantity prediction unit 404.

The second quantity prediction unit 404 is configured to determine, according to the task transfer quantity matrix, object transfer data of each sub-region in the target region in the second time period, where the object transfer data represents a quantity of objects corresponding to tasks whose starting positions are located in the target sub-region and whose ending positions are located in the non-target sub-region.

Further, the target transfer amount prediction model is obtained by training according to a first sample set and a second sample set, each first sample in the first sample set includes a first historical task transfer amount sequence of the corresponding geographic area in a first historical time period and a first historical task transfer amount matrix in a second historical time period, the first historical task transfer amount sequence includes first historical task transfer amount data generated by each sub-area in the corresponding geographic area in a plurality of sub-historical time periods of the first historical time period, the first historical task transfer amount matrix includes first historical task transfer amount data of each sub-area in the corresponding geographic area in the second historical time period, and the first historical task transfer amount data represents the number of historical tasks with starting positions located in predetermined sub-areas and ending positions located in non-predetermined sub-areas, each second sample in the second sample set comprises a second historical task transfer amount sequence of the target area in a first historical time period and a second historical task transfer amount matrix in a second historical time period, the second historical task transfer amount sequence comprises second historical task transfer amount data generated by each sub-area in the target area in a plurality of sub-historical time periods of the first historical time period, the second historical task transfer amount matrix comprises second historical task transfer amount data of each sub-area in the target area in the second historical time period, and the second historical task transfer amount data represents the number of historical tasks of which the starting positions are located in the target sub-area and the ending positions are located in the non-target sub-area.

Further, the first sample and the second sample are determined by the data acquisition unit 405, the abnormality detection unit 406, the first sample determination unit 407, and the second sample determination unit 408:

the data obtaining unit 405 is configured to obtain each of the first historical task transfer amount data and the second historical task transfer amount data. The anomaly detection unit 406 is configured to detect anomaly data in the first historical task transfer amount data and the second task transfer amount data according to a predetermined anomaly detection method. The first sample determining unit 407 is configured to determine, according to the first historical task transfer amount data after removing the abnormal data, the corresponding first historical task transfer amount sequence and the first historical task transfer amount matrix to determine the first sample. The second sample determining unit 408 is configured to determine the corresponding second historical task transfer amount sequence and the second historical task transfer amount matrix according to the second historical task transfer amount data after the abnormal data is removed, so as to determine the second sample.

Further, the target transfer amount prediction model is obtained by training the first training unit 409 and the second training unit 410:

the first training unit 409 is configured to train an initial model with each of the first historical task transfer amount sequences as an input and the corresponding first historical task transfer amount matrix as a target to obtain a first model. The second training unit 410 is configured to train the first model with each second historical task branch amount sequence as an input and with the corresponding second historical task branch amount matrix as a target to obtain the branch amount prediction model.

Further, the transfer amount prediction model is a recurrent neural network or a convolutional neural network.

In the embodiment, a task transfer amount sequence formed by the transfer amounts of the tasks in the sub-time periods of the first time period of each sub-region in the predetermined geographic region is obtained, and a transfer amount prediction model corresponding to the target region is obtained from a plurality of transfer amount prediction models, so that the transfer amounts of the tasks in the second time period of the predetermined geographic region are obtained as the data of the required amount of the user group for the tasks by taking the task transfer amount sequence as the input of the target transfer amount prediction model corresponding to the target region. According to the embodiment, the transfer situations of tasks among the sub-areas of different geographic areas can be accurately predicted through the transfer prediction models corresponding to the different geographic areas so as to estimate the demand situations of users for goods.

Fig. 5 is a schematic view of an electronic device according to a third embodiment of the present invention. The electronic device shown in fig. 5 is a general-purpose data processing apparatus comprising a general-purpose computer hardware structure including at least a processor 501 and a memory 502. The processor 501 and the memory 502 are connected by a bus 503. The memory 502 is adapted to store instructions or programs executable by the processor 501. The processor 501 may be a stand-alone microprocessor or a collection of one or more microprocessors. Thus, the processor 501 implements the processing of data and the control of other devices by executing commands stored in the memory 502 to execute the method flows of the embodiments of the present invention as described above. The bus 503 connects the above-described components together, and also connects the above-described components to a display controller 504 and a display device and an input/output (I/O) device 505. Input/output (I/O) devices 505 may be a mouse, keyboard, modem, network interface, touch input device, motion sensing input device, printer, and other devices known in the art. Typically, input/output (I/O) devices 505 are connected to the system through an input/output (I/O) controller 506.

The memory 502 may store, among other things, software components such as an operating system, communication modules, interaction modules, and application programs. Each of the modules and applications described above corresponds to a set of executable program instructions that perform one or more functions and methods described in embodiments of the invention.

The foregoing flowcharts and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the present invention illustrate various aspects of the present invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

Also, as will be appreciated by one skilled in the art, aspects of embodiments of the present invention may be embodied as a system, method or computer program product. Accordingly, various aspects of embodiments of the invention may take the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a "circuit," module "or" system. Further, aspects of the invention may take the form of: a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.

Any combination of one or more computer-readable media may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of embodiments of the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to: electromagnetic, optical, or any suitable combination thereof. The computer readable signal medium may be any of the following computer readable media: is not a computer readable storage medium and may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including: object oriented programming languages such as Java, Smalltalk, C + +, PHP, Python, and the like; and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package; executing in part on a user computer and in part on a remote computer; or entirely on a remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method of data processing, the method comprising:

based on the target transfer quantity prediction model, acquiring a task transfer quantity matrix of the target region in a second time period according to the task transfer quantity sequence, wherein the task transfer quantity matrix comprises task transfer quantity data of each sub-region in the second time period;

the target transfer quantity prediction model is obtained by training according to a first sample set and a second sample set, wherein each first sample in the first sample set comprises a first historical task transfer quantity sequence of a corresponding geographic area in a first historical time period and a first historical task transfer quantity matrix in a second historical time period, and each second sample in the second sample set comprises a second historical task transfer quantity sequence of the target area in the first historical time period and a second historical task transfer quantity matrix in the second historical time period.

2. The method of claim 1, further comprising:

3. The method of claim 1, wherein the first sequence of historical task transfer amounts comprises first historical task transfer amount data generated by each sub-region of the corresponding geographic region over a plurality of sub-historical time periods of the first historical time period, wherein the first matrix of historical task transfer amounts comprises first historical task transfer amount data generated by each sub-region of the corresponding geographic region over the second historical time period, wherein the first historical task transfer amount data characterizes a number of historical tasks having starting locations at predetermined sub-regions and ending locations at non-predetermined sub-regions, wherein the second sequence of historical task transfer amounts comprises second historical task transfer amount data generated by each sub-region of the target region over a plurality of sub-historical time periods of the first historical time period, and wherein the second matrix of historical task transfer amounts comprises second historical task transfer amount data generated by each sub-region of the target region over the second historical time period According to the second historical task transfer amount data, the number of the historical tasks with the starting positions located in the target sub-area and the ending positions located in the non-target sub-area is represented.

4. The method of claim 3, wherein the first sample and the second sample are determined by:

detecting abnormal data in the first historical task transfer amount data and the second historical task transfer amount data according to a preset abnormal detection method;

determining the corresponding first historical task transfer amount sequence and the first historical task transfer amount matrix according to the first historical task transfer amount data after abnormal data are removed so as to determine the first sample;

and determining the corresponding second historical task transfer amount sequence and the second historical task transfer amount matrix according to the second historical task transfer amount data after abnormal data are removed so as to determine the second sample.

5. The method according to claim 3 or 4, wherein the target transfer amount prediction model is obtained by training as follows:

6. The method of claim 1, wherein the transfer quantity prediction model is a recurrent neural network or a convolutional neural network.

7. A data processing apparatus, characterized in that the apparatus comprises:

the first quantity prediction unit is used for acquiring a task transfer quantity matrix of the target area in a second time period according to the task transfer quantity sequence based on the target transfer quantity prediction model, and the task transfer quantity matrix comprises task transfer quantity data of each sub-area in the second time period;

8. A computer-readable storage medium on which computer program instructions are stored, which computer program instructions, when executed by a processor, implement the method of any one of claims 1-6.

9. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method of any of claims 1-6.