CN111460140A - Data processing method and device, electronic equipment and computer readable storage medium - Google Patents

Data processing method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN111460140A
CN111460140A CN202010148156.XA CN202010148156A CN111460140A CN 111460140 A CN111460140 A CN 111460140A CN 202010148156 A CN202010148156 A CN 202010148156A CN 111460140 A CN111460140 A CN 111460140A
Authority
CN
China
Prior art keywords
probability
target task
data
data processing
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010148156.XA
Other languages
Chinese (zh)
Inventor
王乾
叶俊杰
姜梦晓
赵扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rajax Network Technology Co Ltd
Original Assignee
Rajax Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rajax Network Technology Co Ltd filed Critical Rajax Network Technology Co Ltd
Priority to CN202010148156.XA priority Critical patent/CN111460140A/en
Publication of CN111460140A publication Critical patent/CN111460140A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data processing method, a data processing device, electronic equipment and a computer readable storage medium.

Description

Data processing method and device, electronic equipment and computer readable storage medium
Technical Field
The present invention relates to the field of internet technologies, and in particular, to a data processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
With the development of internet technology, an internet-based O2O system (Online To Offline) brings more and more convenience To life. Currently, in the O2O system, due to various reasons (such as weather reasons), tasks may be terminated abnormally or complained, so how to handle abnormal tasks is a problem to be solved.
Disclosure of Invention
In view of this, embodiments of the present invention provide a data processing method, an apparatus, an electronic device, and a computer-readable storage medium, so as to improve data processing efficiency.
In a first aspect, an embodiment of the present invention provides a data processing method, where the method includes:
receiving a data instruction from terminal equipment;
analyzing the data instruction through at least one processor to obtain target task information and a comment text of a corresponding first object, wherein the target task information comprises a task record of the target task, corresponding first object information and second object information;
obtaining a feature vector corresponding to the target task according to the target task information and the comment text through at least one processor;
inputting, by at least one processor, the feature vector into a data processing model for processing to obtain a first probability and a second probability, the first probability being used to represent a probability that the target task belongs to a first class, and the second probability being used to represent a probability that the target task belongs to a second class;
responsive to the first probability being greater than or equal to a first predetermined value, sending, by at least one processor, a first data processing result, the first data processing result including that the target task belongs to a first class;
in response to the second probability being greater than or equal to a second predetermined value, sending, by at least one processor, a second data processing result, the second data processing result including that the target task belongs to a second class.
Optionally, obtaining, by at least one processor, a feature vector corresponding to the target task according to the target task information and the comment text includes:
performing feature extraction on the target task information to acquire first data;
performing feature extraction on the comment text to acquire second data;
and combining the first data and the second data to obtain a feature vector corresponding to the target task.
Optionally, the performing feature extraction on the target task information to obtain first data includes:
performing onehot coding on discrete data in the target task information to obtain first subdata;
performing normalization processing on continuous data in the target task information to obtain second subdata;
the first sub data and the second sub data are merged to obtain the first data.
Optionally, the method further includes:
sending, by at least one processor, a third processing result in response to the first probability being less than the first predetermined value and the second probability being less than the second predetermined value.
Optionally, the method further includes:
sending, by at least one processor, a third processing result in response to the first probability being greater than or equal to the first predetermined value and the second probability being greater than or equal to the second predetermined value.
Optionally, the method further includes:
training the data processing model based on a data set;
the data set comprises information of a plurality of historical tasks, first object information and second object information corresponding to each historical task, and comment texts of the first objects corresponding to each historical task.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, where the apparatus includes:
a receiving unit configured to receive a data instruction from a terminal device;
the acquisition unit is configured to analyze the data instruction through at least one processor, and acquire target task information and corresponding comment texts of the first object, wherein the target task information comprises task records of the target task, corresponding first object information and second object information;
the feature extraction unit is configured to obtain, through at least one processor, a feature vector corresponding to the target task according to the target task information and the comment text;
a processing unit configured to input, by at least one processor, the feature vector to a data processing model for processing to obtain a first probability and a second probability, the first probability being used for representing a probability that the target task belongs to a first class, and the second probability being used for representing a probability that the target task belongs to a second class;
a first sending unit configured to send, by at least one processor, a first data processing result in response to the first probability being greater than or equal to a first predetermined value, the first data processing result including that the target task belongs to a first class;
a second sending unit configured to send, by the at least one processor, a second data processing result including that the target task belongs to a second class in response to the second probability being greater than or equal to a second predetermined value.
Optionally, the feature extraction unit includes:
a first feature extraction subunit configured to perform feature extraction on the target task information to acquire first data;
a second feature extraction subunit configured to perform feature extraction on the comment text to acquire second data;
a merging subunit configured to merge the first data and the second data to obtain a feature vector corresponding to the target task.
Optionally, the first feature extraction subunit includes:
the first processing module is configured to perform onehot coding on the discrete data in the target task information to obtain first subdata;
the second processing module is configured to perform normalization processing on continuous data in the target task information to obtain second subdata;
a merging module configured to merge the first sub data and the second sub data to obtain the first data.
Optionally, the apparatus further comprises:
a third transmitting unit configured to transmit, by at least one processor, a third processing result in response to the first probability being less than the first predetermined value and the second probability being less than the second predetermined value.
Optionally, the apparatus further comprises:
a fourth transmitting unit configured to transmit, by at least one processor, a third processing result in response to the first probability being greater than or equal to the first predetermined value and the second probability being greater than or equal to the second predetermined value.
Optionally, the apparatus further comprises:
a training unit configured to train the data processing model based on a data set;
the data set comprises information of a plurality of historical tasks, first object information and second object information corresponding to each historical task, and comment texts of the first objects corresponding to each historical task.
In a third aspect, an embodiment of the present invention provides an electronic device, including a memory and a processor, where the memory is used to store one or more computer program instructions, where the one or more computer program instructions are executed by the processor to implement the method described above.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement a method as described above.
The embodiment of the invention obtains the characteristic vector corresponding to the target task by extracting the characteristics of the target task information and the comment text of the corresponding first object, inputs the characteristic vector corresponding to the target task into the data processing model for processing to obtain the first probability and the second probability, predicts the category of the target task by judging the first probability and the second probability, and outputs the corresponding data processing result, thereby improving the classification efficiency and the data processing efficiency of the target task.
Drawings
The above and other objects, features and advantages of the present invention will become more apparent from the following description of the embodiments of the present invention with reference to the accompanying drawings, in which:
FIG. 1 is a flow chart of a data processing method of an embodiment of the present invention;
FIG. 2 is a flow chart of another data processing method of an embodiment of the present invention;
FIG. 3 is a data flow diagram of a data processing method of an embodiment of the present invention;
FIG. 4 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an electronic device of an embodiment of the invention.
Detailed Description
The present invention will be described below based on examples, but the present invention is not limited to only these examples. In the following detailed description of the present invention, certain specific details are set forth. It will be apparent to one skilled in the art that the present invention may be practiced without these specific details. Well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.
Further, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.
Unless the context clearly requires otherwise, throughout the description, the words "comprise", "comprising", and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, what is meant is "including, but not limited to".
In the description of the present invention, it is to be understood that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present invention, "a plurality" means two or more unless otherwise specified.
Fig. 1 is a flowchart of a data processing method of an embodiment of the present invention. As shown in fig. 1, the data processing method according to the embodiment of the present invention includes the following steps:
step S110, receiving a data command from the terminal device. And when receiving the data processing task request, the terminal equipment sends a corresponding data instruction.
Step S120, analyzing the data command through at least one processor, and acquiring target task information and a comment text of the corresponding first object. The target task information includes a task record of the target task, corresponding first object information, and second object information.
In an alternative implementation manner, the target task of the present embodiment is an exception task, that is, a task that is suspended before completion or complained after completion. The data processing task request may be an arbitration processing request. Taking an abnormal task as an example, the task record of the target task may include one or more items of task content, initial time, end time, environment information (information such as weather or road conditions), and delivery information. The corresponding first object information may include one or more of identity information (e.g., whether the first object is a member), historical task information, credit information, historical comment text, and historical abnormal task information of the first object. The second object information may include one or more of historical task information, historical exception task information, historical commented text, and credit information. The abnormal tasks comprise cancelled tasks and complaint tasks, and the comment texts comprise cancellation reasons of the cancelled tasks, complaint reasons of the complaint tasks, comment texts of the tasks and the like.
Taking the abnormal task as the abnormal takeout task as an example, the task record of the target task may include meal information, order placing time, order receiving time, meal taking time, delivery time (or takeout task canceling time), delivery information, and environment information. The delivery information may include attribute information of the delivered resource (platform delivery or self-delivery by the merchant), credit information of the delivered resource, delivery amount information, historical evaluation information, and the like. The corresponding first object is an ordering user, and the first object information may include one or more of identity information (e.g., whether the first object is a member) of the ordering user, historical task information, credit information, historical comment text, and historical abnormal task information. The second object is a merchant, and the second object information may include one or more of historical task information, historical abnormal task information, historical commented text, and credit information. It should be understood that the above-mentioned take-away task is described in detail by taking as an example, but the present embodiment does not limit the type of the target task, and any task formed by multiple parties, such as online shopping, taxi-taking service, etc., can be applied to the data processing method of the present embodiment.
Step S130, obtaining a feature vector corresponding to the target task through at least one processor according to the target task information and the comment text of the first object.
In an alternative implementation manner, step S300 may include performing feature extraction on the target task information to obtain first data, performing feature extraction on the comment text of the first object to obtain second data, and merging the first data and the second data to obtain a feature vector corresponding to the target task. Optionally, the first data and the second data are spliced front and back to obtain a feature vector corresponding to the target task.
The target task information includes discrete data and continuous data. The discrete data index value can only be data calculated by natural numbers or integer units, and the value of the discrete data is generally obtained by a counting method. The continuous data refers to data which can be randomly valued in a certain interval and two adjacent values can be wirelessly divided, the values of the continuous data are continuous, and the values are obtained by adopting a measuring or metering method. In the embodiment, the discrete data and the continuous data in the target task information are processed differently, so that the data processing efficiency is further improved.
In an optional implementation manner, onehot encoding is performed on discrete data in the target task information to obtain first sub-data, normalization processing is performed on continuous data in the target task information to obtain second sub-data, and the first sub-data and the second sub-data are combined to obtain the first data.
In this embodiment, offline data is processed by using an onehot encoding method, which has the basic idea that: each value of the discrete data is considered as a state, for example, there are two states in the history task, namely, a normal history task and an abnormal history task, and onehot encoding ensures that each value can only make one state be in an "active state", that is, only one state bit value in the multiple states of the discrete data is 1, and other state bits are 0. For example, employing onehot encoding for historical task information may be: normal history task- [1,0], abnormal history task- [0,1 ]. Therefore, onehot coding can be carried out on the discrete data in the target task information to obtain the first data. It should be understood that the discrete data in the target task information may also be encoded by using other encoding methods (e.g., dummy variable encoding), and this embodiment is not limited thereto.
In an alternative implementation manner, word vector technology (word2vec), a neural network language model, a C & M (context & word, context and target word), a cbow (continuous Bag of words) model, a Skip-gram model, or the like may be adopted to vectorize the comment text of the first object to obtain the second data.
Step S140, inputting the feature vector corresponding to the target task to the data processing model through at least one processor for processing, so as to obtain a first probability and a second probability. The first probability is used for representing the probability that the target task belongs to the first class, and the second probability is used for representing the probability that the target task belongs to the second class. Optionally, a first value (for example, the first value is 1) may be used to characterize that the target task belongs to the first class, and a second value (for example, the second value is 0) may be used to characterize that the target task belongs to the second class. Optionally, the target task is an abnormal task (i.e. a task to be cancelled or complained), and the target task belongs to the first category, which may be: the target task should be a normal task, that is, the arbitration result of the target task is arbitration pass, and the complainer is not responsible. The target task belonging to the second category may be: the target task is determined to be an abnormal task, namely the arbitration result of the target task is failed, and the complainer should take corresponding responsibility. And taking the abnormal task as the abnormal takeout task, and after the takeout task is finished, the delivery time is prolonged by half an hour compared with the expected time, so that the user complains. The distribution resource (or the commercial tenant) applies for arbitration on the abnormal takeout task through the distribution resource terminal (or the commercial tenant terminal), then a feature vector corresponding to the abnormal takeout task can be obtained according to the obtained information of the abnormal takeout task and the comment text of the user, the feature vector is input into a trained data processing model to be processed so as to obtain the probability that the arbitration passes (namely, the target task belongs to the first class) and the probability that the arbitration does not pass (namely, the target task belongs to the second class), and the arbitration result (namely, the class of the target task) is predicted according to the probabilities. Therefore, the arbitration result of the abnormal task can be automatically predicted by the machine, and the processing efficiency is improved.
Step S150, sending a first data processing result by the at least one processor in response to the first probability being greater than or equal to a first predetermined value, and sending a second data processing result by the at least one processor in response to the second probability being greater than or equal to a second predetermined value. The first data processing result comprises that the target task belongs to a first class, and the second data processing result comprises that the target task belongs to a second class.
In an alternative implementation, the first data processing result is sent by the at least one processor in response to the first probability being greater than or equal to a first predetermined value and the second probability being less than a second predetermined value, and the second data processing result is sent by the at least one processor in response to the second probability being greater than or equal to the second predetermined value and the first probability being less than the first predetermined value.
Taking the arbitration processing of the abnormal task as an example, the target task belongs to the first class and represents the arbitration passing of the abnormal task, and the target task belongs to the second class and represents the arbitration failing of the abnormal task. And when the first probability output by the data processing model is greater than or equal to a first preset value and the second probability is less than a second preset value, transmitting the data processing result of the target task belonging to the first class (namely arbitration passing). And when the second probability output by the data processing model is greater than or equal to a second preset value and the first probability is smaller than a first preset value, sending a data processing result that the target task belongs to a second class (arbitration does not pass). Optionally, the first predetermined value is 0.9, and the second predetermined value is 0.9.
In an optional implementation manner, the data processing method of this embodiment further includes: sending, by at least one processor, a third processing result in response to the first probability being less than the first predetermined value and the second probability being less than the second predetermined value (or in response to the first probability being greater than the first predetermined value and the second probability being greater than the second predetermined value). Optionally, the third data processing result includes that the target task belongs to a third class. When the data processing task is an arbitration task, the target task belongs to the third category of characteristics, and the arbitration result cannot be determined, and the data processing task needs to be arbitrated again through other modes (such as manual arbitration). Taking the arbitration process of the abnormal task as an example, when the first probability output by the data processing model is smaller than the first predetermined value and the second probability is smaller than the second predetermined value, or the first probability output by the data processing model is larger than the first predetermined value and the second probability is larger than the second predetermined value, the data processing result of the unpredictable arbitration result is output. At this time, after receiving the data processing result, the arbitration processing of the abnormal task may be transferred to the manual processing link.
In an optional implementation manner, the data processing method of this embodiment further includes: training the data processing model based on the tagged data set. The data set comprises information of a plurality of historical tasks, first object information and second object information corresponding to each historical task, and comment texts of the first objects corresponding to each historical task. The historical tasks in the data set are the historical tasks subjected to arbitration processing. In the data set, the data processing results of the historical tasks are marked, for example, the historical tasks of the first type (i.e., the historical tasks that have passed arbitration) are marked with a first value (e.g., 1), and the historical tasks of the second type (i.e., the historical tasks that have not passed arbitration) are marked with a second value (e.g., 0). In this embodiment, by obtaining the feature vectors corresponding to the historical tasks, and inputting the feature vectors into the data processing model for training, so as to adjust the parameters of the data processing model, for the historical tasks marked as the first historical tasks, the first probability output by the data processing model is greater than or equal to the first predetermined value, the second probability output by the data processing model is smaller than the second predetermined value, for the historical tasks marked as the second historical tasks, the second probability output by the data processing model is greater than or equal to the second predetermined value, and the first probability output by the data processing model is smaller than the first predetermined value, therefore, the trained data processing model can more accurately predict the category of the current target task, that is, can more accurately predict whether the arbitration processing of the current target task can be arbitrated, and improve the arbitration efficiency.
In this embodiment, the method for obtaining the feature vector corresponding to each historical task is similar to the above method for obtaining the feature vector corresponding to the target task, and is not described herein again.
The embodiment of the invention obtains the characteristic vector corresponding to the target task by extracting the characteristics of the target task information and the comment text of the corresponding first object, inputs the characteristic vector corresponding to the target task into the data processing model for processing to obtain the first probability and the second probability, predicts the category of the target task by judging the first probability and the second probability, and outputs the corresponding data processing result, thereby improving the data processing efficiency.
Fig. 2 is a flow chart of another data processing method according to an embodiment of the present invention. As shown in fig. 2, the data processing method of the present embodiment includes the following steps:
step S210, receiving a data command from the terminal device. And when receiving the data processing task, the terminal equipment sends a corresponding data instruction.
Step S220, analyzing the data command through at least one processor, and acquiring target task information and a comment text of the corresponding first object. The target task information includes a task record of the target task, corresponding first object information, and second object information.
Step S230, obtaining, by at least one processor, a feature vector corresponding to the target task according to the target task information and the comment text of the first object.
In an alternative implementation manner, step S300 may include performing feature extraction on the target task information to obtain first data, performing feature extraction on the comment text of the first object to obtain second data, and merging the first data and the second data to obtain a feature vector corresponding to the target task. Optionally, the first data and the second data are spliced front and back to obtain a feature vector corresponding to the target task. Optionally, onehot encoding is performed on discrete data in the target task information to obtain first sub-data, normalization processing is performed on continuous data in the target task information to obtain second sub-data, and the first sub-data and the second sub-data are combined to obtain the first data. Therefore, in the embodiment, the discrete data and the continuous data in the target task information are processed differently, so that the data processing efficiency is further improved. Optionally, word vector technology (word2vec), a neural network language model, C & M (context & word, context and target word), cbow (continuous Bag of words) model, Skip-gram model, or the like may be used to vectorize the comment text of the first object to obtain the second data.
Step S240, inputting the feature vector corresponding to the target task to the data processing model through at least one processor for processing, so as to obtain a first probability and a second probability. The first probability is used for representing the probability that the target task belongs to the first class, and the second probability is used for representing the probability that the target task belongs to the second class. Optionally, a first value (for example, the first value is 1) may be used to characterize that the target task belongs to the first class, and a second value (for example, the second value is 0) may be used to characterize that the target task belongs to the second class. Optionally, the target task is an abnormal task (i.e. a task to be cancelled or complained), and the target task belongs to the first category, which may be: the target task should be a normal task, that is, the arbitration result of the target task is arbitration pass, and the complainer is not responsible. The target task belonging to the second category may be: the target task is determined to be an abnormal task, namely the arbitration result of the target task is failed, and the complainer should take corresponding responsibility. In the present embodiment, steps S210 to S240 are similar to steps S110 to S140, and will not be described in detail.
Step S250, determining whether the first probability is smaller than a first predetermined value, if the first probability is not smaller than the first predetermined value, executing step S260, and if the first probability is smaller than the first predetermined value, executing step S270.
Step S260, determining whether the second probability is smaller than a second predetermined value, if the second probability is smaller than the second predetermined value, executing step S280, and if the second probability is not smaller than the first predetermined value, executing step S290.
Step S280, sending a first data processing result, where the first data processing result includes that the target task belongs to the first class. That is, when the first probability is not less than the first predetermined value and the second probability is less than the second predetermined value, the first data processing result is transmitted. Taking the arbitration processing of the abnormal task as an example, when the first probability output by the data processing model is greater than or equal to the first predetermined value and the second probability is less than the second predetermined value, the data processing result passing the arbitration (that is, the target task belongs to the first class) is sent.
Step S270, determining whether the second probability is smaller than a second predetermined value, if the second probability is smaller than the second predetermined value, step S290 is executed, and if the second probability is not smaller than the first predetermined value, step S2a0 is executed.
Step S290, the third data processing result is sent. That is, when the first probability is not less than the first predetermined value and the second probability is not less than the second predetermined value, or the first probability is less than the first predetermined value and the second probability is less than the second predetermined value, the third data processing result is transmitted. Taking the arbitration processing of the abnormal task as an example, when the first probability output by the data processing model is not less than the first predetermined value and the second probability is not less than the second predetermined value, or the first probability output by the data processing model is not less than the first predetermined value and the second probability is not less than the second predetermined value, the data processing result of the unpredictable arbitration result is output. At this time, after receiving the data processing result, the arbitration processing of the abnormal task may be transferred to the manual processing link. Therefore, all arbitration tasks do not need to be manually processed, the processing efficiency is improved, and the manpower is saved.
And step S2A0, sending a second data processing result, wherein the second data processing result comprises that the target task belongs to the second class. That is, the second data processing result is transmitted when the first probability is smaller than the first predetermined value and the second probability is not smaller than the second predetermined value. Taking the arbitration processing of the abnormal task as an example, when the first probability output by the data processing model is smaller than the first predetermined value and the second probability is not smaller than the second predetermined value, the data processing result of which the arbitration fails (i.e. the target task belongs to the second class) is sent. Optionally, the first predetermined value is 0.9, and the second predetermined value is 0.9.
According to the embodiment of the invention, the feature vector corresponding to the target task is obtained by extracting the features of the target task information and the comment text of the corresponding first object, the feature vector corresponding to the target task is input to the data processing model to be processed, so that the first probability and the second probability are obtained, the category of the target task is predicted by judging the magnitude of the first probability and the second probability, and the corresponding data processing result is output.
Fig. 3 is a data flow diagram of a data processing method according to an embodiment of the present invention. As shown in fig. 3, in the present embodiment, after receiving an arbitration request (i.e., the data instruction), target task information 31 and comment text 32 of the first object corresponding to the target task are acquired. The target task information 31 includes a task record of the target task, first object information, and second object information. Taking the takeaway task as an example, the task record of the target task may include meal information, delivery information, time information during delivery, and the like. The first object is a meal ordering user, and the first object information may include identity information of the meal ordering user, historical task information, historical abnormal task information (cancellation or complaint task), historical comment text, credit information and the like. The second object is a merchant, and the second object information may include historical task information, historical abnormal task information (cancelled or complained tasks), historical commented text, credit information, and the like. The comment text 32 of the first object may include the reason for canceling or complaining the task, etc., such as a delivery severity timeout, etc. The present embodiment is described by taking take-out tasks as an example, it should be understood that the present embodiment may also be used in other fields, for example, in a shopping platform, a buyer returns, a complaint seller or a complaint distributor, and in a network car reservation, a car reservation is cancelled or a complaint driver, and the like, and the embodiments of the present invention are not limited thereto.
In this embodiment, discrete data in target task information is encoded through onehot encoding to obtain first subdata, continuous data in the target task information is normalized to obtain second subdata, the first subdata and the second subdata are combined to obtain first data, a comment text of a first object is vectorized based on a CBOW model to obtain second data, and the first data and the second data are spliced and combined to obtain a feature vector corresponding to a target task.
In this embodiment, the feature vector corresponding to the target task is input to the pre-trained data processing model 33 for processing, and the first probability and the second probability are output. The first probability is used for representing the probability that the target task belongs to the first class, and the second probability is used for representing the probability that the target task belongs to the second class. The first probability and the second probability are then input to the determining unit 34, and the determining unit 34 predicts the classification result, i.e., the arbitration result, of the target task according to the first probability and the second probability. The determination unit 34 determines that the arbitration is passed when the first probability is not less than the first predetermined value and the second probability is less than the second predetermined value, determines that the arbitration is not passed when the first probability is less than the first predetermined value and the second probability is not less than the second predetermined value, and outputs the indeterminable arbitration result when the first probability is not less than the first predetermined value and the second probability is not less than the second predetermined value, or the first probability is less than the first predetermined value and the second probability is less than the second predetermined value.
In this embodiment, the data processing model is trained according to a pre-labeled training set. The data set comprises information of a plurality of history tasks subjected to arbitration processing, first object information and corresponding second object information corresponding to each history task, and comment texts of the first objects corresponding to each history task. In the data set, the data processing results of the historical tasks are marked, for example, the historical task with arbitration passing is marked with a first value (for example, 1), and the historical task with arbitration failing is marked with a second value (for example, 0). In this embodiment, by obtaining the feature vectors corresponding to the historical tasks, and inputting the feature vectors into the data processing model for training, so as to adjust the parameters of the data processing model, for the historical tasks marked as the first historical tasks, the first probability output by the data processing model is greater than or equal to the first predetermined value, the second probability output by the data processing model is smaller than the second predetermined value, for the historical tasks marked as the second historical tasks, the second probability output by the data processing model is greater than or equal to the second predetermined value, and the first probability output by the data processing model is smaller than the first predetermined value, therefore, the trained data processing model can more accurately predict the category of the current target task, that is, can more accurately predict whether the arbitration processing of the current target task can be arbitrated, and improve the arbitration efficiency.
According to the embodiment of the invention, the feature vector corresponding to the target task is obtained by extracting the features of the target task information and the comment text of the corresponding first object, the feature vector corresponding to the target task is input to the data processing model to be processed, so that the first probability and the second probability are obtained, the category of the target task is predicted by judging the magnitude of the first probability and the second probability, and the corresponding data processing result is output.
Fig. 4 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention. As shown in fig. 4, the data processing apparatus 4 of the present embodiment includes a receiving unit 41, an acquiring unit 42, a feature extracting unit 43, a processing unit 44, a first transmitting unit 45, and a second transmitting unit 46.
The receiving unit 41 is configured to receive a data instruction from the terminal device. The obtaining unit 42 is configured to parse the data instruction through at least one processor, and obtain target task information and corresponding comment text of the first object, where the target task information includes a task record of the target task, corresponding first object information, and second object information.
The feature extraction unit 43 is configured to obtain, by at least one processor, a feature vector corresponding to the target task according to the target task information and the comment text. In an alternative implementation, the feature extraction unit 43 includes a first feature extraction sub-unit 431, a second feature extraction sub-unit 432, and a merging sub-unit 433. The first feature extraction subunit 431 is configured to perform feature extraction on the target task information to acquire first data. The second feature extraction subunit 432 is configured to perform feature extraction on the comment text to obtain second data. The merging subunit 433 is configured to merge the first data and the second data to obtain a feature vector corresponding to the target task.
In an alternative implementation, the first feature extraction subunit 431 includes a first processing module 4311, a second processing module 4312, and a merging module 4313. The first processing module 4311 is configured to perform onehot encoding on the discrete data in the target task information to obtain first sub-data. The second processing module 4312 is configured to perform normalization processing on the continuous data in the target task information to obtain second sub-data. The merging module 4313 is configured to merge the first sub data and the second sub data to obtain the first data.
The processing unit 44 is configured to input, by at least one processor, the feature vector to a data processing model for processing to obtain a first probability characterizing a probability that the target task belongs to a first class and a second probability characterizing a probability that the target task belongs to a second class.
The first sending unit 45 is configured to send, by the at least one processor, a first data processing result including that the target task belongs to the first class in response to the first probability being greater than or equal to a first predetermined value. Optionally, the first sending unit 45 is further configured to send, by the at least one processor, the first data processing result in response to the first probability being greater than or equal to a first predetermined value and the second probability being less than a second predetermined value.
The second sending unit 46 is configured to send, by the at least one processor, a second data processing result including that the target task belongs to the second class in response to the second probability being greater than or equal to a second predetermined value. Optionally, the second sending unit 46 is further configured to send, by the at least one processor, the first data processing result in response to the second probability being greater than or equal to a second predetermined value and the first probability being less than a first predetermined value.
In an alternative implementation, the data processing device 4 further comprises a third sending unit 47. The third sending unit 47 is configured to send, by the at least one processor, a third processing result in response to the first probability being less than the first predetermined value and the second probability being less than the second predetermined value. Optionally, the third data processing result includes that the target task belongs to a third class. When the data processing task is an arbitration task, the target task belongs to the third category of characteristics, and the arbitration result cannot be determined, and the data processing task needs to be arbitrated again through other modes (such as manual arbitration).
In an alternative implementation, the data processing device 4 further comprises a fourth sending unit 48. The fourth transmitting unit 48 is configured to transmit, by the at least one processor, a third processing result in response to the first probability being greater than or equal to the first predetermined value and the second probability being greater than or equal to the second predetermined value.
In an alternative implementation, the data processing device 4 further comprises a training unit 49. The training unit 49 is configured to train the data processing model based on a data set. The data set comprises information of a plurality of historical tasks, first object information and second object information corresponding to each historical task, and comment texts of the first objects corresponding to each historical task.
According to the embodiment of the invention, the feature vector corresponding to the target task is obtained by extracting the features of the target task information and the comment text of the corresponding first object, the feature vector corresponding to the target task is input to the data processing model to be processed, so that the first probability and the second probability are obtained, the classification of the target task is predicted by judging the sizes of the first probability and the second probability, and the corresponding data processing result is output.
Fig. 5 is a schematic diagram of an electronic device of an embodiment of the invention. In the present embodiment, the electronic device 5 includes a server, a terminal, and the like. As shown in fig. 5, the electronic apparatus 5: at least one processor 51; and a memory 52 communicatively coupled to the at least one processor 51; and a communication component 53 in communicative connection with the scanning device, the communication component 53 receiving and transmitting data under the control of the processor 51; the memory 52 stores instructions executable by the at least one processor 51, and the instructions are executed by the at least one processor 51 to implement the data processing method.
Specifically, the electronic device includes: one or more processors 51 and a memory 52, with one processor 51 being an example in fig. 5. The processor 51 and the memory 52 may be connected by a bus or other means, and fig. 5 illustrates the connection by the bus as an example. The memory 52, which is a non-volatile computer-readable storage medium, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The processor 51 executes various functional applications of the device and data processing, i.e., implements the above-described data processing method, by executing nonvolatile software programs, instructions, and modules stored in the memory 52.
The memory 52 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store a list of options, etc. Further, the memory 52 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 52 may optionally include memory located remotely from the processor 51, which may be connected to an external device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
One or more modules are stored in the memory 52 and, when executed by the one or more processors 51, perform the data processing method in any of the method embodiments described above.
The product can execute the method provided by the embodiment of the application, has corresponding functional modules and beneficial effects of the execution method, and can refer to the method provided by the embodiment of the application without detailed technical details in the embodiment.
According to the embodiment of the invention, the feature vector corresponding to the target task is obtained by extracting the features of the target task information and the comment text of the corresponding first object, the feature vector corresponding to the target task is input to the data processing model to be processed, so that the first probability and the second probability are obtained, and the corresponding data processing result is output by judging the magnitude of the first probability and the second probability.
Another embodiment of the invention is directed to a non-transitory storage medium storing a computer-readable program for causing a computer to perform some or all of the above-described method embodiments.
That is, as can be understood by those skilled in the art, all or part of the steps in the method for implementing the embodiments described above may be implemented by a program instructing related hardware, where the program is stored in a storage medium and includes several instructions to enable a device (which may be a single chip, a chip, or the like) or a processor (processor) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The embodiment of the invention discloses A1 and a data processing method, wherein the method comprises the following steps:
receiving a data instruction from terminal equipment;
analyzing the data instruction through at least one processor to obtain target task information and a comment text of a corresponding first object, wherein the target task information comprises a task record of the target task, corresponding first object information and second object information;
obtaining a feature vector corresponding to the target task according to the target task information and the comment text through at least one processor;
inputting, by at least one processor, the feature vector into a data processing model for processing to obtain a first probability and a second probability, the first probability being used to represent a probability that the target task belongs to a first class, and the second probability being used to represent a probability that the target task belongs to a second class;
responsive to the first probability being greater than or equal to a first predetermined value, sending, by at least one processor, a first data processing result, the first data processing result including that the target task belongs to a first class;
in response to the second probability being greater than or equal to a second predetermined value, sending, by at least one processor, a second data processing result, the second data processing result including that the target task belongs to a second class.
A2, the method according to A1, wherein the obtaining, by at least one processor, the feature vector corresponding to the target task according to the target task information and the comment text includes:
performing feature extraction on the target task information to acquire first data;
performing feature extraction on the comment text to acquire second data;
and combining the first data and the second data to obtain a feature vector corresponding to the target task.
A3, the method of A2, wherein the feature extracting the target task information to obtain first data includes:
performing onehot coding on discrete data in the target task information to obtain first subdata;
performing normalization processing on continuous data in the target task information to obtain second subdata;
the first sub data and the second sub data are merged to obtain the first data.
A4, the method according to A1, wherein the method further comprises:
sending, by at least one processor, a third processing result in response to the first probability being less than the first predetermined value and the second probability being less than the second predetermined value.
A5, the method according to A1, wherein the method further comprises:
sending, by at least one processor, a third processing result in response to the first probability being greater than or equal to the first predetermined value and the second probability being greater than or equal to the second predetermined value.
A6, the method according to any one of A1-A5, wherein the method further comprises:
training the data processing model based on a data set;
the data set comprises information of a plurality of historical tasks, first object information and second object information corresponding to each historical task, and comment texts of the first objects corresponding to each historical task.
The embodiment of the invention discloses B1 and a data processing device, wherein the device comprises:
a receiving unit configured to receive a data instruction from a terminal device;
the acquisition unit is configured to analyze the data instruction through at least one processor, and acquire target task information and corresponding comment texts of the first object, wherein the target task information comprises task records of the target task, corresponding first object information and second object information;
the feature extraction unit is configured to obtain, through at least one processor, a feature vector corresponding to the target task according to the target task information and the comment text;
a processing unit configured to input, by at least one processor, the feature vector to a data processing model for processing to obtain a first probability and a second probability, the first probability being used for representing a probability that the target task belongs to a first class, and the second probability being used for representing a probability that the target task belongs to a second class;
a first sending unit configured to send, by at least one processor, a first data processing result in response to the first probability being greater than or equal to a first predetermined value, the first data processing result including that the target task belongs to a first class;
a second sending unit configured to send, by the at least one processor, a second data processing result including that the target task belongs to a second class in response to the second probability being greater than or equal to a second predetermined value.
B2, the apparatus according to B1, wherein the feature extraction unit includes:
a first feature extraction subunit configured to perform feature extraction on the target task information to acquire first data;
a second feature extraction subunit configured to perform feature extraction on the comment text to acquire second data;
a merging subunit configured to merge the first data and the second data to obtain a feature vector corresponding to the target task.
B3, the apparatus according to B2, wherein the first feature extraction subunit includes:
the first processing module is configured to perform onehot coding on the discrete data in the target task information to obtain first subdata;
the second processing module is configured to perform normalization processing on continuous data in the target task information to obtain second subdata;
a merging module configured to merge the first sub data and the second sub data to obtain the first data.
B4, the apparatus according to B1, wherein the apparatus further comprises:
a third transmitting unit configured to transmit, by at least one processor, a third processing result in response to the first probability being less than the first predetermined value and the second probability being less than the second predetermined value.
B5, the apparatus according to B1, wherein the apparatus further comprises:
a fourth transmitting unit configured to transmit, by at least one processor, a third processing result in response to the first probability being greater than or equal to the first predetermined value and the second probability being greater than or equal to the second predetermined value.
B6, the device according to any one of B1-B5, wherein the device further comprises:
a training unit configured to train the data processing model based on a data set;
the data set comprises information of a plurality of historical tasks, first object information and second object information corresponding to each historical task, and comment texts of the first objects corresponding to each historical task.
The embodiment of the invention also discloses C1, an electronic device, comprising a memory and a processor, wherein the memory is used for storing one or more computer program instructions, and the processor executes the one or more computer program instructions to realize the method according to any one of A1-A6.
The embodiment of the invention also discloses C1, a computer readable storage medium, on which computer program instructions are stored, wherein the computer program instructions, when executed by a processor, implement the method according to any one of A1-A6.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method of data processing, the method comprising:
receiving a data instruction from terminal equipment;
analyzing the data instruction through at least one processor to obtain target task information and a comment text of a corresponding first object, wherein the target task information comprises a task record of the target task, corresponding first object information and second object information;
obtaining a feature vector corresponding to the target task according to the target task information and the comment text through at least one processor;
inputting, by at least one processor, the feature vector into a data processing model for processing to obtain a first probability and a second probability, the first probability being used to represent a probability that the target task belongs to a first class, and the second probability being used to represent a probability that the target task belongs to a second class;
responsive to the first probability being greater than or equal to a first predetermined value, sending, by at least one processor, a first data processing result, the first data processing result including that the target task belongs to a first class;
in response to the second probability being greater than or equal to a second predetermined value, sending, by at least one processor, a second data processing result, the second data processing result including that the target task belongs to a second class.
2. The method of claim 1, wherein obtaining, by at least one processor, a feature vector corresponding to the target task according to the target task information and the comment text comprises:
performing feature extraction on the target task information to acquire first data;
performing feature extraction on the comment text to acquire second data;
and combining the first data and the second data to obtain a feature vector corresponding to the target task.
3. The method of claim 2, wherein feature extracting the target task information to obtain first data comprises:
performing onehot coding on discrete data in the target task information to obtain first subdata;
performing normalization processing on continuous data in the target task information to obtain second subdata;
the first sub data and the second sub data are merged to obtain the first data.
4. The method of claim 1, further comprising:
sending, by at least one processor, a third processing result in response to the first probability being less than the first predetermined value and the second probability being less than the second predetermined value.
5. The method of claim 1, further comprising:
sending, by at least one processor, a third processing result in response to the first probability being greater than or equal to the first predetermined value and the second probability being greater than or equal to the second predetermined value.
6. The method according to any one of claims 1-5, further comprising:
training the data processing model based on a data set;
the data set comprises information of a plurality of historical tasks, first object information and second object information corresponding to each historical task, and comment texts of the first objects corresponding to each historical task.
7. A data processing apparatus, characterized in that the apparatus comprises:
a receiving unit configured to receive a data instruction from a terminal device;
the acquisition unit is configured to analyze the data instruction through at least one processor, and acquire target task information and corresponding comment texts of the first object, wherein the target task information comprises task records of the target task, corresponding first object information and second object information;
the feature extraction unit is configured to obtain, through at least one processor, a feature vector corresponding to the target task according to the target task information and the comment text;
a processing unit configured to input, by at least one processor, the feature vector to a data processing model for processing to obtain a first probability and a second probability, the first probability being used for representing a probability that the target task belongs to a first class, and the second probability being used for representing a probability that the target task belongs to a second class;
a first sending unit configured to send, by at least one processor, a first data processing result in response to the first probability being greater than or equal to a first predetermined value, the first data processing result including that the target task belongs to a first class;
a second sending unit configured to send, by the at least one processor, a second data processing result including that the target task belongs to a second class in response to the second probability being greater than or equal to a second predetermined value.
8. The apparatus of claim 7, wherein the feature extraction unit comprises:
a first feature extraction subunit configured to perform feature extraction on the target task information to acquire first data;
a second feature extraction subunit configured to perform feature extraction on the comment text to acquire second data;
a merging subunit configured to merge the first data and the second data to obtain a feature vector corresponding to the target task.
9. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method of any of claims 1-6.
10. A computer-readable storage medium on which computer program instructions are stored, which computer program instructions, when executed by a processor, are to implement a method according to any one of claims 1-6.
CN202010148156.XA 2020-03-05 2020-03-05 Data processing method and device, electronic equipment and computer readable storage medium Pending CN111460140A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010148156.XA CN111460140A (en) 2020-03-05 2020-03-05 Data processing method and device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010148156.XA CN111460140A (en) 2020-03-05 2020-03-05 Data processing method and device, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN111460140A true CN111460140A (en) 2020-07-28

Family

ID=71683228

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010148156.XA Pending CN111460140A (en) 2020-03-05 2020-03-05 Data processing method and device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN111460140A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897113A (en) * 2017-02-23 2017-06-27 郑州云海信息技术有限公司 The method and device of a kind of virtualized host operation conditions prediction
CN107688564A (en) * 2017-08-31 2018-02-13 平安科技(深圳)有限公司 Subject of news Corporate Identity method, electronic equipment and computer-readable recording medium
CN108108348A (en) * 2017-11-17 2018-06-01 腾讯科技(成都)有限公司 Processing method, server, storage medium and the electronic device of information
CN109241418A (en) * 2018-08-22 2019-01-18 中国平安人寿保险股份有限公司 Abnormal user recognition methods and device, equipment, medium based on random forest
CN109903095A (en) * 2019-03-01 2019-06-18 上海拉扎斯信息科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN110378529A (en) * 2019-07-17 2019-10-25 拉扎斯网络科技(上海)有限公司 A kind of method, apparatus, readable storage medium storing program for executing and electronic equipment that data generate
CN110689254A (en) * 2019-09-23 2020-01-14 拉扎斯网络科技(上海)有限公司 Data processing method and device, electronic equipment and computer readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897113A (en) * 2017-02-23 2017-06-27 郑州云海信息技术有限公司 The method and device of a kind of virtualized host operation conditions prediction
CN107688564A (en) * 2017-08-31 2018-02-13 平安科技(深圳)有限公司 Subject of news Corporate Identity method, electronic equipment and computer-readable recording medium
CN108108348A (en) * 2017-11-17 2018-06-01 腾讯科技(成都)有限公司 Processing method, server, storage medium and the electronic device of information
CN109241418A (en) * 2018-08-22 2019-01-18 中国平安人寿保险股份有限公司 Abnormal user recognition methods and device, equipment, medium based on random forest
CN109903095A (en) * 2019-03-01 2019-06-18 上海拉扎斯信息科技有限公司 Data processing method, device, electronic equipment and computer readable storage medium
CN110378529A (en) * 2019-07-17 2019-10-25 拉扎斯网络科技(上海)有限公司 A kind of method, apparatus, readable storage medium storing program for executing and electronic equipment that data generate
CN110689254A (en) * 2019-09-23 2020-01-14 拉扎斯网络科技(上海)有限公司 Data processing method and device, electronic equipment and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN110400219B (en) Service processing method and system, and transaction monitoring method and system
CN110097451B (en) Bank business monitoring method and device
CN109635292B (en) Work order quality inspection method and device based on machine learning algorithm
CN110009155B (en) Method and device for estimating distribution difficulty of service area and electronic equipment
CN108805332B (en) Feature evaluation method and device
CN112529321B (en) Risk prediction method and device based on user data and computer equipment
CN111539780A (en) Task processing method and device, storage medium and electronic equipment
CN108234441B (en) Method, apparatus, electronic device and storage medium for determining forged access request
CN112860676A (en) Data cleaning method applied to big data mining and business analysis and cloud server
CN111582407B (en) Task processing method and device, readable storage medium and electronic equipment
CN117422553A (en) Transaction processing method, device, equipment, medium and product of blockchain network
CN111460140A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN111459675A (en) Data processing method and device, readable storage medium and electronic equipment
CN111831630A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN112506063B (en) Data analysis method, system, electronic device and storage medium
CN113807858A (en) Data processing method based on decision tree model and related equipment
CN110213341B (en) Method and device for detecting downloading of application program
CN110348190B (en) User equipment attribution judging method and device based on user operation behaviors
CN113743435A (en) Business data classification model training method and device, and business data classification method and device
CN112541669A (en) Risk identification method, system and device
US10873550B2 (en) Methods for communication in a communication network for reduced data traffic
CN116993396B (en) Risk early warning method based on vehicle user tag and computer equipment
CN113132312A (en) Processing method and device for threat detection rule
CN111611473A (en) Information push processing method and device, storage medium and terminal
CN111488738A (en) Illegal information identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination