CN111105044B - Discrete feature processing method and device, computer equipment and storage medium - Google Patents

Discrete feature processing method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN111105044B
CN111105044B CN201911326146.4A CN201911326146A CN111105044B CN 111105044 B CN111105044 B CN 111105044B CN 201911326146 A CN201911326146 A CN 201911326146A CN 111105044 B CN111105044 B CN 111105044B
Authority
CN
China
Prior art keywords
discrete
driver
source
attribute dimension
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911326146.4A
Other languages
Chinese (zh)
Other versions
CN111105044A (en
Inventor
宁永恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Manyun Software Technology Co Ltd
Original Assignee
Jiangsu Manyun Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Manyun Software Technology Co Ltd filed Critical Jiangsu Manyun Software Technology Co Ltd
Priority to CN201911326146.4A priority Critical patent/CN111105044B/en
Publication of CN111105044A publication Critical patent/CN111105044A/en
Application granted granted Critical
Publication of CN111105044B publication Critical patent/CN111105044B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a discrete feature processing method, a discrete feature processing device, computer equipment and a storage medium. The method comprises the following steps: acquiring a cargo source discrete characteristic and a driver discrete characteristic with the same attribute dimension; judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched or not; determining the value of an identification bit of a target attribute dimension in the training vector according to the judgment result; the model is trained using the training vectors. The scheme of the embodiment of the invention realizes that the iteration efficiency of the service and the engineering is accelerated while the discrete feature dimension is reduced, and the feature information cannot be lost.

Description

Discrete feature processing method and device, computer equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of machine learning, in particular to a discrete feature processing method and device, computer equipment and a storage medium.
Background
With the continuous development of machine learning technology, the feature dimension is reduced, the training speed is further accelerated, and therefore the iterative efficiency of business and engineering is accelerated to be widely researched. The features involved in the machine learning training may be continuous features or discrete features.
At present, feature dimensions are mainly reduced through a text processing neural network model (such as a Word2vec neural network model); the input quantity of the text processing neural network model is a sequence, and any sequence can be understood as a continuous feature.
The method in the prior art can well reduce the feature dimension of continuous features, but discrete features do not have a sequence concept, so the dimension reduction of the discrete features cannot be carried out by using the method in the prior art. Therefore, it is necessary to research a processing method suitable for discrete features, and to speed up the iteration efficiency of business and engineering while reducing the dimension of the discrete features.
Disclosure of Invention
The invention provides a discrete feature processing method, a discrete feature processing device, computer equipment and a storage medium, which are used for reducing discrete feature dimensions, accelerating the iteration efficiency of business and engineering and avoiding losing feature information.
In a first aspect, an embodiment of the present invention provides a discrete feature processing method, where the method includes:
acquiring a cargo source discrete characteristic and a driver discrete characteristic with the same attribute dimension;
judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched or not;
determining the value of the identification bit of the target attribute dimension in the training vector according to the judgment result;
training a model using the training vectors.
In a second aspect, an embodiment of the present invention further provides a discrete feature processing apparatus, where the apparatus includes:
the discrete characteristic acquisition module is used for acquiring the discrete characteristics of the goods source and the discrete characteristics of the driver with the same attribute dimension;
the discrete characteristic matching module is used for judging whether the discrete characteristics of the goods source and the discrete characteristics of the driver are matched or not;
the attribute dimension dereferencing module is used for determining dereferencing of the identification bit of the attribute dimension in the training vector according to a judgment result;
and the training module is used for training the model by using the training vector.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the discrete feature processing method according to any one of the embodiments of the present invention.
In a fourth aspect, embodiments of the invention are directed to a storage medium containing computer-executable instructions for performing a discrete feature processing method as described in any one of the embodiments of the invention when executed by a computer processor.
According to the scheme of the embodiment of the invention, the discrete characteristics of the goods source and the discrete characteristics of the driver with the same attribute dimension are obtained; judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched or not; determining the value of the identification bit of the target attribute dimension in the training vector according to the judgment result; the model is trained using the training vectors. The method and the device realize that the iteration efficiency of business and engineering is accelerated while the discrete feature dimension is reduced, and the feature information cannot be lost.
Drawings
FIG. 1 is a flow chart of a discrete feature processing method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a discrete feature processing method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a discrete feature processing apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a computer device in the fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It should be further noted that, for the convenience of description, only some of the structures associated with the present invention are shown in the drawings, not all of them.
Example one
Fig. 1 is a flowchart of a discrete feature processing method according to an embodiment of the present invention, where the embodiment is applicable to a case where discrete features are processed in machine learning, and the method may be executed by a discrete feature processing apparatus, which may be implemented by software and/or hardware and integrated in a computer device executing the method. Specifically, referring to fig. 1, the method mainly includes the following steps:
s110, acquiring the discrete characteristics of the goods source and the driver with the same attribute dimension.
Specifically, the source of goods may be any goods that a company or individual may want to deliver, such as: company a needs to distribute daily necessities from beijing to tianjin, or plum some steel products from shanghai to beijing, and the embodiment of the present invention is not limited thereto. The driver may be a driver of the source of the delivery, and in general, information such as a driving route and a driving model of each driver is fixed. For example, the driver and the source of the cargo can be both the driver and the source of the cargo in a certain logistics platform, and the detailed information of the driver and the source of the cargo can be viewed in the logistics platform.
The discrete characteristics of the goods sources mainly comprise the characteristics of the goods source loading and unloading mode, the freight unit, the required vehicle length, the vehicle type and the like; the driver discrete characteristics mainly include: vehicle type, vehicle length, type of cargo being transported, and travel route.
Specifically, the obtaining of the discrete features of the goods source and the discrete features of the driver with the same attribute dimension may be the obtaining of the same features included in the discrete features of the goods source and the discrete features of the driver; for example: obtaining the vehicle type and the required vehicle length characteristic included in the goods source discrete characteristic; the type of the vehicle and the length of the vehicle included in the driver discrete feature are obtained, which is not limited in the embodiment of the present invention.
Optionally, obtaining the discrete source feature and the discrete driver feature of the same attribute dimension may include: acquiring a behavior record of a driver contacting a goods source through a telephone, wherein the behavior record comprises a driver identifier and a goods source identifier; adding a driver attribute in the behavior record according to the driver identifier, and adding a cargo source attribute in the behavior record according to the cargo source identifier to obtain a new behavior record; and according to the new behavior record, acquiring a cargo source discrete feature and a driver discrete feature of the same attribute dimension, wherein the cargo source discrete feature is a feature with a discrete characteristic in the cargo source attribute, and the driver discrete feature is a feature with a discrete characteristic in the driver feature.
Specifically, the behavior of the driver contacting the cargo source through phone call can be obtained to record the cargo source discrete feature and the driver discrete feature with the same attribute dimension, wherein the behavior record comprises a driver identifier and a cargo source identifier. For example, the driver identifier may be an Identity Document (ID) of the driver in the logistics platform; the source identification can be the ID of the source in the logistics platform; the driver or the source of the goods corresponding to the ID can be uniquely identified by the ID of the driver or the source of the goods.
Further, the driver attribute is added in the behavior record according to the driver identifier, and the cargo source attribute is added in the behavior record according to the cargo source identifier, so that a new behavior record is obtained. For example, driver attributes such as the length of the vehicle driven by the driver a and the type of the vehicle can be added to the driver identifier of the driver a (for example, the ID of the driver a); the source identifier of the source a (for example, the ID of the driver a) may be added with the vehicle length of the vehicle required for transporting the source a and the source attributes such as the vehicle type, so as to obtain a new behavior record. Wherein, the new behavior record comprises: the driver identification of the driver A, the length of the vehicle driven by the driver A and the type of the vehicle; the source identification of the source a, the length of the vehicle required to transport the source a, the type of the vehicle, and the like.
Further, after a new behavior record is obtained, the same-dimension discrete feature of the goods source and the discrete feature of the driver can be obtained according to the new behavior record. It should be noted that the source discrete feature is a feature having a discrete characteristic in the source attribute, and the driver discrete feature is a feature having a discrete characteristic in the driver feature.
Illustratively, in the above example, the driver attribute such as the length of the vehicle or the type of the vehicle driven by the driver a is added to the driver identifier of the driver a; after the source attributes such as the vehicle length or the vehicle type of the vehicle required for transporting the source a are added to the source identifier of the source a, so as to obtain a new behavior record, the source discrete feature and the driver discrete feature of the same dimension can be obtained according to the new behavior record, for example: and acquiring the vehicle length of the vehicle driven by the driver A and the driver discrete characteristics of the vehicle type, and the vehicle length of the vehicle required by the transportation source A and the source discrete characteristics of the vehicle type.
Optionally, after adding the driver attribute in the behavior record according to the driver identifier and adding the cargo source attribute in the behavior record according to the cargo source identifier, the method may further include: judging whether the driver attribute or the goods source attribute has a vacancy value or not; and if the vacancy value exists, modifying the vacancy value according to the mode of the attribute dimension where the vacancy value is located.
Specifically, after the driver attribute is added in the behavior record according to the driver identifier and the goods source attribute is added in the behavior record according to the goods source identifier, whether the driver attribute or the goods source attribute has a vacancy value or not can be further judged, namely whether the driver attribute or the goods source attribute lacks data or not is judged; if a vacancy value exists, i.e., data is missing from the driver attribute or the source attribute, the vacancy value may be modified based on the mode of the attribute dimension in which the vacancy value is located. For example, if the attribute dimension in which the vacancy value is located is the vehicle length attribute dimension, the vacancy value may be modified according to the vehicle length attribute dimension, for example, if more than ninety percent of the vehicle length in the vehicle length attribute dimension is 10 meters, 10 meters may be filled in the vacancy value in the vehicle length attribute dimension.
And S120, judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched.
Specifically, after the source discrete feature and the driver discrete feature of the same attribute dimension are obtained, it may be further determined whether the source discrete feature corresponding to the target attribute dimension matches the driver discrete feature, where the target attribute dimension may be, for example, the length of the vehicle, the type of the vehicle, or a feature included in other source discrete features and driver discrete features as mentioned in the above example, which is not limited in the embodiment of the present invention.
For example, if the target attribute dimension is the vehicle length, the discrete feature of the cargo source corresponding to the target attribute dimension is matched with the discrete feature of the driver, that is, the vehicle length required by the transportation of the cargo source is matched with the vehicle length of the vehicle driven by the driver; if the target attribute dimension is the vehicle type, the discrete feature of the cargo source corresponding to the target attribute dimension is matched with the discrete feature of the driver, namely the vehicle type required by the transportation of the cargo source is matched with the vehicle type of the vehicle driven by the driver.
And S130, determining the value of the identification bit of the target attribute dimension in the training vector according to the judgment result.
Specifically, after judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched, the value of the identification bit of the target attribute dimension in the training vector can be determined according to the judgment result. The training vector may be a vector generated by one-hot encoding (one-hot encoding). It should be noted that one-hot encoding is also called one-bit effective encoding, and mainly uses an N-bit status register to encode N states, each state is defined by its independent register bit, and only one bit is effective at any time, where N may be any positive integer.
Optionally, determining, according to the determination result, a value of the flag of the target attribute dimension in the training vector may include: if the data of the discrete features of the goods source in the target attribute dimension are matched with the data of the discrete features of the driver in the target attribute dimension, configuring the identification bits of the target attribute dimension in the training vector into a first characteristic value; and if the data of the discrete features of the goods source in the target attribute dimension and the data of the discrete features of the driver in the target attribute dimension do not match, configuring the identification bit of the target attribute dimension in the training vector as a second characteristic value.
Specifically, if the data of the discrete features of the source goods in the target attribute dimension and the data of the discrete features of the driver in the target attribute dimension match, the flag bit of the target attribute dimension in the training vector is configured as a first feature value, where the first feature value may be 1 or another value, which is not limited in the embodiment of the present invention. For example, if the target attribute dimension is the vehicle length, the data of the target attribute dimension may be 10 meters, and if the vehicle length required for delivering the source in the source discrete feature is 10 meters, and the vehicle length of the vehicle driven by the driver in the driver discrete feature is also 10 meters, at this time, it may be considered that the data of the source discrete feature in the target attribute dimension and the data of the driver discrete feature in the target attribute dimension match, and the value of the identification bit corresponding to the target attribute of the vehicle length in the training vector may be configured to be 1.
Further, if the data of the source discrete feature in the target attribute dimension does not match the data of the driver discrete feature in the target attribute dimension, the identification bit of the target attribute dimension in the training vector is configured to be a second feature value, where the first feature value may be 0 or another value, which is not limited in the embodiment of the present invention. For example, if the target attribute dimension is a vehicle type, the data of the target attribute dimension may be a truck, and if the vehicle type required for delivering the source in the source discrete feature is a van and the vehicle type driven by the driver in the driver discrete feature is a van, at this time, it may be considered that the data of the source discrete feature in the target attribute dimension and the data of the driver discrete feature in the target attribute dimension do not match, and the value of the identification bit corresponding to the vehicle type target attribute in the training vector may be configured to be 0.
The advantage of setting up like this lies in, can reduce the dimension of training vector to original quarter, makes training speed accelerate greatly, and the memory occupies and reduces to about eight percent original to can not lose the characteristic information when reducing the training vector dimension.
And S140, training the model by using the training vector.
Specifically, according to the judgment result, the value of the identification bit of the target attribute dimension in the training vector is determined, that is, after the training vector is determined, the model can be trained by using the training vector, and it should be noted that the model involved in the embodiment of the present invention is a machine learning model. For example, the obtained training vector may be input into a machine learning model for training, and when the model training converges, a training result may be obtained, i.e., the target model may be obtained.
According to the scheme of the embodiment, the discrete feature of the goods source and the discrete feature of the driver with the same attribute dimension are obtained; judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched or not; determining the value of an identification bit of a target attribute dimension in the training vector according to the judgment result; the model is trained using the training vectors. The method and the device realize that the iteration efficiency of business and engineering is accelerated while the discrete feature dimension is reduced, and the feature information cannot be lost.
Example two
Fig. 2 is a flowchart of a discrete feature processing method in the second embodiment of the present invention, and this implementation refines the second embodiment of the present invention on the basis of the foregoing embodiment, and specifically, before training a model using a training vector, the method may further include: determining a sample mark of a training vector according to the behavior record and the behavior data; training the model using the training vectors may include: and leading the training vectors and the sample labels into a logistic regression model, and solving by using a random gradient descent algorithm to generate a target model. Referring to fig. 2, the method mainly includes the following steps:
s210, acquiring the discrete feature of the goods source and the discrete feature of the driver with the same attribute dimension.
S220, judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched.
And S230, determining the value of the identification bit of the target attribute dimension in the training vector according to the judgment result.
And S240, determining a sample mark of the training vector according to the behavior record and the behavior data.
Specifically, after the value of the identification bit of the target attribute dimension in the training vector is determined according to the judgment result, the sample mark of the training vector can be determined according to the behavior record and the behavior data.
Optionally, determining sample labels of the training vectors according to the behavior records and the behavior data may include: acquiring behavior records and behavior data, wherein the behavior records represent that a driver contacts a goods source through a telephone; the behavior data represents that the driver clicks the goods source; if the driver clicks the goods source and contacts the goods source through the telephone, the sample is determined to be marked as a positive sample; if the driver clicks on the source and does not contact the source by phone, then it is determined that the sample is marked as a negative sample.
Specifically, the behavior record is that the driver contacts the source by telephone, and for example, if the driver a contacts the company or the person belonging to the source a by telephone, it is determined whether the driver a can deliver the source a, and at this time, the behavior of the driver a contacting the source a by telephone will be recorded as the behavior record. The behavior data is that the driver clicks the goods source, for example, the driver a may click the goods source a in the logistics platform to determine whether the driver a can deliver the goods source a, and at this time, the behavior of the driver a clicking the goods source a is recorded as the behavior data.
Specifically, in the embodiment of the invention, if the driver clicks the goods source and contacts the goods source through a telephone, the sample is determined to be marked as a positive sample; if the driver clicks on the source and does not contact the source by telephone, then the sample is determined to be marked as a negative sample. For example, if a driver a clicks on a cargo source a in a logistics platform and contacts a company or a personal phone to which the cargo source a belongs, it may be determined that the sample is a positive sample, that is, the label of the cargo source a may be set to 1; if the driver a only clicks on the source a in the logistics platform but does not contact the company or the personal phone to which the source a belongs, it may be determined that the sample is a negative sample, i.e., the label of the source a may be set to 0.
And S250, training the model by using the training vector.
Specifically, training the model using the training vectors may include: and leading the training vectors and the sample labels into a logistic regression model, and solving by using a random gradient descent algorithm to generate a target model. The training vector obtained in S230 and the sample label obtained in S240 may be input into the logistic regression model, and the optimal solution of the logistic regression model is obtained using a random gradient descent algorithm, and the next time the person generates the target model.
In the scheme of this embodiment, on the basis of the above embodiment, before the model is trained by using the training vector, the sample label of the training vector is determined according to the behavior record and the behavior data; and leading the training vectors and the sample labels into a logistic regression model, and solving by using a random gradient descent algorithm to generate a target model. The method and the device realize that the iteration efficiency of business and engineering is accelerated while the discrete feature dimension is reduced, and the feature information cannot be lost.
EXAMPLE III
Fig. 3 is a schematic structural diagram of a discrete feature processing apparatus in a third embodiment of the present invention, which may execute the discrete feature processing method mentioned in any embodiment of the present invention, and the apparatus may be implemented by software and/or hardware. Specifically, referring to fig. 3, the apparatus mainly includes: a discrete feature acquisition module 310, a discrete feature matching module 320, an attribute dimension value module 330, and a training module 340.
The discrete characteristic acquisition module 310 is configured to acquire a cargo source discrete characteristic and a driver discrete characteristic of the same attribute dimension;
the discrete characteristic matching module 320 is used for judging whether the discrete characteristics of the goods source and the discrete characteristics of the driver are matched;
an attribute dimension dereferencing module 330, configured to determine, according to the determination result, a dereferencing of an identification bit of an attribute dimension in the training vector;
a training module 340 for training the model using the training vectors.
According to the scheme of the embodiment, the discrete characteristic acquisition module is used for acquiring the discrete characteristics of the goods source and the driver with the same attribute dimension; judging whether the discrete characteristics of the goods source and the discrete characteristics of the driver are matched or not through a discrete characteristic matching module; determining the value of an identification bit of the attribute dimension in the training vector according to the judgment result through an attribute dimension value module; the model is trained by a training module using the training vectors. The method and the device realize that the iteration efficiency of business and engineering is accelerated while the discrete feature dimension is reduced, and the feature information cannot be lost.
Optionally, the attribute dimension value module 330 includes: a first characteristic value configuration unit and a second characteristic value configuration unit.
The first characteristic value configuration unit is used for configuring the identification bit of the target attribute dimension in the training vector into a first characteristic value if the data of the discrete feature of the source of goods in the target attribute dimension is matched with the data of the discrete feature of the driver in the target attribute dimension;
and the second characteristic value configuration unit is used for configuring the identification bit of the target attribute dimension in the training vector into a second characteristic value if the data of the discrete feature of the source goods in the target attribute dimension are not matched with the data of the discrete feature of the driver in the target attribute dimension.
Optionally, the discrete feature obtaining module 310 includes a new behavior record obtaining unit, configured to obtain a behavior record of a driver contacting the cargo source by telephone, where the behavior record includes a driver identifier and a cargo source identifier; adding a driver attribute in the behavior record according to the driver identifier, and adding a cargo source attribute in the behavior record according to the cargo source identifier to obtain a new behavior record; and according to the new behavior record, acquiring a cargo source discrete feature and a driver discrete feature of the same attribute dimension, wherein the cargo source discrete feature is a feature with a discrete characteristic in the cargo source attribute, and the driver discrete feature is a feature with a discrete characteristic in the driver feature.
Optionally, the new behavior record obtaining unit further includes an empty value judging subunit, configured to judge whether an empty value exists according to the driver attribute or the cargo source attribute; and if the vacancy value exists, modifying the vacancy value according to the mode of the attribute dimension where the vacancy value is located.
Optionally, the discrete feature processing apparatus related in this embodiment further includes: and the sample mark determining unit of the training vector is used for determining the sample mark of the training vector according to the behavior record and the behavior data.
The training vector sample mark determining unit can be specifically used for acquiring a behavior record and behavior data, wherein the behavior record represents a driver telephone contact goods source; the behavior data represents that the driver clicks the goods source; if the driver clicks the goods source and contacts the goods source through the telephone, the sample is determined to be marked as a positive sample; if the driver clicks on the source and does not contact the source by telephone, then the sample is determined to be marked as a negative sample.
Optionally, the training module 340 is further configured to introduce the training vectors and the sample labels into a logistic regression model, and solve the logistic regression model by using a random gradient descent algorithm to generate the target model.
Optionally, the training vector involved in the embodiment of the present invention is a vector generated by one-hot encoding.
The discrete feature processing device provided by the embodiment of the invention can execute the discrete feature processing method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
Example four
Fig. 4 is a schematic structural diagram of a computer apparatus according to a fourth embodiment of the present invention, as shown in fig. 4, the computer apparatus includes a processor 40, a memory 41, an input device 42, and an output device 43; the number of processors 40 in the computer device may be one or more, and one processor 40 is taken as an example in fig. 4; the processor 40, the memory 41, the input device 42 and the output device 43 in the computer apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 4.
The memory 41 is used as a computer-readable storage medium, and can be used to store software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the discrete feature processing method in the embodiment of the present invention (for example, the discrete feature acquisition module 310, the discrete feature matching module 320, the attribute dimension value taking module 330, and the training module 340 in the discrete feature processing apparatus). The processor 40 executes various functional applications of the computer device and data processing by executing software programs, instructions, and modules stored in the memory 41, that is, implements the discrete feature processing method described above.
The memory 41 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 41 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 41 may further include memory located remotely from processor 40, which may be connected to a computer device over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 42 is operable to receive input numeric or character information and to generate key signal inputs relating to user settings and function controls of the computer apparatus. The output device 73 may include a display device such as a display screen.
EXAMPLE five
Embodiment D of the present invention also provides a storage medium containing computer-executable instructions which, when executed by a computer processor, perform a method of discrete feature processing, the method comprising:
acquiring a cargo source discrete feature and a driver discrete feature with the same attribute dimension;
judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched or not;
determining the value of an identification bit of a target attribute dimension in the training vector according to the judgment result;
the model is trained using the training vectors.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the method operations described above, and may also execute the relevant operations in the discrete feature processing method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the discrete feature processing apparatus, the included units and modules are only divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be realized; in addition, the specific names of the functional units are only for convenience of distinguishing from each other and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (9)

1. A discrete feature processing method, comprising:
acquiring a cargo source discrete feature and a driver discrete feature with the same attribute dimension;
judging whether the discrete features of the goods source and the driver of the target attribute dimension are matched or not;
determining the value of the identification bit of the target attribute dimension in the training vector according to the judgment result; training a model using the training vectors;
wherein, the determining the value of the identification bit of the target attribute dimension in the training vector according to the judgment result includes:
if the data of the discrete features of the goods source in the target attribute dimension are matched with the data of the discrete features of the driver in the target attribute dimension, configuring the identification bits of the target attribute dimension in the training vector into a first characteristic value;
and if the data of the discrete features of the goods source in the target attribute dimension and the data of the discrete features of the driver in the target attribute dimension do not match, configuring the identification bit of the target attribute dimension in the training vector as a second characteristic value.
2. The discrete characteristic processing method according to claim 1, wherein the acquiring of the cargo source discrete characteristic and the driver discrete characteristic of the same attribute dimension comprises:
acquiring a behavior record of a driver contacting a goods source through a telephone, wherein the behavior record comprises a driver identifier and a goods source identifier;
adding a driver attribute in the behavior record according to the driver identifier, and adding a cargo source attribute in the behavior record according to the cargo source identifier to obtain a new behavior record;
and acquiring a cargo source discrete feature and a driver discrete feature of the same attribute dimension according to the new behavior record, wherein the cargo source discrete feature is a feature with a discrete characteristic in the cargo source attribute, and the driver discrete feature is a feature with a discrete characteristic in the driver feature.
3. The discrete feature processing method as claimed in claim 2, further comprising, after adding a driver attribute to the behavior record based on the driver identifier and adding a source attribute to the behavior record based on the source identifier:
judging whether the driver attribute or the goods source attribute knows that an empty value exists or not;
and if the vacancy value exists, modifying the vacancy value according to the mode of the attribute dimension where the vacancy value is located.
4. The discrete feature processing method of claim 1, further comprising, before the training a model using the training vector:
determining a sample mark of a training vector according to the behavior record and the behavior data;
the training a model using the training vector comprises:
and introducing the training vectors and the sample marks into a logistic regression model, and solving by using a random gradient descent algorithm to generate a target model.
5. The discrete feature processing method of claim 4, wherein determining sample labels for training vectors from the behavior records and the behavior data comprises:
acquiring a behavior record and behavior data, wherein the behavior record represents that a driver contacts a goods source through a telephone; the behavior data represents a driver clicking on a goods source;
if the driver clicks the goods source and contacts the goods source through the telephone, the sample is determined to be marked as a positive sample;
if the driver clicks on the source and does not contact the source by phone, then it is determined that the sample is marked as a negative sample.
6. The discrete feature processing method of claim 1, wherein the training vector is a one-hot coded generated vector.
7. A discrete feature processing apparatus, comprising:
the discrete characteristic acquisition module is used for acquiring the discrete characteristics of the goods source and the discrete characteristics of the driver with the same attribute dimension;
the discrete characteristic matching module is used for judging whether the discrete characteristics of the goods source and the discrete characteristics of the driver are matched or not;
the attribute dimension dereferencing module is used for determining dereferencing of the identification bit of the attribute dimension in the training vector according to the judgment result;
a training module for training a model using the training vectors;
wherein, the attribute dimension value module comprises: a first characteristic value configuration unit and a second characteristic value configuration unit;
the system comprises a first characteristic value configuration unit, a second characteristic value configuration unit and a training vector, wherein the first characteristic value configuration unit is used for configuring an identification bit of a target attribute dimension in the training vector into a first characteristic value if data of a cargo source discrete feature in the target attribute dimension is matched with data of a driver discrete feature in the target attribute dimension;
and the second characteristic value configuration unit is used for configuring the identification bit of the target attribute dimension in the training vector into a second characteristic value if the data of the discrete feature of the source goods in the target attribute dimension are not matched with the data of the discrete feature of the driver in the target attribute dimension.
8. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the discrete feature processing method according to any one of claims 1-6 when executing the program.
9. A storage medium containing computer-executable instructions for performing the discrete feature processing method of any one of claims 1-6 when executed by a computer processor.
CN201911326146.4A 2019-12-20 2019-12-20 Discrete feature processing method and device, computer equipment and storage medium Active CN111105044B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911326146.4A CN111105044B (en) 2019-12-20 2019-12-20 Discrete feature processing method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911326146.4A CN111105044B (en) 2019-12-20 2019-12-20 Discrete feature processing method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111105044A CN111105044A (en) 2020-05-05
CN111105044B true CN111105044B (en) 2022-09-23

Family

ID=70422124

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911326146.4A Active CN111105044B (en) 2019-12-20 2019-12-20 Discrete feature processing method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111105044B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897801A (en) * 2017-02-28 2017-06-27 百度在线网络技术(北京)有限公司 Method, device, equipment and storage medium that driver classifies
CN109658033A (en) * 2018-12-26 2019-04-19 江苏满运软件科技有限公司 Source of goods route similarity calculating method, system, equipment and storage medium
CN109978465A (en) * 2019-03-29 2019-07-05 江苏满运软件科技有限公司 Source of goods recommended method, device, electronic equipment, storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106897801A (en) * 2017-02-28 2017-06-27 百度在线网络技术(北京)有限公司 Method, device, equipment and storage medium that driver classifies
CN109658033A (en) * 2018-12-26 2019-04-19 江苏满运软件科技有限公司 Source of goods route similarity calculating method, system, equipment and storage medium
CN109978465A (en) * 2019-03-29 2019-07-05 江苏满运软件科技有限公司 Source of goods recommended method, device, electronic equipment, storage medium

Also Published As

Publication number Publication date
CN111105044A (en) 2020-05-05

Similar Documents

Publication Publication Date Title
CN110941598A (en) Data deduplication method, device, terminal and storage medium
WO2017177778A1 (en) Information transmission management method and apparatus, server, and storage medium
CN113128925A (en) Method, device and equipment for generating dispatch path and computer readable storage medium
CN107589990A (en) A kind of method and system of the data communication based on thread pool
CN111464352A (en) Call link data processing method and device
CN110147507A (en) A kind of method, apparatus obtaining short chained address and server
CN103516757A (en) Method, device and system for processing content
CN115481104A (en) Data query method and device, electronic equipment and storage medium
CN111105044B (en) Discrete feature processing method and device, computer equipment and storage medium
CN112598514B (en) Cross-chain transaction management method, cross-chain platform and medium based on block chain
CN111813529B (en) Data processing method, device, electronic equipment and storage medium
CN110505289B (en) File downloading method and device, computer readable medium and wireless communication equipment
CN112328325A (en) Execution method and device of model file, terminal equipment and storage medium
CN105430115A (en) Method and apparatus for optimizing IP (Internet Protocol) library and computing device
CN116167245B (en) Multi-attribute transfer decision model-based multi-modal grain transportation method and system
CN111176641A (en) Flow node execution method, device, medium and electronic equipment
CN107203633B (en) Data table pushing processing method and device and electronic equipment
CN111242684A (en) Advertisement putting method
CN113821495A (en) Database cluster implementation system and method
CN113988992A (en) Order information sending method and device, electronic equipment and computer readable medium
CN112884388B (en) Training method, device and equipment for management strategy generation model
CN115794444B (en) Event communication method, event communication device, computer equipment and computer readable storage medium
CN114611712B (en) Prediction method based on heterogeneous federated learning, model generation method and device
CN117389757A (en) Interface communication method, device, computer equipment and storage medium
CN109861949B (en) Message filtering method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant