CN112418356A - Goods name classification method and device, electronic equipment and storage medium - Google Patents

Goods name classification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112418356A
CN112418356A CN202011494419.9A CN202011494419A CN112418356A CN 112418356 A CN112418356 A CN 112418356A CN 202011494419 A CN202011494419 A CN 202011494419A CN 112418356 A CN112418356 A CN 112418356A
Authority
CN
China
Prior art keywords
goods
name
goods name
classification
name classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011494419.9A
Other languages
Chinese (zh)
Inventor
仲惠琳
郁博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Manyun Logistics Information Co ltd
Original Assignee
Jiangsu Manyun Logistics Information Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Manyun Logistics Information Co ltd filed Critical Jiangsu Manyun Logistics Information Co ltd
Priority to CN202011494419.9A priority Critical patent/CN112418356A/en
Publication of CN112418356A publication Critical patent/CN112418356A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/245Classification techniques relating to the decision surface
    • G06F18/2453Classification techniques relating to the decision surface non-linear, e.g. polynomial classifier
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention provides a goods name classification method, a goods name classification device, electronic equipment and a storage medium, wherein the goods name classification method comprises the following steps: dividing the goods name into byte fragment sequences; using the byte fragment sequence as an input of a trained goods name classification model, wherein the goods name classification model comprises an input layer, a hidden layer and an output layer which are sequentially connected, and the output layer adopts a layered softmax structure; and output of the trained goods name classification model is used as the goods name classification of the goods name. The goods name classification algorithm of the goods names is optimized, so that the model training speed and the prediction efficiency are improved, the system load is reduced, the fault tolerance of wrongly written characters of the goods names is improved, and the goods name classification accuracy is improved.

Description

Goods name classification method and device, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of computers, in particular to a goods name classification method, a goods name classification device, electronic equipment and a storage medium.
Background
With the development of the internet and information technology, online cargo platforms have become more and more popular between the shipper and the driver. The cargo owner publishes the cargo source information through the cargo platform, and a driver browses and accepts orders to realize the docking process before cargo transportation.
In the freight scheduling of the freight platform, trucks with different loads and different types need to be matched according to different cargos, the cargo names are various, and it is very inconvenient to establish a category for each cargo name, so that the cargo names issued by a cargo owner need to be accurately and quickly classified. The goods name classification algorithm can automatically classify the goods names into preset categories. The classified category information can enable the shipper or dispatcher to quickly judge the trucks matched with the classification information.
The goods name classification is a text classification task, and a common algorithm for goods name classification is a text classification algorithm based on a Recurrent Neural Network (RNN). The method inputs goods name texts into RNN in sequence, and puts the hidden state output of RNN when the last text character is input into a feedforward neural network to generate the probability of each kind, and selects the kind with the maximum probability as the predicted goods kind.
The drawback of RNN-based text classification algorithms is the slow training and testing speed. The self-loop mechanism of the RNN makes the algorithm computationally expensive and time consuming, whether training the model or predicting the actual sample. In the actual application process, if a large amount of updated goods are needed and a large amount of classification algorithms are called, the load of the server becomes large, the running speed becomes slow, and the realization of other functions and the user experience are affected.
Secondly, the goods names to be classified are uploaded by the input of the goods owner user, so that a large number of wrongly written characters appear in practical application, and therefore, a few characters appear in the goods names are generated. The text classification algorithm based on the RNN is sensitive to wrongly written characters and rare characters, the algorithm is easy to generate different classification results for the two conditions of wrongly written characters and wrongly written characters with the same goods name. It can be seen that the error rate of the text classification algorithm based on the RNN is high when there are wrongly-written or rare words.
Therefore, how to optimize the goods name classification algorithm of the goods names, so as to improve the model training speed and the prediction efficiency, reduce the system load, improve the fault tolerance of wrongly written characters of the goods names, and further improve the goods name classification accuracy rate is a technical problem to be solved urgently by technical personnel in the field.
Disclosure of Invention
In order to overcome the defects of the related technologies, the invention provides a goods name classification method, a goods name classification device, electronic equipment and a storage medium, and further optimizes a goods name classification algorithm of goods names at least to a certain extent, so that the model training speed and the prediction efficiency are improved, the system load is reduced, meanwhile, the fault tolerance of wrongly-written characters of the goods names is improved, and further, the goods name classification accuracy is improved.
According to an aspect of the present invention, there is provided a goods name classification method including:
dividing the goods name into byte fragment sequences;
using the byte fragment sequence as an input of a trained goods name classification model, wherein the goods name classification model comprises an input layer, a hidden layer and an output layer which are sequentially connected, and the output layer adopts a layered softmax structure; and
and taking the output of the trained goods name classification model as the goods name classification of the goods name.
In some embodiments of the invention, the dividing the goods name into byte fragment sequences comprises:
and taking the ith character to the (i + N-1) th character of the goods name as the ith byte fragment of the byte fragment sequence, wherein i is greater than or equal to 1 and less than or equal to the total number-N +1 of the characters of the goods name, N is the length of a preset byte fragment, and N is an integer greater than 1 and less than or equal to the total number of the characters of the goods name.
In some embodiments of the invention, the output layer includes a huffman binary tree generated according to frequency statistics of candidate goods name classifications, leaf nodes in the huffman binary tree are candidate goods name classifications, the larger the frequency statistics of the candidate goods name classifications, the closer the candidate goods name classifications are to a root node of the huffman binary tree, and the probability of each candidate goods name classification is calculated based on only nodes of the candidate goods name classification on the path of the huffman binary tree.
In some embodiments of the invention, the dividing the goods name into byte fragment sequences is preceded by:
receiving a goods source departure place and a goods source destination input by a goods main terminal;
acquiring a hot route and/or a hot spot based on the departure place and the destination of the goods source, wherein the hot route is a transportation route of which the historical frequency between the departure place and the destination of the goods source is greater than a first preset frequency threshold value, and the hot spot is a path spot of which the historical frequency between the departure place and the destination of the goods source is greater than a second preset frequency threshold value;
acquiring a plurality of candidate goods names based on the acquired hot routes and/or hot places, wherein the candidate goods names are goods names with historical transportation frequency of the hot routes and/or the hot places larger than a third preset frequency threshold;
and determining the goods name for goods name classification based on the selection of the candidate goods name by the goods host.
In some embodiments of the invention, the dividing the goods name into byte fragment sequences is preceded by:
and receiving the goods name input by the goods owner.
In some embodiments of the invention, said classifying the output of the trained goods name classification model as the goods name of the goods name further comprises:
acquiring a historical freight order of a goods owner providing the goods name;
querying the goods packing mode and/or the freight requirement label of the goods name classification based on the historical freight order;
and displaying the inquired goods packaging mode and/or the goods requirement label to a goods owner end providing the goods name.
In some embodiments of the invention, said classifying the output of the trained goods name classification model as the goods name of the goods name further comprises:
acquiring a release request of the goods source information of the goods owner end providing the goods name;
releasing the goods source information to display the goods name classification and the goods name together,
wherein, the goods name classification is used for the driver to carry out the goods source retrieval.
According to still another aspect of the present invention, there is also provided a goods name sorting apparatus including:
a dividing module configured to divide the goods name into byte fragment sequences;
an input module configured to input the byte fragment sequence as a trained goods name classification model, wherein the goods name classification model comprises an input layer, a hidden layer and an output layer which are connected in sequence, and the output layer adopts a layered softmax structure; and
an output module configured to classify an output of the trained goods name classification model as a goods name of the goods name.
According to still another aspect of the present invention, there is also provided an electronic apparatus, including: a processor; a storage medium having stored thereon a computer program which, when executed by the processor, performs the steps as described above.
According to yet another aspect of the present invention, there is also provided a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps as described above.
Compared with the prior art, the invention has the advantages that:
according to the invention, goods names are divided into byte segment sequences, and the byte segment sequences are used as the input of a trained goods name classification model, so that the input of the model is optimized, and meanwhile, the model keeps word sequence information during training, thereby reducing the influence of wrongly written characters and rare characters on a prediction result and improving the accuracy of the model; by adopting a layered softmax structure for an output layer of the goods name classification model, the model training and predicting time is greatly reduced, and meanwhile, the system load is reduced.
Drawings
The above and other features and advantages of the present invention will become more apparent by describing in detail exemplary embodiments thereof with reference to the attached drawings.
Fig. 1 shows a flowchart of a goods name classification method according to an embodiment of the present invention.
Fig. 2 shows a schematic diagram of dividing a cargo name into a sequence of byte segments according to an embodiment of the invention.
Fig. 3 shows a schematic diagram of a hierarchical softmax structure according to an embodiment of the invention.
FIG. 4 illustrates a flow diagram for providing candidate item names in accordance with an embodiment of the present invention.
FIG. 5 illustrates a flow chart for providing cargo packing patterns and/or shipping requirement labels based on the category of the cargo name according to an embodiment of the present invention.
Fig. 6 is a block diagram showing a cargo name sorting apparatus according to an embodiment of the present invention.
Fig. 7 schematically illustrates a computer-readable storage medium in an exemplary embodiment of the invention.
Fig. 8 schematically illustrates an electronic device in an exemplary embodiment of the invention.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the invention and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the steps. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.
Fig. 1 shows a flowchart of a goods name classification method according to an embodiment of the present invention. The goods name classification method comprises the following steps:
step S110: the goods name is divided into a sequence of byte fragments.
Specifically, the goods name can be actively input (directly input or voice input) through the input box by the goods owner. The goods name may also be provided by the platform for selection by the shipper as a plurality of candidate goods names. The candidate cargo names may be provided in various ways, and the provision of the candidate cargo names will be described below in conjunction with various embodiments, which are not described herein in detail.
Specifically, step S110 may be implemented as follows: and taking the ith character to the (i + N-1) th character of the goods name as the ith byte fragment of the byte fragment sequence, wherein i is greater than or equal to 1 and less than or equal to the total number-N +1 of the characters of the goods name, N is the length of a preset byte fragment, and N is an integer greater than 1 and less than or equal to the total number of the characters of the goods name.
Referring to fig. 2, fig. 2 is a diagram illustrating a cargo name divided into byte fragment sequences according to an embodiment of the present invention. As shown in FIG. 2, when N is set to 3 and the cargo name is ABCDEF, the 1 st byte segment X of the byte segment sequence11 st character to 3 rd character ABC of the goods name; byte segment 2, segment X of a byte segment sequence2BCD from 2 nd character to 4 th character of the goods name; byte fragment sequence 3 rd byte fragment X33 rd character to 5 th character CDE of goods name; byte segment 4 of byte segment sequence4The 4 th character to the 6 th character DEF of the goods name. The above description is only illustrative of the byte fragment sequences provided by the present invention, and the present invention is not limited thereto.
Specifically, unlike the conventional method of generating a separate word vector for each word and training it in a later process, the present invention uses a vector of byte segment sequences at the character level to represent the entire item name in order to reduce the effect of wrong words and rare words on the prediction result. The byte fragment sequences used in the present invention, on the one hand, generate better word vectors for rare words: according to the byte fragment sequence at the character level, the number of times of the word is few, but the characters and other words forming the word have shared parts, so that the generated word vector can be optimized; on the other hand, in lexical words, even if the words do not appear in the training corpus, word vectors for the words can still be constructed from the character-level byte segment sequences; on the other hand, the byte segment sequence can enable the model to learn partial information of the local word sequence, if the byte segment sequence is not considered, each word is taken, so that the information contained in the word sequence cannot be considered, namely the word sequence can also be understood as context information, therefore, adjacent words are related in the byte segment sequence mode, and the word sequence information can be maintained by the model during training.
Step S120: and taking the byte fragment sequence as an input of a trained goods name classification model, wherein the goods name classification model comprises an input layer, a hidden layer and an output layer which are sequentially connected, and the output layer adopts a layered softmax structure.
Specifically, the trained goods name classification model output is one of preset goods name classifications, namely, the goods name classification predicted by the model. The hidden layer is the superposition average of the byte fragment sequence vectors. In the training process of the model, the goods name generates byte segment sequence representation through an input layer, the representation after superposition averaging is obtained through a hidden layer, and finally an output vector is obtained through a weight matrix of an output layer. This is the forward calculation process. And then calculating a loss function and the gradient thereof according to the output vector and a pre-marked vector, and reversely updating the weight matrix. This is a back propagation process. When the model is used for actual prediction, the same forward calculation process is used, and the goods name classification corresponding to the dimensionality with the maximum probability value is selected from the output vector to be used as output goods name classification prediction.
Because the goods names are classified more, the algorithm utilizes the layered softmax (hierarchical softmax) in the output layer, and the time required by training is greatly reduced. In the traditional softmax forecasting of the goods name category, each candidate category is processed in an equal way, a probability value is generated for each candidate category, and the goods name category with the maximum probability value is selected as the forecasting, and the function is as follows:
Figure BDA0002841674130000071
where the left side of the equation represents the probability of being classified as j given an input x and the right side of the equation yjRepresents the value corresponding to the category j in the output vector, and K in the denominator is the number of all candidate categories.
It can be seen that when softmax calculates the probability of a certain category according to the dimension value corresponding to a certain specific category in the output vector, the output values corresponding to all the other categories need to participate in the calculation.
And the hierarchical softmax structure is as shown in fig. 3, and is a huffman binary tree generated according to the frequency statistics of the candidate name classifications, leaf nodes in the huffman binary tree are the candidate name classifications, the larger the frequency statistics of the candidate name classifications, the closer the frequency statistics of the candidate name classifications are to the root node of the huffman binary tree, and the probability of each candidate name classification is calculated only based on the nodes of the candidate name classification on the path of the huffman binary tree. Therefore, when the probability of a certain classification is calculated, the output corresponding to all the other classifications is not needed, and only the output corresponding to the classification node on the path is needed. For example, calculate the goods name classification y2Only n (y) need be considered for the probability of (2)2,1)、n(y2,2)、n(y2And 3) probabilities of the three nodes. Therefore, the algorithm complexity is greatly reduced, and the speed of the training process and the classification prediction process of the model is greatly increased.
Step S130: and taking the output of the trained goods name classification model as the goods name classification of the goods name.
In the goods name classification method provided by the invention, goods names are divided into byte fragment sequences, and the byte fragment sequences are used as the input of a trained goods name classification model, so that the input of the model is optimized, and meanwhile, the model keeps word sequence information during training, thereby reducing the influence of wrongly-written characters and rare characters on a prediction result and improving the accuracy of the model; by adopting a layered softmax structure for an output layer of the goods name classification model, the model training and predicting time is greatly reduced, and meanwhile, the system load is reduced.
Specifically, the self-circulation structure of the traditional RNN goods name classification algorithm is abandoned, the calculated amount is greatly reduced, and the training time and the running time of the goods name classification algorithm are shortened. The goods name classification algorithm based on the RNN needs to perform forward operation once when one character in the goods name is input, meanwhile, in the back propagation process, the calculation amount of gradient return is highly related to the number of the characters of the input goods name, and the longer the goods name description needs to be, the longer the gradient calculation time is. The model structure of the invention only needs one forward calculation and one backward propagation process when inputting one goods name, thus greatly reducing the calculation amount. While the hierarchical softmax of the output layer also reduces the computational complexity of that layer from K (number of cargo categories) to log (K). The invention utilizes the character representation of the text generated by the byte fragment sequence, can better process low-frequency words and characters which never appear, and simultaneously reduces the influence of wrongly written characters on the prediction result.
The invention can reduce the training time of the goods name classification algorithm by one order of magnitude. In the comparison experiment between the invention and the RNN-based algorithm, when the length N of the byte fragment sequence is set to be 2, compared with the training time of 2 hours and 45 minutes required by the RNN algorithm, the training time of the invention is shortened to 10 seconds on the same task by the algorithm, and meanwhile, the accuracy of the invention is only 0.1 percent lower than that of the RNN algorithm on the accuracy of a test set. The two algorithms also have similar time-to-time ratios when actually performing the prediction task. Meanwhile, the algorithm has higher tolerance on wrongly written characters, in an experiment, any character of an input text is replaced by the character form, and the algorithm can still output the same goods name classification prediction when the wrongly written characters with similar character pronunciation.
Referring now to fig. 4, fig. 4 illustrates a flow diagram for providing candidate item names in accordance with an embodiment of the present invention. In this embodiment, before the step S110 shown in fig. 1, dividing the goods name into byte fragment sequences, the following steps may be further included:
step S101: and receiving a goods source departure place and a goods source destination input by the goods master terminal.
Step S102: and acquiring a hot route and/or a hot spot based on the departure place and the destination of the goods source, wherein the hot route is a transportation route with the historical frequency between the departure place and the destination of the goods source larger than a first preset frequency threshold, and the hot spot is a path spot with the historical frequency between the departure place and the destination of the goods source larger than a second preset frequency threshold.
Specifically, the historical frequency is the frequency with which hot routes and/or hot locations appear on historical shipping orders for the shipping platform. In some variations, the historical frequency may be the frequency with which hot routes and/or hot spots appear on historical shipping orders for the shipping platform for the last month (half a year, etc.). The first preset frequency threshold and the second preset frequency threshold may be determined based on the total amount of the obtained historical shipment orders. For example, M% of the total amount of the historical shipment orders may be used as the first preset frequency threshold and the second preset frequency threshold, where M is a constant greater than 0 and less than 100. In some variations, the determination may also be made as a median/average, etc., of the frequency of occurrence of each route and/or location. For example, the median/average P% of the frequency of occurrence of each route and/or location may be used as the first preset frequency threshold and the second preset frequency threshold, where P is a constant greater than 100. The present invention can implement more threshold setting modes, which are not described herein.
Step S103: and acquiring a plurality of candidate goods names based on the acquired hot routes and/or hot places, wherein the candidate goods names are goods names with historical transportation frequency of the hot routes and/or the hot places larger than a third preset frequency threshold value.
Specifically, the third preset frequency may be set as needed, for example, may be determined according to the number of candidate goods names required to be acquired. And sequencing the goods names according to the historical transportation frequency of the hot routes and/or hot places, and taking the historical transportation frequency of the Q-th goods name as a third preset frequency. The present invention may implement more determination manners of the third predetermined frequency threshold, which is not described herein again.
Step S104: and determining the goods name for goods name classification based on the selection of the candidate goods name by the goods host.
Therefore, in the above embodiment, in consideration of the fact that the commonly transported goods on the same route and the same location may be the same (for example, if a certain fruit is mainly produced in a certain place, the name of the source from the place is generally the name of the fruit), the supplier can provide a plurality of candidate goods names to the goods owner for selection through the source departure place and the source destination input by the goods owner, so as to improve the input efficiency of the goods name of the goods owner. Meanwhile, the candidate goods names are provided for the goods owner to select, so that the number of different goods names received by the freight platform can be reduced, and the classification efficiency of the goods names is further improved.
Referring now to fig. 5, fig. 5 illustrates a flow chart for providing a package of goods and/or a shipping requirement label based on a category of names of goods according to an embodiment of the present invention. After the step S130 in fig. 1 outputs the trained goods name classification model as the goods name classification of the goods name, the following steps may be further included:
step S141: and acquiring a historical freight order of the owner end providing the goods name.
Step S142: and inquiring the goods packing mode and/or the freight requirement label of the goods name classification based on the historical freight order.
Step S143: and displaying the inquired goods packaging mode and/or the goods requirement label to a goods owner end providing the goods name.
Therefore, the goods packaging mode and/or the freight requirement label can be determined by the shipper based on the historical freight orders and the classification of the freight names obtained by the classification, and the shipper can conveniently carry out source distribution and shipping operation. Furthermore, the freight requirement label is convenient for matching the vehicle and the goods, and a driver end can conveniently retrieve the corresponding goods source through the label.
Further, in some embodiments of the present invention, the step S130 in fig. 1 may further include the following steps after the output of the trained goods name classification model is used as the goods name classification of the goods name: acquiring a release request of the goods source information of the goods owner end providing the goods name; and releasing the goods source information, and displaying the goods name classification and the goods name together, wherein the goods name classification is used for driver side to search the goods source. Therefore, when the goods source information is conveniently issued, the goods names and the goods name classifications are displayed together, the goods name classifications can be used as labels for matching the goods in the vehicle, and the driver end can conveniently search the goods sources through the labels.
The foregoing is merely an exemplary description of various embodiments of the invention and is not intended to be limiting thereof. The above-described embodiments may be implemented individually or in combination, and such variations are within the scope of the invention.
According to still another aspect of the present invention, there is also provided a name sorting apparatus, and fig. 6 shows a block diagram of the name sorting apparatus according to an embodiment of the present invention. The cargo name sorting apparatus 300 includes a dividing module 310, an input module 320, and an output module 330.
The dividing module 310 is configured to divide the goods name into a sequence of byte segments.
The input module 320 is configured to input the byte fragment sequence as a trained goods name classification model, which includes an input layer, a hidden layer, and an output layer, which are connected in sequence, and the output layer adopts a layered softmax structure.
The output module 330 is configured to output the trained goods name classification model as a goods name classification for the goods name.
In the goods name classification device provided by the invention, goods names are divided into byte fragment sequences, and the byte fragment sequences are used as the input of a trained goods name classification model, so that the input of the model is optimized, and meanwhile, the model keeps word sequence information during training, thereby reducing the influence of wrongly-written characters and rare characters on a prediction result and improving the accuracy of the model; by adopting a layered softmax structure for an output layer of the goods name classification model, the model training and predicting time is greatly reduced, and meanwhile, the system load is reduced.
Fig. 6 is a schematic diagram of the apparatus 300 for sorting names of goods provided by the present invention, and the splitting, combining and adding of modules are within the scope of the present invention without departing from the concept of the present invention. The apparatus 300 for sorting names of goods provided by the present invention can be implemented by software, hardware, firmware, plug-in and any combination thereof, which is not limited by the present invention.
In an exemplary embodiment of the present invention, there is further provided a computer-readable storage medium, on which a computer program is stored, which when executed by, for example, a processor, may implement the steps of the method for classification of a good name as described in any one of the above embodiments. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the invention described in the above-mentioned part of the name classification method of the present description, when the program product is run on the terminal device.
Referring to fig. 7, a program product 700 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the tenant computing device, partly on the tenant device, as a stand-alone software package, partly on the tenant computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing devices may be connected to the tenant computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
In an exemplary embodiment of the invention, there is also provided an electronic device that may include a processor and a memory for storing executable instructions of the processor. Wherein the processor is configured to perform the steps of the method for name classification in any of the above embodiments via execution of the executable instructions.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
An electronic device 500 according to this embodiment of the invention is described below with reference to fig. 8. The electronic device 500 shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 8, the electronic device 500 is embodied in the form of a general purpose computing device. The components of the electronic device 500 may include, but are not limited to: at least one processing unit 510, at least one memory unit 520, a bus 530 that couples various system components including the memory unit 520 and the processing unit 510, a display unit 540, and the like.
Wherein the storage unit stores program code executable by the processing unit 510 to cause the processing unit 510 to perform steps according to various exemplary embodiments of the present invention as described in the above section of the name sorting method of the present specification. For example, the processing unit 510 may perform the steps shown in fig. 1, 4 to 5.
The memory unit 520 may include a readable medium in the form of a volatile memory unit, such as a random access memory unit (RAM)5201 and/or a cache memory unit 5202, and may further include a read only memory unit (ROM) 5203.
The memory unit 520 may also include a program/utility 5204 having a set (at least one) of program modules 5205, such program modules 5205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Bus 530 may be one or more of any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.
The electronic device 500 may also communicate with one or more external devices 600 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a tenant to interact with the electronic device 500, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 500 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interfaces 550. Also, the electronic device 500 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 560. The network adapter 560 may communicate with other modules of the electronic device 500 via the bus 530. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which can be a personal computer, a server, or a network device, etc.) to execute the above-mentioned goods name classification method according to the embodiment of the present invention.
Compared with the prior art, the invention has the advantages that:
according to the invention, goods names are divided into byte segment sequences, and the byte segment sequences are used as the input of a trained goods name classification model, so that the input of the model is optimized, and meanwhile, the model keeps word sequence information during training, thereby reducing the influence of wrongly written characters and rare characters on a prediction result and improving the accuracy of the model; by adopting a layered softmax structure for an output layer of the goods name classification model, the model training and predicting time is greatly reduced, and meanwhile, the system load is reduced.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

Claims (10)

1. A method for classifying names of goods is characterized by comprising the following steps:
dividing the goods name into byte fragment sequences;
using the byte fragment sequence as an input of a trained goods name classification model, wherein the goods name classification model comprises an input layer, a hidden layer and an output layer which are sequentially connected, and the output layer adopts a layered softmax structure; and
and taking the output of the trained goods name classification model as the goods name classification of the goods name.
2. The method of claim 1, wherein the dividing the goods name into byte fragment sequences comprises:
and taking the ith character to the (i + N-1) th character of the goods name as the ith byte fragment of the byte fragment sequence, wherein i is greater than or equal to 1 and less than or equal to the total number-N +1 of the characters of the goods name, N is the length of a preset byte fragment, and N is an integer greater than 1 and less than or equal to the total number of the characters of the goods name.
3. The method of claim 1, wherein the output layer comprises a huffman binary tree generated according to frequency statistics of candidate name classifications, wherein leaf nodes in the huffman binary tree are candidate name classifications, the larger the frequency statistics of the candidate name classifications is, the closer the frequency statistics are to a root node of the huffman binary tree, and the probability of each candidate name classification is calculated based only on nodes of the candidate name classification on a path of the huffman binary tree.
4. The method of claim 1, wherein the dividing the goods name into byte fragment sequences is preceded by:
receiving a goods source departure place and a goods source destination input by a goods main terminal;
acquiring a hot route and/or a hot spot based on the departure place and the destination of the goods source, wherein the hot route is a transportation route of which the historical frequency between the departure place and the destination of the goods source is greater than a first preset frequency threshold value, and the hot spot is a path spot of which the historical frequency between the departure place and the destination of the goods source is greater than a second preset frequency threshold value;
acquiring a plurality of candidate goods names based on the acquired hot routes and/or hot places, wherein the candidate goods names are goods names with historical transportation frequency of the hot routes and/or the hot places larger than a third preset frequency threshold;
and determining the goods name for goods name classification based on the selection of the candidate goods name by the goods host.
5. The method of claim 1, wherein the dividing the goods name into byte fragment sequences is preceded by:
and receiving the goods name input by the goods owner.
6. The method of claim 1, wherein the classifying the output of the trained goods name classification model as the goods name of the goods name further comprises:
acquiring a historical freight order of a goods owner providing the goods name;
querying the goods packing mode and/or the freight requirement label of the goods name classification based on the historical freight order;
and displaying the inquired goods packaging mode and/or the goods requirement label to a goods owner end providing the goods name.
7. The method of claim 1, wherein the classifying the output of the trained goods name classification model as the goods name of the goods name further comprises:
acquiring a release request of the goods source information of the goods owner end providing the goods name;
releasing the goods source information to display the goods name classification and the goods name together,
wherein, the goods name classification is used for the driver to carry out the goods source retrieval.
8. A goods name sorting device characterized by comprising:
a dividing module configured to divide the goods name into byte fragment sequences;
an input module configured to input the byte fragment sequence as a trained goods name classification model, wherein the goods name classification model comprises an input layer, a hidden layer and an output layer which are connected in sequence, and the output layer adopts a layered softmax structure; and
an output module configured to classify an output of the trained goods name classification model as a goods name of the goods name.
9. An electronic device, characterized in that the electronic device comprises:
a processor;
memory on which a computer program is stored which, when executed by the processor, performs the method of classification of a good name according to any one of claims 1 to 7.
10. A storage medium, characterized in that the storage medium has stored thereon a computer program which, when executed by the processor, performs the method of classification of a good name according to any one of claims 1 to 7.
CN202011494419.9A 2020-12-17 2020-12-17 Goods name classification method and device, electronic equipment and storage medium Withdrawn CN112418356A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011494419.9A CN112418356A (en) 2020-12-17 2020-12-17 Goods name classification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011494419.9A CN112418356A (en) 2020-12-17 2020-12-17 Goods name classification method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112418356A true CN112418356A (en) 2021-02-26

Family

ID=74776173

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011494419.9A Withdrawn CN112418356A (en) 2020-12-17 2020-12-17 Goods name classification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112418356A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116562769A (en) * 2023-06-15 2023-08-08 深圳爱巧网络有限公司 Cargo data analysis method and system based on cargo attribute classification

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116562769A (en) * 2023-06-15 2023-08-08 深圳爱巧网络有限公司 Cargo data analysis method and system based on cargo attribute classification

Similar Documents

Publication Publication Date Title
CN107679234B (en) Customer service information providing method, customer service information providing device, electronic equipment and storage medium
Danloup et al. A comparison of two meta-heuristics for the pickup and delivery problem with transshipment
CN111950803A (en) Logistics object delivery time prediction method and device, electronic equipment and storage medium
CN109447334B (en) Data dimension reduction method and device for goods source information, electronic equipment and storage medium
KR20160084456A (en) Weight generation in machine learning
CN109636047B (en) User activity prediction model training method, system, device and storage medium
KR20160084453A (en) Generation of weights in machine learning
CN112417881A (en) Logistics information identification method and device, electronic equipment and storage medium
US11194968B2 (en) Automatized text analysis
CN113139060B (en) Text analysis model training and text analysis method, medium, device and equipment
CN113159355A (en) Data prediction method, data prediction device, logistics cargo quantity prediction method, medium and equipment
Xu et al. Computing the reliability of a stochastic distribution network subject to budget constraint
JPWO2014073206A1 (en) Information processing apparatus and information processing method
Voronov Machine learning models for predictive maintenance
CN112418356A (en) Goods name classification method and device, electronic equipment and storage medium
CN111967808A (en) Method and device for determining logistics object receiving mode, electronic equipment and storage medium
CN111241273A (en) Text data classification method and device, electronic equipment and computer readable medium
CN111368189B (en) Goods source sorting recommendation method and device, electronic equipment and storage medium
CN110598989B (en) Goods source quality evaluation method, device, equipment and storage medium
CN112308487A (en) Logistics track display method and device, electronic equipment and storage medium
CN112785111A (en) Production efficiency prediction method, device, storage medium and electronic equipment
CN113935802A (en) Information processing method, device, equipment and storage medium
Demgne et al. Modelling and numerical assessment of a maintenance strategy with stock through piecewise deterministic Markov processes and quasi Monte Carlo methods
CN113570204A (en) User behavior prediction method, system and computer equipment
CN113590484A (en) Algorithm model service testing method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210226