CN117195035A

CN117195035A - Enterprise product word abnormality processing method and device and electronic equipment

Info

Publication number: CN117195035A
Application number: CN202310915101.0A
Authority: CN
Inventors: 蔡青山
Original assignee: Qizhi Technology Co ltd
Current assignee: Qizhi Technology Co ltd
Priority date: 2023-07-25
Filing date: 2023-07-25
Publication date: 2023-12-08

Abstract

A method and a device for processing abnormal words of enterprise products and electronic equipment relate to the field of data processing. The method is applied to the server and comprises the following steps: the method comprises the steps that a training sample library is obtained, the training sample library comprises positive samples and negative samples, the proportion of the positive samples to the negative samples is a preset proportion, and the number of the positive samples is larger than that of the negative samples; inputting the training sample library into a preset enterprise information classification model to obtain the prediction probability of each sample type corresponding to a plurality of samples; calculating a loss function value of a preset enterprise information classification model by adopting a multi-classification focal point loss function based on the prediction probabilities of a plurality of samples; and when the loss function value of the preset enterprise information classification model is smaller than or equal to a preset threshold value, determining that training of the preset enterprise information classification model is completed. Therefore, the problem that the accuracy of the enterprise data classification model for identifying and classifying abnormal words is reduced due to serious unbalance of positive samples and negative samples in training samples for training the enterprise data classification model is solved.

Description

Enterprise product word abnormality processing method and device and electronic equipment

Technical Field

The application relates to the technical field of data processing, in particular to a method and a device for processing abnormal words of enterprise products and electronic equipment.

Background

With the rapid development of artificial intelligence, the application of artificial intelligence is also becoming more and more widespread. When a user queries an enterprise, product words in enterprise information are often extracted and classified by means of an enterprise data classification model applying artificial intelligence, and finally, product words classified as enterprise camping services in a plurality of product words are presented to the user so as to help the user to quickly know products of the enterprise.

At present, when an enterprise writes enterprise information, the situation of writing errors possibly exists, so that abnormal words exist in enterprise product words, but the abnormal words in the enterprise information are small probability events, and therefore the number of the abnormal words is not large. However, for the enterprise data classification model, when the enterprise data classification model is trained, because the number of negative samples composed of abnormal words is small, there is serious imbalance between the positive samples and the negative samples in the training samples for training the enterprise data classification model, so that the accuracy of the enterprise data classification model for identifying and classifying the abnormal words is reduced.

Therefore, a method, a device and an electronic device for processing abnormal words of enterprise products are needed.

Disclosure of Invention

Aiming at the problem that the accuracy of the enterprise data classification model for abnormal word recognition and classification is reduced because of serious imbalance between positive samples and negative samples in training samples for training the enterprise data classification model. The application provides a method and a device for processing abnormal words of enterprise products and electronic equipment.

In a first aspect, the present application provides a method for processing abnormal words of an enterprise product, which is applied to a server, and the method includes: the method comprises the steps that a training sample library is obtained, the training sample library comprises a plurality of samples, the plurality of samples comprise positive samples and negative samples, the positive samples are the correct product words and the corresponding sample types, the negative samples are the abnormal product words and the corresponding sample types, the proportion of the positive samples to the negative samples is a preset proportion, and the number of the positive samples is larger than that of the negative samples; inputting the training sample library into a preset enterprise information classification model to obtain the prediction probability of each sample type corresponding to a plurality of samples; calculating a loss function value of a preset enterprise information classification model by adopting a multi-classification focal point loss function based on the prediction probabilities of a plurality of samples; and when the loss function value of the preset enterprise information classification model is smaller than or equal to a preset threshold value, determining that training of the preset enterprise information classification model is completed.

By adopting the technical scheme, the number of positive samples is larger than that of negative samples by adjusting the proportion of the positive samples to the negative samples in the training sample library. Thus, the problem of unbalanced sample proportion in the training sample library is simulated. And then inputting the training sample library into a preset enterprise information classification model for training, and introducing a multi-classification focus loss function to reduce the attention degree of positive samples in the sample training library, namely, reduce the training loss of the positive samples, and increase the attention degree of negative samples, namely, increase the training loss of the negative samples, so that the preset enterprise information classification model increases more iteration times for the negative samples, and the recognition accuracy of the preset enterprise information classification model for abnormal product words is improved.

In a second aspect, the present application provides a device for processing abnormal words of an enterprise product, where the device is a server, and the server includes an acquisition module and a processing module, where:

the system comprises an acquisition module, a training sample library and a processing module, wherein the acquisition module is used for acquiring a training sample library, the training sample library comprises a plurality of samples, the plurality of samples comprise positive samples and negative samples, the positive samples are correct product words and corresponding product types, the negative samples are abnormal product words and corresponding sample types, the proportion of the positive samples to the negative samples is a preset proportion, and the number of the positive samples is larger than that of the negative samples;

the processing module is used for inputting the training sample library into a preset enterprise information classification model to obtain the prediction probability of the sample types corresponding to the samples; calculating a loss function value of a preset enterprise information classification model by adopting a multi-classification focal point loss function based on the prediction probabilities of a plurality of samples; and when the loss function value of the preset enterprise information classification model is smaller than or equal to a preset threshold value, determining that training of the preset enterprise information classification model is completed.

Optionally, the multi-classification focal point loss function is:

wherein FL is a loss function value; n is the total number of samples in the training sample library; m is the total number of sample types; p is p _ic A prediction probability for the ith sample belonging to sample type c; y is _ic If the ith sample belongs to the sample type c, the real label is 1, otherwise, 0 is taken; alpha _c A weight value for sample type c; gamma is the focal factor.

By adopting the technical scheme, the prediction probability of each sample is substituted into the multi-focus classification loss function to calculate, the multi-classification focus loss function increases the classification difficulty of the negative sample in the process, and reduces the classification difficulty of the positive sample, so that the preset enterprise information classification model is more focused on the negative sample difficult to learn after repeated iterative training.

Optionally, the obtaining module obtains a total number n of samples corresponding to a sample type c, where the sample type c is any one of a plurality of sample types; the processing model determines a weight value for sample type c based on the ratio of N to N.

By adopting the technical scheme, the weight of the sample type is adjusted according to the ratio of the total number N of the samples to the total number N of the samples in the training sample library, the weight is increased for the negative samples, and the weight is reduced for the positive samples, so that the negative samples which are difficult to learn are more concentrated on the preset enterprise information classification model, and the accuracy of the model on abnormal product words of the negative samples is increased.

Optionally, the preset enterprise information classification model includes a plurality of transducer neural network layers, and before the training sample library is input into the preset enterprise information classification model, the method further includes: the processing module calculates the total number of samples in the training sample library; and matching the total number of the samples with a preset model training database to obtain the number of layers of the transducer neural network layer corresponding to the total number of the samples, wherein the preset model training database comprises the corresponding relation between the number of the samples and the neural network model.

By adopting the technical scheme, the total number of samples in the training sample library is calculated, so that the data quantity to be learned of the enterprise information classification model is determined; and then matching the total number of the samples with a preset model training library to obtain a neural network model of the data quantity required to be learned with the enterprise information classification model, and determining the number of layers of the transformer neural network layer of the enterprise information classification model by referring to the number of layers of the transformer neural network layer of the neural network model, so that the probability that the model is difficult to converge is reduced.

Optionally, before inputting the training sample library into the enterprise information classification model, the method further includes: the processing module connects feature marks to the sentence head and the sentence tail of the plurality of samples.

By adopting the technical scheme, the feature labels are connected with the sentence heads and used for marking the beginning of the sequence, and in the training process, the feature labels of the sentence heads can help the model to recognize the semantics of sentences so as to accelerate the training efficiency and accuracy of the model; and connecting the feature labels at the ends of the sentences for marking the end of the sequence so that the model can correctly process the input of a plurality of sentences.

Optionally, after the training of the preset enterprise information classification model is determined, the method further includes: the method comprises the steps that an acquisition module acquires enterprise search words input by a user; the processing module calls target enterprise data corresponding to the enterprise search word from a preset enterprise database based on the enterprise search word, wherein the preset enterprise database comprises enterprise data of a plurality of enterprises; inputting the target enterprise data into a preset enterprise information classification model to obtain a plurality of product words corresponding to the target enterprise data, wherein the plurality of product words comprise correct product words and abnormal product words; and displaying the plurality of product words corresponding to the target enterprise data to the user.

By adopting the technical scheme, the abnormal product words are identified with higher degree by means of the preset enterprise information classification model; therefore, in the enterprise search words input by the user, the preset enterprise information classification model can accurately identify and obtain a plurality of corresponding product words, wherein the product words not only comprise correct product words, but also comprise abnormal product words; therefore, the problem that when a user inquires the product words of the enterprise camping service, the main camping service presented to the user lacks information of part of the product words is solved.

Optionally, the obtaining module determines a dominant product category of the target enterprise data based on product categories of a plurality of correct product words corresponding to the target enterprise data; calling a plurality of product words corresponding to the principal product category from a preset product word database based on the principal product category, wherein the preset product word database comprises the corresponding relation between the product category and the product word; the processing module matches a plurality of abnormal product words corresponding to the target enterprise data with a plurality of product words corresponding to the main product category one by one, and determines correct product words corresponding to the abnormal product words respectively; and replacing the plurality of abnormal products corresponding to the target enterprise data with correct product words corresponding to the plurality of abnormal product words.

By adopting the technical scheme, abnormal product words often have the condition of missing or wrong characters. Therefore, before the plurality of product words corresponding to the target enterprise data are displayed to the user, the corrected abnormal product words are displayed to the user after the abnormal product words are corrected, so that the reading experience of the user is improved.

In a third aspect, the present application provides an electronic device comprising a processor, a memory for storing instructions, a user interface and a network interface for communicating to other devices, the processor for executing the instructions stored in the memory to cause the electronic device to perform the method of any of the first aspects.

In a fourth aspect, the present application provides a computer readable storage medium storing instructions which, when executed, perform the method of any one of the first aspects.

In summary, one or more technical solutions provided in the embodiments of the present application at least have the following technical effects or advantages:

1. the number of positive samples is made larger than the number of negative samples by adjusting the ratio of positive samples to negative samples in the training sample library. Thus, the problem of unbalanced sample proportion in the training sample library is simulated. And then inputting the training sample library into a preset enterprise information classification model for training, and introducing a multi-classification focus loss function to reduce the attention degree of positive samples in the sample training library, namely, reduce the training loss of the positive samples, and increase the attention degree of negative samples, namely, increase the training loss of the negative samples, so that the preset enterprise information classification model increases more iteration times for the negative samples, and the recognition accuracy of the preset enterprise information classification model for abnormal product words is improved.

2. The prediction probability of each sample is substituted into the multi-focus classification loss function to calculate, the multi-classification focus loss function increases the classification difficulty of the negative sample in the process, and reduces the classification difficulty of the positive sample, so that the preset enterprise information classification model is more focused on the negative sample difficult to learn after repeated iterative training.

Drawings

Fig. 1 is a schematic structural diagram of an enterprise information classification model according to an embodiment of the present application.

FIG. 2 is a flowchart illustrating a method for processing abnormal words of an enterprise product according to an embodiment of the present application

Fig. 3 is a schematic structural diagram of an apparatus for processing abnormal words of an enterprise product according to an embodiment of the present application.

Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Reference numerals illustrate: 1. an acquisition module; 2. a processing module; 400. an electronic device; 401. a processor; 402. a communication bus; 403. a user interface; 404. a network interface; 405. a memory.

Detailed Description

In order that those skilled in the art will better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments.

In describing embodiments of the present application, words such as "for example" or "for example" are used to mean serving as examples, illustrations, or descriptions. Any embodiment or design described herein as "such as" or "for example" in embodiments of the application should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "or" for example "is intended to present related concepts in a concrete fashion.

In the description of embodiments of the application, the term "plurality" means two or more. For example, a plurality of systems means two or more systems, and a plurality of screen terminals means two or more screen terminals. Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating an indicated technical feature. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.

Before describing the embodiments of the present application, the present application provides a schematic structure diagram of an enterprise information classification model. The enterprise information classification model adopts a model structure of a bert model as a structure of a pre-training model, and comprises a plurality of transducer neural network layers, wherein each transducer neural network layer comprises a multi-head attention layer, a feedforward network layer and a standardization layer. In the training process, training text is first input to a vector converter, which converts each word or character in the training text into a numeric vector. Then, the numerical vector is input into a multi-head attention layer, and the multi-head attention layer extracts a characteristic sequence from the numerical vector, and captures the relationship of different positions in the input sequence by performing different linear transformations on the input sequence and calculating attention weights. The signature sequence is then input to a feed-forward network layer, which performs a nonlinear transformation on the signature sequence by a multi-layer perceptron (MLP). And finally, inputting the nonlinear transformation result into a classifier formed by the full-connection layers, judging the prediction probability of the category of the training text, and repeating the training steps until the convergence function of the enterprise information classification model converges. After the multi-head attention layer and the feedforward network layer, each layer is connected with a standardization layer, and the standardization layer is used for carrying out standardization treatment on the input in the layers, so that the problems of gradient disappearance and gradient explosion are reduced, and the network is helped to train more stably; regularization technology is adopted between the multi-head attention layer and the normalization layer and between the feedforward network layer and the normalization layer, and in the training process, some neurons in the neural network are randomly discarded with a certain probability, so that each neuron has a certain probability of being turned off, the co-adaptability among the neurons in the neural network is reduced, and the robustness and generalization capability of the model are improved. In addition, a residual connection is adopted between the two layers, and the residual connection is used for directly adding the input of the layers to the output of the layers so as to keep the input information, thereby relieving the gradient disappearance problem and enabling the network to learn deeper features more easily.

Currently, in order to train a neural network model with better performance, the training set of the neural network model often selects to set the ratio of the positive sample to the negative sample to be 1:1. However, for the enterprise data classification model, when the training set of the model is constructed, because the negative samples are derived from the abnormal words in the enterprise information, and the abnormal words appear in the enterprise information as a small probability event, the number of the abnormal words cannot be large, so that the proportion of the positive samples to the negative samples in the training set is seriously unbalanced, for example, the proportion of the positive samples to the negative samples may be 20:1, and the problem of reduced accuracy of the enterprise data classification model on recognition and classification of the abnormal words is caused.

In order to solve the above problems, the present application provides a method for processing abnormal words of an enterprise product, which is applied to a server, as shown in fig. 2, and the method includes steps S101 to S104.

S101, acquiring a training sample library, wherein the training sample library comprises a plurality of samples, the plurality of samples comprise positive samples and negative samples, the positive samples are correct product words and corresponding sample types, the negative samples are abnormal product words and corresponding sample types, the positive samples and the negative samples are preset proportions in proportion, and the number of the positive samples is larger than that of the negative samples.

In the above steps, first, the server acquires enterprise data of a plurality of enterprises from the enterprise database. The enterprise database can be a national recorded enterprise information database or an enterprise database of each large enterprise information query platform. Then, extracting keywords from the enterprise data, wherein in the embodiment of the application, the keywords are product words of main products of the enterprise. The keywords include correct keywords and abnormal keywords, and the correct keywords are words with accurate and complete expression of product information, such as a notebook computer. The abnormal keyword is a word which is used for expressing the missing or wrong product information, such as a pen computer or a pen, namely a local computer. At this time, the correct keyword and the corresponding sample type are constructed as positive samples, for example, the positive samples corresponding to the notebook computer. The abnormal keyword and the corresponding sample type are constructed as negative samples, such as a pen computer corresponding negative sample. And constructing the number of positive samples and the number of negative samples into a training sample library according to a preset proportion, wherein the preset proportion can support the number of positive samples to be far greater than the number of negative samples in the application, and the ratio is preferably 15:1.

S102, inputting the training sample library into a preset enterprise information classification model to obtain the prediction probability of the sample category corresponding to each of the plurality of samples.

In the above steps, vector conversion and feature labeling are required for the training samples input into the preset enterprise information classification model. The vector conversion can utilize a vector converter to convert training samples from text information into a numerical vector sequence; then, the feature labels are connected with the logarithmic vector sequence, and the steps are as follows: the beginning and end of each sentence of each sample are connected with a mark field. For example, the [ CLS ], end of sentence [ SEP ] can be concatenated with the sentence head, and the concatenated samples are: [ CLS ] product word [ SEP ]. Wherein [ CLS ] is used to mark the beginning of the sequence. During the training process, the BERT model encodes the entire sequence according to [ CLS ] and generates corresponding feature representations, which can then be used to identify the overall semantics of the sentence. The SEP is used to mark the end of the sequence so that the model can properly handle the input of multiple sentences.

The enterprise information classification model comprises a plurality of transducer neural network layers, and the problems of difficult information transmission and low parallel computing efficiency when processing long-sequence data are solved by introducing a self-attention mechanism. Therefore, the number of layers of the transformer neural network layer determines the capacity and the expression capacity of the enterprise information classification model, and the larger the number of layers is, the stronger the capacity and the expression capacity are, but the increasing of the number of layers also means the increasing of the complexity and the calculation amount of the model, so that the model is easy to have gradient disappearance and gradient explosion, and the model is difficult to converge. Therefore, the number of layers of the transducer neural network layer for the enterprise information classification model needs to be determined according to the data amount of the training sample library. According to the method, the total number of samples in the training sample library is calculated, so that the data amount required to be learned by the enterprise information classification model is determined, the total number of samples is matched with the preset model training library, a neural network model similar to the total number of samples of the enterprise information classification model is obtained, and finally the layer number of a transformer neural network layer of the neural network model obtained through matching is used as the layer number of the transformer neural network layer of the enterprise information classification model. The preset model training library stores the corresponding relation between the number of samples and the neural network model. Therefore, the number of layers of the transformer neural network layer of the enterprise information classification model is determined by referring to the trained neural network model, so that the situation that the model is difficult to converge is avoided.

After vector conversion and feature labeling are carried out on the training samples, the training samples are input into a transducer neural network layer for model training, and finally the prediction probability of the sample category corresponding to each of the samples is obtained. The prediction probability can be understood as the probability that a sample belongs to a sample class, for example, the prediction probability that a pen computer belongs to a positive sample is 10%.

S103, calculating a loss function value of a preset enterprise information classification model by adopting a multi-classification focus loss function based on the prediction probabilities of a plurality of samples.

In the above step, since the prediction probability of the sample may or may not be accurate. And to correct the accuracy of the enterprise information classification model for the sample types of the respective samples. A multi-class focal point loss function is introduced, which is specifically:

wherein FL is a loss function value; n is the total number of samples in the training sample library; m is the total number of enterprise information categories; p is p _ic The prediction probability of the ith sample belonging to the enterprise information category c; y is _ic If the ith sample belongs to the real label of the enterprise information category c, the real label takes 1, otherwise takes 0; alpha _c The weight value of the enterprise information category c; gamma is the focal factor.

For the above formula, a weight value α is set for each sample class c _c In the present application, the sample class includes two sample types, positive and negative. The weight value of any sample class c in the plurality of sample types is determined specifically as follows: obtaining the total number N of samples of the sample type c, and calculating the ratio of the total number N of samples in the training sample library to the total number N of samples, so as to determine the weight value of the sample type c, wherein if the sample type c is a positive sample, the larger the ratio is, the smaller the weight value of the sample type c is; if the sample type c is a negative sample, the larger the ratio is, the larger the weight value of the sample type c is, so that the sample number difference between different sample types is balanced, and the model is more concerned with the sample type with less sample number. And 3, the attention to the sample types with a large number of samples is reduced, so that more iterative training can be added to the negative samples by the enterprise information classification model according to the characteristics, and the number of iterative training times is reduced to the positive samples, thereby improving the recognition accuracy to the negative samples.

In addition, it is directed to (1-p _ic ) ^γ For adjusting the weight between the easily classified samples and the difficult classified samples. When the sample is misclassified, p _ic The value of (c) is smaller than the value of (c),(1-p _ic ) ^γ the focus factor gamma will amplify the error and thus increase the interest of the enterprise information classification model for difficult-to-classify samples, i.e. increase the interest of negative samples. And when the sample is correctly classified, p _ic The value of (1-p) is larger _ic ) ^γ The focus factor gamma will reduce the error, thereby reducing the weight of the enterprise information classification model to the easily classified samples, i.e. reducing the attention to the aligned samples. The basic principle of the method is that an initial value is set first, then the focus factor is adjusted according to the performance condition of a model in the model training process, so as to obtain an optimal focus factor, wherein the performance condition can be understood as the falling speed, the classification accuracy and the like of a loss function. Therefore, the iteration training times of the enterprise information classification model on the samples difficult to classify are increased, the iteration times of the samples easy to classify are reduced, and the identification accuracy of the enterprise information classification model on the negative samples is improved.

Finally, the importance of different sample categories is balanced by carrying out weighted average on samples of all sample categories, so that the classification performance of the model on the whole data set is improved. Particularly, when an unbalanced data set is processed, the model can learn the characteristics and modes of sample categories with smaller quantity better, and the overall classification accuracy is improved.

And S104, when the loss function value of the preset enterprise information classification model is smaller than or equal to a preset threshold value, determining that training of the preset enterprise information classification model is completed.

In the above step, after calculating the loss function value of the preset enterprise information classification model, the loss function value is compared with the preset threshold, and if the loss function value of the preset enterprise information classification model is less than or equal to the preset threshold, it can be determined that the training of the preset enterprise information classification model is completed.

In one possible implementation, the method aims at solving the problem that when a user queries the product words of the business host business, the host business presented to the user lacks information of part of the product words. The user logs in the enterprise query website through the user equipment, the enterprise search word is input in the enterprise information search column, and when the user clicks the search, the user equipment sends a search request to the server at the moment and sends the enterprise search word to the server. The user equipment may be a mobile phone, a smart phone, a notebook computer, a desktop computer, etc., which is not limited herein. After receiving the search request, the server acquires the enterprise search word input by the user, wherein the enterprise search word can be one section of words or a plurality of words. Then, based on the enterprise search word, invoking target enterprise data corresponding to the enterprise search word from a preset enterprise database, wherein the preset enterprise database comprises enterprise data of a plurality of enterprises; and finally, inputting the target enterprise data into a preset enterprise information classification model, wherein the preset enterprise information classification model has higher recognition degree on abnormal product words in the training process, so that the obtained multiple product words corresponding to the target enterprise data not only comprise correct product words but also comprise abnormal product words. And abnormal product words often have the condition of missing or misword. Therefore, before the plurality of product words corresponding to the target enterprise data are displayed to the user, the abnormal product words are corrected, so that the reading experience of the user is improved. The process is specifically as follows: and determining the dominant product category of the target enterprise data based on the product categories of the plurality of correct product words corresponding to the target enterprise data. For example, if the plurality of correct product words includes a notebook computer, a smart phone, a smart watch, and a smart bracelet, the camping product category of the target enterprise data may be determined to be an electronic product. At this time, based on the main product category of the target enterprise data, a plurality of product words corresponding to the main product category are called from a preset product word database, wherein the preset product word database comprises the corresponding relation between the product category and the product word;

then matching a plurality of abnormal product words corresponding to the target enterprise data with a plurality of product words corresponding to the main product category one by one, and determining correct product words corresponding to the abnormal product words respectively; for example, if the abnormal product word is a game machine, and if the game machine is included in the plurality of product words corresponding to the main product category, the correct product word of the game machine is determined to be the game machine. And finally, replacing the plurality of abnormal products corresponding to the target enterprise data with correct product words corresponding to the plurality of abnormal product words. And displaying the corrected abnormal product words and correct product words to the user.

The application also provides a device for processing the abnormal words of the enterprise products, which is a server, as shown in fig. 3, wherein the server comprises an acquisition module 1 and a processing module 2, and the device comprises:

the training system comprises an acquisition module 1, a training sample library and a processing module, wherein the acquisition module 1 is used for acquiring a training sample library, the training sample library comprises a plurality of samples, the plurality of samples comprise positive samples and negative samples, the positive samples are correct product words and corresponding product types, the negative samples are abnormal product words and corresponding sample types, the ratio of the positive samples to the negative samples is a preset ratio, and the number of the positive samples is larger than that of the negative samples;

the processing module 2 is used for inputting the training sample library into a preset enterprise information classification model to obtain the prediction probability of the sample types corresponding to the samples; calculating a loss function value of a preset enterprise information classification model by adopting a multi-classification focal point loss function based on the prediction probabilities of a plurality of samples; and when the loss function value of the preset enterprise information classification model is smaller than or equal to a preset threshold value, determining that training of the preset enterprise information classification model is completed.

In one possible implementation, the multi-class focal point loss function is:

In one possible implementation manner, the obtaining module 1 obtains the total number n of samples corresponding to a sample type c, where the sample type c is any one of a plurality of sample types; the processing model determines a weight value for sample type c based on the ratio of N to N.

In one possible implementation, the preset enterprise information classification model includes a plurality of transformer neural network layers, and before the training sample library is input into the preset enterprise information classification model, the method further includes: the processing module 2 calculates the total number of samples in the training sample library; and matching the total number of the samples with a preset model training database to obtain the number of layers of the transducer neural network layer corresponding to the total number of the samples, wherein the preset model training database comprises the corresponding relation between the number of the samples and the neural network model.

In one possible implementation, before inputting the training sample library into the enterprise information classification model, the method further includes: the processing module 2 connects feature labels to the sentence head and the sentence tail of the plurality of samples.

In one possible implementation manner, after determining that training of the preset enterprise information classification model is completed, the method further includes: the method comprises the steps that an acquisition module 1 acquires enterprise search words input by a user; the processing module 2 calls target enterprise data corresponding to the enterprise search word from a preset enterprise database based on the enterprise search word, wherein the preset enterprise database comprises enterprise data of a plurality of enterprises; inputting the target enterprise data into a preset enterprise information classification model to obtain a plurality of product words corresponding to the target enterprise data, wherein the plurality of product words comprise correct product words and abnormal product words; and displaying the plurality of product words corresponding to the target enterprise data to the user.

In one possible implementation, the obtaining module 1 determines a dominant product category of the target enterprise data based on product categories of a plurality of correct product words corresponding to the target enterprise data; calling a plurality of product words corresponding to the principal product category from a preset product word database based on the principal product category, wherein the preset product word database comprises the corresponding relation between the product category and the product word; the processing module 2 matches a plurality of abnormal product words corresponding to the target enterprise data with a plurality of product words corresponding to the main product category one by one, and determines correct product words corresponding to the abnormal product words respectively; and replacing the plurality of abnormal products corresponding to the target enterprise data with correct product words corresponding to the plurality of abnormal product words.

It should be noted that: in the device provided in the above embodiment, when implementing the functions thereof, only the division of the above functional modules is used as an example, in practical application, the above functional allocation may be implemented by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to implement all or part of the functions described above. In addition, the embodiments of the apparatus and the method provided in the foregoing embodiments belong to the same concept, and specific implementation processes of the embodiments of the method are detailed in the method embodiments, which are not repeated herein.

The application further provides electronic equipment. Referring to fig. 4, fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. The electronic device 400 may include: at least one processor 401, at least one network interface 404, a user interface 403, a memory 405, and at least one communication bus 402.

Wherein communication bus 402 is used to enable connected communications between these components.

The user interface 403 may include a Display screen (Display) and a Camera (Camera), and the optional user interface 403 may further include a standard wired interface and a standard wireless interface.

The network interface 404 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface), among others.

Wherein the processor 401 may include one or more processing cores. The processor 401 connects the various parts within the entire server using various interfaces and lines, performs various functions of the server and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 405, and invoking data stored in the memory 405. Alternatively, the processor 401 may be implemented in at least one hardware form of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 401 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 401 and may be implemented by a single chip.

The Memory 405 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 405 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 405 may be used to store instructions, programs, code sets, or instruction sets. The memory 405 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described various method embodiments, etc.; the storage data area may store data or the like involved in the above respective method embodiments. The memory 405 may also optionally be at least one storage device located remotely from the aforementioned processor 401. Referring to fig. 4, an operating system, a network communication module, a user interface module, and an application program of a method of handling an enterprise product anomaly may be included in a memory 405, which is a computer storage medium.

In the electronic device 400 shown in fig. 4, the user interface 403 is mainly used as an interface for providing input for a user, and obtains data input by the user; and processor 401 may be used to invoke an application in memory 405 that stores a method of handling an enterprise product anomaly, which when executed by one or more processors 401, causes electronic device 400 to perform the method as described in one or more of the embodiments above. It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all of the preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

In the several embodiments provided by the present application, it should be understood that the disclosed apparatus may be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative, such as a division of units, merely a division of logic functions, and there may be additional divisions in actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some service interface, device or unit indirect coupling or communication connection, electrical or otherwise.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable memory. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in whole or in part in the form of a software product stored in a memory, comprising several instructions for causing a computer device (which may be a personal computer, a server or a network device, etc.) to perform all or part of the steps of the method of the various embodiments of the present application. And the aforementioned memory includes: various media capable of storing program codes, such as a U disk, a mobile hard disk, a magnetic disk or an optical disk.

The foregoing is merely exemplary embodiments of the present disclosure and is not intended to limit the scope of the present disclosure. That is, equivalent changes and modifications are contemplated by the teachings of this disclosure, which fall within the scope of the present disclosure. Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure.

This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a scope and spirit of the disclosure being indicated by the claims.

Claims

1. A method for processing abnormal words of an enterprise product, which is applied to a server, the method comprising:

the method comprises the steps that a training sample library is obtained, the training sample library comprises a plurality of samples, the plurality of samples comprise positive samples and negative samples, the positive samples are sample types corresponding to correct product words, the negative samples are sample types corresponding to abnormal product words, the ratio of the positive samples to the negative samples is a preset ratio, and the number of the positive samples is larger than that of the negative samples;

inputting the training sample library into the preset enterprise information classification model to obtain the prediction probability of the sample types corresponding to the samples;

calculating a loss function value of the preset enterprise information classification model by adopting a multi-classification focal point loss function based on the prediction probabilities of a plurality of samples;

and when the loss function value of the preset enterprise information classification model is smaller than or equal to a preset threshold value, determining that training of the preset enterprise information classification model is completed.

2. The method of claim 1, wherein the multi-class focal point loss function is:

3. The method according to claim 2, characterized in that the determination of the weight value of the sample type c is in particular:

obtaining the total number n of samples corresponding to the sample type c, wherein the sample type c is any one of a plurality of sample types; and determining the weight value of the sample type c based on the ratio of N to N.

4. The method of claim 1, wherein the pre-set business information classification model comprises a plurality of transformer neural network layers, and wherein before inputting the training sample library into the pre-set business information classification model, further comprising:

calculating the total number of samples in the training sample library;

and matching the total number of the samples with a preset model training database to obtain the number of layers of the transducer neural network layer corresponding to the total number of the samples, wherein the preset model training database comprises the corresponding relation between the number of the samples and the neural network model.

5. The method of claim 1, wherein prior to inputting the training sample library into the enterprise information classification model, further comprising:

and connecting feature marks to the sentence heads and sentence ends of a plurality of samples.

6. The method of claim 1, wherein after the determining that the training of the preset enterprise information classification model is completed, further comprising:

acquiring enterprise search words input by a user;

invoking target enterprise data corresponding to the enterprise search word from a preset enterprise database based on the enterprise search word, wherein the preset enterprise database comprises enterprise data of a plurality of enterprises;

inputting the target enterprise data into the preset enterprise information classification model to obtain a plurality of product words corresponding to the target enterprise data, wherein the plurality of product words comprise correct product words and abnormal product words;

and displaying a plurality of product words corresponding to the target enterprise data to the user.

7. The method of claim 6, wherein the presenting the plurality of product words corresponding to the target enterprise data to the user further comprises:

determining a main product category of the target enterprise data based on product categories of a plurality of correct product words corresponding to the target enterprise data;

calling a plurality of product words corresponding to the main product category from a preset product word database based on the main product category, wherein the preset product word database comprises the corresponding relation between the product category and the product word;

matching a plurality of abnormal product words corresponding to the target enterprise data with a plurality of product words corresponding to the main product category one by one, and determining correct product words corresponding to the abnormal product words;

and replacing the plurality of abnormal products corresponding to the target enterprise data with correct product words corresponding to the plurality of abnormal product words.

8. The device for processing the abnormal words of the enterprise product is characterized by being a server, wherein the server comprises an acquisition module (1) and a processing module (2), and the device comprises the following components:

the acquisition module (1) is configured to acquire a training sample library, where the training sample library includes a plurality of samples, the plurality of samples include positive samples and negative samples, the positive samples are correct product words and corresponding product types, the negative samples are abnormal product words and corresponding sample types, the ratio of the positive samples to the negative samples is a preset ratio, and the number of the positive samples is greater than the number of the negative samples;

the processing module (2) is used for inputting the training sample library into the preset enterprise information classification model to obtain the prediction probability of the sample types corresponding to the samples; calculating a loss function value of the preset enterprise information classification model by adopting a multi-classification focal point loss function based on the prediction probabilities of a plurality of samples; and when the loss function value of the preset enterprise information classification model is smaller than or equal to a preset threshold value, determining that training of the preset enterprise information classification model is completed.

9. An electronic device comprising a processor (401), a memory (405), a user interface (403) and a network interface (404), the memory (405) being configured to store instructions, the user interface (403) and the network interface (404) being configured to communicate to other devices, the processor (401) being configured to execute the instructions stored in the memory (505) to cause the electronic device (400) to perform the method according to any of claims 1 to 7.

10. A computer readable storage medium storing instructions which, when executed, perform the method of any one of claims 1 to 7.