CN111681670B

CN111681670B - Information identification method, device, electronic equipment and storage medium

Info

Publication number: CN111681670B
Application number: CN201910141241.0A
Authority: CN
Inventors: 刘纯一; 柳俊宏; 薛艳云; 王鹏; 李奘
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2019-02-25
Filing date: 2019-02-25
Publication date: 2023-05-12
Anticipated expiration: 2039-02-25
Also published as: CN111681670A

Abstract

The embodiment of the application provides an information identification method, an information identification device, electronic equipment and a storage medium, and belongs to the field of information processing. According to the method, the probability value of each text message to be identified is obtained as the text message of the target type, then each probability value is compared with the preset probability value, and the target text message to be identified belonging to the target type in the plurality of text messages to be identified is determined according to the obtained comparison result.

Description

Information identification method, device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of information processing, and in particular, to an information identification method, an information identification device, an electronic device, and a storage medium.

Background

Taking the network bus scene as an example, in order to further ensure the bus safety of passengers, the dialogue between the passengers during the bus process and the driver can be generally obtained to judge whether the contradiction occurs between the driver and the passengers, but because other noise is usually doped in the dialogue between the driver and the passengers obtained during the bus process, such as the navigation is started during the journey process, the navigation can generate navigation broadcasting sound, or if the driver is playing songs or listening to broadcasting, the extra sound is doped in the dialogue between the driver and the passengers, so that whether the contradiction occurs between the driver and the passengers is analyzed in order to obtain the dialogue between the driver and the passengers, and the noise needs to be removed.

At present, a template matching mode is adopted to identify the murmurs, for example, if a large number of navigation sound templates are required to be manually configured in advance if the navigation sound is to be identified, if the template quantity is insufficient, correct matching can not be performed on some navigation sound, the accuracy is not high, and when the acquired data are required to be matched with a large number of templates one by one, the time is too long and the efficiency is too low.

Disclosure of Invention

In view of the foregoing, an object of an embodiment of the present application is to provide an information identification method, an apparatus, an electronic device, and a storage medium, so as to improve accuracy and efficiency of information identification.

In a first aspect, an embodiment of the present application provides an information identifying method, where the method includes: acquiring a plurality of text information to be identified; acquiring probability values of each text message to be identified as text messages of a target type, and acquiring a plurality of probability values altogether, wherein the text messages of the target type are information except for interaction information generated between a service provider and a service requester in the service providing process; comparing each probability value with a preset probability value to obtain a comparison result; and determining target text information to be identified belonging to the target type in the plurality of text information to be identified according to the comparison result.

In the implementation process, the probability value of each text message to be identified as the text message of the target type can be obtained, then each probability value is compared with the preset probability value, and the target text message to be identified belonging to the target type in the plurality of text messages to be identified is determined according to the obtained comparison result.

Optionally, acquiring a probability value of each text message to be identified as the text message of the target type includes: the method comprises the steps of obtaining probability values of text information with each text information to be identified as a target type through a preset language model, wherein the preset language model is obtained by inputting a plurality of training text information with the target type into the language model in advance for training.

In the implementation process, since the preset language module trains the language model through a plurality of training text information of the target type, the preset language model can effectively identify the text information of the target type in the plurality of text information to be identified.

Optionally, obtaining, by a preset language model, a probability value of each text information to be identified as the text information of the target type includes: extracting M words in each text message to be identified through a preset language model, wherein M is an integer greater than or equal to 2; predicting the probability value of the (i+1) th word in the M words after the (i) th word appears in the M words, and predicting the probability value of the ending character appearing after the M th word, so as to obtain M probability values corresponding to each piece of text information to be identified, wherein i is an integer greater than or equal to 1 and less than M; and acquiring the probability value of the text information of which each text information to be identified is of the target type based on M probability values corresponding to each text information to be identified.

In the implementation process, the M probability values corresponding to each text message to be identified are obtained through the preset language model, and then the probability value of the text message with the target type of each text message to be identified is obtained based on the M probability values, so that the preset language model can effectively identify the text message with the target type of the text messages to be identified.

Optionally, predicting a probability value of an i+1th word of the M words after the i-th word of the M words appears, and predicting a probability value of an ending character after the M-th word appears, so as to obtain M probability values corresponding to each text information to be identified, where the method includes: predicting the probability value of the (i+1) th word in the M words after the (i) th word in the M words by a softmax classifier in the preset language model, and predicting the probability value of the ending character after the M th word, so as to obtain M probability values corresponding to each text information to be recognized.

In the implementation process, since the softmax classifier has a good classification prediction effect, a relatively accurate prediction result can be obtained by acquiring the probability value corresponding to each word through the softmax classifier in the preset language model.

Optionally, predicting, by a softmax classifier in the preset language model, a probability value of an i+1th word of the M words after the i-th word appears, and predicting a probability value of an ending character appearing after the M-th word, so as to obtain M probability values corresponding to each text information to be recognized, where the method includes: converting an ith word in each text message to be identified into a numerical vector through an embedded representation module in the preset language model, and obtaining an ith numerical vector corresponding to the ith word; obtaining similarity between an ith word and other M-1 words in the M words through an attention module in the language model as the weight of the ith word; weighting the weight of the ith word and the ith numerical vector corresponding to the ith word to obtain the numerical vector weighted by the ith word; inputting the numerical vector weighted by the ith word into the softmax classifier to predict the probability value of the (i+1) th word in the M words after the ith word is predicted, and predicting the probability value of the ending character after the Mth word is predicted, so as to obtain M probability values corresponding to each text information to be recognized.

In the implementation process, each word is converted into a numerical vector, then the weight corresponding to each word is obtained, and then the weight is weighted and then input into a softmax classifier for probability prediction, so that different contexts are focused on a preset language model, the expression capability of the preset language model is improved, and the accuracy of the prediction effect is further improved.

Optionally, converting, by an embedded representation module in the preset language model, the ith numerical value vector in each text message to be identified into a numerical value vector, to obtain the ith numerical value vector corresponding to the ith numerical value vector, including: converting an ith word in each text message to be identified into a first numerical vector through an embedded representation module in the preset language model, and converting an ith pinyin corresponding to the ith word into a second numerical vector; and splicing the first numerical value vector and the second numerical value vector to obtain an ith numerical value vector corresponding to the ith word.

In the implementation process, the numerical vector corresponding to the pinyin of each character is obtained, and then the numerical vector of the Chinese character is spliced with the numerical vector of the pinyin to obtain the numerical vector for subsequent calculation, so that more data recognition basis can be provided, and the text information of the target type can be effectively recognized for the subsequent operation.

Optionally, before acquiring the plurality of text information to be identified, the method further includes: acquiring a plurality of training text information belonging to a target type; and inputting the training text information of the target type into a language model for training, and obtaining a trained preset language model and a preset probability value output by the preset language model.

In the implementation process, the preset language model is obtained by training the language model in advance, so that the text information of the target type can be effectively identified.

In a second aspect, an embodiment of the present application provides an information identifying apparatus, including:

the text information acquisition module is used for acquiring a plurality of text information to be identified;

the probability value acquisition module is used for acquiring probability values of each text message to be identified as a text message of a target type, and acquiring a plurality of probability values altogether, wherein the text message of the target type is information except for interaction information generated between a service provider and a service requester in the service providing process;

the comparison module is used for comparing each probability value with a preset probability value to obtain a comparison result;

and the identification module is used for determining target text information to be identified, which belongs to the target type, in the plurality of text information to be identified according to the comparison result.

Optionally, the probability value obtaining module is specifically configured to obtain, through a preset language model, a probability value of each text message to be identified as a text message of a target type, where the preset language model is obtained by inputting, in advance, a plurality of training text messages of the target type into the language model for training.

Optionally, the probability value acquisition module is specifically configured to:

extracting M words in each text message to be identified through a preset language model, wherein M is an integer greater than or equal to 2;

predicting the probability value of the (i+1) th word in the M words after the (i) th word appears in the M words, and predicting the probability value of the ending character appearing after the M th word, so as to obtain M probability values corresponding to each piece of text information to be identified, wherein i is an integer greater than or equal to 1 and less than M;

and acquiring the probability value of the text information of which each text information to be identified is of the target type based on M probability values corresponding to each text information to be identified.

Optionally, the probability value obtaining module is further configured to predict, by using a softmax classifier in the preset language model, a probability value of an i+1th word in the M words after the i-th word appears in the M words, and predict a probability value of an ending character after the M-th word, so as to obtain M probability values corresponding to each piece of text information to be identified.

Optionally, the probability value acquisition module is further configured to:

converting an ith word in each text message to be identified into a numerical vector through an embedded representation module in the preset language model, and obtaining an ith numerical vector corresponding to the ith word;

obtaining similarity between an ith word and other M-1 words in the M words through an attention module in the language model as the weight of the ith word;

weighting the weight of the ith word and the ith numerical vector corresponding to the ith word to obtain the numerical vector weighted by the ith word;

inputting the numerical vector weighted by the ith word into the softmax classifier to predict the probability value of the (i+1) th word in the M words after the ith word is predicted, and predicting the probability value of the ending character after the Mth word is predicted, so as to obtain M probability values corresponding to each text information to be recognized.

Optionally, the probability value acquisition module is further configured to:

converting an ith word in each text message to be identified into a first numerical vector through an embedded representation module in the preset language model, and converting an ith pinyin corresponding to the ith word into a second numerical vector;

And splicing the first numerical value vector and the second numerical value vector to obtain an ith numerical value vector corresponding to the ith word.

Optionally, the apparatus further comprises:

the model training module is used for acquiring a plurality of training text information belonging to the target type; and inputting the training text information of the target type into a language model for training, and obtaining a trained preset language model and a preset probability value output by the preset language model.

In a third aspect, embodiments of the present application provide an electronic device comprising a processor and a memory storing computer readable instructions that, when executed by the processor, perform the steps of the method as provided in the first aspect above.

In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method as provided in the first aspect above.

Additional features and advantages of the application will be set forth in the description which follows, and in part will be apparent from the description, or may be learned by practice of the embodiments of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims thereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered limiting the scope, and that other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 shows a schematic diagram of exemplary hardware and software components of an electronic device that may implement the concepts of the present application, according to some embodiments of the present application;

fig. 2 is a flowchart of an information identification method provided in an embodiment of the present application;

fig. 3 is a flowchart of sub-steps of step S120 in an information identifying method according to an embodiment of the present application;

FIG. 4 is a schematic structural diagram of a language model according to an embodiment of the present disclosure;

fig. 5 is a block diagram of an information identifying apparatus according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.

It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.

In order to enable those skilled in the art to understand the present application, the following embodiments are given in connection with a specific application scenario "net car". It will be apparent to those having ordinary skill in the art that the general principles defined herein may be applied to other embodiments and applications without departing from the spirit and scope of the present application. Although the present application is primarily described in terms of a net cart, it should be understood that this is but one exemplary embodiment. The present application may be applied to any other traffic type. The present application may also include any service system for providing services, e.g., a system for sending and/or receiving courier, a service system for trading between buyers and sellers. Applications of the systems or methods of the present application may include web pages, plug-ins to browsers, client terminals, customization systems, internal analysis systems, or artificial intelligence robots, etc., or any combination thereof.

It should be noted that the term "comprising" will be used in the embodiments of the present application to indicate the presence of the features stated hereinafter, but not to exclude the addition of other features.

The terms "driver," "provider," "service provider," and "service provider" are used interchangeably herein to refer to a person, entity, or tool that can provide a service. The terms "passenger," "requestor," "attendant," "service requestor," and "customer" are used interchangeably herein to refer to a person, entity, or tool that may request or subscribe to a service.

Referring to fig. 1, fig. 1 shows a schematic diagram of exemplary hardware and software components of an electronic device 100 that may implement the concepts of the present application, according to some embodiments of the present application. For example, a processor may be used on electronic device 100 and to perform the functions herein.

The electronic device 100 may be a general purpose computer or a special purpose computer, both of which may be used to implement the data processing methods of the present application. Although only one computer is shown, the functionality described herein may be implemented in a distributed fashion across multiple similar platforms for convenience to balance processing loads.

For example, the electronic device 100 may include a network port 110 connected to a network, one or more processors 120 for executing program instructions, a communication bus 130, and various forms of storage media 140, such as magnetic disk, ROM, or RAM, or any combination thereof. By way of example, the computer platform may also include program instructions stored in ROM, RAM, or other types of non-transitory storage media, or any combination thereof. The methods of the present application may be implemented in accordance with these program instructions. The electronic device 100 also includes an Input/Output (I/O) interface 150 between a computer and other Input/Output devices (e.g., keyboard, display screen).

For ease of illustration, only one processor is depicted in the electronic device 100. It should be noted, however, that the electronic device 100 in the present application may also include a plurality of processors, and thus steps performed by one processor described in the present application may also be performed jointly by a plurality of processors or performed separately. For example, if the processor of the electronic device 100 performs step a and step B, it should be understood that step a and step B may also be performed by two different processors together or performed separately in one processor. For example, the first processor performs step a, the second processor performs step B, or the first processor and the second processor together perform steps a and B.

Referring to fig. 2, fig. 2 is a flowchart of an information identifying method according to an embodiment of the present application, where the method includes the following steps:

step S110: and acquiring a plurality of text information to be identified.

In order to judge whether the driver is dangerous to the passengers or not so as to ensure the safety of the passengers, dialogue data between the driver and the passengers can be acquired to analyze the dialogue data, so that whether contradiction and other conditions are generated between the driver and the passengers or not can be judged, and the dangerous degree of the driver can be further judged.

Because the dialogue data between the driver and the passenger can be the dialogue directly performed between the driver and the passenger, and the navigation sound data is broadcasted by voice, a plurality of voice information to be recognized can be obtained first, the voice information to be recognized is recognized and converted into corresponding text information, and a plurality of text information to be recognized is obtained.

Since the driver generally generates a navigation sound or a broadcast sound during the service process, the plurality of voice information to be recognized includes other voice of the navigation sound or the broadcast sound in addition to the voice information between the driver and the passenger. Since the navigation sound or the broadcast sound is generally a voice of some comparative standard, after the navigation sound or the broadcast sound can be effectively identified, other information than the navigation sound or the broadcast sound can be acquired and is considered as voice information between the driver and the passenger.

The method for obtaining the text information to be recognized by carrying out voice recognition on the voice information to be recognized comprises the following steps: the method comprises the steps of identifying voice information to be identified by using a hidden Markov model, identifying the voice information to be identified by using a language model, identifying the voice information to be identified by using an acoustic model and the like.

The plurality of voice information to be recognized can be obtained by the following steps: when the service is started, the network vehicle-restraining platform arranged on the driver terminal can automatically control and start the recording equipment, the recording equipment can record all voice information, namely a plurality of voice information to be recognized, in the driving process, after the service is finished, the recording equipment is closed, and the acquired voice information to be recognized is sent to the server for subsequent processing.

It should be noted that, the above-mentioned plurality of text messages to be identified include, in addition to text messages converted from voice messages obtained by the recording device during the service process, information for text communication between the passenger and the driver, such as text messages exchanged by short messages or WeChat, etc.

Step S120: and acquiring probability values of each text message to be identified as the text message of the target type, and acquiring a plurality of probability values altogether.

In this embodiment, the text information of the target type may refer to information other than the interactive information generated between the service provider and the service requester in the service providing process, and for convenience of description, the text information of the target type may be navigation text information, which may be other types of text information such as broadcast, song sound, etc. if the text information is applied to other application scenarios.

Taking a network taxi-taking scene as an example, a service provider is a driver, a service requester is a passenger, the passenger can place a network taxi-taking order on a network taxi-taking platform on a passenger terminal, after placing the network taxi-taking order, if a driver takes the order, the driver indicates that the service process starts when taking the order, and the service is finished until the passenger finishes the order after getting off, wherein the period is the whole service process. Any interactive information between the driver and the passenger can be contained in the plurality of text messages to be identified in the whole service process, namely, the text messages to be identified can also comprise text communication information between the driver and the passenger besides text information converted from voice information recorded by the recording device.

Since the text information of the target type is doped in the plurality of text information to be identified, in order to identify the text information of the target type in the plurality of text information to be identified, the plurality of text information to be identified can be obtained by cutting all text information to be identified according to all acquired text information to be identified, and the cutting standard is that all text information to be identified is cut according to short sentences, namely, each text information to be identified is one short sentence, so that the plurality of text information to be identified can be obtained, and then the probability value of each text information to be identified as the text information of the target type can be calculated first.

As an embodiment, the similarity value of each text message to be identified and the text message of the target type may be calculated as the probability value of each text message to be identified and the text message of the target type, for example, a plurality of text messages of the target type are stored in the database in advance, and each text message to be identified and the text messages of the target type in the database may be respectively calculated, for example, a hamming distance between two text messages is calculated, the hamming distance is taken as the probability value, and the smaller hamming distance indicates that the two text messages are more similar; or, two text messages can be converted into vectors, then the cosine value of the included angle between the two vectors is calculated, the cosine value of the included angle is used as the probability value, and the closer the cosine value of the included angle is to 1, the more similar the two text messages are.

Therefore, the probability value of each text information to be recognized as the text information of the target type can be calculated in the above manner, and a plurality of probability values are obtained in total.

Step S130: and comparing each probability value with a preset probability value to obtain a comparison result.

Step S140: and determining target text information to be identified belonging to the target type in the plurality of text information to be identified according to the comparison result.

After the probability value of each text message to be identified is obtained as the text message of the target type, each probability value can be compared with a preset probability value, wherein the preset probability value can be flexibly set according to actual needs, the preset probability value can be set to 3 when the probability value is expressed by a hamming distance, and the preset probability value can be set to 0.8 when the probability value is expressed by an included angle cosine value, so that each probability value can be compared with the preset probability value to obtain a comparison result.

If the probability value is represented by an included angle cosine value, if the included angle cosine value between a certain text message to be identified in the plurality of text messages to be identified and the text message of the target type is 0.9 and the preset probability value is 0.8, the text message to be identified can be used as the target text message to be identified, that is, the target text message to be identified belongs to the target type, that is, the target text message to be identified is the navigation sound text message, thereby determining the text message belonging to the navigation sound text type from the plurality of text messages to be identified, further obtaining the rest text messages to be identified from the plurality of text messages to be identified as the interaction information between the driver and the passenger, and further using the interaction information for subsequent semantic understanding or inputting a language model for training and the like so as to analyze whether contradiction occurs between the driver and the passenger.

Therefore, in this embodiment, the probability value of each text message to be identified as the text message of the target type may be obtained, and then each probability value is compared with the preset probability value, and the target text message to be identified belonging to the target type in the plurality of text messages to be identified is determined according to the obtained comparison result.

In one possible implementation manner, in order to obtain a probability value of each text message to be identified as the text message of the target type, the probability value of each text message to be identified as the text message of the target type may also be obtained through a preset language model. Because the preset language module is obtained by training the language model through a plurality of training text messages of the target type, the preset language model can effectively identify the text message of the target type in the plurality of text messages to be identified.

It will be understood that a language model is generally referred to as a statistical language model, which is a probability distribution over a sequence of words, which for a given length m can produce a probability distribution over the sequence that represents the probability of occurrence of a sentence, i.e. it can be determined by the language model whether a given text can constitute a sentence.

In this embodiment, whether each text message to be identified is the text message of the target type is determined by using a probability value, for example, one text message to be identified is a sequence of words, and s=w is assumed to be represented by S ₁ ,w ₂ ,...,w _n Then the probability value of the text information with S as the target type is P (S) =p (w ₁ ,w ₂ ,...,w _n ) And (3) using a conditional probability formula to obtain:

P(S)＝P(w ₁ ,w ₂ ,...,w _n )＝P(w ₁ )·P(w ₂ |w ₁ )·P(w ₃ |w ₁ ,w ₂ )·...·P(w _n |w ₁ ,w ₂ ,...,w _n-1 )

wherein P (w) ₁ ) Representing the probability of the first word occurring, P (w ₂ |w ₁ ) Representing w ₁ In the case that it has already appeared, w ₂ Next to the probability of its occurrence, and so on, the probability of each word correspondence can be calculated.

In order to identify text information belonging to a target type in a plurality of text information to be identified through a language model, the language model needs to be trained in advance, that is, the preset language model is obtained by inputting a plurality of training text information of the target type into the preset language model in advance for training. In the training process, firstly, a plurality of training text information belonging to a target type is acquired, then, the plurality of training text information of the target type is input into a language model for training, and a preset language model after training and a preset probability value output by the preset language model are obtained.

In order to identify text information corresponding to the navigation sound in the text information to be identified, in a training stage, the plurality of training text information can comprise a large amount of navigation sound text information recalled through keywords and a plurality of standard navigation sound text information, then the training text information is input into a language model for training, and a preset probability value and a preset language model can be obtained through training.

The plurality of pieces of navigation sound text information recalled by the keywords can be some pieces of navigation sound text information recalled by the keywords in the navigation sound, and the keywords mainly comprise 3 parts: 1. standard text information related words of navigation sound, such as keywords of "left turn", "right turn", "straight going", "traffic light", etc.; 2. the usual word combinations in the text information of navigation sounds, for example, "after a meter" (after a hundred meters into xx way); 3. since the text is wrong after speech recognition, keywords such as "hello" are summarized from the text manually (i love hello turn right into xx way). Through the keywords, a large amount of navigation sound text information can be recalled, reasonable keywords can provide enough training text information for the language model, the training reliability of the language model is guaranteed, the effective language model can accurately extract rules in the navigation sound text information, and therefore stable and reliable judgment and removal of the navigation sound text information are achieved.

In addition, in the training stage, the preset probability value output by the language model can be characterized by the confusion threshold value output by the language model, the confusion threshold value can be used for judging the prediction accuracy of the language model, and the smaller the confusion threshold value is, the larger the prediction accuracy is. The confusion degree value is output by the language model, namely, the confusion degree value corresponding to each piece of text information to be identified, which is output by the language model, namely, when the confusion degree value of each piece of text information to be identified is smaller than the confusion degree threshold value, the probability value corresponding to each piece of text information to be identified is larger than a preset probability value, and accordingly, the corresponding piece of text information to be identified is the text information of the target type.

In another embodiment, if text information of other target types needs to be identified through the language model, text information of other target types may be obtained to train the language model, for example, a large number of broadcast voice text information is obtained to train the language model, broadcast voice text information summarized by the text information to be identified may be identified from the trained language model, or if dialogs of the driver and the passenger need to be identified through the language model to be some dialogs representing contradictions between the driver and the passenger, a large number of text information recall by some keywords representing contradictions may be obtained to train the language model, for example, keywords may include "roll", "play", "cancel" and the like to recall a large number of text information to train the language model, so that a statement representing contradictions between the driver and the passenger may be identified from the text information to be identified, and thus may be used to analyze whether contradictions between the driver and the passenger occur.

In addition, in this embodiment, as shown in fig. 3, the method for obtaining the probability value of each text message to be identified as the text message of the target type by using the preset language model may include the following steps:

step S121: and extracting M words in each text message to be identified through a preset language model.

Wherein M is an integer greater than or equal to 2.

Step S122: predicting the probability value of the (i+1) th word in the M words after the (i) th word in the M words appears, and predicting the probability value of the ending character after the M th word appears, so as to obtain M probability values corresponding to each text information to be identified.

Wherein i is an integer of 1 or more and less than M.

Step S123: and acquiring the probability value of the text information of which each text information to be identified is of the target type based on M probability values corresponding to each text information to be identified.

For example, if the text information to be identified is "forward right turn", the extracted m=4 words are "forward", "right", "turn", and if i is equal to 1, the probability value of the 2 nd word "square" appearing before the 1 st word "forward", then the probability value of the 3 rd word "right" appearing after the 2 nd word "square", then the probability value of the 4 th word "turn" appearing after the 3 rd word "right" is continuously predicted, and then the probability value of the end character appearing after the 4 th word "turn" is predicted.

The softmax classifier has a good classification prediction effect, so that the probability value of the (i+1) th word in the M words appears after the i-th word in the M words is predicted by the softmax classifier in the preset language model, and the probability value of the ending character appears after the M-th word is predicted, so that M probability values corresponding to each text information to be recognized are obtained.

Specifically, converting an ith word in each text message to be identified into a numerical vector through an embedded representation module in the preset language model to obtain an ith numerical vector corresponding to the ith word; obtaining similarity between an ith word and other M-1 words in the M words through an attention module in the language model as the weight of the ith word; weighting the weight of the ith word and the ith numerical vector corresponding to the ith word to obtain the numerical vector weighted by the ith word; inputting the numerical vector weighted by the ith word into the softmax classifier to predict the probability value of the (i+1) th word in the M words after the ith word is predicted, and predicting the probability value of the ending character after the Mth word is predicted, so as to obtain M probability values corresponding to each text information to be recognized.

The implementation process described above is described in a specific embodiment below. Specifically, as shown in fig. 4, the preset language model includes an embedded representation module, a multi-layer multi-head attention module, a probability calculation module and a confusion calculation module, wherein the embedded representation module is used for converting each word in each text message to be identified into a numerical vector, the multi-layer multi-head attention module is used for adding a corresponding weight in the numerical vector corresponding to each word, the probability calculation module is used for calculating a probability value corresponding to each text message to be identified, and the confusion calculation module is used for calculating the confusion according to the probability value corresponding to each word.

Firstly, each piece of information to be identified is obtained, since numerical calculation is performed inside the language model, each piece of information to be identified is firstly converted into an ID sequence, and is used for inputting an embedded representation module to perform subsequent processing, for example, if one piece of information to be identified is 'front right turn', words can be mapped into words through a pre-established dictionary, such as [10, 20,33,44 ], that is, in the dictionary, the 'front' is at the 10 th position, the 'front' is at the 20 th position, in order to ensure the predictive capability of the first word, a special ID needs to be supplemented in front of the coded sequence, the beginning of a sentence is identified, such as 2, meanwhile, a special ID is supplemented at the end of the sequence, and the end of the sentence is identified, such as 3, and the final ID sequence becomes [2,10,20,33,44,3].

And inputting the ID sequence corresponding to the text information to be identified into an embedded representation module in a preset language model, wherein the embedded representation module is used for converting the ID sequence into a corresponding numerical vector, namely converting each word into the corresponding numerical vector. The embedded representation module is provided with a matrix X, the number of lines of the matrix X is the number of words in a word list, the number of columns is the dimension of a numerical vector, and after the ID sequence is input into the embedded representation module, the embedded representation module can acquire the corresponding line of X according to the value of each ID, so that the numerical vector corresponding to each word is acquired.

For example, the ID sequence is converted into a numerical vector of [ x ] ₂ ,x ₁₀ ,x ₂₀ ,x ₃₃ ,x ₄₄ ,x ₃ ]Wherein x is ₂ Line 2 of matrix X, i.e. the "front" corresponding numerical vector is X ₁₀ For example x ₁₀ ＝[0.7，-0.8,1.2]. The embedded representation module can convert each word into a corresponding numerical vector, and numerical calculation can be conveniently carried out on a subsequent language model.

In addition, in order to provide more data recognition basis, the navigation text information is effectively recognized subsequently, the ith word in each text information to be recognized can be converted into a first numerical vector through an embedded representation module in the preset language model, and the ith pinyin corresponding to the ith word is converted into a second numerical vector; and splicing the first numerical value vector and the second numerical value vector to obtain an ith numerical value vector corresponding to the ith word.

It can be understood that inputting the pinyin information corresponding to each word in each text information to be recognized into the embedded representation module, the embedded representation of the pinyin also has a matrix Y, according to the same method as described above, by obtaining the ID sequence of the pinyin corresponding to each word in the text information to be recognized, extracting the corresponding row from the matrix Y, obtaining the second numerical vector corresponding to the pinyin of each word, and finally splicing the two numerical vectors, where the second numerical vector corresponding to the pinyin of each word in the text information to be recognized is [ Y ] ₂ ,y ₁₀ ,y ₂₀ ,y ₃₃ ,y ₄₄ ,y ₃ ]And finally, the spliced numerical vector is used as the numerical vector corresponding to the text information to be identified. Can be expressed as: [ [ x ] ₂ ,y ₂ ],[x ₁₀ ,y ₁₀ ],[x ₂₀ ,y ₂₀ ],[x ₃₃ ,y ₃₃ ],[x ₄₄ ,y ₄₄ ],[x ₃ ,y ₃ ]]For simplicity of representation, the numerical vector is hereinafter denoted as [ z ] ₁ ,z ₂ ,z ₃ ,z ₄ ,z ₅ ,z ₆ ]Wherein the numerical vector corresponding to each word is included, e.g. the numerical vector corresponding to the 2 nd word is z ₂ ＝[x ₁₀ ,y ₁₀ ]。

The attention module is used for establishing a context relation of the text, increasing the representation capability of the numerical vector, specifically, giving the numerical vector z corresponding to any two words ₁ ,z ₂ The score s is calculated by a softmax function ₁₂ ＝f(z ₁ ,z ₂ ) At a given z ₁ In the case of (a), all other words can be calculated, that is s can be obtained ₁₃ ＝f(z ₁ ,z ₃ ),s ₁₄ ,...,s ₁₆ Therefore, the probability between each word and other words is obtained by the softmax function, which represents the probability that there is a component of the numerical vector corresponding to the other word in the numerical vector corresponding to the word:

At a given probability p ₁₂ ,p ₁₃ ,...,p ₁₆ And the above numerical vector z ₂ ,z ₃ ,...,z ₆ In the case of (2), the representation after the multi-layer multi-head attention module can be obtained by weighting:

in this way the first and second light sources,

the context information is included, which is the weight corresponding to the 0 th word (start character) obtained by the multi-layer multi-head attention module.

In order to ensure the embedded information and the context information at the same time, the weight of the ith word and the ith numerical vector corresponding to the ith word are weighted to obtain the numerical vector weighted by the ith word, namely the numerical vector weighted by the ith word can be adopted

As an output, this may contain both the original information and the context information. And then, the final output of the multi-layer multi-head attention module, namely the weighted numerical vector, can be obtained through transformation of a layer of neural network:

wherein W and b are parameters to be learned, and all

Are transformed by the same W and b, each z _i Such calculation is performed to obtain a numerical vector after the attention mechanism.

In this embodiment, the multi-layered multi-head attention module is a variation of the above attention module, and the representation z is embedded in the above description _i Is calculated as a vector of values. But in the case of multiple heads, the numerical vector z ₁ Will be split from front to back into a plurality of subsections z _1,1 ,z _1,2 ,...,z _1,m For example, [1.1,2.1,3.2,4.5 ]]Is a 4-dimensional vector, if split into two sub-segments, is [1.1,2.1]And [3.2,4.5 ]]Each field then performs attention calculation with sub-segments corresponding to the numerical vectors corresponding to other words, e.g. [ z ] _1,1 ,z _2,1 ,...]Calculation is performed to obtain

I.e. an embedded representation behind the attention module of the first subsection, each of which performs this operation. Then splicing each sub-segment to obtain a vector, thereby obtaining +.>

The weighted value vector for each word can then be calculated.

Multiple head attention mechanisms can focus language models on different contextsThe language model has improved expressive power, and multiple layers are used for representing the attention of each sub-segment (i.e. multiple sub-segments z corresponding to each word) _1,1 ,z _1,2 ,...,z _1,m ) And then, the multi-layer multi-head attention module is used as input of the multi-layer multi-head attention module for secondary calculation, more layers of context information can be modeled, the first layer can pay more attention to local information, the second layer can pay more attention to global information, and the language model can better understand the semantics through the multi-layer multi-head attention module.

Sequence of numerical vector representations after acquisition of a multi-layered multi-head attention module

Then input it into a softmax classifier for probability value prediction, e.g. taking the first word +.>

The probability of predicting the next word can be calculated (e.g. vector +.>

The probability p (square) of "square" needs to be predicted:

after M probability values corresponding to each text information to be recognized can be obtained in the above manner, the confusion degree can be calculated according to the following formula:

that is, the probability value of each text information to be recognized as the text information of the target type is obtained based on the M probability values, that is, the probability value of each text information to be recognized as the text information of the target type is characterized with a degree of confusion. The training process of the language model is similar to the above process, except that the input sample data is different, a confusion threshold can be obtained in the training stage, the confusion threshold can be used for representing a preset probability value, namely, in the prediction stage, the confusion of the output of the language model is smaller than the confusion threshold, and the probability value is larger than the preset probability value. When the probability value of the text information to be identified, which is the target type, is larger than the preset probability value, the text information to be identified is determined to be the target text information to be identified, so that the target text information to be identified can be screened out from the plurality of text information to be identified, the rest text information is obtained and used as the interaction information generated between the service provider and the service requester in the service providing process, and further, the interaction information can be analyzed to analyze whether contradictions and the like are generated between a driver and passengers or not.

Referring to fig. 5, fig. 5 is a block diagram of an information identifying apparatus 200 according to an embodiment of the present application, where the apparatus includes:

a text information obtaining module 210, configured to obtain a plurality of text information to be identified;

the probability value obtaining module 220 is configured to obtain a plurality of probability values for each text message to be identified as a target type of text message, where the target type of text message is information other than interaction information generated between the service provider and the service requester in the service providing process;

a comparison module 230, configured to compare each probability value with a preset probability value to obtain a comparison result;

and the identifying module 240 is configured to determine target text information to be identified, which belongs to the target type, from the plurality of text information to be identified according to the comparison result.

Optionally, the probability value obtaining module 220 is specifically configured to obtain, through a preset language model, a probability value of each text message to be identified as a text message of a target type, where the preset language model is obtained by inputting, in advance, a plurality of training text messages of the target type into the language model for training.

Optionally, the probability value acquisition module 220 is specifically configured to:

Optionally, the probability value obtaining module 220 is further configured to predict, by using a softmax classifier in the preset language model, a probability value of an i+1th word of the M words after the i-th word appears in the M words, and predict a probability value of an ending character after the M-th word, so as to obtain M probability values corresponding to each text information to be identified.

Optionally, the probability value acquisition module 220 is further configured to:

Optionally, the apparatus further comprises:

Embodiments of the present application provide a readable storage medium, which when executed by a processor, performs a method process performed by an electronic device in the method embodiment shown in fig. 2.

It will be clear to those skilled in the art that, for convenience and brevity of description, reference may be made to the corresponding procedure in the foregoing method for the specific working procedure of the apparatus described above, and this will not be repeated here.

In summary, the embodiment of the application provides an information identification method, an apparatus, an electronic device and a storage medium, in the method, by acquiring probability values of text information with each text information to be identified as a target type, comparing each probability value with a preset probability value, and determining target text information to be identified, which belongs to the target type, in a plurality of text information to be identified according to the acquired comparison result, compared with the problems of low accuracy and long time consumption in the prior art that matching is performed through a large number of different templates, the method can effectively improve accuracy and efficiency of information identification.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners as well. The apparatus embodiments described above are merely illustrative, for example, flow diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, the functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims

1. An information identification method, characterized in that the method comprises:

acquiring a plurality of text information to be identified;

acquiring probability values of each text message to be identified as text messages of a target type, and acquiring a plurality of probability values altogether, wherein the text messages of the target type are information except for interaction information generated between a service provider and a service requester in the service providing process;

comparing each probability value with a preset probability value to obtain a comparison result;

determining target text information to be identified belonging to the target type in the plurality of text information to be identified according to the comparison result;

the obtaining the probability value of each text message to be identified as the text message of the target type comprises the following steps:

m words corresponding to each text message to be identified are obtained, wherein M is an integer greater than or equal to 2;

2. The method of claim 1, wherein obtaining a probability value for each text message to be identified as a text message of a target type comprises:

the method comprises the steps of obtaining probability values of text information with each text information to be identified as a target type through a preset language model, wherein the preset language model is obtained by inputting a plurality of training text information with the target type into the language model in advance for training.

3. The method of claim 2, wherein predicting the probability value of the i+1th word of the M words occurring after the i-th word of the M words, and predicting the probability value of the ending character occurring after the M-th word, together obtain M probability values corresponding to each text information to be recognized, comprises:

predicting the probability value of the (i+1) th word in the M words after the (i) th word in the M words by a softmax classifier in the preset language model, and predicting the probability value of the ending character after the M th word, so as to obtain M probability values corresponding to each text information to be recognized.

4. A method according to claim 3, wherein predicting, by a softmax classifier in the preset language model, a probability value of an i+1th word of the M words occurring after the i-th word of the M words, and predicting a probability value of an ending character occurring after the M-th word, obtaining M probability values corresponding to each text information to be recognized, includes:

5. The method according to claim 4, wherein converting, by the embedded representation module in the preset language model, the ith numerical vector in each text message to be recognized into a numerical vector, to obtain the ith numerical vector corresponding to the ith numerical vector, includes:

6. The method according to any one of claims 1-5, further comprising, prior to obtaining the plurality of text messages to be identified:

acquiring a plurality of training text information belonging to a target type;

and inputting the training text information of the target type into a language model for training, and obtaining a trained preset language model and a preset probability value output by the preset language model.

7. An information identifying apparatus, characterized in that the apparatus comprises:

the identification module is used for determining target text information to be identified, which belongs to the target type, in the plurality of text information to be identified according to the comparison result;

The probability value acquisition module is specifically configured to:

8. The apparatus of claim 7, wherein the probability value obtaining module is specifically configured to obtain, through a preset language model, a probability value of each text message to be identified as a text message of a target type, where the preset language model is obtained by inputting a plurality of training text messages of the target type into the language model in advance for training.

9. The apparatus of claim 8, wherein the probability value obtaining module is further configured to predict, by a softmax classifier in the preset language model, a probability value of an i+1th word of the M words after the i-th word appears, and predict a probability value of an ending character after the M-th word, so as to obtain M probability values corresponding to each text information to be recognized.

10. The apparatus of claim 9, wherein the probability value acquisition module is further configured to:

11. The apparatus of claim 10, wherein the probability value acquisition module is further configured to:

12. The apparatus according to any one of claims 7-11, wherein the apparatus further comprises:

13. An electronic device comprising a processor and a memory storing computer readable instructions which, when executed by the processor, perform the steps of the method of any of claims 1-6.

14. A readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, performs the steps of the method according to any of claims 1-6.