CN112632222B

CN112632222B - Terminal equipment and method for determining data belonging field

Info

Publication number: CN112632222B
Application number: CN202011559098.6A
Authority: CN
Inventors: 王聪; 沈承恩
Original assignee: Hisense Visual Technology Co Ltd
Current assignee: Hisense Visual Technology Co Ltd
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2023-02-03
Anticipated expiration: 2040-12-25
Also published as: CN112632222A

Abstract

The embodiment of the application provides a terminal device and a method for determining the field to which data belong, and relates to the technical field of computers. And finally, determining the field to which the text data belongs according to the first classification result and the second classification result corresponding to the text data, so that the text data can be classified into the accurate field.

Description

Terminal equipment and method for determining data belonging field

Technical Field

The present application relates to the field of computers, and in particular, to a terminal device and a method for determining a field to which data belongs.

Background

With the rapid development of artificial intelligence, accurately determining the field to which the voice data input by the user belongs becomes the research focus and difficulty of current intelligent life, and after determining the field to which the voice data belongs, corresponding software application can be opened according to the field to which the voice data belongs, so that great convenience can be brought to the life of the user.

At present, after voice data of a user is acquired and converted into text data, a deep learning model method is usually adopted to determine the field of the text data, but some text data contain non-keywords, and a classification result output by the deep learning model is very easy to be confused with the field of the non-keywords, so that the voice data is often classified into an error field. For example, a user inputs a speech "help me translate apples" which should be classified into the translation domain, but since the non-keyword "apple" is included, it is easily confused with the food domain, and thus may be classified into the food domain.

Disclosure of Invention

In order to solve the existing technical problem, embodiments of the present application provide a terminal device and a method for determining a field to which data belongs, so that text data can be classified into an accurate field.

In order to achieve the above purpose, the technical solution of the embodiment of the present application is implemented as follows:

in a first aspect, an embodiment of the present application provides a terminal device, including:

a memory for storing program code and data information generated by the terminal device during operation;

a processor for executing the program code to implement the following processes: determining a first classification result corresponding to text data to be classified through a deep learning network model; extracting features of the text data to obtain feature vectors corresponding to the text data, and determining whether the text data contain set key information according to the feature vectors; if the text data contains set key information, determining a second classification result corresponding to the text data according to the feature vector through a machine learning network model; and determining the field to which the text data belongs according to the first classification result and the second classification result.

According to the terminal device provided by the embodiment of the application, the first classification result corresponding to the text data to be classified is determined through the deep learning network model, after the characteristics of the text data are extracted, the characteristic vector corresponding to the text data can be obtained, whether the text data contain the set key information or not is determined according to the characteristic vector corresponding to the text data, and when the text data contain the set key information, the second classification result corresponding to the text data can be determined according to the characteristic vector corresponding to the text data through the machine learning network model. And finally, determining the field to which the text data belongs according to the first classification result and the second classification result corresponding to the text data, so that the text data can be classified into the accurate field.

In a possible implementation manner, the terminal device further includes a voice acquisition component; the voice acquisition component is used for acquiring voice data;

the processor is further configured to convert the voice data collected by the voice collection component into text data.

The terminal equipment can collect voice data of a user and convert the collected voice data into text data, so that the text data can be classified and classified into corresponding fields.

In one possible implementation, the processor is specifically configured to:

performing word segmentation processing on text data to be classified to obtain each word segmentation corresponding to the text data;

and inputting each participle corresponding to the text data into a deep learning network model, and determining a first classification result corresponding to the text data according to each participle corresponding to the text data.

After the text data to be classified is obtained, the terminal device can perform word segmentation on the text data to obtain each word segmentation corresponding to the text data, and then each word segmentation is input into the deep learning network model, and the deep learning network model can determine a first classification result corresponding to the text data according to each word segmentation corresponding to the text data. The deep learning network model is adopted to classify the text data, so that the classification efficiency can be improved, and a more accurate classification result can be obtained.

In one possible implementation, the processor is further configured to:

comparing the text data with set key information, and determining a characteristic value of the text data corresponding to the characteristic containing the key information according to the comparison result of the text data and the key information;

and determining a feature vector corresponding to the text data according to the feature value of the text data corresponding to the feature.

The terminal device can compare the text data with the set key information, determine the feature value of the text data corresponding to the feature containing the key information according to the comparison result of the text data and the key information, and determine the feature vector corresponding to the text data according to the feature value of the text data corresponding to the feature. The features of the comparison key included in the text data can be extracted, so that the classification accuracy of classifying the text data can be improved.

In one possible implementation, the processor is further configured to:

and inputting the characteristic vector into a machine learning network model, and determining a second classification result corresponding to the text data according to the characteristic vector.

According to the terminal equipment, the extracted feature vector is input into the machine learning network model, and the machine learning network model can determine the second classification result corresponding to the text data according to the feature vector, so that a more accurate classification result can be obtained.

In one possible implementation, the first classification result includes a first probability that the text data corresponds to each set domain; the second classification result comprises a second probability that the text data corresponds to each set domain; the processor is further configured to:

if the highest probability value in the first probability is larger than a first threshold value or the highest probability value in the second probability is smaller than a second threshold value, determining the field to which the text data belongs as the field corresponding to the highest probability value in the first probability;

if the confusion-prone field is set in the field corresponding to the highest probability value in the second probability, determining the field to which the text data belongs as the field corresponding to the highest probability value in the second probability;

if the field corresponding to the highest probability value in the second probability is not the confusable field and the highest probability value in the first probability is greater than a third threshold, determining that the field to which the text data belongs is the field corresponding to the highest probability value in the first probability;

and if the field corresponding to the highest probability value in the second probability is not the confusable field and the highest probability value in the first probability is less than or equal to a third threshold, determining that the field to which the text data belongs is the field corresponding to the highest probability value in the second probability.

After the first classification result and the second classification result corresponding to the text data are determined, the terminal device can determine the highest probability value in the first probability of the text data corresponding to each set field according to the first classification result, and determine the highest probability value in the second probability of the text data corresponding to each set field according to the second classification result. When the highest probability value in the first probability is greater than the first threshold value or the highest probability value in the second probability is less than the second threshold value, it can be determined that the field to which the text data belongs is the field corresponding to the highest probability value in the first probability. When the field corresponding to the highest probability value in the second probability belongs to the set confusable field, the field to which the text data belongs can be determined to be the field corresponding to the highest probability value in the second probability. When the field corresponding to the highest probability value in the second probability does not belong to the confusable field and the highest probability value in the first probability is greater than the third threshold, it may be determined that the field to which the text data belongs is the field corresponding to the highest probability value in the first probability. When the field corresponding to the highest probability value in the second probability does not belong to the confusable field and the highest probability value in the first probability is less than or equal to the third threshold, it may be determined that the field to which the text data belongs is the field corresponding to the highest probability value in the second probability. Because the confusable field is set before the text data is classified, the text data can be classified into the correct field, and the classification precision is improved.

In one possible implementation, the processor is further configured to:

and starting the application corresponding to the field according to the field to which the text data belongs.

After the field to which the text data belongs is determined, the terminal equipment can start the application corresponding to the field according to the field to which the text data belongs, so that greater convenience can be provided for the life of a user, and the life quality of the user is improved.

In a second aspect, an embodiment of the present application provides a method for determining a domain to which data belongs, including:

determining a first classification result corresponding to text data to be classified through a deep learning network model;

extracting features of the text data to obtain feature vectors corresponding to the text data, and determining whether the text data contains set key information according to the feature vectors;

if the text data contains set key information, determining a second classification result corresponding to the text data according to the feature vector through a machine learning network model;

and determining the field to which the text data belongs according to the first classification result and the second classification result.

In one possible implementation manner, determining a first classification result corresponding to text data to be classified through a deep learning network model includes:

In a possible implementation manner, performing feature extraction on the text data to obtain a feature vector corresponding to the text data includes:

In a possible implementation manner, the determining, by the machine learning network model, a second classification result corresponding to the text data according to the feature vector includes:

In one possible implementation, the first classification result includes a first probability that the text data corresponds to each set domain; the second classification result comprises a second probability that the text data corresponds to each set domain; determining the field to which the text data belongs according to the first classification result and the second classification result, wherein the determining comprises the following steps:

if the field corresponding to the highest probability value in the second probability belongs to a set confusable field, determining the field to which the text data belongs as the field corresponding to the highest probability value in the second probability;

if the field corresponding to the highest probability value in the second probability does not belong to the confusable field and the highest probability value in the first probability is greater than a third threshold, determining that the field to which the text data belongs is the field corresponding to the highest probability value in the first probability;

and if the field corresponding to the highest probability value in the second probability does not belong to the confusable field and the highest probability value in the first probability is less than or equal to a third threshold, determining the field to which the text data belongs as the field corresponding to the highest probability value in the second probability.

In one possible implementation, the method further includes:

if the text data does not contain the set key information, determining the field to which the text data belongs according to the first classification result;

the text data belongs to the field corresponding to the highest probability value in the first probability.

In a possible implementation manner, after determining a domain to which the text data belongs, the method further includes:

In a third aspect, the present application further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the method in the field to which the determination data of the second aspect belongs.

In a fourth aspect, an embodiment of the present application further provides an apparatus for determining a domain to which data belongs, where the apparatus includes:

the first classification result determining unit is used for determining a first classification result corresponding to the text data to be classified through the deep learning network model;

the feature extraction unit is used for extracting features of the text data to obtain a feature vector corresponding to the text data, and determining whether the text data contains set key information according to the feature vector;

the second classification result determining unit is used for determining a second classification result corresponding to the text data according to the feature vector through a machine learning network model when the text data contains set key information;

and the domain determining unit is used for determining the domain to which the text data belongs according to the first classification result and the second classification result.

In a possible implementation manner, the first classification result determining unit is specifically configured to:

In a possible implementation manner, the feature extraction unit is specifically configured to:

In a possible implementation manner, the second classification result determining unit is specifically configured to:

In one possible implementation, the first classification result includes a first probability that the text data corresponds to each set domain; the second classification result comprises a second probability that the text data corresponds to each set field; the domain determining unit is specifically configured to:

and if the field corresponding to the highest probability value in the second probability does not belong to the confusable field and the highest probability value in the first probability is less than or equal to a third threshold, determining that the field to which the text data belongs is the field corresponding to the highest probability value in the second probability.

In a possible implementation manner, the domain determining unit is further configured to:

In a possible implementation manner, the apparatus further includes an application starting unit, configured to:

For technical effects brought by any one implementation manner of the second aspect, the third aspect, or the fourth aspect, reference may be made to technical effects brought by the implementation manner of the first aspect, and details are not described here.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic structural diagram of a terminal device according to an embodiment of the present application;

fig. 2 is a schematic structural diagram of another terminal device provided in an embodiment of the present application;

fig. 3 is a schematic flowchart of a method for determining a domain to which data belongs according to an embodiment of the present application;

fig. 4 is a schematic internal structural diagram of a deep learning network model according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of an apparatus for determining a domain to which data belongs according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of another apparatus for determining a domain to which data belongs according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Fig. 1 schematically illustrates a structural diagram of a terminal device provided in an embodiment of the present application. As shown in fig. 1, the terminal device provided in the embodiment of the present application includes a processor 103 and a memory 102.

A memory 102 for storing data information generated by the terminal device during operation and program codes used by the processor 103 during operation, such as program codes for determining methods of data belonging to the field provided by the embodiments of the present application, and the like, wherein the program codes can be executed by the processor 103.

The processor 103 may include one or more Central Processing Units (CPUs), or digital processing units, etc. A processor 103 for calling the program code stored in the memory 102 to implement the following processes: determining a first classification result corresponding to text data to be classified through a deep learning network model; performing feature extraction on the text data to obtain a feature vector corresponding to the text data, and determining whether the text data contains set key information according to the feature vector; if the text data contains the set key information, determining a second classification result corresponding to the text data according to the feature vector through a machine learning network model; and determining the field to which the text data belongs according to the first classification result and the second classification result.

In an optional embodiment, the terminal device further includes a voice collecting component 101, configured to collect voice data input by a user. The processor 103 is also configured to convert voice data collected by the voice collection component 101 into text data.

The specific connection medium among the voice capturing component 101, the memory 102 and the processor 103 is not limited in the embodiments of the present application. In fig. 1, the voice collecting component 101, the memory 102 and the processor 103 are connected by a bus 104, and the connection manner among other components is only for illustrative purpose and is not limited to this. The bus 104 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 1, but it is not intended that there be only one bus or one type of bus.

In some embodiments, the functionality of the voice capture component 101 may be implemented by an audio device, that is, voice data of a user may be captured by the audio device.

In one embodiment, the terminal device may be a smart device, such as a mobile phone, a tablet computer, a notebook computer, and the like. As shown in fig. 2, the terminal device includes: radio Frequency (RF) circuit 310, memory 320, input unit 330, display unit 340, sensor 350, audio circuit 360, wireless fidelity (WiFi) module 370, processor 380, and the like. Those skilled in the art will appreciate that the terminal device configuration shown in fig. 2 is not limiting of terminal devices and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components.

The following specifically describes each constituent component of the terminal device with reference to fig. 2:

RF circuit 310 may be used for receiving and transmitting signals during a message transmission or call, and in particular, for receiving downlink information from a base station and processing the received downlink information to processor 380; in addition, the data for designing uplink is transmitted to the base station.

The memory 320 may be used to store software programs and modules, such as program instructions corresponding to the method for determining the field to which the data belongs in the embodiment of the present application, and the processor 380 executes various functional applications of the terminal device and data processing, such as the method for determining the field to which the data belongs, provided by the embodiment of the present application, by running the software programs stored in the memory 320. The memory 320 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program of at least one application, and the like; the storage data area may store data created according to the use of the terminal device, and the like. Further, the memory 320 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The input unit 330 may be used to receive numeric or character information input by a user and generate key signal inputs related to user settings and function control of the terminal.

Optionally, the input unit 330 may include a touch panel 331 and other input devices 332.

The touch panel 331, also referred to as a touch screen, can collect touch operations of a user on or near the touch panel 331 (for example, operations of the user on the touch panel 331 or near the touch panel 331 using any suitable object or accessory such as a finger, a stylus, etc.), and implement corresponding operations according to a preset program, for example, operations of the user clicking a shortcut identifier of a function module, etc. Alternatively, the touch panel 331 may include two parts, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 380, and can receive and execute commands sent by the processor 380. In addition, the touch panel 331 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave.

Optionally, other input devices 332 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The display unit 340 may be used to display information input by a user or interface information presented to the user, and various menus of the terminal device. The display unit 340 is a display system of the terminal device, and is configured to present an interface, such as a display desktop, an operation interface of an application, or an operation interface of a live application.

The display unit 340 may include a display panel 341. Alternatively, the Display panel 341 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

Further, the touch panel 331 can cover the display panel 341, and when the touch panel 331 detects a touch operation on or near the touch panel 331, the touch panel is transmitted to the processor 380 to determine the type of the touch event, and then the processor 380 provides a corresponding interface output on the display panel 341 according to the type of the touch event.

Although in fig. 2, the touch panel 331 and the display panel 341 are two separate components to implement the input and output functions of the terminal device, in some embodiments, the touch panel 331 and the display panel 341 may be integrated to implement the input and output functions of the terminal.

The terminal device may also include at least one sensor 350, such as a light sensor, motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor that adjusts the brightness of the display panel 341 according to the brightness of ambient light, and a proximity sensor that turns off the backlight of the display panel 341 when the terminal device is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally three axes), detect the magnitude and direction of gravity when stationary, and can be used for applications (such as horizontal and vertical screen switching, related games, magnetometer attitude calibration), vibration recognition related functions (such as pedometer and tapping), and the like, for recognizing the attitude of the electronic device; as for other sensors such as a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured in the terminal device, detailed description is omitted here.

The audio circuit 360, the speaker 361 and the microphone 362 may provide an audio interface between the user and the terminal device, i.e. implement the functionality of a speech capturing component. The audio circuit 360 may transmit the electrical signal converted from the received audio data to the speaker 361, and the audio signal is converted by the speaker 361 and output; on the other hand, the microphone 362 converts the collected sound signal into an electrical signal, which is received by the audio circuit 360 and converted into audio data, which is then processed by the audio data output processor 380 and then transmitted to, for example, another terminal device via the RF circuit 310, or output to the memory 320 for further processing.

WiFi belongs to short-distance wireless transmission technology, and the terminal equipment can help users to send and receive e-mails, browse webpages, access streaming media and the like through the WiFi module 370, and provides wireless broadband internet access for the users. Although fig. 2 shows the WiFi module 370, it is understood that it does not belong to the essential constitution of the terminal device, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The processor 380 is a control center of the terminal device, connects various parts of the whole terminal device by using various interfaces and lines, and performs various functions of the terminal device and processes data by running or executing software programs and/or modules stored in the memory 320 and calling data stored in the memory 320, thereby performing overall monitoring of the terminal device. Optionally, processor 380 may include one or more processing units; optionally, the processor 380 may integrate an application processor and a modem processor, wherein the application processor mainly processes software programs such as an operating system, applications, and functional modules inside the applications, such as a method for determining a domain to which data belongs provided in an embodiment of the present application. The modem processor handles primarily wireless communications. It will be appreciated that the modem processor described above may not be integrated into processor 380.

It will be appreciated that the configuration shown in fig. 2 is merely illustrative and that the terminal device may include more or fewer components than shown in fig. 2 or may have a different configuration than shown in fig. 2. The components shown in fig. 2 may be implemented in hardware, software, or a combination thereof.

In some embodiments, a flowchart of a method performed by the terminal device for determining a domain to which data belongs may be shown in fig. 3, and includes the following steps:

step S301, determining a first classification result corresponding to the text data to be classified through a deep learning network model.

After the voice data of the user is collected, the voice data can be converted into text data. After the text data is obtained, word segmentation processing can be performed on the text data to obtain each word segmentation corresponding to the text data, and then each word segmentation corresponding to the text data is input into the deep learning network model. The deep learning network model can determine a first classification result corresponding to the text data according to each word segmentation corresponding to the text data.

In one embodiment, the deep learning network model may include one or more transform model structures. The transform model structure may be as shown in fig. 4, and includes a Multi-Head Mask Self-Attention layer (Mask Multi-Head Self-Attention) and a Feed-Forward network layer (Feed Forward), where the Multi-Head Mask Self-Attention layer is used to help a current node to focus only on current text data, so as to obtain semantic information of a context, and a data normalization layer is connected behind both the Multi-Head Mask Attention layer and the Feed-Forward network layer, and is used to normalize the data, and then input the normalized data to a next network. And the data output by each network layer is normalized, so that the data processing by the model is accelerated.

Specifically, the basic attention can be obtained by the following formula:

attention _output ＝Attention(Q,K,V)

wherein, Q, K and V are respectively Query, key and Value, which are three vectors output by attention.

The multi-head attention projects Q, K and V through h different linear transformations, and finally, the results of the different attentions are spliced together, which can be expressed by the following formula:

MultiHead(Q，K，V)＝Concat(head1,…,headh)W ^O

wherein, W ^O Is a weighting matrix.

And the multi-head self-attribute is the same as Q, K and V.

Step S302, feature extraction is carried out on the text data to obtain a feature vector corresponding to the text data.

The text data can be compared with key information contained in the pre-designed features according to the pre-designed features, feature values of the text data corresponding to the features containing the key information are determined according to the comparison result of the text data and the key information, and feature vectors corresponding to the text data are determined according to the feature values of the text data corresponding to the features.

The features are mainly classified into a regular + label combination and a plurality of label combinations, and the pre-designed features can be shown in the following table:

specifically, there may be a plurality of pre-designed features, each feature may include a configuration table, and each configuration table includes a plurality of words and configuration requirements. The word segmentation processing may be performed on the text data to obtain each word segmentation corresponding to the text data, and then each word segmentation corresponding to the text data is compared with a word included in the configuration table of each feature, when there is a word that meets the configuration requirement of a certain configuration table, the feature value of the text data corresponding to the feature including the configuration table is 1, and when each word does not meet the configuration requirement of the certain configuration table, the feature value of the text data corresponding to the feature including the configuration table is 0.

In one embodiment, the configuration requirement of the configuration table is to include any word in the configuration table. For example, there are 2 words "meal" and "thing" in the configuration table of feature a, the text data is "i want to eat", after segmenting the text data, 3 segments "i", "want", "eat", and "3 segments" can be obtained, the obtained 3 segments are respectively compared with 2 words contained in the configuration table, and since the segment "eat" containing the word "meal" meets the configuration requirement, the feature value corresponding to the feature a of the text data "i want to eat" is 1. For another example, the text data is "i want to watch a movie", 3 segmented words, i "," want "," watch a movie ", can be obtained after the segmentation, and the 3 segmented words are compared with the words included in the configuration table, and since all the segmented words do not include 2 words, i.e.," rice "and" east and west ", the feature value of the text data, i.e.," i want to watch a movie ", corresponding to the feature a is 0.

In another embodiment, the configuration of the configuration table requires that words in the configuration table contain tokens. For example, the word "listen to music" is in the configuration table of the feature b, the text data is "open music", 2 segmentations of "open" and "music" can be obtained after the text data is segmented, the obtained 2 segmentations are respectively compared with the words in the configuration table, and the feature value of the text data corresponding to the feature b is 1 because the word "listen to music" contains the segmentations of "music" which meet the configuration requirement. For another example, the text data is "eat", the text data "eat" is compared with the words in the configuration table, and the word "listen to music" does not include "eat", and does not meet the configuration requirement, so that the characteristic value of the text data "eat" corresponding to the characteristic b is 0.

According to the rules, feature extraction can be carried out on each word segmentation in the text data to obtain a feature value of the text data corresponding to each feature, and then a feature vector corresponding to the text data is determined according to the feature value. For example, 5 features including a feature a, a feature b, a feature c, a feature d, and a feature e are designed in advance, the configuration table of the feature a includes a word "movie", the configuration table of the feature b includes a word "meal", the configuration table of the feature c includes a word "clothes", the configuration table of the feature d includes a word "map", and the configuration table of the feature e includes a word "translation", and the configuration requirements of the 5 features are all words included in the configuration table.

When the text data determined according to the voice data input by the user is "i want to eat", the text data is subjected to word segmentation processing to obtain 3 words of "i", "want" and "eat", each word segmentation is compared with the word contained in the configuration table of each feature one by one, the feature value of the text data "i want to eat" corresponding to the feature b is 1 because the word segmentation "eat" contains the word "meal", and the feature value of the text data "i want to eat" corresponding to the feature a, the feature c, the feature d and the word in the configuration table of the feature e are not contained in all the words, so that the feature vector of the text data "i want to eat" is 0,1,0,0,0 because the feature values of the feature a, the feature c, the feature d and the feature e are all 0. When the text data determined according to the voice data input by the user is "eat and watch a movie", the text data is subjected to word segmentation processing, then 3 word segmentations of "eat", "and" watch a movie "can be obtained, each word segmentation is compared with the word contained in the configuration table of each feature one by one, the word segmentation" eat "contains the word" meal ", and the word segmentation" watch a movie "contains the word" movie ", so that the feature values of the text data" eat and watch a movie "corresponding to the feature a and the feature b are all 1, and the feature vectors of the text data" eat and watch a movie "are [0,1,0,0,0] because all the word segmentations do not contain the feature c and the words in the configuration tables of the feature d and the feature e, respectively. When the text data determined according to the voice data input by the user is "i want to read", the text data is subjected to word segmentation processing to obtain 3 word segmentations of "i", "i" and each word segmentation is compared with the word contained in the configuration table of each feature one by one, and since all the word segmentations do not contain the words in the configuration tables of the feature a, the feature b, the feature c, the feature d and the feature e, the feature value of the text data "i want to read" corresponding to the feature a, the feature b, the feature c, the feature d and the feature e is 0, and the feature vector of the text data "i want to read" is [0,1,0,0,0].

Step S303, determining whether the text data contains set key information according to the feature vector; if not, executing step S304; if so, step S305 is performed.

And step S304, determining the field of the text data according to the first classification result.

Step S305, determining a second classification result corresponding to the text data according to the feature vector through the machine learning network model.

Since deep learning does not work well for classifying proper nouns that do not appear in the vocabulary library, such words are mainly focused on: music name, movie name, game cast name, etc. The nouns are added with the heat, and because the word vector of the words is not contained in the word library, and the deep learning network model mainly relies on the word vector of the word library to train the data, the deep learning uses the default word vector, and the wrong classification result is obtained.

On the basis, whether the set key information is contained in the text data or not can be determined according to the feature vector, and whether a machine learning network model is introduced or not can be determined. For example, a feature vector obtained from the text data "i eat" is [0,1,0,0,0], it can be determined that the text data "i eat" contains the set key information "eat", and at this time, a machine learning network model can be introduced. For another example, according to the text data "i want to read" and the obtained feature vector is [0,0,0,0,0], it can be determined that the text data "i want to eat" does not contain set key information, and at this time, a machine learning network model does not need to be introduced, and only a deep learning network model can be adopted.

When the text data does not contain the set key information, determining the field to which the text data belongs according to a first classification result obtained by the deep learning network model, wherein the first classification result comprises a first probability of the text data corresponding to each set field, and taking the field corresponding to the highest probability value in the first probabilities as the field to which the text data belongs. For example, the text data is 'i want to eat', the probability value of the food area is 0.8, the probability value of the movie area is 0.1, and the probability value of the sports area is 0.05 according to the first classification result corresponding to the text data obtained by the deep learning network model, and the area to which the text data 'i want to eat' belongs can be determined to be the food area.

When the text data contains the set key information, the feature vector can be input into the machine learning network model, and a second classification result corresponding to the text data is determined according to the feature vector.

In some embodiments, before training the machine learning network model, feature extraction needs to be performed on text data in a training sample data set, and since a target can support feature extraction of multiple dimensions, 2-dimensional and above-2-dimensional features can be designed according to the intention of the text data when designing the features. The method comprises the steps of extracting features of text data in a training sample data set according to pre-designed features, determining 300 to 400 training sample data, wherein the training sample data is not required to be excessive, and a single training sample data is not required to be too long, but the feature value of at least one dimension of each training sample data is required to be ensured to be 1, the distribution balance of the training sample data is also required to be ensured, and the situations that a large amount of training sample data appears in one dimension and only a few training sample data appears in the other dimension are avoided.

In one embodiment, the machine learning network model may be an xgboost model. The xgboost model has the following advantages: first, in terms of the objective function, the xgboost model performs a second-order taylor expansion on the loss function. The xgboost model introduces a second order derivative for increasing the precision on one hand and for customizing the loss function on the other hand, and the second order taylor expansion can approximate a large number of loss functions. Secondly, in terms of regularization, the xgboost model adds a regularization term to the objective function for controlling the complexity of the model. The regular term includes the L2 paradigm of the leaf node number and the leaf node weight of the tree. The regular term can reduce the variance of the model, so that the obtained model is simpler and is helpful for preventing the overfitting of the model.

The xgboost model is a tool of a massively parallel boosting tree, is the fastest and best tool package of the open source boosting tree at present, and is more than 10 times faster than the common tool package. The xgboost model is an additive operation formula composed of k basic models, and can be expressed by the following formula:

wherein f is _k For the k-th basis model,

is the predicted value of the ith sample.

The loss function can be predicted from

With the true value y _i Determining:

where n is the number of samples.

The prediction accuracy of the model is jointly determined by the deviation and the variance of the model, and the objective function of the xgboost model consists of a loss function L of the model and a regular term omega for inhibiting the complexity of the model, so that the overall objective function is as follows:

and S306, determining the field to which the text data belongs according to the first classification result and the second classification result.

After a first classification result corresponding to the text data is obtained according to the deep learning network model and a second classification result corresponding to the text data is obtained according to the machine learning network model, a domain to which the text data belongs can be determined according to a first probability that the text data included in the first classification result corresponds to each set domain and a second probability that the text data included in the second classification result corresponds to each set domain.

Specifically, the highest probability value among first probabilities that the text data corresponds to each set region and the highest probability value among second probabilities that the text data corresponds to each set region are determined. When the highest probability value in the first probabilities is greater than the first threshold value 0.98 or the highest probability value in the second probabilities is less than the second threshold value 0.5, it may be determined that the field to which the text data belongs is the field corresponding to the highest probability value in the first probabilities. For example, the text data is "i want to eat and watch a movie", the probability value corresponding to the leisure area of the text data "i want to eat and watch a movie" obtained according to the deep learning network model is the highest and is 0.99, the probability value corresponding to the cate area of the text data "i want to eat and watch a movie" obtained according to the machine learning network model is the highest and is 0.4, and the area to which the text data "i want to eat and watch a movie" belongs can be determined to be the leisure area.

The confusion area may be preset, and when the area corresponding to the highest probability value in the second probability belongs to the set confusion area, the area to which the text data belongs is determined to be the area corresponding to the highest probability value in the second probability. For example, a food field and a translation field can be preset to be easy-to-confuse fields, the text data is 'help me translate apple', the probability value of the text data 'help me translate apple' corresponding to the food field can be determined to be the highest according to the deep learning network model, the probability value of the text data 'help me translate apple' corresponding to the translation field can be determined to be the highest according to the machine learning network model, and the translation field determined according to the machine learning network model belongs to the set easy-to-confuse field, so that the field to which the text data 'help me translate apple' belongs can be determined to be the translation field.

When the domain corresponding to the highest probability value in the second probability does not belong to the confusable domain and the highest probability value in the first probability is greater than the third threshold value of 0.94, it may be determined that the domain to which the text data belongs is the domain corresponding to the highest probability value in the first probability. For example, the preset confusable fields are a food field and a translation field, the text data is a map to be opened, the probability value of the text data corresponding to the navigation field of the map to be opened is determined to be 0.96 according to the deep learning network model, the probability value of the text data corresponding to the travel field of the map to be opened is determined to be 0.96 according to the machine learning network model, the probability value of the navigation field determined according to the deep learning network model is greater than 0.94, and therefore the field to which the text data corresponding to the map to be opened belongs can be determined to be the navigation field.

When the domain corresponding to the highest probability value in the second probability does not belong to the confusable domain and the highest probability value in the first probability is less than or equal to the third threshold value of 0.94, it may be determined that the domain to which the text data belongs is the domain corresponding to the highest probability value in the second probability. For example, the preset confusion-prone fields are an automobile field and a translation field, the text data is 'i want to eat apples', the probability value of the text data 'i want to eat apples' corresponding to the mobile phone field is determined to be the highest and is 0.92 according to the deep learning network model, the probability value of the text data 'i want to eat apples' corresponding to the cate field is determined to be the highest according to the machine learning network model, the cate field determined according to the machine learning network model does not belong to the set confusion-prone field, and the probability value of the mobile phone field determined according to the deep learning network model is less than 0.94, so that the field to which the text data 'i want to eat apples' belongs can be determined to be the cate field.

In an embodiment, after determining the field to which the text data belongs, the terminal device may start an application corresponding to the field according to the field to which the text data belongs. For example, the text data is "i want to order take out", and the terminal device may start an application corresponding to the food field after determining that the field to which the text data belongs is the food field. For another example, the text data is "i want to open a map", and the terminal device may start an application corresponding to the navigation field after determining that the field to which the text data belongs is the navigation field.

The method for determining the data domain shown in fig. 3 is based on the same inventive concept, and an apparatus for determining the data domain is also provided in the embodiment of the present application, and the apparatus for determining the data domain may be disposed in a terminal device. Because the device is a device corresponding to the method in the field to which the data is determined, and the principle of solving the problem of the device is similar to that of the method, the implementation of the device can refer to the implementation of the method, and repeated details are not repeated.

Fig. 5 is a schematic structural diagram of an apparatus for determining a domain to which data belongs according to an embodiment of the present application, and as shown in fig. 5, the apparatus for determining a domain to which data belongs includes a first classification result determining unit 501, a feature extracting unit 502, a second classification result determining unit 503, and a domain determining unit 504.

The first classification result determining unit 501 is configured to determine, through a deep learning network model, a first classification result corresponding to text data to be classified;

a feature extraction unit 502, configured to perform feature extraction on the text data to obtain a feature vector corresponding to the text data, and determine whether the text data includes set key information according to the feature vector;

a second classification result determining unit 503, configured to determine, when the text data includes the set key information, a second classification result corresponding to the text data according to the feature vector through a machine learning network model;

a domain determining unit 504, configured to determine a domain to which the text data belongs according to the first classification result and the second classification result.

In a possible implementation manner, the first classification result determining unit 501 is specifically configured to:

performing word segmentation processing on the text data to be classified to obtain each word segmentation corresponding to the text data;

and inputting each participle corresponding to the text data into the deep learning network model, and determining a first classification result corresponding to the text data according to each participle corresponding to the text data.

In a possible implementation manner, the feature extraction unit 502 is specifically configured to:

comparing the text data with the set key information, and determining a characteristic value of the text data corresponding to the characteristic containing the key information according to the comparison result of the text data and the key information;

In a possible implementation manner, the second classification result determining unit 503 is specifically configured to:

and inputting the characteristic vectors into a machine learning network model, and determining a second classification result corresponding to the text data according to the characteristic vectors.

In one possible implementation, the first classification result includes a first probability that the text data corresponds to each of the set fields; the second classification result comprises a second probability that the text data corresponds to each set domain; the domain determining unit 504 is specifically configured to:

if the field corresponding to the highest probability value in the second probability belongs to the set confusable field, determining the field to which the text data belongs as the field corresponding to the highest probability value in the second probability;

In a possible implementation manner, the domain determining unit 504 is further configured to:

the field to which the text data belongs is a field corresponding to the highest probability value in the first probabilities.

In a possible implementation manner, as shown in fig. 6, the apparatus may further include an application starting unit 601, configured to:

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A terminal device, comprising:

the memory is used for storing program codes and data information generated when the terminal equipment runs;

a processor for executing the program code to implement the following processes: determining a first classification result corresponding to text data to be classified through a deep learning network model; the first classification result comprises a first probability that the text data corresponds to each set field; extracting features of the text data to obtain feature vectors corresponding to the text data, and determining whether the text data contains set key information according to the feature vectors; if the text data contains set key information, determining a second classification result corresponding to the text data according to the feature vector through a machine learning network model; the second classification result comprises a second probability that the text data corresponds to each set domain; if the highest probability value in the first probability is larger than a first threshold value or the highest probability value in the second probability is smaller than a second threshold value, determining the field to which the text data belongs as the field corresponding to the highest probability value in the first probability; if the field corresponding to the highest probability value in the second probability belongs to a set confusable field, determining the field to which the text data belongs as the field corresponding to the highest probability value in the second probability; if the field corresponding to the highest probability value in the second probability does not belong to the confusable field and the highest probability value in the first probability is greater than a third threshold, determining that the field to which the text data belongs is the field corresponding to the highest probability value in the first probability; and if the field corresponding to the highest probability value in the second probability does not belong to the confusable field and the highest probability value in the first probability is less than or equal to a third threshold, determining that the field to which the text data belongs is the field corresponding to the highest probability value in the second probability.

2. The terminal device of claim 1, wherein the terminal device further comprises a voice capture component; the voice acquisition component is used for acquiring voice data;

the processor is further configured to convert voice data collected by the voice collection component into text data.

3. The terminal device of claim 1, wherein the processor is specifically configured to:

4. The terminal device of claim 1, wherein the processor is further configured to:

5. The terminal device of claim 1, wherein the processor is further configured to:

6. The terminal device of claim 1, wherein the processor is further configured to:

7. A method for determining a domain to which data belongs, comprising:

determining a first classification result corresponding to text data to be classified through a deep learning network model; the first classification result comprises a first probability that the text data corresponds to each set field;

if the text data contains set key information, determining a second classification result corresponding to the text data according to the feature vector through a machine learning network model; the second classification result comprises a second probability that the text data corresponds to each set domain;

8. The method of claim 7, wherein determining a first classification result corresponding to the text data to be classified by a deep learning network model comprises:

9. The method of claim 7, wherein performing feature extraction on the text data to obtain a feature vector corresponding to the text data comprises: