CN115688742B

CN115688742B - User data analysis method and AI system based on artificial intelligence

Info

Publication number: CN115688742B
Application number: CN202211568435.7A
Authority: CN
Inventors: 曹成海; 王瑞; 宋杨
Original assignee: Beijing Guolian Video Information Technology Co ltd
Current assignee: Beijing Guolian Video Information Technology Co ltd
Priority date: 2022-12-08
Filing date: 2022-12-08
Publication date: 2023-10-31
Anticipated expiration: 2042-12-08
Also published as: CN115688742A

Abstract

According to the user data analysis method and the AI system based on the artificial intelligence, the comparison text knowledge field and the comparison numerical value field are respectively obtained based on the basic text knowledge field, so that the comparison numerical value field and the comparison text knowledge field have consistent indication capability, the matched comparison numerical value field is also contained between two topic text data sets containing the matched comparison text knowledge field, in addition, when the comparison text knowledge field is obtained, the comparison numerical value field is obtained together, the deviation in the numerical value conversion process is reduced, the speed of numerical value conversion debugging is increased, the first matching array is used as a main object in the comparison matching process through the comparison text knowledge field and a first matching array corresponding to the comparison numerical value field, the comparison text knowledge field is used as a secondary object for matching comparison, the comparison speed and reliability are increased, and the emotion polarity of a user is rapidly determined based on the emotion polarity indicated by the obtained comparison topic text data sets.

Description

User data analysis method and AI system based on artificial intelligence

Technical Field

The application relates to the field of data analysis, in particular to a user data analysis method and an AI system based on artificial intelligence.

Background

With the popularization of the internet, more and more people join the internet to transmit information and acquire knowledge, and meanwhile, people express own views of things on the internet to acquire approval and discussion. By performing emotion polarity analysis on text data initiated by a user aiming at a specific event, the method is very valuable, for example, in the consumption field, analysis on commodity comments by customers is performed, especially when online shopping is an indispensable consumption channel nowadays, people can directly make positive or negative evaluation on purchased commodities on each shopping platform, and in fact, many emotion visualization systems are designed specifically for the scene at present, so how to quickly and accurately acquire text data of a user aiming at a topic is a technical problem which needs attention.

Disclosure of Invention

The application aims to provide an artificial intelligence-based user data analysis method and an AI system so as to provide an efficient and accurate analysis scheme.

In order to achieve the above object, the embodiments of the present application are realized as follows:

in a first aspect, an embodiment of the present application provides an artificial intelligence based user data analysis method, which is applied to a server, where the server is communicatively connected to at least one terminal device, and the terminal device is configured to generate text information in response to an operation of a user, where the text information forms a user topic text data set, and the method includes:

Acquiring a basic text knowledge field of the user topic text data set;

converting the basic text knowledge field of the user topic text data set to obtain a comparison text knowledge field of the user topic text data set;

performing numerical conversion operation on the basic text knowledge field of the user topic text data set to obtain a comparison numerical value field corresponding to the user topic text data set;

normalizing the comparison value field of the user topic text data set to obtain a first matching array of the comparison value field of the user topic text data set;

and carrying out commonality matching processing on a plurality of comparison topic text data sets based on comparison text knowledge fields of the user topic text data sets and a first matching array of the user topic text data sets to obtain comparison topic text data sets matched with the user topic text data sets, wherein the comparison topic text data sets are topic text data sets marked with emotion polarities.

As a possible implementation manner, the comparing text knowledge field of the user topic text data set and the comparing numerical field of the user topic text data set are obtained based on extracting a topic text analysis network, and before obtaining the basic text knowledge field of the user topic text data set, the method further comprises a debugging process of the topic text analysis network, including:

Performing text knowledge field mining on a preset topic text data set sample based on the topic text analysis network to obtain a comparison text knowledge field of the preset topic text data set sample and a comparison numerical value field of the preset topic text data set sample;

obtaining a cost index value based on a comparison text knowledge field of the preset topic text data set sample to obtain a text knowledge field space cost index value of the preset topic text data set sample;

obtaining a cost index value based on a comparison value field of the preset topic text data set sample to obtain a value field space cost index value and a value field cost index value of the preset topic text data set sample;

integrating a plurality of text knowledge field space cost index values, numerical field space cost index values and numerical field cost index values of the preset topic text data set sample, and determining the cost index value obtained by the integrating operation as a first cost index value corresponding to the topic text analysis network;

debugging the topic text analysis network based on a first price index value corresponding to the topic text analysis network.

As a possible implementation manner, the preset topic text data set sample includes a topic text data set comparison sample, a topic text data set positive sample and a topic text data set negative sample, a commonality measurement result between the topic text data set positive sample and the topic text data set comparison sample is greater than a first preset commonality measurement result, a commonality measurement result between the topic text data set negative sample and the topic text data set comparison sample is less than a second preset commonality measurement result, and the first preset commonality measurement result is greater than the second preset commonality measurement result;

the topic text analysis network comprises a main knowledge mining module, a comparison knowledge conversion module and a comparison numerical value conversion module;

the text knowledge field mining is performed on a preset topic text data set sample based on the topic text analysis network to obtain a comparison text knowledge field of the preset topic text data set sample and a comparison numerical value field of the preset topic text data set sample, and the text knowledge field mining comprises the following steps:

based on the main knowledge mining module, main text knowledge field mining is respectively carried out on the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample, so that basic text knowledge fields respectively corresponding to the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample are obtained;

Respectively performing numerical conversion operation on basic text knowledge fields of the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample based on the comparison numerical conversion module to obtain comparison numerical fields respectively corresponding to the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample;

and respectively converting the topic text data set comparison sample, the topic text data set positive sample and the basic text knowledge field of the topic text data set negative sample based on the comparison knowledge conversion module to obtain the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample respectively corresponding comparison text knowledge fields.

As a possible implementation manner, the obtaining a cost index value based on the comparison text knowledge field of the preset topic text data set sample to obtain a text knowledge field space cost index value of the preset topic text data set sample includes:

determining a first comparison text knowledge field interval value between a comparison text knowledge field of the topic text data set comparison sample and a comparison text knowledge field of the topic text data set positive sample, and then acquiring a second comparison text knowledge field interval value between a comparison text knowledge field of the topic text data set comparison sample and a comparison text knowledge field of the topic text data set negative sample;

Acquiring a first sum result of the first comparison text knowledge field interval value and a first preset cost index value, and then acquiring a first difference result between the first sum result and the second comparison text knowledge field interval value;

if the first difference result is a positive number result, determining the first difference result as the text knowledge field space cost index value;

and if the first difference result is not a positive number result, determining a zero value as the text knowledge field space cost index value.

As a possible implementation manner, the obtaining the cost index value based on the comparison value field of the preset topic text data set sample to obtain the value field space cost index value of the preset topic text data set sample includes:

acquiring a first comparison value field interval value between a comparison value field of the topic text data set comparison sample and a comparison value field of the topic text data set positive sample, and then acquiring a second comparison value field interval value between the comparison value field of the topic text data set comparison sample and the comparison value field of the topic text data set negative sample;

Determining a second sum result of the first comparison value field interval value and a second preset cost index value, and then obtaining a second difference result between the second sum result and the second comparison value field interval value;

if the second difference result is a positive number result, determining the second difference result as the numerical field space cost index value;

and if the second difference result is not a positive number result, determining a zero value as the numerical field space cost index value.

As a possible implementation manner, the obtaining the cost index value based on the comparison value field of the preset topic text data set sample to obtain the value field cost index value of the preset topic text data set sample includes:

performing debugging normalization operation on the comparative numerical value fields respectively corresponding to the topic text data set comparative sample, the topic text data set positive sample and the topic text data set negative sample to obtain a debugging numerical value corresponding to each field in each comparative numerical value field;

determining a preset calculation result of each field in each comparison value field and the corresponding debugging value;

And adding the preset calculation results corresponding to each field to obtain a numerical field cost index value of the preset topic text data set sample.

As a possible implementation manner, the debugging the topic text analysis network based on the first price index value corresponding to the topic text analysis network includes:

if the first price index value is obtained based on the text knowledge field space cost index value, the numerical field space cost index value and the numerical field cost index value, adjusting the network coefficients of the main knowledge mining module, the network coefficients of the comparison numerical conversion module and the network coefficients of the comparison knowledge conversion module based on the first price index value; or alternatively; adjusting the network coefficients of the comparison numerical conversion module and the network coefficients of the comparison knowledge conversion module based on the first price index value;

if the first price index value is obtained based on the numerical field space cost index value and the numerical field cost index value, adjusting the network coefficient of the comparison numerical conversion module based on the first price index value;

The comparison text knowledge field of the user topic text data set and the comparison numerical value field of the user topic text data set are obtained based on an extraction topic text analysis network, wherein the topic text analysis network comprises a main knowledge mining module, a comparison numerical value conversion module and a comparison knowledge conversion module, and before the basic text knowledge field of the user topic text data set is obtained, the method further comprises:

adjusting the network coefficient of the main knowledge mining module and the network coefficient of the comparison knowledge conversion module until a second cost index value corresponding to the topic text analysis network meets a preset condition;

and adjusting the network coefficient of the main knowledge mining module, the network coefficient of the comparison numerical value conversion module and the network coefficient of the comparison knowledge conversion module until a third price index value corresponding to the topic text analysis network meets the preset condition.

As a possible implementation manner, the adjusting the network coefficients of the backbone knowledge mining module and the network coefficients of the comparison knowledge transformation module includes:

performing text knowledge field mining on a preset topic text data set sample based on the topic text analysis network to obtain a comparison text knowledge field of the preset topic text data set sample;

Obtaining a cost index value based on a comparison text knowledge field of the preset topic text data set sample, obtaining a text knowledge field space cost index value of the preset topic text data set sample, and determining the text knowledge field space cost index value as a second cost index value corresponding to the topic text analysis network;

and adjusting the network coefficients of the trunk knowledge mining module and the network coefficients of the comparison knowledge transformation module based on the second cost index value corresponding to the topic text analysis network.

As a possible implementation manner, the performing, by the first matching array based on the comparison text knowledge field of the user topic text data set and the user topic text data set, a common matching process on a plurality of comparison topic text data sets to obtain a comparison topic text data set matched with the user topic text data set, includes:

acquiring the corresponding situation between the second matching array and the comparison topic text data sets based on the second matching arrays of the plurality of comparison topic text data sets, wherein each second matching array corresponds to one or more comparison topic text data sets;

Identifying a locking matching array from a plurality of second matching arrays based on the first matching array, and determining a comparison topic text data set corresponding to the locking matching array as a matching topic text data set;

acquiring a field interval between a comparison text knowledge field of a comparison topic text data set corresponding to the locking matching array and a comparison text knowledge field of the user topic text data set, and determining the comparison topic text data set with the field interval larger than a preset field interval as the matching topic text data set.

In a second aspect, an embodiment of the present application provides a user data analysis AI system, including a terminal device and a server in communication with each other, where the server includes a processor and a memory, where the memory stores a computer program, and when the processor executes the computer program, the above-mentioned artificial intelligence-based user data analysis method is performed.

According to the user data analysis method and the AI system based on the artificial intelligence, the comparison text knowledge field and the comparison numerical value field are respectively obtained based on the basic text knowledge field, so that the comparison numerical value field and the comparison text knowledge field have consistent indication capability, and then the matched comparison numerical value field is also contained between two topic text data sets containing the matched comparison text knowledge field.

In the following description, other features will be partially set forth. Upon review of the ensuing disclosure and the accompanying figures, those skilled in the art will in part discover these features or will be able to ascertain them through production or use thereof. The features of the present application may be implemented and obtained by practicing or using the various aspects of the methods, tools, and combinations that are set forth in the detailed examples described below.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings that are required to be used in the description of the embodiments of the present application will be briefly described below.

FIG. 1 is a flow chart of a user data analysis method based on artificial intelligence according to an embodiment of the present application.

Fig. 2 is a schematic diagram of a functional module architecture of a user data analysis device according to an embodiment of the present application.

Fig. 3 is a schematic diagram of a server according to an embodiment of the present application.

Detailed Description

Embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application. The terminology used in the description of the embodiments of the application is for the purpose of describing particular embodiments of the application only and is not intended to be limiting of the application.

The implementation main body of the user data analysis method based on artificial intelligence in the embodiment of the application is a server, including but not limited to a single network server, a server group formed by a plurality of network servers or cloud formed by a large number of computers or network servers in cloud computing, wherein the cloud computing is a distributed computing type and is a super virtual computer formed by a group of loosely coupled computer sets. The server can operate alone to realize the application, and can also access the network and realize the application through interaction with other servers in the network. Wherein the network on which the server is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a VPN network, and the like. The server is in communication connection with at least one terminal device to form the user data analysis AI system provided by the embodiment of the application. The terminal device is configured to generate text information in response to an operation of the user, the text information being configured to constitute a user topic text dataset. The terminal device may include, but is not limited to, a personal computer, a smart phone, a tablet computer, a personal palm assistant, and the like.

In the user data analysis method and the AI system based on the artificial intelligence provided by the embodiment of the application, the execution of the related flow is carried out by means of the artificial intelligence. The topic text analysis network which is calibrated in advance is introduced, and can be any feasible artificial intelligent network such as a machine learning network or a deep learning network, for example, a DNN, CNN, RNN, resNet, LSTM network, and the application is not limited to the above. The topic text analysis network can comprise a main knowledge mining module, a comparison knowledge conversion module and a comparison value conversion module, wherein the main knowledge mining module can comprise a plurality of cascaded convolution units (layers) with different output sizes, and parameters such as the number of subunits, the convolution kernel size, the channel, the step length and the like corresponding to each convolution unit can be adaptively selected according to actual conditions; the comparison knowledge transformation module may include a downsampling unit (e.g., max mapping), a normalizing unit (e.g., BN, LN, IN, GN, WS), and a class mapping unit (full connection integration); the module architecture of the contrast value conversion module may refer to the contrast knowledge conversion module.

When the topic text analysis network is debugged, a preset topic text data set sample prepared in advance is loaded into the topic text analysis network, basic text knowledge fields (such as feature vectors of topic text data) of each topic text data set of the preset topic text data set sample are mined by carrying out convolution operation on the basis of a main knowledge mining module, the basic text knowledge fields of each topic text data set are processed on the basis of a comparison knowledge conversion module to obtain comparison text knowledge fields, the processing can be conversion operation (such as mapping processing) of the text knowledge fields, the basic text knowledge fields of each topic text data set are processed on the basis of a comparison value conversion module to obtain comparison value fields, the processing is carried out on the basis of numerical conversion operation (such as quantitative processing, based on binary expression and then through codes), a comparison text knowledge field and a comparison numerical value field are respectively obtained based on a basic text knowledge field, so that the comparison numerical value field and the comparison text knowledge field have consistent indicating capability, then, two topic text data sets containing the matched comparison text knowledge field also contain the matched comparison numerical value field, the two topic text data sets containing the matched comparison text knowledge field cannot be distinguished due to different comparison numerical value fields, in addition, when the comparison text knowledge field is obtained, the comparison numerical value field is obtained together, the transformation is not required to be learned after the comparison text knowledge field is obtained, multiple learning is prevented from adding redundant nodes, multiple deviation is prevented from being generated, namely, the deviation of the knowledge field and the deviation in the numerical value transformation process, obtaining a text knowledge field space cost index value (the cost index value is used for evaluating the quality of an output result or called cost value, loss value and quality evaluation factor) by comparing text knowledge fields, obtaining a numerical field space cost index value and a numerical field cost index value by comparing numerical fields, and obtaining a total cost index value by comparing the text knowledge field space cost index value, the numerical field space cost index value and the numerical field cost index value so as to regulate the network coefficient of the topic text analysis network by the total cost index value.

Specifically, in the user data analysis method based on artificial intelligence provided by the embodiment of the application, the comparison text knowledge field of the user topic text data set and the comparison numerical value field of the user topic text data set are obtained based on the call topic text analysis network, and the following debugging process of the topic text analysis network is described in detail, which may include the following steps S1 to S5:

step S1: text knowledge field mining is conducted on the preset topic text data set sample based on the topic text analysis network, and a comparison text knowledge field of the preset topic text data set sample and a comparison numerical value field of the preset topic text data set sample are obtained.

Regarding the topic text data set, it may include a speaker that a target user issues with respect to a specific topic, the speaker may include text information and image information (such as a picture and an expression package), the topic text data set may be a single speaker or may be a collection of multiple speakers, for example, in the e-commerce field, the topic text data set may be one evaluation content performed by the user with respect to the product S, or may be multiple comments performed by the user with respect to the product S, and the topic text data set obtained after integrating may, of course, also be a topic text data set obtained after integrating with respect to the product S and the product D, and after analyzing by the server, the emotion polarity result of the user, for example, the topic text data set (with emotion polarity already marked) matching high similarity with the target topic text data set may help to obtain emotion polarity quickly.

Returning to the debugging process of the topic text analysis network in the embodiment of the application, the preset topic text data set sample comprises a topic text data set comparison sample, a topic text data set positive sample and a topic text data set negative sample, so as to form a triplet sample. The common measurement result between the topic text data set positive sample and the topic text data set comparison sample is larger than a first preset common measurement result, the common measurement result between the topic text data set negative sample and the topic text data set comparison sample is smaller than a second preset common measurement result, and the first preset common measurement result is larger than the second preset common measurement result; the topic text analysis network comprises a main knowledge mining module, a comparison knowledge conversion module and a comparison numerical conversion module. In step S1, text knowledge field mining is performed on a preset topic text data set sample based on a topic text analysis network to obtain a comparison text knowledge field of the preset topic text data set sample and a comparison numerical value field of the preset topic text data set sample, which may include the following procedures: the method comprises the steps that a main text knowledge field mining module is based on a main text knowledge mining module to respectively conduct main text knowledge field mining on a topic text data set comparison sample, a topic text data set positive sample and a topic text data set negative sample to obtain basic text knowledge fields respectively corresponding to the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample; respectively carrying out numerical conversion operation on basic text knowledge fields of the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample based on the comparison numerical conversion module to obtain comparison numerical fields respectively corresponding to the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample; and converting the basic text knowledge fields of the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample by the comparison value conversion module based on the comparison knowledge conversion module to obtain comparison text knowledge fields respectively corresponding to the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample.

For example, the preset topic text data set sample includes a topic text data set comparison sample, a topic text data set positive sample and a topic text data set negative sample, the topic text data set comparison sample and the topic text data set positive sample are similar topic text data sets, then the common metric result between the topic text data set comparison sample and the topic text data set positive sample is larger than a first preset common metric result, the topic text data set comparison sample and the topic text data set negative sample are dissimilar topic text data sets, then the common metric result between the topic text data set comparison sample and the topic text data set negative sample is smaller than a second preset common metric result, the first preset common metric result is larger than the second preset common metric result, the preset topic text data set sample can be obtained through similar topic text data set combination, for example, 10 topic text data set combinations are set, any one of the topic text data sets A-1 and any one of the rest 9 text data set combinations is expressed according to the common metric result of the topic text data sets of the topic text data set combination A, the common metric result of any topic text data sets is determined to be the topic text data set of the topic text data set combination A, the common metric result of the topic text data set is 9 is determined according to the common metric result of the topic text data set combination of the topic text data set negative sample is 9, and determining the rest topic text data sets in the topic text data set combination A as topic text data set positive samples, wherein topic text data sets in the order can be selected based on a third preset commonality measurement result, the commonality measurement result between the topic text data set comparison sample and the topic text data set negative samples is smaller than the second preset commonality measurement result but larger than the third preset commonality measurement result, and the third preset commonality measurement result is smaller than the second preset commonality measurement result. Based on the method, the obtained topic text data set negative sample is a balanced difficulty topic text data set sample, and a topic text analysis network with better adaptability can be debugged.

For example, a main knowledge mining module is used for mining general text knowledge fields of a topic text data set comparison sample, a topic text data set positive sample and a topic text data set negative sample respectively to obtain a basic text knowledge field of the topic text data set comparison sample, a basic text knowledge field of the topic text data set positive sample and a basic text knowledge field of the topic text data set negative sample, a numerical conversion operation is respectively carried out on the basic text knowledge field of the topic text data set comparison sample, the basic text knowledge field of the topic text data set positive sample and the basic text knowledge field of the topic text data set negative sample based on a comparison numerical conversion module to obtain a comparison numerical value field of the topic text data set comparison sample, a comparison numerical value field of the topic text data set positive sample and a comparison numerical value field of the topic text data set negative sample, and a transformation operation is respectively carried out on the basic text knowledge field of the topic text data set comparison sample, the basic text knowledge field of the topic text data set positive sample and the basic text knowledge field of the topic text data set negative sample based on the comparison numerical conversion module to obtain a comparison text knowledge field of the topic text data set comparison sample, a comparison text data set positive comparison text sample and a topic text data set comparison text field of the topic text data set negative sample.

Step S2: and obtaining a cost index value based on the comparison text knowledge field of the preset topic text data set sample, and obtaining a text knowledge field space cost index value of the preset topic text data set sample.

As some embodiments, in step S2, the obtaining a cost index value based on the comparison text knowledge field of the preset topic text data set sample to obtain a text knowledge field space cost index value of the preset topic text data set sample may include the following procedures: acquiring a first comparison text knowledge field interval value (such as the distance between vectors of knowledge fields, then the space cost index value can also be called a distance cost index value) between a comparison text knowledge field of a comparison sample of the topic text data set and a comparison text knowledge field of a positive sample of the topic text data set, and then acquiring a second comparison text knowledge field interval value between the comparison sample of the topic text data set and the negative sample of the topic text data set; acquiring a first sum result of the first comparison text knowledge field interval value and a first preset cost index value, and then acquiring a first difference result between the first sum result and the second comparison text knowledge field interval value; if the first difference result is a positive number result (> 0), determining the first difference result as a text knowledge field space cost index value; if the first difference result is not a positive number result (less than or equal to 0), a zero value is determined as a text knowledge field space cost index value.

As a possible implementation manner, in step S2, the obtaining of the cost index value based on the comparative value field of the preset topic text data set sample to obtain the value field space cost index value of the preset topic text data set sample may include the following procedures: determining a first comparison value field interval value between a comparison value field of a comparison sample of the topic text data set and a comparison value field of a positive sample of the topic text data set, and then acquiring a second comparison value field interval value between the comparison sample of the topic text data set and the negative sample of the topic text data set; determining a second sum result of the first comparison value field interval value and a second preset cost index value, and then obtaining a second difference result between the second sum result and the second comparison value field interval value; if the second difference result is a positive number result, determining the second difference result as a numerical field space cost index value; and if the second difference result is not a positive number result, determining a zero value as a numerical field space cost index value.

Step S3: and obtaining a cost index value based on a comparison value field of the preset topic text data set sample, and obtaining a value field space cost index value and a value field cost index value of the preset topic text data set sample.

As a possible implementation manner, in step S3, the obtaining of the cost index value based on the comparative value field of the preset topic text data set sample to obtain the value field cost index value of the preset topic text data set sample may include the following procedures: performing debugging normalization operation (for example, unifying the values, namely 0 or 1) on the comparative numerical value fields corresponding to the topic text data set comparative sample, the topic text data set positive sample and the topic text data set negative sample respectively to obtain the debugging values corresponding to the fields in each comparative numerical value field; obtaining a preset calculation result of each field in each comparison value field and the corresponding debug value, wherein the preset calculation result is, for example, a difference value between each field value and the debug value, and then determining an absolute value; and adding the preset calculation results corresponding to each field (namely adding the absolute values of the fields) to obtain a numerical field cost index value of the preset topic text data set sample, wherein as another feasible implementation mode, the values obtained by squaring the absolute values of the fields can be added to obtain the numerical field cost index value of the preset topic text data set sample.

Step S4: and integrating a plurality of text knowledge field space cost index values, numerical field space cost index values and numerical field cost index values of a preset topic text data set sample, and determining the cost index value obtained by the integrating operation as a first cost index value of a corresponding topic text analysis network.

For example, when the topic text analysis network is debugged, determining all network coefficients of the topic text analysis network to be calibrated, in the debugging process, calculating each topic text data set of a preset topic text data set sample to obtain a comparison text knowledge field and a comparison numerical value field, calculating a text knowledge field space cost index value according to the comparison text knowledge field of the preset topic text data set sample, calculating a numerical field space cost index value based on the comparison numerical value field of the preset topic text data set sample, calculating a numerical field cost index value based on the comparison numerical value field of the preset topic text data set sample, and adding the three cost index values to obtain a total cost index value, for example:

Loss＝a·loss1+b·loss2+c·loss3

wherein a is the weight of the text knowledge field space cost index value loss1, b is the weight of the numerical field space cost index value loss2, c is the weight of the numerical field cost index value loss3, and the specific numerical value of each weight can be determined according to practice.

Step S5: the topic text analysis network is commissioned based on the first price index value of the corresponding topic text analysis network.

As a possible implementation manner, the debugging the topic text analysis network based on the first price index value of the corresponding topic text analysis network in step S5 may include the following procedures: if the first price index value is obtained based on the text knowledge field space cost index value, the numerical value field space cost index value and the numerical value field cost index value, adjusting the network coefficients of the main knowledge mining module, the comparison numerical value conversion module and the comparison knowledge conversion module based on the first price index value, or adjusting the network coefficients of the comparison numerical value conversion module and the comparison knowledge conversion module based on the first price index value; and if the first price index value is obtained based on the numerical field space cost index value and the numerical field cost index value, adjusting and comparing the network coefficients of the numerical conversion module based on the first price index value.

For example, in the process of adjusting the network coefficients of the topic text analysis network through the total cost index value, all the network coefficients or local network coefficients of the topic text analysis network may be adjusted, the comparison value conversion module and the comparison knowledge conversion module may be learned simultaneously in the debugging process, the network coefficients of the trunk knowledge mining module, the comparison value conversion module and the comparison knowledge conversion module may be adjusted based on the first cost index value, or the network coefficients of the comparison value conversion module and the comparison knowledge conversion module may be adjusted based on the first cost index value. In addition, only the comparison numerical conversion module can be debugged, and the comparison knowledge conversion module is not regulated.

As a possible implementation manner, the comparison text knowledge field of the user topic text data set and the comparison numerical value field of the user topic text data set are obtained based on an extraction topic text analysis network, the topic text analysis network comprises a main knowledge mining module, a comparison numerical value conversion module and a comparison knowledge conversion module, and before the basic text knowledge field of the user topic text data set is obtained based on the extraction topic text analysis network, the network coefficients of the main knowledge mining module and the network coefficients of the comparison knowledge conversion module are regulated until the second cost index value of the corresponding topic text analysis network meets a preset condition; and adjusting the network coefficient of the main knowledge mining module, the network coefficient of the comparison numerical value conversion module and the network coefficient of the comparison knowledge conversion module until the third price index value of the corresponding topic text analysis network meets the preset condition, wherein the preset condition is that the change of the price index value is smaller than a preset value.

As a possible implementation manner, adjusting the network coefficients of the backbone knowledge mining module and the network coefficients of the comparison knowledge transformation module may include the following procedures: text knowledge field mining is conducted on the preset topic text data set sample based on the topic text analysis network, and a comparison text knowledge field of the preset topic text data set sample is obtained; obtaining a cost index value based on a comparison text knowledge field of a preset topic text data set sample, obtaining a text knowledge field space cost index value of the preset topic text data set sample, and determining the text knowledge field space cost index value as a second cost index value of a corresponding topic text analysis network; and adjusting the network coefficient of the trunk knowledge mining module and the network coefficient of the comparison knowledge transformation module based on the second cost index value of the corresponding topic text analysis network.

For example, the network coefficients of the main knowledge mining module and the network coefficients of the comparison knowledge conversion module are adjusted, only the text knowledge field space cost index value is obtained, text knowledge field mining is conducted on the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample based on the topic text analysis network, the text knowledge field comparison of the topic text data set comparison sample, the text knowledge field comparison of the topic text data set positive sample and the text knowledge field comparison of the topic text data set negative sample are obtained, and the cost index value is obtained based on the text knowledge field comparison of the topic text data set comparison sample, the text knowledge field comparison of the topic text data set positive sample and the text knowledge field comparison of the topic text data set negative sample, so that the text knowledge field space cost index value is obtained.

Referring to fig. 1, a flowchart of an artificial intelligence based user data analysis method according to an embodiment of the present application includes the following steps S100 to S500.

Step 100: a base text knowledge field of a user topic text dataset is obtained.

For example, the topic text analysis network provided by the embodiment of the application comprises a main knowledge mining module, a comparison knowledge conversion module and a comparison numerical conversion module, wherein the comparison knowledge conversion module and the comparison numerical conversion module are simultaneously connected with the main knowledge mining module. The method comprises the steps that a main knowledge mining module based on a topic text analysis network obtains basic text knowledge fields of a user topic text data set.

Step 200: and converting the basic text knowledge field of the user topic text data set to obtain the comparison text knowledge field of the user topic text data set.

As a possible implementation manner, the transformation operation is performed based on a comparison knowledge transformation module, where the comparison knowledge transformation module may include a first mapping unit (pooling), a first normalization unit (normalization) and a first FCL (Fully Connected Layer, full connection) unit, and the transformation operation is performed on the basic text knowledge field of the user topic text dataset in step 200 to obtain the comparison text knowledge field of the user topic text dataset, and may include the following procedures: pooling basic text knowledge fields of the user topic text data set based on the first pooling unit to obtain a pooling result pooling-1 of the basic text knowledge fields; normalizing the pooling result pooling-1 based on the first regularization unit to obtain a normalized result of the basic text knowledge field; and carrying out classified mapping on the normalized result based on the first FCL unit to obtain a comparison text knowledge field of the user topic text data set.

Step 300: and performing numerical conversion operation on the basic text knowledge field of the user topic text data set to obtain a comparison numerical value field corresponding to the user topic text data set.

As a possible implementation manner, the numerical conversion operation is performed based on a comparative numerical conversion module, where the comparative numerical conversion module includes a second mapping unit, a second alignment unit, and a second FCL unit, and in step 300, performing the numerical conversion operation on the basic text knowledge field of the user topic text data set to obtain the comparative numerical field corresponding to the user topic text data set, which may include the following procedures: pooling the basic text knowledge field of the user topic text data set based on the second pooling unit to obtain a pooling result pooling-2 of the basic text knowledge field; normalizing the pooling result pooling-2 based on the second regularization unit to obtain a normalized result of the basic text knowledge field; and carrying out classified mapping on the normalized result based on the second FCL unit to obtain a comparison value field of the user topic text data set.

Step 400: and carrying out normalization operation on the comparison value field of the user topic text data set to obtain a first matching array of the comparison value field of the user topic text data set.

The normalization operation in the embodiment of the present application can be understood as binary encoding, in which a negative value in the contrast value field is normalized to 0 and a positive value is normalized to 1. For example, let the contrast value field be {0.7;0.31;0.24; -0.54;0.37, performing normalization operation to obtain a first matching array {1 }; 1, a step of; 1, a step of; 0;1}.

Step 500: and carrying out commonality matching processing on the multiple comparison topic text data sets based on the comparison text knowledge field of the user topic text data set and the first matching array of the user topic text data set to obtain the comparison topic text data set matched with the user topic text data set.

The topic text data set is a topic text data set with emotion polarity marked in advance, the topic text data set matched with the user topic text data set is obtained through commonality matching processing, the topic text data set is highly similar to the user topic text data set, and the emotion polarity corresponding to the topic text data set can be used as the emotion polarity of the user topic text data set. In the embodiment of the application, the topic comparison text data set is often representative, and the topic comparison text data set obtained later is increasingly close to the user topic text data set through continuous data inflow and updating, so that the indication accuracy of emotion polarity is higher.

As a possible implementation manner, the process of performing the common matching processing on the multiple comparison topic text data sets to obtain the comparison topic text data set matched with the user topic text data set in step 500 based on the comparison text knowledge field of the user topic text data set and the first matching array of the user topic text data set may specifically include the following steps S501 to S503.

Step S501: based on a second matching array of the plurality of sets of comparative topic text data, a correspondence between the second matching array and the sets of comparative topic text data is determined.

Wherein each second matching array corresponds to one or more sets of topic-contrasting text data. For example, there are 5 sets of matching topic text data, each set of matching topic text data includes a second matching array, and some sets of matching topic text data include identical second matching arrays, based on which each of the second matching arrays that are not identical corresponds to one or more sets of matching topic text data, forming one or more corresponding cases.

Step S502: based on the first matching array, a locking matching array is identified in the plurality of second matching arrays, and a comparison topic text data set corresponding to the locking matching array is determined as a matching topic text data set.

As a possible implementation manner, identifying the lock matching array from the plurality of second matching arrays based on the first matching array in step 502 may include the following procedures: acquiring array intervals (such as distances) between each second matching array and the first matching array, and determining the second array intervals with the array intervals smaller than the preset array intervals as locking matching arrays; or, performing an replacing operation based on a preset array interval on the first matching array to obtain a replacing matching array, and determining a second array interval consistent with the replacing matching array in the second matching arrays as a locking matching array.

Step 503: determining a field interval between a comparison text knowledge field of a comparison topic text data set corresponding to the locking matching array and a comparison text knowledge field of a user topic text data set, and determining the comparison topic text data set with the field interval larger than a preset field interval as the matching topic text data set.

For example, the lock matching array corresponds to one or more than one comparison topic text data set, for example, the lock matching array x corresponds to the comparison topic text data set-1, the lock matching array y corresponds to the comparison topic text data sets set-2 and set-3, field intervals between the comparison text knowledge fields of the comparison topic text data set-1, the comparison topic text data set-2 and the comparison topic text data set-3 and the comparison text knowledge fields of the user topic text data set are determined, the comparison topic text data set with the field interval larger than the preset field interval is determined as the match topic text data set, or the comparison topic text data set corresponding to the lock matching array is directly determined as the match topic text data set, and the match results are arranged in sequence according to the field interval. In the embodiment of the application, the reference for determining the first array interval of the comparison value field as the common match is directly obtained based on the steps 100, 300 and 400, so that the additional calculation amount is not needed, and the calculation cost and the storage dependence are saved.

As a possible implementation manner, before the step of determining the correspondence between the second matching array and the comparative topic text data set in step S501, obtaining basic text knowledge fields of a plurality of comparative topic text data sets; performing numerical conversion operation on basic text knowledge fields of the multiple comparison topic text data sets to obtain comparison numerical fields corresponding to the multiple comparison topic text data sets; normalizing the comparison value fields of the plurality of comparison topic text data sets to obtain a second matching array of the comparison value fields of the plurality of comparison topic text data sets; before determining a field interval between the comparison text knowledge field of the comparison topic text data set corresponding to the lock matching array and the comparison text knowledge field of the user topic text data set in step 503, performing a conversion operation on the basic text knowledge fields of the comparison topic text data sets to obtain the comparison text knowledge fields of the comparison topic text data sets.

For example, comparing the comparison topic text data sets, loading the comparison topic text data sets into the topic text analysis network to obtain comparison text knowledge fields and coding values of the comparison value fields, obtaining the comparison value fields from the comparison value conversion module, obtaining a second matching array through normalization operation, storing the corresponding relation between the comparison topic text data sets and the comparison text knowledge fields, denoising the second matching arrays to obtain a plurality of different binary coding values, and corresponding to the comparison topic text data sets.

As a possible implementation manner, in order to debug a topic text analysis network, a preset topic text data set sample needs to be arranged in advance, the preset topic text data set sample can be obtained according to topic text data set combinations, whether topic text data set combinations are recorded in advance are consistent topic text data sets or not is judged, when common measurement results are debugged (similar training) in the process, a common measurement result among topic text data set combinations can be indicated by a comparison value field obtained through expected debugging, then the same topic text data set combination debug comparison value field can be adopted. The topic text data set combination can only represent that two topic text data sets in the topic text data set combination are similar, and cannot indicate that different topic text data set combinations are similar, so that each round of possible topic text data set combination A and topic text data set combination B are similar, the rest n topic text data sets in front are determined to be false value samples, the false value samples are respectively combined with the topic text data set combination A to form preset topic text data set samples, any topic text data set in the topic text data set combination A is a true value sample, the other topic text data set is a comparison sample, each topic text data set combination A can obtain n preset topic text data set samples, and finally n×m preset topic text data set samples are obtained, wherein m is the number of topic text data set combinations in one round.

As a possible implementation manner, the topic text analysis network includes a main knowledge mining module, a comparison knowledge conversion module, and a comparison value conversion module, where the framework of the main knowledge mining module, the comparison knowledge conversion module, and the comparison value conversion module has been exemplified, and in use, the output result of the comparison value conversion module is activated, the framework of the main knowledge mining module may be a Resnet network, the comparison knowledge conversion module and the comparison value conversion module may select other network frameworks, and may be accumulated by two networks, and then connect an FCL unit as an output of the comparison value conversion module, where the networks include an FCL unit and an activation unit, the accumulation framework may be an FCL unit-activation unit-FCL unit, and the activation unit may use a tanh function, so that the output result is within [ -1.1], or a sigmoid function. Based on the above architecture, when the comparison text knowledge field is obtained, the comparison value field can be obtained together, the comparison value field is subjected to value conversion based on the activation function, redundant nodes are prevented from being added in multiple learning, the comparison text knowledge field and the value result of the comparison value field are simultaneously generated in the topic text analysis network through parallel debugging of the main knowledge mining module, and two topic text data sets containing the matched comparison text knowledge field cannot be distinguished due to the fact that the comparison value field is different.

As a possible implementation manner, when the topic text analysis network propagates in the input-to-output direction, all network coefficients of the network are set to be debugged, each topic text data set of the loaded preset topic text data set sample is processed in the debugging process to obtain a comparison text knowledge field and a comparison numerical value field, a text knowledge field space cost index value is obtained through the comparison text knowledge field of the preset topic text data set sample, a numerical field space cost index value is obtained through the comparison numerical value field of the preset topic text data set sample, a numerical field cost index value is obtained through the comparison numerical value field of the preset topic text data set sample, and the three cost index values are added to obtain the total cost index value.

In summary, the user data analysis method based on artificial intelligence provided by the embodiment of the application obtains the comparison text knowledge field and the comparison value field based on the basic text knowledge field respectively, so that the comparison value field and the comparison text knowledge field have consistent indicating capability, and then, the two topic text data sets including the matched comparison text knowledge field also include the matched comparison value field.

Based on the same principle as the method shown in fig. 1, there is also provided in an embodiment of the present application a user data analysis device 10, as shown in fig. 2, the device 10 includes:

an obtaining module 11, configured to obtain a basic text knowledge field of the user topic text data set.

The knowledge transformation module 12 is configured to perform a transformation operation on the basic text knowledge field of the user topic text data set, so as to obtain a comparison text knowledge field of the user topic text data set.

The numerical conversion module 13 is configured to perform a numerical conversion operation on the basic text knowledge field of the user topic text data set, and obtain a comparative numerical field corresponding to the user topic text data set.

The normalization module 14 is configured to normalize the comparison value field of the user topic text data set to obtain a first matching array of the comparison value field of the user topic text data set.

The matching module 15 is configured to perform a common matching process on the multiple comparison topic text data sets based on the comparison text knowledge field of the user topic text data set and the first matching array of the user topic text data set, so as to obtain a comparison topic text data set matched with the user topic text data set.

It will be readily appreciated that the specific implementation of the user data analysis device 10 described above is consistent with the concepts of the methods provided above, and thus, will not be described in detail herein

The above embodiment describes the user data analysis device 10 from the viewpoint of a virtual module, and the following describes a server from the viewpoint of a physical module, specifically as follows:

an embodiment of the present application provides a server, as shown in fig. 3, a server 100 includes: a processor 101 and a memory 103. Wherein the processor 101 is coupled to the memory 103, such as via bus 102. Optionally, the server 100 may also include a transceiver 104. It should be noted that, in practical applications, the transceiver 104 is not limited to one, and the structure of the server 100 is not limited to the embodiment of the present application.

The processor 101 may be a CPU, general purpose processor, GPU, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules and circuits described in connection with this disclosure. The processor 101 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessor combinations, a combination of a DSP and a microprocessor, etc.

Bus 102 may include a path to transfer information between the aforementioned components. Bus 102 may be a PCI bus or an EISA bus, etc. The bus 102 may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, only one thick line is shown in fig. 3, but not only one bus or one type of bus.

Memory 103 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an EEPROM, a CD-ROM or other optical disk storage, optical disk storage (including compact disks, laser disks, optical disks, digital versatile disks, blu-ray disks, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

The memory 103 is used for storing application program codes for executing the inventive arrangements and is controlled to be executed by the processor 101. The processor 101 is configured to execute application code stored in the memory 103 to implement what is shown in any of the method embodiments described above.

The embodiment of the application provides a server, which comprises: one or more processors; a memory; one or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs when executed by the processor, to implement the artificial intelligence based user data analysis method described above. According to the technical scheme provided by the application, the comparison text knowledge field and the comparison numerical value field are respectively obtained through the basic text knowledge field, so that the comparison numerical value field and the comparison text knowledge field have consistent indicating capability, and then, the matched comparison numerical value field is also contained between two topic text data sets containing the matched comparison text knowledge field.

Embodiments of the present application provide a computer readable storage medium having a computer program stored thereon, which when run on a processor, enables the processor to perform the corresponding content of the method embodiments described above.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

The foregoing is only a partial embodiment of the present application, and it should be noted that it will be apparent to those skilled in the art that modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.

Claims

1. A method of user data analysis based on artificial intelligence, applied to a server communicatively coupled to at least one terminal device configured to generate text information in response to user operation, the text information constituting a user topic text dataset, the method comprising:

acquiring a basic text knowledge field of the user topic text data set, wherein the basic text knowledge field is a feature vector of the topic text data;

performing conversion operation on the basic text knowledge field of the user topic text data set to obtain a comparison text knowledge field of the user topic text data set, wherein the conversion operation comprises: pooling the basic text knowledge field of the user topic text data set based on a first pooling unit to obtain a pooling result pooling-1 of the basic text knowledge field; normalizing the pooling result pooling-1 based on a first regularization unit to obtain a normalized result of the basic text knowledge field; classifying and mapping the normalized result based on a first FCL unit to obtain the comparison text knowledge field;

Performing a numerical conversion operation on the basic text knowledge field of the user topic text data set to obtain a comparative numerical field corresponding to the user topic text data set, wherein the numerical conversion operation comprises: pooling the basic text knowledge field of the user topic text data set based on a second pooling unit to obtain a pooling result pooling-2 of the basic text knowledge field; normalizing the pooling result pooling-2 based on a second regularization unit to obtain a normalized result of the basic text knowledge field; classifying and mapping the normalized result based on a second FCL unit to obtain the comparison value field;

performing commonality matching processing on a plurality of comparison topic text data sets based on comparison text knowledge fields of the user topic text data sets and a first matching array of the user topic text data sets to obtain comparison topic text data sets matched with the user topic text data sets, wherein the comparison topic text data sets are topic text data sets marked with emotion polarities;

The emotion polarity marked by the topic comparison text data set is used as the emotion polarity of the topic comparison text data set;

the comparison text knowledge field of the user topic text data set and the comparison value field of the user topic text data set are obtained based on extracting a topic text analysis network, and before obtaining the basic text knowledge field of the user topic text data set, the method further comprises a debugging process of the topic text analysis network, which comprises the following steps:

obtaining a cost index value based on a comparison text knowledge field of the preset topic text data set sample to obtain a text knowledge field space cost index value of the preset topic text data set sample, wherein the text knowledge field space cost index value is a distance cost index value between text knowledge fields, and obtaining the cost index value based on the comparison text knowledge field of the preset topic text data set sample to obtain the text knowledge field space cost index value of the preset topic text data set sample comprises: determining a first comparison text knowledge field interval value between a comparison text knowledge field of the topic text data set comparison sample and a comparison text knowledge field of the topic text data set positive sample, and then acquiring a second comparison text knowledge field interval value between a comparison text knowledge field of the topic text data set comparison sample and a comparison text knowledge field of the topic text data set negative sample; acquiring a first sum result of the first comparison text knowledge field interval value and a first preset cost index value, and then acquiring a first difference result between the first sum result and the second comparison text knowledge field interval value; if the first difference result is a positive number result, determining the first difference result as the text knowledge field space cost index value; if the first difference result is not a positive number result, determining a zero value as the text knowledge field space cost index value;

Obtaining a cost index value based on a comparison value field of the preset topic text data set sample to obtain a value field space cost index value and a value field cost index value of the preset topic text data set sample, wherein the cost index value comprises the following specific steps: acquiring a first comparison value field interval value between a comparison value field of the topic text data set comparison sample and a comparison value field of the topic text data set positive sample, and then acquiring a second comparison value field interval value between the comparison value field of the topic text data set comparison sample and the comparison value field of the topic text data set negative sample; determining a second sum result of the first comparison value field interval value and a second preset cost index value, and then obtaining a second difference result between the second sum result and the second comparison value field interval value; if the second difference result is a positive number result, determining the second difference result as the numerical field space cost index value; if the second difference result is not a positive number result, determining a zero value as the numerical field space cost index value; performing debugging normalization operation on the comparative numerical value fields respectively corresponding to the topic text data set comparative sample, the topic text data set positive sample and the topic text data set negative sample to obtain a debugging numerical value corresponding to each field in each comparative numerical value field; determining a preset calculation result of each field in each comparison value field and the corresponding debugging value; adding the preset calculation results corresponding to each field to obtain a numerical field cost index value of the preset topic text data set sample;

Integrating a plurality of text knowledge field space cost index values, numerical field space cost index values and numerical field cost index values of the preset topic text data set sample, and determining the cost index value obtained by the integrating operation as a first cost index value corresponding to the topic text analysis network, wherein the cost index value is used for evaluating the quality of an output result of the topic text analysis network;

debugging the topic text analysis network based on a first price index value corresponding to the topic text analysis network;

the preset topic text data set samples comprise topic text data set comparison samples, topic text data set positive samples and topic text data set negative samples, a commonality measurement result between the topic text data set positive samples and the topic text data set comparison samples is larger than a first preset commonality measurement result, a commonality measurement result between the topic text data set negative samples and the topic text data set comparison samples is smaller than a second preset commonality measurement result, and the first preset commonality measurement result is larger than the second preset commonality measurement result;

based on the comparison knowledge conversion module, respectively converting basic text knowledge fields of the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample to obtain comparison text knowledge fields respectively corresponding to the topic text data set comparison sample, the topic text data set positive sample and the topic text data set negative sample;

The debugging the topic text analysis network based on the first price index value corresponding to the topic text analysis network comprises:

before obtaining the basic text knowledge field of the user topic text dataset, the method further comprises:

2. The method of claim 1, wherein said adjusting network coefficients of the backbone knowledge mining module and network coefficients of the comparative knowledge transformation module comprises:

3. The method according to claim 1, wherein the performing, based on the comparison text knowledge field of the user topic text data set and the first matching array of the user topic text data set, a commonality matching process on a plurality of comparison topic text data sets to obtain a comparison topic text data set matched with the user topic text data set, includes:

4. A user data analysis AI system comprising a terminal device and a server in communication with each other, the server comprising a processor and a memory, the memory storing a computer program which, when executed by the processor, performs the artificial intelligence based user data analysis method according to any one of claims 1 to 3.