CN116050410A - Big data analysis method and deep learning system based on digital online session service - Google Patents

Big data analysis method and deep learning system based on digital online session service Download PDF

Info

Publication number
CN116050410A
CN116050410A CN202310141339.2A CN202310141339A CN116050410A CN 116050410 A CN116050410 A CN 116050410A CN 202310141339 A CN202310141339 A CN 202310141339A CN 116050410 A CN116050410 A CN 116050410A
Authority
CN
China
Prior art keywords
text
session
conversation
segments
text segments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310141339.2A
Other languages
Chinese (zh)
Inventor
郜佳敏
潘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202310141339.2A priority Critical patent/CN116050410A/en
Publication of CN116050410A publication Critical patent/CN116050410A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application provides a big data analysis method and a deep learning system based on digital online session service, which are characterized in that through carrying out interactive interest analysis on session texts in digital online session big data of any target user, corresponding interactive interest analysis information is generated, target interactive interest points meeting personalized service pushing conditions are obtained based on the interactive interest analysis information, forward behavior path data corresponding to the target interactive interest points of the target user are traced back, behavior preference analysis is carried out on the forward behavior path data, corresponding behavior preference data are obtained, personalized service pushing is carried out on the target user based on the behavior preference data and the target interactive interest points, and therefore, in the personalized service pushing process, the behavior preference data of the forward behavior path data corresponding to the target interactive interest points are combined, and the accuracy of personalized service pushing can be improved.

Description

Big data analysis method and deep learning system based on digital online session service
Technical Field
The application relates to the technical field of digital online service, in particular to a big data analysis method and a deep learning system based on digital online session service.
Background
For digital online service, common push service is duckling push advertisement information, latest information, activity information and the like, and the information is pushed to a user for an unlimited number of times, so that the user is easy to feel objectionable, otherwise, the current novel personalized push service can be recommended to relevant personalized service content information of the user in a focusing mode according to the interest points of the user, namely personalized service information recommendation. However, in the related art, personalized service pushing is only performed based on the interactive interest points of the user, which inevitably brings a part of noise to affect the accuracy of personalized service pushing.
Disclosure of Invention
In order to at least overcome the above-mentioned shortcomings in the prior art, an object of the present application is to provide a big data analysis method and a deep learning system based on a digital online session service.
In a first aspect, the present application provides a big data analysis method based on a digital online session service, applied to a deep learning system, the method comprising:
performing interactive interest analysis on the session text in the digital online session big data of any target user to generate corresponding interactive interest analysis information;
Based on the interaction interest analysis information, acquiring target interaction interest points meeting personalized service pushing conditions, and tracing forward behavior path data corresponding to the target interaction interest points of the target users;
and performing behavior preference analysis on the forward behavior path data to obtain corresponding behavior preference data, and performing personalized service pushing on the target user based on the behavior preference data and the target interaction interest point.
In a possible implementation manner of the first aspect, the step of performing interactive interest analysis on the session text in the digital online session big data of the arbitrary target user to generate corresponding interactive interest analysis information includes:
according to the keyword characteristics in the conversation text in the digital online conversation big data of any target user, text segment extraction is carried out on the conversation text, a plurality of first conversation text segments are generated, the keyword characteristics are determined according to the keyword vectors in the conversation text, and the occurrence frequency of the keyword characteristics is positively associated with the extraction frequency of the text segments;
according to a multi-modal significance concern network carrying static embedded knowledge points and a multi-modal significance concern network carrying dynamic embedded knowledge points, feature embedding is carried out on the plurality of first session text segments, dialogue embedded features of the plurality of first session text segments are generated, feature reduction is carried out on the dialogue embedded features of the plurality of first session text segments, and significant text segments of the plurality of first session text segments are generated, wherein each significant text segment carries a significant attention degree;
Performing interaction interest analysis according to the salient text segments of the plurality of first session text segments to generate interaction interest degree of each first session text segment;
and analyzing the interactive interest degree of the session text according to the interactive interest degree of each first session text segment, and generating interactive interest analysis information of the session text.
In a possible implementation manner of the first aspect, the step of extracting text segments from the session text according to the keyword features in the session text in the digital online session big data of any target user to generate a plurality of first session text segments includes:
performing thermodynamic diagram output on the conversation text to generate a knowledge thermal unit of the conversation text, wherein the knowledge thermal unit characterizes the occurrence frequency of keyword features at different conversation nodes in the conversation text;
determining text segment extraction frequency of corresponding session nodes according to the occurrence frequency of keyword features at different session nodes in the knowledge thermal unit;
and according to the text segment extraction frequency, extracting the text segments of the conversation text, and generating a plurality of first conversation text segments.
In a possible implementation manner of the first aspect, the step of performing feature embedding on the plurality of first session text segments and generating dialogue embedded features of the plurality of first session text segments according to the multi-modal significance focusing network carrying static embedded knowledge points and the multi-modal significance focusing network carrying dynamic embedded knowledge points includes:
for any first session text segment, splitting text words of the first session text segment to generate a plurality of second session text segments;
the semantic representation vectors of the second conversation text segments are obtained and generated by vector fusion of conversation text word vectors of the second conversation text segments and conversation node vectors of the second conversation text segments;
according to the multi-modal significance focusing network carrying the static embedded knowledge points and the multi-modal significance focusing network carrying the dynamic embedded knowledge points, feature embedding is carried out on semantic characterization vectors of the plurality of second session text segments, and dialogue embedded features of the first session text segments are generated;
the step of performing feature embedding on the semantic characterization vectors of the plurality of second session text segments according to the multi-modal salient interest network with the static embedded knowledge points and the multi-modal salient interest network with the dynamic embedded knowledge points to generate dialogue embedded features of the first session text segments includes:
According to the multi-modal significance focusing network carrying the static embedded knowledge points, the regular spoken language conversion network and the feedforward artificial neural network, feature embedding is carried out on semantic characterization vectors of the plurality of second conversation text segments, and the planned conversation embedding features of the plurality of second conversation text segments are generated;
according to a multi-modal significance focusing network carrying dynamic embedded knowledge points, a regular spoken language conversion network and a feedforward artificial neural network, feature embedding is carried out on the planned dialogue embedded features of the plurality of second dialogue text segments, and dialogue embedded features of the first dialogue text segments are generated;
the step of performing feature embedding on the semantic characterization vectors of the plurality of second session text segments according to the multi-modal significance focusing network carrying static embedded knowledge points, the regular spoken language conversion network and the feedforward artificial neural network to generate the proposed dialogue embedded features of the plurality of second session text segments comprises the following steps:
according to the first rule spoken language conversion network, carrying out rule spoken language conversion on semantic representation vectors of the plurality of second conversation text segments;
according to the multi-modal significance focusing network carrying the static embedded knowledge points, performing significance feature analysis on semantic characterization vectors of the plurality of second session text segments after the rule spoken language conversion to generate first sketched features;
Determining a second proposed feature according to the semantic representation vectors of the plurality of second conversation text segments and the first proposed feature;
according to a second rule spoken language conversion network, performing rule spoken language conversion on the second proposed feature;
and processing the second proposed features after the regular spoken language conversion according to the feedforward artificial neural network to generate proposed dialogue embedded features of the plurality of second session text segments.
In a possible implementation manner of the first aspect, the step of performing feature reduction on dialogue embedded features of the plurality of first session text segments to generate salient text segments of the plurality of first session text segments includes:
for any first session text segment, performing feature reduction on dialogue embedded features of the first session text segment to generate a significant attention degree corresponding to a plurality of keyword vectors in the first session text segment;
and generating a salient text segment of the first session text segment according to the salient attention degree corresponding to the plurality of keyword vectors in the first session text segment, wherein the concentration degree of each keyword vector in the salient text segment represents the corresponding salient attention degree.
In a possible implementation manner of the first aspect, the step of performing an interaction interest analysis according to the salient text segments of the plurality of first session text segments to generate an interaction interest degree of each first session text segment includes:
for a salient text segment of any first session text segment, acquiring a focusing amount of a keyword vector corresponding to each salient attention degree in the salient text segment of the first session text segment;
and determining the interaction interest degree of the first session text segment according to the focusing amount of the keyword vector corresponding to each salient attention degree.
In a possible implementation manner of the first aspect, the step of performing an interaction interest analysis according to the salient text segments of the plurality of first session text segments to generate an interaction interest degree of each first session text segment includes:
for a significant text segment of any first significant text segment, updating the significant attention degree corresponding to a plurality of keyword vectors in the significant text segment according to the session node relation among the plurality of keyword vectors in the significant text segment;
and generating the interaction interest degree of the first session text segment according to the focusing amount of the keyword vector corresponding to each salient attention degree in the updated salient text segment.
In a possible implementation manner of the first aspect, the interactive interest analysis information includes an interactive interest degree of the session text;
the step of analyzing the interactive interest degree of the session text according to the interactive interest degree of each first session text segment to generate interactive interest analysis information of the session text includes:
determining the weight corresponding to each interaction interest degree according to the focusing quantity of the first session text segment corresponding to each interaction interest degree;
and analyzing the interactive interest degree of the session text according to the weight corresponding to each interactive interest degree, and generating the interactive interest degree of the session text.
In a possible implementation manner of the first aspect, the method further includes:
splicing the salient text segments of the plurality of first session text segments to generate salient text segments of the session text, wherein the salient text segments of the session text represent text segments of salient session nodes in the session text;
or splicing the updated salient text segments of the plurality of first session text segments to generate the salient text segments of the session text.
For example, in a possible implementation manner of the first aspect, the method further includes:
extracting text segments of a template session text according to keyword features in the template session text to generate a plurality of first template session text segments, wherein the template session text carries marked interactive interest degrees, the keyword features are determined according to keyword vectors in the template session text, and occurrence frequencies of the keyword features are positively associated with extraction frequencies of the text segments;
determining the marked significant attention degree of each first template session text segment in prior according to session nodes and training label information of the plurality of first template session text segments in the template session text, wherein the training label information characterizes the significant attention degree of each session node in the template session text;
according to a multi-modal significance focusing network carrying static embedded knowledge points and a multi-modal significance focusing network carrying dynamic embedded knowledge points, feature embedding is carried out on the plurality of first template session text segments, dialogue embedded features of the plurality of first template session text segments are generated, feature reduction is carried out on the dialogue embedded features of the plurality of first template session text segments, template significance text segments of the plurality of first template session text segments are generated, and each template significance text segment carries significance focusing degree;
Performing interaction interest analysis according to the template salient text segments of the plurality of first template session text segments to generate interaction interest degree of each first template session text segment;
analyzing the interactive interest degree of the template session text according to the interactive interest degree of each first template session text segment to generate interactive interest analysis information of the template session text;
according to the significance attention degree carried in each template significance text segment, the significance attention degree marked in advance in each first template session text segment and the interaction interest degree of the template session text
The interaction interest degree is marked, and the network weight parameters of the interaction interest analysis network are updated;
the step of updating the network weight parameter of the interactive interest analysis network according to the significance attention degree carried in each template significance text segment, the significance attention degree marked a priori in each first template session text segment, the interactive interest degree of the template session text and the marked interactive interest degree, includes:
determining a first training cost value according to the significance attention degree carried in each template significance text segment and the significance attention degree marked in the corresponding first template session text segment in advance;
Determining a second training cost value according to the interaction interest degree of the template session text and the labeling interaction interest degree, wherein the second training cost value is a cross entropy training cost value;
updating the network weight parameters of the interactive interest analysis network according to the first training cost value and the second training cost value;
the step of determining a first training cost value according to the significance attention degree carried in each template significance text segment and the significance attention degree marked a priori in the corresponding first template session text segment, includes:
determining a third training cost value according to the significance attention degree carried in each template significance text segment and the significance attention degree marked in the corresponding first template session text segment in advance, wherein the third training cost value is a cross entropy training cost value;
determining a fourth training cost value according to the significance concern degree carried in each template significance text segment and the significance concern degree marked in the corresponding first template session text segment in advance;
and carrying out weight fusion on the third training cost value and the fourth training cost value to generate the first training cost value.
For example, in a possible implementation manner of the first aspect, the step of performing behavior preference analysis on the forward behavior path data to obtain corresponding behavior preference data, and performing personalized service pushing on the target user based on the behavior preference data and the target interaction interest point includes:
loading the forward behavior path data of the target user into a behavior preference decision model meeting a model convergence condition, and acquiring behavior preference data of the target user generated by the behavior preference decision model, wherein the behavior preference decision model is generated by updating model weight parameters by adopting model learning sample data, the model learning sample data comprises template behavior path data and template behavior path derivative data, the template behavior path data is the actually collected forward behavior path data of the template user, and the template behavior path derivative data is sample data generated by performing data derivative expansion on the template behavior path data;
extracting target personalized service content data corresponding to the behavior preference field in the behavior preference data from a personalized service content database corresponding to the target interactive interest point, and pushing the target personalized service content data to the target user.
For example, in a possible implementation manner of the first aspect, the behavior preference decision model includes an encoding unit and a decoding unit, and the behavior preference decision model is generated through training by:
selecting part of the template behavior path data from the template behavior path data to form a target model learning sequence number group, and executing the following steps for the target template behavior path data in the target model learning sequence number group:
loading the target template behavior path data to a behavior path derivative network to generate target template behavior path derivative data;
generating a first encoding vector set and a second encoding vector set based on the encoding unit respectively encoding the target template behavior path data and the encoding vector set of the target template behavior path derivative data;
loading the first encoding vector set and the second encoding vector set to the decoding unit respectively, and obtaining first behavior preference decision data and second behavior preference decision data generated by the decoding unit;
obtaining a sequence number group learning cost value based on the first coding vector set, the second coding vector set, the first behavior preference decision data, the second behavior preference decision data and the priori behavior preference data corresponding to each template behavior path data in the target model learning sequence number group;
And updating the weight parameter information of the behavior path derivative network, the coding unit and the decoding unit based on the serial number group learning cost value, and re-executing the step of selecting part of the template behavior path data from the template behavior path data to form a target model learning serial number group until the weight parameter information meets a model convergence condition.
For example, in one possible implementation manner of the first aspect, the learning the first set of encoding vectors, the second set of encoding vectors, the first preference decision data, the second preference decision data, and the prior preference data corresponding to the target template behavior path data based on each template behavior path data in the target model learning sequence number obtains a sequence number learning cost value, including:
obtaining a first learning cost value based on prior behavior preference data corresponding to the target template behavior path data and the first behavior preference decision data;
loading the first coding vector set and the second coding vector set into a classification neural network respectively, obtaining classification discrimination data generated by the classification neural network, and obtaining a second learning cost value based on the classification discrimination data;
Obtaining a template learning cost value corresponding to the target template behavior path data based on the first learning cost value and the second learning cost value;
acquiring a decision thermodynamic diagram of the first behavior preference decision data corresponding to each template behavior path data in the model learning sequence number group as a first decision thermodynamic diagram, and acquiring a decision thermodynamic diagram of the second behavior preference decision data corresponding to each template behavior path data in the model learning sequence number group as a second decision thermodynamic diagram;
obtaining a first unit learning cost value based on the first decision thermodynamic diagram and the second decision thermodynamic diagram;
summing the template learning cost values respectively corresponding to each template behavior path data in the model learning sequence number group to generate a second unit learning cost value;
obtaining the serial number group learning cost value based on the first unit learning cost value and the second unit learning cost value;
the updating the weight parameter information of the behavior path derived network, the encoding unit and the decoding unit based on the serial number group learning cost value includes:
and updating the weight parameter information of the behavior path derivative network, the coding unit, the decoding unit and the bisection neural network based on the serial number group learning cost value.
For example, in a possible implementation manner of the first aspect, the generating a first set of encoding vectors and a second set of encoding vectors based on the encoding unit encoding the target template behavior path data and the target template behavior path derivative data, respectively, includes:
respectively carrying out noise characteristic cleaning on the target template behavior path data and the target template behavior path derivative data to generate first noise characteristic cleaning data and second noise characteristic cleaning data;
and loading the first noise characteristic cleaning data and the second noise characteristic cleaning data to the coding unit respectively to acquire the first coding vector set and the second coding vector set generated by the coding unit.
In a second aspect, embodiments of the present application also provide a deep learning system, the deep learning system including a processor and a machine-readable storage medium having stored therein a computer program loaded and executed in conjunction with the processor to implement the big data analysis method based on the digital online session service of the first aspect above.
By adopting the technical scheme in any aspect, through carrying out interactive interest analysis on the session text in the digital online session big data of any target user, corresponding interactive interest analysis information is generated, the target interactive interest points meeting personalized service pushing conditions are obtained based on the interactive interest analysis information, forward behavior path data corresponding to the target interactive interest points of the target user are traced, behavior preference analysis is carried out on the forward behavior path data, corresponding behavior preference data is obtained, personalized service pushing is carried out on the target user based on the behavior preference data and the target interactive interest points, and therefore, in the personalized service pushing process, the behavior preference data of the forward behavior path data corresponding to the target interactive interest points are combined, and the accuracy of personalized service pushing can be improved.
Extracting text segments from different session node parts of the session text according to different text segment extraction frequencies through different occurrence frequencies of keyword features in the session text, carrying out feature embedding and feature reduction on the session text segments obtained by extracting the text segments through a multi-mode significant attention network carrying static embedded knowledge points and a multi-mode significant attention network carrying dynamic embedded knowledge points to generate significant text segments carrying significant attention degrees of each session text segment, and predicting interactive interest degrees of the session text segments through the significant text segments, so as to analyze the interactive interest degrees of the session text and generate interactive interest analysis information of the session text; therefore, under the condition of ensuring that the interactive interest part in the conversation text is not omitted in the text segment extraction process, the text segment extraction of the non-interactive interest part in the conversation text can be reduced, so that the interactive interest part in the conversation text is more focused in the feature embedding and restoring process, the error of subsequent interactive interest analysis is reduced, the analysis accuracy of the remarkable attention degree is improved, the accuracy of the interactive interest degree of the conversation text is further improved, and the method can be convenient for providing reference basis for the subsequent personalized information pushing.
Drawings
For a clearer description of the technical solutions of the embodiments of the present application, reference will be made to the accompanying drawings, which are needed to be activated, for the sake of simplicity, and it should be understood that the following drawings only illustrate some embodiments of the present application and should therefore not be considered as limiting the scope, and that other related drawings can be obtained by those skilled in the art without the inventive effort.
Fig. 1 is a flow chart of a big data analysis method based on a digital online session service according to an embodiment of the present application;
fig. 2 is a schematic block diagram of a deep learning system for implementing the above-mentioned big data analysis method based on the digital online session service according to an embodiment of the present application.
Detailed Description
The following description is presented to enable one of ordinary skill in the art to make and use the application and is provided in the context of a particular application and its requirements. It will be apparent to those having ordinary skill in the art that various changes can be made to the disclosed embodiments and that the general principles defined herein may be applied to other embodiments and applications without departing from the principles and scope of the present application. Thus, the present application is not limited to the embodiments described, but is to be accorded the widest scope consistent with the claims.
Step S100, carrying out interactive interest analysis on the session text in the digital online session big data of any target user, and generating corresponding interactive interest analysis information.
Step S200, based on the interaction interest analysis information, obtaining target interaction interest points meeting personalized service pushing conditions, and tracing forward behavior path data corresponding to the target interaction interest points of the target user.
In this embodiment, an interest point with an interaction interest confidence degree greater than a preset confidence degree may be obtained as the target interaction interest point satisfying the personalized service push condition, and then forward behavior path data corresponding to the target interaction interest point by the target user may be traced, for example, behavior data related to the target user in a preset time period before the current time node by the target user may be traced as the forward behavior path data.
Step S300, performing behavior preference analysis on the forward behavior path data to obtain corresponding behavior preference data, and performing personalized service pushing on the target user based on the behavior preference data and the target interaction interest point.
Based on the above steps, in the embodiment, through performing interactive interest analysis on the session text in the digital online session big data of any target user, corresponding interactive interest analysis information is generated, based on the interactive interest analysis information, a target interactive interest point meeting personalized service pushing conditions is obtained, forward behavior path data corresponding to the target interactive interest point of the target user are traced, behavior preference analysis is performed on the forward behavior path data, corresponding behavior preference data is obtained, personalized service pushing is performed on the target user based on the behavior preference data and the target interactive interest point, and therefore in the personalized service pushing process, not only the interactive interest point but also the behavior preference data corresponding to the target interactive interest point are combined, and the accuracy of personalized service pushing can be improved.
In some exemplary design ideas, the embodiments of the present application provide an interactive interest analysis method based on artificial intelligence, which includes the following steps.
Step S101, extracting text segments of the conversation text according to keyword features in the conversation text in the digital online conversation big data of any target user, generating a plurality of first conversation text segments, determining the keyword features according to keyword vectors in the conversation text, and positively associating the occurrence frequency of the keyword features with the extraction frequency of the text segments.
In some exemplary design considerations, the target user may be any user using a digital online service (e.g., an e-commerce online service, a medical online service, an industrial online service, etc.), and the digital online session big data may be used to represent a data set of online session behaviors of the target user during use of the digital online service, where the online session behaviors may be, for example, problem session behaviors for a certain service item.
The keyword features of the conversation text may be used to characterize semantic features in the conversation text, as determined by the keyword vectors in the conversation text. The frequency of occurrence of keyword features for different session nodes (e.g., session in progress, e.g., machine, pre-manual, after-manual, etc.) in the session text may be different. In the conversation text, the keyword features at the conversation nodes of the persistent conversation content with the target keywords are denser than the keyword features at the conversation nodes without the persistent conversation content. In the persistent session content of the target keyword, there may be a non-interactive interest part and an interactive interest part. The deep learning system can extract text segments from the conversational text based on the frequency of occurrence of the keyword features. And at the session nodes with high occurrence frequency of the keyword features, the text segment extraction times are more, and more first session text segments are generated. At the session nodes with low occurrence frequency of the keyword features, the text segment extraction times are less, and fewer first session text segments are generated.
Step S102, feature embedding is carried out on a plurality of first session text segments according to a multi-mode significance focusing network carrying static embedded knowledge points and a multi-mode significance focusing network carrying dynamic embedded knowledge points, dialogue embedding features of the plurality of first session text segments are generated, feature reduction is carried out on the dialogue embedding features of the plurality of first session text segments, and significance text segments of the plurality of first session text segments are generated, wherein each significance text segment carries significance focusing degrees.
In some exemplary design considerations, for any first session text segment, feature embedding is performed on the first session text segment according to a multimodal saliency attention network carrying statically embedded knowledge points. And then, according to the multi-mode significance focusing network carrying the dynamic embedded knowledge points, carrying out feature embedding again on the first session text segment after feature embedding to generate the dialogue embedded feature of the first session text segment. Then, feature reduction is performed on the dialogue embedded features of the first session text segment to generate a salient text segment of the first session text segment. In the feature restoration process, the corresponding significance attention degree at each session node in the first session text segment can be predicted. The level of prominence focus characterizes a level of prominence of persistent session content for the target keyword in the first session text segment. The deep learning system marks the predicted significance attention degree according to the corresponding session nodes to generate the significance text segment.
Step S103, performing interaction interest analysis according to the salient text segments of the first session text segments to generate interaction interest degree of each first session text segment.
In some exemplary design concepts, for any first session text segment, the salient text segment of the first session text segment carries a salient attention degree corresponding to each keyword vector of the first session text segment. The deep learning system can analyze the interactive interest of the first session text segment according to the significance attention degree corresponding to each keyword vector of the first session text segment carried in the significance text segment. The first text segments of different categories correspond to different degrees of interactive interest. The interactive interest level can represent the interest state of the persistent session content of the target keyword.
Step S104, analyzing the interactive interest degree of the conversation text according to the interactive interest degree of each first conversation text segment, and generating interactive interest analysis information of the conversation text.
In some exemplary design ideas, the interactive interest degree of a plurality of first session texts is integrated, and the overall interactive interest degree of the session texts is analyzed to generate interactive interest analysis information of the session texts. That is, the interactive interest analysis information includes the interactive interest level of the conversation text. The interactive interest analysis information can reflect the interest state of the target keywords. The deep learning system can feed the interactive interest analysis information back to the personalized service pushing platform, and the personalized service pushing platform can execute corresponding personalized pushing service based on the interactive interest analysis information.
Based on the above steps, in the embodiment, text segment extraction is performed on different session node parts of a session text according to different text segment extraction frequencies through different occurrence frequencies of keyword features in the session text, then feature embedding and feature restoration are performed on the session text segments obtained by text segment extraction through a multi-mode significant attention network carrying static embedded knowledge points and a multi-mode significant attention network carrying dynamic embedded knowledge points, significant text segments carrying significant attention degrees of all the session text segments are generated, and then interaction interest degrees of the session text segments are predicted through the significant text segments, so that interaction interest degrees of the session text are analyzed, and interaction interest analysis information of the session text is generated; therefore, under the condition of ensuring that the interactive interest part in the conversation text is not omitted in the text segment extraction process, the text segment extraction of the non-interactive interest part in the conversation text can be reduced, so that the interactive interest part in the conversation text is more focused in the feature embedding and restoring process, the error of subsequent interactive interest analysis is reduced, the analysis accuracy of the remarkable attention degree is improved, the accuracy of the interactive interest degree of the conversation text is further improved, and the method can be convenient for providing reference basis for the subsequent personalized information pushing.
In some exemplary design ideas, the embodiment may implement the above-mentioned interactive interest analysis method through an interactive interest analysis network. The interactive interest analysis network is a neural network model which is trained in advance and accords with network convergence conditions. The deep learning system is capable of updating (training) the network weight parameters of the interactive interest analysis network, and a training embodiment of the interactive interest analysis network is provided below, which includes the following steps.
Step S201, text segment extraction is carried out on a template session text according to keyword features in the template session text, a plurality of first template session text segments are generated, the template session text carries marked interactive interest degrees, the keyword features are determined according to keyword vectors in the template session text, and occurrence frequencies of the keyword features are positively correlated with text segment extraction frequencies.
In some exemplary design considerations, the template session text has certain persistent session content of the template user displayed therein. The template session text carries the annotation interaction interest degree, and the annotation interaction interest degree characterizes the accurate interest state of the continuous session content in the template session text. The principle of text segment extraction of the template session text by the deep learning system is the same as that of text segment extraction of the session text in step S101, and will not be described here again.
Step S202, according to session nodes of a plurality of first template session text segments in a template session text and training label information, determining the significance attention degree marked in advance in each first template session text segment, wherein the training label information represents the significance attention degree of each session node in the template session text.
In some exemplary design considerations, the training label information of the template session text is essentially a matrix. The training label information carries the significance attention degree corresponding to each keyword vector in the template conversation text. For example, each session node in the training label information carries a training reference value, and different training reference values represent different degrees of attention of significance. For any first template session text segment, the significance attention degree corresponding to each keyword vector at the corresponding session node can be found in the training label information based on the session node of the first template session text segment in the template session text. And then, the significance attention degree corresponding to each keyword vector at the corresponding session node can be used as the significance attention degree marked in the prior in the session text segment of the first template. The priori noted significance concern is the accurate significance concern, and is used for providing reference for the significance concern of subsequent analysis.
Step S203, according to the multi-modal salient attention network carrying the static embedded knowledge points and the multi-modal salient attention network carrying the dynamic embedded knowledge points, feature embedding is carried out on the plurality of first template conversation text segments to generate dialogue embedded features of the plurality of first template conversation text segments, feature reduction is carried out on the dialogue embedded features of the plurality of first template conversation text segments to generate template salient text segments of the plurality of first template conversation text segments, and each template salient text segment carries salient attention degrees.
In some exemplary design ideas, feature embedding and feature restoration are performed on the plurality of first template text segments according to the same principles as the feature embedding and feature restoration performed on the plurality of first template text segments in step S102, which are not described herein.
Step S204, performing interaction interest analysis according to the template significance text segments of the plurality of first template session text segments to generate interaction interest degree of each first template session text segment.
In some exemplary design ideas, the interaction interest level of the plurality of first template text segments is analyzed according to the same principle as that of the analysis of the interaction interest level of the plurality of first template text segments in step S103, which is not described herein.
Step S205, analyzing the interactive interest degree of the template session text according to the interactive interest degree of each first template session text segment, and generating interactive interest analysis information of the template session text.
In some exemplary design ideas, the interactive interest level of the template session text is analyzed according to the same principle as that of the analysis of the interactive interest level of the session text in step S104, and will not be described herein.
Step S206, updating the network weight parameters of the interactive interest analysis network according to the significance attention degree carried in each template significance text segment, the significance attention degree marked in the prior in each first template session text segment, the interactive interest degree of the template session text and the marked interactive interest degree.
In some exemplary design ideas, the network weight parameters of the interactive interest analysis network are updated according to the difference between the significance attention degree carried in each template significance text segment and the significance attention degree marked in the corresponding first template session text segment in advance and the difference between the interactive interest degree of the template session text and the marked interactive interest degree, so that the two differences are reduced as much as possible, and the interactive interest analysis precision of the interactive interest analysis network is improved.
Based on the steps, text segment extraction is carried out on different areas of the template session text according to different text segment extraction frequencies through different occurrence frequencies of keyword features in the template session text, the priori marked significant attention degree in the template session text segment obtained by extracting each text segment is determined through training tag information, feature embedding and feature reduction are carried out on the template session text segment obtained by extracting the text segment through a multi-mode significant attention network carrying static embedded knowledge points and a multi-mode significant attention network carrying dynamic embedded knowledge points, the template significant text segment carrying significant attention degrees of each template session text segment is generated, then the interactive interest degree of the template session text segment is predicted, so that the interactive interest degree of the template session text is predicted, and then the significant attention degree obtained by analyzing in the template significant text segment, the priori marked significant attention degree in the template session text segment, the interactive interest degree of the template session text obtained by analyzing and the marked interactive interest degree of the template session text are updated according to network weight parameters of the interactive interest analysis network; the text segment extraction process can reduce the text segment extraction of the non-interactive interest part in the template session text while ensuring that the interactive interest part in the template session text is not missed, so that the interactive interest part in the template session text is more focused in the feature embedding and restoring process, the error of the subsequent interactive interest analysis is reduced, the analysis accuracy of the remarkable attention degree is improved, the interactive interest degree of the template session text segment and the interactive interest degree of the template session text can be more accurately predicted, and the network weight parameters of the interactive interest analysis network are updated from the local information such as the remarkable attention degree in the template session text segment and the global information such as the interactive interest degree of the template session text in the training process, so that the interactive interest analysis accuracy of the interactive interest analysis network is higher.
Another embodiment of the present application is described below, including the following steps.
Step S301, extracting text segments from the conversation text according to the keyword features in the conversation text, and generating a plurality of first conversation text segments, wherein the keyword features are determined according to the keyword vectors in the conversation text, and the occurrence frequency of the keyword features and the extraction frequency of the text segments are positively correlated.
In some exemplary design ideas, this step S301 mainly includes two steps, step S3011-step S3012, as follows.
Step S3011, by specifying a denoising algorithm, the noise feature vector is removed, and a conversation text including only substantial conversation content is generated.
The substantive session content is a certain persistent session content of the target keyword.
Step S3012, extracting text segments of the conversation text only including the substantial conversation content according to the keyword features in the conversation text.
In some exemplary design ideas, semantic characterization vectors of the conversational text can be extracted to determine the occurrence frequency of keyword features, thereby extracting text segments from the conversational text. Accordingly, this step S3012 includes the following steps. The deep learning system outputs thermodynamic diagrams of the conversation text, and generates knowledge thermal units of the conversation text, wherein the knowledge thermal units represent occurrence frequency of keyword features at different conversation nodes in the conversation text. And then, determining the text segment extraction frequency of the corresponding session node according to the occurrence frequency of the keyword features at different session nodes in the knowledge thermal unit. And then, according to the text segment extraction frequency, extracting text segments from the conversation text to generate a plurality of first conversation text segments. Therefore, since the keyword vectors of the interactive interest part and the non-interactive interest part in the conversation text are different, the keyword features are determined through the keyword vectors in the conversation text, so that the keyword features can indicate the interactive interest part in the conversation text, the conversation text is subjected to text segment extraction through the keyword features of the conversation text, and the occurrence frequency of the keyword features and the text segment extraction frequency are positively correlated, so that the text segment extraction of the non-interactive interest part in the conversation text can be reduced while the non-missing of the interactive interest part in the conversation text is ensured in the text segment extraction process, and the error brought by the excessive non-interactive interest part to the subsequent interactive interest analysis can be reduced.
Step S302, for any first session text segment, splitting text words of the first session text segment to generate a plurality of second session text segments.
Step S303, semantic representation vectors of a plurality of second conversation text segments are obtained, wherein the semantic representation vectors of the plurality of second conversation text segments are generated by vector fusion of conversation text word vectors of the plurality of second conversation text segments and conversation node vectors of the plurality of second conversation text segments.
In some exemplary design considerations, for any second segment of conversation text, a conversation text word vector and a conversation node vector for the second segment of conversation text are extracted. The conversation text word vector characterizes content in the second conversation text segment. The session node vector characterizes session nodes of the second session text segment in the first session text segment. And then, fusing the conversation text word vector and the conversation node vector of the second conversation text segment to generate a semantic representation vector of the second conversation text segment.
Step S304, feature embedding is carried out on semantic characterization vectors of a plurality of second session text segments according to the multi-modal significance focusing network carrying the static embedded knowledge points and the multi-modal significance focusing network carrying the dynamic embedded knowledge points, so as to generate dialogue embedded features of the first session text segments.
In some exemplary design ideas, feature embedding is performed on semantic characterization vectors of a plurality of second session text segments according to a multi-modal significance interest network carrying static embedded knowledge points. And then, according to the multi-mode significance focusing network carrying the dynamic embedded knowledge points, carrying out feature embedding again on the semantic characterization vectors of the plurality of second conversation text segments after feature embedding to generate conversation embedded features of the first conversation text segments.
In some exemplary design considerations, this step S302 includes the following steps. The deep learning system performs feature embedding on semantic characterization vectors of a plurality of second session text segments according to a multi-modal saliency attention network (such as a multi-head self-attention mechanism with a fixed window) carrying static embedded knowledge points, a regular spoken language conversion network and a feedforward artificial neural network (such as a multi-layer perceptron), so as to generate a plurality of proposed dialogue embedded features of the second session text segments. And then, according to a multi-modal significance focusing network (such as a multi-head self-focusing mechanism with a shift window) carrying dynamic embedded knowledge points, a rule spoken language conversion network and a feedforward artificial neural network, performing feature embedding on the proposed dialogue embedded features of a plurality of second dialogue text segments to generate dialogue embedded features of the first dialogue text segments. Wherein, the activation function in the feedforward artificial neural network is a GELU function. Therefore, the feature embedding is carried out on the first conversation text segments through the multi-modal significance focusing network carrying the static embedded knowledge points and the multi-modal significance focusing network carrying the dynamic embedded knowledge points, so that the interactive interest part in the conversation text is focused more, and the conversation embedding features of the first conversation text segments can more accurately represent the interactive interest part.
And according to the first rule spoken language conversion network, carrying out rule spoken language conversion on semantic representation vectors of a plurality of second conversation text segments. And then, according to a multi-modal significance focusing network carrying static embedded knowledge points, performing significance feature analysis on semantic characterization vectors of a plurality of second session text segments after the rule spoken language conversion to generate first sketched features. Then, determining a second proposed feature according to the semantic token vector and the first proposed feature of the plurality of second conversation text segments. And then, according to a second rule spoken language conversion network, performing rule spoken language conversion on the second proposed feature. And then, processing the second proposed features according to the feedforward artificial neural network to generate proposed dialogue embedded features of a plurality of second session text segments. The first rule spoken language conversion network and the network layer of the multi-mode significance focusing network carrying the static embedded knowledge points form a residual network, namely semantic characterization vectors of a plurality of second session text segments and the first sketched features are added to generate second sketched features. The second rule spoken language conversion network and the feedforward artificial neural network also form a residual network, namely, the second formulated features and semantic characterization vectors output by the feedforward artificial neural network are added to generate formulated dialogue embedded features of a plurality of second conversation text segments.
In step S305, feature reduction is performed on the dialogue embedded features of the plurality of first dialogue text segments, so as to generate salient text segments of the plurality of first dialogue text segments, where each salient text segment carries a salient attention degree.
In some exemplary design ideas, for any first session text segment, feature reduction is performed on dialogue embedded features of the first session text segment to analyze a significant attention degree corresponding to each keyword vector in the first session text segment, and a significant text segment of the first session text segment is generated based on the significant attention degree corresponding to each keyword vector.
In some exemplary design considerations, different concentrations can be employed to display different degrees of significant attention. Accordingly, this step S305 includes the following steps. And for any first session text segment, performing feature reduction on the dialogue embedded features of the first session text segment to generate the salient attention degree corresponding to the plurality of keyword vectors in the first session text segment. And then, generating the salient text segment of the first session text segment according to the salient attention degree corresponding to the keyword vectors in the first session text segment. Wherein the concentration of each keyword vector in the salient text segment characterizes the corresponding salient attention degree. Thus, different degrees of significant attention are displayed through different densities, and the degrees of significant attention of each session node in the first session text segment can be intuitively determined.
In step S306, an interactive interest analysis is performed according to the salient text segments of the plurality of first session text segments, so as to generate an interactive interest degree of each first session text segment.
In some exemplary design ideas, each salient text segment carries a salient attention degree corresponding to each keyword vector in the corresponding first session text segment. The deep learning system is capable of predicting a degree of interactive interest of the corresponding first session text segment based on each salient text segment. The process of predicting the interactive interest degree of the first session text segments by the deep learning system is equivalent to performing interactive interest analysis on the first session text segments.
In some exemplary design considerations, the degree of interactive interest of the first session text segment can be predicted based on the amount of focus of the keyword vector corresponding to each salient degree of interest in the salient text segment. Accordingly, this step S306 includes the following steps. And for the salient text segment of any first session text segment, acquiring the focusing quantity of the keyword vector corresponding to each salient attention degree in the salient text segment of the first session text segment. And then, determining the interaction interest degree of the first session text segment according to the focusing amount of the keyword vector corresponding to each salient attention degree. Therefore, the interaction interest degree of the first session text segment is related to the significance attention degree of each session node in the first session text segment, and the interaction interest degree of the first session text segment is predicted through the focusing amount of the keyword vector corresponding to each significance attention degree in the significance text segment, so that the predicted interaction interest degree of the first session text segment is more accurate.
In some exemplary design considerations, the prediction of the interactive interest level of the first session text segment may be affected due to errors in the significance focus level predicted by the deep learning system. The deep learning system can update the significant attention degree obtained by analysis, adjust the significant attention degree with errors, and analyze the interactive interest degree of the first session text segment. Accordingly, this step S306 includes the following steps. And updating the significance attention degree corresponding to the plurality of keyword vectors in the significance text segment according to the session node relation among the plurality of keyword vectors in the significance text segment for the significance text segment of any first session text segment. And then, generating the interaction interest degree of the first session text segment according to the focusing amount of the keyword vector corresponding to each salient attention degree in the updated salient text segment.
Step S307, analyzing the interactive interest degree of the conversation text according to the interactive interest degree of each first conversation text segment, and generating interactive interest analysis information of the conversation text.
In some exemplary design considerations, the interactive interest level of the conversation text is related to the predicted interactive interest level of each first conversation text segment in the conversation text. The deep learning system predicts the interactive interest level of the conversation text based on the interactive interest levels of the plurality of first conversation texts.
In some exemplary design considerations, the degree of interactive interest of the conversation text can be predicted based on the amount of focus of the first conversation text segment corresponding to each degree of interactive interest. Accordingly, this step S307 includes the following steps. The deep learning system determines the weight corresponding to each interaction interest degree according to the focusing quantity of the first session text segment corresponding to each interaction interest degree. And then, analyzing the interactive interest degree of the session text according to the weight corresponding to each interactive interest degree, and generating the interactive interest degree of the session text. Therefore, the interactive interest degree of the conversation text is determined through the weights corresponding to the first conversation text segments corresponding to each interactive interest degree in the plurality of first conversation text segments, so that the interactive interest analysis information can accurately represent the interactive interest degree of the conversation text, and a reference basis can be provided for the follow-up personalized information pushing.
In some exemplary design ideas, the interactive interest level of the conversation text can also be predicted based on the focusing amount of the keyword vector corresponding to each significant attention level in the conversation text. Accordingly, this step S307 includes the following steps. The deep learning system acquires the focusing quantity of the keyword vector corresponding to each salient attention degree in the salient text segment of the conversation text. And then, determining the interactive interest degree of the conversation text according to the focusing amount of the keyword vector corresponding to each salient attention degree. Therefore, the interactive interest degree of the session text is related to the remarkable attention degree of each session node in the session text, and the interactive interest degree of the session text is predicted through the focusing amount of the keyword vector corresponding to each remarkable attention degree in the remarkable text segment, so that the predicted interactive interest degree of the session text is more accurate, and a reference basis can be provided for subsequent personalized information pushing conveniently.
Step S308, according to the salient text segments of the plurality of first conversation text segments, salient text segments of the conversation text are generated.
In some exemplary design ideas, in the deep learning system, in the prediction process of the interactive interest degree of the conversation text according to the salient text segments of the plurality of first conversation text segments, the salient text segments of the conversation text can also be generated based on the salient text segments of the plurality of first conversation text segments. That is, the execution timing of step S308 may be the same as the execution timing of steps S306 to S307. The method comprises the steps of splicing the salient text segments of the plurality of first conversation text segments to generate salient text segments of the conversation text, or splicing the updated salient text segments of the plurality of first conversation text segments to generate salient text segments of the conversation text. The salient text segment of the conversation text characterizes text segments of salient conversation nodes in the conversation text.
Therefore, according to the embodiment, text segment extraction is carried out on different session node parts of the session text according to different text segment extraction frequencies through different occurrence frequencies of keyword features in the session text, then feature embedding and feature reduction are carried out on the session text segments obtained through text segment extraction through a multi-mode significant attention network carrying static embedded knowledge points and a multi-mode significant attention network carrying dynamic embedded knowledge points, significant text segments carrying significant attention degrees of all the session text segments are generated, and then interaction interest degrees of the session text segments are predicted through the significant text segments, so that interaction interest degrees of the session text are analyzed, and interaction interest analysis information of the session text is generated; therefore, under the condition of ensuring that the interactive interest part in the conversation text is not omitted in the text segment extraction process, the text segment extraction of the non-interactive interest part in the conversation text can be reduced, so that the interactive interest part in the conversation text is more focused in the feature embedding and restoring process, the error of subsequent interactive interest analysis is reduced, the analysis accuracy of the remarkable attention degree is improved, the accuracy of the interactive interest degree of the conversation text is further improved, and the method can be convenient for providing reference basis for the subsequent personalized information pushing.
An embodiment of training an interactive interest analysis network according to another embodiment of the present application is described below, including the following steps.
Step S401, extracting text segments from a template session text according to keyword features in the template session text, generating a plurality of first template session text segments, wherein the template session text carries marked interactive interest degrees, the keyword features are determined according to keyword vectors in the template session text, and the occurrence frequency of the keyword features is positively correlated with the extraction frequency of the text segments.
In some exemplary design ideas, text segment extraction is performed on the template session text according to the same principle as that of text segment extraction performed on the session text in step S301, so as to generate a plurality of first template session text segments, which are not described herein again.
Step S402, according to session nodes of a plurality of first template session text segments in a template session text and training label information, determining the prior marked significance attention degree in each first template session text segment, wherein the training label information characterizes the significance attention degree of each session node in the template session text.
In some exemplary design considerations, training label information of the template session text is the same size as the template session text, and each session node in the session text corresponds. The training label information is marked with the significance attention degree of priori marks corresponding to the keyword vectors in the template conversation text. The deep learning system can find the significance attention degree corresponding to each keyword vector at the corresponding session node in the training label information based on the session node of the first template session text segment in the template session text so as to obtain the significance attention degree marked in advance in the first template session text segment.
Step S403, according to the multi-modal salient attention network carrying the static embedded knowledge points and the multi-modal salient attention network carrying the dynamic embedded knowledge points, feature embedding is carried out on the plurality of first template conversation text segments to generate dialogue embedded features of the plurality of first template conversation text segments, feature reduction is carried out on the dialogue embedded features of the plurality of first template conversation text segments to generate template salient text segments of the plurality of first template conversation text segments, and each template salient text segment carries salient attention degrees.
In some exemplary design ideas, feature embedding and feature restoration are performed on the template session text according to the same principle as the generation of the salient text segments of the plurality of first session text segments in steps S302 to S305, so as to generate a plurality of first template session text segments, which are not described herein.
Step S404, performing interaction interest analysis according to the template significance text segments of the plurality of first template session text segments to generate interaction interest degree of each first template session text segment.
In some exemplary design ideas, the interaction interest level of the plurality of first template text segments is analyzed according to the same principle as that of the analysis of the interaction interest level of the plurality of first template text segments in step S306, and will not be described herein.
Step S405, analyzing the interactive interest degree of the template session text according to the interactive interest degree of each first template session text segment, and generating interactive interest analysis information of the template session text.
In some exemplary design ideas, the interactive interest level of the template text is analyzed according to the same principle as that of the analysis of the interactive interest level of the session text in step S307, and will not be described herein.
Step S406, determining a first training cost value according to the significance attention degree carried in each template significance text segment and the significance attention degree marked in the corresponding first template session text segment in advance.
In some exemplary design ideas, the first training cost value can be determined based on a difference between a salient attention degree carried in each template salient text segment and a salient attention degree marked a priori in a corresponding first template session text segment.
In some exemplary design ideas, a third training cost value is determined according to the significance attention degree carried in each template significance text segment and the significance attention degree marked a priori in the corresponding first template session text segment, wherein the third training cost value is a cross entropy training cost value. And then, determining a fourth training cost value according to the significance attention degree carried in each template significance text segment and the significance attention degree marked in the corresponding first template session text segment in advance. And then, carrying out weight fusion on the third training cost value and the fourth training cost value to generate a first training cost value. The third training cost value characterizes the distinction between the predicted significance concern degree corresponding to each keyword vector and the significance concern degree marked a priori. The fourth training cost value is associated with the session node portion and characterizes a distinction between a predicted significant degree of attention corresponding to the keyword vector within one session node portion and a priori noted significant degree of attention. Therefore, the predicted significance attention degree corresponding to each keyword vector is not only related to the keyword vector value, but also related to the keyword vector values of other keyword vectors, and the cross entropy training cost value and the Dice training cost value between the significance attention degree carried in the template significance text segment and the significance attention degree marked in the prior in the corresponding first template session text segment are calculated, so that the training cost value is calculated from the angle of the keyword vector, the training cost value is calculated from the angle of the session node part, and the calculated first training cost value reflects the performance of the interaction interest analysis network more accurately, and the training of the interaction interest analysis network with better performance is facilitated.
Step S407, determining a second training cost value according to the interactive interest degree of the template conversation text and the labeled interactive interest degree, wherein the second training cost value is the cross entropy training cost value.
In some exemplary design considerations, a second training cost value for the interactive interest analysis network is determined based on a distinction between the interactive interest level of the template session text and the annotated interactive interest level.
Step S408, updating the network weight parameters of the interactive interest analysis network according to the first training cost value and the second training cost value.
In some exemplary design ideas, the value convergence (no longer decline) of the first training cost value and the second training cost value is taken as a training direction, and the network weight parameters of the interactive interest analysis network are updated.
Based on the steps, text segment extraction is carried out on different areas of the template session text according to different text segment extraction frequencies through different occurrence frequencies of keyword features in the template session text, the priori marked significant attention degree in the template session text segment obtained by extracting each text segment is determined through training tag information, feature embedding and feature reduction are carried out on the template session text segment obtained by extracting the text segment through a multi-mode significant attention network carrying static embedded knowledge points and a multi-mode significant attention network carrying dynamic embedded knowledge points, the template significant text segment carrying significant attention degrees of each template session text segment is generated, then the interactive interest degree of the template session text segment is predicted, so that the interactive interest degree of the template session text is predicted, and then the significant attention degree obtained by analyzing in the template significant text segment, the priori marked significant attention degree in the template session text segment, the interactive interest degree of the template session text obtained by analyzing and the marked interactive interest degree of the template session text are updated according to network weight parameters of the interactive interest analysis network; the text segment extraction process can reduce the text segment extraction of the non-interactive interest part in the template session text while ensuring that the interactive interest part in the template session text is not missed, so that the interactive interest part in the template session text is more focused in the feature embedding and restoring process, the error of the subsequent interactive interest analysis is reduced, the analysis accuracy of the remarkable attention degree is improved, the interactive interest degree of the template session text segment and the interactive interest degree of the template session text can be more accurately predicted, and the network weight parameters of the interactive interest analysis network are updated from the local information such as the remarkable attention degree in the template session text segment and the global information such as the interactive interest degree of the template session text in the training process, so that the interactive interest analysis accuracy of the interactive interest analysis network is higher.
In some exemplary design considerations, for step S300, this may be achieved by the following steps.
Step S310, loading forward behavior path data of the target user into a behavior preference decision model meeting model convergence conditions, and acquiring behavior preference data of the target user generated by the behavior preference decision model, wherein the behavior preference decision model is generated by updating model weight parameters by adopting model learning sample data, the model learning sample data comprises template behavior path data and template behavior path derivative data, the template behavior path data is actually collected forward behavior path data of the template user, and the template behavior path derivative data is sample data generated by performing data derivative expansion on the template behavior path data;
step S320, extracting target personalized service content data corresponding to the behavior preference field in the behavior preference data from the personalized service content database corresponding to the target interactive interest point, and pushing the target personalized service content data to the target user.
The behavior preference decision model comprises an encoding unit and a decoding unit, and is generated through training of the following steps:
Selecting part of the template behavior path data from the template behavior path data to form a target model learning sequence number group, and executing the following steps for the target template behavior path data in the target model learning sequence number group:
(1) Loading the target template behavior path data to a behavior path derivative network to generate target template behavior path derivative data;
(2) Generating a first encoding vector set and a second encoding vector set based on the encoding unit respectively encoding the target template behavior path data and the encoding vector set of the target template behavior path derivative data;
(3) Loading the first encoding vector set and the second encoding vector set to the decoding unit respectively, and obtaining first behavior preference decision data and second behavior preference decision data generated by the decoding unit;
(4) Obtaining a sequence number group learning cost value based on the first coding vector set, the second coding vector set, the first behavior preference decision data, the second behavior preference decision data and the priori behavior preference data corresponding to each template behavior path data in the target model learning sequence number group;
(5) And updating the weight parameter information of the behavior path derivative network, the coding unit and the decoding unit based on the serial number group learning cost value, and re-executing the step of selecting part of the template behavior path data from the template behavior path data to form a target model learning serial number group until the weight parameter information meets a model convergence condition.
The sequence number group learning cost value is obtained based on the first coding vector set, the second coding vector set, the first behavior preference decision data, the second behavior preference decision data and the prior behavior preference data corresponding to each template behavior path data in the target model learning sequence number group, and the sequence number group learning cost value comprises: obtaining a first learning cost value based on prior behavior preference data corresponding to the target template behavior path data and the first behavior preference decision data; loading the first coding vector set and the second coding vector set into a classification neural network respectively, obtaining classification discrimination data generated by the classification neural network, and obtaining a second learning cost value based on the classification discrimination data; obtaining a template learning cost value corresponding to the target template behavior path data based on the first learning cost value and the second learning cost value; acquiring a decision thermodynamic diagram of the first behavior preference decision data corresponding to each template behavior path data in the model learning sequence number group as a first decision thermodynamic diagram, and acquiring a decision thermodynamic diagram of the second behavior preference decision data corresponding to each template behavior path data in the model learning sequence number group as a second decision thermodynamic diagram; obtaining a first unit learning cost value based on the first decision thermodynamic diagram and the second decision thermodynamic diagram; summing the template learning cost values respectively corresponding to each template behavior path data in the model learning sequence number group to generate a second unit learning cost value; and obtaining the serial number group learning cost value based on the first unit learning cost value and the second unit learning cost value.
Updating the weight parameter information of the behavior path derivative network, the encoding unit and the decoding unit based on the serial number group learning cost value, including: and updating the weight parameter information of the behavior path derivative network, the coding unit, the decoding unit and the bisection neural network based on the serial number group learning cost value.
Wherein generating a first set of encoding vectors and a second set of encoding vectors based on the encoding unit encoding the target template behavior path data and the set of encoding vectors of the target template behavior path-derived data, respectively, includes: respectively carrying out noise characteristic cleaning on the target template behavior path data and the target template behavior path derivative data to generate first noise characteristic cleaning data and second noise characteristic cleaning data; and loading the first noise characteristic cleaning data and the second noise characteristic cleaning data to the coding unit respectively to acquire the first coding vector set and the second coding vector set generated by the coding unit.
Fig. 2 schematically illustrates a deep learning system 100 that may be used to implement various embodiments described herein.
For one embodiment, FIG. 2 shows a deep learning system 100, the deep learning system 100 having a plurality of processors 102, a control module (chipset) 104 coupled to at least one of the processor(s) 102, a memory 106 coupled to the control module 104, a non-volatile memory (NVM)/storage device 108 coupled to the control module 104, a plurality of input/output devices 110 coupled to the control module 104, and a network interface 112 coupled to the control module 106.
Processor 102 may include a plurality of single-core or multi-core processors, and processor 102 may include any combination of general-purpose or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some alternative implementations, the deep learning system 100 can function as a server device such as a gateway as described in the embodiments herein.
In some alternative embodiments, the deep learning system 100 may include a plurality of computer-readable media (e.g., memory 106 or NVM/storage 108) having instructions 114 and a plurality of processors 102 combined with the plurality of computer-readable media configured to execute the instructions 114 to implement the modules to perform the actions described in this disclosure.
For one embodiment, the control module 104 may include any suitable interface controller to provide any suitable interface to one or more of the processor(s) 102 and/or any suitable device or component in communication with the control module 104.
The control module 104 may include a memory controller module to provide an interface to the memory 106. The memory controller modules may be hardware modules, software modules, and/or firmware modules.
The memory 106 may be used, for example, to load and store data and/or instructions 114 for the deep learning system 100. For one embodiment, memory 106 may comprise any suitable volatile memory, such as, for example, a suitable DRAM. In some alternative embodiments, memory 106 may comprise double data rate type four synchronous dynamic random access memory (DDR 4 SDRAM).
For one embodiment, the control module 104 may include a plurality of input/output controllers to provide interfaces to the NVM/storage 108 and the input/output device(s) 110.
For example, the number of the cells to be processed, NVM/storage 108 may be used to store data and/or instructions 114. NVM/storage 108 may include any suitable nonvolatile memory (e.g., flash memory) and/or may include any suitable nonvolatile storage(s) (e.g., a plurality of Hard Disk Drives (HDDs), a plurality of Compact Disc (CD) drives, and/or a plurality of Digital Versatile Disc (DVD) drives).
NVM/storage 108 may include storage resources that are physically part of the device on which deep learning system 100 is installed, or which may be accessible by the device, but may not be necessary as part of the device. For example, NVM/storage 108 may be accessed via input/output device(s) 110 in connection with a network.
The input/output device(s) 110 may provide an interface for the deep learning system 100 to communicate with any other suitable device, and the input/output device 110 may include a communication component, a pinyin component, a sensor component, and the like. The network interface 112 may provide an interface for the deep learning system 100 to communicate in accordance with a plurality of networks, and the deep learning system 100 may communicate wirelessly with a plurality of components of a wireless network in accordance with any of a plurality of wireless network standards and/or protocols, such as accessing a wireless network based on a communication standard, such as WiFi, 2G, 3G, 4G, 5G, etc., or a combination thereof.
For one embodiment, at least one of the processor(s) 102 may be packaged together with logic of a plurality of controllers (e.g., memory controller modules) of the control module 104. For one embodiment, at least one of the processor(s) 102 may be packaged together with logic of a plurality of controllers of the control module 104 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 102 may be integrated on the same die with logic of multiple controllers of the control module 104. For one embodiment, at least one of the processor(s) 102 may be integrated on the same die with logic of multiple controllers of the control module 104 to form a system-on-chip (SoV).
In various embodiments, the deep learning system 100 may be, but is not limited to being: a server, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.), among other terminal devices. In various embodiments, the deep learning system 100 may have more or fewer components and/or different architectures. For example, in some alternative embodiments, the deep learning system 100 includes a plurality of cameras, a keyboard, a liquid crystal display (LVD) screen (including a touch screen display), a non-volatile memory port, a plurality of antennas, a graphics chip, an application specific integrated circuit (ASIV), and a speaker.
The foregoing has outlined rather broadly the more detailed description of the present application, wherein specific examples have been provided to illustrate the principles and embodiments of the present application, the description of the examples being provided solely to assist in the understanding of the method of the present application and the core concepts thereof; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. A big data analysis method based on a digital online session service, characterized in that it is applied to the deep learning system, the method comprising:
Performing interactive interest analysis on the session text in the digital online session big data of any target user to generate corresponding interactive interest analysis information;
based on the interaction interest analysis information, acquiring target interaction interest points meeting personalized service pushing conditions, and tracing forward behavior path data corresponding to the target interaction interest points of the target users;
and performing behavior preference analysis on the forward behavior path data to obtain corresponding behavior preference data, and performing personalized service pushing on the target user based on the behavior preference data and the target interaction interest point.
2. The big data analysis method based on the digital online session service according to claim 1, wherein the step of performing interactive interest analysis on the session text in the digital online session big data of any target user, and generating corresponding interactive interest analysis information includes:
according to the keyword characteristics in the conversation text in the digital online conversation big data of any target user, text segment extraction is carried out on the conversation text, a plurality of first conversation text segments are generated, the keyword characteristics are determined according to the keyword vectors in the conversation text, and the occurrence frequency of the keyword characteristics is positively associated with the extraction frequency of the text segments;
According to a multi-modal significance concern network carrying static embedded knowledge points and a multi-modal significance concern network carrying dynamic embedded knowledge points, feature embedding is carried out on the plurality of first session text segments, dialogue embedded features of the plurality of first session text segments are generated, feature reduction is carried out on the dialogue embedded features of the plurality of first session text segments, and significant text segments of the plurality of first session text segments are generated, wherein each significant text segment carries a significant attention degree;
performing interaction interest analysis according to the salient text segments of the plurality of first session text segments to generate interaction interest degree of each first session text segment;
and analyzing the interactive interest degree of the session text according to the interactive interest degree of each first session text segment, and generating interactive interest analysis information of the session text.
3. The big data analysis method based on digital online conversation service of claim 2 wherein the step of extracting text segments from the conversation text according to the keyword features in the conversation text in the digital online conversation big data of any target user to generate a plurality of first conversation text segments includes:
Performing thermodynamic diagram output on the conversation text to generate a knowledge thermal unit of the conversation text, wherein the knowledge thermal unit characterizes the occurrence frequency of keyword features at different conversation nodes in the conversation text;
determining text segment extraction frequency of corresponding session nodes according to the occurrence frequency of keyword features at different session nodes in the knowledge thermal unit;
and according to the text segment extraction frequency, extracting the text segments of the conversation text, and generating a plurality of first conversation text segments.
4. The big data analysis method based on digital online session service according to claim 2, wherein the step of feature embedding the plurality of first session text segments and generating the dialogue embedded features of the plurality of first session text segments according to the multi-modal significance focusing network carrying static embedded knowledge points and the multi-modal significance focusing network carrying dynamic embedded knowledge points comprises:
for any first session text segment, splitting text words of the first session text segment to generate a plurality of second session text segments;
the semantic representation vectors of the second conversation text segments are obtained and generated by vector fusion of conversation text word vectors of the second conversation text segments and conversation node vectors of the second conversation text segments;
According to the multi-modal significance focusing network carrying the static embedded knowledge points and the multi-modal significance focusing network carrying the dynamic embedded knowledge points, feature embedding is carried out on semantic characterization vectors of the plurality of second session text segments, and dialogue embedded features of the first session text segments are generated;
the step of performing feature embedding on the semantic characterization vectors of the plurality of second session text segments according to the multi-modal salient interest network with the static embedded knowledge points and the multi-modal salient interest network with the dynamic embedded knowledge points to generate dialogue embedded features of the first session text segments includes:
according to the multi-modal significance focusing network carrying the static embedded knowledge points, the regular spoken language conversion network and the feedforward artificial neural network, feature embedding is carried out on semantic characterization vectors of the plurality of second conversation text segments, and the planned conversation embedding features of the plurality of second conversation text segments are generated;
according to a multi-modal significance focusing network carrying dynamic embedded knowledge points, a regular spoken language conversion network and a feedforward artificial neural network, feature embedding is carried out on the planned dialogue embedded features of the plurality of second dialogue text segments, and dialogue embedded features of the first dialogue text segments are generated;
The step of performing feature embedding on the semantic characterization vectors of the plurality of second session text segments according to the multi-modal significance focusing network carrying static embedded knowledge points, the regular spoken language conversion network and the feedforward artificial neural network to generate the proposed dialogue embedded features of the plurality of second session text segments comprises the following steps:
according to the first rule spoken language conversion network, carrying out rule spoken language conversion on semantic representation vectors of the plurality of second conversation text segments;
according to the multi-modal significance focusing network carrying the static embedded knowledge points, performing significance feature analysis on semantic characterization vectors of the plurality of second session text segments after the rule spoken language conversion to generate first sketched features;
determining a second proposed feature according to the semantic representation vectors of the plurality of second conversation text segments and the first proposed feature;
according to a second rule spoken language conversion network, performing rule spoken language conversion on the second proposed feature;
and processing the second proposed features after the regular spoken language conversion according to the feedforward artificial neural network to generate proposed dialogue embedded features of the plurality of second session text segments.
5. The method of claim 2, wherein the step of feature reducing dialogue embedded features of the plurality of first dialogue text segments to generate salient text segments of the plurality of first dialogue text segments comprises:
for any first session text segment, performing feature reduction on dialogue embedded features of the first session text segment to generate a significant attention degree corresponding to a plurality of keyword vectors in the first session text segment;
and generating a salient text segment of the first session text segment according to the salient attention degree corresponding to the plurality of keyword vectors in the first session text segment, wherein the concentration degree of each keyword vector in the salient text segment represents the corresponding salient attention degree.
6. The method for analyzing big data based on a digital online session service according to claim 2, wherein the step of performing an interactive interest analysis based on the salient text segments of the plurality of first session text segments to generate the interactive interest level of each of the first session text segments comprises:
for a salient text segment of any first session text segment, acquiring a focusing amount of a keyword vector corresponding to each salient attention degree in the salient text segment of the first session text segment;
And determining the interaction interest degree of the first session text segment according to the focusing amount of the keyword vector corresponding to each salient attention degree.
7. The method for analyzing big data based on a digital online session service according to claim 2, wherein the step of performing an interactive interest analysis based on the salient text segments of the plurality of first session text segments to generate the interactive interest level of each of the first session text segments comprises:
for a significant text segment of any first significant text segment, updating the significant attention degree corresponding to a plurality of keyword vectors in the significant text segment according to the session node relation among the plurality of keyword vectors in the significant text segment;
and generating the interaction interest degree of the first session text segment according to the focusing amount of the keyword vector corresponding to each salient attention degree in the updated salient text segment.
8. The big data analysis method based on the digital online conversation service according to claim 2, wherein the interactive interest analysis information includes an interactive interest degree of the conversation text;
the step of analyzing the interactive interest degree of the session text according to the interactive interest degree of each first session text segment to generate interactive interest analysis information of the session text includes:
Determining the weight corresponding to each interaction interest degree according to the focusing quantity of the first session text segment corresponding to each interaction interest degree;
and analyzing the interactive interest degree of the session text according to the weight corresponding to each interactive interest degree, and generating the interactive interest degree of the session text.
9. The digital online session service-based big data analysis method of claim 2, wherein the method further comprises:
splicing the salient text segments of the plurality of first session text segments to generate salient text segments of the session text, wherein the salient text segments of the session text represent text segments of salient session nodes in the session text;
or splicing the updated salient text segments of the plurality of first session text segments to generate the salient text segments of the session text.
10. A deep learning system comprising a processor and a machine-readable storage medium having stored therein machine-executable instructions loaded and executed by the processor to implement the digital online session service-based big data analysis method of any of claims 1-9.
CN202310141339.2A 2023-02-21 2023-02-21 Big data analysis method and deep learning system based on digital online session service Pending CN116050410A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310141339.2A CN116050410A (en) 2023-02-21 2023-02-21 Big data analysis method and deep learning system based on digital online session service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310141339.2A CN116050410A (en) 2023-02-21 2023-02-21 Big data analysis method and deep learning system based on digital online session service

Publications (1)

Publication Number Publication Date
CN116050410A true CN116050410A (en) 2023-05-02

Family

ID=86131455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310141339.2A Pending CN116050410A (en) 2023-02-21 2023-02-21 Big data analysis method and deep learning system based on digital online session service

Country Status (1)

Country Link
CN (1) CN116050410A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116956030A (en) * 2023-07-21 2023-10-27 广州一号家政科技有限公司 Household business processing method and system based on artificial intelligence
CN118379083A (en) * 2024-06-25 2024-07-23 青州市坦博尔服饰股份有限公司 Outdoor sport clothing system behavior data analysis method and system based on digital intelligence

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116956030A (en) * 2023-07-21 2023-10-27 广州一号家政科技有限公司 Household business processing method and system based on artificial intelligence
CN116956030B (en) * 2023-07-21 2024-02-02 广州一号家政科技有限公司 Household business processing method and system based on artificial intelligence
CN118379083A (en) * 2024-06-25 2024-07-23 青州市坦博尔服饰股份有限公司 Outdoor sport clothing system behavior data analysis method and system based on digital intelligence

Similar Documents

Publication Publication Date Title
US11314946B2 (en) Text translation method, device, and storage medium
US11328129B2 (en) Artificial intelligence system using phrase tables to evaluate and improve neural network based machine translation
US11775761B2 (en) Method and apparatus for mining entity focus in text
CN116050410A (en) Big data analysis method and deep learning system based on digital online session service
CN110415679B (en) Voice error correction method, device, equipment and storage medium
CN109783824B (en) Translation method, device and storage medium based on translation model
CN111460290B (en) Information recommendation method, device, equipment and storage medium
CN111611797B (en) Method, device and equipment for marking prediction data based on Albert model
CN115270003B (en) Information recommendation method and system based on Internet of things platform behavior data mining
CN112509562B (en) Method, apparatus, electronic device and medium for text post-processing
US20230049562A1 (en) Document creation and editing via automated assistant interactions
CN110457578A (en) A kind of customer service demand recognition methods and device
CN115203394A (en) Model training method, service execution method and device
CN112380876B (en) Translation method, device, equipment and medium based on multilingual machine translation model
CN114896454A (en) Short video data recommendation method and system based on label analysis
CN110489730B (en) Text processing method, device, terminal and storage medium
CN113343085B (en) Information recommendation method and device, storage medium and electronic equipment
CN116881851B (en) Internet of things data processing method and device based on machine learning and server
CN116028626A (en) Text matching method and device, storage medium and electronic equipment
CN113344590A (en) Method and device for model training and complaint rate estimation
CN110019068B (en) Log text processing method and device
CN112927714B (en) Data processing method and device
CN113011165B (en) Method, device, equipment and medium for identifying blocked keywords
CN112989013B (en) Conversation processing method and device, electronic equipment and storage medium
CN114817469B (en) Text enhancement method, training method and training device for text enhancement model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20230502