CN116668968B

CN116668968B - Cross-platform communication information processing method and system

Info

Publication number: CN116668968B
Application number: CN202310914445.XA
Authority: CN
Inventors: 吴庆军; 程怀玺; 王昌; 许淑婷
Original assignee: Xi'an Excellent Spectrum Information Technology Co ltd
Current assignee: Xi'an Excellent Spectrum Information Technology Co ltd
Priority date: 2023-07-25
Filing date: 2023-07-25
Publication date: 2023-10-13
Anticipated expiration: 2043-07-25
Also published as: CN116668968A

Abstract

The present application relates to the field of information processing technologies, and in particular, to a method and a system for processing information in cross-platform communication. Before cross-platform sharing is carried out on the cross-platform communication information to be shared, the target application demand items can be detected through the application sharing demand detection network, and whether the target application demand items are contained in the cross-platform communication information to be shared or not can be accurately judged and determined. Therefore, when the cross-platform communication information to be shared is shared, targeted sharing processing can be performed according to the target application requirement items, so that the cross-platform communication information to be shared can be matched with and suitable for related application requirements/application scenes.

Description

Cross-platform communication information processing method and system

Technical Field

The present application relates to the field of information processing technologies, and in particular, to a method and a system for processing information in cross-platform communication.

Background

With the continuous development and maturation of communication systems, the use of communication systems for information exchange interaction has become increasingly popular. In some interactive scenarios of online services/digital services, data information sharing/sharing processing between different communication platforms is generally involved. However, in terms of the information sharing technology of cross-platform communication, the conventional scheme is difficult to judge the requirements of related communication information, so that the pertinence of the cross-platform communication information sharing is difficult to ensure.

Disclosure of Invention

In order to improve the technical problems in the related art, the application provides a cross-platform communication information processing method and system.

In a first aspect, an embodiment of the present application provides an information processing method for cross-platform communication, which is applied to a communication information processing system, where the method includes:

acquiring cross-platform communication information to be shared, and generating a communication information set to be processed according to the cross-platform communication information to be shared;

detecting whether the to-be-shared communication content in the to-be-processed communication information set contains a target application demand item or not through an application sharing demand detection network, wherein the application sharing demand detection network is obtained by debugging according to a learning example after filtering processing, and the filtering processing is used for reducing the annotation amount of the learning example;

and under the condition that at least one communication content to be shared in the communication information set to be processed contains the target application requirement item, determining that the cross-platform communication information to be shared contains the target application requirement item.

In some preferred examples, the method further comprises:

performing target application demand item detection on the to-be-shared cross-platform communication information through the application sharing demand detection network to obtain a plurality of initial detection results;

Obtaining a final detection result of at least one piece of cross-platform communication information to be shared in the plurality of pieces of cross-platform communication information to be shared;

under the condition that the initial detection result of any cross-platform communication information to be shared does not contain the target application requirement item, but the corresponding final detection result of the cross-platform communication information to be shared contains the target application requirement item, the communication information set of the cross-platform communication information to be shared contains the communication content to be shared of the target application requirement item as the increased prior cross-platform communication content, and the method is used for screening an increased debugging learning example set of an application sharing requirement detection network.

In some preferred examples, the step of debugging the application sharing demand detection network includes:

acquiring X priori cross-platform communication contents for network debugging and a priori communication information set, wherein the X priori cross-platform communication contents comprise target application demand items, and the priori cross-platform communication information in the priori communication information set carries annotation data related to the target application demand items, and X is an integer greater than 1;

obtaining a communication information set according to the prior communication information set, wherein the communication information set comprises a plurality of communication contents to be shared;

Screening a debugging learning example set for applying a sharing requirement detection network from the plurality of communication contents to be shared according to the X priori cross-platform communication contents and the involving descriptions of the plurality of communication contents to be shared, wherein the debugging learning example set comprises a positive learning example set containing target application requirement items and a negative learning example set not containing target application requirement items;

and debugging the application sharing requirement detection network through the debugging learning example set.

In some preferred examples, the annotation data is at least one of a mask, a mark, and a theme, and the a priori cross-platform communication information in the a priori communication information set is extracted from the original a priori communication information set according to the characterization information of the target application requirement item.

In some preferred examples, the description of the involvement of the plurality of communication contents to be shared is obtained by content grouping the plurality of communication contents to be shared, wherein, according to the X priori cross-platform communication contents and the description of the involvement of the plurality of communication contents to be shared, a debug learning example set for applying a sharing requirement detection network is selected from the plurality of communication contents to be shared, including:

Content clustering is carried out on the communication contents to be shared so as to obtain X1 clustering clusters, wherein X1 is an integer greater than 1 and is less than the prior cross-platform communication information number of the prior communication information set;

and obtaining the positive learning example set and the negative learning example set according to the quantization difference between each priori cross-platform communication content and the core cluster members of the X1 clusters and the quantization difference between the core cluster members of the X1 clusters.

In some preferred examples, deriving the positive learning example set and the negative learning example set from a quantization difference between each a priori cross-platform communication content and the core cluster members of the X1 clusters, and a quantization difference between the core cluster members of the X1 clusters, comprises:

obtaining a first-order active learning example set according to the quantization difference between each priori cross-platform communication content and the core cluster members of the X1 clusters;

obtaining a further positive learning example set and the negative learning example set according to the quantization difference among the core cluster members of the X1 clusters;

the initial positive learning example set and the advanced positive learning example set are used as the positive learning example set.

In some preferred examples, deriving a set of first-order positive learning examples from a quantized difference between each a priori cross-platform communication content and a core cluster member of the X1 clusters, comprising:

for each priori cross-platform communication content, taking the core cluster members of X2 clusters with the smallest quantification difference with the priori cross-platform communication content of the core cluster members of the X1 clusters as the first-order core cluster members corresponding to the priori cross-platform communication content, wherein X2 is an integer larger than 1;

and taking all communication content learning examples in the clusters where all primary core cluster members corresponding to the X priori cross-platform communication contents are located as primary positive learning examples forming the primary positive learning example set.

In some preferred examples, deriving a set of advanced positive learning examples from the quantized differences between the core cluster members of the X1 clusters includes:

for each primary core cluster member, determining at least one core cluster member which is not a primary core cluster member among the core cluster members of the X1 clusters and is closest to the primary core cluster member, taking the closest at least one core cluster member as a corresponding primary core cluster member, determining a quantization difference between the primary core cluster member and the corresponding primary core cluster member, and taking the determined quantization difference between the primary core cluster member and the corresponding primary core cluster member as a first quantization difference;

And determining the communication content learning examples meeting the first requirement in the cluster where the corresponding advanced core cluster member is located as the advanced active learning examples forming the advanced active learning example set, wherein the quantization difference between the communication content learning examples meeting the first requirement and the initial core cluster member is a second quantization difference, and the second quantization difference and the first quantization difference meet a first set quantization discrimination requirement.

In some preferred examples, deriving the set of negative learning examples from a quantified difference between core cluster members of the X1 clusters includes:

for each primary core cluster member, determining the core cluster member of X3 clusters with the largest difference in quantization with the primary core cluster member among the core cluster members of the X1 clusters, and taking the determined core cluster members of the X3 clusters as X3 negative core cluster member examples to be processed corresponding to the primary core cluster member, wherein X3 is an integer greater than 1;

screening one of the X3 passive core cluster member examples to be processed as a passive core cluster member example corresponding to the primary core cluster member according to the quantization difference between each of the X3 passive core cluster member examples to be processed and the primary core cluster member and the quantization difference between each of the X3 passive core cluster member examples to be processed and the corresponding secondary core cluster member;

And taking all communication content learning examples in the cluster where the passive core cluster member examples corresponding to the first-order core cluster member are located as passive learning examples for forming the passive learning example set.

In some preferred examples, screening one of the X3 passive core cluster member instances to be processed as a passive core cluster member instance corresponding to the primary core cluster member according to the quantization difference of the X3 passive core cluster member instances to be processed and the quantization difference of the respective advanced core cluster members, includes:

for each passive core cluster member instance to be processed corresponding to the primary core cluster member, performing the following steps until the passive core cluster member instance to be processed is determined to be the passive core cluster member instance corresponding to the primary core cluster member:

determining the quantization difference between the negative core cluster member example to be processed and the primary core cluster member, and taking the determined quantization difference between the negative core cluster member example to be processed and the primary core cluster member as a third quantization difference;

Taking the advanced core cluster member with the smallest quantization difference with the passive core cluster member example to be processed among all the advanced core cluster members as the advanced core cluster member corresponding to the passive core cluster member example to be processed;

determining quantization differences of the negative core cluster member examples to be processed and the corresponding advanced core cluster members, and taking the determined quantization differences of the negative core cluster member examples to be processed and the corresponding advanced core cluster members as fourth quantization differences;

and under the condition that the third quantization difference and the fourth quantization difference meet a second set quantization discrimination requirement, determining the passive core cluster member example to be processed as the passive core cluster member example corresponding to the first-order core cluster member.

In some preferred examples, debugging the application sharing requirement detection network through the set of debug learning examples includes:

pre-debugging an application sharing demand detection network through a set of set samples to determine a pre-tested network variable of the application sharing demand detection network and optimize a generation module variable of the application sharing demand detection network;

Disassembling the debugging learning example set into a plurality of debugging learning example sets;

for each debug learning example group:

randomly screening active learning example groups and passive learning example groups from the active learning example set and the passive learning example set according to set rules respectively to serve as auxiliary learning example sets for the debugging learning example groups;

determining selected communication content learning examples of the debug learning example group according to the description and the theme of the auxiliary communication content learning examples in the auxiliary learning example set and the theme of the communication content learning examples in the debug learning example group;

and performing a round of optimization on the current network variable of the application sharing requirement detection network through the selected communication content learning examples in the debugging learning example group.

In some preferred examples, determining selected ones of the set of debug learning examples from the description and subject matter of involvement of the auxiliary communication learning examples in the set of auxiliary learning examples and the subject matter of the communication learning examples in the set of debug learning examples comprises:

content clustering is respectively carried out on the positive learning example group and the negative learning example group to obtain X4 positive learning example clusters and X4 negative learning example clusters, wherein X4 is a positive integer greater than 1;

Determining an auxiliary communication content learning example in the positive learning example group, wherein the auxiliary communication content learning example is the smallest in quantization difference with each positive core clustering member of the X4 positive learning example clusters, and determining an auxiliary communication content learning example in the negative learning example group, wherein the auxiliary communication content learning example is the smallest in quantization difference with each negative core clustering member of the X4 negative learning example clusters, so as to obtain 2X 4 auxiliary communication content learning examples;

for each communication content learning example in the debug learning example group, determining one auxiliary communication content learning example with the smallest quantification difference from the communication content learning example in the 2 x4 auxiliary communication content learning examples, and determining the communication content learning example as a selected communication content learning example under the condition that the theme of the one auxiliary communication content learning example with the smallest quantification difference is the same as the theme of the communication content learning example.

In a second aspect, the present application also provides a communication information processing system, including a processor and a memory; the processor is in communication with the memory, and the processor is configured to read and execute a computer program from the memory to implement the method described above.

In a third aspect, the present application also provides a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the method described above.

By applying the embodiment of the application, before the cross-platform sharing/sharing of the cross-platform communication information to be shared is carried out, the target application demand item can be detected through the application sharing demand detection network, and whether the cross-platform communication information to be shared contains the target application demand item can be accurately judged and determined. Therefore, when the cross-platform communication information to be shared is shared, targeted sharing processing can be performed according to the target application requirement items, so that the cross-platform communication information to be shared can be matched with and suitable for related application requirements/application scenes.

When the method and the device are applied to generating the debugging and learning example set for debugging the application sharing requirement detection network through the original prior communication information set, a plurality of positive and negative learning examples can be adaptively determined from the communication information set of the prior communication information set based on the prior cross-platform communication content containing a small part of the target application requirement item and the prior communication information set (obtained by sampling the prior communication information set according to annotation data) related to the target application requirement item, so that the to-be-shared communication content with low learning value can be filtered to reduce the annotation quantity of the debugging and learning examples. In addition, when the target application demand items to be detected have changes, only the prior cross-platform communication content needs to be adjusted and the key summary characteristics of the prior communication information set need to be extracted, so that the generation thought of the debugging and learning example set of the application sharing demand detection network can be flexibly matched and is suitable for detection processing of other target application demand items.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.

Fig. 1 is a flow chart of an information processing method for cross-platform communication according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.

The method embodiments provided by the embodiments of the present application may be implemented in a communication information processing system, a computer device, or a similar computing device. Taking the example of running on a communication information handling system, the communication information handling system may include one or more processors (which may include, but is not limited to, a microprocessor MCU, a programmable logic device FPGA, etc.) and memory for storing data, and optionally, a transmission device for communication functions. It will be appreciated by those of ordinary skill in the art that the above-described configuration is merely illustrative and is not intended to limit the configuration of the above-described communication information handling system. For example, the communication information handling system may include more or fewer components than shown above or may have a different configuration than shown above.

The memory may be used to store a computer program, for example, a software program of application software and a module, for example, a computer program corresponding to an information processing method of cross-platform communication in the embodiment of the present application, and the processor executes the computer program stored in the memory, thereby performing various functional applications and data processing, that is, implementing the method described above. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory may further include memory remotely located with respect to the processor, the remote memory being connectable to the communication information handling system through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission means is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a communication information handling system. In one example, the transmission means comprises a network adapter (Network Interface Controller, simply referred to as NIC) that can be connected to other network devices via a base station to communicate with the internet. In one example, the transmission device may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.

Referring to fig. 1, fig. 1 is a flow chart of a method for processing information of cross-platform communication according to an embodiment of the present application, where the method is applied to a communication information processing system by applying content corresponding to a debug training phase of a shared demand detection network, and further may include STEP210-STEP240.

STEP210 obtains X a priori cross-platform communication content and a priori communication information set for network debugging.

The X priori cross-platform communication contents comprise target application demand items, and the priori cross-platform communication information in the priori communication information set carries annotation data related to the target application demand items, wherein X is an integer greater than 1.

For example, the prior cross-platform communication content may be understood as authenticated cross-platform communication content serving as a debug training sample, and the target application requirement item may be understood as an application requirement label corresponding to related content in the prior cross-platform communication content. In the embodiment of the application, the cross-platform communication content can be communication content shared at different application program ends or PC ends, including but not limited to user behavior data, business session data and the like. Based on the above, the application requirement label can be a user portrait label, a business collaboration label and the like, and the application requirement label can reflect related application requirements/application scenes and the like which can be matched or applicable by prior cross-platform communication content.

Further, the prior communication information set is composed of a plurality of prior cross-platform communication information, each prior cross-platform communication information can comprise corresponding prior cross-platform communication content, and annotation data can be understood as an identifier which is carried by the prior cross-platform communication information and is connected with the target application requirement item. For example, the annotation data is at least one of a mask, a mark, and a theme, and the prior cross-platform communication information in the prior communication information set is extracted from an original prior communication information set (an initial prior communication information set) according to key summary features (such as a requirement keyword) of the requirement item of the target application.

By way of example, a priori cross-platform content may be understood as content that has been identified as containing the target application requirements.

In the embodiment of the application, only a small part of prior cross-platform communication content is needed, the debugging learning example set containing sufficient communication content to be shared can be screened from the communication information set, and the thought can be understood as mining the debugging learning example set in the communication information set.

STEP220 obtains a communication information set according to the prior communication information set, wherein the communication information set comprises a plurality of communication contents to be shared.

For example, each priori cross-platform communication information in the a priori communication information set may be extracted, that is, all communication contents to be shared that form each priori cross-platform communication information may be stored, so as to obtain a plurality of communication contents to be shared as a communication information set. And then, the communication content to be shared containing the target application requirement item is required to be mined from the communication information set to serve as an active learning example, and the communication content to be shared without the target application requirement item is acquired to serve as a passive learning example. Wherein the positive learning example and the negative learning example can be understood as a positive sample and a negative sample, respectively.

STEP230 screens the debugging learning example set for applying the sharing requirement detection network from the plurality of communication contents to be shared according to the X priori cross-platform communication contents and the involving descriptions of the plurality of communication contents to be shared.

Wherein the debug learning example set includes a positive learning example set including target application demand items and a negative learning example set not including target application demand items.

The referred description of the communication content to be shared may be a grouping feature of the communication content to be shared, so the referred description may be understood as a distribution feature or a clustering feature. The X priori cross-platform communication contents can be used as a standard, so that the communication contents to be shared, of the plurality of communication contents to be shared, with a common score (text similarity between the communication contents) of the priori cross-platform communication contents being large enough, are used as positive learning examples, and the communication contents to be shared, with a common score of the priori cross-platform communication contents being small enough, are used as negative learning examples. It can be understood that the active/passive learning examples are mined from the communication contents to be shared in the communication information set according to the prior cross-platform communication contents, and the active/passive learning examples are derived from the prior cross-platform communication contents.

STEP240, debug the application sharing requirement detection network through the set of debug learning examples.

The application sharing requirement detection network provided by the embodiment of the application can be a residual model, and the residual model can avoid the problem of gradient explosion when feature mining and processing are performed, so that the feature performance of the communication content can be ensured.

Therefore, according to the key summary features of the target application demand item, the prior communication information set carrying annotation data related to the target application demand item can be extracted from the original prior communication information set, then the debugging learning example set is screened out from the communication information set according to the involved description of a plurality of communication contents to be shared in the communication information set of the prior communication information set and a small portion of prior cross-platform communication contents containing the target application demand item, and then the application sharing demand detection network is debugged through the debugging learning example set, so that the communication contents to be shared with low learning value can be filtered to reduce the annotation amount of the debugging learning example, and in addition, when the target application demand item to be detected changes, only the prior cross-platform communication contents are required to be adjusted and the key summary features of the prior communication information set are required to be extracted.

For STEP230, the description of the content to be shared in this STEP is obtained by content clustering (clustering) the content to be shared, in other words, the content to be shared is divided into clusters, so that the content to be shared in each cluster is as similar as possible to the content to be shared in the rest clusters.

For example, each communication content to be shared has a feature, and may be represented by description knowledge (such as a feature vector), where the content grouping of the plurality of communication contents to be shared may be that the description knowledge of the plurality of communication contents to be shared is content-grouped according to a K-means grouping rule, and exemplary STEPs may refer to STEP2301-STEP2302.

In STEP2301, content clustering is performed on the plurality of communication contents to be shared to obtain X1 clusters, where X1 is an integer greater than 1 and less than the number of prior cross-platform communication information sets.

The number of core cluster members is also X1. For example, X1 is set to be smaller than the number of prior cross-platform communication information in the prior communication information set, so as to avoid the communication content to be shared, in which the communication content learning example in each cluster is the same prior cross-platform communication information, and for example, the value of X1 is 1/10 of the number of prior cross-platform communication information.

Illustratively, the descriptive knowledge of each communication to be shared (and the descriptive knowledge of the prior cross-platform communication required for the next step) can be obtained by feature mining through a residual model.

In addition, the pre-trial network variables of the residual model are non-fixed. For example, when the residual model is further used as an application sharing requirement detection network, in the debugging process, the network variable of the application sharing requirement detection network is continuously updated and optimized, and when the added debugging learning example set needs to be mined, the corresponding description knowledge can be output according to the application sharing requirement detection network after the network variable is updated or optimized.

STEP2302 obtains the positive learning example set and the negative learning example set according to quantization differences between each prior cross-platform communication content and the core cluster members of the X1 clusters and quantization differences between the core cluster members of the X1 clusters.

In STEP2302, it may be understood that, according to the a priori cross-platform communications, active/passive learning examples are mined from the several communications to be shared based on the above-mentioned quantization differences, where the active/passive learning examples are derived from the a priori cross-platform communications.

Illustratively, STEP2302 may include STEP23021-STEP23023.

STEP23021 obtains a first-order positive learning example set according to the quantization difference between each prior cross-platform communication content and the core cluster members of the X1 clusters.

Illustratively, the quantization difference between the a priori cross-platform communication content and the core cluster member can also be understood as the quantization difference (feature distance) between the descriptive knowledge of the a priori cross-platform communication content and the core cluster member (a feature of the same dimension as the descriptive knowledge). For the description knowledge of each priori cross-platform communication content, taking X2 core cluster members which are the nearest (most similar) to the quantization difference of the description knowledge of the priori cross-platform communication content of the X1 core cluster members as first-order core cluster members, wherein X2 is an integer larger than 1; and taking all communication content learning examples in the clusters where all primary core cluster members corresponding to the X priori cross-platform communication contents are located as primary positive learning examples included in the primary positive learning example set. Where a first order may be understood as primary or primary and a core cluster member may be understood as a cluster center, a classification center, or a cluster center.

For example, X2 may be a preset value, for example, X2 may be 10. For example, for each prior cross-platform communication content, 10 core cluster members with the smallest quantization difference with the description knowledge of the prior cross-platform communication content in the 1000 core cluster members are used as first-order core cluster members of the prior cross-platform communication content. Because the prior cross-platform communication content necessarily contains the target application requirement item, the probability that the communication content learning examples in the clusters of the first-order core cluster members contain the target application requirement item is also high, for example, at least half of the communication content learning examples in the clusters of the first-order core cluster members contain the target application requirement item.

However, in view of errors in content clustering, there may be individual or partial core cluster members that are not first-order core cluster members, and at least some of the communication content learning examples in the clusters in which they are located also contain target application demand items, so that communication content learning examples in other clusters may be appropriately introduced. Samples in clusters (also referred to as first-order/order clusters) where the first-order core cluster members in STEP23021 are located are first-order positive learning examples, and at least a part of the communication content learning examples in clusters (also referred to as second-order/order clusters) where the further-order core cluster members as obtained in STEP23022 are located are further positive learning examples. In addition, the debug learning example set should also include negative learning examples.

STEP23022 obtains a set of advanced positive learning examples and the set of negative learning examples based on the quantized differences between the core cluster members of the X1 clusters.

STEP23023 takes the initial positive learning example set and the advanced positive learning example set as the positive learning example set.

For example, the acquisition of the advanced active learning examples in the advanced active learning example set may include the following.

The following sub-step 1-sub-step 3 is performed for each first order core cluster member.

And 1, determining at least one core cluster member which is not a primary core cluster member and is closest to the primary core cluster member (the quantization distance is the smallest) among the core cluster members of the X1 clusters as a secondary core cluster member corresponding to the primary core cluster member.

For example, the closer to the primary core cluster member, the communication content learning example included in the cluster where the non-primary core cluster member is located includes the higher probability of including the target application requirement item, so that only one or two core cluster members with the smallest quantization difference with respect to the primary core cluster member may be screened as the advanced core cluster member corresponding to the primary core cluster member, because there may be only a small portion of the remaining clusters or no communication content learning example may include the target application requirement item, and thus may be ignored. Under the condition of higher calculation margin, more core cluster members can be screened to serve as advanced core cluster members so as to obtain more advanced active learning examples containing target application requirement items.

And 2, determining quantization differences between the first-order core cluster members and the corresponding advanced core cluster members, and taking the determined quantization differences as first quantization differences.

And 3, determining a communication content learning example meeting the first requirement in the cluster where the corresponding advanced core cluster member is located as an advanced active learning example forming an advanced active learning example set, wherein the quantization difference between the communication content learning example meeting the first requirement and the initial core cluster member is a second quantization difference, and the second quantization difference and the first quantization difference meet a first set quantization discrimination requirement.

For example, the first set quantization determining requirement that the second quantization difference (V2) and the first quantization difference (V1) meet may be that a first ratio of V2 to V1 is less than or equal to P1, and P1 is greater than or equal to 0 and less than or equal to 1.

For example, if P1 is 3/4, for the primary core cluster member M1, the corresponding secondary core cluster member is M1' (the core cluster member with the smallest quantization difference from M1), and the quantization difference (the first quantization difference) between the secondary core cluster member M1' and the primary core cluster member M1 is V1', the ratio of the descriptive knowledge of the communication content learning examples T1' and T2' of the cluster in which the secondary core cluster member M1' is located to the quantization differences (the second quantization differences) V2' and V2″ of the primary core cluster member M1 is 1/2, and therefore, the communication content learning examples T1' and T2' may be regarded as the secondary positive learning examples. Alternatively, for the case where there are two advanced core cluster members, for the first-order core cluster member M1, the corresponding advanced core cluster members are M11 and M12 (two core cluster members with the smallest quantization difference from M1), and the quantization difference (first quantization difference) between the advanced core cluster member M11 and the first-order core cluster member M1 is V11, and the quantization difference (first quantization difference) between the advanced core cluster member M12 and the first-order core cluster member M1 is V12. For the advanced core cluster member M11, the ratio of the descriptive knowledge of the communication content learning examples T1 and T2 of the cluster in which the advanced core cluster member M11 is positioned to the second quantization differences V21 and V22 of the first core cluster member M1 to V11 is 1/2 (less than 3/4); for the advanced core cluster member M11, the ratio of the descriptive knowledge of the communication content learning example T3 of the cluster in which the advanced core cluster member M11 is located to the second quantization difference V23 to V12 of the first-order core cluster member M2 is 3/4, so that the communication content learning examples T1 to T3 of the clusters in which the advanced core cluster members M11 and M12 are located can be regarded as advanced active learning examples.

STEP23021 and STEP23022 describe a process of mining positive learning examples (including a first-order positive learning example and a second-order negative learning example) from a communication information set according to a priori cross-platform communication content and a description of involvement of communication content to be shared in the communication information set in the embodiment of the present application, and the following describes a process of mining negative learning examples. Sub-step a-sub-step b may also be performed for each primary core cluster member.

A substep a, determining core cluster members of X3 clusters with the largest quantization difference with the primary core cluster member among the core cluster members of the X1 clusters, taking the determined core cluster members of the X3 clusters as X3 passive core cluster member examples to be processed corresponding to the primary core cluster member, wherein X3 is an integer greater than 1.

For example, the X3 passive core cluster member instances to be processed may be sorted from high to low according to the quantization difference with the primary core cluster member, for the next step to determine the simplification of the process of determining the passive core cluster member instance corresponding to each primary core cluster member.

And b, screening one of the X3 passive core cluster member examples to be processed as a passive core cluster member example corresponding to the primary core cluster member according to the quantization difference between each of the X3 passive core cluster member examples to be processed and the primary core cluster member and the quantization difference between each of the X3 passive core cluster member examples to be processed and the corresponding secondary core cluster member.

Illustratively, this step may specifically include, for each first-order core cluster member, the following: for each passive core cluster member instance to be processed corresponding to the primary core cluster member, performing the following steps until the passive core cluster member instance to be processed is determined to be the passive core cluster member instance corresponding to the primary core cluster member: determining a quantization difference between the negative core cluster member example to be processed and the first-order core cluster member, and taking the determined quantization difference as a third quantization difference; taking the advanced core cluster member with the smallest quantization difference with the passive core cluster member example to be processed among all the advanced core cluster members as the advanced core cluster member corresponding to the passive core cluster member example to be processed; determining quantization differences of the negative core cluster member examples to be processed and the corresponding advanced core cluster members, and taking the determined quantization differences as fourth quantization differences; and determining the passive core cluster member example to be processed as the passive core cluster member example corresponding to the first-order core cluster member under the condition that the third quantization difference and the fourth quantization difference meet a second set quantization discrimination requirement.

For example, the second set quantization determining requirement may be that the second ratio of the fourth quantization difference (V4) to the third quantization difference (e.g., V3) is greater than P2, and P2 is greater than 0 and less than or equal to 1, for example, p2=0.5. In other words, the quantization difference between the passive core cluster member example corresponding to the first-order core cluster member and all the advanced core cluster members needs to be as large as possible. The reason for this is that: whereas only a few examples of the clusters in which the advanced core cluster members are located may contain the target application requirement item, and the advanced core cluster members are closer to the initial core cluster members, if the quantization difference between the passive core cluster member example to be processed and all of the advanced core cluster members is still sufficiently large (by discriminating according to the quantization difference of the passive core cluster member example to be processed and the closest one of the advanced core cluster members), it is indicated that the example in which the passive core cluster member example to be processed is located is less likely to contain the target application requirement item.

In view of determining whether each passive core cluster member instance to be processed can be used as the passive core cluster member instance of the primary core cluster member or not, determining the passive core cluster member instance, and then judging the passive core cluster member instance to be processed corresponding to the primary core cluster member, and if the quantization difference between a certain core cluster member and the primary core cluster member is far, the probability that the core cluster member is the passive core cluster member instance is large, so that the X3 passive core cluster member instances to be processed can be sorted from high to low according to the quantization difference with the primary core cluster member, and the passive core cluster member instances to be processed with a larger probability of being the passive core cluster member instance can be sorted as far as possible, so that the passive core cluster member instance corresponding to the primary core cluster member can be found in time.

In addition, for each primary core cluster member, there may be a case where the passive core cluster member instance corresponding to the primary core cluster member feature does not exist (the quantization difference between the other core cluster members and the primary core cluster member is not large enough), in which case, the determination of the passive core cluster member instance corresponding to the primary core cluster member is terminated, and the determination of the passive core cluster member instance corresponding to the next primary core cluster member is performed.

Further, in the sub-step b, all the communication content learning examples in the cluster where the passive core cluster member example corresponding to the first-order core cluster member is located are taken as the passive communication content learning examples which are used as the passive learning example set.

The beneficial effects achieved by applying the embodiment at least comprise: the prior communication information set (communication information set) can be mined to a plurality of positive learning examples and negative learning examples, and the positive learning examples and the negative learning examples can form a debugging learning example set to debug the application sharing demand detection network. The application sharing requirement detection network debugged according to the debugging learning example set can detect the target application requirement item to a certain extent, the thought can reduce the operation amount of annotation processing, and flexible reuse of detection processing can be realized when the target application requirement item to be detected changes.

In some examples, with respect to STEP240, embodiments of the present application also provide further implementations of debugging the application sharing demand detection network through a set of debug learning examples including perturbation examples, such as STEP2401-STEP2403.

STEP2401 pre-debugs the application sharing demand detection network by setting a sample set to determine a pre-debugging network variable of the application sharing demand detection network, and optimize a module variable of a generation module of the application sharing demand detection network.

STEP2402 disassembles the set of debug learning examples into a number of debug learning example groups.

For example, the set of debug learning examples may be batched.

STEP2403 performs the following processing for each debug learning example group.

A part of active learning examples (active learning example group) and a part of passive learning examples (passive learning example group) are arbitrarily screened from the active learning example set and the passive learning example set, respectively, according to a set rule, as an auxiliary learning example set for the debug learning example group.

For example, if the debug learning example set includes 5000 pieces of communication contents to be shared (for example, 3000 active learning example sets and 2000 passive learning example sets) and the preset ratio is 1/5, 1/5 of the communication contents to be shared (600 active learning examples, 400 passive learning examples and 1000 communication contents to be shared) in the active learning example set and the passive learning example set are respectively screened as the auxiliary learning example set (reference sample) of the debug learning example set.

Further, some positive/negative learning examples can be arbitrarily screened, the operation difficulty can be simplified, and different auxiliary learning example sets can be arbitrarily screened for different debugging learning example groups, so that the filtering precision of the disturbance examples can be improved.

Further, selected ones of the set of debug learning examples are determined based on the description and subject matter of the involvement of the auxiliary communication learning examples in the set of auxiliary learning examples and the subject matter of the communication learning examples in the set of debug learning examples.

For example, the description of the involvement of the auxiliary communication content learning examples in the auxiliary learning example set may also be obtained through clustering, that is, content clustering is performed on the active learning example group and the passive learning example group respectively, so as to obtain X4 active learning example clusters and X4 passive learning example clusters, where X4 is a positive integer greater than 1.

For example, x4=20, and content clustering is performed on the active learning example group and the passive learning example group respectively, so that 20 active learning example clusters and 20 passive learning example clusters are obtained, and thus there are 20 core cluster members (called active core cluster members) for the active learning example group and 20 core cluster members (called passive core cluster members) for the passive learning example group, and each cluster may include only the communication content to be shared of the active learning example with the topic of "1", only the communication content to be shared of the passive learning example with the topic of "0", and simultaneously include the communication content to be shared of the active learning example with the topic of "1" and the passive learning example with the topic of "0".

Further, determining an auxiliary communication content learning example in the active learning example group, wherein the auxiliary communication content learning example is the smallest in quantization difference with each active core cluster member of the X4 active learning example clusters, and determining an auxiliary communication content learning example in the passive learning example group in the substep 1, wherein the auxiliary communication content learning example is the smallest in quantization difference with each passive core cluster member of the X4 passive learning example clusters, so as to obtain 2×x4 auxiliary communication content learning examples.

For example, under the condition that x4=20, 40 auxiliary communication content learning examples can be obtained, and the subject of each auxiliary communication content learning example is determined.

Then, for each of the plurality of communication content learning examples in the debug learning example group, one of the 2×4 auxiliary communication content learning examples having the smallest quantization difference from the communication content learning example is determined, and the communication content learning example is determined as the selected communication content learning example on the condition that the topic of the one auxiliary communication content learning example having the smallest quantization difference is the same as the topic of the communication content learning example.

In this way, perturbation examples of the number of communication learning examples may be filtered, i.e., the perturbation examples will not be used for network variable optimization of the application sharing demand detection network, while the remaining communication learning examples will be used for network variable optimization of the application sharing demand detection network.

In addition, in this substep 2, for the several communication content learning examples in the debug learning example group, descriptive knowledge of the several communication content learning examples in the debug learning example group and descriptive knowledge of auxiliary communication content learning examples in the auxiliary learning example set are used for performing quantization difference determination and clustering. Therefore, before the clustering processing in the substep 2 is performed, the application sharing requirement detection network may be input with the communication content learning examples of the debug learning example group and all auxiliary communication content learning examples (including disturbance examples) in the auxiliary learning example set arbitrarily screened for the debug learning example group at this time, respectively, and the application sharing requirement detection network may output descriptive knowledge of each communication content learning example at max pool layer according to the current network variable.

Further, when optimizing the network variables of the application sharing demand detection network, the method can be realized based on forward processing and loss function combination.

STEP2403, optimizing the current network variable of the application sharing requirement detection network through the selected communication content learning example in the debug learning example group.

For example, the optimization of the network variable can be performed by feeding back the loss function and combining with Adam algorithm until the loss function converges or the number of loop iteration debugging reaches the set number.

The above describes the content of the debugging phase of the application sharing demand detection network, based on which the subsequent content is presented for the application phase of the application sharing demand detection network.

STEP310, obtain cross-platform communication information to be shared.

For example, the step of applying the application stage of the shared demand detection network may be implemented in the communication information processing system, where the cross-platform communication information to be shared may be pre-cached in the communication information processing system or may be obtained through another system.

STEP320 generates a set of communication information to be processed according to the cross-platform communication information to be shared.

For example, the to-be-shared cross-platform communication information is extracted to generate a to-be-processed communication information set.

STEP330 detects whether the to-be-shared communication content in the to-be-processed communication information set includes the target application requirement item through an application sharing requirement detection network, wherein the application sharing requirement detection network is obtained by debugging according to the learning example after filtering processing, and the filtering processing is used for reducing the annotation amount of the learning example.

STEP340, under the condition that at least one communication content to be shared in the communication information set to be processed includes the target application requirement item, determining that the cross-platform communication information to be shared includes the target application requirement item.

It can be appreciated that, before the application STEP310-STEP340 performs cross-platform sharing on the cross-platform communication information to be shared, the application sharing requirement detection network can be used to detect the target application requirement item, so that whether the cross-platform communication information to be shared contains the target application requirement item can be accurately determined. Therefore, when the cross-platform communication information to be shared is shared, targeted sharing processing can be performed according to the target application requirement items, so that the cross-platform communication information to be shared can be matched with and suitable for related application requirements/application scenes.

In addition, a re-inspection stage can be introduced in the implementation of the debugged application sharing demand detection network (which is actually used for target application demand item detection), more prior cross-platform communication content is obtained based on the re-inspection result, and errors are further checked, so that the network is further debugged.

Based on this, the target application demand item detection method may further include the following steps.

STEP350, performing target application demand item detection on the to-be-shared cross-platform communication information through the application sharing demand detection network to obtain a plurality of initial detection results.

For example, in an application stage of the application sharing requirement detection network, an initial detection result of each time is stored.

STEP360, obtaining a final detection result of at least one piece of cross-platform communication information to be shared in the plurality of pieces of cross-platform communication information to be shared.

The method includes the steps of carrying out rechecking on at least one piece of cross-platform communication information to be shared in the plurality of pieces of cross-platform communication information to be shared according to manual work to obtain at least one final detection result. The cross-platform communication information to be shared, which does not contain the target application requirement item, is generally sampled as the initial detection result, because if the cross-platform communication information to be shared actually contains the target application requirement item, the added prior cross-platform communication content can be obtained from the cross-platform communication information to realize the utilization of the past data.

STEP370, under the condition that the initial detection result of any cross-platform communication information to be shared does not include the target application requirement item, but the final detection result reflects that the target application requirement item is included, takes the communication content to be shared including the target application requirement item in the information set of the cross-platform communication information to be shared as the increased prior cross-platform communication content, and is used for screening the increased debugging learning example set of the application sharing requirement detection network.

In the exemplary re-inspection process, when it is determined that the neural network detects that the target application requirement item is not included in the to-be-shared cross-platform communication information, but actually includes the target application requirement item, the communication content learning example with an error may be captured (for example, positioning processing is performed) to extract the communication content learning example, and the communication content learning example is stored as the added prior cross-platform communication content, and when the later round of debugging of the application sharing requirement detection network is applied, the debugging learning example set may be mined according to the adjusted prior cross-platform communication content, and is used for the later round of debugging of the application sharing requirement detection network.

In the exemplary embodiment, when it is determined in the rechecking process that the certain cross-platform communication information to be shared, which is detected by the neural network to include the target application requirement item, is actually the case including the target application requirement item, the communication content to be shared including the target application requirement item may also be captured from the cross-platform communication information to be shared, and may also be used as the added prior cross-platform communication content.

In the exemplary process, when it is determined that the cross-platform communication information to be shared, which is detected by the neural network to include the target application requirement item, does not actually include the target application requirement item in the rechecking process, the prior cross-platform communication content cannot be added, and a certain detection error can be tolerated, so that the embodiment of the application can ignore the situation.

Therefore, the prior cross-platform communication content is expanded based on the fact that the final detection result is inconsistent with the initial detection result (the initial detection result does not contain the target application requirement item), so that the prior cross-platform communication content can be enriched as much as possible, and the debugging quality of the application sharing requirement detection network can be improved through the rechecking and the checking processing.

Further, there is also provided a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the above-described method.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus and method embodiments described above are merely illustrative, for example, flow diagrams and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above description is only of the preferred embodiments of the present application and is not intended to limit the present application, but various modifications and variations can be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. An information processing method for cross-platform communication, which is applied to a communication information processing system, the method comprising:

under the condition that at least one communication content to be shared in the communication information set to be processed contains the target application requirement item, determining that the cross-platform communication information to be shared contains the target application requirement item;

The step of debugging the application sharing requirement detection network comprises the following steps:

debugging the application sharing requirement detection network through the debugging learning example set;

the annotation data is an identifier which is carried by prior cross-platform communication information and is in contact with a target application demand item, and the annotation data is at least one of a mask, a mark and a theme; the involvement description of the plurality of communication contents to be shared is grouping, distributing or clustering characteristics of the plurality of communication contents to be shared.

2. The method of claim 1, wherein the method further comprises:

3. The method of claim 1, wherein the annotation data is at least one of a mask, a tag, and a theme, and the a priori cross-platform communication information in the a priori communication information set is extracted from an original a priori communication information set based on characterization information of the target application requirement item.

4. The method of claim 1, wherein the involvement description of the plurality of communication contents to be shared is obtained by content clustering the plurality of communication contents to be shared, wherein selecting a debug learning example set for applying a sharing requirement detection network from the plurality of communication contents to be shared according to the X priori cross-platform communication contents and the involvement description of the plurality of communication contents to be shared comprises:

5. The method of claim 4, wherein deriving the set of positive learning examples and the set of negative learning examples based on a quantization difference between each a priori cross-platform communication content and the core cluster members of the X1 clusters, and a quantization difference between the core cluster members of the X1 clusters, comprises:

taking the initial positive learning example set and the advanced positive learning example set as the positive learning example set;

the method comprises the steps of obtaining a first-order positive learning example set according to the quantization difference between each priori cross-platform communication content and the core cluster members of the X1 clusters, wherein the first-order positive learning example set comprises the following steps: for each priori cross-platform communication content, taking the core cluster members of X2 clusters with the smallest quantification difference with the priori cross-platform communication content of the core cluster members of the X1 clusters as the first-order core cluster members corresponding to the priori cross-platform communication content, wherein X2 is an integer larger than 1; taking all communication content learning examples in the clusters where all primary core cluster members corresponding to the X priori cross-platform communication contents are located as primary positive learning examples for forming the primary positive learning example set;

Wherein obtaining a set of advanced active learning examples from the quantized differences between the core cluster members of the X1 clusters comprises: for each primary core cluster member, determining at least one core cluster member which is not a primary core cluster member among the core cluster members of the X1 clusters and is closest to the primary core cluster member, taking the closest at least one core cluster member as a corresponding primary core cluster member, determining a quantization difference between the primary core cluster member and the corresponding primary core cluster member, and taking the determined quantization difference between the primary core cluster member and the corresponding primary core cluster member as a first quantization difference; and determining the communication content learning examples meeting the first requirement in the cluster where the corresponding advanced core cluster member is located as the advanced active learning examples forming the advanced active learning example set, wherein the quantization difference between the communication content learning examples meeting the first requirement and the initial core cluster member is a second quantization difference, and the second quantization difference and the first quantization difference meet a first set quantization discrimination requirement.

6. The method of claim 5, wherein deriving the set of negative learning examples from a quantified difference between core cluster members of the X1 clusters comprises:

taking all communication content learning examples in the cluster where the passive core cluster member examples corresponding to the first-order core cluster member are located as passive learning examples for forming the passive learning example set;

Wherein screening one of the X3 passive core cluster member instances to be processed as a passive core cluster member instance corresponding to the primary core cluster member according to the quantization difference between the X3 passive core cluster member instances to be processed and the primary core cluster member and the quantization difference between the X3 passive core cluster member instances to be processed and the corresponding secondary core cluster member, includes:

7. The method of claim 1, wherein debugging the application sharing demand detection network through the set of debug learning examples comprises:

for each debug learning example group:

performing a round of optimization on the current network variable of the application sharing requirement detection network through the selected communication content learning examples in the debugging learning example group;

wherein determining selected ones of the set of debug learning examples based on the description and subject matter of the involvement of the auxiliary communication learning examples in the set of auxiliary learning examples and the subject matter of the communication learning examples in the set of debug learning examples comprises:

8. A communication information processing system, comprising a processor and a memory; the processor is communicatively connected to the memory, the processor being configured to read a computer program from the memory and execute the computer program to implement the method of any of claims 1-7.

9. A computer readable storage medium, characterized in that a program is stored thereon, which program, when being executed by a processor, implements the method of any of claims 1-7.