WO2022121801A1 - 信息处理方法、装置和电子设备 - Google Patents

信息处理方法、装置和电子设备 Download PDF

Info

Publication number
WO2022121801A1
WO2022121801A1 PCT/CN2021/135402 CN2021135402W WO2022121801A1 WO 2022121801 A1 WO2022121801 A1 WO 2022121801A1 CN 2021135402 W CN2021135402 W CN 2021135402W WO 2022121801 A1 WO2022121801 A1 WO 2022121801A1
Authority
WO
WIPO (PCT)
Prior art keywords
cluster center
clustered
feature vector
model
center
Prior art date
Application number
PCT/CN2021/135402
Other languages
English (en)
French (fr)
Inventor
吴培昊
谭言信
雷孝钧
Original Assignee
北京有竹居网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京有竹居网络技术有限公司 filed Critical 北京有竹居网络技术有限公司
Publication of WO2022121801A1 publication Critical patent/WO2022121801A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Definitions

  • the present disclosure relates to the field of Internet technologies, and in particular, to an information processing method, an apparatus, and an electronic device.
  • high-frequency question answering is an important basic capability, which relies on the standard question and answer library in the background.
  • the sources of content in the Q&A database include offline manual planning and online high-frequency question collection. The addition of the latter can greatly enrich the standard Q&A database and improve FAQ coverage.
  • Online high-frequency problems often come from data analysis of online problems. Therefore, the ability to analyze and process online issues is crucial.
  • an embodiment of the present disclosure provides an information processing method, the method includes: importing at least two problems to be clustered into a clustering model to obtain at least one target cluster center, wherein the target cluster center indicates a cluster ; Based on the at least one target cluster center, the at least two problems to be clustered are determined as at least one cluster.
  • an embodiment of the present disclosure provides an information processing apparatus, including: a generating unit, configured to import at least two problems to be clustered into a clustering model to obtain at least one target cluster center, wherein the target cluster center Indicating clusters; a determining unit, configured to determine the at least two problems to be clustered as at least one cluster based on the at least one target cluster center.
  • embodiments of the present disclosure provide an electronic device, including: one or more processors; and a storage device for storing one or more programs, when the one or more programs are stored by the one or more programs The one or more processors execute, so that the one or more processors implement the information processing method as described in the first aspect.
  • an embodiment of the present disclosure provides a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements the steps of the information processing method described in the first aspect.
  • FIG. 1 is a flowchart of one embodiment of an information processing method according to the present disclosure
  • 2A is a schematic diagram of a training flow of a classification model according to the present disclosure
  • 2B is a schematic diagram of an application scenario of the information processing method according to the present disclosure.
  • step 202 is a schematic diagram of an optional implementation of step 202 according to the present disclosure.
  • step 101 of the information processing method according to the present disclosure is a schematic diagram of an optional implementation of step 101 of the information processing method according to the present disclosure
  • step 402 of the information processing method of the present disclosure is a schematic diagram of an optional implementation manner of step 402 of the information processing method of the present disclosure
  • FIG. 6 is a schematic diagram of another optional implementation manner of step 402 of the information processing method of the present disclosure.
  • FIG. 7 is a schematic structural diagram of an embodiment of an information processing apparatus according to the present disclosure.
  • FIG. 8 is an exemplary system architecture to which an information processing method of an embodiment of the present disclosure may be applied.
  • FIG. 9 is a schematic diagram of a basic structure of an electronic device provided according to an embodiment of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • FIG. 1 shows a flow of an embodiment of an information processing method according to the present disclosure.
  • the information processing method includes the following steps:
  • Step 101 Import at least two problems to be clustered into a clustering model to obtain at least one target cluster center.
  • an execution body eg, a server of the information processing method may import at least two problems to be clustered into a clustering model to obtain at least one target cluster center.
  • the number of questions to be clustered may be at least two.
  • the problem to be clustered can be textual information.
  • the fields involved in the clustering problem may be various fields, which are not limited here.
  • the above-mentioned at least one target cluster center may be used to indicate a cluster, and may also be understood as indicating a problem type of the problem to be clustered. Problems to be clustered belonging to the same cluster can be understood as problems belonging to the same type.
  • the type of problem corresponding to a cluster exists objectively; however, the name of the type of problem corresponding to a cluster may be determined before the appearance of the cluster, or it may be determined after the cluster is determined.
  • the problems in the clusters can be further analyzed to obtain corresponding analysis results.
  • the problems in the cluster can be analyzed to find the problems that have not appeared in the collected problem set, so as to realize the mining of new problems.
  • the problem types corresponding to each cluster can be analyzed to find the problem types that have not appeared in the collected problem set.
  • the problem types involved in the entire cluster are not included before, so that new problem types can be mined.
  • the clustering model described above may include a feature extraction sub-model.
  • the feature extraction sub-model can generate the feature vector corresponding to the problem to be clustered, and the feature vector is used for clustering to determine the target cluster center.
  • the above-mentioned feature extraction sub-model is obtained based on the feature extraction layer of a pre-trained classification model.
  • the classification model may be pre-trained.
  • a classification model can include feature extraction layers and classification layers. Then, the feature extraction layer of the trained classification model can be used as a feature extraction sub-model.
  • the feature extraction layer in the classification model has the ability to extract type features, that is, it can expand the difference between different types of problems to be clustered, and reduce the difference between problems of the same type to be clustered.
  • Step 102 Determine the at least two problems to be clustered as at least one cluster based on at least one target cluster center.
  • the above-mentioned execution subject may determine the at least two problems to be clustered as at least one cluster according to the at least one target cluster center determined in step 101 .
  • the cluster center to which the problem to be clustered belongs can be determined by determining the distance between the problem to be clustered and the center of each target cluster. Therefore, the problems to be clustered under the center of each target cluster can be regarded as a cluster, so that at least two problems to be clustered can be divided into at least one cluster.
  • the information processing method provided by this embodiment by importing the problem to be clustered into the clustering model, at least one target cluster center is obtained; then, according to the target cluster center, the at least two target cluster centers are Class problems are identified as at least one class cluster.
  • a new clustering method can be provided, which improves the clustering speed and clustering accuracy for problems.
  • the feature extraction sub-model in the clustering model that determines the center of the target cluster has the ability to extract type features, so the feature vector that is the basis for clustering can have a better type representation. Therefore, the clustering efficiency can be improved and the time consumed by the clustering can be reduced; and the accuracy of the clustering can be improved.
  • the above classification model can be obtained through the first step.
  • the first step can be implemented through the flow shown in FIG. 2A .
  • the flow shown in FIG. 2A may include step 201 and step 202 .
  • Step 201 acquiring training samples.
  • the above training samples can have labels, and the labels can indicate the text content type.
  • the text content type may involve various fields and is not limited here.
  • Step 202 Based on the training samples and corresponding labels, the classification network to be trained is trained to obtain the classification model.
  • the above classification network to be trained may include a feature extraction layer to be trained and a classification layer to be trained.
  • the feature extraction layer in the classification model can be obtained by training the feature extraction layer to be trained.
  • the specific structure of the feature extraction layer to be trained can be set according to the actual application scenario, which is not limited here.
  • the feature extraction layer to be trained may comprise a convolutional neural network.
  • the feature extraction layer to be trained may adopt a BERT (Bidirectional Encoder Representations from Transformers, BERT) structure.
  • the specific structure of the classification layer to be trained can be set according to the actual application scenario, which is not limited here.
  • the classification layer to be trained may include a pooling layer and a fully connected layer; the fully connected layer is used to map features to types.
  • the training samples can be imported into the classification network to be trained, and the classification results can be obtained. Then compare the classification results with the labels corresponding to the training samples to determine the loss value. Afterwards, the loss value can be used for backpropagation to adjust the parameters of the classification network to be trained.
  • the classification network to be trained obtained by training is determined as a classification model through multiple iteration steps until the conditions for stopping the iteration are satisfied.
  • FIG. 2B shows a schematic diagram of an exemplary application scenario of the embodiment of the present application.
  • the classification task of the training samples can be used to train the pre-established classification network to be trained to obtain the classification model.
  • the classification network to be trained may include a feature extraction layer to be trained and a classification layer to be trained.
  • the trained feature extraction layer can be taken from the classification model as a feature extraction sub-model in the clustering model.
  • the clustering model may include a feature extraction sub-model and a clustering sub-model.
  • the distance between the problem to be clustered and the center of each target cluster can be determined to determine the target cluster center to which the problem to be clustered belongs. Therefore, the problems to be clustered under the center of each target cluster can be regarded as a cluster, so that at least two problems to be clustered can be divided into at least one cluster.
  • the above-mentioned step 202 may include the steps shown in FIG. 3 .
  • the steps shown in FIG. 3 may include step 301 , step 302 , step 303 and step 304 .
  • Step 301 Import at least two training samples into the classification network to be trained, and obtain prediction types corresponding to the at least two training samples.
  • the labels of the above at least two training samples are different.
  • Step 302 Determine a single sample loss value of each training sample according to the prediction type and label of each training sample.
  • various loss calculation methods can be used to determine the loss value of a single sample.
  • a cross-entropy loss function can be employed to determine a single sample loss value.
  • Step 303 Determine the total sample loss value according to the determined single sample loss value.
  • the individual sample loss values can be combined in various ways to determine the total sample loss value.
  • the determined individual sample loss values may be added together, and the resulting sum taken as the sample total loss value.
  • Step 304 based on the total loss value of the samples, adjust the parameters of the classification network to be trained.
  • the total sample loss value can be used for backpropagation to adjust the parameters of the classification network to be trained.
  • two different types of problem sample sets can be obtained as training sets for the classification model. Then, in each training, problem samples are extracted from each problem sample set to form a pair of training samples. After that, each training sample is vectorized by bert, and the overall representation of the sentence level is obtained by pooling the vectorized output. Then, the pooled output is mapped to the type dimension through the linear layer for classification, and the classification result is obtained. Calculate the loss value of a single sample by calculating each classification result and the corresponding label, and then add the loss value of the single sample to obtain the total loss value of the sample, and use the total loss value of the sample to perform backpropagation to update the parameters of bert.
  • the labels of the at least two training samples are different, so that during the training process of the neural network to be trained, It has good generalization ability, that is, it can have relatively accurate feature extraction ability for various types of training samples.
  • a single sample loss value is used to adjust the parameters of the classification network to be trained, it may make it difficult for the classification network to be trained to take into account various types of problem samples. For example, adjusting the completed classification on a set of problem samples of type A After updating its parameters with the problem sample set of type B, the updated classification network may have poor representation ability for the problem sample of type A.
  • the foregoing step 101 may be implemented by including the steps in the flow shown in FIG. 4 .
  • the flow shown in FIG. 4 may include step 401 and step 402 .
  • Step 401 Import the problem to be clustered into the feature extraction sub-model to obtain a first feature vector.
  • Step 402 based on the back-propagation algorithm and the first feature vector, update the initial cluster center to obtain the at least one target cluster center.
  • the initial cluster center can be determined by random setting.
  • updating the initial cluster center through the first feature vector and the back-propagation algorithm can be understood as using deep learning to determine the target cluster center, which can improve the accuracy of determining the target cluster center.
  • the above-mentioned initial cluster centers may be obtained by clustering the first feature vector by using a mean clustering algorithm.
  • the means clustering algorithm may include a k-means clustering algorithm (K-means) algorithm.
  • K-means k-means clustering algorithm
  • the principle of the K-means algorithm is briefly described as follows: First, K objects are randomly selected as the initial clustering centers. Then calculate the distance between each object and each seed cluster center, and assign each object to its nearest cluster center. Cluster centers and the objects assigned to them represent a cluster. Once all objects have been assigned, the cluster center for each cluster is recalculated based on the existing objects in the cluster. This process will repeat until a certain termination condition is met.
  • generating the initial cluster center by means of the mean clustering algorithm can make the initial cluster center more suitable for the actual scene of this clustering, improve the accuracy of the initial cluster center, and reduce the number of cluster centers based on the initial cluster center. Time and computation to get the center of the target cluster.
  • step 402 may include the steps shown in FIG. 5 .
  • the steps shown in FIG. 5 may include step 501 , step 502 and step 503 .
  • Step 501 Determine the initial cluster center as the first candidate cluster center.
  • Step 502 Based on the first candidate cluster center, the following first iterative step is performed: based on the first candidate cluster center and the first feature vector, determine the first probability value that the problem to be clustered belongs to each first candidate cluster center; Perform reinforcement processing on each first probability value to obtain a first reinforcement value; generate a first loss value according to the first reinforcement value and the first probability value; in response to determining that the first stopping condition is satisfied, determine the first candidate cluster center is the target cluster center and output.
  • the first candidate cluster center can be continuously updated.
  • the first candidate cluster centers may be different each time the iterative steps are performed.
  • the number of the first candidate cluster centers may be at least one; that is, it may be one or at least two.
  • determining the first probability value that the problem to be clustered belongs to each first candidate cluster center can be realized in various ways. This is not limited.
  • each first feature vector calculates the distance between the feature vector and the center of each first candidate cluster; The sum of the distances of the candidate cluster centers is calculated as a ratio, and the ratio is determined as the first probability value.
  • the first square of the distance between the feature vector and the center of each first candidate cluster can be calculated, and then the first square and 1 are added to obtain the first sum.
  • a candidate cluster center corresponds to the first sum respectively; the first sum corresponding to the first candidate cluster center and the sum of the first sum are calculated as a ratio, and the ratio is determined as a first probability value.
  • reinforcement processing is used to widen the gap between the first probability values.
  • the strengthening process can strengthen the proportion of the part of the first probability value with a higher degree of confidence in the totality of the first probability value.
  • a quadratic is obtained for the first probability value, and then the ratio of the quadratic to the sum of each first probability quadratic is used to determine the first reinforcement value.
  • generating the first loss value according to the first reinforcement value and the first probability value may be implemented in various ways, which are not limited herein.
  • the logarithm of the ratio of the first reinforcement value to the first probability value may be taken as the first loss value.
  • the first probability value is considered toward the first reinforcement value, the first loss will become smaller and smaller, and tend to converge (eg, converge to a constant). Therefore, it is possible to realize the iteration of the cluster center as the first iterative step proceeds.
  • Step 503 in response to determining that the first stopping condition is not satisfied, perform backpropagation based on the generated first loss value, adjust the first candidate cluster center to obtain a new first candidate cluster center, and jump to execute the first iteration step .
  • the first stop condition can be set according to the actual application scenario.
  • the first stopping condition may include, but is not limited to, at least one of the following: the number of iterations is not less than a preset number of times threshold, and the first loss value is not less than a preset loss value threshold.
  • backpropagation can be performed based on the first loss value, and the value of the first candidate cluster center can be adjusted to obtain a new first candidate cluster center. Then, jump to the first iterative step, and continue to perform the first iterative step (this time the first candidate cluster center on which the first iterative step is performed is different from the first iterative step of the previous round).
  • step 102 may include: determining the cluster to which the problem to be clustered belongs according to the first feature vector and the target cluster center.
  • each target cluster center may have its own problem to be clustered, that is, cluster division is performed for at least two problems to be clustered.
  • step 402 may include the steps shown in FIG. 6 .
  • the steps shown in FIG. 6 may include step 601 , step 602 and step 603 .
  • Step 601 Determine the initial cluster center as the second candidate cluster center, and determine the first feature vector as the second feature vector.
  • the first feature vector is the first feature vector generated by the initial feature extraction sub-model.
  • the second eigenvector can be understood as a name for distinguishing it from the first eigenvector, and does not mean that the specific value of the first eigenvector has changed.
  • Step 602 based on the second candidate cluster center and the second feature vector, perform the following second iterative step: based on the second candidate cluster center and the second feature vector, determine that the problem to be clustered belongs to each second candidate cluster center. second probability value; performing enhancement processing on each second probability value to obtain a second enhancement value; generating a second loss value according to the second enhancement value and the second probability value; in response to determining that the second stopping condition is satisfied, the second The candidate cluster center is determined as the target cluster center and output, and the feature extraction sub-model is determined as the adjusted feature extraction sub-model.
  • the second candidate cluster center can be continuously updated.
  • the second candidate cluster center may be different each time the iterative step is performed.
  • the number of the second candidate cluster centers may be at least one; that is, it may be one or at least two.
  • determining the second probability value that the problem to be clustered belongs to each second candidate cluster center can be realized in various ways. This is not limited.
  • the distance between the feature vector and the center of each second candidate cluster is calculated; the distance between the second feature vector and the center of the target second candidate cluster is the same as the
  • the sum of the distances of the candidate cluster centers is calculated as a ratio, and the ratio is determined as the second probability value.
  • the second square of the distance between the feature vector and the center of each second candidate cluster can be calculated, and then the second square is added to 1 to obtain the second sum.
  • the two candidate cluster centers are respectively corresponding to the second sum; the second sum corresponding to the second candidate cluster center and the sum of the second sum are calculated as a ratio, and the ratio is determined as the second probability value.
  • reinforcement processing is used to widen the gap between the second probability values.
  • the strengthening process can strengthen the proportion of the second probability value with higher confidence in the whole of the second probability value.
  • a quadratic is obtained for the second probability value, and then the ratio of the quadratic to the sum of each second probability quadratic is used to determine the second reinforcement value.
  • the second loss value is generated according to the second enhancement value and the second probability value, which can be implemented in various ways, which are not limited here.
  • the logarithm of the ratio of the second reinforcement value to the second probability value may be taken as the second loss value.
  • the second probability value is considered toward the second reinforcement value, the second loss tends to approach zero more and more.
  • the second iterative step can be performed to realize the iteration of the clustering.
  • Step 603 in response to determining that the second stopping condition is not satisfied, adjust the second candidate cluster center based on the generated second loss value to obtain a new second candidate cluster center, and adjust the feature extraction sub-model based on the generated second loss value parameters, and import the problem to be clustered into the adjusted feature extraction sub-model to obtain a new second feature vector, and jump to execute the second iterative step.
  • the second stop condition can be set according to the actual application scenario.
  • the second stopping condition may include, but is not limited to, at least one of the following: the number of iterations is not less than a preset number of times threshold, and the second loss value is not less than a preset loss value threshold.
  • back-propagation can be performed based on the second loss value, and the value of the second candidate cluster center can be adjusted to obtain a new second candidate cluster center. Then, jump to the second iterative step, and continue to execute the second iterative step (this time the second candidate cluster center on which the second iterative step is performed is different from the previous second iterative step).
  • the parameters of the feature extraction sub-model can be adjusted based on the second loss value.
  • the feature extraction sub-model is constantly updated.
  • a new feature vector can be obtained from the updated feature extraction sub-model for the clustering problem.
  • the step 102 may include: importing the problem to be clustered into the adjusted feature extraction sub-model to obtain a third feature vector; and determining the problem to be clustered according to the third feature vector and the target cluster center the class cluster to which it belongs.
  • step 603 back-propagation is performed based on the second loss value to adjust the parameters of the feature extraction sub-model, so the feature extraction sub-model is constantly updated.
  • the feature extraction sub-model used in each second iterative step can be the latest feature extraction sub-model retained after the update. Therefore, when the problem to be clustered is imported into the adjusted feature extraction
  • the sub-model those skilled in the art can understand that it is the latest feature extraction sub-model that is imported and retained after the update. Therefore, the obtained third feature vector can more accurately express the type features of the problem to be clustered.
  • each target cluster center may have its own problem to be clustered, that is, at least two problems to be clustered are grouped to achieve the determination of at least one cluster. cluster.
  • the features of the feature extraction sub-model can be further improved. Characterization ability to improve the accuracy of clustering.
  • the present disclosure provides an embodiment of an information processing apparatus.
  • the apparatus embodiment corresponds to the method embodiment shown in FIG. 1 , and the apparatus may specifically Used in various electronic devices.
  • the information processing apparatus of this embodiment includes: a generating unit 701 and a determining unit 702 .
  • the generating unit is used for importing at least two problems to be clustered into the clustering model to obtain at least one target cluster center, wherein the target cluster center indicates a cluster;
  • the determining unit is used for at least one target cluster based on the at least one target cluster center.
  • the cluster center determines the at least two problems to be clustered as at least one cluster.
  • the specific processing of the recording unit generating unit 701 and the determining unit 702 of the information processing apparatus and the technical effects brought by them can be referred to the relevant descriptions of the steps 101 and 102 in the corresponding embodiment of FIG. 1 respectively. No longer.
  • the clustering model includes a feature extraction sub-model, wherein the feature extraction sub-model is obtained based on a feature extraction layer of a pre-trained classification model, and the feature extraction sub-model is used to generate the to-be-clustered The eigenvector corresponding to the question, which is used for clustering to determine the target cluster center.
  • the classification model is obtained through a first step, wherein the first step includes: acquiring training samples, wherein the labels of the training samples indicate text content types; based on the training samples and corresponding labels,
  • the classification network to be trained is trained to obtain the classification model, wherein the classification network to be trained includes a feature extraction layer to be trained and a classification layer, and the feature extraction layer in the classification model is obtained by training the feature extraction layer to be trained.
  • the training the classification network to be trained based on the training samples and the corresponding labels to obtain the classification model includes: importing at least two training samples into the classification network to be trained to obtain the at least two training samples.
  • importing at least two problems to be clustered into a clustering model to obtain at least one target cluster center includes: importing the problems to be clustered into the feature extraction sub-model to obtain a first feature vector; Based on the back-propagation algorithm and the first feature vector, the initial cluster center is updated to obtain the at least one target cluster center.
  • the initial cluster center is obtained by clustering the first feature vector using a mean clustering algorithm.
  • the updating the initial cluster center based on the back-propagation algorithm and the first feature vector to obtain the at least one target cluster center includes: determining the initial cluster center as the first candidate cluster center ; Based on the first candidate cluster center, the following first iterative steps are performed: based on the first candidate cluster center and the first feature vector, determine the first probability value that the problem to be clustered belongs to each first candidate cluster center; The first probability value is enhanced to obtain the first enhanced value; the first loss value is generated according to the first enhanced value and the first probability value; in response to determining that the first stopping condition is satisfied, the first candidate cluster center is determined as the target The cluster center is output; in response to determining that the first stopping condition is not satisfied, backpropagation is performed based on the generated first loss value, the first candidate cluster center is adjusted to obtain a new first candidate cluster center, and the jump to execute the first an iterative step.
  • determining the at least two problems to be clustered as at least one cluster based on the at least one target cluster center includes: determining the to-be-clustered problem according to the first feature vector and the target cluster center The cluster to which the clustering problem belongs.
  • updating the initial cluster center based on the back-propagation algorithm and the first feature vector to obtain the at least one target cluster center includes: determining the initial cluster center as the second candidate cluster center , and determine the first feature vector as the second feature vector; based on the second candidate cluster center and the second feature vector, perform the following second iterative step: based on the second candidate cluster center and the second feature vector, determine the cluster to be clustered
  • the class problem belongs to the second probability value of each second candidate cluster center; each second probability value is enhanced to obtain the second enhanced value; according to the second enhanced value and the second probability value, the second loss value is generated; the response In response to determining that the second stopping condition is satisfied, the second candidate cluster center is determined as the target cluster center and output; in response to determining that the second stopping condition is not satisfied, back-propagation is performed based on the generated second loss value, and the second candidate is adjusted.
  • the cluster center obtains a new second candidate cluster center, and adjusts the parameters of the feature extraction sub-model by back-propagation based on the generated second loss value, and imports the clustering problem into the adjusted feature extraction sub-model to obtain a new The second feature vector, and the jump performs the second iterative step.
  • determining the at least two questions to be clustered as at least one cluster based on the at least one target cluster center includes: importing the questions to be clustered into an adjusted feature extraction sub-model , obtain the third eigenvector; according to the third eigenvector and the center of the target cluster, determine the cluster to which the problem to be clustered belongs.
  • FIG. 8 illustrates an exemplary system architecture to which an information processing method according to an embodiment of the present disclosure may be applied.
  • the system architecture may include terminal devices 801 , 802 , and 803 , a network 804 , and a server 805 .
  • the network 804 is a medium used to provide a communication link between the terminal devices 801 , 802 , 803 and the server 805 .
  • Network 804 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the terminal devices 801, 802, and 803 can interact with the server 805 through the network 804 to receive or send messages and the like.
  • Various client applications may be installed on the terminal devices 801 , 802 and 803 , such as web browser applications, search applications, and news information applications.
  • the client applications in the terminal devices 801, 802, and 803 can receive the user's instruction, and complete corresponding functions according to the user's instruction, such as adding corresponding information to the information according to the user's instruction.
  • the terminal devices 801, 802, and 803 may be hardware or software.
  • the terminal devices 801, 802, and 803 can be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, Moving Picture Experts Compression Standard Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, Moving Picture Experts Compression Standard Audio Layer 4) Players, Laptops and Desktops, etc.
  • the terminal devices 801, 802, and 803 are software, they can be installed in the electronic devices listed above. It can be implemented as a plurality of software or software modules (eg, software or software modules for providing distributed services), or can be implemented as a single software or software module. There is no specific limitation here.
  • the server 805 may be a server that provides various services, for example, receives information acquisition requests sent by the terminal devices 801, 802, and 803, and acquires display information corresponding to the information acquisition requests in various ways according to the information acquisition requests. And the related data of the displayed information is sent to the terminal devices 801 , 802 , and 803 .
  • the information processing methods provided by the embodiments of the present disclosure may be executed by terminal devices, and correspondingly, the information processing apparatuses may be set in the terminal devices 801 , 802 , and 803 .
  • the information processing method provided by the embodiment of the present disclosure may also be executed by the server 805 , and accordingly, the information processing apparatus may be provided in the server 805 .
  • terminal devices, networks and servers in FIG. 8 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.
  • FIG. 9 shows a schematic structural diagram of an electronic device (eg, the terminal device or the server in FIG. 8 ) suitable for implementing an embodiment of the present disclosure.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablets), PMPs (portable multimedia players), vehicle-mounted terminals (eg, mobile terminals such as in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs, desktop computers, and the like.
  • the electronic device shown in FIG. 9 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present disclosure.
  • the electronic device may include a processing device (eg, a central processing unit, a graphics processor, etc.) 901 which may be loaded into a random access memory according to a program stored in a read only memory (ROM) 902 or from a storage device 908
  • the program in the (RAM) 903 executes various appropriate operations and processes.
  • various programs and data necessary for the operation of the electronic device 900 are also stored.
  • the processing device 901, the ROM 902, and the RAM 903 are connected to each other through a bus 904.
  • An input/output (I/O) interface 905 is also connected to bus 904 .
  • the following devices can be connected to the I/O interface 905: input devices 909 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD), speakers, vibration
  • An output device 907 such as a computer
  • a storage device 908 including, for example, a magnetic tape, a hard disk, etc.
  • the communication means 909 may allow the electronic device to communicate wirelessly or by wire with other devices to exchange data. While FIG. 9 illustrates an electronic device having various means, it should be understood that not all of the illustrated means are required to be implemented or available. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 909, or from the storage device 908, or from the ROM 902.
  • the processing apparatus 901 the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, electrical wire, optical fiber cable, RF (radio frequency), etc., or any suitable combination of the foregoing.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (eg, the Internet), and peer-to-peer networks (eg, ad hoc peer-to-peer networks), as well as any currently known or future development network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device can: import at least two problems to be clustered into a clustering model to obtain at least one target class a cluster center, wherein the target cluster center indicates a cluster; and based on the at least one target cluster center, the at least two problems to be clustered are determined as at least one cluster.
  • Computer program code for performing operations of the present disclosure may be written in one or more programming languages, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, and This includes conventional procedural programming languages - such as the "C" language or similar programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the name of the unit does not constitute a limitation of the unit itself in some cases, for example, the generation unit can also be described as "the unit that generates the center of the target cluster".
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), Systems on Chips (SOCs), Complex Programmable Logical Devices (CPLDs) and more.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLDs Complex Programmable Logical Devices
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the clustering model includes a feature extraction sub-model, the feature extraction sub-model is used to generate a feature vector corresponding to the problem to be clustered, and the feature vector is used for clustering to determine a target class cluster center.
  • the feature extraction sub-model is obtained based on a feature extraction layer of a pre-trained classification model.
  • the classification model is obtained through a first step, wherein the first step includes: acquiring a training sample, wherein a label of the training sample indicates a text content type; based on the training sample and corresponding labels, train the classification network to be trained to obtain the classification model, wherein the classification network to be trained includes a feature extraction layer to be trained and a classification layer, and the feature extraction layer in the classification model is performed by the feature extraction layer to be trained. Trained to get.
  • the training of the classification network to be trained based on the training samples and the corresponding labels to obtain the classification model includes: importing at least two training samples into the classification network to be trained, Obtain the prediction type corresponding to the at least two training samples, wherein the labels of the at least two training samples are different; determine the single sample loss value of each training sample according to the prediction type and label of each training sample; Determine the loss value of a single sample, determine the total loss value of the sample; based on the total loss value of the sample, adjust the parameters of the classification network to be trained.
  • importing at least two problems to be clustered into a clustering model to obtain at least one target cluster center includes: importing the problems to be clustered into the feature extraction sub-model to obtain The first feature vector; based on the back-propagation algorithm and the first feature vector, update the initial cluster center to obtain the at least one target cluster center.
  • the initial cluster center is obtained by clustering the first feature vector using a mean clustering algorithm.
  • updating the initial cluster center based on the backpropagation algorithm and the first feature vector to obtain the at least one target cluster center includes: determining the initial cluster center as the first cluster center a candidate cluster center; based on the first candidate cluster center, perform the following first iterative step: based on the first candidate cluster center and the first feature vector, determine the problem to be clustered belongs to the first candidate cluster center of each first candidate cluster probability value; perform reinforcement processing on each first probability value to obtain a first reinforcement value; generate a first loss value according to the first reinforcement value and the first probability value; in response to determining that the first stopping condition is satisfied, classify the first candidate class
  • the cluster center is determined as the target cluster center and output; in response to determining that the first stopping condition is not satisfied, backpropagation is performed based on the generated first loss value, and the first candidate cluster center is adjusted to obtain a new first candidate cluster center, and jump to execute the first iteration step.
  • determining the at least two problems to be clustered as at least one cluster based on the at least one target cluster center includes: according to the first feature vector and the target cluster Cluster center, to determine the cluster to which the problem to be clustered belongs.
  • updating the initial cluster center based on the backpropagation algorithm and the first feature vector to obtain the at least one target cluster center includes: determining the initial cluster center as the first cluster center two candidate cluster centers, and determining the first feature vector as a second feature vector; based on the second candidate cluster centers and the second feature vector, perform the following second iterative step: based on the second candidate cluster centers and the second feature vector, determine the second probability value of the problem to be clustered belonging to the center of each second candidate cluster; perform enhancement processing on each second probability value to obtain the second enhancement value; generate the second enhancement value according to the second enhancement value and the second probability value Two loss values; in response to determining that the second stopping condition is satisfied, the second candidate cluster center is determined as the target cluster center and output; in response to determining that the second stopping condition is not satisfied, backpropagation is performed based on the generated second loss value , adjust the second candidate cluster center to obtain a new second candidate cluster center, and perform back-propagation based on the generated second loss
  • determining the at least two problems to be clustered as at least one cluster based on the at least one target cluster center includes: importing the problems to be clustered into an adjusted The feature extraction sub-model of , obtains the third feature vector; according to the third feature vector and the center of the target cluster, the cluster to which the problem to be clustered is determined is determined.
  • an information processing apparatus includes: a generating unit configured to import at least two problems to be clustered into a clustering model to obtain at least one target cluster center, wherein the target cluster center indicates A cluster; a determining unit, configured to determine the at least two problems to be clustered as at least one cluster based on the at least one target cluster center.
  • the clustering model includes a feature extraction sub-model, wherein the feature extraction sub-model is obtained based on a feature extraction layer of a pre-trained classification model, and the feature extraction sub-model uses To generate the feature vector corresponding to the problem to be clustered, the feature vector is used for clustering to determine the target cluster center.
  • the classification model is obtained through a first step, wherein the first step includes: acquiring a training sample, wherein a label of the training sample indicates a text content type; based on the training sample and corresponding labels, train the classification network to be trained to obtain the classification model, wherein the classification network to be trained includes a feature extraction layer to be trained and a classification layer, and the feature extraction layer in the classification model is performed by the feature extraction layer to be trained. Trained to get.
  • the training of the classification network to be trained based on the training samples and the corresponding labels to obtain the classification model includes: importing at least two training samples into the classification network to be trained, Obtain the prediction type corresponding to the at least two training samples, wherein the labels of the at least two training samples are different; determine the single sample loss value of each training sample according to the prediction type and label of each training sample; Determine the loss value of a single sample, determine the total loss value of the sample; based on the total loss value of the sample, adjust the parameters of the classification network to be trained.
  • importing at least two problems to be clustered into a clustering model to obtain at least one target cluster center includes: importing the problems to be clustered into the feature extraction sub-model to obtain The first feature vector; based on the back-propagation algorithm and the first feature vector, update the initial cluster center to obtain the at least one target cluster center.
  • the initial cluster center is obtained by clustering the first feature vector using a mean clustering algorithm.
  • updating the initial cluster center based on the backpropagation algorithm and the first feature vector to obtain the at least one target cluster center includes: determining the initial cluster center as the first cluster center a candidate cluster center; based on the first candidate cluster center, perform the following first iterative step: based on the first candidate cluster center and the first feature vector, determine the problem to be clustered belongs to the first candidate cluster center of each first candidate cluster probability value; perform reinforcement processing on each first probability value to obtain a first reinforcement value; generate a first loss value according to the first reinforcement value and the first probability value; in response to determining that the first stopping condition is satisfied, classify the first candidate class
  • the cluster center is determined as the target cluster center and output; in response to determining that the first stopping condition is not satisfied, backpropagation is performed based on the generated first loss value, and the first candidate cluster center is adjusted to obtain a new first candidate cluster center, and jump to execute the first iteration step.
  • determining the at least two problems to be clustered as at least one cluster based on the at least one target cluster center includes: according to the first feature vector and the target cluster Cluster center, to determine the cluster to which the problem to be clustered belongs.
  • updating the initial cluster center based on the backpropagation algorithm and the first feature vector to obtain the at least one target cluster center includes: determining the initial cluster center as the first cluster center two candidate cluster centers, and determining the first feature vector as a second feature vector; based on the second candidate cluster centers and the second feature vector, perform the following second iterative step: based on the second candidate cluster centers and the second feature vector, determine the second probability value of the problem to be clustered belonging to the center of each second candidate cluster; perform enhancement processing on each second probability value to obtain the second enhancement value; generate the second enhancement value according to the second enhancement value and the second probability value Two loss values; in response to determining that the second stopping condition is satisfied, the second candidate cluster center is determined as the target cluster center and output; in response to determining that the second stopping condition is not satisfied, backpropagation is performed based on the generated second loss value , adjust the second candidate cluster center to obtain a new second candidate cluster center, and perform back-propagation based on the generated second loss
  • determining the at least two problems to be clustered as at least one cluster based on the at least one target cluster center includes: importing the problems to be clustered into an adjusted The feature extraction sub-model of , obtains the third feature vector; according to the third feature vector and the center of the target cluster, the cluster to which the problem to be clustered is determined is determined.
  • an electronic device includes: one or more processors; a storage device for storing one or more programs, when the one or more programs are stored by the one or more programs A plurality of processors execute such that the one or more processors implement a method as described in any information processing method.
  • a computer-readable medium has a computer program stored thereon, and when the program is executed by a processor, implements any one of the methods described in the information processing method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Human Computer Interaction (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种信息处理方法、装置和电子设备。该方法包括:将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心(101),其中,目标类簇中心指示类簇;基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇(102)。由此,提供了一种新的问题聚类方式。

Description

信息处理方法、装置和电子设备
相关申请的交叉引用
本申请要求于2020年12月07日提交的,申请号为202011432971.5、发明名称为“信息处理方法、装置和电子设备”的中国专利申请的优先权,该申请的全文通过引用结合在本申请中。
技术领域
本公开涉及互联网技术领域,尤其涉及一种信息处理方法、装置和电子设备。
背景技术
随着互联网的发展,用户越来越多的使用终端设备浏览各类信息。用户在浏览各种信息的时候,可能会提出各种问题。智能客服技术的发展,可以机器自动回复用户的提问成为现实。
在智能客服场景下,高频问答(FAQ)是一项重要的基础能力,其依赖于后台的标准问答库。问答库中内容的来源包括线下人工规划以及线上高频问题的收集,后者的添加可以大大丰富标准问答库,提升FAQ覆盖率。线上高频问题往往来源于对线上问题进行数据分析得到。因此,对于线上问题的分析处理能力至关重要。
发明内容
提供该公开内容部分以便以简要的形式介绍构思,这些构思 将在后面的具体实施方式部分被详细描述。该公开内容部分并不旨在标识要求保护的技术方案的关键特征或必要特征,也不旨在用于限制所要求的保护的技术方案的范围。
第一方面,本公开实施例提供了一种信息处理方法,该方法包括:将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,其中,目标类簇中心指示类簇;基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
第二方面,本公开实施例提供了一种信息处理装置,包括:生成单元,用于将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,其中,目标类簇中心指示类簇;确定单元,用于基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
第三方面,本公开实施例提供了一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如第一方面所述的信息处理方法。
第四方面,本公开实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如第一方面所述的信息处理方法的步骤。
附图说明
结合附图并参考以下具体实施方式,本公开各实施例的上述和其他特征、优点及方面将变得更加明显。贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。
图1是根据本公开的信息处理方法的一个实施例的流程图;
图2A是根据本公开的分类模型的训练流程示意图;
图2B是根据本公开的信息处理方法的一个应用场景的示意图;
图3是根据本公开的步骤202的一种可选的实现方式的示意图;
图4是根据本公开的信息处理方法的步骤101的一种可选的实 现方式的示意图;
图5是本公开的信息处理方法的步骤402的一种可选的实现方式的示意图;
图6是本公开的信息处理方法的步骤402的另一种可选的实现方式的示意图;
图7是根据本公开的信息处理装置的一个实施例的结构示意图;
图8是本公开的一个实施例的信息处理方法可以应用于其中的示例性系统架构;
图9是根据本公开实施例提供的电子设备的基本结构的示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、 模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“一个或多个”。
本公开实施方式中的多个装置之间所交互的消息或者信息的名称仅用于说明性的目的,而并不是用于对这些消息或信息的范围进行限制。
请参考图1,其示出了根据本公开的信息处理方法的一个实施例的流程。如图1所示该信息处理方法,包括以下步骤:
步骤101,将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心。
在本实施例中,信息处理方法的执行主体(例如服务器)可以将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心。
在这里,待聚类问题的数量可以是至少两个。待聚类问题可以是文本信息。
在这里,待聚类问题涉及的领域,可能是各种领域,在此不做限定。
在本实施例中,上述至少一个目标类簇中心,类簇中心可以用于指示类簇,也可以理解为指示待聚类问题的问题类型。属于同一类簇的待聚类问题,可以理解为属于同一类型的问题。
可以理解,类簇对应的问题类型,是客观存在的;但是,类簇对应的问题类型的名称,可能是先于类簇的出现而确定的,也可能是在确定类簇之后在确定的。
在一些应用场景中,将待聚类问题进行聚类得到同类型问题的类簇后,可以进一步对类簇中的问题进行分析,得到相应的分析结果。
作为示例,可以对类簇中的问题进行分析,找到已收录问题集合中没有出现过的问题,从而实现挖掘新的问题。
作为示例,可以对各个类簇对应的问题类型进行分析,找到 已收录问题集合中没有出现过的问题类型。换句话说,可能整个类簇所涉及的问题类型都是之前没有收录过的,从而,可以实现挖掘新的问题类型。
在一些实施例中,上述聚类模型可以包括特征提取子模型。特征提取子模型可以生成待聚类问题对应的特征向量,特征向量用于聚类以确定目标类簇中心。
在一些实施例中,上述特征提取子模型基于预先训练的分类模型的特征提取层得到。
在一些实施例中,可以预先训练分类模型。分类模型可以包括特征提取层和分类层。然后,可以将训练完成的分类模型的特征提取层作为特征提取子模型。
需要说明的是,分类模型中的特征提取层,具有对于类型特征的提取能力,即可以将不同类型的待聚类问题的区别扩大,将相同类型的待聚类问题的区别减小。
步骤102,基于至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
在本实施例中,上述执行主体可以根据步骤101确定的至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
在这里,可以通过确定待聚类问题与各个目标类簇中心之间的距离,确定待聚类问题所属的类簇中心。由此,可以将每个目标类簇中心旗下的待聚类问题作为一个类簇,实现将至少两个待聚类问题分为至少一个类簇。
需要说明的是,本实施例提供的信息处理方法,通过将待聚类问题导入到聚类模型,然后得到至少一个目标类簇中心;然后根据目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。由此,可以提供一种新的聚类方式,提高对于问题的聚类速度和聚类准确度。
需要说明的是,在一些应用场景中,确定目标类簇中心的聚类模型中的特征提取子模型,具有提取类型特征的能力,因此可 以使得作为聚类基础的特征向量具有较好的类型表征能力,由此,可以提高聚类效率,减少聚类所消耗的时间;并且,可以提高聚类的准确度。
在一些实施例中,上述分类模型可以通过第一步骤得到。在这里,第一步骤可以通过图2A所示的流程实现。图2A所示的流程可以包括步骤201和步骤202。
步骤201,获取训练样本。
在这里,上述训练样本可以具有标签,标签可以指示文本内容类型。文本内容类型可能涉及各种领域,在此不做限定。
步骤202,基于训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型。
在这里,上述待训练分类网络可以包括待训练特征提取层和待训练分类层。分类模型中的特征提取层可以通过对待训练特征提取层进行训练得到。
在这里,待训练特征提取层的具体结构,可以根据实际应用场景设置,在此不做限定。作为示例,待训练特征提取层可以包括卷积神经网络。作为示例,待训练特征提取层可以采用波特(Bidirectional Encoder Representations from Transformers,BERT)结构。
在这里,待训练分类层的具体结构,可以根据实际应用场景设置,在此不做限定。作为示例,待训练分类层可以包括池化层和全连接层;全连接层用于将特征映射至类型。
在这里,可以将训练样本导入待训练分类网络,得到分类结果。然后将分类结果和训练样本对应的标签进行对比,确定损失值。再后,可以利用损失值进行反向传播,调整待训练分类网络的参数。通过多次迭代步骤直到满足停止迭代的条件,将训练得到的待训练分类网络,确定为分类模型。
请参考图2B,其示出了本申请实施例示例性应用场景的示意图。
首先,在分类模型的训练阶段,可以采用训练样本分类任务, 对预先建立的待训练分类网络进行训练,得到分类模型。待训练分类网络可以包括待训练特征提取层和待训练分类层。
然后,可以从分类模型中拿出训练完成的特征提取层,作为聚类模型中的特征提取子模型。
再后,在聚类阶段,可以将待聚类问题,导入聚类模型,得到目标类簇中心。聚类模型可以包括特征提取子模型和聚类子模型。
最后,在推理阶段,可以确定待聚类问题与各个目标类簇中心的距离,以确定待聚类问题所属的目标聚类中心。由此,可以将每个目标类簇中心旗下的待聚类问题作为一个类簇,实现将至少两个待聚类问题分为至少一个类簇。
在一些实施例中,上述步骤202可以包括图3所示步骤。图3所示步骤可以包括步骤301、步骤302、步骤303和步骤304。
步骤301,将至少两个训练样本导入待训练分类网络,得到至少两个训练样本对应的预测类型。
在这里,上述至少两个训练样本的标签各不相同。
步骤302,根据各个训练样本的预测类型和标签,确定各个训练样本的单个样本损失值。
在这里,可以采用各种损失计算方式,确定单个样本损失值。
作为示例,可以采用交叉熵损失函数,确定单个样本损失值。
步骤303,根据所确定的单个样本损失值,确定样本总损失值。
在这里,可以采用各种方式将单个样本损失值进行结合,确定样本总损失值。作为示例,可以将所确定的单个样本损失值相加,将得到的加和作为样本总损失值。
步骤304,基于样本总损失值,调整待训练分类网络的参数。
在这里,可以采用样本总损失值进行反向传播,调整待训练分类网络的参数。
在一些应用场景中,可以取得两个不同类型的问题样本集,作为分类模型的训练集。然后,每次训练都从各个问题样本集中抽取出问题样本,组成一对训练样本。再后,采用bert分别对各 个训练样本进行向量化,对于向量化的输出采用池化的方式得到句子级别的整体表示。再通过线性层将池化后的输出映射到类型维度进行分类,得到分类结果。将各个分类结果与对应的标签计算单个样本损失值,再将单个样本损失值相加得到样本总损失值,并利用样本总损失值进行反向传播,更新bert的参数。
需要说明的是,通过采用至少两个训练样本对待训练分类网络同时训练,并且一同输入至待训练分类网络的至少两个训练样本的标签各不相同,可以使得待训练神经网络的训练过程中,具有较好的泛化能力,即对于各种类型的训练样本,均可以具有较为准确的特征提取能力。对比来说,如果采用使用单个样本损失值调整待训练分类网络的参数,可能会使得待训练分类网络难以兼顾各种不同类型的问题样本,例如,在甲类型的问题样本集合上调整完成的分类网络,再采用乙类型的问题样本集合对其进行参数更新之后,可能会导致更新之后的分类网络对于甲类型的问题样本表征能力变差。
在一些实施例中,上述步骤101,可以包括图4所示流程中的步骤实现。图4所示流程可以包括步骤401和步骤402。
步骤401,将待聚类问题导入特征提取子模型,得到第一特征向量。
步骤402,基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心。
在这里,初始类簇中心可以采用随机设置的方式确定。
需要说明的是,通过第一特征向量和反向传播算法,更新初始类簇中心,可以理解为采用深度学习的方式,确定目标类簇中心,可以提高目标类簇中心的确定的准确度。
在一些实施例中,上述初始类簇中心可以通过采用均值聚类算法,对第一特征向量进行聚类得到。
在这里,均值聚类算法可以包括k均值聚类(k-means clustering algorithm,K-means)算法。K-means算法原理简述如下:先随机选取K个对象作为初始的聚类中心。然后计算每个对象与各个种 子聚类中心之间的距离,把每个对象分配给距离它最近的聚类中心。聚类中心以及分配给它们的对象就代表一个聚类。一旦全部对象都被分配了,每个聚类的聚类中心会根据聚类中现有的对象被重新计算。这个过程将不断重复直到满足某个终止条件。
需要说明的是,通过均值聚类算法生成初始类簇中心,可以使得初始类簇中心更为适合本次聚类的实际场景,提高初始类簇中心的准确度,从而可以减少基于初始类簇中心得到目标类簇中心的时间和计算量。
在一些实施例中,步骤402可以包括图5所示步骤。图5所示步骤可以包括步骤501、步骤502和步骤503。
步骤501,将初始类簇中心确定为第一候选类簇中心。
步骤502,基于第一候选类簇中心,执行以下第一迭代步骤:基于第一候选类簇中心和第一特征向量,确定待聚类问题属于各个第一候选类簇中心的第一概率值;对各个第一概率值进行强化处理,得到第一强化值;根据第一强化值和第一概率值,生成第一损失值;响应于确定第一停止条件满足,将第一候选类簇中心确定为目标类簇中心并输出。
在这里,第一候选类簇中心,可以是不断更新的。每次执行迭代步骤的时候,第一候选类簇中心可以是不同的。第一候选类簇中心的数量,可以是至少一个;即可以是一个,也可以是至少两个。
在这里,基于第一候选类簇中心,第一候选类簇中心和第一特征向量,确定待聚类问题属于各个第一候选类簇中心的第一概率值,可以采用各种方式实现,在此不做限定。
作为示例,对于每个第一特征向量,计算该特征向量与各个第一候选类簇中心之间的距离;将该第一特征向量与目标第一候选类簇中心之间距离,与各个第一候选类簇中心的距离的和,求取比值,将该比值确定为第一概率值。
作为示例,可以将对于每个第一特征向量,计算该特征向量与各个第一候选类簇中心之间的距离的第一平方,然后第一平方 与1加和得到第一加和,各个第一候选类簇中心分别与第一加和对应;将第一候选类簇中心对应的第一加和,与第一加和的总和,求取比值,以及将该比值确定为第一概率值。
在这里,强化处理用于扩大第一概率值之间的差距。换句话说,强化处理可以强化第一概率值中置信度较高的部分在第一概率值总体中所占的比例。
在这里,强化处理的具体方式可以根据实际应用场景设置,在此不做限定。
作为示例,对于第一概率值求取二次方,然后将二次方与各个第一概率二次方之和的比值,确定的第一强化值。
在这里,根据第一强化值和第一概率值,生成第一损失值,可以通过各种方式实现,在此不做限定。
作为示例,可以将对第一强化值和第一概率值的比值求取对数,作为第一损失值。
需要说明的是,随着第一概率值向第一强化值考虑,第一损失会越来越小,趋近于收敛(例如收敛至一常数)。由此,可以使得随着第一迭代步骤的进行,实现类簇中心的迭代。
步骤503,响应于确定第一停止条件不满足,基于生成的第一损失值进行反向传播,调整第一候选类簇中心得到新的第一候选类簇中心,以及跳转执行第一迭代步骤。
在这里,第一停止条件可以根据实际应用场景设置。作为示例,第一停止条件可以包括但是不限于以下至少一项:迭代次数不小于预设次数阈值、第一损失值不小于预设损失值阈值。
在这里,可以基于第一损失值进行反向传播,调整第一候选类簇中心的值,得到新的第一候选类簇中心。然后,跳转到第一迭代步骤,继续执行第一迭代步骤(这次执行第一迭代步骤所基于的第一候选类簇中心,相对于上一轮第一迭代步骤是不同的)。
在一些实施例中,步骤102,可以包括:根据第一特征向量和目标类簇中心,确定待聚类问题所属的类簇。
在这里,可以确定第一特征向量与各个目标类簇中心之间的 距离,然后将最大距离对应的目标类簇中心,确定为待聚类问题所属的目标类簇中心。可以理解,为各个待聚类问题确定目标类簇中心之后,各个目标类簇中心可以具有自身的待聚类问题,即对至少两个待聚类问题进行类簇划分。
需要说明的是,通过采用更新第一候选类簇中心,而不更新特征提取子模型的方式进行迭代步骤,以不断确定新的类簇中心,一方面可以保留预先训练的特征提取子模型的特征表征能力,一方面可以减少计算量,加快计算速度。
在一些实施例中,步骤402可以包括图6所示步骤。图6所示步骤可以包括步骤601、步骤602和步骤603。
步骤601,将初始类簇中心确定为第二候选类簇中心,以及将第一特征向量确定为第二特征向量。
在这里,第一特征向量是初始的特征提取子模型生成的第一特征向量。
在这里,第二特征向量可以理解为为了与第一特征向量进行区分的命名,并不带代表第一特征向量具体数值发生改变。
步骤602,基于第二候选类簇中心和第二特征向量,执行以下第二迭代步骤:基于第二候选类簇中心和第二特征向量,确定待聚类问题属于各个第二候选类簇中心的第二概率值;对各个第二概率值进行强化处理,得到第二强化值;根据第二强化值和第二概率值,生成第二损失值;响应于确定第二停止条件满足,将第二候选类簇中心确定为目标类簇中心并输出,以及特征提取子模型确定为经调整特征提取子模型。
在这里,第二候选类簇中心,可以是不断更新的。每次执行迭代步骤的时候,第二候选类簇中心可以是不同的。第二候选类簇中心的数量,可以是至少一个;即可以是一个,也可以是至少两个。
在这里,基于第二候选类簇中心,第二候选类簇中心和第二特征向量,确定待聚类问题属于各个第二候选类簇中心的第二概率值,可以采用各种方式实现,在此不做限定。
作为示例,对于每个第二特征向量,计算该特征向量与各个第二候选类簇中心之间的距离;将该第二特征向量与目标第二候选类簇中心之间距离,与各个第二候选类簇中心的距离的和,求取比值,将该比值确定为第二概率值。
作为示例,可以将对于每个第二特征向量,计算该特征向量与各个第二候选类簇中心之间的距离的第二平方,然后第二平方与1加和得到第二加和,各个第二候选类簇中心分别与第二加和对应;将第二候选类簇中心对应的第二加和,与第二加和的总和,求取比值,以及将该比值确定为第二概率值。
在这里,强化处理用于将扩大第二概率值之间的差距。换句话说,强化处理可以强化第二概率值中置信度较高的部分在第二概率值总体中所占的比例。
在这里,强化处理的具体方式可以根据实际应用场景设置,在此不做限定。
作为示例,对于第二概率值求取二次方,然后将二次方与各个第二概率二次方之和的比值,确定的第二强化值。
在这里,根据第二强化值和第二概率值,生成第二损失值,可以通过各种方式实现,在此不做限定。
作为示例,可以将对第二强化值和第二概率值的比值求取对数,作为第二损失值。
需要说明的是,随着第二概率值向第二强化值考虑,第二损失会越来越趋近于零。由此,可以使得第二迭代步骤的进行,实现聚类的迭代。
步骤603,响应于确定第二停止条件不满足,基于生成的第二损失值调整第二候选类簇中心得到新的第二候选类簇中心,以及基于生成的第二损失值调整特征提取子模型的参数,以及将待聚类问题导入调整后的特征提取子模型得到新的第二特征向量,以及跳转执行第二迭代步骤。
在这里,第二停止条件可以根据实际应用场景设置。作为示例,第二停止条件可以包括但是不限于以下至少一项:迭代次数 不小于预设次数阈值、第二损失值不小于预设损失值阈值。
在这里,可以基于第二损失值进行反向传播,调整第二候选类簇中心的值,得到新的第二候选类簇中心。然后,跳转到第二迭代步骤,继续执行第二迭代步骤(这次执行第二迭代步骤所基于的第二候选类簇中心,相对于上一轮第二迭代步骤是不同的)。
在这里,可以基于第二损失值,调整特征提取子模型的参数。换句话说,随着迭代步骤的进行,特征提取子模型也在不断更新。每次第二迭代步骤中的第二特征向量,可以更新后的特征提取子模型对待聚类问题得到新的特征向量。
在一些实施例中,所述步骤102,可以包括:将待聚类问题导入经调整的特征提取子模型,得到第三特征向量;根据第三特征向量和目标类簇中心,确定待聚类问题所属的类簇。
在这里,在步骤603中,基于第二损失值进行反向传播,调整特征提取子模型的参数,因此特征提取子模型的是在不断更新的。随着更新的进行,每次第二迭代步骤中所使用的特征提取子模型均可以是更新后所保留的最新的特征提取子模型,因此,在执行将待聚类问题导入经调整的特征提取子模型的时候,本领域技术人员可以理解是导入到的上述更新后保留的最新的特征提取子模型。因此,得到的第三特征向量可以更为准确得表达待聚类问题的类型特征。
在这里,可以确定第三特征向量与各个目标类簇中心之间的距离,然后将最大距离对应的目标类簇中心,确定为待聚类问题所属的目标类簇中心。可以理解,为各个待聚类问题确定目标类簇中心之后,各个目标类簇中心可以具有自身的待聚类问题,即对至少两个待聚类问题进行了分堆,实现了确定至少一个类簇。
需要说明的是,通过采用更新第二候选类簇中心,并且更新特征提取子模型的方式进行迭代步骤,以不断确定新的类簇中心和第二特征向量,可以进一步提高特征提取子模型的特征表征能力,提高聚类的准确度。
进一步参考图7,作为对上述各图所示方法的实现,本公开提供了一种信息处理装置的一个实施例,该装置实施例与图1所示的方法实施例相对应,该装置具体可以应用于各种电子设备中。
如图7所示,本实施例的信息处理装置包括:生成单元701和确定单元702。其中,生成单元,用于将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,其中,目标类簇中心指示类簇;确定单元,用于基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
在本实施例中,信息处理装置的记录单元生成单元701和确定单元702的具体处理及其所带来的技术效果可分别参考图1对应实施例中步骤101和步骤102的相关说明,在此不再赘述。
在一些实施例中,其中所述聚类模型包括特征提取子模型,其中,所述特征提取子模型基于预先训练的分类模型的特征提取层得到,所述特征提取子模型用于生成待聚类问题对应的特征向量,特征向量用于聚类以确定目标类簇中心。在一些实施例中,所述分类模型通过第一步骤得到,其中,所述第一步骤包括:获取训练样本,其中,训练样本的标签指示文本内容类型;基于所述训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型,其中,所述待训练分类网络包括待训练特征提取层和分类层,分类模型中的特征提取层通过对待训练特征提取层进行训练得到。
在一些实施例中,所述基于所述训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型,包括:将至少两个训练样本导入待训练分类网络,得到所述至少两个训练样本对应的预测类型,其中,所述至少两个训练样本的标签各不相同;根据各个训练样本的预测类型和标签,确定各个训练样本的单个样本损失值;根据所确定的单个样本损失值,确定样本总损失值;基于所述样本总损失值,调整所述待训练分类网络的参数。
在一些实施例中,所述将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,包括:将待聚类问题导入所述特征 提取子模型,得到第一特征向量;基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心。
在一些实施例中,所述初始类簇中心通过采用均值聚类算法对第一特征向量进行聚类得到。
在一些实施例中,所述基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心,包括:将初始类簇中心确定为第一候选类簇中心;基于第一候选类簇中心,执行以下第一迭代步骤:基于第一候选类簇中心和第一特征向量,确定待聚类问题属于各个第一候选类簇中心的第一概率值;对各个第一概率值进行强化处理,得到第一强化值;根据第一强化值和第一概率值,生成第一损失值;响应于确定第一停止条件满足,将第一候选类簇中心确定为目标类簇中心并输出;响应于确定第一停止条件不满足,基于生成的第一损失值进行反向传播,调整第一候选类簇中心得到新的第一候选类簇中心,以及跳转执行第一迭代步骤。
在一些实施例中,所述基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇,包括:根据第一特征向量和目标类簇中心,确定待聚类问题所属的类簇。
在一些实施例中,所述基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心,包括:将初始类簇中心确定为第二候选类簇中心,以及将第一特征向量确定为第二特征向量;基于第二候选类簇中心和第二特征向量,执行以下第二迭代步骤:基于第二候选类簇中心和第二特征向量,确定待聚类问题属于各个第二候选类簇中心的第二概率值;对各个第二概率值进行强化处理,得到第二强化值;根据第二强化值和第二概率值,生成第二损失值;响应于确定第二停止条件满足,将第二候选类簇中心确定为目标类簇中心并输出;响应于确定第二停止条件不满足,基于生成的第二损失值进行反向传播,调整第二候选类簇中心得到新的第二候选类簇中心,以及基于生成的第二损失值进行反向传播调整特征提取子模型的参数,以及将待 聚类问题导入调整后的特征提取子模型得到新的第二特征向量,以及跳转执行第二迭代步骤。
在一些实施例中,所述基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇,包括:将待聚类问题导入经调整的特征提取子模型,得到第三特征向量;根据第三特征向量和目标类簇中心,确定待聚类问题所属的类簇。
请参考图8,图8示出了本公开的一个实施例的信息处理方法可以应用于其中的示例性系统架构。
如图8所示,系统架构可以包括终端设备801、802、803,网络804,服务器805。网络804用以在终端设备801、802、803和服务器805之间提供通信链路的介质。网络804可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
终端设备801、802、803可以通过网络804与服务器805交互,以接收或发送消息等。终端设备801、802、803上可以安装有各种客户端应用,例如网页浏览器应用、搜索类应用、新闻资讯类应用。终端设备801、802、803中的客户端应用可以接收用户的指令,并根据用户的指令完成相应的功能,例如根据用户的指令在信息中添加相应信息。
终端设备801、802、803可以是硬件,也可以是软件。当终端设备801、802、803为硬件时,可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计算机和台式计算机等等。当终端设备801、802、803为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务的软件或软件模块),也可以实现成单个软件或软件模块。在此不做具体限定。
服务器805可以是提供各种服务的服务器,例如接收终端设备801、802、803发送的信息获取请求,根据信息获取请求通过各种 方式获取信息获取请求对应的展示信息。并展示信息的相关数据发送给终端设备801、802、803。
需要说明的是,本公开实施例所提供的信息处理方法可以由终端设备执行,相应地,信息处理装置可以设置在终端设备801、802、803中。此外,本公开实施例所提供的信息处理方法还可以由服务器805执行,相应地,信息处理装置可以设置于服务器805中。
应该理解,图8中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。
下面参考图9,其示出了适于用来实现本公开实施例的电子设备(例如图8中的终端设备或服务器)的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、PDA(个人数字助理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。图9示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图9所示,电子设备可以包括处理装置(例如中央处理器、图形处理器等)901,其可以根据存储在只读存储器(ROM)902中的程序或者从存储装置908加载到随机访问存储器(RAM)903中的程序而执行各种适当的动作和处理。在RAM 903中,还存储有电子设备900操作所需的各种程序和数据。处理装置901、ROM902以及RAM 903通过总线904彼此相连。输入/输出(I/O)接口905也连接至总线904。
通常,以下装置可以连接至I/O接口905:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置909;包括例如液晶显示器(LCD)、扬声器、振动器等的输出装置907;包括例如磁带、硬盘等的存储装置908;以及通 信装置909。通信装置909可以允许电子设备与其他设备进行无线或有线通信以交换数据。虽然图9示出了具有各种装置的电子设备,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置909从网络上被下载和安装,或者从存储装置908被安装,或者从ROM 902被安装。在该计算机程序被处理装置901执行时,执行本公开实施例的方法中限定的上述功能。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。 计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、RF(射频)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,其中,目标类簇中心指示类簇;基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。 在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,生成单元还可以被描述为“生成目标类簇中心的单元”。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的一个或多个实施例,所述聚类模型包括特征提取子模型,所述特征提取子模型用于生成待聚类问题对应的特征向量,特征向量用于聚类以确定目标类簇中心。
根据本公开的一个或多个实施例,所述特征提取子模型基于预先训练的分类模型的特征提取层得到。
根据本公开的一个或多个实施例,所述分类模型通过第一步骤得到,其中,所述第一步骤包括:获取训练样本,其中,训练样本的标签指示文本内容类型;基于所述训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型,其中,所述待训练分类网络包括待训练特征提取层和分类层,分类模型中的特征提取层通过对待训练特征提取层进行训练得到。
根据本公开的一个或多个实施例,所述基于所述训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型,包括:将至少两个训练样本导入待训练分类网络,得到所述至少两个训练样本对应的预测类型,其中,所述至少两个训练样本的标签各不相同;根据各个训练样本的预测类型和标签,确定各个训练样本的单个样本损失值;根据所确定的单个样本损失值,确定样本总损失值;基于所述样本总损失值,调整所述待训练分类网络的参数。
根据本公开的一个或多个实施例,所述将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,包括:将待聚类问题导入所述特征提取子模型,得到第一特征向量;基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心。
根据本公开的一个或多个实施例,所述初始类簇中心通过采用均值聚类算法对第一特征向量进行聚类得到。
根据本公开的一个或多个实施例,所述基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心,包括:将初始类簇中心确定为第一候选类簇中心;基于第一候选类簇中心,执行以下第一迭代步骤:基于第一候选类簇中 心和第一特征向量,确定待聚类问题属于各个第一候选类簇中心的第一概率值;对各个第一概率值进行强化处理,得到第一强化值;根据第一强化值和第一概率值,生成第一损失值;响应于确定第一停止条件满足,将第一候选类簇中心确定为目标类簇中心并输出;响应于确定第一停止条件不满足,基于生成的第一损失值进行反向传播,调整第一候选类簇中心得到新的第一候选类簇中心,以及跳转执行第一迭代步骤。
根据本公开的一个或多个实施例,所述基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇,包括:根据第一特征向量和目标类簇中心,确定待聚类问题所属的类簇。
根据本公开的一个或多个实施例,所述基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心,包括:将初始类簇中心确定为第二候选类簇中心,以及将第一特征向量确定为第二特征向量;基于第二候选类簇中心和第二特征向量,执行以下第二迭代步骤:基于第二候选类簇中心和第二特征向量,确定待聚类问题属于各个第二候选类簇中心的第二概率值;对各个第二概率值进行强化处理,得到第二强化值;根据第二强化值和第二概率值,生成第二损失值;响应于确定第二停止条件满足,将第二候选类簇中心确定为目标类簇中心并输出;响应于确定第二停止条件不满足,基于生成的第二损失值进行反向传播,调整第二候选类簇中心得到新的第二候选类簇中心,以及基于生成的第二损失值进行反向传播调整特征提取子模型的参数,以及将待聚类问题导入调整后的特征提取子模型得到新的第二特征向量,以及跳转执行第二迭代步骤。
根据本公开的一个或多个实施例,所述基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇,包括:将待聚类问题导入经调整的特征提取子模型,得到第三特征向量;根据第三特征向量和目标类簇中心,确定待聚类问题所属的类簇。
根据本公开的一个或多个实施例,信息处理装置,包括:生成单元,用于将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,其中,目标类簇中心指示类簇;确定单元,用于基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
根据本公开的一个或多个实施例,其中所述聚类模型包括特征提取子模型,其中,所述特征提取子模型基于预先训练的分类模型的特征提取层得到,所述特征提取子模型用于生成待聚类问题对应的特征向量,特征向量用于聚类以确定目标类簇中心。根据本公开的一个或多个实施例,所述分类模型通过第一步骤得到,其中,所述第一步骤包括:获取训练样本,其中,训练样本的标签指示文本内容类型;基于所述训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型,其中,所述待训练分类网络包括待训练特征提取层和分类层,分类模型中的特征提取层通过对待训练特征提取层进行训练得到。
根据本公开的一个或多个实施例,所述基于所述训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型,包括:将至少两个训练样本导入待训练分类网络,得到所述至少两个训练样本对应的预测类型,其中,所述至少两个训练样本的标签各不相同;根据各个训练样本的预测类型和标签,确定各个训练样本的单个样本损失值;根据所确定的单个样本损失值,确定样本总损失值;基于所述样本总损失值,调整所述待训练分类网络的参数。
根据本公开的一个或多个实施例,所述将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,包括:将待聚类问题导入所述特征提取子模型,得到第一特征向量;基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心。
根据本公开的一个或多个实施例,所述初始类簇中心通过采用均值聚类算法对第一特征向量进行聚类得到。
根据本公开的一个或多个实施例,所述基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心,包括:将初始类簇中心确定为第一候选类簇中心;基于第一候选类簇中心,执行以下第一迭代步骤:基于第一候选类簇中心和第一特征向量,确定待聚类问题属于各个第一候选类簇中心的第一概率值;对各个第一概率值进行强化处理,得到第一强化值;根据第一强化值和第一概率值,生成第一损失值;响应于确定第一停止条件满足,将第一候选类簇中心确定为目标类簇中心并输出;响应于确定第一停止条件不满足,基于生成的第一损失值进行反向传播,调整第一候选类簇中心得到新的第一候选类簇中心,以及跳转执行第一迭代步骤。
根据本公开的一个或多个实施例,所述基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇,包括:根据第一特征向量和目标类簇中心,确定待聚类问题所属的类簇。
根据本公开的一个或多个实施例,所述基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心,包括:将初始类簇中心确定为第二候选类簇中心,以及将第一特征向量确定为第二特征向量;基于第二候选类簇中心和第二特征向量,执行以下第二迭代步骤:基于第二候选类簇中心和第二特征向量,确定待聚类问题属于各个第二候选类簇中心的第二概率值;对各个第二概率值进行强化处理,得到第二强化值;根据第二强化值和第二概率值,生成第二损失值;响应于确定第二停止条件满足,将第二候选类簇中心确定为目标类簇中心并输出;响应于确定第二停止条件不满足,基于生成的第二损失值进行反向传播,调整第二候选类簇中心得到新的第二候选类簇中心,以及基于生成的第二损失值进行反向传播调整特征提取子模型的参数,以及将待聚类问题导入调整后的特征提取子模型得到新的第二特征向量,以及跳转执行第二迭代步骤。
根据本公开的一个或多个实施例,所述基于所述至少一个目 标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇,包括:将待聚类问题导入经调整的特征提取子模型,得到第三特征向量;根据第三特征向量和目标类簇中心,确定待聚类问题所属的类簇。
根据本公开的一个或多个实施例,一种电子设备,包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如任一信息处理方法所述的方法。
根据本公开的一个或多个实施例,一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现任一如信息处理方法所述的方法。
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特 征和动作仅仅是实现权利要求书的示例形式。

Claims (14)

  1. 一种信息处理方法,其特征在于,包括:
    将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,其中,目标类簇中心指示类簇;
    基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
  2. 根据权利要求1所述的方法,其特征在于,所述聚类模型包括特征提取子模型,所述特征提取子模型用于生成待聚类问题对应的特征向量,特征向量用于聚类以确定目标类簇中心。
  3. 根据权利要求2所述的方法,其特征在于,所述特征提取子模型基于预先训练的分类模型的特征提取层得到。
  4. 根据权利要求3所述的方法,其特征在于,所述分类模型通过第一步骤得到,其中,所述第一步骤包括:
    获取训练样本,其中,训练样本的标签指示文本内容类型;
    基于所述训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型,其中,所述待训练分类网络包括待训练特征提取层和分类层,分类模型中的特征提取层通过对待训练特征提取层进行训练得到。
  5. 根据权利要求4所述的方法,其特征在于,所述基于所述训练样本和对应的标签,对待训练分类网络进行训练,得到所述分类模型,包括:
    将至少两个训练样本导入待训练分类网络,得到所述至少两个训练样本对应的预测类型,其中,所述至少两个训练样本的标签各不相同;
    根据各个训练样本的预测类型和标签,确定各个训练样本的单个样本损失值;
    根据所确定的单个样本损失值,确定样本总损失值;
    基于所述样本总损失值,调整所述待训练分类网络的参数。
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,所述将至少两个待聚类问题导入聚类模型,得到至少一个目标类簇中心,包括:
    将待聚类问题导入所述特征提取子模型,得到第一特征向量;
    基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心。
  7. 根据权利要求6所述的方法,其特征在于,所述初始类簇中心通过采用均值聚类算法对第一特征向量进行聚类得到。
  8. 根据权利要求6所述的方法,其特征在于,所述基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心,包括:
    将初始类簇中心确定为第一候选类簇中心;
    基于第一候选类簇中心,执行以下第一迭代步骤:基于第一候选类簇中心和第一特征向量,确定待聚类问题属于各个第一候选类簇中心的第一概率值;对各个第一概率值进行强化处理,得到第一强化值;根据第一强化值和第一概率值,生成第一损失值;响应于确定第一停止条件满足,将第一候选类簇中心确定为目标类簇中心并输出;
    响应于确定第一停止条件不满足,基于生成的第一损失值进行反向传播,调整第一候选类簇中心得到新的第一候选类簇中心,以及跳转执行第一迭代步骤。
  9. 根据权利要求8所述的方法,其特征在于,所述基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇,包括:
    根据第一特征向量和目标类簇中心,确定待聚类问题所属的类簇。
  10. 根据权利要求6所述的方法,其特征在于,所述基于反向传播算法和第一特征向量,更新初始类簇中心,得到所述至少一个目标类簇中心,包括:
    将初始类簇中心确定为第二候选类簇中心,以及将第一特征向量确定为第二特征向量;
    基于第二候选类簇中心和第二特征向量,执行以下第二迭代步骤:基于第二候选类簇中心和第二特征向量,确定待聚类问题属于各个第二候选类簇中心的第二概率值;对各个第二概率值进行强化处理,得到第二强化值;根据第二强化值和第二概率值,生成第二损失值;响应于确定第二停止条件满足,将第二候选类簇中心确定为目标类簇中心并输出;
    响应于确定第二停止条件不满足,基于生成的第二损失值进行反向传播,调整第二候选类簇中心得到新的第二候选类簇中心,以及基于生成的第二损失值进行反向传播调整特征提取子模型的参数,以及将待聚类问题导入调整后的特征提取子模型得到新的第二特征向量,以及跳转执行第二迭代步骤。
  11. 根据权利要求10所述的方法,其特征在于,所述基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇,包括:
    将待聚类问题导入经调整的特征提取子模型,得到第三特征向量;
    根据第三特征向量和目标类簇中心,确定待聚类问题所属的类簇。
  12. 一种信息处理装置,其特征在于,包括:
    生成单元,用于将至少两个待聚类问题导入聚类模型,得到 至少一个目标类簇中心,其中,目标类簇中心指示类簇;
    确定单元,用于基于所述至少一个目标类簇中心,将所述至少两个待聚类问题确定为至少一个类簇。
  13. 一种电子设备,其特征在于,包括:
    至少一个处理器;
    存储装置,用于存储至少一个程序,
    当所述一个或多个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-11中任一所述的方法。
  14. 一种计算机可读介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1-11中任一所述的方法。
PCT/CN2021/135402 2020-12-07 2021-12-03 信息处理方法、装置和电子设备 WO2022121801A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011432971.5 2020-12-07
CN202011432971.5A CN112650841A (zh) 2020-12-07 2020-12-07 信息处理方法、装置和电子设备

Publications (1)

Publication Number Publication Date
WO2022121801A1 true WO2022121801A1 (zh) 2022-06-16

Family

ID=75350578

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/135402 WO2022121801A1 (zh) 2020-12-07 2021-12-03 信息处理方法、装置和电子设备

Country Status (2)

Country Link
CN (1) CN112650841A (zh)
WO (1) WO2022121801A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115422480A (zh) * 2022-10-31 2022-12-02 荣耀终端有限公司 事件发生地区域的确定方法、设备及存储介质
CN115709356A (zh) * 2022-08-31 2023-02-24 深圳前海瑞集科技有限公司 焊接工艺参数获取方法、装置、电子设备及存储介质
CN116340831A (zh) * 2023-05-24 2023-06-27 京东科技信息技术有限公司 一种信息分类方法、装置、电子设备及存储介质
CN117273313A (zh) * 2023-09-08 2023-12-22 中关村科学城城市大脑股份有限公司 水网调蓄方法、装置、电子设备和计算机可读介质

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112650841A (zh) * 2020-12-07 2021-04-13 北京有竹居网络技术有限公司 信息处理方法、装置和电子设备
CN115495793B (zh) * 2022-11-17 2023-04-07 中关村科学城城市大脑股份有限公司 多集问题安全发送方法、装置、设备和介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180114055A1 (en) * 2016-10-25 2018-04-26 VMAXX. Inc. Point to Set Similarity Comparison and Deep Feature Learning for Visual Recognition
CN108564102A (zh) * 2018-01-04 2018-09-21 百度在线网络技术(北京)有限公司 图像聚类结果评价方法和装置
CN108764319A (zh) * 2018-05-21 2018-11-06 北京京东尚科信息技术有限公司 一种样本分类方法和装置
CN109344154A (zh) * 2018-08-22 2019-02-15 中国平安人寿保险股份有限公司 数据处理方法、装置、电子设备及存储介质
CN110020022A (zh) * 2019-01-03 2019-07-16 阿里巴巴集团控股有限公司 数据处理方法、装置、设备及可读存储介质
CN112650841A (zh) * 2020-12-07 2021-04-13 北京有竹居网络技术有限公司 信息处理方法、装置和电子设备

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107656948B (zh) * 2016-11-14 2019-05-07 平安科技(深圳)有限公司 自动问答系统中的问题聚类处理方法及装置
US11436428B2 (en) * 2017-06-06 2022-09-06 Sightline Innovation Inc. System and method for increasing data quality in a machine learning process
CN109388674B (zh) * 2018-08-31 2022-11-15 创新先进技术有限公司 数据处理方法、装置、设备及可读存储介质
CN109389166A (zh) * 2018-09-29 2019-02-26 聚时科技(上海)有限公司 基于局部结构保存的深度迁移嵌入聚类机器学习方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180114055A1 (en) * 2016-10-25 2018-04-26 VMAXX. Inc. Point to Set Similarity Comparison and Deep Feature Learning for Visual Recognition
CN108564102A (zh) * 2018-01-04 2018-09-21 百度在线网络技术(北京)有限公司 图像聚类结果评价方法和装置
CN108764319A (zh) * 2018-05-21 2018-11-06 北京京东尚科信息技术有限公司 一种样本分类方法和装置
CN109344154A (zh) * 2018-08-22 2019-02-15 中国平安人寿保险股份有限公司 数据处理方法、装置、电子设备及存储介质
CN110020022A (zh) * 2019-01-03 2019-07-16 阿里巴巴集团控股有限公司 数据处理方法、装置、设备及可读存储介质
CN112650841A (zh) * 2020-12-07 2021-04-13 北京有竹居网络技术有限公司 信息处理方法、装置和电子设备

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115709356A (zh) * 2022-08-31 2023-02-24 深圳前海瑞集科技有限公司 焊接工艺参数获取方法、装置、电子设备及存储介质
CN115422480A (zh) * 2022-10-31 2022-12-02 荣耀终端有限公司 事件发生地区域的确定方法、设备及存储介质
CN116340831A (zh) * 2023-05-24 2023-06-27 京东科技信息技术有限公司 一种信息分类方法、装置、电子设备及存储介质
CN116340831B (zh) * 2023-05-24 2024-02-06 京东科技信息技术有限公司 一种信息分类方法、装置、电子设备及存储介质
CN117273313A (zh) * 2023-09-08 2023-12-22 中关村科学城城市大脑股份有限公司 水网调蓄方法、装置、电子设备和计算机可读介质
CN117273313B (zh) * 2023-09-08 2024-05-24 中关村科学城城市大脑股份有限公司 水网调蓄方法、装置、电子设备和计算机可读介质

Also Published As

Publication number Publication date
CN112650841A (zh) 2021-04-13

Similar Documents

Publication Publication Date Title
WO2022121801A1 (zh) 信息处理方法、装置和电子设备
US11620532B2 (en) Method and apparatus for generating neural network
CN108416310B (zh) 用于生成信息的方法和装置
KR102308002B1 (ko) 정보 생성 방법 및 장치
CN111666416B (zh) 用于生成语义匹配模型的方法和装置
CN111104599B (zh) 用于输出信息的方法和装置
CN112364860A (zh) 字符识别模型的训练方法、装置和电子设备
CN111738010B (zh) 用于生成语义匹配模型的方法和装置
US11763204B2 (en) Method and apparatus for training item coding model
WO2023143016A1 (zh) 特征提取模型的生成方法、图像特征提取方法和装置
CN111460288B (zh) 用于检测新闻事件的方法和装置
CN110457325B (zh) 用于输出信息的方法和装置
CN111680799B (zh) 用于处理模型参数的方法和装置
CN111008213B (zh) 用于生成语言转换模型的方法和装置
US20210004406A1 (en) Method and apparatus for storing media files and for retrieving media files
CN113033707B (zh) 视频分类方法、装置、可读介质及电子设备
CN113051933A (zh) 模型训练方法、文本语义相似度确定方法、装置和设备
CN113033682B (zh) 视频分类方法、装置、可读介质、电子设备
CN112241761B (zh) 模型训练方法、装置和电子设备
CN112417260B (zh) 本地化推荐方法、装置及存储介质
CN115700548A (zh) 用户行为预测的方法、设备和计算机程序产品
CN112685516A (zh) 一种多路召回推荐方法、装置、电子设备及介质
CN112734462B (zh) 一种信息推荐方法、装置、设备及介质
CN113283115B (zh) 图像模型生成方法、装置和电子设备
CN117857388B (zh) 交换机运行信息检测方法、装置、电子设备与计算机介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21902508

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21902508

Country of ref document: EP

Kind code of ref document: A1