CN116527620A - Machine learning transmission method, device and storage medium based on multiple message bodies - Google Patents

Machine learning transmission method, device and storage medium based on multiple message bodies Download PDF

Info

Publication number
CN116527620A
CN116527620A CN202310744857.3A CN202310744857A CN116527620A CN 116527620 A CN116527620 A CN 116527620A CN 202310744857 A CN202310744857 A CN 202310744857A CN 116527620 A CN116527620 A CN 116527620A
Authority
CN
China
Prior art keywords
cluster
machine learning
classification
user database
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310744857.3A
Other languages
Chinese (zh)
Inventor
沈浩
韩松乔
李威伟
毛杨
毛志国
段云湖
李万松
孙晓明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhixun Information Technology Co ltd
Original Assignee
Shanghai Zhixun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Zhixun Information Technology Co ltd filed Critical Shanghai Zhixun Information Technology Co ltd
Priority to CN202310744857.3A priority Critical patent/CN116527620A/en
Publication of CN116527620A publication Critical patent/CN116527620A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/06Message adaptation to terminal or network requirements
    • H04L51/063Content adaptation, e.g. replacement of unsuitable content
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/21Monitoring or handling of messages
    • H04L51/214Monitoring or handling of messages using selective forwarding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • H04L67/306User profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a machine learning transmission method, a device and a storage medium based on various message bodies, wherein the machine learning transmission method based on the various message bodies comprises the following steps: acquiring existing user data, and classifying users by using a machine learning classification algorithm and a clustering algorithm to obtain a corresponding classification cluster user database and a clustering cluster user database; acquiring information of a plurality of message bodies to be transmitted, and selecting corresponding classification cluster groups to respectively transmit information of different message bodies based on the classification cluster group user database; when triggering the reissue strategy, selecting a corresponding cluster group based on the cluster group user database to send information triggering the reissue strategy; and acquiring behavior feedback data of the user after information is sent out, and performing machine learning of a classification algorithm and a clustering algorithm as machine learning input data so as to update the classification cluster user database and the clustering cluster user database.

Description

Machine learning transmission method, device and storage medium based on multiple message bodies
Technical Field
The present disclosure relates to the field of computer information processing technologies, and in particular, to a machine learning sending method and device based on multiple message bodies, and a storage medium.
Background
As the types of messages increase, the manner in which messages are sent becomes increasingly diverse. There are currently conventional short messages (text messages), 5G messages, multimedia messages, rich media card messages. Different messages need to be sent to different groups of users, such as: the method has the advantages that 5G of messages are supported by a 5G message terminal user, the 5G of messages are required to be transmitted, a mobile phone manufacturer supports a user of a card mobile phone to transmit rich media card messages, a common smart mobile phone user is used to transmit multimedia messages, other short text messages are used to transmit, currently, a transmission mode is simply short message transmission, or multimedia type messages are transmitted, which users do not use which messages to transmit, and many times, information transmission operators collect user use terminals and other user attributes in an offline mode, and the transmission group is divided to transmit the information in a manual mode, so that the method is labor-consuming, and has low efficiency and transmission definition. There is currently no way to continuously optimize the transmit population by using machine learning.
Disclosure of Invention
An object of the embodiments of the present application is to provide a machine learning transmission method, apparatus and storage medium based on multiple message bodies, so as to solve the problems in the prior art that a message transmission mode collects user usage terminals and other user attributes through an offline mode by an information transmission operator, and transmits information by dividing a transmission group through a manual mode, which causes trouble and effort, and has low efficiency and transmission definition.
To achieve the above object, an embodiment of the present application provides a machine learning transmission method based on multiple message bodies, including: acquiring existing user data, and classifying users by using a machine learning classification algorithm and a clustering algorithm to obtain a corresponding classification cluster user database and a clustering cluster user database;
acquiring information of a plurality of message bodies to be transmitted, and selecting corresponding classification cluster groups to respectively transmit information of different message bodies based on the classification cluster group user database;
when triggering the reissue strategy, selecting a corresponding cluster group based on the cluster group user database to send information triggering the reissue strategy;
and acquiring behavior feedback data of the user after information is sent out, and performing machine learning of a classification algorithm and a clustering algorithm as machine learning input data so as to update the classification cluster user database and the clustering cluster user database.
Optionally, the information with multiple message bodies includes:
text messages, 5G messages, multimedia messages, and/or rich media card messages.
Optionally, the classifying the user by using the machine learning classification algorithm to obtain a corresponding classification cluster user database includes:
the method comprises the steps of obtaining information samples of various message bodies, measuring the similarity between user data and the samples by using the distance, specifically calculating the distance of multiple attributes of the samples by using a multi-dimensional space Euclidean distance to obtain the similarity, and obtaining a corresponding classification cluster user database based on a distance threshold.
Optionally, the classifying the users by using the clustering algorithm of machine learning to obtain a corresponding clustered user database includes:
acquiring characteristics of user data, carrying out characteristic standardization, selecting the most effective characteristics, converting the selected characteristics, and extracting representative characteristics;
performing similarity measurement based on a specific measurement function to obtain centers of all clusters and user groups of each sample;
and analyzing the clustering result by using a clustering similarity measurement method and a data element distance measurement method.
Optionally, the obtaining the center of each cluster includes:
sequentially updating the values of the clustering centers by an iterative method;
the method specifically comprises the following steps: randomly and properly selecting initial centers of 4 clusters by using symbolsA representation;
in K iterations, for any sample, the distance between the sample and each center is calculated, the sample is classified into a cluster with the shortest distance, and the formula is:wherein->Represents the minimum value in the distance of the ith sample to the jth center point,/-)>Represents the j-th center point,>representing the distance from the ith sample to the jth center point, and Argmin represents the shortest distance from the sample to a certain center point;
using the formula:updating the central value of the cluster, wherein +.>Is the center point of the j-th cluster, which is sought,/->Representing the corresponding j-th cluster, +.>Indicating the total distance for samples divided in the j-th cluster,/->Representing the total number of samples.
Optionally, the method further comprises:
taking a random point as a starting point during initialization;
in the iterative process, the center of gravity or the mass center of all data points of the same cluster is taken as a new center point;
all data points are assigned to the closest center point thereto.
Optionally, when information needs to be sent to a new user not in the classification cluster user database and the cluster user database, the information is sent in a mode designated by the new user.
Optionally, when the behavior feedback data of the new user not in the classification cluster user database and the cluster user database is acquired, machine learning of a classification algorithm and a clustering algorithm is performed as machine learning input data to update the classification cluster user database and the cluster user database.
To achieve the above object, the present application further provides a machine learning transmission device based on multiple message bodies, including: a memory; and
a processor coupled to the memory, the processor configured to perform the steps of the method as described above.
To achieve the above object, the present application also provides a computer storage medium having stored thereon a computer program which, when executed by a machine, implements the steps of the method as described above.
The embodiment of the application has the following advantages:
the embodiment of the application provides a machine learning sending method based on various message bodies, which comprises the following steps: acquiring existing user data, and classifying users by using a machine learning classification algorithm and a clustering algorithm to obtain a corresponding classification cluster user database and a clustering cluster user database; acquiring information of a plurality of message bodies to be transmitted, and selecting corresponding classification cluster groups to respectively transmit information of different message bodies based on the classification cluster group user database; when triggering the reissue strategy, selecting a corresponding cluster group based on the cluster group user database to send information triggering the reissue strategy; and acquiring behavior feedback data of the user after information is sent out, and performing machine learning of a classification algorithm and a clustering algorithm as machine learning input data so as to update the classification cluster user database and the clustering cluster user database.
By the method, under the condition of initial existing basic data, the cluster group of the sending user is divided, and then the cluster group library is updated by using a machine learning mode, so that corresponding message body type information can be sent by using a more accurate mode, and the problem that what type of message is sent by what account is not known at present is solved. The method solves the problems that in the prior art, the information is sent down by dividing a sending group in a manual mode by collecting user using terminals and other user attributes in an offline mode through an information sending operator, so that the trouble and the effort are caused, and the efficiency and the sending definition are very low.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those skilled in the art from this disclosure that the drawings described below are merely exemplary and that other embodiments may be derived from the drawings provided without undue effort.
Fig. 1 is a flowchart of a machine learning transmission method based on multiple message bodies according to an embodiment of the present application;
FIG. 2 is a logic block diagram of a machine learning transmission method based on multiple message bodies according to an embodiment of the present application;
fig. 3 is a block diagram of a machine learning transmitting device based on multiple message bodies according to an embodiment of the present application.
Detailed Description
Other advantages and advantages of the present application will become apparent to those skilled in the art from the following description of specific embodiments, which is to be read in light of the present disclosure, wherein the present embodiments are described in some, but not all, of the several embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
In addition, the technical features described below in the different embodiments of the present application may be combined with each other as long as they do not collide with each other.
Based on the information which has more transmission, the method introduces artificial intelligent machine learning, firstly uses a classification algorithm to classify the existing users in a supervised learning mode, for example, designates and divides 5G user groups, rich media card user groups, multimedia message user groups and text message user groups.
For the actual business of information transmission, there is a user group which is not limited to use of supervised learning division, the classification algorithm is further supplemented, the unsupervised learning clustering algorithm is adopted to divide the transmission users into a plurality of clusters, and a large number of clusters are obtained through training of a long-term model, so that the corresponding information types of the users can be transmitted in other business scenes for dividing the user group in a supplementary and accurate mode. Such as: and if the 5G message user has a message fallback (which means that the user terminal does not support 5G), finding a reissue strategy in the cluster group divided by the clustering algorithm (for example, if the attribute of the user is suitable for the multimedia message through clustering, the user reissues the multimedia message). The following is an explanation by way of specific examples.
An embodiment of the present application provides a machine learning transmission method based on multiple message bodies, referring to fig. 1 and 2, fig. 1 is a flowchart of a machine learning transmission method based on multiple message bodies provided in an embodiment of the present application, and fig. 2 is a logic block diagram of a machine learning transmission method based on multiple message bodies provided in an embodiment of the present application, where it should be understood that the method may further include additional blocks not shown and/or blocks not shown may be omitted, and the scope of the present application is not limited in this respect.
At step 101, existing user data is obtained, and the users are classified by using a machine learning classification algorithm and a clustering algorithm, so as to obtain a corresponding classification cluster user database and a clustering cluster user database.
Specifically, basic data and behaviors of a user are collected: the corresponding data of the user is collected through the precipitation data of company business (various message bodies sent by the mobile phone number of the user, the user behavior with business properties, and the characteristics of the brand, the model, the gender and the age of the user of the mobile phone used by the user are collected).
Determining message type sending to divide the user group so as to be grouped into corresponding divided groups according to the data of the users (for example, 5G message corresponds to the user supporting 5G message, rich media card message corresponds to the mobile and manufacturer supports the users, short text message corresponds to the non-intelligent machine, and multimedia message corresponds to the general intelligent terminal user).
In some embodiments, the population partitioning based on the above: (1) the number of classifications is 4; (2) the classified user groups are as follows: 5G message, rich media message, multimedia message, text short message; (3) classification criteria: through operation or logic judgment.
In some embodiments, the classifying the user by using the machine learning classification algorithm to obtain a corresponding classification cluster user database includes:
the method comprises the steps of obtaining information samples of various message bodies, measuring the similarity between user data and the samples by using the distance, specifically calculating the distance of multiple attributes of the samples by using a multi-dimensional space Euclidean distance to obtain the similarity, and obtaining a corresponding classification cluster user database based on a distance threshold.
Specifically, using the KNN (k nearest neighbor) classification algorithm, a sample of the message type is given, such as: 5G message samples. KNN uses distance to measure similarity to samples (steps are distance calculation, neighbor finding, classification), for example:
(,) To the sample (+)>) Calculating distance d= = ->Wherein->Is a sample coordinate value,/->,Feature coordinate values, d, represent the distance of the feature to the sample (the distance between two points of the planar analytic geometry);
for the multi-attribute calculation distance of the sample, a multi-dimensional space Euclidean distance is adopted:
d(x,y) =wherein x, y are two samples, +.>Andare their features;
taking a 5G message user as an example, the equipment attribute for carrying the 5G message has dimensions of 1. A smart phone system, 2. A manufacturer of the mobile phone 3. The delivery time of the mobile phone 4.5G message app version and the like. The whole apple iOS system and the mobile phones produced before 2020 are loaded with the android system do not support 5G messages, so that the given distance exceeds the KNN distance threshold in the learning process of the classification algorithm, and obviously, the 5G users cannot be classified.
In some embodiments, the classifying the users by using the clustering algorithm of machine learning to obtain the corresponding clustered user database includes:
acquiring characteristics of user data, carrying out characteristic standardization, selecting the most effective characteristics, converting the selected characteristics, and extracting representative characteristics;
performing similarity measurement based on a specific measurement function to obtain centers of all clusters and user groups of each sample;
and analyzing the clustering result by using a clustering similarity measurement method and a data element distance measurement method.
In some embodiments, the deriving the center of each cluster includes:
and successively updating the values of the clustering centers by an iterative method.
Specifically, a K-means algorithm of partitional clustering (the mainstream clustering algorithm is divided into two types, namely partitional clustering and hierarchical clustering) is used;
the steps of using partitional clustering are:
(1) Data preparation: the features are normalized.
(2) Feature selection: the most efficient feature is selected.
(3) Feature extraction: the selected features are converted to extract representative features.
(4) Clustering: and carrying out similarity measurement (see the similarity measurement of table 1) based on a specific measurement function, so that the similarity of the same cluster data is as close as possible, and the data of different clusters are as separated as possible, thereby obtaining the centers of the clusters and the user groups of each sample.
(5) Evaluation: and (3) analyzing clustering results, such as a clustering similarity measurement method such as distance errors, sum of Squares Errors (SSE) and the like, and measuring the distance of the data elements.
Table 1:
in table 1: euclidean distance:the method comprises the steps of carrying out a first treatment on the surface of the Where x, y are the two samples,and->Are their features, n represents the total feature number;
manhattan distance:wherein->Representation of sample->To the center point->For example, manhattan distance, e.g., manhattan distance>Coordinates of->Center point +.>Coordinates of (c)Sample->To the center point->Manhattan distance->
Chebyshev distance:n-dimensional spatial point a #) To point b ()>) Chebyshev distance (two n-dimensional vectors)
Minkowski (Minkowski) distance:is expressed generally for a plurality of distance metric formulas, the value of p is a variable, n represents the total feature number, and when p=2, the Euclidean distance is obtained
Flow using the K-means algorithm:
(1) Finding a center point using an iterative algorithm:
the algorithm clusters around k points in space, classifying the objects closest to them. And successively updating the values of the clustering centers by an iterative method.
Description of algorithm:
(1) randomly and properly selecting initial centers of 4 clusters by using symbolsA representation;
(2) in K iterations, for any sample, find its distance to each center, and assign the sample to the cluster with the shortest distance to the center:
wherein,,represents the minimum value in the distance of the ith sample to the jth center point,/-)>Represents the j-th center point,>representing the distance from the ith sample to the jth center point, argmin representing the shortest distance from the sample to a certain center point, and the approximate meaning of the formula is that the distance between each sample and the center point is calculated, and the smallest distance is recorded;
(3) updating the central value of the cluster by means of a mean value and the like:
wherein,,is the center point of the j-th cluster, which is sought,/->Representing the corresponding j-th cluster, +.>Indicating the total distance for samples divided in the j-th cluster,/->And the total number of the samples is represented, and the calculation formula of the central value is represented by taking the average number obtained by dividing the total distance of the clustered samples by the total number of the samples as the central value.
For example, the present application takes the delivery time of a mobile phone as an example: and (3) the delivery time of the 5G mobile phone, and the sample is iterated through k to obtain a certain time after the delivery time 2020.
(2) After the center point is determined, each data point belongs to the center point closest to it.
(3) Updating the center point: taking a random point as a starting point during initialization; in the iterative process, the center of gravity (or centroid) of all data points of the same cluster is taken as a new center point.
(4) Assigning data points: all data points are assigned to the center point closest to it.
At step 102, information of multiple message bodies to be sent is acquired, and corresponding classification cluster groups are selected to respectively send information of different message bodies based on the classification cluster group user database.
Specifically, through the steps, the existing user data of the system is subjected to a machine learning classification algorithm (KNN) and a clustering algorithm (K-means) to obtain a corresponding classification cluster user database and a clustering cluster database.
At step 103, when triggering the reissue strategy, selecting a corresponding cluster group based on the cluster group user database to send information triggering the reissue strategy.
Specifically, when information is transmitted (multiple message bodies are provided), the clusters are divided according to a classification algorithm, and corresponding information is transmitted respectively. If other strategies such as reissue strategies exist, a cluster user database using a clustering algorithm acts as the strategy for sending information.
At step 104, behavior feedback data of the user after the information is sent is obtained and used as machine learning input data, and machine learning of a classification algorithm and a clustering algorithm is performed to update the classification cluster user database and the clustering cluster user database.
Specifically, after the information is sent out, the feedback data according to the behavior of the user is used as learning input data, machine learning (a classification algorithm and a clustering algorithm) is performed, and the user cluster is adjusted to obtain a user database corresponding to the updated user cluster.
In some embodiments, when information needs to be sent to a new user not in the categorized cluster user database and the clustered cluster user database, the information is sent in a manner specified by the new user.
In some embodiments, when the behavior feedback data of the new user not in the classified cluster user database and the clustered cluster user database is acquired, machine learning of a classification algorithm and a clustering algorithm is performed as machine learning input data to update the classified cluster user database and the clustered cluster user database.
Specifically, for all new users to send (users not in the classification and cluster database), the user-approved mode of sending (typically in multimedia messaging) is adopted. The user database is updated using the above-described method as learning input data for information fed back by users not in the cluster library.
By the method, under the condition of initial existing basic data, the cluster group of the sending user is divided, and then the cluster group library is updated by using a machine learning mode, so that corresponding message body type information can be sent by using a more accurate mode, and the problem that what type of message is sent by what account is not known at present is solved. The method solves the problems that in the prior art, the information is sent down by dividing a sending group in a manual mode by collecting user using terminals and other user attributes in an offline mode through an information sending operator, so that the trouble and the effort are caused, and the efficiency and the sending definition are very low.
Fig. 3 is a block diagram of a machine learning transmitting device based on multiple message bodies according to an embodiment of the present application. The device comprises:
a memory 201; and a processor 202 connected to the memory 201, the processor 202 configured to: acquiring existing user data, and classifying users by using a machine learning classification algorithm and a clustering algorithm to obtain a corresponding classification cluster user database and a clustering cluster user database;
acquiring information of a plurality of message bodies to be transmitted, and selecting corresponding classification cluster groups to respectively transmit information of different message bodies based on the classification cluster group user database;
when triggering the reissue strategy, selecting a corresponding cluster group based on the cluster group user database to send information triggering the reissue strategy;
and acquiring behavior feedback data of the user after information is sent out, and performing machine learning of a classification algorithm and a clustering algorithm as machine learning input data so as to update the classification cluster user database and the clustering cluster user database.
In some embodiments, the processor 202 is further configured to: the information with various message bodies comprises:
text messages, 5G messages, multimedia messages, and/or rich media card messages.
In some embodiments, the processor 202 is further configured to: the machine learning classification algorithm is used for classifying users to obtain a corresponding classification cluster user database, and the method comprises the following steps:
the method comprises the steps of obtaining information samples of various message bodies, measuring the similarity between user data and the samples by using the distance, specifically calculating the distance of multiple attributes of the samples by using a multi-dimensional space Euclidean distance to obtain the similarity, and obtaining a corresponding classification cluster user database based on a distance threshold.
In some embodiments, the processor 202 is further configured to: the machine learning clustering algorithm is used for classifying users to obtain a corresponding clustering cluster user database, and the machine learning clustering algorithm comprises the following steps:
acquiring characteristics of user data, carrying out characteristic standardization, selecting the most effective characteristics, converting the selected characteristics, and extracting representative characteristics;
performing similarity measurement based on a specific measurement function to obtain centers of all clusters and user groups of each sample;
and analyzing the clustering result by using a clustering similarity measurement method and a data element distance measurement method.
In some embodiments, the processor 202 is further configured to: the obtaining the center of each cluster comprises the following steps:
sequentially updating the values of the clustering centers by an iterative method;
the method specifically comprises the following steps: randomly and properly selecting initial centers of 4 clusters by using symbolsA representation;
in K iterations, for any sample, the distance between the sample and each center is calculated, the sample is classified into a cluster with the shortest distance, and the formula is:wherein->Represents the minimum value in the distance of the ith sample to the jth center point,/-)>Represents the j-th center point,>representing the distance from the ith sample to the jth center point, and Argmin represents the shortest distance from the sample to a certain center point;
using the formula:updating the central value of the cluster, wherein +.>Is the center point of the j-th cluster, which is sought,/->Representing the corresponding j-th cluster, +.>Indicating the total distance for samples divided in the j-th cluster,/->Representing the total number of samples.
In some embodiments, the processor 202 is further configured to: further comprises:
taking a random point as a starting point during initialization;
in the iterative process, the center of gravity or the mass center of all data points of the same cluster is taken as a new center point;
all data points are assigned to the closest center point thereto.
In some embodiments, the processor 202 is further configured to: when the information needs to be sent to the new users which are not in the classified cluster group user database and the clustered cluster group user database, the information is sent in a mode designated by the new users.
In some embodiments, the processor 202 is further configured to: when the behavior feedback data of the new users which are not in the classifying cluster user database and the clustering cluster user database are acquired, the classifying algorithm and the machine learning of the clustering algorithm are carried out as machine learning input data so as to update the classifying cluster user database and the clustering cluster user database.
Reference is made to the foregoing method embodiments for specific implementation methods, and details are not repeated here.
The present application may be a method, apparatus, system, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for performing the various aspects of the present application.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present application may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present application are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information for computer readable program instructions, which may execute the computer readable program instructions.
Various aspects of the present application are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Note that all features disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic set of equivalent or similar features. Where used, further, preferably, still further and preferably, the brief description of the other embodiment is provided on the basis of the foregoing embodiment, and further, preferably, further or more preferably, the combination of the contents of the rear band with the foregoing embodiment is provided as a complete construct of the other embodiment. A further embodiment is composed of several further, preferably, still further or preferably arrangements of the strips after the same embodiment, which may be combined arbitrarily.
While the application has been described in detail with respect to the general description and specific embodiments thereof, it will be apparent to those skilled in the art that certain modifications and improvements may be made thereto based upon the application. Accordingly, such modifications or improvements may be made without departing from the spirit of the application and are intended to be within the scope of the invention as claimed.

Claims (10)

1.一种基于多种消息体的机器学习发送方法,其特征在于,包括:1. A machine learning sending method based on multiple message bodies, characterized in that it includes: 获取已有的用户数据,分别运用机器学习的分类算法和聚类算法对用户进行分类,得到对应的分类簇群用户数据库和聚类簇群用户数据库;Obtain existing user data and classify users using machine learning classification and clustering algorithms respectively to obtain corresponding user databases of classification clusters and user databases of clusters. 获取待发送的有多种消息体的信息,基于所述分类簇群用户数据库选择对应的分类簇群分别发送不同消息体的信息;Obtain information about multiple message bodies to be sent, and select the corresponding classification cluster based on the classification cluster user database to send information about different message bodies respectively; 当触发补发策略时,基于所述聚类簇群用户数据库选择对应的聚类簇群发送触发补发策略的信息;When the resend strategy is triggered, the corresponding cluster is selected based on the cluster user database and the information triggering the resend strategy is sent. 获取信息发出后用户的行为反馈数据,作为机器学习输入数据,进行分类算法和聚类算法的机器学习,以对所述分类簇群用户数据库和聚类簇群用户数据库进行更新。The system acquires user behavior feedback data after information is sent, uses it as machine learning input data, and performs machine learning on classification and clustering algorithms to update the classification cluster user database and the cluster user database. 2.根据权利要求1所述的基于多种消息体的机器学习发送方法,其特征在于,所述有多种消息体的信息,包括:2. The machine learning sending method based on multiple message bodies according to claim 1, characterized in that the information having multiple message bodies includes: 文本消息、5G消息、多媒体消息和/或富媒体卡片消息。Text messages, 5G messages, multimedia messages, and/or rich media card messages. 3.根据权利要求1所述的基于多种消息体的机器学习发送方法,其特征在于,所述运用机器学习的分类算法对用户进行分类,得到对应的分类簇群用户数据库,包括:3. The machine learning sending method based on multiple message bodies according to claim 1, characterized in that, the step of classifying users using a machine learning classification algorithm to obtain a corresponding user database of classification clusters includes: 获取多种消息体的信息样本,使用距离来衡量用户数据与样本间的相似度,具体包括采用多维空间欧氏距离对样本多属性计算距离,以得到相似度,之后基于距离阈值得到对应的分类簇群用户数据库。Information samples of various message bodies are obtained, and distance is used to measure the similarity between user data and samples. Specifically, multidimensional Euclidean distance is used to calculate the distance between multiple attributes of the samples to obtain the similarity. Then, the corresponding classification cluster user database is obtained based on the distance threshold. 4.根据权利要求1所述的基于多种消息体的机器学习发送方法,其特征在于,所述运用机器学习的聚类算法对用户进行分类,得到对应的聚类簇群用户数据库,包括:4. The machine learning sending method based on multiple message bodies according to claim 1, characterized in that, the step of classifying users using a machine learning clustering algorithm to obtain a corresponding cluster user database includes: 获取用户数据的特征,进行特征标准化,选择最有效的特征,对选择的特征进行转换,提取具有代表性的特征;Acquire the features of user data, standardize the features, select the most effective features, transform the selected features, and extract representative features; 基于特定的度量函数进行相似度度量,得到各个聚类的中心以及每个样本的用户群体;Similarity is measured based on a specific metric function to obtain the center of each cluster and the user group of each sample; 利用聚类相似度度量方法和数据元素距离度量方法,分析聚类结果。Clustering results are analyzed using cluster similarity metrics and data element distance metrics. 5.根据权利要求4所述的基于多种消息体的机器学习发送方法,其特征在于,所述得到各个聚类的中心,包括:5. The machine learning sending method based on multiple message bodies according to claim 4, characterized in that obtaining the center of each cluster includes: 通过迭代的方法,逐次更新各聚类中心的值;The values of each cluster center are updated iteratively. 具体包括:随机适当选择4个聚类的初始中心,用符号表示;Specifically, this includes: randomly selecting four initial cluster centers, and using symbols... , , express; 在K次迭代中,对于任一样本,求其到各个中心的距离,将该样本归到距离最短中心的聚类中,所利用的公式为:,其中,表示第i个样本到第j个中心点的距离中的最小值,表示第j个中心点,表示第i个样本到第j个中心点的距离,Argmin表示取样本到某个中心点最短距离;In the K iterations, for any sample, its distance to each center is calculated, and the sample is assigned to the cluster with the shortest distance center. The formula used is: ,in, This represents the minimum distance from the i-th sample to the j-th center point. Let j represent the j-th center point. Let represent the distance from the i-th sample to the j-th center point, and Argmin represent the shortest distance from a sample to a certain center point; 利用公式:更新该聚类的中心值,其中,是所求的第j个簇的中心点,表示对应于第j个簇,表示对划分在第j个簇样本总距离 ,表示样本总数。Using the formula: Update the cluster center values, where, It is the center point of the j-th cluster. This indicates that it corresponds to the j-th cluster. This represents the total distance for samples belonging to the j-th cluster. This represents the total number of samples. 6.根据权利要求5所述的基于多种消息体的机器学习发送方法,其特征在于,还包括:6. The machine learning sending method based on multiple message bodies according to claim 5, characterized in that it further includes: 初始化的时候以随机取点作为起始点;During initialization, a randomly selected point is used as the starting point; 迭代过程中,取同一聚类的所有数据点的重心或质心作为新中心点;During the iteration process, the centroid or centroid of all data points in the same cluster is taken as the new center point; 把所有的数据点分配到与其最近的中心点。Assign all data points to the nearest center point. 7.根据权利要求1所述的基于多种消息体的机器学习发送方法,其特征在于,包括:7. The machine learning sending method based on multiple message bodies according to claim 1, characterized in that it includes: 当需要将信息发送给不在所述分类簇群用户数据库和聚类簇群用户数据库中的新用户时,采用新用户指定的方式发送。When information needs to be sent to a new user who is not in the classification cluster user database or the clustering cluster user database, the information should be sent in the manner specified by the new user. 8.根据权利要求1所述的基于多种消息体的机器学习发送方法,其特征在于,包括:8. The machine learning sending method based on multiple message bodies according to claim 1, characterized in that it includes: 当获取不在所述分类簇群用户数据库和聚类簇群用户数据库中的新用户的行为反馈数据时,作为机器学习输入数据,进行分类算法和聚类算法的机器学习,以对所述分类簇群用户数据库和聚类簇群用户数据库进行更新。When behavioral feedback data of new users not present in the classification cluster user database and the clustered cluster user database are obtained, they are used as machine learning input data to perform machine learning on classification and clustering algorithms in order to update the classification cluster user database and the clustered cluster user database. 9.一种基于多种消息体的机器学习发送装置,其特征在于,包括:9. A machine learning sending device based on multiple message bodies, characterized in that it comprises: 存储器;以及Memory; and 与所述存储器连接的处理器,所述处理器被配置成执行如权利要求1至8中任一项所述的方法的步骤。A processor connected to the memory, the processor being configured to perform the steps of the method as claimed in any one of claims 1 to 8. 10.一种计算机存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被机器执行时实现如权利要求1至8中任一项所述的方法的步骤。10. A computer storage medium having a computer program stored thereon, characterized in that, when the computer program is executed by a machine, it implements the steps of the method as described in any one of claims 1 to 8.
CN202310744857.3A 2023-06-25 2023-06-25 Machine learning transmission method, device and storage medium based on multiple message bodies Pending CN116527620A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310744857.3A CN116527620A (en) 2023-06-25 2023-06-25 Machine learning transmission method, device and storage medium based on multiple message bodies

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310744857.3A CN116527620A (en) 2023-06-25 2023-06-25 Machine learning transmission method, device and storage medium based on multiple message bodies

Publications (1)

Publication Number Publication Date
CN116527620A true CN116527620A (en) 2023-08-01

Family

ID=87401432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310744857.3A Pending CN116527620A (en) 2023-06-25 2023-06-25 Machine learning transmission method, device and storage medium based on multiple message bodies

Country Status (1)

Country Link
CN (1) CN116527620A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9501744B1 (en) * 2012-06-11 2016-11-22 Dell Software Inc. System and method for classifying data
CN108734355A (en) * 2018-05-24 2018-11-02 国网福建省电力有限公司 A kind of short-term electric load method of parallel prediction and system applied to power quality harnessed synthetically scene
CN110097066A (en) * 2018-01-31 2019-08-06 阿里巴巴集团控股有限公司 A kind of user classification method, device and electronic equipment
CN110532429A (en) * 2019-09-04 2019-12-03 重庆邮电大学 It is a kind of based on cluster and correlation rule line on user group's classification method and device
CN112241494A (en) * 2020-12-10 2021-01-19 平安科技(深圳)有限公司 Key information pushing method and device based on user behavior data
WO2022105525A1 (en) * 2020-11-17 2022-05-27 深圳壹账通智能科技有限公司 Method and apparatus for predicting user probability, and computer device
CN114553813A (en) * 2022-01-17 2022-05-27 中国工商银行股份有限公司 Banking-based message push method and device, processor and electronic device
US11438439B1 (en) * 2021-03-31 2022-09-06 Microsoft Technology Licensing, Llc Detecting non-personal network and connectivity attributes for classifying user location
CN115208518A (en) * 2022-07-15 2022-10-18 腾讯科技(深圳)有限公司 Data transmission control method, device and computer readable storage medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9501744B1 (en) * 2012-06-11 2016-11-22 Dell Software Inc. System and method for classifying data
CN110097066A (en) * 2018-01-31 2019-08-06 阿里巴巴集团控股有限公司 A kind of user classification method, device and electronic equipment
CN108734355A (en) * 2018-05-24 2018-11-02 国网福建省电力有限公司 A kind of short-term electric load method of parallel prediction and system applied to power quality harnessed synthetically scene
CN110532429A (en) * 2019-09-04 2019-12-03 重庆邮电大学 It is a kind of based on cluster and correlation rule line on user group's classification method and device
WO2022105525A1 (en) * 2020-11-17 2022-05-27 深圳壹账通智能科技有限公司 Method and apparatus for predicting user probability, and computer device
CN112241494A (en) * 2020-12-10 2021-01-19 平安科技(深圳)有限公司 Key information pushing method and device based on user behavior data
US11438439B1 (en) * 2021-03-31 2022-09-06 Microsoft Technology Licensing, Llc Detecting non-personal network and connectivity attributes for classifying user location
CN114553813A (en) * 2022-01-17 2022-05-27 中国工商银行股份有限公司 Banking-based message push method and device, processor and electronic device
CN115208518A (en) * 2022-07-15 2022-10-18 腾讯科技(深圳)有限公司 Data transmission control method, device and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
熊赟等: "《PM2.5卫星遥感技术及其应用》", 上海科学技术出版社, pages: 100 *

Similar Documents

Publication Publication Date Title
US10846052B2 (en) Community discovery method, device, server and computer storage medium
CN110162970B (en) A program processing method, device and related equipment
Farid et al. Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks
CN109189767B (en) Data processing method and device, electronic equipment and storage medium
WO2019214245A1 (en) Information pushing method and apparatus, and terminal device and storage medium
US10511681B2 (en) Establishing and utilizing behavioral data thresholds for deep learning and other models to identify users across digital space
CN113821657A (en) Image processing model training method and image processing method based on artificial intelligence
CN110880006B (en) User classification method, apparatus, computer device and storage medium
CN111209469A (en) A kind of personalized recommendation method, device, computer equipment and storage medium
CN112860685A (en) Automatic recommendation of analysis of data sets
US12367421B2 (en) Automated data hierarchy extraction and prediction using a machine learning model
US20210157847A1 (en) Attribute diversity for frequent pattern analysis
CN112528103B (en) Method and device for recommending an object
CN110928957A (en) Data clustering method and device
CN110659175A (en) Log trunk extraction method, log trunk classification method, log trunk extraction equipment and log trunk storage medium
EP4517585A1 (en) Long duration structured video action segmentation
CN113315851A (en) Domain name detection method, device and storage medium
CN116451139A (en) Live broadcast data rapid analysis method based on artificial intelligence
CN111753199A (en) User portrait construction method and device, electronic device and medium
US9400927B2 (en) Information processing apparatus and non-transitory computer readable medium
CN105426425A (en) Big data marketing method based on mobile signaling
CN116883740A (en) Similar picture identification method, device, electronic equipment and storage medium
Gias et al. Samplehst: Efficient on-the-fly selection of distributed traces
US12271430B2 (en) Data cataloging based on classification models
CN114492602A (en) Sample processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20230801