CN114090869A - Target object processing method and device, electronic equipment and storage medium - Google Patents

Target object processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114090869A
CN114090869A CN202111194472.1A CN202111194472A CN114090869A CN 114090869 A CN114090869 A CN 114090869A CN 202111194472 A CN202111194472 A CN 202111194472A CN 114090869 A CN114090869 A CN 114090869A
Authority
CN
China
Prior art keywords
information
target object
target
clustering
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111194472.1A
Other languages
Chinese (zh)
Inventor
李艳艳
谭国铭
李晴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Taou Science & Technology Development Co ltd
Original Assignee
Beijing Taou Science & Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Taou Science & Technology Development Co ltd filed Critical Beijing Taou Science & Technology Development Co ltd
Priority to CN202111194472.1A priority Critical patent/CN114090869A/en
Publication of CN114090869A publication Critical patent/CN114090869A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a target object processing method and device, electronic equipment and a storage medium, and relates to the technical field of data processing. The method comprises the following steps: acquiring object information of a target object; the object information comprises behavior information and attribute information of the target object; clustering the object information, and determining the category information corresponding to the target object according to the clustering result; and determining resource pushing strategy information corresponding to the target object according to the category information. The embodiment of the application realizes accurate classification of the target object.

Description

Target object processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a target object processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
In scientific research in various fields, a plurality of variables reflecting objects need to be observed in order to analyze and find rules among the variables, so as to classify the objects.
In the prior art, generally, objects, i.e. target objects, are classified based on measurements characterized by variables that can be directly measured. For example, the students are classified and evaluated based on their learning performances, and the learning performances of the students can be represented by the variables of the interim performances, the end-of-term performances and the homework completion conditions, but the factors reflecting the learning performances can also include the variables such as learning enthusiasm which cannot be directly measured; only by adopting the statistical results of variables such as the interim and end achievements and the job completion conditions, the students cannot be subjected to multi-dimensional and objective classification evaluation, and the problem of inaccurate classification of target objects exists.
Disclosure of Invention
The embodiment of the application provides a target object processing method and device, electronic equipment and a computer readable storage medium, which can solve the problem of inaccurate classification of target objects.
According to an aspect of an embodiment of the present application, there is provided a target object processing method, including:
acquiring object information of a target object; the object information comprises behavior information and attribute information of the target object;
clustering the object information, and determining the category information corresponding to the target object according to the clustering result;
and determining resource pushing strategy information corresponding to the target object according to the category information.
Optionally, the clustering the object information includes:
reducing the dimension of the object information to obtain the object information after dimension reduction;
and clustering the object information subjected to dimension reduction.
Optionally, the performing dimension reduction on the object information to obtain dimension-reduced object information includes:
extracting a first feature vector of each target object aiming at the object information;
splicing the first feature vectors to obtain a feature matrix;
and reducing the dimension of the characteristic matrix to obtain a target matrix, and taking the target matrix as the object information after dimension reduction.
Optionally, the performing dimension reduction on the feature matrix to obtain the target matrix includes:
inputting the characteristic matrix into a pre-trained mapper to obtain a target matrix; wherein the mapper indicates a mapping relationship between the feature matrix and the target matrix.
Optionally, the clustering the object information after the dimensionality reduction includes:
determining the space distance between the target objects based on the object information after dimension reduction;
constructing a minimum spanning tree based on the spatial distance;
generating a plurality of clustering clusters according to the minimum spanning tree;
and determining a clustering result based on the clustering cluster.
Optionally, the determining the spatial distance between the target objects based on the object information after the dimension reduction includes:
extracting a second feature vector of each target object based on the target matrix;
and respectively calculating the space distance between the target objects based on the second feature vectors.
Optionally, the building a minimum spanning tree based on the spatial distance includes:
and taking the mark of each target object as a vertex, taking the space distance as the weight of an edge between adjacent vertices, and constructing a minimum spanning tree according to the vertex and the weight of the edge.
Optionally, the determining, according to the category information, resource pushing policy information corresponding to the target object includes:
counting the behavior information to obtain behavior result information of the target object; wherein the behavior result information indicates behavior characteristics of the target object;
and determining resource pushing strategy information corresponding to the target object according to the behavior result information and the category information.
According to another aspect of the embodiments of the present application, there is provided a target object processing apparatus including:
the acquisition module is used for acquiring the object information of the target object; the object information comprises behavior information and attribute information of the target object;
the clustering module is used for clustering the object information and determining the category information corresponding to the target object according to the clustering result;
and the determining module is used for determining the resource pushing strategy information corresponding to the target object according to the category information.
Optionally, the clustering module includes:
the dimension reduction unit is used for reducing the dimension of the object information to obtain the object information after dimension reduction;
and the clustering unit is used for clustering the object information subjected to dimension reduction.
Optionally, the dimension reduction unit is configured to:
extracting a first feature vector of each target object aiming at the object information;
splicing the first feature vectors to obtain a feature matrix;
and reducing the dimension of the characteristic matrix to obtain a target matrix, and taking the target matrix as the object information after dimension reduction.
Optionally, the dimension reduction unit is configured to:
inputting the characteristic matrix into a pre-trained mapper to obtain a target matrix; wherein the mapper indicates a mapping relationship between the feature matrix and the target matrix.
Optionally, the clustering unit is configured to:
determining the space distance between the target objects based on the object information after dimension reduction;
constructing a minimum spanning tree based on the spatial distance;
generating a plurality of clustering clusters according to the minimum spanning tree;
and determining a clustering result based on the clustering cluster.
Optionally, the clustering unit includes:
extracting a second feature vector of each target object based on the target matrix;
and respectively calculating the space distance between the target objects based on the second feature vectors.
Optionally, the clustering unit includes:
and taking the mark of each target object as a vertex, taking the space distance as the weight of an edge between adjacent vertices, and constructing a minimum spanning tree according to the vertex and the weight of the edge.
Optionally, the determining module is configured to:
counting the behavior information to obtain behavior result information of the target object; wherein the behavior result information indicates behavior characteristics of the target object;
and determining resource pushing strategy information corresponding to the target object according to the behavior result information and the category information.
According to another aspect of an embodiment of the present application, there is provided an electronic apparatus including:
the device comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to realize the steps of the method shown in the first aspect of the embodiment of the application.
According to a further aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method as set forth in the first aspect of embodiments of the present application.
According to an aspect of embodiments of the present application, there is provided a computer program product comprising a computer program that, when executed by a processor, performs the steps of the method illustrated in the first aspect of embodiments of the present application.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
according to the method and the device, the object information of the target object is obtained firstly, and the object information is clustered, so that the category information of the target object is determined, and as the object information comprises behavior information and attribute information, the characteristics of the target object can be represented from different static and dynamic dimensions, and the target object can be classified accurately; compared with the prior art, objects, namely target objects, are classified based on the measurement results represented by the directly measurable variables, the resource pushing strategy information corresponding to the target objects can be determined based on the class information of the target objects, multi-dimensional classification evaluation can be performed on the target objects, the resource pushing strategy can be adjusted in time based on the class information, personalized resource pushing on the target objects is achieved, and future behavior information of the target objects is effectively guided.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic view of an application scenario of a target object processing method according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a target object processing method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of generating category information in a target object processing method according to an embodiment of the present application;
fig. 4 is a schematic flowchart illustrating training of an initial mapper in a target object processing method according to an embodiment of the present disclosure;
fig. 5 is a schematic flowchart of a clustering process in a target object processing method according to an embodiment of the present application;
fig. 6 is a flowchart illustrating an exemplary target object processing method according to an embodiment of the present application
Fig. 7 is a schematic structural diagram of a target object processing apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a target object processing electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application are described below in conjunction with the drawings in the present application. It should be understood that the embodiments set forth below in connection with the drawings are exemplary descriptions for explaining technical solutions of the embodiments of the present application, and do not limit the technical solutions of the embodiments of the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should be further understood that the terms "comprises" and/or "comprising," when used in this specification in connection with embodiments of the present application, specify the presence of stated features, information, data, steps, operations, elements, and/or components, but do not preclude the presence or addition of other features, information, data, steps, operations, elements, components, and/or groups thereof, as embodied in the art. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein indicates at least one of the items defined by the term, e.g., "a and/or B" indicates either an implementation as "a", or an implementation as "a and B".
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
In scientific research in various fields, a large number of observations on a plurality of variables reflecting things and a large number of data are often required in order to analyze and find laws. Multivariate large samples undoubtedly provide rich information for scientific research, but also increase the workload of data acquisition to some extent, and more importantly, in most cases, the possible correlation among many variables increases the complexity of problem analysis and brings inconvenience to the analysis. If each index is analyzed separately, the analysis may be isolated rather than integrated. Blindly reducing the index may lose much information and produce erroneous conclusions. Therefore, a reasonable method is needed to be found, so that the loss of information contained in the original index is reduced as much as possible while the analysis index is reduced, and the collected data is comprehensively analyzed. Because of the existence of a certain correlation relationship among variables, a few indexes are used for respectively integrating various types of information existing in the variables, and the few integrated indexes are irrelevant to each other, namely, the represented information is non-overlapping and is generally called as a factor, and a factor analysis method is named accordingly.
Factor analysis refers to a statistical technique that studies the extraction of commonality factors from a population of variables. Originally proposed by british psychologist c.e. spearman. The students find that a certain correlation exists among the scores of all the departments of the students, and the students with good scores of one department often have better scores of other departments, so that whether certain potential common factors exist or not is supposed, or certain common intelligence conditions influence the learning scores of the students. Factor analysis can find hidden representative factors among many variables. The number of variables can be reduced by factoring variables of the same nature, and assumptions of relationships between the variables can also be examined.
Taking the classification evaluation of the users on the production platform as an example, in order to achieve the production goal, it is necessary to know which behaviors of the users will affect the production result, which factors are manipulated to change the production result, and different user behaviors may cause different results. In the process, users need to be classified, and different strategies are implemented according to different groups of people and the behaviors of the people.
Specifically, the production goal may be an overall production goal boost of 5%, and different production volume boost goals may be customized according to different user classifications, such as a 10% increment customized for user population 1 (high potential users) and a 3% increment customized for user population 2 (low potential users). Thus, there is a need to determine which factors affect production behavior, which factors (causes) can be manipulated to change the production state (effect); at the same time, there is also a need for a reasonable and long-term effective way of grouping users in order to explore the pertinence of incentive strategies, e.g. incentive assumptions can be made based on specific groups and characteristics of groups. By factor analysis it can be determined that: user classification strategies, feature awareness of each user category and distribution rules of users.
The application provides a target object processing method, a target object processing device, electronic equipment and a computer-readable storage medium, and aims to solve the technical problem that in the prior art, target object classification is inaccurate.
The embodiment of the application provides a target object processing method, which can be realized by a terminal or a server. The terminal or the server related to the embodiment of the application can perform clustering processing on the obtained object information of the target object, so that the category information corresponding to the target object is determined, the technical scheme of the embodiment of the application can determine the resource pushing strategy information corresponding to the target object based on the category information, the purpose of accurately classifying the target object is achieved, and personalized resource pushing on the target object is realized.
The technical solutions of the embodiments of the present application and the technical effects produced by the technical solutions of the present application will be described below through descriptions of several exemplary embodiments. It should be noted that the following embodiments may be referred to, referred to or combined with each other, and the description of the same terms, similar features, similar implementation steps and the like in different embodiments is not repeated.
As shown in fig. 1, the target object processing method of the present application may be applied to the scenario shown in fig. 1, specifically, the server 101 may first obtain object information of a target object from the client 102, perform clustering processing on the object information, determine category information corresponding to the target object, and then determine resource pushing policy information corresponding to the target object based on the category information.
In the scenario shown in fig. 1, the target object processing method may be performed in the server, or in another scenario, may be performed in the terminal.
Those skilled in the art will understand that the "terminal" used herein may be a Mobile phone, a tablet computer, a PDA (Personal Digital Assistant), an MID (Mobile Internet Device), etc.; a "server" may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
An embodiment of the present application provides a target object processing method, and as shown in fig. 2, the method includes:
s201, acquiring object information of a target object; the object information includes behavior information and attribute information of the target object.
Specifically, the terminal or the server for processing the target object may obtain object information corresponding to the target from a preset database based on an identification query of the target object, and may also collect the object information of the target object in a specified time period in real time.
The target object may be an object to be statistically analyzed, the object information may include static data and dynamic data of the target object, the static data may be attribute information of the target object, and the dynamic data may be behavior information of the target object. Specifically, the attribute information may represent an identity of the target object, and the behavior information may represent a behavior activity performed by the target object.
In the embodiment of the application, taking a target object as an example of a user of target social software for specific description, object information may be obtained from a database corresponding to the social software, where the object information includes user behavior information and user attribute information; the user behavior information may include production data of original content published by the user, comment data published by the user, interaction data of the user in the target social software, and the like. The user attribute information may include data such as the industry in which the user is located, the working age of the user, the user's academic history and work experience.
S202, clustering the object information, and determining the category information corresponding to the target object according to the clustering result.
Specifically, each target object corresponds to a piece of multidimensional object information, and a terminal or a server for processing the target object may perform dimensionality reduction on the object information of the target object, perform clustering processing on the object information after dimensionality reduction, and obtain multiple categories according to a clustering result, thereby determining category information corresponding to each target object.
The data dimensionality reduction is a preprocessing method for high-dimensional feature data; dimension reduction is to retain data with high dimension with some most important features, and remove noise and unimportant features, thereby achieving the purpose of increasing data processing speed. In actual production and application, dimension reduction can save a great deal of time and cost for people within a certain information loss range. Data dimensionality reduction can make data easier to use, reduce computational overhead of algorithms, remove noise, and make data processing results easier to understand.
In practical applications, dimension reduction algorithms are classified into linear dimension reduction algorithms and nonlinear dimension reduction algorithms, wherein the linear dimension reduction algorithms include Singular Value Decomposition (SVD), Principal Component Analysis (PCA), and the like, and the nonlinear dimension reduction algorithms include UMAP (Uniform Manifold Approximation and Projection), t-SNE (t-partitioned stored neighboring Embedding, t-Distributed domain Embedding) algorithms.
The clustering process is to cluster and combine adjacent similar classified areas by using morphological operators. Clustering differs from classification in that the class into which the clustering is required to be divided is unknown. Clustering is a process of classifying data into different classes or clusters, so that objects in the same cluster have great similarity, and objects in different clusters have great dissimilarity. From a statistical point of view, cluster analysis is a method of simplifying data by data modeling. The traditional statistical clustering analysis method comprises a systematic clustering method, a decomposition method, an addition method, a dynamic clustering method, ordered sample clustering, overlapped clustering, fuzzy clustering and the like.
In the embodiment of the application, a target object is taken as an example of a user of target social software to specifically describe, and corresponding object information includes production data of original content published by the user, comment data published by the user, behavior information such as interaction data of the user in the target social software, and attribute information such as industry where the user is located, working years of the user, user academic records and working experiences. The object information is multi-dimensional feature data, the object information can be subjected to dimension reduction to obtain two-dimensional data, and feature dimensions are greatly reduced while the features of original object information are retained to the maximum extent; then, clustering processing is carried out on the two-dimensional data through a clustering algorithm to obtain category information corresponding to each object information.
S203, determining resource pushing strategy information corresponding to the target object according to the category information.
Specifically, the terminal or the server for processing the target object may display, in a preset interaction interface, interface controls corresponding to the category information and the preset resource pushing policy information, respectively, and then determine the resource pushing policy information corresponding to the target object based on the operation of the user.
Wherein, the operation of the user comprises:
dragging or moving the interface control to an operation within a preset range of an interaction interface;
clicking operation aiming at the interface control;
and pushing the input operation of the strategy information aiming at the category information or the resource in the interface control.
In the embodiment of the application, a target object is taken as an example of a user of target social software to specifically describe, and corresponding object information includes production data of original content published by the user, comment data published by the user, behavior information such as interaction data of the user in the target social software, and attribute information such as industry where the user is located, working years of the user, user academic records and working experiences. The object information is multidimensional characteristic data, the object information can be subjected to dimensionality reduction to obtain two-dimensional data, and then the two-dimensional data is subjected to clustering processing through a clustering algorithm to obtain category information corresponding to each object information. For example, object information can be divided into three categories: and technical personnel configure corresponding resource pushing strategy information for various users based on the categories of the high-activity users, the low-activity users and the potential users.
According to the method and the device, the object information of the target object is obtained firstly, and the object information is clustered, so that the category information of the target object is determined, and as the object information comprises behavior information and attribute information, the characteristics of the target object can be represented from different static and dynamic dimensions, and the target object can be classified accurately; compared with the prior art, objects, namely target objects, are classified based on the measurement results represented by the directly measurable variables, the resource pushing strategy information corresponding to the target objects can be determined based on the class information of the target objects, multi-dimensional classification evaluation can be performed on the target objects, the resource pushing strategy can be adjusted in time based on the class information, personalized resource pushing on the target objects is achieved, and future behavior information of the target objects is effectively guided.
A possible implementation manner is provided in the embodiment of the present application, as shown in fig. 3, the clustering process performed on the object information in step S202 includes:
(1) and reducing the dimension of the object information to obtain the object information after dimension reduction.
Specifically, the object information may be converted into a feature vector, and then the feature vector is subjected to dimension reduction to obtain the dimension-reduced object information, and the specific steps of the dimension reduction will be described in detail below.
The embodiment of the present application provides a possible implementation manner, where the performing dimension reduction on the object information to obtain the dimension-reduced object information includes:
and a, extracting a first feature vector of each target object aiming at the object information.
Specifically, the data cleaning and the data splicing may be performed on the object information in sequence to generate object text data, and then word vector mapping may be performed on the object text data to obtain the first feature vector.
The method comprises the following two steps of removing punctuations and special symbols in object information to obtain initial text data, then performing word segmentation processing on the initial text data and removing stop words, thereby generating at least one word group. When only one phrase is available, the phrase is directly used as object text data; when the number of the phrases is larger than or equal to two, data splicing can be carried out on at least one phrase to obtain object text data.
In this embodiment of the present application, after the object information of each target object is converted into the first feature vector, the head end or the tail end of each first feature vector may be filled with a zero vector with reference to the first feature vector with the largest number of words of text, so that each first feature vector has the same length, and further the object information of each target object has vector representations with the same length, where the vector representations of the object information with similar contents have a larger cosine similarity, so as to facilitate subsequent clustering processing.
And b, splicing the first feature vectors to obtain a feature matrix.
Each target object corresponds to object information, each object information can be represented as a first feature vector, and a plurality of first feature vectors can be spliced to obtain a feature matrix.
And c, reducing the dimension of the characteristic matrix to obtain a target matrix, and taking the target matrix as the object information after dimension reduction.
In the embodiment of the application, the object information includes multi-dimensional information such as behavior information and attribute information of the target object, and the multi-dimensional data of the feature matrix can be subjected to dimension reduction so as to facilitate subsequent clustering.
The embodiment of the present application provides a possible implementation manner, where the performing dimension reduction on the feature matrix to obtain the target matrix includes:
inputting the characteristic matrix into a pre-trained mapper to obtain a target matrix; wherein the mapper indicates a mapping relationship between the feature matrix and the target matrix.
Wherein the pre-trained mapper may be a UMAP mapper. The UMAP algorithm is a non-linear dimensionality reduction algorithm that is based on three assumptions about the data: the data are uniformly distributed on the Riemann manifold; the riemann metric is locally constant (or can be approximated as such); the manifolds are locally connected. Based on these assumptions, the prevalence of data with fuzzy topologies can be modeled, and dimension reduction can be accomplished by searching for low-dimensional mappings of data with the closest equivalent fuzzy topology.
Specifically, as shown in fig. 4, the training step of the UMAP mapper includes:
acquiring a preset initial mapper and a training set, wherein the training set comprises a sample characteristic matrix and a sample target matrix corresponding to the sample characteristic matrix;
inputting the sample characteristic matrix into an initial mapper, constructing an initial graph on the high dimension by the initial mapper based on the sample characteristic matrix, mapping the initial graph to a low dimension space, and outputting a target matrix label;
calculating the difference value between the target matrix label and the sample target matrix based on the loss function, and continuously adjusting the parameters of the initial mapper; and when the difference value is smaller than a preset threshold value, the initial mapper is considered to be converged, and the UMAP mapper is obtained.
(2) And clustering the object information subjected to dimension reduction.
Specifically, the Clustering may be performed on the object information after the dimensionality reduction Based on an HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) algorithm. The specific clustering steps will be described in detail below.
The embodiment of the present application provides a possible implementation manner, and the clustering performed on the object information after the dimension reduction includes:
and a, determining the space distance between the target objects based on the object information after dimension reduction.
Specifically, each target object may be regarded as a space point, the reduced-dimension object information corresponding to the target object may be regarded as a value of the space point, and a distance between the space points, that is, a space distance between the target objects may be calculated based on the value of the space point.
In practical application, in order to determine a cluster based on object information, high-density spatial points need to be found, so that the spatial points with low density can be dispersed by calculating the mutual reachable metric distance between the spatial points, and the space is transformed according to the density.
A possible implementation manner is provided in the embodiment of the present application, as shown in fig. 5, the determining a spatial distance between target objects based on the object information after the dimension reduction includes:
a1, extracting a second feature vector of each target object based on the target matrix;
a2, respectively calculating the space distance between the target objects based on the second feature vectors.
Wherein the spatial distance may be a mutual reachability metric distance, the mutual reachability metric distance between two spatial points may be calculated based on the following formula (1):
dmreach-k(a,b)=max{corek(a),corek(b),d(a,b)} (1)
wherein d ismreach-k(a, b) is the mutual reachable metric distance between spatial points a, b, corek(a) Is the core distance, core, of the spatial point ak(b) Is the core distance of the spatial point b and d (a, b) is the actual distance between the spatial points a, b.
And b, constructing a minimum spanning tree based on the spatial distance.
Specifically, Prim (a primum algorithm, an algorithm in graph theory, which can search for the minimum spanning tree in the weighted connected graph) algorithm may be adopted to generate the minimum spanning tree by taking the spatial distance as the weight of the minimum spanning tree.
A possible implementation manner is provided in the embodiment of the present application, where the building of the minimum spanning tree based on the spatial distance includes:
and taking the mark of each target object as a vertex, taking the space distance as the weight of an edge between adjacent vertices, and constructing a minimum spanning tree according to the vertex and the weight of the edge.
And c, generating a plurality of clustering clusters according to the minimum spanning tree.
Specifically, the edges of the minimum spanning tree may be sorted according to the ascending order of spatial distance, and then each edge is subjected to iteration processing, a new merged cluster is created for each edge, and a cluster is determined based on the merged clusters.
And d, determining a clustering result based on the clustering cluster.
In the embodiment of the application, a target object is taken as an example of a user of target social software to specifically describe, and corresponding object information includes production data of original content published by the user, comment data published by the user, behavior information such as interaction data of the user in the target social software, and attribute information such as industry where the user is located, working years of the user, user academic records and working experiences. The object information is multidimensional characteristic data, the object information can be subjected to dimensionality reduction to obtain two-dimensional data, and then the two-dimensional data is subjected to clustering processing through a clustering algorithm to obtain category information corresponding to each object information. For example, object information can be divided into three categories: high activity users, low activity users, and potential users.
A possible implementation manner is provided in this embodiment of the present application, where the determining, according to the category information in step S203, the resource pushing policy information corresponding to the target object includes:
(1) counting the behavior information to obtain behavior result information of the target object; wherein the behavior result information indicates behavior characteristics of the target object.
In the embodiment of the application, a target object is taken as an example of a user of target social software for specific explanation, and corresponding object information includes attribute information such as an industry where the user is located, a working life of the user, a user academic calendar and a working experience, and behavior information such as production data of original content published by the user, comment data published by the user, and interaction data of the user in the target social software. Specifically, the production data of the original content released by the user can be counted to obtain the releasing times of the original content, the obtained number of comments and the number of praise; counting comment data issued by a user, and determining the monthly comment times of the user, the comment praise number, the comment reply interaction times, the comment liveness of each plate and the like; and (4) counting the interaction data of the user in the target social software to obtain the number of the user plus friends, the average online time and the like.
(2) And determining resource pushing strategy information corresponding to the target object according to the behavior result information and the category information.
Specifically, the behavior result information and the category information may be displayed in a preset interactive interface, and then resource pushing policy information corresponding to the target object is determined based on selection information of the user.
The resource pushing policy information may be preset, or may be customized based on the selection information of the user.
In the embodiment of the application, a target object is taken as an example of a user of target social software to specifically describe, and corresponding object information includes production data of original content published by the user, comment data published by the user, behavior information such as interaction data of the user in the target social software, and attribute information such as industry where the user is located, working years of the user, user academic records and working experiences. The object information is multidimensional characteristic data, the object information can be subjected to dimensionality reduction to obtain two-dimensional data, and then the two-dimensional data is subjected to clustering processing through a clustering algorithm to obtain category information corresponding to each object information. For example, object information can be divided into three categories: high activity users, low activity users, and potential users. And customizing a resource pushing strategy related to the A plate for the target user according to the comment liveness of the target user belonging to the high-liveness user category in the A plate.
In order to better understand the above target object processing method, an example of the target object processing method of the present application is described in detail below with reference to fig. 6, which includes the following steps:
s601, acquiring object information of a target object; the object information includes behavior information and attribute information of the target object.
The target object may be an object to be statistically analyzed, the object information may include static data and dynamic data of the target object, the static data may be attribute information of the target object, and the dynamic data may be behavior information of the target object. Specifically, the attribute information may represent an identity of the target object, and the behavior information may represent a behavior activity performed by the target object.
S602, reducing the dimension of the object information to obtain the object information after dimension reduction.
Specifically, the object information may be converted into a feature vector, and then the feature vector is subjected to dimension reduction, so as to obtain the dimension-reduced object information.
S603, based on the object information after dimension reduction, the space distance between the target objects is determined.
Specifically, each target object may be regarded as a space point, the reduced-dimension object information corresponding to the target object may be regarded as the value of the space point, and the distance between the space points, that is, the space distance between the target objects may be calculated based on the value of the space point.
S604, constructing a minimum spanning tree based on the space distance.
Specifically, Prim (a primum algorithm, an algorithm in graph theory, which can search for the minimum spanning tree in the weighted connected graph) algorithm may be adopted to generate the minimum spanning tree by taking the spatial distance as the weight of the minimum spanning tree.
S605, generating a plurality of clustering clusters according to the minimum spanning tree, determining a clustering result based on the clustering clusters, and determining the category information of the target object according to the clustering result.
Specifically, the edges of the minimum spanning tree may be sorted according to the ascending order of spatial distance, and then each edge is subjected to iteration processing, a new merged cluster is created for each edge, and a cluster is determined based on the merged clusters.
S606, counting the behavior information to obtain the behavior result information of the target object; wherein the behavior result information indicates behavior characteristics of the target object.
S607, determining the resource pushing strategy information corresponding to the target object according to the behavior result information and the category information.
The resource pushing policy information may be preset, or may be customized based on the selection information of the user.
Specifically, the behavior result information and the category information may be displayed in a preset interactive interface, and then resource pushing policy information corresponding to the target object is determined based on selection information of the user.
According to the method and the device, the object information of the target object is obtained firstly, and the object information is clustered, so that the category information of the target object is determined, and as the object information comprises behavior information and attribute information, the characteristics of the target object can be represented from different static and dynamic dimensions, and the target object can be classified accurately; compared with the prior art, objects, namely target objects, are classified based on the measurement results represented by the directly measurable variables, the resource pushing strategy information corresponding to the target objects can be determined based on the class information of the target objects, multi-dimensional classification evaluation can be performed on the target objects, the resource pushing strategy can be adjusted in time based on the class information, personalized resource pushing on the target objects is achieved, and future behavior information of the target objects is effectively guided.
An embodiment of the present application provides a target object processing apparatus, and as shown in fig. 7, the target object processing apparatus 70 may include: an obtaining module 701, a clustering module 702 and a determining module 703;
the acquiring module 701 is configured to acquire object information of a target object; the object information comprises behavior information and attribute information of the target object;
a clustering module 702, configured to perform clustering processing on the object information, and determine category information corresponding to the target object according to a clustering result;
a determining module 703 is configured to determine, according to the category information, resource pushing policy information corresponding to the target object.
A possible implementation manner is provided in the embodiment of the present application, and the clustering module 702 includes:
the dimension reduction unit is used for reducing the dimension of the object information to obtain the object information after dimension reduction;
and the clustering unit is used for clustering the object information subjected to dimension reduction.
In an embodiment of the present application, a possible implementation manner is provided, where the dimension reduction unit is configured to:
extracting a first feature vector of each target object aiming at the object information;
splicing the first feature vectors to obtain a feature matrix;
and reducing the dimension of the characteristic matrix to obtain a target matrix, and taking the target matrix as the object information after dimension reduction.
In an embodiment of the present application, a possible implementation manner is provided, where the dimension reduction unit is configured to:
inputting the characteristic matrix into a pre-trained mapper to obtain a target matrix; wherein the mapper indicates a mapping relationship between the feature matrix and the target matrix.
The embodiment of the present application provides a possible implementation manner, and the clustering unit is configured to:
determining the space distance between the target objects based on the object information after dimension reduction;
constructing a minimum spanning tree based on the spatial distance;
generating a plurality of clustering clusters according to the minimum spanning tree;
and determining a clustering result based on the clustering cluster.
The embodiment of the present application provides a possible implementation manner, and the clustering unit includes:
extracting a second feature vector of each target object based on the target matrix;
and respectively calculating the space distance between the target objects based on the second feature vectors.
The embodiment of the present application provides a possible implementation manner, and the clustering unit includes:
and taking the mark of each target object as a vertex, taking the space distance as the weight of an edge between adjacent vertices, and constructing a minimum spanning tree according to the vertex and the weight of the edge.
In an embodiment of the present application, a possible implementation manner is provided, and the determining module 703 is configured to:
counting the behavior information to obtain behavior result information of the target object; wherein the behavior result information indicates behavior characteristics of the target object;
and determining resource pushing strategy information corresponding to the target object according to the behavior result information and the category information.
According to the method and the device, the object information of the target object is obtained firstly, and the object information is clustered, so that the category information of the target object is determined, and as the object information comprises behavior information and attribute information, the characteristics of the target object can be represented from different static and dynamic dimensions, and the target object can be classified accurately; compared with the prior art, objects, namely target objects, are classified based on the measurement results represented by the directly measurable variables, the resource pushing strategy information corresponding to the target objects can be determined based on the class information of the target objects, multi-dimensional classification evaluation can be performed on the target objects, the resource pushing strategy can be adjusted in time based on the class information, personalized resource pushing on the target objects is achieved, and future behavior information of the target objects is effectively guided.
The apparatus of the embodiment of the present application may execute the method provided by the embodiment of the present application, and the implementation principle is similar, the actions executed by the modules in the apparatus of the embodiments of the present application correspond to the steps in the method of the embodiments of the present application, and for the detailed functional description of the modules of the apparatus, reference may be specifically made to the description in the corresponding method shown in the foregoing, and details are not repeated here.
The embodiment of the application provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to realize the steps of the target object processing method, and compared with the prior art, the method can realize the following steps: according to the method and the device, the object information of the target object is obtained firstly, and the object information is clustered, so that the category information of the target object is determined, and as the object information comprises behavior information and attribute information, the characteristics of the target object can be represented from different static and dynamic dimensions, and the target object can be classified accurately; compared with the prior art, objects, namely target objects, are classified based on the measurement results represented by the directly measurable variables, the resource pushing strategy information corresponding to the target objects can be determined based on the class information of the target objects, multi-dimensional classification evaluation can be performed on the target objects, the resource pushing strategy can be adjusted in time based on the class information, personalized resource pushing on the target objects is achieved, and future behavior information of the target objects is effectively guided.
In an alternative embodiment, an electronic device is provided, as shown in fig. 8, the electronic device 80 shown in fig. 8 comprising: a processor 801 and a memory 803. Wherein the processor 801 is coupled to a memory 803, such as via a bus 802. Optionally, the electronic device 800 may further include a transceiver 804, and the transceiver 804 may be used for data interaction between the electronic device and other electronic devices, such as transmission of data and/or reception of data. It should be noted that the transceiver 804 is not limited to one in practical applications, and the structure of the electronic device 800 is not limited to the embodiment of the present application.
The Processor 801 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 801 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Bus 802 may include a path that transfers information between the above components. The bus 802 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 802 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus.
The Memory 803 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium, other magnetic storage devices, or any other medium that can be used to carry or store a computer program and that can be Read by a computer, without limitation.
The memory 803 is used for storing computer programs for executing the embodiments of the present application, and is controlled by the processor 801 to execute the computer programs. The processor 801 is adapted to execute computer programs stored in the memory 803 to implement the steps shown in the foregoing method embodiments.
Among them, electronic devices include but are not limited to: mobile terminals such as mobile phones, notebook computers, PADs, etc. and fixed terminals such as digital TVs, desktop computers, etc.
Embodiments of the present application provide a computer-readable storage medium, on which a computer program is stored, and when being executed by a processor, the computer program may implement the steps and corresponding contents of the foregoing method embodiments.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device realizes the following when executed:
acquiring object information of a target object; the object information comprises behavior information and attribute information of the target object;
clustering the object information, and determining the category information corresponding to the target object according to the clustering result;
and determining resource pushing strategy information corresponding to the target object according to the category information.
It should be understood that, although each operation step is indicated by an arrow in the flowchart of the embodiment of the present application, the implementation order of the steps is not limited to the order indicated by the arrow. In some implementation scenarios of the embodiments of the present application, the implementation steps in the flowcharts may be performed in other sequences as desired, unless explicitly stated otherwise herein. In addition, some or all of the steps in each flowchart may include multiple sub-steps or multiple stages based on an actual implementation scenario. Some or all of these sub-steps or stages may be performed at the same time, or each of these sub-steps or stages may be performed at different times, respectively. In a scenario where execution times are different, an execution sequence of the sub-steps or the phases may be flexibly configured according to requirements, which is not limited in the embodiment of the present application.
The foregoing is only an optional implementation manner of a part of implementation scenarios in this application, and it should be noted that, for those skilled in the art, other similar implementation means based on the technical idea of this application are also within the protection scope of the embodiments of this application without departing from the technical idea of this application.

Claims (11)

1. A target object processing method, comprising:
acquiring object information of a target object; wherein the object information includes behavior information and attribute information of the target object;
clustering the object information, and determining the category information corresponding to the target object according to a clustering result;
and determining resource pushing strategy information corresponding to the target object according to the category information.
2. The target object processing method according to claim 1, wherein the clustering the object information includes:
reducing the dimension of the object information to obtain the object information after dimension reduction;
and clustering the object information subjected to dimension reduction.
3. The method for processing the target object according to claim 2, wherein the performing dimension reduction on the object information to obtain the dimension-reduced object information includes:
extracting a first feature vector of each target object aiming at the object information;
splicing the first feature vectors to obtain a feature matrix;
and reducing the dimension of the characteristic matrix to obtain a target matrix, and taking the target matrix as the object information after dimension reduction.
4. The target object processing method according to claim 3, wherein the performing dimension reduction on the feature matrix to obtain the target matrix comprises:
inputting the characteristic matrix into a pre-trained mapper to obtain a target matrix; wherein the mapper indicates a mapping relationship between the feature matrix and the target matrix.
5. The method according to claim 3, wherein the clustering the reduced-dimension object information includes:
determining a spatial distance between the target objects based on the object information after dimension reduction;
constructing a minimum spanning tree based on the spatial distance;
generating a plurality of clustering clusters according to the minimum spanning tree;
determining a clustering result based on the clustering cluster.
6. The method as claimed in claim 5, wherein said determining the spatial distance between the target objects based on the reduced-dimension object information comprises:
extracting a second feature vector of each target object based on the target matrix;
and respectively calculating the spatial distance between the target objects based on the second feature vectors.
7. The target object processing method of claim 5, wherein the constructing a minimum spanning tree based on the spatial distance comprises:
and taking the mark of each target object as a vertex, taking the space distance as the weight of an edge between adjacent vertices, and constructing a minimum spanning tree according to the vertex and the weight of the edge.
8. The target object processing method according to claim 1, wherein the determining, according to the category information, resource push policy information corresponding to the target object includes:
counting the behavior information to obtain behavior result information of the target object; wherein the behavior result information indicates a behavior feature of the target object;
and determining resource pushing strategy information corresponding to the target object according to the behavior result information and the category information.
9. A target object processing apparatus, comprising:
the acquisition module is used for acquiring the object information of the target object; wherein the object information includes behavior information and attribute information of the target object;
the clustering module is used for clustering the object information and determining the category information corresponding to the target object according to a clustering result;
and the determining module is used for determining the resource pushing strategy information corresponding to the target object according to the category information.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory, wherein the processor executes the computer program to implement the steps of the target object processing method of any one of claims 1 to 8.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the target object processing method of any one of claims 1 to 8.
CN202111194472.1A 2021-10-13 2021-10-13 Target object processing method and device, electronic equipment and storage medium Pending CN114090869A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111194472.1A CN114090869A (en) 2021-10-13 2021-10-13 Target object processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111194472.1A CN114090869A (en) 2021-10-13 2021-10-13 Target object processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114090869A true CN114090869A (en) 2022-02-25

Family

ID=80296896

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111194472.1A Pending CN114090869A (en) 2021-10-13 2021-10-13 Target object processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114090869A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114679386A (en) * 2022-05-25 2022-06-28 杭州海康威视数字技术股份有限公司 Cloud-edge cooperative Internet of things device role judgment and management method, system and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114679386A (en) * 2022-05-25 2022-06-28 杭州海康威视数字技术股份有限公司 Cloud-edge cooperative Internet of things device role judgment and management method, system and device
CN114679386B (en) * 2022-05-25 2022-08-05 杭州海康威视数字技术股份有限公司 Cloud-edge cooperative Internet of things device role judgment and management method, system and device

Similar Documents

Publication Publication Date Title
US11487941B2 (en) Techniques for determining categorized text
US9489627B2 (en) Hybrid clustering for data analytics
JP2021504789A (en) ESG-based corporate evaluation execution device and its operation method
CN114238573B (en) Text countercheck sample-based information pushing method and device
CN112395487A (en) Information recommendation method and device, computer-readable storage medium and electronic equipment
Yang et al. Experimental analysis and evaluation of wide residual networks based agricultural disease identification in smart agriculture system
Hemavathi et al. RETRACTED ARTICLE: Effective feature selection technique in an integrated environment using enhanced principal component analysis
Tavakoli et al. Clustering time series data through autoencoder-based deep learning models
CN110069558A (en) Data analysing method and terminal device based on deep learning
CN114328800A (en) Text processing method and device, electronic equipment and computer readable storage medium
CN114090869A (en) Target object processing method and device, electronic equipment and storage medium
CN112131199A (en) Log processing method, device, equipment and medium
CN116680401A (en) Document processing method, document processing device, apparatus and storage medium
Hamad et al. Sentiment analysis of restaurant reviews in social media using naïve bayes
Kumbhar et al. Web mining: A Synergic approach resorting to classifications and clustering
Gopala Krishnan et al. Predictive algorithm and criteria to perform big data analytics
Pei [Retracted] Construction of a Legal System of Corporate Social Responsibility Based on Big Data Analysis Technology
JP2021152751A (en) Analysis support device and analysis support method
KR20220105792A (en) AI-based Decision Making Support System utilizing Dynamic Text Sources
CN113094584A (en) Method and device for determining recommended learning resources
Kaur et al. Blog response volume prediction using adaptive neuro fuzzy inference system
Kawan et al. Multiclass Resume Categorization Using Data Mining
Porwal et al. Citation Classification Prediction Implying Text Features Using Natural Language Processing and Supervised Machine Learning Algorithms
Stefanowski et al. Final remarks on big data analysis and its impact on society and science
Atzberger et al. Quantifying Topic Model Influence on Text Layouts Based on Dimensionality Reductions.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination