CN116579791A - User mining method and device - Google Patents
User mining method and device Download PDFInfo
- Publication number
- CN116579791A CN116579791A CN202310358724.2A CN202310358724A CN116579791A CN 116579791 A CN116579791 A CN 116579791A CN 202310358724 A CN202310358724 A CN 202310358724A CN 116579791 A CN116579791 A CN 116579791A
- Authority
- CN
- China
- Prior art keywords
- user
- bill
- vector
- business
- feature vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000005065 mining Methods 0.000 title claims abstract description 49
- 239000013598 vector Substances 0.000 claims abstract description 204
- 230000006399 behavior Effects 0.000 claims description 53
- 238000013507 mapping Methods 0.000 claims description 27
- 238000012545 processing Methods 0.000 claims description 22
- 238000010586 diagram Methods 0.000 claims description 15
- 238000009792 diffusion process Methods 0.000 claims description 13
- 230000032683 aging Effects 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 10
- 238000001228 spectrum Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 238000010276 construction Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 description 9
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000036961 partial effect Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000002860 competitive effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000011524 similarity measure Methods 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0202—Market predictions or forecasting for commercial activities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/02—Banking, e.g. interest calculation or account maintenance
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- Theoretical Computer Science (AREA)
- Development Economics (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Game Theory and Decision Science (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Technology Law (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The application provides a user mining method and a device, wherein the method comprises the following steps: acquiring business behavior data of business to and from bills and bill panorama; acquiring user word vectors according to business behavior data, and acquiring user network feature vectors according to bill panoramic graphs; vector splicing is carried out on the user network feature vector and the user word vector to obtain mixed feature information; calculating a seed user vector set according to the mixed characteristic information; and carrying out user mining according to the seed user vector set to obtain the target potential passenger of the bill. It can be seen that implementing this embodiment enables the value of enterprise customers to be mined and target potential customers to be searched accurately and quickly, thereby facilitating locating potential businesses and identifying non-entity customers.
Description
Technical Field
The application relates to the technical field of data processing, in particular to a user mining method and device.
Background
Currently, the ticket market is an important channel for enterprises to acquire bank financing and credit support. The financing cost of the bill is lower, and the financial cost of enterprises can be effectively saved, so that the demand on bill posting is urgent no matter the large-scale clients or the medium-sized and small enterprises. The bill is effectively combined with the traditional credit tool, which is beneficial to developing new customers, stabilizing old customers and absorbing deposit, creating cross selling opportunities and achieving competitive advantage. The enterprise client value is furthest mined through bill financing, and the bill financing method has become the consensus of developing markets for various banks. However, in business practice, challenges are often faced with the inability to find target potential customers, the inability to find potential business, the difficulty in identifying non-entity customers, and the like.
Disclosure of Invention
The embodiment of the application aims to provide a user mining method and device, which can mine the value of enterprise clients and accurately and quickly search target potential clients, thereby being beneficial to positioning potential services and identifying non-entity clients.
A first aspect of the embodiment of the present application provides a user mining method, including:
acquiring business behavior data of business to and from bills and bill panorama;
acquiring a user word vector according to the business behavior data, and acquiring a user network feature vector according to the bill panoramic spectrum;
vector splicing is carried out on the user network feature vector and the user word vector to obtain mixed feature information;
calculating a seed user vector set according to the mixed characteristic information;
and carrying out user mining according to the seed user vector set to obtain target potential customers of the bill.
In the implementation process, the method can obtain business behavior data of the business and bill panorama of the bill among enterprises preferentially; then, obtaining a user word vector according to business behavior data, and obtaining a user network feature vector according to the bill panorama; then, vector splicing is carried out on the user network feature vector and the user word vector to obtain mixed feature information; thirdly, calculating a seed user vector set according to the mixed characteristic information; and finally, carrying out user mining according to the seed user vector set to obtain the target potential passenger of the bill. Therefore, the method can mine the value of the enterprise clients and accurately and quickly search the target potential customers, thereby being beneficial to locating potential business and identifying non-entity clients.
Further, the obtaining business behavior data of the notes between enterprises and the panoramic spectrum of the notes includes:
acquiring business behavior data of business notes between enterprises and historical transaction behavior data of bill clients;
constructing a client bill panoramic network diagram according to the bill client historical transaction behavior data;
training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map; the bill panorama comprises feature vectors of each user in a network of bill panorama.
Further, the obtaining the user word vector according to the business behavior data includes:
mapping the business behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year average balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
processing the mapping feature vector through a pre-configured neural network full connection to obtain a processing vector set;
and adding the processing vector sets to obtain user word vectors.
Further, the step of performing user mining according to the seed user vector set to obtain a target diver of the bill includes:
acquiring a data source to be mined;
marking a seed guest group in the data source to be mined, and determining a diffusion range according to the seed guest group;
and searching for similarity vectors in the diffusion range according to the seed user vector set to obtain target potential customers of the bill.
A second aspect of an embodiment of the present application provides a user mining apparatus, including:
the first acquisition unit is used for acquiring business behavior data of notes between enterprises and a note panorama;
the second acquisition unit is used for acquiring the user word vector according to the business behavior data;
the third acquisition unit is used for acquiring a user network feature vector according to the bill panoramic spectrum;
the splicing unit is used for carrying out vector splicing on the user network feature vector and the user word vector to obtain mixed feature information;
a calculating unit, configured to calculate a seed user vector set according to the mixed feature information;
and the mining unit is used for carrying out user mining according to the seed user vector set to obtain target potential customers of the bill.
In the implementation process, the device can acquire business behavior data of business and bill panorama of the business bill through the first acquisition unit; acquiring a user word vector according to business behavior data through a second acquisition unit; acquiring a user network feature vector according to the bill panorama by a third acquisition unit; vector splicing is carried out on the user network feature vector and the user word vector through a splicing unit, so that mixed feature information is obtained; calculating, by a calculation unit, a seed user vector set from the hybrid feature information; and finally, carrying out user mining according to the seed user vector set by a mining unit to obtain the target potential passenger of the bill. Therefore, the device can mine the value of enterprise clients and accurately and quickly search target potential clients, thereby being beneficial to locating potential services and identifying non-entity clients.
Further, the first acquisition unit includes:
the first acquisition subunit is used for acquiring business behavior data of bills between enterprises and historical transaction behavior data of bill clients;
the construction subunit is used for constructing a client bill panoramic network diagram according to the bill client historical transaction behavior data;
the training subunit is used for training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map; the bill panorama comprises feature vectors of each user in a network of bill panorama.
Further, the second acquisition unit includes:
the mapping subunit is used for mapping the business behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year average balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
the processing subunit is used for processing the mapping feature vectors through the pre-configured neural network full connection to obtain a processing vector set;
and the adding subunit is used for adding the processing vector set to obtain a user word vector.
Further, the excavating unit includes:
the second acquisition subunit is used for acquiring a data source to be mined;
the marking subunit is used for marking the seed guest group in the data source to be mined and determining a diffusion range according to the seed guest group;
and the searching subunit is used for searching the similarity vector in the diffusion range according to the seed user vector set to obtain the target potential passenger of the bill.
A third aspect of the embodiment of the present application provides an electronic device, including a memory and a processor, where the memory is configured to store a computer program, and the processor is configured to execute the computer program to cause the electronic device to execute the user mining method according to any one of the first aspect of the embodiment of the present application.
A fourth aspect of the embodiments of the present application provides a computer readable storage medium storing computer program instructions which, when read and executed by a processor, perform the user mining method according to any one of the first aspect of the embodiments of the present application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a user mining method according to an embodiment of the present application;
fig. 2 is a flow chart of another user mining method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a user excavating device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of another user excavating apparatus according to an embodiment of the present application;
fig. 5 is a schematic flow chart of an example based on the whole model according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.
Example 1
Referring to fig. 1, fig. 1 is a flow chart of a user mining method according to the present embodiment. The user mining method comprises the following steps:
s101, acquiring business behavior data of business and bill panorama of the enterprise bill.
S102, obtaining user word vectors according to business behavior data, and obtaining user network feature vectors according to the bill panoramic spectrum.
And S103, vector splicing is carried out on the user network feature vector and the user word vector, and mixed feature information is obtained.
S104, calculating a seed user vector set according to the mixed characteristic information.
S105, user mining is carried out according to the seed user vector set, and target potential customers of the bill are obtained.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
In this embodiment, the execution body of the method may be an intelligent device such as a smart phone or a tablet computer, which is not limited in this embodiment.
Therefore, by implementing the user mining method described in the embodiment, on the basis of considering the homogeneity problem of the client in the network map structure, the enterprise portrait information can be processed by using the ebedding technology in the natural language, so that the two partial vectors can be spliced to obtain the feature vector mixed by the user. Because the Node2vec graph algorithm is essentially a feature extractor, the method can sample the graph, construct a model based on the sampled sequence, and finally convert the nodes into feature vectors so that the model can mine high-order hidden features among clients. On the other hand, the method can also search topN vectors which are most similar to the seed guest group in the database by utilizing the fasss vector search library, so that the method is particularly suitable for being used in a big data scene, and can easily support the search of tens of millions of similar vectors by utilizing fasss when facing to a huge customer group, thereby improving the running efficiency of a model, reducing the use of memory and saving cluster resources.
Example 2
Referring to fig. 2, fig. 2 is a flow chart of a user mining method according to the present embodiment. The user mining method comprises the following steps:
s201, acquiring business behavior data of business and historical transaction behavior data of bill clients among enterprises.
S202, constructing a client bill panoramic network diagram according to historical transaction behavior data of bill clients.
S203, training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map.
In this embodiment, the bill panorama includes a feature vector of each user in a network of bill panorama.
S204, mapping the business behavior data to obtain a mapping feature vector.
In this embodiment, the business behavior data at least includes a business establishment period, a business location industry, a business aging EVA, a business loan information, and a business post current daily balance; the mapping feature vector includes at least an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector, and an enterprise impression feature vector.
S205, the mapping feature vector is processed through the pre-configured neural network full connection, and a processing vector set is obtained.
S206, adding the processing vector sets to obtain user word vectors, and obtaining user network feature vectors according to the bill panoramic spectrum.
S207, vector stitching is carried out on the user network feature vector and the user word vector, and mixed feature information is obtained.
S208, calculating a seed user vector set according to the mixed characteristic information.
S209, acquiring a data source to be mined.
S210, marking a seed guest group in the data source to be mined, and determining a diffusion range according to the seed guest group.
S211, searching for similarity vectors in a diffusion range according to the seed user vector set to obtain target potential customers of the bill.
In this embodiment, the method can be used to understand the architecture flow of the whole model in conjunction with fig. 5. The specific flow is as follows:
(1) and taking out characteristics such as the established years, the industries, the annual EVA, the loan deposit information, the daily balance of the present year and the like of the enterprise, and respectively carrying out the embellishing mapping processing to obtain characteristic vector representations of each dimension such as the enterprise attribute characteristics, the industry characteristics, the annual EVA characteristics, the deposit characteristics, the loan characteristics, the present characteristics and the like.
(2) And (3) adding the six feature vector representations generated in the step (1) after the full connection processing by using the neural network to obtain the final feature vector of the user.
(3) A network is constructed according to the historical transaction behaviors of bill clients, a network x open source network library can be selected, nodes are enterprise entities, bill relations among enterprises are edges of the network, transfer transaction amount is weight of the network, after a client bill panoramic network diagram is constructed, training is conducted through a Node2vec model, and feature vector representation of each user in the network is obtained.
(4) And (3) splicing the two user feature vectors obtained in the step (2) and the step (3), inputting the two user feature vectors into a neural network model, adding a maximum pooling layer and a full-connection layer, and finally obtaining the feature vector mixed by the user after two layers of conversion.
(5) After the data source is taken out from the database system, marking the seed guest group to show and distinguish with other guest groups; the scope of the diffusion is determined, for example, a seed user needs to diffuse to find topN users most similar to the seed user, and the parameters can be set manually according to the service requirement.
(6) Calculating the feature vector of the seed client according to the steps (1) - (4), and searching the similarity vector by utilizing fass to obtain topN users which are the most similar to each seed user, wherein fass judges that the similarity measure supports two measuring methods: 1. l2 (euclidean distance) 2, inner product. If the speed of the fasss search is to be increased, this can be done by setting the index, but this type of index requires an additional training phase and is performed on a set of vectors with the same distribution as the database vectors.
In this embodiment, the method may be applied to the financial field.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
In this embodiment, the execution body of the method may be an intelligent device such as a smart phone or a tablet computer, which is not limited in this embodiment.
Therefore, by implementing the user mining method described in the embodiment, on the basis of considering the homogeneity problem of the client in the network map structure, the enterprise portrait information can be processed by using the ebedding technology in the natural language, so that the two partial vectors can be spliced to obtain the feature vector mixed by the user. Because the Node2vec graph algorithm is essentially a feature extractor, the method can sample the graph, construct a model based on the sampled sequence, and finally convert the nodes into feature vectors so that the model can mine high-order hidden features among clients. On the other hand, the method can also search topN vectors which are most similar to the seed guest group in the database by utilizing the fasss vector search library, so that the method is particularly suitable for being used in a big data scene, and can easily support the search of tens of millions of similar vectors by utilizing fasss when facing to a huge customer group, thereby improving the running efficiency of a model, reducing the use of memory and saving cluster resources.
Example 3
Referring to fig. 3, fig. 3 is a schematic structural diagram of a user excavating apparatus according to the present embodiment. As shown in fig. 3, the user excavating apparatus includes:
a first obtaining unit 310, configured to obtain business behavior data of notes between enterprises and a panoramic spectrum of the notes;
a second obtaining unit 320, configured to obtain a user word vector according to the business behavior data;
a third obtaining unit 330, configured to obtain a user network feature vector according to the bill panorama;
the splicing unit 340 is configured to perform vector splicing on the user network feature vector and the user word vector to obtain mixed feature information;
a calculating unit 350 for calculating a seed user vector set according to the mixed feature information;
and the mining unit 360 is used for mining the users according to the seed user vector set to obtain target potential customers of the bill.
In this embodiment, the explanation of the user excavating device may refer to the description in embodiment 1 or embodiment 2, and the description is not repeated in this embodiment.
Therefore, the user mining device described in the embodiment can process the enterprise portrait information by using the ebedding technology in the natural language on the basis of considering the homogeneity problem of the client in the network map structure, so that the two partial vectors are spliced to obtain the feature vector mixed by the user. Because the Node2vec graph algorithm is essentially a feature extractor, the device can sample the graph, construct a model based on the sampled sequence, and finally convert the nodes into feature vectors so that the model can mine high-order hidden features among clients. On the other hand, the device can also search topN vectors which are most similar to the seed guest group in the database by utilizing the fasss vector search library, so that the device is particularly suitable for being used in a big data scene, and can easily support the search of tens of millions of similar vectors by utilizing fasss when facing to a huge guest group, thereby improving the running efficiency of a model, reducing the use of memory and saving cluster resources.
Example 4
Referring to fig. 4, fig. 4 is a schematic structural diagram of a user excavating apparatus according to the present embodiment. As shown in fig. 4, the user excavating apparatus includes:
a first obtaining unit 310, configured to obtain business behavior data of notes between enterprises and a panoramic spectrum of the notes;
a second obtaining unit 320, configured to obtain a user word vector according to the business behavior data;
a third obtaining unit 330, configured to obtain a user network feature vector according to the bill panorama;
the splicing unit 340 is configured to perform vector splicing on the user network feature vector and the user word vector to obtain mixed feature information;
a calculating unit 350 for calculating a seed user vector set according to the mixed feature information;
and the mining unit 360 is used for mining the users according to the seed user vector set to obtain target potential customers of the bill.
As an alternative embodiment, the first acquisition unit 310 includes:
a first obtaining subunit 311, configured to obtain business behavior data of a ticket between enterprises and historical transaction behavior data of a ticket client;
a construction subunit 312, configured to construct a client bill panorama network diagram according to the bill client historical transaction behavior data;
a training subunit 313, configured to train the client bill panoramic network map through a Node2vec model, so as to obtain a bill panoramic map; the bill panorama comprises a feature vector of each user in a network of bill panorama.
As an alternative embodiment, the second acquisition unit 320 includes:
a mapping subunit 321, configured to map the traffic behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year daily balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
a processing subunit 322, configured to process the mapping feature vector through a pre-configured neural network full connection, to obtain a processing vector set;
an adding subunit 323, configured to perform an adding process on the set of processing vectors to obtain a user word vector.
As an alternative embodiment, the excavating unit 360 includes:
a second obtaining subunit 361, configured to obtain a data source to be mined;
a marking subunit 362, configured to mark the seed guest group in the data source to be mined, and determine the diffusion range according to the seed guest group;
and the searching subunit 363 is used for searching the similarity vector in the diffusion range according to the seed user vector set to obtain the target potential passenger of the bill.
In this embodiment, the explanation of the user excavating device may refer to the description in embodiment 1 or embodiment 2, and the description is not repeated in this embodiment.
Therefore, the user mining device described in the embodiment can process the enterprise portrait information by using the ebedding technology in the natural language on the basis of considering the homogeneity problem of the client in the network map structure, so that the two partial vectors are spliced to obtain the feature vector mixed by the user. Because the Node2vec graph algorithm is essentially a feature extractor, the device can sample the graph, construct a model based on the sampled sequence, and finally convert the nodes into feature vectors so that the model can mine high-order hidden features among clients. On the other hand, the device can also search topN vectors which are most similar to the seed guest group in the database by utilizing the fasss vector search library, so that the device is particularly suitable for being used in a big data scene, and can easily support the search of tens of millions of similar vectors by utilizing fasss when facing to a huge guest group, thereby improving the running efficiency of a model, reducing the use of memory and saving cluster resources.
An embodiment of the present application provides an electronic device, including a memory and a processor, where the memory is configured to store a computer program, and the processor is configured to execute the computer program to cause the electronic device to execute a user mining method in embodiment 1 or embodiment 2 of the present application.
An embodiment of the present application provides a computer readable storage medium storing computer program instructions that, when read and executed by a processor, perform the user mining method of embodiment 1 or embodiment 2 of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Claims (10)
1. A user mining method, comprising:
acquiring business behavior data of business to and from bills and bill panorama;
acquiring a user word vector according to the business behavior data, and acquiring a user network feature vector according to the bill panoramic spectrum;
vector splicing is carried out on the user network feature vector and the user word vector to obtain mixed feature information;
calculating a seed user vector set according to the mixed characteristic information;
and carrying out user mining according to the seed user vector set to obtain target potential customers of the bill.
2. The method for mining users according to claim 1, wherein the acquiring business behavior data of business to and from notes and note panorama of notes between enterprises comprises:
acquiring business behavior data of business notes between enterprises and historical transaction behavior data of bill clients;
constructing a client bill panoramic network diagram according to the bill client historical transaction behavior data;
training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map; the bill panorama comprises feature vectors of each user in a network of bill panorama.
3. The method of claim 1, wherein the obtaining the user word vector according to the business behavior data comprises:
mapping the business behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year average balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
processing the mapping feature vector through a pre-configured neural network full connection to obtain a processing vector set;
and adding the processing vector sets to obtain user word vectors.
4. The user mining method according to claim 1, wherein the performing user mining according to the seed user vector set to obtain target potential customers of the ticket includes:
acquiring a data source to be mined;
marking a seed guest group in the data source to be mined, and determining a diffusion range according to the seed guest group;
and searching for similarity vectors in the diffusion range according to the seed user vector set to obtain target potential customers of the bill.
5. A user mining apparatus, the user mining apparatus comprising:
the first acquisition unit is used for acquiring business behavior data of notes between enterprises and a note panorama;
the second acquisition unit is used for acquiring the user word vector according to the business behavior data;
the third acquisition unit is used for acquiring a user network feature vector according to the bill panoramic spectrum;
the splicing unit is used for carrying out vector splicing on the user network feature vector and the user word vector to obtain mixed feature information;
a calculating unit, configured to calculate a seed user vector set according to the mixed feature information;
and the mining unit is used for carrying out user mining according to the seed user vector set to obtain target potential customers of the bill.
6. The user mining apparatus of claim 5, wherein the first acquisition unit comprises:
the first acquisition subunit is used for acquiring business behavior data of bills between enterprises and historical transaction behavior data of bill clients;
the construction subunit is used for constructing a client bill panoramic network diagram according to the bill client historical transaction behavior data;
the training subunit is used for training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map; the bill panorama comprises feature vectors of each user in a network of bill panorama.
7. The user mining apparatus of claim 5, wherein the second acquisition unit comprises:
the mapping subunit is used for mapping the business behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year average balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
the processing subunit is used for processing the mapping feature vectors through the pre-configured neural network full connection to obtain a processing vector set;
and the adding subunit is used for adding the processing vector set to obtain a user word vector.
8. The user mining apparatus of claim 5, wherein the mining unit includes:
the second acquisition subunit is used for acquiring a data source to be mined;
the marking subunit is used for marking the seed guest group in the data source to be mined and determining a diffusion range according to the seed guest group;
and the searching subunit is used for searching the similarity vector in the diffusion range according to the seed user vector set to obtain the target potential passenger of the bill.
9. An electronic device comprising a memory for storing a computer program and a processor that runs the computer program to cause the electronic device to perform the user mining method of any of claims 1 to 4.
10. A readable storage medium having stored therein computer program instructions which, when read and executed by a processor, perform the user mining method of any of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310358724.2A CN116579791A (en) | 2023-03-29 | 2023-03-29 | User mining method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310358724.2A CN116579791A (en) | 2023-03-29 | 2023-03-29 | User mining method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116579791A true CN116579791A (en) | 2023-08-11 |
Family
ID=87532960
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310358724.2A Pending CN116579791A (en) | 2023-03-29 | 2023-03-29 | User mining method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116579791A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117271886A (en) * | 2023-08-25 | 2023-12-22 | 广东美亚旅游科技集团股份有限公司 | Data searching method, system, equipment and medium based on air ticket order management |
-
2023
- 2023-03-29 CN CN202310358724.2A patent/CN116579791A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117271886A (en) * | 2023-08-25 | 2023-12-22 | 广东美亚旅游科技集团股份有限公司 | Data searching method, system, equipment and medium based on air ticket order management |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108153824B (en) | Method and device for determining target user group | |
CN108388559A (en) | Name entity recognition method and system, computer program of the geographical space under | |
CN110309432B (en) | Synonym determining method based on interest points and map interest point processing method | |
US20190034816A1 (en) | Methods and system for associating locations with annotations | |
CN106033416A (en) | A string processing method and device | |
CN109255000B (en) | Dimension management method and device for label data | |
CN106776897A (en) | A kind of user's portrait label determines method and device | |
CN110598066B (en) | Bank full-name rapid matching method based on word vector expression and cosine similarity | |
JP2018537760A (en) | Method and apparatus for account mapping based on address information | |
CN112330342A (en) | Method and system for optimally matching enterprise name and system user name | |
CN112241458B (en) | Text knowledge structuring processing method, device, equipment and readable storage medium | |
CN116579791A (en) | User mining method and device | |
CN111309834A (en) | Method and device for matching wireless hotspot with interest point | |
Chi et al. | Creating a new dataset to analyse house prices in England | |
CN117151429B (en) | Government service flow arranging method and device based on knowledge graph | |
CN114328808A (en) | Address fuzzy matching method, address processing method, address fuzzy matching device and electronic equipment | |
CN113761867A (en) | Address recognition method and device, computer equipment and storage medium | |
CN112711645A (en) | Method and device for expanding position point information, storage medium and electronic equipment | |
CN109144999B (en) | Data positioning method, device, storage medium and program product | |
CN103514167B (en) | Data processing method and equipment | |
CN115309995A (en) | Scientific and technological resource pushing method and device based on demand text | |
CN112214494B (en) | Retrieval method and device | |
CN110309312B (en) | Associated event acquisition method and device | |
CN113434706A (en) | Academic collaboration relation analysis method and device | |
CN111767722A (en) | Word segmentation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |