CN116579791A - User mining method and device - Google Patents

User mining method and device Download PDF

Info

Publication number
CN116579791A
CN116579791A CN202310358724.2A CN202310358724A CN116579791A CN 116579791 A CN116579791 A CN 116579791A CN 202310358724 A CN202310358724 A CN 202310358724A CN 116579791 A CN116579791 A CN 116579791A
Authority
CN
China
Prior art keywords
user
bill
vector
business
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310358724.2A
Other languages
Chinese (zh)
Inventor
蔡凡华
胡万利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202310358724.2A priority Critical patent/CN116579791A/en
Publication of CN116579791A publication Critical patent/CN116579791A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Technology Law (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides a user mining method and a device, wherein the method comprises the following steps: acquiring business behavior data of business to and from bills and bill panorama; acquiring user word vectors according to business behavior data, and acquiring user network feature vectors according to bill panoramic graphs; vector splicing is carried out on the user network feature vector and the user word vector to obtain mixed feature information; calculating a seed user vector set according to the mixed characteristic information; and carrying out user mining according to the seed user vector set to obtain the target potential passenger of the bill. It can be seen that implementing this embodiment enables the value of enterprise customers to be mined and target potential customers to be searched accurately and quickly, thereby facilitating locating potential businesses and identifying non-entity customers.

Description

User mining method and device
Technical Field
The application relates to the technical field of data processing, in particular to a user mining method and device.
Background
Currently, the ticket market is an important channel for enterprises to acquire bank financing and credit support. The financing cost of the bill is lower, and the financial cost of enterprises can be effectively saved, so that the demand on bill posting is urgent no matter the large-scale clients or the medium-sized and small enterprises. The bill is effectively combined with the traditional credit tool, which is beneficial to developing new customers, stabilizing old customers and absorbing deposit, creating cross selling opportunities and achieving competitive advantage. The enterprise client value is furthest mined through bill financing, and the bill financing method has become the consensus of developing markets for various banks. However, in business practice, challenges are often faced with the inability to find target potential customers, the inability to find potential business, the difficulty in identifying non-entity customers, and the like.
Disclosure of Invention
The embodiment of the application aims to provide a user mining method and device, which can mine the value of enterprise clients and accurately and quickly search target potential clients, thereby being beneficial to positioning potential services and identifying non-entity clients.
A first aspect of the embodiment of the present application provides a user mining method, including:
acquiring business behavior data of business to and from bills and bill panorama;
acquiring a user word vector according to the business behavior data, and acquiring a user network feature vector according to the bill panoramic spectrum;
vector splicing is carried out on the user network feature vector and the user word vector to obtain mixed feature information;
calculating a seed user vector set according to the mixed characteristic information;
and carrying out user mining according to the seed user vector set to obtain target potential customers of the bill.
In the implementation process, the method can obtain business behavior data of the business and bill panorama of the bill among enterprises preferentially; then, obtaining a user word vector according to business behavior data, and obtaining a user network feature vector according to the bill panorama; then, vector splicing is carried out on the user network feature vector and the user word vector to obtain mixed feature information; thirdly, calculating a seed user vector set according to the mixed characteristic information; and finally, carrying out user mining according to the seed user vector set to obtain the target potential passenger of the bill. Therefore, the method can mine the value of the enterprise clients and accurately and quickly search the target potential customers, thereby being beneficial to locating potential business and identifying non-entity clients.
Further, the obtaining business behavior data of the notes between enterprises and the panoramic spectrum of the notes includes:
acquiring business behavior data of business notes between enterprises and historical transaction behavior data of bill clients;
constructing a client bill panoramic network diagram according to the bill client historical transaction behavior data;
training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map; the bill panorama comprises feature vectors of each user in a network of bill panorama.
Further, the obtaining the user word vector according to the business behavior data includes:
mapping the business behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year average balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
processing the mapping feature vector through a pre-configured neural network full connection to obtain a processing vector set;
and adding the processing vector sets to obtain user word vectors.
Further, the step of performing user mining according to the seed user vector set to obtain a target diver of the bill includes:
acquiring a data source to be mined;
marking a seed guest group in the data source to be mined, and determining a diffusion range according to the seed guest group;
and searching for similarity vectors in the diffusion range according to the seed user vector set to obtain target potential customers of the bill.
A second aspect of an embodiment of the present application provides a user mining apparatus, including:
the first acquisition unit is used for acquiring business behavior data of notes between enterprises and a note panorama;
the second acquisition unit is used for acquiring the user word vector according to the business behavior data;
the third acquisition unit is used for acquiring a user network feature vector according to the bill panoramic spectrum;
the splicing unit is used for carrying out vector splicing on the user network feature vector and the user word vector to obtain mixed feature information;
a calculating unit, configured to calculate a seed user vector set according to the mixed feature information;
and the mining unit is used for carrying out user mining according to the seed user vector set to obtain target potential customers of the bill.
In the implementation process, the device can acquire business behavior data of business and bill panorama of the business bill through the first acquisition unit; acquiring a user word vector according to business behavior data through a second acquisition unit; acquiring a user network feature vector according to the bill panorama by a third acquisition unit; vector splicing is carried out on the user network feature vector and the user word vector through a splicing unit, so that mixed feature information is obtained; calculating, by a calculation unit, a seed user vector set from the hybrid feature information; and finally, carrying out user mining according to the seed user vector set by a mining unit to obtain the target potential passenger of the bill. Therefore, the device can mine the value of enterprise clients and accurately and quickly search target potential clients, thereby being beneficial to locating potential services and identifying non-entity clients.
Further, the first acquisition unit includes:
the first acquisition subunit is used for acquiring business behavior data of bills between enterprises and historical transaction behavior data of bill clients;
the construction subunit is used for constructing a client bill panoramic network diagram according to the bill client historical transaction behavior data;
the training subunit is used for training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map; the bill panorama comprises feature vectors of each user in a network of bill panorama.
Further, the second acquisition unit includes:
the mapping subunit is used for mapping the business behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year average balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
the processing subunit is used for processing the mapping feature vectors through the pre-configured neural network full connection to obtain a processing vector set;
and the adding subunit is used for adding the processing vector set to obtain a user word vector.
Further, the excavating unit includes:
the second acquisition subunit is used for acquiring a data source to be mined;
the marking subunit is used for marking the seed guest group in the data source to be mined and determining a diffusion range according to the seed guest group;
and the searching subunit is used for searching the similarity vector in the diffusion range according to the seed user vector set to obtain the target potential passenger of the bill.
A third aspect of the embodiment of the present application provides an electronic device, including a memory and a processor, where the memory is configured to store a computer program, and the processor is configured to execute the computer program to cause the electronic device to execute the user mining method according to any one of the first aspect of the embodiment of the present application.
A fourth aspect of the embodiments of the present application provides a computer readable storage medium storing computer program instructions which, when read and executed by a processor, perform the user mining method according to any one of the first aspect of the embodiments of the present application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a user mining method according to an embodiment of the present application;
fig. 2 is a flow chart of another user mining method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a user excavating device according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of another user excavating apparatus according to an embodiment of the present application;
fig. 5 is a schematic flow chart of an example based on the whole model according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only to distinguish the description, and are not to be construed as indicating or implying relative importance.
Example 1
Referring to fig. 1, fig. 1 is a flow chart of a user mining method according to the present embodiment. The user mining method comprises the following steps:
s101, acquiring business behavior data of business and bill panorama of the enterprise bill.
S102, obtaining user word vectors according to business behavior data, and obtaining user network feature vectors according to the bill panoramic spectrum.
And S103, vector splicing is carried out on the user network feature vector and the user word vector, and mixed feature information is obtained.
S104, calculating a seed user vector set according to the mixed characteristic information.
S105, user mining is carried out according to the seed user vector set, and target potential customers of the bill are obtained.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
In this embodiment, the execution body of the method may be an intelligent device such as a smart phone or a tablet computer, which is not limited in this embodiment.
Therefore, by implementing the user mining method described in the embodiment, on the basis of considering the homogeneity problem of the client in the network map structure, the enterprise portrait information can be processed by using the ebedding technology in the natural language, so that the two partial vectors can be spliced to obtain the feature vector mixed by the user. Because the Node2vec graph algorithm is essentially a feature extractor, the method can sample the graph, construct a model based on the sampled sequence, and finally convert the nodes into feature vectors so that the model can mine high-order hidden features among clients. On the other hand, the method can also search topN vectors which are most similar to the seed guest group in the database by utilizing the fasss vector search library, so that the method is particularly suitable for being used in a big data scene, and can easily support the search of tens of millions of similar vectors by utilizing fasss when facing to a huge customer group, thereby improving the running efficiency of a model, reducing the use of memory and saving cluster resources.
Example 2
Referring to fig. 2, fig. 2 is a flow chart of a user mining method according to the present embodiment. The user mining method comprises the following steps:
s201, acquiring business behavior data of business and historical transaction behavior data of bill clients among enterprises.
S202, constructing a client bill panoramic network diagram according to historical transaction behavior data of bill clients.
S203, training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map.
In this embodiment, the bill panorama includes a feature vector of each user in a network of bill panorama.
S204, mapping the business behavior data to obtain a mapping feature vector.
In this embodiment, the business behavior data at least includes a business establishment period, a business location industry, a business aging EVA, a business loan information, and a business post current daily balance; the mapping feature vector includes at least an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector, and an enterprise impression feature vector.
S205, the mapping feature vector is processed through the pre-configured neural network full connection, and a processing vector set is obtained.
S206, adding the processing vector sets to obtain user word vectors, and obtaining user network feature vectors according to the bill panoramic spectrum.
S207, vector stitching is carried out on the user network feature vector and the user word vector, and mixed feature information is obtained.
S208, calculating a seed user vector set according to the mixed characteristic information.
S209, acquiring a data source to be mined.
S210, marking a seed guest group in the data source to be mined, and determining a diffusion range according to the seed guest group.
S211, searching for similarity vectors in a diffusion range according to the seed user vector set to obtain target potential customers of the bill.
In this embodiment, the method can be used to understand the architecture flow of the whole model in conjunction with fig. 5. The specific flow is as follows:
(1) and taking out characteristics such as the established years, the industries, the annual EVA, the loan deposit information, the daily balance of the present year and the like of the enterprise, and respectively carrying out the embellishing mapping processing to obtain characteristic vector representations of each dimension such as the enterprise attribute characteristics, the industry characteristics, the annual EVA characteristics, the deposit characteristics, the loan characteristics, the present characteristics and the like.
(2) And (3) adding the six feature vector representations generated in the step (1) after the full connection processing by using the neural network to obtain the final feature vector of the user.
(3) A network is constructed according to the historical transaction behaviors of bill clients, a network x open source network library can be selected, nodes are enterprise entities, bill relations among enterprises are edges of the network, transfer transaction amount is weight of the network, after a client bill panoramic network diagram is constructed, training is conducted through a Node2vec model, and feature vector representation of each user in the network is obtained.
(4) And (3) splicing the two user feature vectors obtained in the step (2) and the step (3), inputting the two user feature vectors into a neural network model, adding a maximum pooling layer and a full-connection layer, and finally obtaining the feature vector mixed by the user after two layers of conversion.
(5) After the data source is taken out from the database system, marking the seed guest group to show and distinguish with other guest groups; the scope of the diffusion is determined, for example, a seed user needs to diffuse to find topN users most similar to the seed user, and the parameters can be set manually according to the service requirement.
(6) Calculating the feature vector of the seed client according to the steps (1) - (4), and searching the similarity vector by utilizing fass to obtain topN users which are the most similar to each seed user, wherein fass judges that the similarity measure supports two measuring methods: 1. l2 (euclidean distance) 2, inner product. If the speed of the fasss search is to be increased, this can be done by setting the index, but this type of index requires an additional training phase and is performed on a set of vectors with the same distribution as the database vectors.
In this embodiment, the method may be applied to the financial field.
In this embodiment, the execution subject of the method may be a computing device such as a computer or a server, which is not limited in this embodiment.
In this embodiment, the execution body of the method may be an intelligent device such as a smart phone or a tablet computer, which is not limited in this embodiment.
Therefore, by implementing the user mining method described in the embodiment, on the basis of considering the homogeneity problem of the client in the network map structure, the enterprise portrait information can be processed by using the ebedding technology in the natural language, so that the two partial vectors can be spliced to obtain the feature vector mixed by the user. Because the Node2vec graph algorithm is essentially a feature extractor, the method can sample the graph, construct a model based on the sampled sequence, and finally convert the nodes into feature vectors so that the model can mine high-order hidden features among clients. On the other hand, the method can also search topN vectors which are most similar to the seed guest group in the database by utilizing the fasss vector search library, so that the method is particularly suitable for being used in a big data scene, and can easily support the search of tens of millions of similar vectors by utilizing fasss when facing to a huge customer group, thereby improving the running efficiency of a model, reducing the use of memory and saving cluster resources.
Example 3
Referring to fig. 3, fig. 3 is a schematic structural diagram of a user excavating apparatus according to the present embodiment. As shown in fig. 3, the user excavating apparatus includes:
a first obtaining unit 310, configured to obtain business behavior data of notes between enterprises and a panoramic spectrum of the notes;
a second obtaining unit 320, configured to obtain a user word vector according to the business behavior data;
a third obtaining unit 330, configured to obtain a user network feature vector according to the bill panorama;
the splicing unit 340 is configured to perform vector splicing on the user network feature vector and the user word vector to obtain mixed feature information;
a calculating unit 350 for calculating a seed user vector set according to the mixed feature information;
and the mining unit 360 is used for mining the users according to the seed user vector set to obtain target potential customers of the bill.
In this embodiment, the explanation of the user excavating device may refer to the description in embodiment 1 or embodiment 2, and the description is not repeated in this embodiment.
Therefore, the user mining device described in the embodiment can process the enterprise portrait information by using the ebedding technology in the natural language on the basis of considering the homogeneity problem of the client in the network map structure, so that the two partial vectors are spliced to obtain the feature vector mixed by the user. Because the Node2vec graph algorithm is essentially a feature extractor, the device can sample the graph, construct a model based on the sampled sequence, and finally convert the nodes into feature vectors so that the model can mine high-order hidden features among clients. On the other hand, the device can also search topN vectors which are most similar to the seed guest group in the database by utilizing the fasss vector search library, so that the device is particularly suitable for being used in a big data scene, and can easily support the search of tens of millions of similar vectors by utilizing fasss when facing to a huge guest group, thereby improving the running efficiency of a model, reducing the use of memory and saving cluster resources.
Example 4
Referring to fig. 4, fig. 4 is a schematic structural diagram of a user excavating apparatus according to the present embodiment. As shown in fig. 4, the user excavating apparatus includes:
a first obtaining unit 310, configured to obtain business behavior data of notes between enterprises and a panoramic spectrum of the notes;
a second obtaining unit 320, configured to obtain a user word vector according to the business behavior data;
a third obtaining unit 330, configured to obtain a user network feature vector according to the bill panorama;
the splicing unit 340 is configured to perform vector splicing on the user network feature vector and the user word vector to obtain mixed feature information;
a calculating unit 350 for calculating a seed user vector set according to the mixed feature information;
and the mining unit 360 is used for mining the users according to the seed user vector set to obtain target potential customers of the bill.
As an alternative embodiment, the first acquisition unit 310 includes:
a first obtaining subunit 311, configured to obtain business behavior data of a ticket between enterprises and historical transaction behavior data of a ticket client;
a construction subunit 312, configured to construct a client bill panorama network diagram according to the bill client historical transaction behavior data;
a training subunit 313, configured to train the client bill panoramic network map through a Node2vec model, so as to obtain a bill panoramic map; the bill panorama comprises a feature vector of each user in a network of bill panorama.
As an alternative embodiment, the second acquisition unit 320 includes:
a mapping subunit 321, configured to map the traffic behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year daily balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
a processing subunit 322, configured to process the mapping feature vector through a pre-configured neural network full connection, to obtain a processing vector set;
an adding subunit 323, configured to perform an adding process on the set of processing vectors to obtain a user word vector.
As an alternative embodiment, the excavating unit 360 includes:
a second obtaining subunit 361, configured to obtain a data source to be mined;
a marking subunit 362, configured to mark the seed guest group in the data source to be mined, and determine the diffusion range according to the seed guest group;
and the searching subunit 363 is used for searching the similarity vector in the diffusion range according to the seed user vector set to obtain the target potential passenger of the bill.
In this embodiment, the explanation of the user excavating device may refer to the description in embodiment 1 or embodiment 2, and the description is not repeated in this embodiment.
Therefore, the user mining device described in the embodiment can process the enterprise portrait information by using the ebedding technology in the natural language on the basis of considering the homogeneity problem of the client in the network map structure, so that the two partial vectors are spliced to obtain the feature vector mixed by the user. Because the Node2vec graph algorithm is essentially a feature extractor, the device can sample the graph, construct a model based on the sampled sequence, and finally convert the nodes into feature vectors so that the model can mine high-order hidden features among clients. On the other hand, the device can also search topN vectors which are most similar to the seed guest group in the database by utilizing the fasss vector search library, so that the device is particularly suitable for being used in a big data scene, and can easily support the search of tens of millions of similar vectors by utilizing fasss when facing to a huge guest group, thereby improving the running efficiency of a model, reducing the use of memory and saving cluster resources.
An embodiment of the present application provides an electronic device, including a memory and a processor, where the memory is configured to store a computer program, and the processor is configured to execute the computer program to cause the electronic device to execute a user mining method in embodiment 1 or embodiment 2 of the present application.
An embodiment of the present application provides a computer readable storage medium storing computer program instructions that, when read and executed by a processor, perform the user mining method of embodiment 1 or embodiment 2 of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, for example, of the flowcharts and block diagrams in the figures that illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A user mining method, comprising:
acquiring business behavior data of business to and from bills and bill panorama;
acquiring a user word vector according to the business behavior data, and acquiring a user network feature vector according to the bill panoramic spectrum;
vector splicing is carried out on the user network feature vector and the user word vector to obtain mixed feature information;
calculating a seed user vector set according to the mixed characteristic information;
and carrying out user mining according to the seed user vector set to obtain target potential customers of the bill.
2. The method for mining users according to claim 1, wherein the acquiring business behavior data of business to and from notes and note panorama of notes between enterprises comprises:
acquiring business behavior data of business notes between enterprises and historical transaction behavior data of bill clients;
constructing a client bill panoramic network diagram according to the bill client historical transaction behavior data;
training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map; the bill panorama comprises feature vectors of each user in a network of bill panorama.
3. The method of claim 1, wherein the obtaining the user word vector according to the business behavior data comprises:
mapping the business behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year average balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
processing the mapping feature vector through a pre-configured neural network full connection to obtain a processing vector set;
and adding the processing vector sets to obtain user word vectors.
4. The user mining method according to claim 1, wherein the performing user mining according to the seed user vector set to obtain target potential customers of the ticket includes:
acquiring a data source to be mined;
marking a seed guest group in the data source to be mined, and determining a diffusion range according to the seed guest group;
and searching for similarity vectors in the diffusion range according to the seed user vector set to obtain target potential customers of the bill.
5. A user mining apparatus, the user mining apparatus comprising:
the first acquisition unit is used for acquiring business behavior data of notes between enterprises and a note panorama;
the second acquisition unit is used for acquiring the user word vector according to the business behavior data;
the third acquisition unit is used for acquiring a user network feature vector according to the bill panoramic spectrum;
the splicing unit is used for carrying out vector splicing on the user network feature vector and the user word vector to obtain mixed feature information;
a calculating unit, configured to calculate a seed user vector set according to the mixed feature information;
and the mining unit is used for carrying out user mining according to the seed user vector set to obtain target potential customers of the bill.
6. The user mining apparatus of claim 5, wherein the first acquisition unit comprises:
the first acquisition subunit is used for acquiring business behavior data of bills between enterprises and historical transaction behavior data of bill clients;
the construction subunit is used for constructing a client bill panoramic network diagram according to the bill client historical transaction behavior data;
the training subunit is used for training the client bill panoramic network map through a Node2vec model to obtain a bill panoramic map; the bill panorama comprises feature vectors of each user in a network of bill panorama.
7. The user mining apparatus of claim 5, wherein the second acquisition unit comprises:
the mapping subunit is used for mapping the business behavior data to obtain a mapping feature vector; the business behavior data at least comprises business establishment years, business location industries, business aging EVA, business loan deposit information and business post current year average balance; the mapping feature vector at least comprises an enterprise attribute feature vector, an enterprise industry feature vector, an enterprise aging EVA feature vector, an enterprise deposit feature vector, an enterprise loan feature vector and an enterprise impression feature vector;
the processing subunit is used for processing the mapping feature vectors through the pre-configured neural network full connection to obtain a processing vector set;
and the adding subunit is used for adding the processing vector set to obtain a user word vector.
8. The user mining apparatus of claim 5, wherein the mining unit includes:
the second acquisition subunit is used for acquiring a data source to be mined;
the marking subunit is used for marking the seed guest group in the data source to be mined and determining a diffusion range according to the seed guest group;
and the searching subunit is used for searching the similarity vector in the diffusion range according to the seed user vector set to obtain the target potential passenger of the bill.
9. An electronic device comprising a memory for storing a computer program and a processor that runs the computer program to cause the electronic device to perform the user mining method of any of claims 1 to 4.
10. A readable storage medium having stored therein computer program instructions which, when read and executed by a processor, perform the user mining method of any of claims 1 to 4.
CN202310358724.2A 2023-03-29 2023-03-29 User mining method and device Pending CN116579791A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310358724.2A CN116579791A (en) 2023-03-29 2023-03-29 User mining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310358724.2A CN116579791A (en) 2023-03-29 2023-03-29 User mining method and device

Publications (1)

Publication Number Publication Date
CN116579791A true CN116579791A (en) 2023-08-11

Family

ID=87532960

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310358724.2A Pending CN116579791A (en) 2023-03-29 2023-03-29 User mining method and device

Country Status (1)

Country Link
CN (1) CN116579791A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271886A (en) * 2023-08-25 2023-12-22 广东美亚旅游科技集团股份有限公司 Data searching method, system, equipment and medium based on air ticket order management

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117271886A (en) * 2023-08-25 2023-12-22 广东美亚旅游科技集团股份有限公司 Data searching method, system, equipment and medium based on air ticket order management

Similar Documents

Publication Publication Date Title
CN108153824B (en) Method and device for determining target user group
CN108388559A (en) Name entity recognition method and system, computer program of the geographical space under
CN110309432B (en) Synonym determining method based on interest points and map interest point processing method
US20190034816A1 (en) Methods and system for associating locations with annotations
CN106033416A (en) A string processing method and device
CN109255000B (en) Dimension management method and device for label data
CN106776897A (en) A kind of user's portrait label determines method and device
CN110598066B (en) Bank full-name rapid matching method based on word vector expression and cosine similarity
JP2018537760A (en) Method and apparatus for account mapping based on address information
CN112330342A (en) Method and system for optimally matching enterprise name and system user name
CN112241458B (en) Text knowledge structuring processing method, device, equipment and readable storage medium
CN116579791A (en) User mining method and device
CN111309834A (en) Method and device for matching wireless hotspot with interest point
Chi et al. Creating a new dataset to analyse house prices in England
CN117151429B (en) Government service flow arranging method and device based on knowledge graph
CN114328808A (en) Address fuzzy matching method, address processing method, address fuzzy matching device and electronic equipment
CN113761867A (en) Address recognition method and device, computer equipment and storage medium
CN112711645A (en) Method and device for expanding position point information, storage medium and electronic equipment
CN109144999B (en) Data positioning method, device, storage medium and program product
CN103514167B (en) Data processing method and equipment
CN115309995A (en) Scientific and technological resource pushing method and device based on demand text
CN112214494B (en) Retrieval method and device
CN110309312B (en) Associated event acquisition method and device
CN113434706A (en) Academic collaboration relation analysis method and device
CN111767722A (en) Word segmentation method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination