CN112541042B - Method for generating lightweight social network under ten million orders of magnitude - Google Patents

Method for generating lightweight social network under ten million orders of magnitude Download PDF

Info

Publication number
CN112541042B
CN112541042B CN202011497282.2A CN202011497282A CN112541042B CN 112541042 B CN112541042 B CN 112541042B CN 202011497282 A CN202011497282 A CN 202011497282A CN 112541042 B CN112541042 B CN 112541042B
Authority
CN
China
Prior art keywords
row number
array
node
nodes
social network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011497282.2A
Other languages
Chinese (zh)
Other versions
CN112541042A (en
Inventor
赵伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan XW Bank Co Ltd
Original Assignee
Sichuan XW Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan XW Bank Co Ltd filed Critical Sichuan XW Bank Co Ltd
Priority to CN202011497282.2A priority Critical patent/CN112541042B/en
Publication of CN112541042A publication Critical patent/CN112541042A/en
Application granted granted Critical
Publication of CN112541042B publication Critical patent/CN112541042B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering

Abstract

The invention discloses a method for generating a lightweight social network under ten million orders of magnitude, which belongs to the technical field of generation of social relationship data and aims to solve the problems that a set of high-availability graph database environment is deployed and maintained in medium and small demand scenes, the required maintenance cost is huge, payment is needed when high-availability versions of some existing mature graph databases are used, developers and business parties need to learn query languages of a graph database specially, and the learning cost is high; the method has the problems that all data of a certain relational network cannot be quickly and accurately queried, and a graph database is introduced for medium and small-sized demand scenes, so that the complexity of a system architecture is increased, and the cost performance is not high. The invention provides a light-weight and high-availability scheme for forming a network under ten million data nodes, and can meet the scene that most systems use social networks for data analysis. The method and the device are used for generating the social relationship network.

Description

Method for generating lightweight social network under ten million orders of magnitude
Technical Field
The invention belongs to the technical field of generation of social relationship data, and particularly relates to a lightweight social network generation method under ten million orders of magnitude.
Background
In the business of risk control, social platform, e-commerce, etc., data of the relationship network is often used. For example: when a person who is relatively close to the relationship network in which you are in buys a mother-infant product in an e-commerce, the system can recommend the product to you to improve the purchase rate; in financial wind control, the number of intermediaries and electronic fraud people in the relationship network where you are located is large, so that the relationship network where you are located is a high-risk network, and loan applications of people inside can be rejected with high probability.
At present, there are many data model methods for analyzing and generating social relationship network data, and at present, social relationship network data is established mainly by building a graph database, and the specific steps include:
the method comprises the following steps: constructing a graph database environment according to official documents;
step two: importing the relation data;
step three: operating on the data by learning a data query language;
step four: and realizing the operation and maintenance of the image database.
In the prior art, the following problems exist: a set of highly available graph database environments is deployed and maintained, and the required maintenance cost is huge; payment is required using some existing mature highly available versions of graph databases; developers and business parties need to specially learn the query language of the graphic database, so that the learning cost is high; all data of a certain relational network cannot be quickly and accurately queried; the graph database is introduced for medium and small demand scenes, the complexity of a system architecture is increased, and the cost performance is not high.
Disclosure of Invention
Aiming at the situations with medium and small requirements in the prior art, a set of high-availability graph database environment is deployed and maintained, the required maintenance cost is huge, the high-availability versions using some existing mature graph databases need to be paid, developers and business parties need to specially learn the query language of the graph database, and the learning cost is high; the invention provides a method for generating a lightweight social network under ten million orders of magnitude, which aims to solve the problems that all data of a certain relational network cannot be quickly and accurately queried, and a graph database cannot be introduced for medium and small demand scenes, so that the complexity of a system architecture is increased, and the cost performance is not high: the requirement of a social network is met through a technical stack in the current Web project without introducing a third-party heavy component or a graph database through system additional maintenance, the network forming efficiency is high, and most of small and medium-sized use scenes can be met.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for generating a lightweight social network of ten million orders of magnitude comprises the following steps:
step A: acquiring original data from a data source, and generating nodes and node attributes, wherein the original data comprises an identity card number, a mobile phone number, a company address and a family address of an individual user;
b, processing the node attribute into a key value pair corresponding to the node, establishing a row number corresponding node list, and replacing the node attribute with a row number to form a row number corresponding node list;
and C: declare the array of basic row numbers, the value of each column in the array, namely the subscript value of the array, 0,1,2,3,4,5, \ 8230n, the value of each column in the array is the subscript value of the array, and the list of the nodes corresponding to the row numbers in the array is processed in reverse to the list of the row numbers corresponding to the nodes;
step D: sequentially traversing the row number array corresponding to each node, updating the subscript value corresponding to the basic row number array to the minimum row number value in the row number array when the row number array corresponding to the node is traversed, and generating a relation set of the minimum subscript and the row number;
and E, generating a final basic row number array after traversing the row number arrays corresponding to all the nodes, combining the nodes with the same row number value in the final basic row number array, and processing the nodes with the same row number value to a social network.
Further, in step a: the data source supports a configuration function, and the configuration source of the data source comprises a third party interface, a database and a file.
Further, step C includes: and in the generated list of the line numbers corresponding to the nodes, the line numbers corresponding to each node are arranged in an ascending order.
And step D, after traversing the row number array corresponding to one node, updating the basic row number array, maintaining the updated minimum subscript and row number relationship set, then continuously traversing the row number array corresponding to the next node on the basis of the updated basic row number array, updating the subscript value corresponding to the basic row number in the same updating mode, updating the minimum subscript and row number relationship set until traversing all the row number arrays corresponding to the nodes, and completing the updating of the basic row number array.
Further, processing nodes and edges of the data of the social network to form a visual social network, and selecting whether to store the data of the nodes and the edges according to the actual application condition.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that: the requirement of a social network is met through a technical stack in the current Web project without introducing a third-party heavy component or a graph database through system extra maintenance, the network forming efficiency is high, and the formed ten-million-node order-of-magnitude relation network can meet most of small and medium-sized use scenes; the whole set of solutions does not need to additionally consider high available deployments, only depending on the accessed application system.
Drawings
FIG. 1 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 2 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 3 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 4 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 5 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 6 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 7 is a schematic view of an embodiment of the present invention;
FIG. 8 is a schematic view of an embodiment of the present invention;
FIG. 9 is a flow chart illustrating an embodiment of the present invention.
Detailed Description
All of the features disclosed in this specification, or all of the steps in any method or process so disclosed, may be combined in any combination, except combinations of features and/or steps that are mutually exclusive.
The invention will be further described with reference to the accompanying drawings and the detailed description.
As shown in the figure, the invention discloses a method for generating a lightweight social network under ten million orders of magnitude, which comprises the following steps:
step A: acquiring original data from a data source, and generating a node and a node attribute, wherein the original data comprises an identity card number, a mobile phone number, a company address and a family address of an individual user;
the data source supports a configuration function, and the configuration source of the data source comprises a third party interface, a database and a file.
B, processing the node attribute into a key value pair corresponding to the node, establishing a list of nodes corresponding to row numbers, and replacing the node attribute with the row number to form a list of nodes corresponding to the row number;
and C: declare the array of basic row numbers, the value of each column in the array, namely the subscript value of the array, 0,1,2,3,4,5, \ 8230n, the value of each column in the array is the subscript value of the array, and the list of the nodes corresponding to the row numbers in the array is processed in reverse to the list of the row numbers corresponding to the nodes;
in the step C: and in the generated list of the line numbers corresponding to the nodes, the line numbers corresponding to each node are arranged in an ascending order.
Step D: sequentially traversing the row number array corresponding to each node, updating the subscript value corresponding to the basic row number array to the minimum row number value in the row number array when the row number array corresponding to the node is traversed, and generating a relation set of the minimum subscript and the row number;
and D, after traversing the row number array corresponding to one node, updating the basic row number array, maintaining the updated minimum subscript and row number relationship set, then continuously traversing the row number array corresponding to the next node on the basis of the updated basic row number array, updating the subscript value corresponding to the basic row number in the same updating mode, updating the minimum subscript and row number relationship set until traversing all the row number arrays corresponding to the nodes, and completing the updating of the basic row number array.
And E, generating a final basic row number array after traversing the row number arrays corresponding to all the nodes, combining the nodes with the same row number value in the final basic row number array, and processing the nodes with the same row number value to a social network.
And processing nodes and edges of the data of the social network to form a visual social network, and selecting whether to store the data of the nodes and the edges according to the actual application condition.
The embodiment is as follows:
acquiring data of five persons with identity card numbers of 510130,510159,510162,510216 and 510224, processing attributes of 5 persons into key value pairs with attributes corresponding to the identity card numbers, wherein all nodes of the five persons are regarded as master nodes for understanding, and the key value pairs are shown in figure 1;
then, replacing the attribute with an increasing row number of 0-8 to form a list of the identification numbers corresponding to the row numbers, as shown in FIG. 2;
a base array of row numbers is declared, and the value of each column of the gather [0,1,2,3,4,5,6,7,8] is the subscript value of the array. Reversely processing a list of the identification card corresponding to the line number according to the list of the identification card corresponding to the line number, and arranging the line number corresponding to each identification card in an ascending order, wherein the ascending order of the line number is from the minimum line number in traversal; as shown in fig. 3.
Traversing the row number [1,3,4,7,8] with the identification number of 510162, updating the subscript value corresponding to the base row number array to be the minimum row number value in the current row number array, updating the updated data to be the fast [0,1,2,1, 5,6, 1], and then maintaining the relation set 1- > [1,3,4,7,8] of the minimum subscript and the row number, as shown in fig. 4.
<xnotran> [2,3,4,5,6], father [2] 2, 3 , father [3] 1, 1<2, 1, 1 ,1- > [1,2,3,4,7,8], , father [0,1,1,1,1,1,1,1,1]; </xnotran> As shown in fig. 5.
<xnotran> [0,5], [4,7], [0,1,2,6,7,8], father [0,0,0,0,0,0,0,0,0], , [510130,510159,510162,510216,510224] ; </xnotran> As shown in fig. 6.
And processing nodes and edges of data in the same social network to enable the social network to be visualized on a page, and storing the data of the nodes and the edges in Redis. And defining the number as a mobile phone number and the letter as a home address, and forming a data format of the node and the edge as shown in fig. 7;
the presentation of network information in a web page according to nodes and edges is shown in fig. 8.
The above are merely representative examples of the many specific applications of the present invention, and do not limit the scope of the invention in any way. All the technical solutions formed by using the conversion or the equivalent substitution fall within the protection scope of the present invention.

Claims (5)

1. A method for generating a lightweight social network in the tens of millions of orders is characterized by comprising the following steps:
step A: acquiring original data from a data source, and generating a node and a node attribute, wherein the original data comprises an identity card number, a mobile phone number, a company address and a family address of an individual user;
b, processing the node attribute into a key value pair corresponding to the node, establishing a row number corresponding node list, and replacing the node attribute with a row number to form a row number corresponding node list;
step C: declare the array of basic row numbers, the value of each column in the array, namely the subscript value of the array, 0,1,2,3,4,5, \ 8230n, the value of each column in the array is the subscript value of the array, and the list of the nodes corresponding to the row numbers in the array is processed in reverse to the list of the row numbers corresponding to the nodes;
step D: sequentially traversing the row number list corresponding to each node, updating the column value corresponding to the basic row number array into the minimum column value in the row number list when the row number list corresponding to the node is traversed, and generating a relation set of the minimum column value and the row number;
and E, generating a final basic row number array after traversing the row number lists corresponding to all the nodes, combining the nodes with the same column value in the final basic row number array, and processing the nodes with the same column value to a social network.
2. The method of claim 1, wherein the social network comprises: in the step A: the data source supports a configuration function, and the configuration source of the data source comprises a third party interface, a database and a file.
3. The method for generating a lightweight social network on the order of ten million according to claim 1, wherein: the step C comprises the following steps: and in the generated list of the line numbers corresponding to the nodes, the line numbers corresponding to each node are arranged in an ascending order.
4. The method for generating a lightweight social network on the order of ten million according to claim 1, wherein: and step D, after traversing the row number list corresponding to one node, updating the basic row number array, maintaining the updated minimum column value and row number relation set, then continuously traversing the row number list corresponding to the next node on the basis of the updated basic row number array, updating the column value corresponding to the basic row number in the same updating mode, updating the minimum column value and row number relation set until traversing all the row number lists corresponding to the nodes, and completing the updating of the basic row number array.
5. A method for generating a lightweight social network in the order of tens of millions according to any of claims 1 to 4, wherein: and processing nodes and edges of the data of the social network to form a visual social network, and selecting whether to store the data of the nodes and the edges according to the actual application condition.
CN202011497282.2A 2020-12-17 2020-12-17 Method for generating lightweight social network under ten million orders of magnitude Active CN112541042B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011497282.2A CN112541042B (en) 2020-12-17 2020-12-17 Method for generating lightweight social network under ten million orders of magnitude

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011497282.2A CN112541042B (en) 2020-12-17 2020-12-17 Method for generating lightweight social network under ten million orders of magnitude

Publications (2)

Publication Number Publication Date
CN112541042A CN112541042A (en) 2021-03-23
CN112541042B true CN112541042B (en) 2022-11-04

Family

ID=75019068

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011497282.2A Active CN112541042B (en) 2020-12-17 2020-12-17 Method for generating lightweight social network under ten million orders of magnitude

Country Status (1)

Country Link
CN (1) CN112541042B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103138981A (en) * 2011-11-30 2013-06-05 中国移动通信集团公司 Method and device for social network service analysis
CN103336810A (en) * 2013-06-26 2013-10-02 国家电网公司 Power distribution network topology analysis method based on multi-core computer
CN105117421A (en) * 2015-07-31 2015-12-02 四川长虹电器股份有限公司 Social network analysis method based on graph structure matching
TW201705083A (en) * 2015-07-24 2017-02-01 Chunghwa Telecom Co Ltd Synergic management and control system for service templates comprising a service demand processing module and a core module
CN107292424A (en) * 2017-06-01 2017-10-24 四川新网银行股份有限公司 A kind of anti-fraud and credit risk forecast method based on complicated social networks
CN109189469A (en) * 2018-06-22 2019-01-11 北京大学 Android application micro services method and system based on reflection
CN109299615A (en) * 2017-08-07 2019-02-01 南京邮电大学 A kind of difference privacy processing dissemination method towards social network data
CN109325019A (en) * 2018-08-17 2019-02-12 国家电网有限公司客户服务中心 Data correlation relation network establishing method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10096033B2 (en) * 2011-09-15 2018-10-09 Stephan HEATH System and method for providing educational related social/geo/promo link promotional data sets for end user display of interactive ad links, promotions and sale of products, goods, and/or services integrated with 3D spatial geomapping, company and local information for selected worldwide locations and social networking
US20190019533A1 (en) * 2017-07-17 2019-01-17 Mashlink, Inc. Methods for efficient annotation of audiovisual media

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103138981A (en) * 2011-11-30 2013-06-05 中国移动通信集团公司 Method and device for social network service analysis
CN103336810A (en) * 2013-06-26 2013-10-02 国家电网公司 Power distribution network topology analysis method based on multi-core computer
TW201705083A (en) * 2015-07-24 2017-02-01 Chunghwa Telecom Co Ltd Synergic management and control system for service templates comprising a service demand processing module and a core module
CN105117421A (en) * 2015-07-31 2015-12-02 四川长虹电器股份有限公司 Social network analysis method based on graph structure matching
CN107292424A (en) * 2017-06-01 2017-10-24 四川新网银行股份有限公司 A kind of anti-fraud and credit risk forecast method based on complicated social networks
CN109299615A (en) * 2017-08-07 2019-02-01 南京邮电大学 A kind of difference privacy processing dissemination method towards social network data
CN109189469A (en) * 2018-06-22 2019-01-11 北京大学 Android application micro services method and system based on reflection
CN109325019A (en) * 2018-08-17 2019-02-12 国家电网有限公司客户服务中心 Data correlation relation network establishing method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Self-disclosure and privacy calculus on social networking sites: the role of culture;Hanna Krasnova等;《Business & Information Systems Engineering》;20120426;第127-135页 *
利用手机用户通信信息的社团发现算法研究;汤志杰;《中国优秀硕士学位论文全文数据库信息科技辑》;20170215(第2期);第I138-51页 *
大型社交网络中社团挖掘算法的研究;赵月娥;《中国优秀硕士学位论文全文数据库信息科技辑》;20180215(第2期);第I138-1150页 *
标签传播算法;山清水秀iOS;《https://blog.csdn.net/weixin_30904593/article/details/101778731》;20190902;第1页 *

Also Published As

Publication number Publication date
CN112541042A (en) 2021-03-23

Similar Documents

Publication Publication Date Title
US10671936B2 (en) Method for clustering nodes of a textual network taking into account textual content, computer-readable storage device and system implementing said method
CN102591854B (en) For advertisement filtering system and the filter method thereof of text feature
US8768976B2 (en) Operational-related data computation engine
US11860675B2 (en) Latent network summarization
US11373257B1 (en) Artificial intelligence-based property data linking system
CN111753024B (en) Multi-source heterogeneous data entity alignment method oriented to public safety field
CN109102157A (en) A kind of bank&#39;s work order worksheet processing method and system based on deep learning
CN110532309B (en) Generation method of college library user portrait system
CN111178614A (en) Enterprise risk prediction method and system
CN107679977A (en) A kind of tax administration platform and implementation method based on semantic analysis
CN112214614A (en) Method and system for mining risk propagation path based on knowledge graph
CN112084342A (en) Test question generation method and device, computer equipment and storage medium
US20230087421A1 (en) Systems and methods for generalized structured data discovery utilizing contextual metadata disambiguation via machine learning techniques
CN114863439B (en) Information extraction method, information extraction device, electronic equipment and medium
CN107832319B (en) Heuristic query expansion method based on semantic association network
CN113706291A (en) Fraud risk prediction method, device, equipment and storage medium
US9141686B2 (en) Risk analysis using unstructured data
Mair et al. The grand old party–a party of values?
CN112487305B (en) GCN-based dynamic social user alignment method
US11163761B2 (en) Vector embedding models for relational tables with null or equivalent values
CN112598039A (en) Method for acquiring positive sample in NLP classification field and related equipment
CN112541042B (en) Method for generating lightweight social network under ten million orders of magnitude
CN104573098B (en) Extensive object identifying method based on Spark systems
CN106227771B (en) A kind of domain expert&#39;s discovery method based on socialization programming website
CN210605805U (en) Industrial and commercial administrative management service platform system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant