CN112541042B - Method for generating lightweight social network under ten million orders of magnitude - Google Patents
Method for generating lightweight social network under ten million orders of magnitude Download PDFInfo
- Publication number
- CN112541042B CN112541042B CN202011497282.2A CN202011497282A CN112541042B CN 112541042 B CN112541042 B CN 112541042B CN 202011497282 A CN202011497282 A CN 202011497282A CN 112541042 B CN112541042 B CN 112541042B
- Authority
- CN
- China
- Prior art keywords
- row number
- array
- node
- nodes
- social network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
Abstract
The invention discloses a method for generating a lightweight social network under ten million orders of magnitude, which belongs to the technical field of generation of social relationship data and aims to solve the problems that a set of high-availability graph database environment is deployed and maintained in medium and small demand scenes, the required maintenance cost is huge, payment is needed when high-availability versions of some existing mature graph databases are used, developers and business parties need to learn query languages of a graph database specially, and the learning cost is high; the method has the problems that all data of a certain relational network cannot be quickly and accurately queried, and a graph database is introduced for medium and small-sized demand scenes, so that the complexity of a system architecture is increased, and the cost performance is not high. The invention provides a light-weight and high-availability scheme for forming a network under ten million data nodes, and can meet the scene that most systems use social networks for data analysis. The method and the device are used for generating the social relationship network.
Description
Technical Field
The invention belongs to the technical field of generation of social relationship data, and particularly relates to a lightweight social network generation method under ten million orders of magnitude.
Background
In the business of risk control, social platform, e-commerce, etc., data of the relationship network is often used. For example: when a person who is relatively close to the relationship network in which you are in buys a mother-infant product in an e-commerce, the system can recommend the product to you to improve the purchase rate; in financial wind control, the number of intermediaries and electronic fraud people in the relationship network where you are located is large, so that the relationship network where you are located is a high-risk network, and loan applications of people inside can be rejected with high probability.
At present, there are many data model methods for analyzing and generating social relationship network data, and at present, social relationship network data is established mainly by building a graph database, and the specific steps include:
the method comprises the following steps: constructing a graph database environment according to official documents;
step two: importing the relation data;
step three: operating on the data by learning a data query language;
step four: and realizing the operation and maintenance of the image database.
In the prior art, the following problems exist: a set of highly available graph database environments is deployed and maintained, and the required maintenance cost is huge; payment is required using some existing mature highly available versions of graph databases; developers and business parties need to specially learn the query language of the graphic database, so that the learning cost is high; all data of a certain relational network cannot be quickly and accurately queried; the graph database is introduced for medium and small demand scenes, the complexity of a system architecture is increased, and the cost performance is not high.
Disclosure of Invention
Aiming at the situations with medium and small requirements in the prior art, a set of high-availability graph database environment is deployed and maintained, the required maintenance cost is huge, the high-availability versions using some existing mature graph databases need to be paid, developers and business parties need to specially learn the query language of the graph database, and the learning cost is high; the invention provides a method for generating a lightweight social network under ten million orders of magnitude, which aims to solve the problems that all data of a certain relational network cannot be quickly and accurately queried, and a graph database cannot be introduced for medium and small demand scenes, so that the complexity of a system architecture is increased, and the cost performance is not high: the requirement of a social network is met through a technical stack in the current Web project without introducing a third-party heavy component or a graph database through system additional maintenance, the network forming efficiency is high, and most of small and medium-sized use scenes can be met.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for generating a lightweight social network of ten million orders of magnitude comprises the following steps:
step A: acquiring original data from a data source, and generating nodes and node attributes, wherein the original data comprises an identity card number, a mobile phone number, a company address and a family address of an individual user;
b, processing the node attribute into a key value pair corresponding to the node, establishing a row number corresponding node list, and replacing the node attribute with a row number to form a row number corresponding node list;
and C: declare the array of basic row numbers, the value of each column in the array, namely the subscript value of the array, 0,1,2,3,4,5, \ 8230n, the value of each column in the array is the subscript value of the array, and the list of the nodes corresponding to the row numbers in the array is processed in reverse to the list of the row numbers corresponding to the nodes;
step D: sequentially traversing the row number array corresponding to each node, updating the subscript value corresponding to the basic row number array to the minimum row number value in the row number array when the row number array corresponding to the node is traversed, and generating a relation set of the minimum subscript and the row number;
and E, generating a final basic row number array after traversing the row number arrays corresponding to all the nodes, combining the nodes with the same row number value in the final basic row number array, and processing the nodes with the same row number value to a social network.
Further, in step a: the data source supports a configuration function, and the configuration source of the data source comprises a third party interface, a database and a file.
Further, step C includes: and in the generated list of the line numbers corresponding to the nodes, the line numbers corresponding to each node are arranged in an ascending order.
And step D, after traversing the row number array corresponding to one node, updating the basic row number array, maintaining the updated minimum subscript and row number relationship set, then continuously traversing the row number array corresponding to the next node on the basis of the updated basic row number array, updating the subscript value corresponding to the basic row number in the same updating mode, updating the minimum subscript and row number relationship set until traversing all the row number arrays corresponding to the nodes, and completing the updating of the basic row number array.
Further, processing nodes and edges of the data of the social network to form a visual social network, and selecting whether to store the data of the nodes and the edges according to the actual application condition.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that: the requirement of a social network is met through a technical stack in the current Web project without introducing a third-party heavy component or a graph database through system extra maintenance, the network forming efficiency is high, and the formed ten-million-node order-of-magnitude relation network can meet most of small and medium-sized use scenes; the whole set of solutions does not need to additionally consider high available deployments, only depending on the accessed application system.
Drawings
FIG. 1 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 2 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 3 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 4 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 5 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 6 is a schematic illustration of the steps of one embodiment of the present invention;
FIG. 7 is a schematic view of an embodiment of the present invention;
FIG. 8 is a schematic view of an embodiment of the present invention;
FIG. 9 is a flow chart illustrating an embodiment of the present invention.
Detailed Description
All of the features disclosed in this specification, or all of the steps in any method or process so disclosed, may be combined in any combination, except combinations of features and/or steps that are mutually exclusive.
The invention will be further described with reference to the accompanying drawings and the detailed description.
As shown in the figure, the invention discloses a method for generating a lightweight social network under ten million orders of magnitude, which comprises the following steps:
step A: acquiring original data from a data source, and generating a node and a node attribute, wherein the original data comprises an identity card number, a mobile phone number, a company address and a family address of an individual user;
the data source supports a configuration function, and the configuration source of the data source comprises a third party interface, a database and a file.
B, processing the node attribute into a key value pair corresponding to the node, establishing a list of nodes corresponding to row numbers, and replacing the node attribute with the row number to form a list of nodes corresponding to the row number;
and C: declare the array of basic row numbers, the value of each column in the array, namely the subscript value of the array, 0,1,2,3,4,5, \ 8230n, the value of each column in the array is the subscript value of the array, and the list of the nodes corresponding to the row numbers in the array is processed in reverse to the list of the row numbers corresponding to the nodes;
in the step C: and in the generated list of the line numbers corresponding to the nodes, the line numbers corresponding to each node are arranged in an ascending order.
Step D: sequentially traversing the row number array corresponding to each node, updating the subscript value corresponding to the basic row number array to the minimum row number value in the row number array when the row number array corresponding to the node is traversed, and generating a relation set of the minimum subscript and the row number;
and D, after traversing the row number array corresponding to one node, updating the basic row number array, maintaining the updated minimum subscript and row number relationship set, then continuously traversing the row number array corresponding to the next node on the basis of the updated basic row number array, updating the subscript value corresponding to the basic row number in the same updating mode, updating the minimum subscript and row number relationship set until traversing all the row number arrays corresponding to the nodes, and completing the updating of the basic row number array.
And E, generating a final basic row number array after traversing the row number arrays corresponding to all the nodes, combining the nodes with the same row number value in the final basic row number array, and processing the nodes with the same row number value to a social network.
And processing nodes and edges of the data of the social network to form a visual social network, and selecting whether to store the data of the nodes and the edges according to the actual application condition.
The embodiment is as follows:
acquiring data of five persons with identity card numbers of 510130,510159,510162,510216 and 510224, processing attributes of 5 persons into key value pairs with attributes corresponding to the identity card numbers, wherein all nodes of the five persons are regarded as master nodes for understanding, and the key value pairs are shown in figure 1;
then, replacing the attribute with an increasing row number of 0-8 to form a list of the identification numbers corresponding to the row numbers, as shown in FIG. 2;
a base array of row numbers is declared, and the value of each column of the gather [0,1,2,3,4,5,6,7,8] is the subscript value of the array. Reversely processing a list of the identification card corresponding to the line number according to the list of the identification card corresponding to the line number, and arranging the line number corresponding to each identification card in an ascending order, wherein the ascending order of the line number is from the minimum line number in traversal; as shown in fig. 3.
Traversing the row number [1,3,4,7,8] with the identification number of 510162, updating the subscript value corresponding to the base row number array to be the minimum row number value in the current row number array, updating the updated data to be the fast [0,1,2,1, 5,6, 1], and then maintaining the relation set 1- > [1,3,4,7,8] of the minimum subscript and the row number, as shown in fig. 4.
<xnotran> [2,3,4,5,6], father [2] 2, 3 , father [3] 1, 1<2, 1, 1 ,1- > [1,2,3,4,7,8], , father [0,1,1,1,1,1,1,1,1]; </xnotran> As shown in fig. 5.
<xnotran> [0,5], [4,7], [0,1,2,6,7,8], father [0,0,0,0,0,0,0,0,0], , [510130,510159,510162,510216,510224] ; </xnotran> As shown in fig. 6.
And processing nodes and edges of data in the same social network to enable the social network to be visualized on a page, and storing the data of the nodes and the edges in Redis. And defining the number as a mobile phone number and the letter as a home address, and forming a data format of the node and the edge as shown in fig. 7;
the presentation of network information in a web page according to nodes and edges is shown in fig. 8.
The above are merely representative examples of the many specific applications of the present invention, and do not limit the scope of the invention in any way. All the technical solutions formed by using the conversion or the equivalent substitution fall within the protection scope of the present invention.
Claims (5)
1. A method for generating a lightweight social network in the tens of millions of orders is characterized by comprising the following steps:
step A: acquiring original data from a data source, and generating a node and a node attribute, wherein the original data comprises an identity card number, a mobile phone number, a company address and a family address of an individual user;
b, processing the node attribute into a key value pair corresponding to the node, establishing a row number corresponding node list, and replacing the node attribute with a row number to form a row number corresponding node list;
step C: declare the array of basic row numbers, the value of each column in the array, namely the subscript value of the array, 0,1,2,3,4,5, \ 8230n, the value of each column in the array is the subscript value of the array, and the list of the nodes corresponding to the row numbers in the array is processed in reverse to the list of the row numbers corresponding to the nodes;
step D: sequentially traversing the row number list corresponding to each node, updating the column value corresponding to the basic row number array into the minimum column value in the row number list when the row number list corresponding to the node is traversed, and generating a relation set of the minimum column value and the row number;
and E, generating a final basic row number array after traversing the row number lists corresponding to all the nodes, combining the nodes with the same column value in the final basic row number array, and processing the nodes with the same column value to a social network.
2. The method of claim 1, wherein the social network comprises: in the step A: the data source supports a configuration function, and the configuration source of the data source comprises a third party interface, a database and a file.
3. The method for generating a lightweight social network on the order of ten million according to claim 1, wherein: the step C comprises the following steps: and in the generated list of the line numbers corresponding to the nodes, the line numbers corresponding to each node are arranged in an ascending order.
4. The method for generating a lightweight social network on the order of ten million according to claim 1, wherein: and step D, after traversing the row number list corresponding to one node, updating the basic row number array, maintaining the updated minimum column value and row number relation set, then continuously traversing the row number list corresponding to the next node on the basis of the updated basic row number array, updating the column value corresponding to the basic row number in the same updating mode, updating the minimum column value and row number relation set until traversing all the row number lists corresponding to the nodes, and completing the updating of the basic row number array.
5. A method for generating a lightweight social network in the order of tens of millions according to any of claims 1 to 4, wherein: and processing nodes and edges of the data of the social network to form a visual social network, and selecting whether to store the data of the nodes and the edges according to the actual application condition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011497282.2A CN112541042B (en) | 2020-12-17 | 2020-12-17 | Method for generating lightweight social network under ten million orders of magnitude |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011497282.2A CN112541042B (en) | 2020-12-17 | 2020-12-17 | Method for generating lightweight social network under ten million orders of magnitude |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112541042A CN112541042A (en) | 2021-03-23 |
CN112541042B true CN112541042B (en) | 2022-11-04 |
Family
ID=75019068
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011497282.2A Active CN112541042B (en) | 2020-12-17 | 2020-12-17 | Method for generating lightweight social network under ten million orders of magnitude |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112541042B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103138981A (en) * | 2011-11-30 | 2013-06-05 | 中国移动通信集团公司 | Method and device for social network service analysis |
CN103336810A (en) * | 2013-06-26 | 2013-10-02 | 国家电网公司 | Power distribution network topology analysis method based on multi-core computer |
CN105117421A (en) * | 2015-07-31 | 2015-12-02 | 四川长虹电器股份有限公司 | Social network analysis method based on graph structure matching |
TW201705083A (en) * | 2015-07-24 | 2017-02-01 | Chunghwa Telecom Co Ltd | Synergic management and control system for service templates comprising a service demand processing module and a core module |
CN107292424A (en) * | 2017-06-01 | 2017-10-24 | 四川新网银行股份有限公司 | A kind of anti-fraud and credit risk forecast method based on complicated social networks |
CN109189469A (en) * | 2018-06-22 | 2019-01-11 | 北京大学 | Android application micro services method and system based on reflection |
CN109299615A (en) * | 2017-08-07 | 2019-02-01 | 南京邮电大学 | A kind of difference privacy processing dissemination method towards social network data |
CN109325019A (en) * | 2018-08-17 | 2019-02-12 | 国家电网有限公司客户服务中心 | Data correlation relation network establishing method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10096033B2 (en) * | 2011-09-15 | 2018-10-09 | Stephan HEATH | System and method for providing educational related social/geo/promo link promotional data sets for end user display of interactive ad links, promotions and sale of products, goods, and/or services integrated with 3D spatial geomapping, company and local information for selected worldwide locations and social networking |
US20190019533A1 (en) * | 2017-07-17 | 2019-01-17 | Mashlink, Inc. | Methods for efficient annotation of audiovisual media |
-
2020
- 2020-12-17 CN CN202011497282.2A patent/CN112541042B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103138981A (en) * | 2011-11-30 | 2013-06-05 | 中国移动通信集团公司 | Method and device for social network service analysis |
CN103336810A (en) * | 2013-06-26 | 2013-10-02 | 国家电网公司 | Power distribution network topology analysis method based on multi-core computer |
TW201705083A (en) * | 2015-07-24 | 2017-02-01 | Chunghwa Telecom Co Ltd | Synergic management and control system for service templates comprising a service demand processing module and a core module |
CN105117421A (en) * | 2015-07-31 | 2015-12-02 | 四川长虹电器股份有限公司 | Social network analysis method based on graph structure matching |
CN107292424A (en) * | 2017-06-01 | 2017-10-24 | 四川新网银行股份有限公司 | A kind of anti-fraud and credit risk forecast method based on complicated social networks |
CN109299615A (en) * | 2017-08-07 | 2019-02-01 | 南京邮电大学 | A kind of difference privacy processing dissemination method towards social network data |
CN109189469A (en) * | 2018-06-22 | 2019-01-11 | 北京大学 | Android application micro services method and system based on reflection |
CN109325019A (en) * | 2018-08-17 | 2019-02-12 | 国家电网有限公司客户服务中心 | Data correlation relation network establishing method |
Non-Patent Citations (4)
Title |
---|
Self-disclosure and privacy calculus on social networking sites: the role of culture;Hanna Krasnova等;《Business & Information Systems Engineering》;20120426;第127-135页 * |
利用手机用户通信信息的社团发现算法研究;汤志杰;《中国优秀硕士学位论文全文数据库信息科技辑》;20170215(第2期);第I138-51页 * |
大型社交网络中社团挖掘算法的研究;赵月娥;《中国优秀硕士学位论文全文数据库信息科技辑》;20180215(第2期);第I138-1150页 * |
标签传播算法;山清水秀iOS;《https://blog.csdn.net/weixin_30904593/article/details/101778731》;20190902;第1页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112541042A (en) | 2021-03-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10671936B2 (en) | Method for clustering nodes of a textual network taking into account textual content, computer-readable storage device and system implementing said method | |
CN102591854B (en) | For advertisement filtering system and the filter method thereof of text feature | |
US8768976B2 (en) | Operational-related data computation engine | |
US11860675B2 (en) | Latent network summarization | |
US11373257B1 (en) | Artificial intelligence-based property data linking system | |
CN111753024B (en) | Multi-source heterogeneous data entity alignment method oriented to public safety field | |
CN109102157A (en) | A kind of bank's work order worksheet processing method and system based on deep learning | |
CN110532309B (en) | Generation method of college library user portrait system | |
CN111178614A (en) | Enterprise risk prediction method and system | |
CN107679977A (en) | A kind of tax administration platform and implementation method based on semantic analysis | |
CN112214614A (en) | Method and system for mining risk propagation path based on knowledge graph | |
CN112084342A (en) | Test question generation method and device, computer equipment and storage medium | |
US20230087421A1 (en) | Systems and methods for generalized structured data discovery utilizing contextual metadata disambiguation via machine learning techniques | |
CN114863439B (en) | Information extraction method, information extraction device, electronic equipment and medium | |
CN107832319B (en) | Heuristic query expansion method based on semantic association network | |
CN113706291A (en) | Fraud risk prediction method, device, equipment and storage medium | |
US9141686B2 (en) | Risk analysis using unstructured data | |
Mair et al. | The grand old party–a party of values? | |
CN112487305B (en) | GCN-based dynamic social user alignment method | |
US11163761B2 (en) | Vector embedding models for relational tables with null or equivalent values | |
CN112598039A (en) | Method for acquiring positive sample in NLP classification field and related equipment | |
CN112541042B (en) | Method for generating lightweight social network under ten million orders of magnitude | |
CN104573098B (en) | Extensive object identifying method based on Spark systems | |
CN106227771B (en) | A kind of domain expert's discovery method based on socialization programming website | |
CN210605805U (en) | Industrial and commercial administrative management service platform system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |